You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using OllamaWeb-UI, and I am trying to figure out how RAG is implemented.
What I find out :
True RAG is implemented using Langchain and ChromaDB (cf : /backend/apps/rag/main.py).
I found several files corresponding to this.
But, I need some explanation about the exact workflow (Is there any documentation somewhere, I didn't find it ?)
Here is the process inside the web browser :
_initNewChat
File { name: "cleanedMergedDiderot.txt_part2", lastModified: 1709209652720, webkitRelativePath: "", size: 6135799, type: "" }
Array [ {…} ]
length: 1
: Array []
siecle ; depuis ce tems, chaque secrétaire d'état a
ses archives ou son dépôt. ( G ) ( a ),Code des Grecs. Voyez Code canonique .,Mais M. Richer, célebre docteur de Sorbonne,
contrebalance cette autorité dans son histoire des conciles généraux, liv. I. chap. ij. num. 7. en rapportant,
[CUT TEXT]
mortel étoit une disposition indispensable pour la fréquente communion ; mais ils ont aussi pensé que
cette disposition étoit suffisante. . . . . . . . . .
vigostral-7b-chat.Q8_0.gguf:latest_
In the HTTP request, we have (can be found in src/lib/utils/rag/index.ts):
_
Use the following context as your learned knowledge, inside XML tags.
[context]
When answer to user:
If you don't know, just say that you don't know.
If you don't know when you are not sure, ask for clarification.
Avoid mentioning that you obtained the information from the context.
And answer according to the language of the user's question.
Given the context information, answer the query.
Query: [query]
_
My questions are :
I see that some text are sent with the query to the model using HTTP request : Are we just simply extracting text from the provided document and adding it to the context prompt of the model ? (Which cannot take full text file content, because it is too large for the context of the model)
Or are we doing true RAG, searching for part of the text that could fit the request, and then add it to the context of the model ? What is the workflow (I am not sure about how RAG is truly working) ?
I tried to give the model a 140MB text file. I had no errors, but it looked like it did not work (doing nothing, when I check inside the http request to the model, the [context] part is empty). For really small file (5KB), it seems like the full file is giving inside [context], and when giving medium text files (5MB), just some part of the text is given in [context] http request, ending with ". . . . . . . . .". Can someone provide me some explanations, or a link to some documentation ?
We are using "all-MiniLM-L6-v2" from sentence transformers (https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) : This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.
So, it looks at our request, search inside the vector database to take chunks of text that could fit our request, and give us back plain text ?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
I am currently using OllamaWeb-UI, and I am trying to figure out how RAG is implemented.
What I find out :
True RAG is implemented using Langchain and ChromaDB (cf : /backend/apps/rag/main.py).
I found several files corresponding to this.
But, I need some explanation about the exact workflow (Is there any documentation somewhere, I didn't find it ?)
Here is the process inside the web browser :
In the HTTP request, we have (can be found in src/lib/utils/rag/index.ts):
_
_
My questions are :
So, it looks at our request, search inside the vector database to take chunks of text that could fit our request, and give us back plain text ?
Thank you for your answers
Beta Was this translation helpful? Give feedback.
All reactions