Optimize docs search hyper parameters #320

creatorrr · 2024-05-07T11:47:14Z

Let's run an automated evaluation on the RAG dataset (using a local model or something) and then tune the doc search hyperparameters based on this. Parameters are:

num docs k_docs
confidence docs_confidence

https://github.com/julep-ai/julep/blob/dev/agents-api/agents_api/models/entry/proc_mem_context.py#L13

rag dataset: rag-12000
contains three columns: context, question, answer

evaluation recipe:

create an agent
add all the documents from the context column as agent docs
for every row in the dataset (use the train split only)
- create a session with the agent
- ask the question from question column (you can set max_tokens to 1 since we dont care about the returned answer)
- note the document-ids returned from session.chat
- get all documents using the document ids
- check if context (value of that row) is in the fetched documents

cool thing: optuna: https://optuna.org/

The text was updated successfully, but these errors were encountered:

creatorrr assigned whiterabbit1983 May 7, 2024

creatorrr added good first issue Good for newcomers bounty program difficult and removed good first issue Good for newcomers labels May 30, 2024

julep-ai locked and limited conversation to collaborators May 30, 2024

creatorrr converted this issue into discussion #361 May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Optimize docs search hyper parameters #320

Optimize docs search hyper parameters #320

creatorrr commented May 7, 2024 •

edited

This issue was moved to a discussion.

This issue was moved to a discussion.

Optimize docs search hyper parameters #320

Optimize docs search hyper parameters #320

Comments

creatorrr commented May 7, 2024 • edited

This issue was moved to a discussion.

creatorrr commented May 7, 2024 •

edited