Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add better few shot examples for response consistency eval #527

Open
sourabhagr opened this issue Feb 18, 2024 · 1 comment
Open

Add better few shot examples for response consistency eval #527

sourabhagr opened this issue Feb 18, 2024 · 1 comment
Labels
good first issue Good for newcomers

Comments

@sourabhagr
Copy link
Contributor

The few-shot example should include:

  1. An argument to justify why the given answer is appropriate for the given question.
  2. A score between 0 to 1, indicating how logical the argument
  3. An explanation for the score.

Relevant code snippets at "uptrain/operators/language/prompts/few_shots.py" with variable name: RESPONSE_CONSISTENCY_FEW_SHOT__COT

@sourabhagr sourabhagr added bug Something isn't working good first issue Good for newcomers and removed bug Something isn't working labels Feb 18, 2024
@sky-2002
Copy link
Contributor

Something like this (I generated this using chatgpt)? @sourabhagr

RESPONSE_CONSISTENCY_FEW_SHOT__COT = """
[Question]: Which Alex is being referred to in the last line?
[Context]: In a story, Alex is a renowned chef famous for their culinary skills, especially in Italian cuisine. They've recently been experimenting with French recipes, trying to fuse them with Italian dishes to create something unique. Alex's restaurant, which used to serve exclusively Italian dishes, now offers a hybrid menu that's gaining popularity. However, Alex has a twin named Alex, who is not involved in the culinary world but is an artist in the local community. The artist Alex's paintings are not good. But, her food is also delicious and is tasty.
[Response]: In the last line, it is referring to the renowned chef Alex, whose food is delicious and tasty.
[Argument]: The LLM's response identifies the renowned chef Alex as the subject of the last line, focusing on the established narrative that this Alex is known for their culinary expertise. This interpretation maintains consistency with the broader story arc, where chef Alex's skills and experimentation with cuisine are central themes.
[Score]: 0.8
[Explanation]: The response correctly identifies the renowned chef Alex as the subject of the last line based on the established narrative about culinary skills. However, it overlooks the possibility of the last line introducing a twist regarding the artist Alex's cooking abilities. The score of 0.8 reflects the response's strong alignment with the main storyline but acknowledges a slight deviation from addressing the potential new aspect introduced in the last line.
"""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants