Guide on how to evaluate models #180

kisimoff · 2024-04-03T09:09:00Z

Im willing to test a few models and share the results.
I've looked at the readme, but couldn't wrap my head around how to benchmark a model. Any help would be appriciated!

the-crypt-keeper · 2024-04-07T23:17:34Z

The docs definitely need a rewrite my apologies here.

The general flow is:

prepare.py
interview*.py
eval.py

In the dark days we had to deal with dozens of prompt formats, but these days prepare.py can be run with --chat hfmodel and it will sort it out.

Note that there are two interviews junior-v2 and senior, I usually only run senior on strong models that get >90% on junior.

the-crypt-keeper pushed a commit that referenced this issue Apr 12, 2024

#180 evaluation guide, wip

b35eb7b

the-crypt-keeper pushed a commit that referenced this issue May 7, 2024

#180 guide wip, refresh requirements files

5f743ab

the-crypt-keeper mentioned this issue May 12, 2024

Create a requirements.txt and document pip dependencies of each interviewer #71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guide on how to evaluate models #180

Guide on how to evaluate models #180

kisimoff commented Apr 3, 2024

the-crypt-keeper commented Apr 7, 2024

Guide on how to evaluate models #180

Guide on how to evaluate models #180

Comments

kisimoff commented Apr 3, 2024

the-crypt-keeper commented Apr 7, 2024