You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, @wchen-github . The answer md5 is calculated based on the label field in the data entry. As you can see, the correct answer is assumed to be INSERT INTO Filmography (Year, Title, Role, Notes) VALUES ('2019', 'New Movie', 'Lead Actor', '-'). Capitalized Lead Actor is probably causing the difference in hash. We'll try to do better in data filtering and validation. There shouldn't be many similar exceptions. Thank you for your report!
I looked into one particular DbBench task. GPT4 seems to have give the right answer but MD5 doesn't match.
Steps to reproduce the behavior:
Run a task with line [Bug/Assistance] DBBench Unknown database #106 of dbbench/standard.jsonl:
{"description": "The film titled 'New Movie' will be added to the Filmography table with the lead actor role and a note of '-' for the year 2019.", "label": ["INSERT INTO Filmography (Year, Title, Role, Notes) VALUES ('2019', 'New Movie', 'Lead Actor', '-')"], "create": {"database": "fetaqa", "init": "fetaqa_init.sql"}, "table": {"table_name": "Filmography", "table_info": {"columns": [{"name": "Year", "type": "INT"}, {"name": "Title", "type": "TEXT"}, {"name": "Role", "type": "TEXT"}, {"name": "Notes", "type": "TEXT"}], "rows": [["1985", "Back to the Future", "Jennifer Parker", "-"], ["2008", "Still Waters Burn", "Laura Harper", "-"], ["2011", "Alien Armageddon", "Eileen Daly", "-"], ["2013", "You Are Not Alone", "Cristina's Mom", "Short film"], ["2013", "Max", "Mom", "Short film"], ["2014", "Starship: Rising", "Captain Savage", "-"], ["2015", "EP/Executive Protection", "Pam Travis", "-"], ["2015", "Back in Time", "Herself", "Back to the Future documentary"], ["2015", "Back to the 2015 Future", "Jennifer Parker", "Short film"], ["2017", "Vitals", "Margaret Parks", "-"], ["2018", "Groove Street", "Julie", "-"], ["1999", "The Matrix", "Trinity", "-"], ["2005", "Batman Begins", "Rachel Dawes", "-"], ["2010", "Inception", "Mal", "-"], ["2012", "The Avengers", "Black Widow/Natasha Romanoff", "-"], ["2014", "Interstellar", "Brand", "-"], ["2016", "La La Land", "Mia Dolan", "-"], ["2017", "Wonder Woman", "Wonder Woman/Diana Prince", "-"], ["2019", "Avengers: Endgame", "Black Widow/Natasha Romanoff", "-"], ["2021", "The Suicide Squad", "Harley Quinn", "-"], ["2022", "Black Panther: Wakanda Forever", "Okoye", "-"]]}}, "evaluation": "", "example": "", "type": ["INSERT"], "heads": ["Year", "Title", "Role", "Notes"], "add_description": "The name of this table is Filmography, and the headers of this table are Year,Title,Role,Notes.", "source": "fetaqa", "answer_md5": "[('ae2213ddbcb907c43fd757035b363328',)]"}
Get the output SQL command and MD5 from the output/runs.jsonl file:
This is only one example I collected. There are many errors of similar kind. Can you help me identify the issues I am facing, please?
The text was updated successfully, but these errors were encountered: