Use of ocr in Evaluation #95

bruceisme · 2024-04-27T08:33:46Z

In Appendix A's Image-text Data Collection, mention "It is important to note that the
OCR detector is utilized solely for generating enriched data and is not employed during testing ". But the textvqa scripts is using llava_textvqa_val_v051_ocr.jsonl which has ocr. So have you ever test a version without ocr in textvqa, was it worse than llava_textvqa_val_v051_ocr.jsonl ? can we understand that model could get better result with ocr input?

The text was updated successfully, but these errors were encountered:

yanwei-li · 2024-05-03T04:49:57Z

Hi, the word in Appendix A means that we do not perform an extra PaddleOCR detector for evaluation. For the TextVQA, we keep the OCR Token with that in LLaVA. It should have a worse result without the original OCR tokens.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of ocr in Evaluation #95

Use of ocr in Evaluation #95

bruceisme commented Apr 27, 2024 •

edited

yanwei-li commented May 3, 2024

Use of ocr in Evaluation #95

Use of ocr in Evaluation #95

Comments

bruceisme commented Apr 27, 2024 • edited

yanwei-li commented May 3, 2024

bruceisme commented Apr 27, 2024 •

edited