Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool sometimes misinterprets multiple-choice questions #15

Open
timpaul opened this issue Apr 23, 2024 · 7 comments
Open

Tool sometimes misinterprets multiple-choice questions #15

timpaul opened this issue Apr 23, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@timpaul
Copy link
Owner

timpaul commented Apr 23, 2024

Accurately interpreting multiple choice questions (beyond simple yes/no) is a challenge. Lets capture examples of the tool successfully and unsuccessfully doing this, to determine how we might improve the performance.

@timpaul
Copy link
Owner Author

timpaul commented Apr 23, 2024

Here's a partially successful example for this image:

image

The different options for question 20 in the doc have been correctly parsed, but the hint text was not.

@timpaul
Copy link
Owner Author

timpaul commented Apr 23, 2024

Another mostly successful example from the same form as above:

image

The options for question 23 in the form were correctly determined, as was the fact that only one response is allowed.

The conditional date fields were not picked up, but this isn't surprising as the multiple-choice component doesn't support them.

This is a good example of where you might choose to structure this question differently in the web version anyway, using multiple pages and routing.

@timpaul
Copy link
Owner Author

timpaul commented Apr 23, 2024

Here's an example of it getting it wrong, from the same form:

image

It made 2 errors:

  1. It treated the hint text as the first option
  2. It assumed only one response was allowed

What's interesting (and frustrating) is that the question is nearly identical to this one, which was successfully parsed.

It does occasionally get it right:

image

@timpaul timpaul changed the title Measure and improve performance on multiple-choice questions Tool sometimes misinterprets multiple-choice questions Apr 23, 2024
@timpaul timpaul added the bug Something isn't working label Apr 23, 2024
@timpaul
Copy link
Owner Author

timpaul commented Apr 23, 2024

Here's another example of a mostly successful extraction, from question 42 of this image:

image

The hint text isn't carried over, and is added to the question title.

@timpaul
Copy link
Owner Author

timpaul commented May 1, 2024

It's now getting an isolated version of this example right:

image

@timpaul
Copy link
Owner Author

timpaul commented May 2, 2024

Another fail, from this image:

image

It chose checkboxes instead of radios. I wonder if I can get it to understand the difference based on the hint text?...

@timpaul
Copy link
Owner Author

timpaul commented May 2, 2024

Yes, I can!

image

This was fixed in this commit by adding the following to the description text for the answer_type object in the schema:

If any part of the question contains text like 'Tick the boxes...' it's a multiple_choice question.

I'd tried a few other variants before finding one that worked, which is interesting. I think what made it work was the confidence of the statement. Saying if any part of the question, and that it is (rather than probably is). Also expressing it as a standalone sentence, rather than appending it as a clause to another sentence.

Notice that the question in the example doesn't contain the exact text that I cite in the schema, but it still matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant