Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matching strategy "all" is not matching expected hits #4636

Open
mdostmann opened this issue May 15, 2024 · 3 comments
Open

matching strategy "all" is not matching expected hits #4636

mdostmann opened this issue May 15, 2024 · 3 comments
Labels
support Issues related to support questions

Comments

@mdostmann
Copy link

Describe the bug
When using matching strategy 'all', I expect that documents where all search terms of the query match at least one searchable attribute are considered as hits.
However, it seems that for some reason documents that should get considered as valid hits are ignored and not returned in the result.

To Reproduce
Steps to reproduce the behavior:

  1. set up fresh meili V1.8 instance
  2. add the movies.json to the index
  3. do a search with matchingStrategy set to 'all' with one search term
    Request POST /indexes/movies/search:
    { "q": "Taisto", "matchingStrategy": "all" }
    Result
    1 hit, movie "Ariel", as expected
  4. do a search with matchingStrategy set to 'all' with two search terms
    Request POST /indexes/movies/search
    { "q": "Kasu Taisto", "matchingStrategy": "all" }
    Result
    no hits

Expected behavior
In the given example, the overview attribute of the movie contains both search terms and this should get returned as a hit.

Meilisearch version:
V1.8

additional context
The behaviour can also be reproduced via the matching strategy example

@dureuill
Copy link
Contributor

dureuill commented May 16, 2024

Hello 👋

In your provided query "Kasu Taisto", looks like the first word is a prefix of Kasurinen from the documents.

Prefix search is only applied on the last query term, even with the all matching strategy.

Looking for Kasurinen Taisto does return the expected hit.

I can see why the behavior is surprising. Applying prefix search only to the last term is for performance reasons

@mdostmann
Copy link
Author

Hi @dureuill and thank you for your response.
I think I do not get how this is supposed to work. E.g., searching for "Ted John" will return documents where both terms match, same for "Irving John".
"Irvi John", on the other hand, does not match any document.
"Taisto miner Finnish Kasu" does match, as does "Taisto miner Finnis Kasu" but not "Taisto miner Finn Kasu".

This is probably the intended behaviour, although I'm having a rough time figuring out what I can expect to match, especially due to this blog: "This matching strategy is called: all because all query terms must be present in the document for it to be returned."

The examples above all query terms that seem to be present and proper prefixes.

Is this really the intended behaviour?

@dureuill
Copy link
Contributor

Let's break down what happens in all the cases you're mentioning. When trying to understand relevancy, it can help to enable showRankingScoreDetails which will give some information on which rules where applied.

The TL;DR is that there are multiple effects at work, prefix is one of them, but there's also typo tolerance (which will depend on the length of words and the number of missing letters) and ngrams (can't find docs about them right now but the intuition is that they will help you find documents when the space is at the wrong location, e.g. finding documents for "batman" when querying "bat man").

  • "Ted John" and "Irving John" -> both words are in documents
  • "Irvi John" -> Unlikely to have documents containing "Irvi". Going to "Irving" via typo requires two typos, in default settings this triggers only for words with 9 letters
  • "Taisto miner Finnish Kasu" -> prefix on Kasu
  • "Taisto miner Finnis Kasu" -> prefix on Kasu + one typo on "Finnish" (one typo triggers for words of 5+ characters in the query in default settings)
  • "Taisto miner Finn Kasu" -> prefix on Kasu, too many typos to go back to Finnish.

Is this really the intended behaviour?

Yes. If you don't want typo tolerance, you can disable it. However it is generally useful to humans typing queries.

@curquiza curquiza added the support Issues related to support questions label May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support Issues related to support questions
Projects
None yet
Development

No branches or pull requests

3 participants