-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add near text search with multiple target vectors #4955
base: main
Are you sure you want to change the base?
Conversation
Will be an excellent addition! I would like to suggest the ability to configure scoring/ranking, such as, for a nearText case, sorting by the minimum average distance based on a distance metric (such as cosine) and including some weighting, so if this was using 3 vectors, weights might be [0.4, 0.3, 0.3] to more heavily weight the first vector. depending on the distance metric, there may need to be some normalization, especially if the vectors are coming from different embedding models. |
Related to this, but a different usage scenario, is a query that extends across collections that involves more than one vector. Given a data model like: Document (Collection), Topic (Collection), Image (Collection) Document: Topic: Image: Query:
So the Topics collection has two vectors and serves to "join" the two embedding spaces allowing queries to traverse across the embedding spaces. One scenario when this arises is when there is an existing dataset for "documents" and an existing dataset for "images" and you want to query across them without having to modify the current data (or the processes that maintain it). |
For the parallel N vector query case, is there the concept of optimizing the ordering, such that the vector that has the least nearby results can be a gating factor on the others? In document search, if you were querying for "happy" AND "aardvark" you would search for "aardvark" first which presumably would be less frequent and help filter the "happy" results. The situation with vectors is not exactly the same but thought a similar process might help. |
Co-authored-by: Marcin Antas <antas.marcin@gmail.com> Co-authored-by: Parker Duckworth <parkerduckworth@gmail.com>
Quality Gate failedFailed conditions |
What's being changed:
Adds searching for multiple target vectors in a single search. This should support all vector related searches but not aggregate. Can be tested with the python client
4.7.0a0
Review checklist