Blog: Unreasonable Effectiveness of literally using your brain #339

jxnl · 2024-01-12T04:36:12Z

This is so incomplete its laughable, but i need some eyes.

hurrrsh · 2024-01-12T05:09:47Z

docs/blog/posts/look-at-it.md

+**Failure Modes**
+
+- You literally just can't service many of the unless you bake them into your index.
+- If you don't have a maps index, you can't answer most directions questions.


hurrrsh · 2024-01-12T05:11:21Z

docs/blog/posts/look-at-it.md

+
+## Eval Benefits of classifying by capability
+
+Once you've classified your questions by capability, you can also start specializing not only to generate response using specific capabilities, but also specialize your evaluation metrics. Imagine you're a construction site, and you have 'ownership' questions. If you knew that apriori, you'd know to add in the prompt that the responses of 'ownership' must result in not only the name of a person, but a full name and contact information. You can then, also validate in an eval if that criteria is met!


If you knew that apriori, you'd know how to add in the prompt

hurrrsh · 2024-01-12T05:12:17Z

docs/blog/posts/look-at-it.md

+    This is a work in progress I'm going to use bullet points to outline the main points of the article.
+
+- I've been consulting some and a common question is "How do I improve my RAG?"
+- Theres always the black box responses like 'lets add cohere' or 'lets change chunk size', but generic solutinos get you generic results.


hurrrsh · 2024-01-12T05:13:49Z

docs/blog/posts/look-at-it.md

+
+- I've been consulting some and a common question is "How do I improve my RAG?"
+- Theres always the black box responses like 'lets add cohere' or 'lets change chunk size', but generic solutinos get you generic results.
+- Instead my recommendation is to simply... look at your data.


My recommendation is to simply look at your data.

hurrrsh · 2024-01-12T05:15:13Z

docs/blog/posts/look-at-it.md

+- I've been consulting some and a common question is "How do I improve my RAG?"
+- Theres always the black box responses like 'lets add cohere' or 'lets change chunk size', but generic solutinos get you generic results.
+- Instead my recommendation is to simply... look at your data.
+- Once we look at the data we'll have plenty of information we need to itentify the best intervention strategy and also to figure out where we might want to specialize our model.


hurrrsh · 2024-01-12T05:16:37Z

docs/blog/posts/look-at-it.md

+- Instead my recommendation is to simply... look at your data.
+- Once we look at the data we'll have plenty of information we need to itentify the best intervention strategy and also to figure out where we might want to specialize our model.
+
+In this blog we'll cover a range of things that can lead us into the right direction. Go over some examples of companies that can do this kind of exploration. We'll leave it open ended as to what the interventions are, but give you the tools to drill down into your data and figure out what you need to do. For example if if google learns that a large portion of queries are looking for directions or a location, they might want to build a seperate index and release a maps product rather than expecting a HTML page to be the best response.


For example, if Google learns that a large portion of queries is looking for directions or a location.("if" is repeated in your draft)

hurrrsh · 2024-01-12T05:16:49Z

docs/blog/posts/look-at-it.md

+
+## What do I look for?
+
+- The first time we shuold be looking is simply looking at the questions we're asking.


*should
The first thing we should be looking at is simply examining the questions we're asking.

hurrrsh · 2024-01-12T05:21:02Z

docs/blog/posts/look-at-it.md

+
+You can imagine day one of google, they can spend tonnes of time looking at the data and trying to figure out what to do, but they can also just look at the data and see what people are searching for.
+
+we might discover that there are tonnes of search questions that look like directions that doo poorly because there are only few websites that give directinos from one place to another, so they identiy that they might want to support a maps feature, same with photos, or shopping or videos.


*do poorly
*directions

hurrrsh · 2024-01-12T05:21:47Z

docs/blog/posts/look-at-it.md

+
+we might discover that there are tonnes of search questions that look like directions that doo poorly because there are only few websites that give directinos from one place to another, so they identiy that they might want to support a maps feature, same with photos, or shopping or videos.
+
+we might also notice that for sports games and showtimes, and weather, they might want to return smaller modals rather than a complete new 'page' of results. All these decisions are likely something that could be done by inspecting the data early on in the business


hurrrsh · 2024-01-12T05:22:14Z

docs/blog/posts/look-at-it.md

+
+### Topics
+
+I see topics as the data that could be retrieved via text or semantic search. For embedding search it could be the types of text that isearched. "Privacy documents, legal documents, etc" are all topics since they can completed be generated by a searc hquery.


*is searched
*completely
*search query

hurrrsh · 2024-01-12T05:25:58Z

docs/blog/posts/look-at-it.md

+
+### Capabilities
+
+Capabilities are the things that you can do with your index itself. For example if you have a plain jane search index only over text. being able to answer comparisone questions and timeline questions are going to be capabilities that you can't do unless you bake them into your index. otherwise we'll embed to somethign strange "What happened last week" NEEDS to be embedded to a date, otherwise you're going to have a bad time. This is somethign we covered a lot [Rag is more than embedding search](./rag-and-beyond.md).


*Being able to answer comparison
*otherwise we will embed it to something strange
*This is something

hurrrsh · 2024-01-12T05:26:24Z

docs/blog/posts/look-at-it.md

+
+Capabilities are the things that you can do with your index itself. For example if you have a plain jane search index only over text. being able to answer comparisone questions and timeline questions are going to be capabilities that you can't do unless you bake them into your index. otherwise we'll embed to somethign strange "What happened last week" NEEDS to be embedded to a date, otherwise you're going to have a bad time. This is somethign we covered a lot [Rag is more than embedding search](./rag-and-beyond.md).
+
+Heres some more examples of capabilities:


Here are some more examples of capabilities

ivanleomk · 2024-01-19T03:39:38Z

docs/blog/posts/look-at-it.md

+
+## Rerankers
+
+Every company I've worked with so far has tried something like Kohiroi Rancors and have generally liked it very much. I don't have much to say here except for the fact that if you haven't tried it already, definitely consider running these and comparing how your metrics improve.


Cohere Re-Ranker?

ivanleomk · 2024-01-19T03:43:08Z

docs/blog/posts/look-at-it.md

+
+## Summary Index
+
+In a production-ranked application, we can't guarantee that the LLM will be able to answer the question correctly every time. But what we can do is give confidence to the user that the documents that are relevant are being shown. An easy way of doing that is to run some kind of summarization like [chain of density](./chain-of-density.md), embedding that summary, and doing retrieval of documents first, and then putting them into context rather than using chunks themselves.


Is this what you were trying to say? Wasn't too sure if it was generating summary for user or generating summary for retrieval

We can increase the quality of our LLM responses by generating succinct summaries of our sources using methods such as chain of density and then providing these to the user to verify. This has an added benefit of increasing the quality of the context that the LLM model has access to.

ivanleomk · 2024-01-19T03:46:24Z

docs/blog/posts/look-at-it.md

+
+## Specific Advice
+
+This part becomes much more challenging because these issues are not going to be solved by simply playing around with re-rankers and chunk size. It requires examining your data and applying a range of data science techniques to gain a deep understanding of the data. This will enable us to make informed decisions on how to construct new indices and develop specific search engines that cater to your customers' needs.


In order to cater to your customer's needs, we need to look beyond playing with re-rankers and chunk sizes. Instead, we need to look deeper and examine the type of queries that we're receiving so that we can identify the most effective interventions.

In this blog, we'll show you a quick framework and some tools that you'll be able to use in order to identify these issues and ultimately fix them

jxnl added 3 commits January 11, 2024 23:35

bump

58f04c3

add failure modes

4e59f20

bump

2f6678f

hurrrsh reviewed Jan 12, 2024

View reviewed changes

clean up

f7fba61

ivanleomk reviewed Jan 19, 2024

View reviewed changes

bump

b298f3e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blog: Unreasonable Effectiveness of literally using your brain #339

Blog: Unreasonable Effectiveness of literally using your brain #339

jxnl commented Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024 •

edited

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024 •

edited

hurrrsh Jan 12, 2024

hurrrsh Jan 12, 2024

ivanleomk Jan 19, 2024

ivanleomk Jan 19, 2024

ivanleomk Jan 19, 2024 •

edited


		## Eval Benefits of classifying by capability

		Once you've classified your questions by capability, you can also start specializing not only to generate response using specific capabilities, but also specialize your evaluation metrics. Imagine you're a construction site, and you have 'ownership' questions. If you knew that apriori, you'd know to add in the prompt that the responses of 'ownership' must result in not only the name of a person, but a full name and contact information. You can then, also validate in an eval if that criteria is met!


		## What do I look for?

		- The first time we shuold be looking is simply looking at the questions we're asking.


		You can imagine day one of google, they can spend tonnes of time looking at the data and trying to figure out what to do, but they can also just look at the data and see what people are searching for.

		we might discover that there are tonnes of search questions that look like directions that doo poorly because there are only few websites that give directinos from one place to another, so they identiy that they might want to support a maps feature, same with photos, or shopping or videos.


		we might discover that there are tonnes of search questions that look like directions that doo poorly because there are only few websites that give directinos from one place to another, so they identiy that they might want to support a maps feature, same with photos, or shopping or videos.

		we might also notice that for sports games and showtimes, and weather, they might want to return smaller modals rather than a complete new 'page' of results. All these decisions are likely something that could be done by inspecting the data early on in the business


		### Topics

		I see topics as the data that could be retrieved via text or semantic search. For embedding search it could be the types of text that isearched. "Privacy documents, legal documents, etc" are all topics since they can completed be generated by a searc hquery.


		### Capabilities

		Capabilities are the things that you can do with your index itself. For example if you have a plain jane search index only over text. being able to answer comparisone questions and timeline questions are going to be capabilities that you can't do unless you bake them into your index. otherwise we'll embed to somethign strange "What happened last week" NEEDS to be embedded to a date, otherwise you're going to have a bad time. This is somethign we covered a lot [Rag is more than embedding search](./rag-and-beyond.md).


		Capabilities are the things that you can do with your index itself. For example if you have a plain jane search index only over text. being able to answer comparisone questions and timeline questions are going to be capabilities that you can't do unless you bake them into your index. otherwise we'll embed to somethign strange "What happened last week" NEEDS to be embedded to a date, otherwise you're going to have a bad time. This is somethign we covered a lot [Rag is more than embedding search](./rag-and-beyond.md).

		Heres some more examples of capabilities:


		## Rerankers

		Every company I've worked with so far has tried something like Kohiroi Rancors and have generally liked it very much. I don't have much to say here except for the fact that if you haven't tried it already, definitely consider running these and comparing how your metrics improve.


		## Summary Index

		In a production-ranked application, we can't guarantee that the LLM will be able to answer the question correctly every time. But what we can do is give confidence to the user that the documents that are relevant are being shown. An easy way of doing that is to run some kind of summarization like [chain of density](./chain-of-density.md), embedding that summary, and doing retrieval of documents first, and then putting them into context rather than using chunks themselves.


		## Specific Advice

		This part becomes much more challenging because these issues are not going to be solved by simply playing around with re-rankers and chunk size. It requires examining your data and applying a range of data science techniques to gain a deep understanding of the data. This will enable us to make informed decisions on how to construct new indices and develop specific search engines that cater to your customers' needs.

Blog: Unreasonable Effectiveness of literally using your brain #339

Are you sure you want to change the base?

Blog: Unreasonable Effectiveness of literally using your brain #339

Conversation

jxnl commented Jan 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hurrrsh Jan 12, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hurrrsh Jan 12, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ivanleomk Jan 19, 2024 • edited

Choose a reason for hiding this comment

hurrrsh Jan 12, 2024 •

edited

hurrrsh Jan 12, 2024 •

edited

ivanleomk Jan 19, 2024 •

edited