ThinkAi

ThinkAi is an LLM app with Retrieval Augmented Generation (RAG) that talks Philosophy built using InstructGPT embeddings, Chroma's Vector Search, LangChain tokenizers for text chunking, Meta's bart-large-cnn model for summarizing, and OpenAI's gpt-3.5-turbo model for structuring the final response. This is wrapped with a NextJS web app hosted completely on AWS (AWS Amplify, AWS Elastic Beanstalk, and AWS EC2), code here - ThinkAi UI

LLMs are trained on the internet making it hard to know if the generated response comes from a reliable source or even if it is a product of its hallucination. RAG helps with adding external knowledge forcing the model to generate a response using this context. It enables more factual consistency, improves reliability of the generated responses, and helps to mitigate the problem of hallucination. This is exactly what ThinkAI aims to achieve.

Performance evaluation of ThinkAI with ChatGPT here: ThinkAI v. ChatGPT

System Architecture:

Basic User Flow:

Here is how ThinkAi processes each user query;

User queries the web client.
Chroma DB encodes this query using embeddings.
The closest documents to this query are pulled using Chroma's vector search.
The summaries for these documents are pulled from the preprocessed JSON file and concatenated serving as context to the prompt.
API call prompting gpt-3.5-turbo and the response is sent back to the user.

Code for all the preprocessing step - COMING SOON!

Preprocessing: Data Collection

All the data has been collected from articles published and owned by Stanford Encyclopedia of Philosphy. Using beautifulsoup, all scraped pages were dumped into files which were then cleaned removing noise like html tags, media, etc. and structured into JSON files.

Preprocessing: Create Embeddings

Using Chroma DB, embeddings are created on the entire text of each article with an index on the URL of each article. For a given query, Chroma creates embeddings and then using Vector search, Chroma pulls out the closest top 3 articles for the given query.

Preprocessing: Summarize each article

There are a lot of summarization models, however, the max token size is at 1024 tokens (~800 words) for summarizing any text. This does not fit the use case since the text here is at least a couple of pages long. One way to work through this is to chunk text, summarize individually, and combine again. This may result in loss of continuity in the flow of the text, to overcome that, we add few tokens overlap in each chunk so we stay with the flow. This has proven to extract the main context of the article. Each JSON file is broken into chunks of 1000 tokens using LangChain's Text Splitters. These chunks are individually summarized using Meta's BART model published on HuggingFace. Finally, these individual summaries are combined together by simple concatenation.

All the generated summaries for each article is stored in a JSON file for fast access.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
constants.py		constants.py
data.zip		data.zip
get_docs.py		get_docs.py
get_nearest_links.py		get_nearest_links.py
get_response.py		get_response.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

docs

docs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

constants.py

constants.py

data.zip

data.zip

get_docs.py

get_docs.py

get_nearest_links.py

get_nearest_links.py

get_response.py

get_response.py

requirements.txt

requirements.txt

Repository files navigation

ThinkAi

System Architecture:

Contents

Basic User Flow:

Preprocessing: Data Collection

Preprocessing: Create Embeddings

Preprocessing: Summarize each article

About

Releases

Packages

Languages

License

maanvithag/thinkai

Folders and files

Latest commit

History

Repository files navigation

ThinkAi

System Architecture:

Contents

Basic User Flow:

Preprocessing: Data Collection

Preprocessing: Create Embeddings

Preprocessing: Summarize each article

About

Topics

Resources

License

Stars

Watchers

Forks

Languages