title | description |
---|---|
Basics |
Phidata is a framework for building AI Assistants with memory, knowledge and tools. |
- Memory: Stores chat history, summaries, facts in a database.
- Knowledge: Stores business context in a vector database.
- Tools: Let LLMs take actions like calling APIs, sending emails or querying a database.
Let's work through a few examples
Open the `Terminal` and create a python virtual environment.<CodeGroup>
```bash Mac
python3 -m venv ~/.venvs/aienv
source ~/.venvs/aienv/bin/activate
```
```bash Windows
python3 -m venv aienv
aienv/scripts/activate
```
</CodeGroup>
<CodeGroup>
```bash Mac
pip install -U phidata
```
```bash Windows
pip install -U phidata
```
</CodeGroup>
Tools are functions that an Assistant can run to achieve tasks like searching the web, running SQL, sending an email, calling APIs.
Let's create an Assistant that prepares a research report by searching the web.
Create a file `research.py````python research.py
from phi.assistant import Assistant
from phi.tools.duckduckgo import DuckDuckGo
from phi.tools.newspaper4k import Newspaper4k
assistant = Assistant(
tools=[DuckDuckGo(), Newspaper4k()],
show_tool_calls=True,
description="You are a senior NYT researcher writing an article on a topic.",
instructions=[
"For the provided topic, search for the top 3 links.",
"Then read each URL and extract the article text, if a URL isn't available, ignore and let it be.",
"Analyse and prepare an NYT worthy article based on the information.",
],
add_datetime_to_instructions=True,
)
assistant.print_response("Latest developments in AI", markdown=True)
```
<CodeGroup>
```bash Mac
export OPENAI_API_KEY=sk-***
```
```bash Windows
setx OPENAI_API_KEY sk-***
```
</CodeGroup>
Install libraries
```shell
pip install openai duckduckgo-search newspaper4k lxml_html_clean
```
Run the Assistant
```shell
python research.py
```
To provide LLMs with business context, we store information in a vector database that the Assistant can search to improve its responses. Let's use PgVector as our vector store as it can also provide storage for our Assistants.
Install docker desktop and run PgVector on port 5532 using:
docker run -d \
-e POSTGRES_DB=ai \
-e POSTGRES_USER=ai \
-e POSTGRES_PASSWORD=ai \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-v pgvolume:/var/lib/postgresql/data \
-p 5532:5432 \
--name pgvector \
phidata/pgvector:16
Retrieval Augmented Generation means "stuffing the prompt with relevant information" to improve LLM responses. This is a 2 step process:
- Retrieve relevant information from a vector database.
- Augment the prompt to provide context to the LLM.
Let's build a PDF Assistant that helps us with food recipes using RAG.
Install the required libraries using pip<CodeGroup>
```bash Mac
pip install -U pgvector pypdf "psycopg[binary]" sqlalchemy
```
```bash Windows
pip install -U pgvector pypdf "psycopg[binary]" sqlalchemy
```
</CodeGroup>
```python rag_assistant.py
from phi.assistant import Assistant
from phi.knowledge.pdf import PDFUrlKnowledgeBase
from phi.vectordb.pgvector import PgVector2
knowledge_base = PDFUrlKnowledgeBase(
# Read PDF from this URL
urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
# Store embeddings in the `ai.recipes` table
vector_db=PgVector2(
collection="recipes",
db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
),
)
# Load the knowledge base
knowledge_base.load(recreate=False)
assistant = Assistant(
knowledge_base=knowledge_base,
# The add_references_to_prompt will update the prompt with references from the knowledge base.
add_references_to_prompt=True,
)
assistant.print_response("How do I make pad thai?", markdown=True)
```
<CodeGroup>
```bash Mac
python rag_assistant.py
```
```bash Windows
python rag_assistant.py
```
</CodeGroup>
<br />
<Note>
The Assistant uses the `OpenAI` LLM and Embeddings by default.
</Note>
from phi.knowledge.pdf import PDFKnowledgeBase
...
knowledge_base = PDFKnowledgeBase(
path="data/pdfs",
vector_db=PgVector2(
collection="pdf_documents",
db_url=vector_db.get_db_connection_local(),
),
)
...
With the RAG assistant above, the add_references_to_prompt=True
always adds information from the knowledge base to the prompt, regardless of whether it is relevant to the question.
With Autonomous assistants, we let the LLM decide if it needs to access the knowledge base and what search parameters it needs to query the knowledge base.
Make the Assistant
Autonomous by setting the search_knowledge
and read_chat_history
flags, giving it tools to search the knowledge base and chat history on demand.
```python auto_assistant.py
from phi.assistant import Assistant
from phi.knowledge.pdf import PDFUrlKnowledgeBase
from phi.vectordb.pgvector import PgVector2
knowledge_base = PDFUrlKnowledgeBase(
urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
vector_db=PgVector2(
collection="recipes",
db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
),
)
# Comment out as the knowledge base is already loaded.
# knowledge_base.load(recreate=False)
assistant = Assistant(
knowledge_base=knowledge_base,
# Show tool calls in the response
show_tool_calls=True,
# Enable the assistant to search the knowledge base
search_knowledge=True,
# Enable the assistant to read the chat history
read_chat_history=True,
)
assistant.print_response("How do I make pad thai?", markdown=True)
assistant.print_response("What was my last question?", markdown=True)
```
<CodeGroup>
```bash Mac
python auto_assistant.py
```
```bash Windows
python auto_assistant.py
```
</CodeGroup>
> Notice how it searches the knowledge base and chat history when needed
Assistants comes with built-in memory but it only lasts while the session is active. To continue conversations across sessions, we can store chat history in a database like PostgreSQL.
Lets wrap up by building an Assistant that can read PDFs, store them in a knowledge base and continue conversations across sessions by storing chat history in a database.
Create a file pdf_assistant.py
import typer
from rich.prompt import Prompt
from typing import Optional, List
from phi.assistant import Assistant
from phi.storage.assistant.postgres import PgAssistantStorage
from phi.knowledge.pdf import PDFUrlKnowledgeBase
from phi.vectordb.pgvector import PgVector2
db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"
knowledge_base = PDFUrlKnowledgeBase(
urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
vector_db=PgVector2(collection="recipes", db_url=db_url),
)
# Comment out after first run
knowledge_base.load()
storage = PgAssistantStorage(table_name="pdf_assistant", db_url=db_url)
def pdf_assistant(new: bool = False, user: str = "user"):
run_id: Optional[str] = None
if not new:
existing_run_ids: List[str] = storage.get_all_run_ids(user)
if len(existing_run_ids) > 0:
run_id = existing_run_ids[0]
assistant = Assistant(
run_id=run_id,
user_id=user,
knowledge_base=knowledge_base,
storage=storage,
# Show tool calls in the response
show_tool_calls=True,
# Enable the assistant to search the knowledge base
search_knowledge=True,
# Enable the assistant to read the chat history
read_chat_history=True,
)
if run_id is None:
run_id = assistant.run_id
print(f"Started Run: {run_id}\n")
else:
print(f"Continuing Run: {run_id}\n")
# Runs the assistant as a cli app
assistant.cli_app(markdown=True)
if __name__ == "__main__":
typer.run(pdf_assistant)
Run the Assistant
python pdf_assistant.py
python pdf_assistant.py
Ask Questions
Now the conversation continues across sessions. Ask a question:
How do I make pad that?
Then message bye
to exit, start the app again and ask:
What was my last message?
Run the pdf_assistant.py
file with the --new
flag to start a new run.
python pdf_assistant.py --new
python pdf_assistant.py --new
Congratulations on building an AI Assistant with memory, knowledge and tools. You can serve this Assistant using FastApi
, Streamlit
or Django
. Here's the architecture we just built:
Continue your journey by learning about:
Or Checkout the following examples:
If you need help, reach out to us on Discord.