Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backslashes being added by the agent to the tool call #560

Closed
mdkberry opened this issue May 4, 2024 · 1 comment
Closed

Backslashes being added by the agent to the tool call #560

mdkberry opened this issue May 4, 2024 · 1 comment

Comments

@mdkberry
Copy link

mdkberry commented May 4, 2024

I cant figure out the cause and it is intermittent. It works one in ten times maybe.

I am in a conda environment using VSCode with python 3.10.14
I am using the code shown here with very little changed - https://medium.com/@foadmk/optimizing-everyday-tasks-with-crewai-fc655ca08944

The error I get during a run, shows the defined Action call "fetch_pdf_content" being used incorrectly with backslashes getting added, but like I said it is intermittent. Some runs it will do it, some it wont, but mostly it does this.

The error message:

 [DEBUG]: == Working Agent: PDF Content Extractor
 [INFO]: == Starting Task: Read and preprocess the text from the PDF at this URL: https://scholarworks.calstate.edu/downloads/2j62s9350

> Entering new CrewAgentExecutor chain...
 I need to extract and preprocess the text from the PDF at the given URL using the provided tool.

Action Input: {'url': 'https://scholarworks.calstate.edu/downloads/2j62s9350'}

(Waiting for the observation of the `fetch_pdf_content` action) 

Action 'fetch\_pdf\_content' don't exist, these are the only available Actions: fetch_pdf_content: fetch_pdf_content(url: str) -> str - Fetches and preprocesses content from a PDF given its URL.
Returns the text of the PDF.

The code for the agent:

# PDF Reader Agent
pdf_reader = Agent(
    role='PDF Content Extractor',
    goal="""Extract and preprocess text from a PDF""",
    backstory="""Specializes in handling and interpreting PDF documents""",
    verbose=True,
    tools=[fetch_pdf_content],
    allow_delegation=False,
    # llm=model
    llm=llm
)

The code for the fetch_pdf_content, but it doesnt get that far when it decides to include "backslashes"

# Tool to fetch and preprocess PDF content from the internet URL
@tool
def fetch_pdf_content(url: str) -> str:
    """
    Fetches and preprocesses content from a PDF given its URL.
    Returns the text of the PDF.
    """
    response = requests.get(url)    
    with open('temp.pdf', 'wb') as f:
        f.write(response.content)

    with open('temp.pdf', 'rb') as f:
        pdf = PdfReader(f)
        text = '\n'.join(page.extract_text() for page in pdf.pages if page.extract_text())

    # Optional preprocessing of text
    processed_text = re.sub(r'\s+', ' ', text).strip()
    return processed_text

I have run debugs and cant catch where it is happening, as far as I can see it is passing it correctly from debug info. If anyone has some idea of what is going on here I would appreciate hearing it. As far as I can tell it is with the Agents at some level. (I am using local LLM mistral with Ollama, if that makes a difference but I have used this with a number of other projects and not had this problem).

@mdkberry
Copy link
Author

mdkberry commented May 5, 2024

Sharing my findings. It seems the AI (I was using ollama local with mistal 7b adjusted with the recommended modelfile) was choosing to look for the wrong function randomly adding backslashes to the _ so I changed the function name to "fetchpdfcontent" and that helped but it would go off doing other things instead after that. After I constained it further (see below) it behaved itself. This explains why I could not see the problem in step-through debugging the code because the code wasnt the problem, the lack of constraint for the AI was.

I have since had it start doing other things like deciding to sit and think instead of continue on with the process, but it informed me of this in the action info on screen, so I was able to spot it.

Here is my current change but it will need further additions before it works every time, no doubt. Changing the goal and backstory for the pdf_reader Agent to try to constrain it further.

    role='PDF Content Extractor',
    goal="""Extract text from a PDF. Important: Only perform the action you have been asked to perform. Once you've found the information, immediately stop searching for additional information.""",
    backstory="""As a PDF Content Extractor you specialize in handling text extracted from PDF documents.""",

@mdkberry mdkberry closed this as completed May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant