Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: tool_calls is sometimes undefined #20

Open
Ademsk1 opened this issue May 1, 2024 · 0 comments
Open

Bug: tool_calls is sometimes undefined #20

Ademsk1 opened this issue May 1, 2024 · 0 comments

Comments

@Ademsk1
Copy link

Ademsk1 commented May 1, 2024

When attempting to access content that might be blocked, I'd like to safely handle this. When doing so however I come across the following error which crashes my server:

...ab/src/node_modules/llm-scraper/dist/models.js:41
  const c = completion.choices[0].message.tool_calls[0].function.arguments;
                                                    ^
TypeError: Cannot read properties of undefined (reading '0')
    at generateOpenAICompletions

Digging into the response of the completion.choices we see something like:

[
  {"index":0,
  "message":
    {"role":"assistant",
     "content":"The content you provided shows that access to the requested webpage has been blocked due to security measures implemented by Cloudflare, likely triggered by specific actions or commands deemed suspicious. This type of response is commonly served when automated systems (like web scrapers) or aggressive browsing behaviors are detected. There is no job-related content or other typical webpage elements displayed in the provided HTML. Instead, it provides information about why the access was denied, suggesting methods to resolve the issue such as contacting the site owner."
    },
  "logprobs":null,
  "finish_reason":"stop"
  }
]

My schema description contains this at the end:

If the content is inaccessible, e.g. behind a paywall, or has been blocked, the scraper will describe the error in the error field, and the appropriate status code (e.g. 401: Unauthorized, or 403: Forbidden).

Could my schema be affecting the completion content?
Here's also the code that I use. Wrapping in try doesn't seem to do much.

try {
    const openai = initialise()
    const browser = await chromium.launch();
    const scraper = new LLMScraper(browser, openai);
    const pages = await scraper.run(url, {
      model: "gpt-4-turbo",
      schema,
      mode: "html",
      closeOnFinish: true,
    })
    const stream = []
    for await (const page of pages) {
      stream.push(page)
    }
    console.log(stream[0].data)
    return stream[0].data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant