-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: http(s) support for file readers #754
base: main
Are you sure you want to change the base?
feat: http(s) support for file readers #754
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
fs: GenericFileSystem = defaultFS, | ||
): Promise<Document[]> { | ||
const dataBuffer = await fs.readRawFile(file); | ||
const blob = new Blob([dataBuffer]); | ||
return [new ImageDocument({ image: blob, id_: file })]; | ||
return [new ImageDocument({ image: blob, id_: `${file}` })]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if file
is an URL
then you can also use
[new ImageDocument({ image: file, id_:
${file} })];
because image
can be URL
(I guess this only makes sense for URLs with http(s)://
prefix as we might send the image's URL to a LLM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file system also can read URLs if it's file:
schema
like
fs.readFile(new URL('file:/path/to/file'))
So, it could be error if pass file URL or non-public URL to LLM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. I forgot that an http(s)://
URL can still be non-public, and we can't find out whether it is public or not, so it's better to use Blob
as a general case.
It might be worth adding a dedicated function for reading a public URL though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so there are many cases:
- string
1.1 base64 ✅
1.2 http(s) ✅
1.3 others - URL
2.1. image URL
2.1.1 nonpublic
2.1.2 public ✅
2.2 non-image URL - blob
3.1 image blob (png/jpg...) ✅
3.2 non-image blob (docx, pdf...)
Fixes: #495