Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: non-english youtube rag #1960

Closed
2 of 4 tasks
atassis opened this issue May 4, 2024 · 4 comments · Fixed by #1965
Closed
2 of 4 tasks

enhancement: non-english youtube rag #1960

atassis opened this issue May 4, 2024 · 4 comments · Fixed by #1965
Assignees

Comments

@atassis
Copy link

atassis commented May 4, 2024

Bug Report

Description

Bug Summary:
If you use RAG for youtube without english translation, but with another language provided- request fails with an error from youtube-transcript-api

Steps to Reproduce:
try to execute next prompt #https://www.youtube.com/watch?v=FuRem6-sTmQ

Expected Behavior:
prefetch the list of languages of the video from the same package with

from youtube_transcript_api import YouTubeTranscriptApi

transcript_list = YouTubeTranscriptApi.list_transcripts(YoutubeLoader.extract_video_id(video_url)
languages = list(enumerate(transcript_list))
print(languages[0][1].language_code)

Actual Behavior:
See the toast error after several seconds.

Environment

  • Open WebUI Version: 0.1.123

  • Ollama (if applicable): 0.1.32

  • Operating System: Fedora 39 (both client and server, different machines)

  • Browser (if applicable): Chrome (version doesn't matter)

Reproduction Details

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.

Logs and Screenshots

Browser Console Logs:
I'll provide if needed, seems irrelevant

Docker Container Logs:
I'll provide if needed, seems irrelevant

Screenshots (if applicable):
image

Installation Method

Docker

Additional Information

The problem is that YoutubeLoader from langchain doesn't handle by himselves the verification that requested (or default 'en') language is provided by the video. We need to handle it by ourselves.

@tjbck tjbck changed the title Youtube video with non-english generated transcription fails to fetch enhancement: non-english youtube rag May 4, 2024
@tjbck tjbck self-assigned this May 8, 2024
@tjbck tjbck linked a pull request May 8, 2024 that will close this issue
@grigio
Copy link

grigio commented May 9, 2024

@tjbck
Screenshot from 2024-05-09 15-59-15

I've updated but I always get a similar error

@justinh-rahb
Copy link
Collaborator

justinh-rahb commented May 9, 2024

@grigio You're still requesting en, change this to it:

Screenshot 2024-05-09 at 10 17 33 AM

@grigio
Copy link

grigio commented May 9, 2024

@grigio You're still requesting en, change this to it:

Thanks, it works. I was thinking that it was managed by the UI language

@atassis
Copy link
Author

atassis commented May 9, 2024

What about trying to make a separate toggle with automatic language recognition? With the code suggested by me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants