Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mp3 text extraction Exception - 5MB~ file #460

Open
RiccardoRomagnoli opened this issue Apr 8, 2023 · 0 comments
Open

mp3 text extraction Exception - 5MB~ file #460

RiccardoRomagnoli opened this issue Apr 8, 2023 · 0 comments

Comments

@RiccardoRomagnoli
Copy link

Describe the bug
Get HTTP error from SpeechRecognition when trying to extract text from an mp3 file of 5MB

Desktop (please complete the following information):

  • OS: Ubuntu
  • Textract version 1.6.5
  • Python version 3.8

Additional context
Add any other context about the problem here.

File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/site-packages/speech_recognition/init.py", line 840, in recognize_google
response = urlopen(request, timeout=self.operation_timeout)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/urllib/request.py", line 531, in open
response = meth(req, response)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/urllib/request.py", line 640, in http_response
response = self.parent.error(
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 1, in
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/site-packages/textract/parsers/init.py", line 79, in process
return parser.process(filename, input_encoding, output_encoding, **kwargs)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/site-packages/textract/parsers/utils.py", line 46, in process
byte_string = self.extract(filename, **kwargs)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/site-packages/textract/parsers/audio.py", line 28, in extract
speech = self.extract(temp_filename, method, **kwargs)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/site-packages/textract/parsers/audio.py", line 39, in extract
speech = r.recognize_google(audio)
File "/home/riccardo/.conda/envs/stochastic/lib/python3.8/site-packages/speech_recognition/init.py", line 842, in recognize_google
raise RequestError("recognition request failed: {}".format(e.reason))
speech_recognition.RequestError: recognition request failed: Bad Request

@jpweytjens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant