Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when downloading pics using chrome #46

Open
qiuzhewei opened this issue Aug 3, 2021 · 2 comments
Open

Error when downloading pics using chrome #46

qiuzhewei opened this issue Aug 3, 2021 · 2 comments

Comments

@qiuzhewei
Copy link

Hi,
Following error occcurs when I try to run the script.

selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary

Any help will be appreciate!

@caop-kie
Copy link

You need to install the chrome browser in your operating system in the first place to use the selenium package.

@madiskoivopuu
Copy link

For anyone still having this issue, the problem lies within the regex for parsing the image URL. It gets extra junk in there which breaks the image link. To fix the code, modify google_image_url_from_webpage function in crawler.py to this:

# (line 121)

image_elements = driver.find_elements(By.CLASS_NAME, "islib")
    image_urls = list()
    url_pattern = r"imgurl=\S*?&" # explanation: \S -> match any whitespace character
                                      #                  *? -> match previous token \S between 0 and unlimited times and do so lazily, aka match until the first & and not the last one

    for image_element in image_elements[:max_number]:
        outer_html = image_element.get_attribute("outerHTML")
        re_group = re.search(url_pattern, outer_html)
        if re_group is not None:
            image_url = unquote(re_group.group()[len("imgurl=") : -len("&")])
            image_urls.append(image_url)
    return image_urls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants