Selecting Icons #32

llermaly · 2024-01-21T02:39:43Z

Hi! ,

I'm trying to automate using the search bar on a list of unknown sites.

In most cases the bar is not visible but there is an icon I must click before to display the search bar.

This example, I want to detect and click the magnifying glass:

The problem is it shows this way in the text [ @ 18 ] so GPT can not pick it (I'm using the llamaindex agent)

I read @asim-shrestha mentions GPT-V mode in another issue but I'm not sure on how activate that one, I'm following the docs without success.

Any advice? thanks

The text was updated successfully, but these errors were encountered:

asim-shrestha · 2024-02-01T20:48:07Z

Hey @llermaly, because Tarsier is typically for text parsing, we currently don't support icons. (Not sure how we'd best go about it in the future either)

For images it is quite straight forward. There is a page_to_image function in tarsier that will return the bytes of the image. Then you can pass that in as an image to a vision language model likeGPT-4-V. Let me know if that helps!

asim-shrestha · 2024-02-01T20:49:14Z

If you still want to go the text approach, you can manually find out which of those elements may be related to a search icon (through image name, or some other tag in the html itself) and provide that information in the prompt as well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Selecting Icons #32

Selecting Icons #32

llermaly commented Jan 21, 2024 •

edited

asim-shrestha commented Feb 1, 2024

asim-shrestha commented Feb 1, 2024

Selecting Icons #32

Selecting Icons #32

Comments

llermaly commented Jan 21, 2024 • edited

asim-shrestha commented Feb 1, 2024

asim-shrestha commented Feb 1, 2024

llermaly commented Jan 21, 2024 •

edited