Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to customize search and crawler api? #42

Open
wxs77577 opened this issue Mar 28, 2019 · 2 comments
Open

How to customize search and crawler api? #42

wxs77577 opened this issue Mar 28, 2019 · 2 comments

Comments

@wxs77577
Copy link

It crawled some useless content of sites like site navmenu, header and footer, how to remove them from crawler or search api?

@ltienphat1307
Copy link

+1

@ltienphat1307
Copy link

Hi @medcl

The GOPA tool has crawled whole content html for each site, it cannot categorize by title, excerpt or site description Then store all of them in one field named "text".

Is there any way to config to categorize by title, excerpt or site description. I'd like to crawl necessary data?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants