Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to load pretrained models #721

Closed
Arwin567 opened this issue May 14, 2024 · 3 comments
Closed

Not able to load pretrained models #721

Arwin567 opened this issue May 14, 2024 · 3 comments

Comments

@Arwin567
Copy link

Arwin567 commented May 14, 2024

I am trying to use llmware's pretrained model 'industry-bert-contracts' in the code and facing the following error: "AttributeError: 'HFEmbeddingModel' object has no attribute 'max_input_len'"

I tried loading all other HFEmbeddingModel model family in model_config.py and still facing the same issue.

My code is working with other GGUFGenerativeModel's "bling-phi-3-gguf"
"llmware/bling-sheared-llama-1.3b-0.1" models

@doberst
Copy link
Contributor

doberst commented May 14, 2024

@Arwin567 - thanks for sharing this and sorry that you ran into an issue. I suspect the problem is that "industry-bert-contracts" is an embedding model, not a generative model. So, if you are looking to run a prompt/LLM inference, then "bling-phi-3-gguf" and "llmware/bling-sheared-llama-1.3.-0.1" are great choices as they are both "generative" models ....

On the other hand, if you are interested to build a semantic embedding space for knowledge retrieval with a vector database, then "industry-bert-contracts" is a great choice, and it will output a 768 dimensional embedding vector (not a text generation). You may want to check out some of the Embedding examples, which use that model.

Hope this resolves this issue... please confirm back. (Really glad that you caught this - we will note this to add to the documentation to better clarify for others too.)

Within the ModelCatalog, you can use the discovery methods - ModelCatalog().list_all_models(), or ModelCatalog().list_generative_models() or ModelCatalog().list_embedding_models() ...

@Arwin567
Copy link
Author

Understood, Thank you for clarification and instant response.

I am trying to retrieve data from contracts to csv with 15 questions
and using your contract_analysis_on_laptop code. Can I use industry-bert-contracts model in this case?

@doberst
Copy link
Contributor

doberst commented May 21, 2024

@Arwin567 - yes, definitely, the industry-bert-contracts model is great for building semantic embeddings on contracts and other legal documents. Hope it is progressing well! Will close this thread.

@doberst doberst closed this as completed May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants