Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add structured generation for InferenceEndpointsLLM #657

Open
davanstrien opened this issue May 21, 2024 · 1 comment · Fixed by #680
Open

[FEATURE] Add structured generation for InferenceEndpointsLLM #657

davanstrien opened this issue May 21, 2024 · 1 comment · Fixed by #680
Assignees
Milestone

Comments

@davanstrien
Copy link
Contributor

Is your feature request related to a problem? Please describe.

It would be nice to be able to use structured generation via InferenceEndpointsLLM. This is directly possible on the server side for models using TextGenerationInference under the hood.

Describe the solution you'd like

It's currently possible to use grammars via hosted Inference Endpoints LLMs using the huggingface_hub library, i.e. something like:

from pydantic import BaseModel
from pydantic.types import Annotated, StringConstraints
from huggingface_hub import InferenceClient

class Sentences(BaseModel):
    positive: list[str]
    negative: list[str]

client = InferenceClient("meta-llama/Meta-Llama-3-70B-Instruct")

client.text_generation(
    "Return sentences with positive or negative sentences. Return as a JSON object with two keys, positive and negative, containing a list of 5 sentences",
    grammar={"type": "json", "value": Sentences.model_json_schema()},
)

Describe alternatives you've considered
It's also possible to subclass the current InferenceEndpointsLLM to get similar behavior.

@alvarobartt alvarobartt self-assigned this May 21, 2024
@alvarobartt alvarobartt added this to the 1.2.0 milestone May 21, 2024
@alvarobartt alvarobartt assigned plaguss and unassigned plaguss May 22, 2024
@davanstrien
Copy link
Contributor Author

For now, I've created a custom LLM to use this. In can it's useful you can see that here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants