You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
It would be nice to be able to use structured generation via InferenceEndpointsLLM. This is directly possible on the server side for models using TextGenerationInference under the hood.
Describe the solution you'd like
It's currently possible to use grammars via hosted Inference Endpoints LLMs using the huggingface_hub library, i.e. something like:
from pydantic import BaseModel
from pydantic.types import Annotated, StringConstraints
from huggingface_hub import InferenceClient
class Sentences(BaseModel):
positive: list[str]
negative: list[str]
client = InferenceClient("meta-llama/Meta-Llama-3-70B-Instruct")
client.text_generation(
"Return sentences with positive or negative sentences. Return as a JSON object with two keys, positive and negative, containing a list of 5 sentences",
grammar={"type": "json", "value": Sentences.model_json_schema()},
)
Is your feature request related to a problem? Please describe.
It would be nice to be able to use structured generation via
InferenceEndpointsLLM
. This is directly possible on the server side for models using TextGenerationInference under the hood.Describe the solution you'd like
It's currently possible to use grammars via hosted Inference Endpoints LLMs using the
huggingface_hub
library, i.e. something like:huggingface_hub
docs: https://huggingface.co/docs/huggingface_hub/package_reference/inference_client#huggingface_hub.InferenceClient.text_generation.grammarDescribe alternatives you've considered
It's also possible to subclass the current InferenceEndpointsLLM to get similar behavior.
The text was updated successfully, but these errors were encountered: