Post
2034
Understanding the json format response with HF's Serverless Inference API 🤗
As it stands, there seems to be an inconsistency with the OpenAI documentation on the question of implementing the JSON response format using the InferenceClient completion API.
After investigating the InferenceClient source code, I share the official solution using a JSON Schema. This consolidates the structure of the response and simplifies parsing as part of an automated process for extracting metadata, information:
As a reminder, json mode is activated with the OpenAI client as follows:
One question remains unanswered, however, and will perhaps be answered by the community: it seems that an incompatibility persists for list of dictionaries generation, and currently, the production of simple dictionaries seems to be the only functional option.
As it stands, there seems to be an inconsistency with the OpenAI documentation on the question of implementing the JSON response format using the InferenceClient completion API.
After investigating the InferenceClient source code, I share the official solution using a JSON Schema. This consolidates the structure of the response and simplifies parsing as part of an automated process for extracting metadata, information:
from huggingface_hub import InferenceClient
client = InferenceClient("meta-llama/Meta-Llama-3-70B-Instruct")
messages = [
{
"role": "user",
"content": "I saw a puppy a cat and a raccoon during my bike ride in the park. What did I saw and when?",
},
]
response_format = {
"type": "json",
"value": {
"properties": {
"location": {"type": "string"},
"activity": {"type": "string"},
"animals_seen": {"type": "integer", "minimum": 1, "maximum": 5},
"animals": {"type": "array", "items": {"type": "string"}},
},
"required": ["location", "activity", "animals_seen", "animals"],
},
}
response = client.chat_completion(
messages=messages,
response_format=response_format,
max_tokens=500,
)
print(response.choices[0].message.content)
As a reminder, json mode is activated with the OpenAI client as follows:
response = client.chat.completions.create(
model="gpt-3.5-turbo-0125",
messages=[...],
response_format={"type": "json_object"}
)
One question remains unanswered, however, and will perhaps be answered by the community: it seems that an incompatibility persists for list of dictionaries generation, and currently, the production of simple dictionaries seems to be the only functional option.