Documentation required for parameters of the Inference API
For now, there is no documentation regarding the parameters of the Inference API. However, the API call returns an error when the input is too long. Hence, documentation for parameters such as max tokens are required to be mentioned in the README.md
Unfortunately, max_tokens
will not necessarily work, you can probably use truncation
.
import os
API_TOKEN = os.getenv("HF_API_TOKEN")
import json
import requests
headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "/static-proxy?url=https%3A%2F%2Fapi-inference.huggingface.co%2Fmodels%2Froberta-base-openai-detector"
def query(payload):
data = json.dumps(payload)
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))
data = query({"inputs": "I like you. I love you" * 100, "parameters": { "truncation": True}})
print(data)
This is undocumented indeed, will work on it since this model is getting a lot more usage. In general most pipeline
parameters from transformers
are supported and knowingly not necessarily documented (so we can modify/accept/reject them in the API where it makes sense. This is mostly to prevent abuse when we do it.
Is that clearer to you ? Does the solution work for you ?
Cheers,
Nicolas
hey @Narsil thanks for such a wonderful explanation. I just had one more small doubt : could you explain to me what truncation actually does?
Thanks in advance.
Regards,
Raihan
truncation
works by dropping some tokens. In this case the rightmost ones (so the end of the text).
This is usually preferred for text-classification, since initial parts of the text are usually good enough (for regular text-classification, like is this text about politics or science).
There is no real way to make a finite range model work on large range. The best you can do is send each part of the text and try and come up with some kind of aggregate decision.
If the start is not openai generated, and the middle is, does that mean that the text was generated ? There's just no way to tell tbh. So the API, just like transformers
pipeline reflect that by just throwing errors by default.
Does that answer your question ?
Yeah. Thanks for the awesome explanation!