I see. If you want to use it in a compiled language, for example, you can handle it in a more complicated way.
For HF, the easiest way is to use InferenceClient with Python or JS, but if that’s not possible, the next best thing is an API that returns JSON.
I heard that everyone used this before InferenceClient was developed, so there should be a lot that can be done, but because of that history, there is a lot of pre-2023 information. Well, it’s still in use inside old Gradio and so on, and it should work.