Have a conversation with an open-source language model via the Inference Providers API.
Tool calling support depends on the model/provider. The default chat model is
optimized for a low-friction first call; for tool-calling examples, use a
tool-capable model such as "Qwen/Qwen2.5-72B-Instruct".
Usage
hf_chat(
message,
system = NULL,
model = hf_default_model("chat"),
max_tokens = 500,
temperature = 0.7,
token = NULL,
tools = NULL,
tool_choice = NULL,
stream = FALSE,
callback = NULL,
image = NULL,
endpoint_url = NULL,
...
)Arguments
- message
Character string. The user message to send to the model.
- system
Character string or NULL. Optional system prompt to set behavior.
- model
Character string. Model ID from Hugging Face Hub. Default: "meta-llama/Llama-3.1-8B-Instruct". Use `:provider` suffix to select a specific provider (e.g., "meta-llama/Llama-3.1-8B-Instruct:together").
- max_tokens
Integer. Maximum tokens to generate. Default: 500.
- temperature
Numeric. Sampling temperature (0-2). Default: 0.7.
- token
Character string or NULL. API token for authentication.
- tools
A list of tool definitions created by
hf_tool(), or NULL.- tool_choice
Character string or list controlling tool use. The public Hugging Face router currently supports "auto" and "none"; a tool name, "required", or full list can be used with compatible custom endpoints. Default: NULL.
- stream
Logical. If TRUE, stream response deltas and return the reassembled final response. Default: FALSE.
- callback
Function called with each streamed text delta. When NULL and
stream = TRUE, deltas are printed to the console.- image
Optional image input for vision-capable chat models. Can be a URL, local file path, raw vector, or list/vector of those.
- endpoint_url
Character string or NULL. A custom Inference Endpoint URL. When provided, requests are sent to this URL instead of the public Inference API. The endpoint must support the chat completions format.
- ...
Additional parameters passed to the model.
Examples
if (FALSE) { # \dontrun{
# Simple question
hf_chat("What is the capital of France?")
# With system prompt
hf_chat(
"Explain gradient descent",
system = "You are a statistics professor. Use simple analogies."
)
# Use a specific provider
hf_chat("Hello!", model = "meta-llama/Llama-3-8B-Instruct:together")
# Stream response deltas
hf_chat("Reply with exactly: OK", stream = TRUE)
} # }