Skip to contents

Have a conversation with an open-source language model via the Inference Providers API. Tool calling support depends on the model/provider. The default chat model is optimized for a low-friction first call; for tool-calling examples, use a tool-capable model such as "Qwen/Qwen2.5-72B-Instruct".

Usage

hf_chat(
  message,
  system = NULL,
  model = hf_default_model("chat"),
  max_tokens = 500,
  temperature = 0.7,
  token = NULL,
  tools = NULL,
  tool_choice = NULL,
  stream = FALSE,
  callback = NULL,
  image = NULL,
  endpoint_url = NULL,
  ...
)

Arguments

message

Character string. The user message to send to the model.

system

Character string or NULL. Optional system prompt to set behavior.

model

Character string. Model ID from Hugging Face Hub. Default: "meta-llama/Llama-3.1-8B-Instruct". Use `:provider` suffix to select a specific provider (e.g., "meta-llama/Llama-3.1-8B-Instruct:together").

max_tokens

Integer. Maximum tokens to generate. Default: 500.

temperature

Numeric. Sampling temperature (0-2). Default: 0.7.

token

Character string or NULL. API token for authentication.

tools

A list of tool definitions created by hf_tool(), or NULL.

tool_choice

Character string or list controlling tool use. The public Hugging Face router currently supports "auto" and "none"; a tool name, "required", or full list can be used with compatible custom endpoints. Default: NULL.

stream

Logical. If TRUE, stream response deltas and return the reassembled final response. Default: FALSE.

callback

Function called with each streamed text delta. When NULL and stream = TRUE, deltas are printed to the console.

image

Optional image input for vision-capable chat models. Can be a URL, local file path, raw vector, or list/vector of those.

endpoint_url

Character string or NULL. A custom Inference Endpoint URL. When provided, requests are sent to this URL instead of the public Inference API. The endpoint must support the chat completions format.

...

Additional parameters passed to the model.

Value

A tibble with columns: role, content, model, tokens_used, and tool_calls.

Examples

if (FALSE) { # \dontrun{
# Simple question
hf_chat("What is the capital of France?")

# With system prompt
hf_chat(
  "Explain gradient descent",
  system = "You are a statistics professor. Use simple analogies."
)

# Use a specific provider
hf_chat("Hello!", model = "meta-llama/Llama-3-8B-Instruct:together")

# Stream response deltas
hf_chat("Reply with exactly: OK", stream = TRUE)
} # }