Convert unstructured text into tidy columns using a chat model with structured
JSON output. The schema argument can be a lightweight named character
vector such as c(name = "string", score = "number") or a full JSON
Schema list. The function returns one row per input text and one column per
schema field.
Usage
hf_extract(
text,
schema,
model = hf_default_model("chat"),
strict = TRUE,
system = paste("Extract the requested fields from the user's text.",
"Return only JSON that matches the schema."),
token = NULL,
endpoint_url = NULL,
...
)Arguments
- text
Character vector of text(s) to extract from.
- schema
A named character vector of field names and JSON types, or a JSON Schema list with object
properties.- model
Character string. Model ID from Hugging Face Hub. Default: "meta-llama/Llama-3.1-8B-Instruct".
- strict
Logical. Whether to request strict JSON Schema adherence. Default: TRUE.
- system
Character string. System prompt sent with each extraction request. Default: a concise extraction instruction.
- token
Character string or NULL. API token for authentication.
- endpoint_url
Character string or NULL. A custom Inference Endpoint URL. The endpoint must support the chat completions format.
- ...
Additional parameters passed to the chat-completions request.