Skip to contents

Generate short image captions. This uses a vision-capable chat model by default because the public `hf-inference` provider did not expose a broadly available image-to-text captioning model during verification.

Usage

hf_caption_image(
  image,
  prompt = "Write a short, factual caption for this image.",
  model = hf_default_model("caption_image"),
  max_tokens = 80,
  token = NULL,
  endpoint_url = NULL,
  ...
)

Arguments

image

Image input: a local file path, URL, raw vector, or vector/list of paths/URLs.

prompt

Prompt used to request the caption.

model

Character string. Vision-capable chat model ID. Default: "google/gemma-3-4b-it".

max_tokens

Integer. Maximum tokens to generate.

token

Character string or NULL. API token for authentication.

endpoint_url

Character string or NULL. A custom Inference Endpoint URL.

...

Additional arguments passed to hf_describe_image().

Value

A tibble with columns: image, caption.

Examples

if (FALSE) { # \dontrun{
hf_caption_image("cat.png")
} # }