Skip to contents

Transcribe speech from an audio file, URL, or raw vector using automatic speech recognition via the Hugging Face Inference Providers API.

Usage

hf_transcribe(
  audio,
  return_timestamps = FALSE,
  model = hf_default_model("transcribe"),
  token = NULL,
  endpoint_url = NULL,
  content_type = NULL,
  ...
)

Arguments

audio

Audio input: a local file path, URL, raw vector, or vector/list of paths/URLs.

return_timestamps

Logical or character. Use `FALSE` for text only, `TRUE` for chunk timestamps, or a model-supported value such as `"word"`.

model

Character string. Model ID from Hugging Face Hub. Default: "openai/whisper-large-v3".

token

Character string or NULL. API token for authentication.

endpoint_url

Character string or NULL. A custom Inference Endpoint URL.

content_type

Character string or NULL. MIME type to use for raw audio inputs. Paths and URLs are inferred when possible.

...

Additional arguments (currently unused).

Value

A tibble with columns: audio, text, chunks.

Examples

if (FALSE) { # \dontrun{
hf_transcribe("interview.flac")
hf_transcribe("interview.flac", return_timestamps = "word")
} # }