Skip to contents

Generate dense vector representations (embeddings) for text using transformer models. Useful for semantic similarity, clustering, and as features for ML models.

Usage

hf_embed(text, model = "BAAI/bge-small-en-v1.5", token = NULL, ...)

Arguments

text

Character vector of text(s) to embed.

model

Character string. Model ID from Hugging Face Hub. Default: "BAAI/bge-small-en-v1.5" (384-dim embeddings).

token

Character string or NULL. API token for authentication.

...

Additional arguments (currently unused).

Value

A tibble with columns: text, embedding (list-column of numeric vectors), n_dims

Examples

if (FALSE) { # \dontrun{
# Generate embeddings
embeddings <- hf_embed(c("Hello world", "Goodbye world"))

# Access embedding vectors
embeddings$embedding[[1]]  # First embedding vector
} # }