Generate dense vector representations (embeddings) for text using transformer models.
Useful for semantic similarity, clustering, and as features for ML models.
Usage
hf_embed(text, model = "BAAI/bge-small-en-v1.5", token = NULL, ...)
Arguments
- text
Character vector of text(s) to embed.
- model
Character string. Model ID from Hugging Face Hub.
Default: "BAAI/bge-small-en-v1.5" (384-dim embeddings).
- token
Character string or NULL. API token for authentication.
- ...
Additional arguments (currently unused).
Value
A tibble with columns: text, embedding (list-column of numeric vectors), n_dims
Examples
if (FALSE) { # \dontrun{
# Generate embeddings
embeddings <- hf_embed(c("Hello world", "Goodbye world"))
# Access embedding vectors
embeddings$embedding[[1]] # First embedding vector
} # }