Create text embeddings using a Hugging Face model as part of a tidymodels recipe. This step converts text columns into embedding features for downstream modeling.
Arguments
- recipe
A recipe object.
- ...
One or more text column selectors (see recipes::selections()).
- role
Character string. Role for the new embedding variables. Default: "predictor".
- trained
Logical. Internal use only.
- model
Character string. Hugging Face model ID for embeddings. Default: "BAAI/bge-small-en-v1.5".
- token
Character string or NULL. API token for authentication.
- embeddings
List. Internal use only (stores embeddings during training).
- skip
Logical. Should step be skipped when baking? Default: FALSE.
- id
Character string. Unique ID for this step.
- x
A step_hf_embed object
Examples
if (FALSE) { # \dontrun{
library(tidymodels)
library(dplyr)
# Create a recipe with embeddings
rec <- recipe(sentiment ~ text, data = train_data) |>
step_hf_embed(text, model = "BAAI/bge-small-en-v1.5")
# Use in a workflow
wf <- workflow() |>
add_recipe(rec) |>
add_model(logistic_reg()) |>
fit(data = train_data)
} # }