Skip to contents

Fill in a [MASK] token in text with predicted words. Commonly used with BERT-style models.

Usage

hf_fill_mask(
  text,
  model = "google-bert/bert-base-uncased",
  mask_token = "[MASK]",
  top_k = 5,
  token = NULL,
  ...
)

Arguments

text

Character vector of text(s) containing [MASK] token.

model

Character string. Model ID from Hugging Face Hub. Default: "google-bert/bert-base-uncased".

mask_token

Character string. The mask token to use. Default: "[MASK]". Some models use different tokens like "<mask>".

top_k

Integer. Number of top predictions to return. Default: 5.

token

Character string or NULL. API token for authentication.

...

Additional arguments (currently unused).

Value

A tibble with columns: text, token, score, filled (the complete text)

Examples

if (FALSE) { # \dontrun{
# Fill in the blank
hf_fill_mask("The capital of France is [MASK].")

# Get top predictions
hf_fill_mask("Paris is the [MASK] of France.", top_k = 3)

# Use with different mask token
hf_fill_mask("The capital of France is <mask>.", mask_token = "<mask>")
} # }