Skip to contents

If a model_id is provided, the Inference API will be used to make the prediction. If you wish to download a model or pipeline rather than running your predictions through the Inference API, download the model with one of the hf_load_*_model() or hf_load_pipeline() functions.

Usage

hf_inference(
  model,
  payload,
  flatten = TRUE,
  use_gpu = FALSE,
  use_cache = FALSE,
  wait_for_model = FALSE,
  use_auth_token = NULL,
  stop_on_error = FALSE
)

Arguments

model

Either a downloaded model or pipeline from the Hugging Face Hub (using hf_load_pipeline()), or a model_id. Run hf_search_models(...) for model_ids.

payload

The data to predict on. Use one of the hf_*_payload() functions to create.

flatten

Whether to flatten the results into a data frame. Default: TRUE (flatten the results)

use_gpu

API Only - Whether to use GPU for inference.

use_cache

API Only - Whether to use cached inference results for previously seen inputs.

wait_for_model

API Only - Whether to wait for the model to be ready instead of receiving a 503 error after a certain amount of time.

use_auth_token

API Only - The token to use as HTTP bearer authorization for the Inference API. Defaults to HUGGING_FACE_HUB_TOKEN environment variable.

stop_on_error

API Only - Whether to throw an error if an API error is encountered. Defaults to FALSE (do not throw error).

Value

The results of the inference