Registry / Tools / Hugging Face Inference Endpoints

Hugging Face Inference Endpoints facts

tool_hugging_face_inference_endpoints

Platform

Hugging Face Inference Endpoints

Managed infrastructure for deploying Hugging Face models as dedicated inference endpoints.

last verified: 2026-06-10pricing checked: 2026-05-18

Official website Pricing page Documentation API docs

Fact summary

Name: Hugging Face Inference Endpoints
Company / maker: Hugging Face
Category: Model & Infrastructure
Tags: Model Hosting, Inference, Deployment
Entity type: Platform
Availability: Public
Pricing model: Usage Based
Pricing unit: compute time
Primary pricing route: Api Usage
Free plan: No
Delivery modes: Web AppAPISDK
Supported platforms: Web
Data state: source backed

Official links

Formal source rows with URL, source type, and freshness.

Link	URL	source type	checked
Official website	huggingface.co/docs/inference-endpoints/en/index	official-site	2026-06-10
Pricing page	huggingface.co/docs/inference-endpoints/en/pricing	official-pricing-page	2026-05-18
Documentation	huggingface.co/docs/inference-endpoints/en/index	official-documentation	2026-05-06
API docs	huggingface.co/docs/inference-endpoints/en/api_reference	official-documentation	2026-05-06

Access and delivery

Supported platforms and delivery modes are intentionally separate.

Supported platforms

Web

Delivery modes

Web AppAPISDK

Capabilities

Capability tags

Model HostingModel ApisEvaluation Monitoring

Input types

Structured DataFunction CallTextImageAudioVideo

Output types

Structured JsonTextImageAudioVideoEmbeddingsDocument

Pricing summary

Derived from official pricing routes where available

pricingModel: Usage Based
pricingUnit: compute time
currency: USD
hasFreePlan: No
pricingLastChecked: 2026-05-18

Pricing routes

Each route is a first-class record with source URL and checked date.

Request a quote

routeType: enterprise-sales

The pricing page includes a request-a-quote path and notes quota requests for unavailable instance types.

Dedicated endpoint compute

routeType: api-usage

primary route

Dedicated endpoints are priced by selected instance type; the pricing page states hourly rates are shown and actual cost is calculated by the minute while deployed endpoints are initializing or running.

Pricing plans

plan	track	usage limit	features	source	checked
Usage-based dedicated endpoints	api	Costs are calculated per minute from the selected instance hourly rate while endpoints are initializing and running.	Dedicated Inference Endpoints, Select instance type to deploy and scale models, CPU, GPU, and accelerator instance hourly pricing, Accessible to Hugging Face accounts with an active subscription and credit card on file		2026-05-18

Evidence and sources

source URL	source type	used for	checked
huggingface.co/docs/inference-endpoints/en/pricing	official-pricing-page	pricing-model, pricing-routes	2026-05-18
huggingface.co/docs/inference-endpoints/en/api_reference	official-documentation	api-access, delivery-modes, capability-claims, relationships, entity-boundary	2026-05-06
huggingface.co/docs/inference-endpoints/index	official-documentation	identity, capabilities, delivery-modes	2026-05-06

Freshness and recent changes

Facts verified

2026-06-10

Identity and core facts

Pricing checked

2026-05-18

Pricing routes and plans

Record updated

2026-05-06

Public record freshness

One-hop registry pages derived from this record's modeled facts. This section does not infer peer tools.

Hugging Face Inference Endpoints facts

Hugging Face Inference Endpoints

Fact summary

Official links

Access and delivery

Supported platforms

Delivery modes

Capabilities

Capability tags

Pricing summary

Pricing summary

Pricing routes

Pricing plans

Evidence and sources

Freshness and recent changes

Related pages

Hugging Face

Model & Infrastructure

Usage Based

Api Usage

Related registry paths