hf-memSafety 100Repository

CLI to estimate the required VRAM to load Safetensors models for inference from the Hugging Face Hub (Transformers, Diffusers and Sentence Transformers)

668 stars

13.4k downloads

Updated 2/4/2026

Package Files

Loading files...

SKILL.md

hf-mem

What it does

Estimates inference memory requirements for models on the Hugging Face Hub via Safetensors metadata with HTTP Range requests.

Requirements

uv package manager (for uvx command)
HF_TOKEN environment variable (only for gated/private models)

When to use

User asks about model VRAM/memory needs
User wants to check if a model fits in their GPU
User provides a Hugging Face model URL or model ID

Usage

uvx hf-mem --model-id <org/model-name>

Or add --experimental since hf-mem 0.4.3 to include KV cache estimations for LLMs and VLMs too.

Examples

uvx hf-mem --model-id black-forest-labs/FLUX.1-dev
uvx hf-mem --model-id mistralai/Mistral-7B-v0.1 --experimental

When it fails

HTTP 401, if the model is gated/private, meaning you need to set HF_TOKEN with read access to it.
HTTP 404, if the provided --model-id is not available on the Hugging Face Hub.
RuntimeError, if none of model.safetensors, model.safetensors.index.json, or model_index.json is available.

Install

Requires askill CLI v1.0+▶

AI Quality Score

95/100Analyzed 2/6/2026

An excellent skill document for a VRAM estimation tool. It provides clear triggers, installation-free usage via uvx, and detailed error handling scenarios.

100

100

90

95

100

Metadata

Licenseunknown

Version-

Updated2/4/2026

Publisheralvarobartt

Tags

api