askill
mediapipe-usage

mediapipe-usageSafety 95Repository

Provides guidance for Google MediaPipe Pose Landmarker on web using @mediapipe/tasks-vision. Covers setup, landmark indices, running modes, and real-time video patterns. Use when working with MediaPipe, pose detection, body landmarks, or @mediapipe/tasks-vision.

1 stars
1.2k downloads
Updated 2/17/2026

Package Files

Loading files...
SKILL.md

Google MediaPipe Usage (Web / Pose Landmarker)

Quick Start

  1. Install @mediapipe/tasks-vision, resolve WASM from CDN
  2. Create PoseLandmarker with createFromOptions
  3. Use detect() for single image, or detectForVideo() in a throttled requestAnimationFrame loop

Setup

Install (prefer pnpm):

pnpm add @mediapipe/tasks-vision

WASM root: Resolve vision tasks from CDN when creating the task:

const vision = await FilesetResolver.forVisionTasks(
  "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm",
);

Create the Pose Landmarker Task

Use PoseLandmarker.createFromOptions(vision, options):

import { PoseLandmarker, FilesetResolver } from "@mediapipe/tasks-vision";

const vision = await FilesetResolver.forVisionTasks(
  "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm",
);

const poseLandmarker = await PoseLandmarker.createFromOptions(vision, {
  baseOptions: {
    modelAssetPath: modelUrl, // see reference.md for lite/full/heavy URLs
    delegate: "GPU", // falls back to CPU if unavailable
  },
  runningMode: "VIDEO", // or "IMAGE" for single image
  numPoses: 1,
  minPoseDetectionConfidence: 0.5,
  minPosePresenceConfidence: 0.5,
  minTrackingConfidence: 0.5,
});
  • runningMode: IMAGE for single image → use detect(image). VIDEO for stream → use detectForVideo(video, timestamp).
  • baseOptions.modelAssetPath: URL to a .task model (lite / full / heavy). See reference.md for URLs.
  • delegate: "GPU" preferred; some environments fall back to CPU.

Run the Task

Single image (runningMode IMAGE):

const result = poseLandmarker.detect(imageElement);

Video / webcam (runningMode VIDEO):

Call detectForVideo(video, timestamp) inside a requestAnimationFrame loop. Throttle by time (e.g. ~33 ms between frames) to avoid excessive work:

let lastFrameTime = 0;
function detectLoop() {
  const now = performance.now();
  if (video.readyState >= 2 && now - lastFrameTime > 33) {
    lastFrameTime = now;
    const result = poseLandmarker.detectForVideo(video, now);
    if (result.landmarks?.length) {
      const landmarks = result.landmarks[0]; // first person
      // use landmarks
    }
  }
  requestAnimationFrame(detectLoop);
}
requestAnimationFrame(detectLoop);

Result Shape

  • result.landmarks: Array of poses; each pose is NormalizedLandmark[] (33 points). Each landmark: x, y, z (normalized 0–1; z is depth relative to hip center), visibility (0–1).
  • result.worldLandmarks: Optional 3D coordinates in meters (same indices).
  • Single person: use result.landmarks[0].

Practical Patterns (Know-how)

  • State machine: idle → loading (load model) → ready (can start) → active (webcam + detection) → error. When switching model variant, close the old PoseLandmarker instance and create a new one.
  • Throttle: Run detectForVideo only when performance.now() - lastFrameTime > 33 (≈30 fps) to avoid blocking the main thread.
  • Smoothing: Apply a smoothing factor (e.g. 0.3) to derived values (pitch, bank) to reduce jitter; use a dead zone (in degrees) to ignore small movements.
  • Confidence: Use each landmark’s visibility; ignore or downweight points below a threshold. Helper: getLandmark(landmarks, index, minConfidence) returning the point only if visibility >= minConfidence.
  • Gestures: e.g. “hands forward” = compare shoulder vs wrist z; “hands overhead” = compare wrist y to shoulder y. Use consecutive-frame counters for toggles (e.g. require N frames in pose before firing an action).

Cleanup

  • Stop webcam: stream.getTracks().forEach(t => t.stop()).
  • Release task: poseLandmarker.close() when done or before creating a new instance.

Additional Resources

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

84/100Analyzed 2/24/2026

High-quality technical reference skill for Google MediaPipe Pose Landmarker on web. Well-structured with setup instructions, code examples for both IMAGE and VIDEO modes, practical patterns (throttling, smoothing, confidence filtering), and cleanup guidance. Scores well on actionability and clarity. Slight completeness gap due to dependency on external reference.md/examples.md files. No internal-only indicators - clearly designed for reuse across projects.

95
85
75
78
85

Metadata

Licenseunknown
Version-
Updated2/17/2026
Publisherliuchiawei

Tags

No tags yet.