askill
linux-perf

linux-perfSafety 95Repository

Linux perf profiler skill for CPU performance analysis. Use when collecting sampling profiles with perf record, generating perf report, measuring hardware counters (cache misses, branch mispredicts, IPC), identifying hot functions, or feeding perf data into flamegraph tools. Activates on queries about perf, Linux performance counters, PMU events, off-CPU profiling, perf stat, perf annotate, or sampling-based profiling on Linux.

17 stars
1.2k downloads
Updated 2/20/2026

Package Files

Loading files...
SKILL.md

Linux perf

Purpose

Guide agents through perf for CPU profiling: sampling, hardware counter measurement, hotspot identification, and integration with flamegraph generation.

Triggers

  • "Which function is consuming the most CPU?"
  • "How do I measure cache misses / IPC?"
  • "How do I use perf to find hotspots?"
  • "How do I generate a flamegraph from perf data?"
  • "perf shows [unknown] or [kernel] frames"

Workflow

1. Prerequisites

# Install
sudo apt install linux-perf    # Debian/Ubuntu (version-matched)
sudo dnf install perf          # Fedora/RHEL

# Check permissions
# By default perf requires root or paranoid level ≤ 1
cat /proc/sys/kernel/perf_event_paranoid
# 2 = only CPU stats (not kernel), 1 = user+kernel, 0 = all, -1 = no restrictions

# Temporarily lower (session only)
sudo sysctl -w kernel.perf_event_paranoid=1

# Persistent
echo 'kernel.perf_event_paranoid=1' | sudo tee /etc/sysctl.d/99-perf.conf
sudo sysctl -p /etc/sysctl.d/99-perf.conf

Compile the target with debug symbols for useful frame data:

gcc -g -O2 -fno-omit-frame-pointer -o prog main.c
# -fno-omit-frame-pointer: essential for frame-pointer-based unwinding
# Alternative: compile with DWARF CFI and use --call-graph=dwarf

2. perf stat — quick counters

# Basic hardware counters
perf stat ./prog

# With specific events
perf stat -e cache-misses,cache-references,instructions,cycles,branch-misses ./prog

# Wall-clock comparison: N runs
perf stat -r 5 ./prog

# Attach to existing process
perf stat -p 12345 sleep 10

Interpret perf stat output:

  • IPC (instructions per cycle) < 1.0: memory-bound or stalled pipeline
  • cache-miss rate > 5%: significant cache pressure
  • branch-miss rate > 5%: branch predictor struggling

3. perf record — sampling

# Default: sample at 1000 Hz (cycles event)
perf record -g ./prog

# Specify frequency
perf record -F 999 -g ./prog

# Specific event
perf record -e cache-misses -g ./prog

# Attach to running process
perf record -F 999 -g -p 12345 sleep 30

# Off-CPU profiling (time spent waiting)
perf record -e sched:sched_switch -ag sleep 10

# DWARF call graphs (better for binaries without frame pointers)
perf record -F 999 --call-graph=dwarf ./prog

# Save to named file
perf record -o myapp.perf.data -g ./prog

4. perf report — interactive analysis

perf report                          # reads perf.data
perf report -i myapp.perf.data
perf report --no-children            # self time only (not cumulative)
perf report --sort comm,dso,sym      # sort by fields
perf report --stdio                  # non-interactive text output

Navigation in TUI:

  • Enter — expand a symbol
  • a — annotate (show assembly with hit counts)
  • s — show source (needs debug info)
  • d — filter by DSO (library)
  • t — filter by thread
  • ? — help

5. perf annotate — hot instructions

# Show assembly with hit percentages
perf annotate sym_name

# From report: press 'a' on a symbol
# Or directly:
perf annotate -i perf.data --symbol=hot_function --stdio

High hit count on a mov or vmovdqa suggests a cache miss at that load.

6. perf top — live profiling

# Live top, like 'top' but for functions
sudo perf top -g

# Filter by process
sudo perf top -p 12345

7. Feed into flamegraphs

# Generate perf script output
perf script > out.perf

# Use Brendan Gregg's FlameGraph tools
git clone https://github.com/brendangregg/FlameGraph
./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
./FlameGraph/flamegraph.pl out.folded > flamegraph.svg

# Open flamegraph.svg in browser

See skills/profilers/flamegraphs for reading flamegraphs and interpreting results.

8. Common issues

ProblemCauseFix
Permission deniedperf_event_paranoid too highLower paranoid level or run with sudo
[unknown] framesMissing frame pointers or debug infoRecompile with -fno-omit-frame-pointer or use --call-graph=dwarf
[kernel] everywhereKernel symbols not visibleUse sudo perf record; install linux-image-$(uname -r)-dbgsym
No kallsymsKernel symbols unavailable`echo 0
Empty report for short programProgram exits too fastUse -F 9999 or instrument longer workload
DWARF unwinding slowLarge DWARF stackLimit with --call-graph dwarf,512

9. Useful events

# List all available events
perf list

# Common hardware events
cycles
instructions
cache-references
cache-misses
branch-instructions
branch-misses
stalled-cycles-frontend
stalled-cycles-backend

# Software events
context-switches
cpu-migrations
page-faults

# Tracepoints (requires root)
sched:sched_switch
syscalls:sys_enter_read

For a counter reference and interpretation guide, see references/events.md.

Related skills

  • Use skills/profilers/flamegraphs for SVG flamegraph generation and reading
  • Use skills/profilers/valgrind for cache simulation and memory profiling
  • Use skills/compilers/gcc or skills/compilers/clang for PGO from perf data (AutoFDO)

Install

Download ZIP
Requires askill CLI v1.0+

AI Quality Score

88/100Analyzed 2/24/2026

Comprehensive Linux perf profiling skill with excellent technical depth, clear workflow steps, and practical examples. Covers installation, permissions, sampling, hardware counters, flamegraph integration, and troubleshooting. Slight deduction for mismatched tags (ci-cd/github/github-actions don't fit perf skill) and referencing a potentially missing reference file.

95
95
90
90
95

Metadata

Licenseunknown
Version-
Updated2/20/2026
Publishermohitmishra786

Tags

ci-cdgithubgithub-actions