Samply delivers command-line CPU profiling across macOS, Linux, and Windows. Run samply record ./your-app args, and it captures a profile, then opens it directly in profiler.firefox.com. No IDE plugins, no complex setup—just stacks, flame graphs, and source views in a browser. This matters because CPU bottlenecks kill performance in high-stakes apps like trading bots or crypto miners, and samply cuts profiling time from hours to minutes.
At 1000Hz sampling (1ms intervals), it grabs per-thread stack traces. macOS and Windows snag both on-CPU and off-CPU samples, revealing lock contention or I/O waits. Linux sticks to on-CPU via perf events, missing off-CPU for now. Profiles stay local—disk and RAM—until you upload to share.firefox.dev. Example: profiling Mozilla’s dump_syms on macOS shows exact functions, line-level samples, and call trees. Double-click a function, and source code pops up.
Platform-Specific Setup
On Linux, perf demands privileges. Temporarily loosen with
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
for full access, or -1 for unprivileged sampling. Reboot resets it; for permanence, edit sysctl.conf. Without this, sampling fails—test it first.
Windows shines for system-wide profiling: samply record -a hits all processes. Symbols are key for readability. Chain servers like Microsoft’s (https://msdl.microsoft.com/download/symbols), Mozilla’s Breakpad (https://symbols.mozilla.org/try/), and Chromium’s. Full command:
samply record -a --windows-symbol-server https://msdl.microsoft.com/download/symbols --breakpad-symbol-server https://symbols.mozilla.org/try/ --windows-symbol-server https://chromium-browser-symsrv.commondatastorage.googleapis.com
Without symbols, stacks read like gibberish.
macOS runs smooth out-of-the-box, leveraging native sampling. All platforms spawn your app as a subprocess, so no source rebuilds needed.
Installation and Reliability
Grab prebuilts via scripts—fastest for testing:
# macOS/Linux
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/mstange/samply/releases/download/samply-v0.13.1/samply-installer.sh | sh
# Windows
powershell -ExecutionPolicy Bypass -c "irm https://github.com/mstange/samply/releases/download/samply-v0.13.1/samply-installer.ps1 | iex"
Rust users: cargo install --locked samply. Or clone https://github.com/mstange/samply, cargo build --release. Version 0.13.1 is current; check GitHub for updates. It’s single-binary, no deps beyond perf on Linux.
Skeptical take: samply isn’t revolutionary—it’s a thin wrapper on proven tech (perf, Apple’s sampler, ETW). But it unifies them under Firefox Profiler, which crushes clunky natives like Instruments or VTune in usability. Flame graphs beat tables; shareable links beat screenshots. Limitations? Linux off-CPU absence hurts kernel-heavy workloads. Symbols hunt adds friction on Windows. Still, for user-space CLI apps—scripts, servers, crypto tools—it’s a no-brainer. In finance/crypto, where 1ms latency swings profits, quick profiling spots leaks before they cost you.
Why this matters now: As Rust grows in perf-critical code (e.g., Solana validators, DeFi engines), samply’s cargo integration fits perfectly. Pair with flamegraph tools or cargo-flamegraph for deeper dives. Test it on your hot loop; the first profile often reveals 20-50% gains hiding in allocations or syscalls. Open-source under MPL 2.0, maintained by Mozilla’s Marcel Sanchez—solid backing, low risk.