Gesture Instrument

Play a synth with your hands. Webcam hand-tracking turns 21 points per hand into pitch and volume — two hands, two voices. A browser-playable version is landing soon.

MediaPipeWeb AudioWebGL
🖐

The idea

An instrument with no keys — you play it with your hands in the air. The webcam tracks your hands and turns movement into sound: lift a hand to raise the pitch, open it to get louder, pinch your fingers to mute.

Pitch is locked to an A-minor pentatonic scale, so there are no wrong notes — you can play melodies just by moving. Two hands give you two independent voices.

It’s the body-as-controller half of a bigger idea: if the AI is the luthier that builds the instrument, the body is the player. The version embedded here runs entirely in your browser; a deeper desktop prototype explores extra timbres and tempo detection.

How it works

MediaPipe hand-tracking

21 landmarks per hand, tracked in-browser on the GPU — the neon skeleton you see is the live data driving the sound.

Pitch from height

Vertical hand position maps to an A-minor pentatonic scale, so every position lands on a musical note.

Two hands, two voices

Each hand drives an independent oscillator voice (saw + detuned triangle) through a shared filter and delay shimmer.

Gesture control

Hand openness sets volume and opens the low-pass filter; a finger pinch gates the voice to silence.

What’s next

Ship the in-browser instrument, then fold it into the Synth Lab stack as the live gesture input.

A browser-playable version — live hand-tracking right in this page, running entirely on-device — is built and landing in a follow-up update.