Skip to main content

Automatic BPM and Key Detection: How It Works (2025)

StemSplit Team
StemSplit Team
Automatic BPM and Key Detection: How It Works (2025)
Summarize with AI:

Most DJs and producers spend hours manually tagging BPM and key in their music libraries. What if every track came with that metadata automatically — accurate, consistent, and ready to use?

TL;DR: StemSplit now automatically detects BPM (tempo) and musical key for every processed song using librosa — the industry-standard Python library for audio analysis. This data appears on job detail pages and is available via our API and RapidAPI endpoints. BPM detection analyzes 60 seconds for accuracy, while key detection uses 120 seconds with chroma features and key profile correlation.

What Is BPM and Key Detection?

BPM (Beats Per Minute) tells you the tempo of a track — how fast the beat is. Essential for DJs who need to match tempos between songs and producers who want to know the exact speed of a track.

Musical Key identifies the harmonic center of a song — like "C major" or "A minor." Critical for harmonic mixing, where DJs transition between songs in compatible keys for smoother blends.

Together, BPM and key metadata transform how you organize and work with music. No more guessing, no more manual entry.

How StemSplit Detects BPM and Key

We built this feature using librosa — the same Python library used by Spotify, YouTube Music, and major music production software. Here's why it's the right choice and how it works.

Why librosa?

Industry Standard: librosa is the de facto standard for music information retrieval in Python. It's used by:

  • Spotify for audio analysis
  • YouTube Music for content identification
  • Research institutions for music information retrieval
  • Professional audio software for tempo/key detection

Proven Accuracy: The algorithms in librosa are based on decades of research in music information retrieval. They're battle-tested on millions of songs and refined through academic research.

Open Source & Maintained: Unlike proprietary solutions, librosa is open source, actively maintained, and transparent about its methods. You can verify exactly how detection works.

BPM Detection Process

Our BPM detection analyzes 60 seconds of audio — the sweet spot between accuracy and speed.

How it works:

  1. Onset Detection — Identifies the start of musical events (beats, notes, transients)
  2. Tempo Estimation — Analyzes the timing between onsets to find the underlying tempo
  3. Beat Tracking — Refines the tempo estimate by tracking the actual beat pattern

The result: A precise BPM value rounded to one decimal place (e.g., 128.3 BPM).

Why 60 seconds? Research shows that 60 seconds captures enough musical content for reliable tempo detection. Shorter samples (<20 seconds) can be inaccurate, especially with tempo changes. Longer samples (>60 seconds) provide diminishing returns — the extra time doesn't significantly improve accuracy.

Key Detection Process

Our key detection analyzes 120 seconds of audio for maximum accuracy.

How it works:

  1. Chroma Feature Extraction — Analyzes the pitch class profile (which notes are present)
  2. Temporal Averaging — Averages chroma features across time for stability
  3. Key Profile Correlation — Compares the chroma profile to major and minor key templates (Krumhansl-Schmuckler profiles)
  4. Mode Detection — Determines whether the key is major or minor based on correlation strength

The result: A key signature like "C", "Am", "F#", or "Dm".

Why 120 seconds? Key detection needs more audio than BPM because harmonic content can vary throughout a song. 120 seconds ensures we capture the overall harmonic character, not just a single section. This is especially important for songs with key changes or complex harmonic progressions.

Why Krumhansl-Schmuckler Profiles? These are the most widely validated key profiles in music psychology research. They're based on how human listeners perceive key relationships — making our detection align with how DJs and producers actually hear music.

Where You'll See BPM and Key

On Job Detail Pages

Every completed job now shows BPM and key prominently at the top of the page — right after the title and duration. They appear in styled badges that make the information impossible to miss.

Displayed for:

  • Uploaded stem separation jobs
  • YouTube jobs
  • SoundCloud jobs

In the API Response

BPM and key are included in the audioMetadata field for all job types.

API Endpoints:

  • GET /api/v1/jobs/{id} — Returns audioMetadata.bpm and audioMetadata.key
  • GET /api/v1/youtube-jobs/{id} — Returns audioMetadata.bpm and audioMetadata.key
  • GET /api/v1/soundcloud-jobs/{id} — Returns audioMetadata.bpm and audioMetadata.key

RapidAPI Endpoints:

  • GET /rapidapi/v1/jobs/{id} — Returns audioMetadata.bpm and audioMetadata.key
  • GET /rapidapi/v1/youtube-jobs/{id} — Returns audioMetadata.bpm and audioMetadata.key
  • GET /rapidapi/v1/soundcloud-jobs/{id} — Returns audioMetadata.bpm and audioMetadata.key

Example API Response

{
  "id": "clxxx123...",
  "status": "COMPLETED",
  "audioMetadata": {
    "bpm": 128.3,
    "key": "Am",
    "waveformPeaks": {
      "vocals": [0.2, 0.5, 0.8, ...],
      "instrumental": [0.3, 0.6, 0.7, ...]
    }
  },
  "outputs": {
    "vocals": {
      "url": "https://storage.example.com/vocals.mp3",
      "expiresAt": "2025-01-15T13:00:00Z"
    }
  }
}

For complete API documentation, see our Developer Reference.


Building an app that needs BPM and key data? Our API makes it easy to access this metadata programmatically. Check out our developer documentation to get started.


Technical Deep Dive: The Detection Algorithms

BPM Detection Algorithm

librosa uses a multi-stage approach:

Stage 1: Onset Detection

  • Analyzes the audio signal for sudden changes in energy
  • Identifies the start of musical events (drums, notes, transients)
  • Creates an onset envelope — a representation of when musical events occur

Stage 2: Tempo Estimation

  • Analyzes the spacing between onsets
  • Uses autocorrelation to find repeating patterns
  • Identifies the most likely tempo candidates

Stage 3: Beat Tracking

  • Refines the tempo estimate by tracking actual beats
  • Handles tempo variations and changes
  • Outputs a precise BPM value

Why this works: Unlike simple peak detection, this approach understands musical structure. It doesn't just find loud sounds — it finds the underlying rhythmic pattern that defines the tempo.

Key Detection Algorithm

Our key detection uses chroma-based analysis:

Stage 1: Chroma Feature Extraction

  • Converts audio to chroma features — a 12-dimensional representation
  • Each dimension represents one of the 12 pitch classes (C, C#, D, D#, E, F, F#, G, G#, A, A#, B)
  • Shows which pitch classes are present and how strongly

Stage 2: Temporal Averaging

  • Averages chroma features across the entire analyzed segment
  • Creates a stable representation of the song's harmonic content
  • Reduces the impact of momentary harmonic variations

Stage 3: Key Profile Correlation

  • Compares the averaged chroma profile to 24 key templates (12 major + 12 minor)
  • Uses Krumhansl-Schmuckler key profiles — validated through music psychology research
  • Calculates correlation coefficients for each possible key

Stage 4: Mode Selection

  • Selects the key with the highest correlation
  • Determines major vs. minor by comparing major and minor correlations
  • Outputs the final key (e.g., "C" for C major, "Am" for A minor)

Why this works: Chroma features capture the harmonic "fingerprint" of a song. By comparing this fingerprint to known key profiles, we can identify the tonal center — the same way human listeners do.

Accuracy and Limitations

BPM Detection Accuracy

What works well:

  • Clear, consistent tempos
  • Well-produced commercial releases
  • Songs with prominent rhythmic elements

Challenges:

  • Songs with tempo changes (rubato, accelerando)
  • Very slow or very fast tempos (outside 60-200 BPM range)
  • Ambient or rhythmically ambiguous music

Typical accuracy: Within ±1 BPM for most commercial music.

Key Detection Accuracy

What works well:

  • Songs with clear tonal centers
  • Standard major/minor keys
  • Well-produced commercial releases

Challenges:

  • Modal music (Dorian, Mixolydian, etc.) — may detect relative major/minor
  • Songs with frequent key changes
  • Atonal or highly chromatic music
  • Very short songs (<30 seconds)

Typical accuracy: 85-95% correct key identification for standard pop/rock/electronic music.

Why Not 100% Accuracy?

Music is complex. A song might:

  • Start in one key and modulate to another
  • Use modal scales that don't fit major/minor templates
  • Have ambiguous harmonic content

Our detection provides the primary key — the tonal center that dominates most of the song. For songs with key changes, it identifies the most prominent key.

Use Cases for BPM and Key Data

For DJs

Harmonic Mixing: Match keys between songs for smooth, musical transitions. Songs in compatible keys (like C major and A minor) blend naturally.

Tempo Matching: Know the exact BPM before mixing. No more guessing or manually tapping tempo.

Library Organization: Sort and filter your collection by BPM and key. Build playlists that flow musically.

For Producers

Remix Planning: Know the original key and tempo before starting a remix. Maintain harmonic compatibility or plan key changes intentionally.

Sample Matching: Find samples that match your project's key and tempo automatically.

Reference Tracks: Quickly identify the key and tempo of reference tracks for your own productions.

For Developers

Music Apps: Build apps that organize music by BPM and key automatically.

DJ Software Integration: Use our API to populate BPM/key fields in DJ software automatically.

Music Analysis Tools: Create tools that analyze music libraries and suggest compatible tracks.

API Integration Examples

JavaScript/TypeScript

// Get job with BPM and key
const response = await fetch('https://api.stemsplit.io/v1/jobs/{jobId}', {
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY'
  }
});

const job = await response.json();

if (job.audioMetadata) {
  console.log(`BPM: ${job.audioMetadata.bpm}`);
  console.log(`Key: ${job.audioMetadata.key}`);
}

Python

import requests

response = requests.get(
    f'https://api.stemsplit.io/v1/jobs/{job_id}',
    headers={'Authorization': f'Bearer {api_key}'}
)

job = response.json()

if job.get('audioMetadata'):
    print(f"BPM: {job['audioMetadata']['bpm']}")
    print(f"Key: {job['audioMetadata']['key']}")

RapidAPI

curl --request GET \
  --url 'https://stemsplit-api.p.rapidapi.com/v1/jobs/{jobId}' \
  --header 'X-RapidAPI-Key: YOUR_RAPIDAPI_KEY' \
  --header 'X-RapidAPI-Host: stemsplit-api.p.rapidapi.com'

For complete API documentation with all endpoints and examples, see our Developer Reference.

Controlling BPM and Key Detection

For Uploaded Files

When uploading a file for stem separation, you can choose to enable or disable audio analysis:

  • Enabled (default): BPM and key are detected automatically
  • Disabled: Faster processing, no BPM/key detection

This option appears in the upload interface. For most users, we recommend leaving it enabled — the analysis adds only 2-3 seconds to processing time.

For YouTube and SoundCloud Jobs

BPM and key detection are always enabled for YouTube and SoundCloud jobs. Since these jobs already include audio analysis for metadata extraction, BPM and key detection adds minimal overhead.

FAQ

How accurate is the BPM detection?

For most commercial music with consistent tempos, BPM detection is accurate within ±1 BPM. Songs with tempo changes or ambiguous rhythms may have less accurate results.

How accurate is the key detection?

Key detection achieves 85-95% accuracy for standard pop, rock, and electronic music. Modal music or songs with frequent key changes may be less accurate.

Can I disable BPM and key detection?

Yes — for uploaded files only. Use the "Enable Audio Analysis" toggle in the upload interface. YouTube and SoundCloud jobs always include BPM and key detection.

What if a song changes key or tempo?

The detection identifies the primary key and tempo — the ones that dominate most of the song. For songs with changes, it reports the most prominent values.

Is this data available via API?

Yes. BPM and key are included in the audioMetadata field for all job types. See our Developer Reference for complete API documentation.

What library does StemSplit use for detection?

We use librosa — the industry-standard Python library for music information retrieval. It's the same library used by Spotify, YouTube Music, and major audio software.

Why librosa instead of other libraries?

librosa is:

  • Industry standard (used by major platforms)
  • Open source and transparent
  • Based on validated research
  • Actively maintained
  • Proven accurate on millions of songs

Can I use this data commercially?

Yes. BPM and key metadata detected by StemSplit can be used in your applications, DJ software, or music analysis tools. The data is provided as-is — you're responsible for how you use it.

How long does detection take?

BPM and key detection adds 2-3 seconds to processing time. This happens automatically during stem separation, so there's no additional wait.

The Bottom Line

Automatic BPM and key detection transforms how you work with music. No more manual tagging, no more guessing. Every song processed through StemSplit comes with accurate tempo and key metadata — ready to use in your DJ sets, productions, or applications.

Whether you're mixing tracks, planning remixes, or building music apps, having BPM and key data automatically available saves time and opens up new creative possibilities.


Start Using BPM and Key Detection

Every song you process through StemSplit now includes automatic BPM and key detection.

  • ✅ Powered by librosa — industry-standard accuracy
  • ✅ Available via API and RapidAPI
  • ✅ Displayed prominently on job pages
  • ✅ Works for uploads, YouTube, and SoundCloud

Try Stem Separation with BPM/Key Detection →


Tags

#bpm detection#key detection#music analysis#audio metadata#api#librosa