Skip to main content

Endpoint

GET /api/v1/music/timeline?clip_id={clip_id}
Synchronous. Returns word-level timing data immediately. Returns the start and end timestamps for each lyric word in a clip. Use this to build synchronized lyric displays (karaoke-style), subtitle files, or animated text overlays.

Query parameters

ParameterTypeRequiredDescription
clip_idstringYesClip ID to retrieve timing for. Obtain from Upload Audio (data.clipId) or a completed generation task (results[].id).

Response

{
  "code": 0,
  "message": "ok",
  "request_id": "req-1710000000000",
  "data": {
    "aligned": [
      { "word": "Hello", "start": 2.14, "end": 2.56 },
      { "word": "world", "start": 2.58, "end": 2.99 },
      { "word": "this", "start": 3.1, "end": 3.3 },
      { "word": "is", "start": 3.32, "end": 3.45 },
      { "word": "music", "start": 3.5, "end": 3.95 }
    ]
  }
}
FieldTypeDescription
alignedarrayArray of word timing objects
aligned[].wordstringThe lyric word
aligned[].startfloatStart time in seconds
aligned[].endfloatEnd time in seconds

Example

curl "https://api.example.com/api/v1/music/timeline?clip_id=abc123def456" \
  -H "Authorization: Bearer sk-mm-your-key"

Use case: karaoke display

Iterate through aligned words, highlight the current word based on the audio playback time:
function getCurrentWord(aligned, currentTime) {
  return aligned.find(w => currentTime >= w.start && currentTime <= w.end);
}