For vocals? Vocal Dataset Recording Specifications

What to play

<aside>

Play songs or musical pieces only

AI needs to learn the emotions, flow, dynamic changes, articulations within the musical contents.

❌ Don’t play:

<aside>

Play like you’re on stage

It is necessary to reflect the expressiveness and dynamics of the performance as much as possible in the recording, rather than precisely interpreting the score.

✅ Try your best to:

<aside>

Use articulations in your tracks

AI needs to learn when and how to use articulations in a real performance.

We don’t need a track only for a certain articulation, but you need to insert as many articulations as you can in your play, even if the original score doesn’t include any.

</aside>

<aside>

Length of each track

80% tacks should be a full music 20% tracks should contain several short phrases, each phrase is 2 - 8 bars long We need 1 hour of recordings in total, but each track should be bounced as one audio file

</aside>

<aside>

You are the performer behind the model

The AI model learns everything from your recordings. It plays as you play, feels as you feel. It is your digital avatar.

</aside>

How to record

  1. Follow the clicks of each song! It’s ok to have a few time offsets
  2. Dry instrument tracks without reverb, delay, or other backing tracks
  3. For polyphonic instruments, feel free to do polyphonic
  4. No background noise or big room reflections.
  5. No obvious instrumental leaks from your headphones. (Try to lower your headphone volume)
  6. When two clips are connected, use cross-fade and do not cover any words, cross-fade over silence or breath or consonant only. No need to remove breaths.
  7. There should be at least 1s long silence space at the beginning and end of each track.

image.png

image.png

image.png

Samples

Violin:

Violin sample.mp3