AI music generation

Overview of some AI music generation tools. We’ll talk about music generation in general, real world examples, wavenet for voice, magenta for music, AI theory and state of the art.

Introduction

Generative composition

Wavenet (google / tensorflow)

Voice generation

# Call Google wavenet API with curl on text/dreambank.quotes
# (one call per line in file, with voice parameters)
./quote2speech text/dreambank.quotes output/voice/mp3

Magenta models (google / tensoflor)

Generative composition. Some pre-trained models are available in magenta.

Installing Magenta

See https://github.com/tensorflow/magenta#installation.

# Python 2.x or 3.x
sudo apt install python-pip
# See https://pypi.org/project/magenta/
sudo pip install magenta
# If you want GPU support
pip install magenta-gpu

Or build from sources

bazel build //magenta/tools/pip:build_pip_package
./magenta/tools/build.sh

To build then execute a specific target

bazel build //magenta/models/music_vae:music_vae_generate
bazel-bin/magenta/models/music_vae/music_vae_generate

Downloading models

mkdir -p models
mkdir -p checkpoints

# Drum kit RNN
wget -O models/drum_kit_rnn.mag "http://download.magenta.tensorflow.org/models/drum_kit_rnn.mag"

# Melody RNN
wget -O models/attention_rnn.mag "http://download.magenta.tensorflow.org/models/attention_rnn.mag"

# Polyphony RNN
wget -O models/polyphony_rnn.mag "http://download.magenta.tensorflow.org/models/polyphony_rnn.mag"

# Music VAE (2 GB per checkpoint)
wget -O checkpoints/mel_2bar_big.ckpt.tar "http://download.magenta.tensorflow.org/models/music_vae/checkpoints_bundled/mel_2bar_big.ckpt.tar"
wget -O checkpoints/mel_16bar_hierdec.ckpt.tar "http://download.magenta.tensorflow.org/models/music_vae/checkpoints_bundled/mel_16bar_hierdec.ckpt.tar"
wget -O checkpoints/trio_16bar_hierdec.ckpt.tar "http://download.magenta.tensorflow.org/models/music_vae/checkpoints_bundled/trio_16bar_hierdec.ckpt.tar"
wget -O checkpoints/drums_2bar_small.lokl.ckpt.tar "http://download.magenta.tensorflow.org/models/music_vae/checkpoints_bundled/drums_2bar_small.lokl.ckpt.tar"
wget -O checkpoints/drums_2bar_small.hikl.ckpt.tar "http://download.magenta.tensorflow.org/models/music_vae/checkpoints_bundled/drums_2bar_small.hikl.ckpt.tar"
wget -O checkpoints/drums_2bar_nade.full.ckpt.tar "http://download.magenta.tensorflow.org/models/music_vae/checkpoints_bundled/drums_2bar_nade.full.ckpt.tar"

# NSynth (2 GB per checkpoint)
wget -O checkpoints/baseline-ckpt.tar "http://download.magenta.tensorflow.org/models/nsynth/baseline-ckpt.tar"
wget -O checkpoints/wavenet-ckpt.tar "http://download.magenta.tensorflow.org/models/nsynth/wavenet-ckpt.tar"

Drums RNN (rythmics generation)

# Generate score based on single kick (default primer)
# (num_output: number of midi files out)
# (num_steps: number steps to generate, double the size of the primer)
# (temperature: smaller the value, closer to the primer)
drums_rnn_generate --config=drum_kit --bundle_file=models/drum_kit_rnn.mag --output_dir=output/drums_rnn/drum_kit/basic --num_outputs=10 --num_steps=64 --temperature=1

# Outputs log-likelihood: the smaller the value, the closer to the primer
# (close to primer)
# INFO:tensorflow:Beam search yields sequence with log-likelihood: -2.849966
# (far to primer)
# INFO:tensorflow:Beam search yields sequence with log-likelihood: -95.074890

# Generate score based on one step of bass drum and hi-hat, then one step of rest, then one step of just hi-hat
# (primer_drums: python array of midi note to prime the model)
drums_rnn_generate --config=drum_kit --bundle_file=models/drum_kit_rnn.mag --output_dir=output/drums_rnn/drum_kit/bd_hat --num_outputs=10 --num_steps=64 --temperature=1 --primer_drums="[(36, 42), (), (42,)]"

# Generate score based on jazz primer
# (primer_midi: midi file to prime the model)
drums_rnn_generate --config=drum_kit --bundle_file=models/drum_kit_rnn.mag --output_dir=output/drums_rnn/drum_kit/jazz --num_outputs=10 --num_steps=64 --temperature=1 --primer_midi=primer/jazz-drum-basic.mid

# Generate score based on jazz primer with high temperature
drums_rnn_generate --config=drum_kit --bundle_file=models/drum_kit_rnn.mag --output_dir=output/drums_rnn/drum_kit/jazz_temp --num_outputs=10 --num_steps=64 --temperature=1.5 --primer_midi=primer/jazz-drum-basic.mid

Melody / Polyphony RNN (melody generation)

# Generate score based on mario theme song (monophonic)
# (steps number: 7 bars * 16 steps each = 112 steps for primer)
# (steps number: 112 steps for primer * 2 for generation = 224 steps)
melody_rnn_generate --config=attention_rnn --bundle_file=models/attention_rnn.mag --output_dir=output/melody/attention_rnn/mario_mono --num_outputs=10 --num_steps=224 --primer_midi=primer/mario-cut-mono.mid

# Generate score based on mario theme song (polyphonic)
# (it doesn't work because it isn't trained on polyphonic data)
melody_rnn_generate --config=attention_rnn --bundle_file=models/attention_rnn.mag --output_dir=output/melody/attention_rnn/mario_poly --num_outputs=10 --num_steps=224 --primer_midi=primer/mario-cut-poly.mid

# Generate score based on mario theme song (polyphonic bis)
# (condition_on_primer: option to remove gap on first step of generation)
# (inject_primer_during_generation: same)
polyphony_rnn_generate --config=polyphony_rnn --bundle_file=models/polyphony_rnn.mag --output_dir=output/polyphony/polyphony_rnn/mario_poly --num_outputs=10 --num_steps=224 --primer_midi=primer/mario-cut-poly.mid --condition_on_primer=false --inject_primer_during_generation=true

Music VAE (score interpolation)

# Generate simple melodies
# (checkpoint_file: model training checkpoint)
# (mode: sample will generate a new output based on model checkpoint)
music_vae_generate --config=cat-mel_2bar_big --checkpoint_file=checkpoints/mel_2bar_big.ckpt.tar --mode=sample --num_outputs=5 --output_dir=output/music_vae/cat-mel_2bar_big/sample

# Generate more complex melodies (16 bar)
music_vae_generate --config=hierdec-mel_16bar --checkpoint_file=checkpoints/mel_16bar_hierdec.ckpt.tar --mode=sample --num_outputs=5 --output_dir=output/music_vae/hierdec-mel_16bar/sample

# Generate complex rythmics, melody and bass (full track!)
music_vae_generate --config=hierdec-trio_16bar --checkpoint_file=checkpoints/trio_16bar_hierdec.ckpt.tar --mode=sample --num_outputs=5 --output_dir=output/music_vae/hierdec-trio_16bar/sample

# Interpolate between two melodies (mario to zelda)
# (mode: interpolate will generate a new output between the two input provided)
music_vae_generate --config=cat-mel_2bar_big --checkpoint_file=checkpoints/mel_2bar_big.ckpt.tar --mode=interpolate --num_outputs=5 --input_midi_1=primer/mario-cut-2bar-mono.mid --input_midi_2=primer/zelda-cut-2bar-mono.mid --output_dir=output/music_vae/cat-mel_2bar_big/interpolate

# Then append the results together
python append.py output/music_vae/cat-mel_2bar_big/interpolate/mario-to-zelda.mid output/music_vae/cat-mel_2bar_big/interpolate/cat-mel_2bar_big_interpolate_*.mid

# Genere more drums
music_vae_generate --config=cat-drums_2bar_small --checkpoint_file=checkpoints/drums_2bar_small.lokl.ckpt.tar --mode=sample --num_outputs=5 --output_dir=output/music_vae/cat-drums_2bar_small/lokl/sample
music_vae_generate --config=cat-drums_2bar_small --checkpoint_file=checkpoints/drums_2bar_small.hikl.ckpt.tar --mode=sample --num_outputs=5 --output_dir=output/music_vae/cat-drums_2bar_small/hikl/sample
music_vae_generate --config=nade-drums_2bar_full --checkpoint_file=checkpoints/drums_2bar_nade.full.ckpt.tar --mode=sample --num_outputs=5 --output_dir=output/music_vae/nade-drums_2bar_full/sample

NSynth (audio synthesis)

Doesn’t work (checkpoint too old?), test with online version: https://experiments.withgoogle.com/ai/sound-maker/view, see https://github.com/tensorflow/magenta/issues/1315.

# TODO "error NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint"
nsynth_generate --checkpoint_path=checkpoints/wavenet-ckpt/wavenet-ckpt/model.ckpt-200000.index --source_path=wav --save_path=output/nsynth/wavenet --batch_size=4

Getting help

# For any command, use "--helpfull" to get the parameters
music_vae_generate --helpfull

Training

AI theory

Tools (librairies / midi/ audio)

State of the art