Coursera 吴恩达DeepLearning.AI 第五课 sequence model 序列模型第一周 Improvise a Jazz Solo with an LSTM Network

本文主要是介绍Coursera 吴恩达DeepLearning.AI 第五课 sequence model 序列模型第一周 Improvise a Jazz Solo with an LSTM Network，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

本周的习题有些地方有点坑，主要是题目讲述的不够清楚，以下答案供参考

Improvise a Jazz Solo with an LSTM Network

Welcome to your final programming assignment of this week! In this notebook, you will implement a model that uses an LSTM to generate music. You will even be able to listen to your own music at the end of the assignment.

You will learn to:

Apply an LSTM to music generation.
Generate your own jazz music with deep learning.

Please run the following cell to load all the packages required in this assignment. This may take a few minutes.

In [ ]: 

 from __future__ import print_function import IPython import sys from music21 import * import numpy as np from grammar import * from qa import * from preprocess import *  from music_utils import * from data_utils import * from keras.models import load_model, Model from keras.layers import Dense, Activation, Dropout, Input, LSTM, Reshape, Lambda, RepeatVector from keras.initializers import glorot_uniform from keras.utils import to_categorical from keras.optimizers import Adam from keras import backend as K 

1 - Problem statement

You would like to create a jazz music piece specially for a friend's birthday. However, you don't know any instruments or music composition. Fortunately, you know deep learning and will solve this problem using an LSTM netwok.

You will train a network to generate novel jazz solos in a style representative of a body of performed work.

1.1 - Dataset

You will train your algorithm on a corpus of Jazz music. Run the cell below to listen to a snippet of the audio from the training set:

In [ ]: 

 IPython.display.Audio('./data/30s_seq.mp3') 

We have taken care of the preprocessing of the musical data to render it in terms of musical "values." You can informally think of each "value" as a note, which comprises a pitch and a duration. For example, if you press down a specific piano key for 0.5 seconds, then you have just played a note. In music theory, a "value" is actually more complicated than this--specifically, it also captures the information needed to play multiple notes at the same time. For example, when playing a music piece, you might press down two piano keys at the same time (playng multiple notes at the same time generates what's called a "chord"). But we don't need to worry about the details of music theory for this assignment. For the purpose of this assignment, all you need to know is that we will obtain a dataset of values, and will learn an RNN model to generate sequences of values.

Our music generation system will use 78 unique values. Run the following code to load the raw music data and preprocess it into values. This might take a few minutes.

In [ ]: 

 X, Y, n_values, indices_values = load_music_utils() print('shape of X:', X.shape) print('number of training examples:', X.shape[0]) print('Tx (length of sequence):', X.shape[1]) print('total # of unique values:', n_values) print('Shape of Y:', Y.shape) 

You have just loaded the following:

X: This is an (m, Tx, 78) dimensional array. We have m training examples, each of which is a snippet of Tx=30 musical values. At each time step, the input is one of 78 different possible values, represented as a one-hot vector. Thus for example, X[i,t,:] is a one-hot vector representating the value of the i-th example at time t.
Y: This is essentially the same as X, but shifted one step to the left (to the past). Similar to the dinosaurus assignment, we're interested in the network using the previous values to predict the next value, so our sequence model will try to predict y⟨t⟩ given x⟨1⟩,…,x⟨t⟩. However, the data in Y is reordered to be dimension (Ty,m,78), where Ty=Tx. This format makes it more convenient to feed to the LSTM later.