深度学习作业L5W3(2):Improvise a Jazz Solo with an LSTM Network

2023-10-22 12:10

本文主要是介绍深度学习作业L5W3(2):Improvise a Jazz Solo with an LSTM Network,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

构建一个识别触发语言的模型(siri)

数据构建

我们手里有:
背景音
正例音(符合trigger word)
反例音 (不符合trigger word)

我们把正例音和反例音随机插在背景音当中作为训练集数据,同时如下设置标签:在每个正例音结束之后,将一定时间段的y值设为1(增加1的个数提高学习效果)

判断时间段是否冲突

# GRADED FUNCTION: is_overlappingdef is_overlapping(segment_time, previous_segments):"""Checks if the time of a segment overlaps with the times of existing segments.Arguments:segment_time -- a tuple of (segment_start, segment_end) for the new segmentprevious_segments -- a list of tuples of (segment_start, segment_end) for the existing segmentsReturns:True if the time segment overlaps with any of the existing segments, False otherwise"""segment_start, segment_end = segment_time### START CODE HERE ### (≈ 4 line)# Step 1: Initialize overlap as a "False" flag. (≈ 1 line)overlap = False# Step 2: loop over the previous_segments start and end times.# Compare start/end times and set the flag to True if there is an overlap (≈ 3 lines)for previous_start, previous_end in previous_segments:if not(previous_start>segment_end or previous_end<segment_start):overlap = Truebreak### END CODE HERE ###return overlap

设置插入方法来构造训练集数据

# GRADED FUNCTION: insert_audio_clipdef insert_audio_clip(background, audio_clip, previous_segments):"""Insert a new audio segment over the background noise at a random time step, ensuring that the audio segment does not overlap with existing segments.Arguments:background -- a 10 second background audio recording.  audio_clip -- the audio clip to be inserted/overlaid. previous_segments -- times where audio segments have already been placedReturns:new_background -- the updated background audio"""# Get the duration of the audio clip in mssegment_ms = len(audio_clip)### START CODE HERE ### # Step 1: Use one of the helper functions to pick a random time segment onto which to insert # the new audio clip. (≈ 1 line)segment_time = get_random_time_segment(segment_ms)# Step 2: Check if the new segment_time overlaps with one of the previous_segments. If so, keep # picking new segment_time at random until it doesn't overlap. (≈ 2 lines)while is_overlapping(segment_time, previous_segments):segment_time = get_random_time_segment(segment_ms)# Step 3: Add the new segment_time to the list of previous_segments (≈ 1 line)previous_segments.append(segment_time)### END CODE HERE #### Step 4: Superpose audio segment and backgroundnew_background = background.overlay(audio_clip, position = segment_time[0])return new_background, segment_time

构建标签

# GRADED FUNCTION: insert_onesdef insert_ones(y, segment_end_ms):"""Update the label vector y. The labels of the 50 output steps strictly after the end of the segment should be set to 1. By strictly we mean that the label of segment_end_y should be 0 while, the50 followinf labels should be ones.Arguments:y -- numpy array of shape (1, Ty), the labels of the training examplesegment_end_ms -- the end time of the segment in msReturns:y -- updated labels"""# duration of the background (in terms of spectrogram time-steps)segment_end_y = int(segment_end_ms * Ty / 10000.0)# Add 1 to the correct index in the background label (y)### START CODE HERE ### (≈ 3 lines)for i in range(segment_end_y + 1, segment_end_y + 51):if i < Ty:y[0, i] = 1### END CODE HERE ###return y

构建带标签的数据

# GRADED FUNCTION: create_training_exampledef create_training_example(background, activates, negatives):"""Creates a training example with a given background, activates, and negatives.Arguments:background -- a 10 second background audio recordingactivates -- a list of audio segments of the word "activate"negatives -- a list of audio segments of random words that are not "activate"Returns:x -- the spectrogram of the training exampley -- the label at each time step of the spectrogram"""# Set the random seednp.random.seed(18)# Make background quieterbackground = background - 20### START CODE HERE #### Step 1: Initialize y (label vector) of zeros (≈ 1 line)y = np.zeros((1, Ty))# Step 2: Initialize segment times as empty list (≈ 1 line)previous_segments = []### END CODE HERE #### Select 0-4 random "activate" audio clips from the entire list of "activates" recordingsnumber_of_activates = np.random.randint(0, 5)random_indices = np.random.randint(len(activates), size=number_of_activates)random_activates = [activates[i] for i in random_indices]### START CODE HERE ### (≈ 3 lines)# Step 3: Loop over randomly selected "activate" clips and insert in backgroundfor random_activate in random_activates:# Insert the audio clip on the backgroundbackground, segment_time = insert_audio_clip(background, random_activate, previous_segments)# Retrieve segment_start and segment_end from segment_timesegment_start, segment_end = segment_time# Insert labels in "y"y = insert_ones(y, segment_end)### END CODE HERE #### Select 0-2 random negatives audio recordings from the entire list of "negatives" recordingsnumber_of_negatives = np.random.randint(0, 3)random_indices = np.random.randint(len(negatives), size=number_of_negatives)random_negatives = [negatives[i] for i in random_indices]### START CODE HERE ### (≈ 2 lines)# Step 4: Loop over randomly selected negative clips and insert in backgroundfor random_negative in random_negatives:# Insert the audio clip on the background background, _ = insert_audio_clip(background, random_negative, previous_segments)### END CODE HERE #### Standardize the volume of the audio clip background = match_target_amplitude(background, -20.0)# Export new training example file_handle = background.export("train" + ".wav", format="wav")print("File (train.wav) was saved in your directory.")# Get and plot spectrogram of the new recording (background with superposition of positive and negatives)x = graph_spectrogram("train.wav")return x, y

模型建立

接下来我们利用已经处理好的训练集数据进行训练
在这里插入图片描述
模型先是对输入进行一维卷积,随后通过两层GRU单元以及softmax得到输出

# GRADED FUNCTION: modeldef model(input_shape):"""Function creating the model's graph in Keras.Argument:input_shape -- shape of the model's input data (using Keras conventions)Returns:model -- Keras model instance"""X_input = Input(shape = input_shape)### START CODE HERE #### Step 1: CONV layer (≈4 lines)X = Conv1D(196, 15, strides=4)(X_input)                                 # CONV1DX = BatchNormalization()(X)                                 # Batch normalizationX = Activation('relu')(X)                                # ReLu activationX = Dropout(0.8)(X)                                 # dropout (use 0.8)# Step 2: First GRU Layer (≈4 lines)X = GRU(128, return_sequences=True)(X)                                 # GRU (use 128 units and return the sequences)X = Dropout(0.8)(X)                                  # dropout (use 0.8)X = BatchNormalization()(X)                                # Batch normalization# Step 3: Second GRU Layer (≈4 lines)X = GRU(128, return_sequences=True)(X)                                   # GRU (use 128 units and return the sequences)X = Dropout(0.8)(X)                                 # dropout (use 0.8)X = BatchNormalization()(X)                                # Batch normalizationX = Dropout(0.8)(X)                              # dropout (use 0.8)# Step 4: Time-distributed dense layer (≈1 line)X = TimeDistributed(Dense(1, activation = "sigmoid"))(X) # time distributed  (sigmoid)### END CODE HERE ###model = Model(inputs = X_input, outputs = X)return model  
model = model(input_shape = (Tx, n_freq))
model = load_model('./models/tr_model.h5')
opt = Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, decay=0.01)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=["accuracy"])
model.fit(X, Y, batch_size = 5, epochs=1)

最终获得了不错的效果

这篇关于深度学习作业L5W3(2):Improvise a Jazz Solo with an LSTM Network的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/261361

相关文章

深度解析Java DTO(最新推荐)

《深度解析JavaDTO(最新推荐)》DTO(DataTransferObject)是一种用于在不同层(如Controller层、Service层)之间传输数据的对象设计模式,其核心目的是封装数据,... 目录一、什么是DTO?DTO的核心特点:二、为什么需要DTO?(对比Entity)三、实际应用场景解析

深度解析Java项目中包和包之间的联系

《深度解析Java项目中包和包之间的联系》文章浏览阅读850次,点赞13次,收藏8次。本文详细介绍了Java分层架构中的几个关键包:DTO、Controller、Service和Mapper。_jav... 目录前言一、各大包1.DTO1.1、DTO的核心用途1.2. DTO与实体类(Entity)的区别1

深度解析Python装饰器常见用法与进阶技巧

《深度解析Python装饰器常见用法与进阶技巧》Python装饰器(Decorator)是提升代码可读性与复用性的强大工具,本文将深入解析Python装饰器的原理,常见用法,进阶技巧与最佳实践,希望可... 目录装饰器的基本原理函数装饰器的常见用法带参数的装饰器类装饰器与方法装饰器装饰器的嵌套与组合进阶技巧

深度解析Spring Boot拦截器Interceptor与过滤器Filter的区别与实战指南

《深度解析SpringBoot拦截器Interceptor与过滤器Filter的区别与实战指南》本文深度解析SpringBoot中拦截器与过滤器的区别,涵盖执行顺序、依赖关系、异常处理等核心差异,并... 目录Spring Boot拦截器(Interceptor)与过滤器(Filter)深度解析:区别、实现

深度解析Spring AOP @Aspect 原理、实战与最佳实践教程

《深度解析SpringAOP@Aspect原理、实战与最佳实践教程》文章系统讲解了SpringAOP核心概念、实现方式及原理,涵盖横切关注点分离、代理机制(JDK/CGLIB)、切入点类型、性能... 目录1. @ASPect 核心概念1.1 AOP 编程范式1.2 @Aspect 关键特性2. 完整代码实

SpringBoot开发中十大常见陷阱深度解析与避坑指南

《SpringBoot开发中十大常见陷阱深度解析与避坑指南》在SpringBoot的开发过程中,即使是经验丰富的开发者也难免会遇到各种棘手的问题,本文将针对SpringBoot开发中十大常见的“坑... 目录引言一、配置总出错?是不是同时用了.properties和.yml?二、换个位置配置就失效?搞清楚加

Go学习记录之runtime包深入解析

《Go学习记录之runtime包深入解析》Go语言runtime包管理运行时环境,涵盖goroutine调度、内存分配、垃圾回收、类型信息等核心功能,:本文主要介绍Go学习记录之runtime包的... 目录前言:一、runtime包内容学习1、作用:① Goroutine和并发控制:② 垃圾回收:③ 栈和

Python中文件读取操作漏洞深度解析与防护指南

《Python中文件读取操作漏洞深度解析与防护指南》在Web应用开发中,文件操作是最基础也最危险的功能之一,这篇文章将全面剖析Python环境中常见的文件读取漏洞类型,成因及防护方案,感兴趣的小伙伴可... 目录引言一、静态资源处理中的路径穿越漏洞1.1 典型漏洞场景1.2 os.path.join()的陷

Android学习总结之Java和kotlin区别超详细分析

《Android学习总结之Java和kotlin区别超详细分析》Java和Kotlin都是用于Android开发的编程语言,它们各自具有独特的特点和优势,:本文主要介绍Android学习总结之Ja... 目录一、空安全机制真题 1:Kotlin 如何解决 Java 的 NullPointerExceptio

Spring Boot拦截器Interceptor与过滤器Filter深度解析(区别、实现与实战指南)

《SpringBoot拦截器Interceptor与过滤器Filter深度解析(区别、实现与实战指南)》:本文主要介绍SpringBoot拦截器Interceptor与过滤器Filter深度解析... 目录Spring Boot拦截器(Interceptor)与过滤器(Filter)深度解析:区别、实现与实