一文梳理ChatTTS的进阶用法,手把手带你实现个性化配音,音色、语速、停顿,口语,全搞定

本文主要是介绍一文梳理ChatTTS的进阶用法,手把手带你实现个性化配音,音色、语速、停顿,口语,全搞定,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

前几天和大家分享了如何从0到1搭建一套语音交互系统。

其中,语音合成(TTS)是提升用户体验的关键所在。于是,上一篇接着和大家聊了聊:全网爆火的AI语音合成工具-ChatTTS,有人已经拿它赚到了第一桶金,送增强版整合包。

后台有小伙伴反应实测中发现了一些常见的问题,今天,单独开一篇关于ChatTTS的进阶教程,手把手带你实现如何固定音色、设置语速、添加停顿词、口头语、笑声等,以及超长文本生成背后的原理

全程干货满满,赶紧上车~

全文目录

# ChatTTS 简介
# ChatTTS 基础-环境准备
# ChatTTS 进阶-固定音色
# ChatTTS 进阶-固定语速
# ChatTTS 进阶-添加停顿
## 添加停顿词
## 添加笑声
## 添加口头语
# ChatTTS 进阶-超长文本语音生成

ChatTTS 简介

GitHub 仓库地址:https://github.com/2noise/ChatTTS

ChatTTS 是一个文本转语音的开源项目,短短2周左右的时间,在 GitHub 上已经斩获了 24.4k 的 Star,绝对的爆火级项目。
在这里插入图片描述
还没有体验过 ChatTTS 的小伙伴,出门右转看这篇👉::全网爆火的AI语音合成工具-ChatTTS,有人已经拿它赚到了第一桶金,送增强版整合包。

下面我们直接进入实操,先准备好本地环境,然后开启进阶用法。

ChatTTS 基础-环境准备

1. 安装环境

在自己本地,打开一个终端:

git clone https://github.com/2noise/ChatTTS
cd ChatTTS
pip install -r requirements.txt

2.下载模型

最后先下载模型到本地(代码中自动下载非常慢)。

**方式一:**从huggingface下载(太慢了,很有可能失败,不推荐

# 先安装需要用到的包
pip install huggingface-hub
pip install hf_transfer
conda install -c conda-forge pynini=2.1.5 && pip install WeTextProcessing
pip install nemo_text_processing# 编写代码
import os
os.environ['HF_ENDPOINT']="https://hf-mirror.com"
os.environ['HF_HUB_ENABLE_HF_TRANSFER']="1"
from huggingface_hub import snapshot_downloadsnapshot_download(repo_id="2Noise/ChatTTS", # 模型名称 or 地址local_dir='../models/ChatTTS', # 保存模型到指定位置,供下次加载使用cache_dir='../models' # 缓存地址)

方式二:从modelscope上下载 (速度更快,强烈推荐

git clone https://www.modelscope.cn/henjicc/ChatTTS.git

3.小试牛刀-文本转语音

进行模型推理实现中文文本转语音,注意将 local_path 改为你自己下载好的模型地址。

import torch
import torchaudio
import ChatTTStorch._dynamo.config.cache_size_limit = 64
torch._dynamo.config.suppress_errors = True
torch.set_float32_matmul_precision('high')chat = ChatTTS.Chat()
# compile=True效果好,但是更慢
chat.load_models(source='local', local_path='/path/to/models/ChatTTS', compile=False)texts = ["你好啊,我是猴哥,欢迎您关注 猴哥的AI知识库 公众号,下次更新不迷路",]
wavs = chat.infer(texts)torchaudio.save("output.wav", torch.from_numpy(wavs[0]), 24000)

如果没有报错,上述代码会将生成结果保存在当前目录,打开试试吧~

你会发现每次生成的音频音色是不一样的,这是因为 ChatTTS 每次都会随机选择一个音色进行合成。

如果想要固定音色,以及更多个性化设置,继续往下看~

ChatTTS 进阶-固定音色

在ChatTTS中,控制音色的主要是参数 spk_emb,让我们看下源码:

def sample_random_speaker(self, ):dim = self.pretrain_models['gpt'].gpt.layers[0].mlp.gate_proj.in_features# dim=768std, mean = self.pretrain_models['spk_stat'].chunk(2)# std.size:(768),mean.size(768)return torch.randn(dim, device=std.device) * std + mean

sample_random_speaker 返回的是一个 768 维的向量,也就是说如果你把这个 768 维的向量固定住,就可以获取一个固定的音色了。

为了节省大家抽卡获取音色的时间,这里猴哥梳理了两个有特点的音色。保存下来,直接取用~

音色一:男-铿锵有力:

'3.281,2.916,2.316,2.280,-0.884,0.723,-10.224,-0.266,2.000,4.442,1.427,-1.075,-3.027,0.438,1.224,1.043,-0.423,-3.808,0.740,-4.704,10.824,-0.362,-0.291,10.761,1.843,6.068,-4.458,5.142,4.183,4.333,-4.859,-1.444,0.901,3.424,-0.992,-1.645,0.230,0.818,-2.294,1.417,2.839,-0.533,-2.302,-1.379,-3.898,0.998,-0.196,14.249,1.641,-6.927,-1.283,6.005,-0.125,1.531,3.420,8.003,-0.202,-2.823,4.435,2.378,5.186,4.951,1.945,-2.100,1.644,1.921,-1.138,6.229,-5.203,3.610,-0.919,-5.326,2.130,-4.196,21.073,-0.012,11.642,-4.142,-0.581,-0.138,0.882,-1.235,2.377,6.930,3.499,-0.239,4.493,0.411,-2.122,-7.575,0.809,0.885,0.008,0.900,9.930,-3.983,-0.462,-0.699,-4.674,5.071,-3.361,-2.448,0.742,-3.553,-6.620,-0.570,2.859,-3.424,8.462,2.013,-0.204,6.866,-1.946,0.946,-0.440,-4.125,1.194,-2.351,5.368,-5.819,0.049,-0.918,-1.812,-2.288,4.262,12.440,2.084,-0.156,-5.392,0.106,3.279,4.475,0.139,1.006,5.619,0.716,-5.918,-0.774,-5.805,-7.207,-0.402,0.228,-1.422,4.098,1.244,-1.481,-0.672,-3.778,1.220,-0.947,0.807,-1.658,5.977,7.529,-1.070,-2.112,-2.946,-1.739,-2.284,-1.045,-0.442,-1.813,1.269,-2.797,0.321,0.155,-0.379,-2.554,-1.409,2.421,0.318,4.154,0.647,0.610,-3.516,4.489,3.286,3.552,-0.358,1.132,-2.382,-0.612,-5.613,-0.168,-0.680,1.723,-20.408,-2.504,5.870,1.631,4.703,-0.359,0.367,-1.661,-2.172,-9.155,-3.521,-1.058,4.984,-4.086,-3.736,8.959,-1.759,-1.117,17.161,1.497,1.160,2.355,2.584,-2.312,-2.436,0.787,-4.522,5.730,-2.144,1.276,-5.419,-5.749,8.496,5.873,-0.472,-1.497,1.162,8.649,0.326,-0.714,4.123,-3.431,-1.579,0.016,-4.171,-15.907,0.607,1.909,-3.616,-0.803,-0.046,1.836,5.704,-0.730,3.511,7.312,-0.066,0.972,0.199,1.777,-3.750,0.089,18.306,1.156,-2.320,1.280,-2.491,-2.765,-19.535,-1.302,-8.004,-1.833,0.941,-3.547,-0.741,-1.860,-1.626,-2.555,-0.086,0.221,7.494,8.971,-2.007,-2.635,-2.494,-1.794,-5.080,4.963,-1.599,-1.862,-0.225,0.518,2.853,1.657,-0.257,-1.165,-7.366,2.681,5.232,0.592,-2.189,1.606,-1.407,-2.102,-2.945,2.853,8.901,0.029,-5.533,2.786,-0.961,2.544,1.200,-1.171,1.578,-1.681,-0.955,-2.784,2.281,-3.195,2.811,-1.718,0.810,-0.603,-1.736,-1.373,-0.891,-5.330,-1.381,9.614,4.155,0.047,-1.053,3.823,1.010,0.042,-2.849,1.482,0.798,19.538,-12.677,3.608,-1.660,4.493,2.087,-0.906,1.919,1.318,-1.815,-1.287,9.502,0.361,-4.073,-0.589,-1.339,5.567,1.668,0.479,0.333,-26.911,-3.391,-1.020,-4.389,-4.838,2.512,1.478,-3.802,-1.676,-0.469,0.898,3.632,-1.620,0.432,-1.906,-1.674,-7.568,0.361,1.675,-1.164,-13.232,-1.971,-3.983,-2.921,8.597,-0.231,-0.449,0.392,4.953,-5.841,0.047,4.419,-5.355,-4.528,-1.681,-8.136,0.295,3.060,-7.177,-1.483,1.266,2.680,2.265,-8.495,-2.133,1.485,-1.022,-3.286,-1.811,-1.127,-4.633,0.473,6.305,-4.399,-0.452,1.234,-0.245,4.392,2.415,-2.063,3.707,-4.495,3.831,29.219,-11.790,-0.259,1.078,-7.773,1.709,-1.397,-0.067,0.768,2.663,-1.454,-3.022,-1.810,5.015,1.889,1.119,-1.413,-3.240,4.198,14.864,-2.950,1.651,-3.228,-1.095,-2.941,1.231,-8.138,-1.936,-2.160,5.722,-5.481,0.544,-0.786,-1.140,-1.677,2.252,0.031,-2.385,-1.336,3.705,-3.270,-0.495,2.222,1.482,-6.423,-0.509,-1.519,5.775,0.420,-2.140,-3.850,2.567,8.821,-6.976,-6.974,-1.999,-3.226,-1.778,-6.607,9.911,-9.480,1.611,-0.308,-9.211,-1.476,1.751,-0.466,-2.363,-2.358,-4.866,-1.147,-3.444,-6.753,-0.507,-1.245,-2.413,-1.962,2.347,-0.094,-0.745,0.668,2.071,-0.755,-2.450,6.657,7.899,-6.277,2.189,-2.127,2.291,6.095,0.927,-0.250,-1.662,4.031,6.098,-17.201,-9.840,-0.729,1.427,3.222,-0.208,2.380,1.103,1.147,4.051,2.980,0.371,4.143,-1.133,0.486,-0.588,-0.392,-1.507,-0.803,-6.217,-0.947,-5.456,-9.972,-1.221,-1.677,0.376,-0.066,-0.987,-1.144,2.161,-0.282,1.924,1.364,10.975,-6.569,-5.145,-1.047,-1.040,-6.781,-0.387,-21.836,0.683,3.180,-0.757,-0.025,3.418,0.626,1.140,-4.496,-2.867,-2.973,10.865,2.019,-6.465,9.939,-0.107,-2.146,4.367,0.148,-1.867,1.160,-13.273,7.023,0.068,-1.266,-4.770,-1.669,-0.643,1.833,-0.005,-3.790,3.982,-4.962,-0.175,2.481,-0.260,-1.224,1.529,1.225,5.921,5.203,2.303,-2.092,-0.826,5.299,-0.106,0.513,-3.272,1.970,-2.140,-0.033,-3.860,2.144,2.398,-1.478,4.514,1.553,0.901,-2.297,4.627,-0.360,-0.540,0.280,4.952,-2.520,-3.369,19.334,-0.363,0.924,-0.099,-1.544,1.592,-0.163,6.088,-2.863,2.072,-0.336,-1.897,4.755,-3.300,-6.699,-3.852,0.276,-5.298,-0.012,-6.344,3.159,9.078,-2.398,-1.005,-5.604,1.775,4.505,1.172,-3.949,-1.302,5.698,-0.228,-8.495,-4.654,2.533,-4.065,0.098,0.392,2.624,0.928,-0.316,-0.798,-1.814,-3.183,-3.962,-2.339,-2.051,3.898,-1.971,-2.940,8.407,0.673,2.367,7.541,3.822,-6.822,-2.951,-2.204,10.548,-0.255,-1.850,1.263,1.779,1.912,0.608,-1.325,2.220,2.637,1.699,-8.412,-1.397,-0.422,-4.485,3.247,0.382,7.712,-2.262,-1.700,-0.506,7.307,7.266,-0.498,-6.427,-5.879,14.950,0.991,3.057,10.976,-3.662,-0.781,-0.486,7.302,1.052,-3.824,-9.337,-0.783,3.696,-1.935,1.468,-2.889,-1.950,-1.511,2.164,2.791,0.533,-0.986,0.150,1.650,0.909,1.949,-4.902,1.626,0.463,2.622,-1.272,2.164,2.691,-4.966,-0.418,-0.016,-8.490,-3.339,5.198,-3.525,2.392,0.445,1.293,3.217,6.489,1.553,2.088,-1.272,-1.077,3.093,-7.780,4.537,0.824,-1.956,1.215,0.527,2.576,-3.828,2.519,3.089,-4.081,-2.068,-0.241,0.574,-0.245,-1.646,-0.643,3.113,0.861,9.045,-4.593,5.160,3.660,-1.495'

音色二:女-铿锵有力:

'-4.741,0.419,-3.355,3.652,-1.682,-1.254,9.719,1.436,0.871,12.334,-0.175,-2.653,-3.132,0.525,1.573,-0.351,0.030,-3.154,0.935,-0.111,-6.306,-1.840,-0.818,9.773,-1.842,-3.433,-6.200,-4.311,1.162,1.023,11.552,2.769,-2.408,-1.494,-1.143,12.412,0.832,-1.203,5.425,-1.481,0.737,-1.487,6.381,5.821,0.599,6.186,5.379,-2.141,0.697,5.005,-4.944,0.840,-4.974,0.531,-0.679,2.237,4.360,0.438,2.029,1.647,-2.247,-1.716,6.338,1.922,0.731,-2.077,0.707,4.959,-1.969,5.641,2.392,-0.953,0.574,1.061,-9.335,0.658,-0.466,4.813,1.383,-0.907,5.417,-7.383,-3.272,-1.727,2.056,1.996,2.313,-0.492,3.373,0.844,-8.175,-0.558,0.735,-0.921,8.387,-7.800,0.775,1.629,-6.029,0.709,-2.767,-0.534,2.035,2.396,2.278,2.584,3.040,-6.845,7.649,-2.812,-1.958,8.794,2.551,3.977,0.076,-2.073,-4.160,0.806,3.798,-1.968,-4.690,5.702,-4.376,-2.396,1.368,-0.707,4.930,6.926,1.655,4.423,-1.482,-3.670,2.988,-3.296,0.767,3.306,1.623,-3.604,-2.182,-1.480,-2.661,-1.515,-2.546,3.455,-3.500,-3.163,-1.376,-12.772,1.931,4.422,6.434,-0.386,-0.704,-2.720,2.177,-0.666,12.417,4.228,0.823,-1.740,1.285,-2.173,-4.285,-6.220,2.479,3.135,-2.790,1.395,0.946,-0.052,9.148,-2.802,-5.604,-1.884,1.796,-0.391,-1.499,0.661,-2.691,0.680,0.848,3.765,0.092,7.978,3.023,2.450,-15.073,5.077,3.269,2.715,-0.862,2.187,13.048,-7.028,-1.602,-6.784,-3.143,-1.703,1.001,-2.883,0.818,-4.012,4.455,-1.545,-14.483,-1.008,-3.995,2.366,3.961,1.254,-0.458,-1.175,2.027,1.830,2.682,0.131,-1.839,-28.123,-1.482,16.475,2.328,-13.377,-0.980,9.557,0.870,-3.266,-3.214,3.577,2.059,1.676,-0.621,-6.370,-2.842,0.054,-0.059,-3.179,3.182,3.411,4.419,-1.688,-0.663,-5.189,-5.542,-1.146,2.676,2.224,-5.519,6.069,24.349,2.509,4.799,0.024,-2.849,-1.192,-16.989,1.845,6.337,-1.936,-0.585,1.691,-3.564,0.931,0.223,4.314,-2.609,0.544,-1.931,3.604,1.248,-0.852,2.991,-1.499,-3.836,1.774,-0.744,0.824,7.597,-1.538,-0.009,0.494,-2.253,-1.293,-0.475,-3.816,8.165,0.285,-3.348,3.599,-4.959,-1.498,-1.492,-0.867,0.421,-2.191,-1.627,6.027,3.667,-21.459,2.594,-2.997,5.076,0.197,-3.305,3.998,1.642,-6.221,3.177,-3.344,5.457,0.671,-2.765,-0.447,1.080,2.504,1.809,1.144,2.752,0.081,-3.700,0.215,-2.199,3.647,1.977,1.326,3.086,34.789,-1.017,-14.257,-3.121,-0.568,-0.316,11.455,0.625,-6.517,-0.244,-8.490,9.220,0.068,-2.253,-1.485,3.372,2.002,-3.357,3.394,1.879,16.467,-2.271,1.377,-0.611,-5.875,1.004,12.487,2.204,0.115,-4.908,-6.992,-1.821,0.211,0.540,1.239,-2.488,-0.411,2.132,2.130,0.984,-10.669,-7.456,0.624,-0.357,7.948,2.150,-2.052,3.772,-4.367,-11.910,-2.094,3.987,-1.565,0.618,1.152,1.308,-0.807,1.212,-4.476,0.024,-6.449,-0.236,5.085,1.265,-0.586,-2.313,3.642,-0.766,3.626,6.524,-1.686,-2.524,-0.985,-6.501,-2.558,0.487,-0.662,-1.734,0.275,-9.230,-3.785,3.031,1.264,15.340,2.094,1.997,0.408,9.130,0.578,-2.239,-1.493,11.034,2.201,6.757,3.432,-4.133,-3.668,2.099,-6.798,-0.102,2.348,6.910,17.910,-0.779,4.389,1.432,-0.649,5.115,-1.064,3.580,4.129,-4.289,-2.387,-0.327,-1.975,-0.892,5.327,-3.908,3.639,-8.247,-1.876,-10.866,2.139,-3.932,-0.031,-1.444,0.567,-5.543,-2.906,1.399,-0.107,-3.044,-4.660,-1.235,-1.011,9.577,2.294,6.615,-1.279,-2.159,-3.050,-6.493,-7.282,-8.546,5.393,2.050,10.068,3.494,8.810,2.820,3.063,0.603,1.965,2.896,-3.049,7.106,-0.224,-1.016,2.531,-0.902,1.436,-1.843,1.129,6.746,-2.184,0.801,-0.965,-7.555,-18.409,6.176,-3.706,2.261,4.158,-0.928,2.164,-3.248,-4.892,-0.008,-0.521,7.931,-10.693,4.320,-0.841,4.446,-1.591,-0.702,4.075,3.323,-3.406,-1.198,-5.518,-0.036,-2.247,-2.638,2.160,-9.644,-3.858,2.402,-2.640,1.683,-0.961,-3.076,0.226,5.106,0.712,0.669,2.539,-4.340,-0.892,0.732,0.775,-2.757,4.365,-2.368,5.368,0.342,-0.655,0.240,0.775,3.686,-4.008,16.296,4.973,1.851,4.747,0.652,-2.117,6.470,2.189,-8.467,3.236,3.745,-1.332,3.583,-2.504,5.596,-2.440,0.995,-2.267,-3.322,3.490,1.156,1.716,0.669,-3.640,-1.709,5.055,6.265,-3.963,2.863,14.129,5.180,-3.590,0.393,0.234,-3.978,6.946,-0.521,1.925,-1.497,-0.283,0.895,-3.969,5.338,-1.808,-3.578,2.699,2.728,-0.895,-2.175,-2.717,2.574,4.571,1.131,2.187,3.620,-0.388,-3.685,0.979,2.731,-2.164,1.628,-1.006,-7.766,-11.033,-10.985,-2.413,-1.967,0.790,0.826,-1.623,-1.783,3.021,1.598,-0.931,-0.605,-1.684,1.408,-2.771,-2.354,5.564,-2.296,-4.774,-2.830,-5.149,2.731,-3.314,-1.002,3.522,3.235,-1.598,1.923,-2.755,-3.900,-3.519,-1.673,-2.049,-10.404,6.773,1.071,0.247,1.120,-0.794,2.187,-0.189,-5.591,4.361,1.772,1.067,1.895,-5.649,0.946,-2.834,-0.082,3.295,-7.659,-0.128,2.077,-1.638,0.301,-0.974,4.331,11.711,4.199,1.545,-3.236,-4.404,-1.333,0.623,1.414,-0.240,-0.816,-0.808,-1.382,0.632,-5.238,0.120,10.634,-2.026,1.702,-0.469,1.252,1.173,3.015,-8.798,1.633,-5.323,2.149,-6.481,11.635,3.072,5.642,5.252,4.702,-3.523,-0.594,4.150,1.392,0.554,-4.377,3.646,-0.884,1.468,0.779,2.372,-0.101,-5.702,0.539,-0.440,5.149,-0.011,-1.899,-1.349,-0.355,0.076,-0.100,-0.004,5.346,6.276,0.966,-3.138,-2.633,-3.124,3.606,-3.793,-3.332,2.359,-0.739,-3.301,-2.775,-0.491,3.283,-1.394,-1.883,1.203,1.097,2.233,2.170,-2.980,-15.800,-6.791,-0.175,-4.600,-3.840,-4.179,6.568,5.935,-0.431,4.623,4.601,-1.726,0.410,2.591,4.016,8.169,1.763,-3.058,-1.340,6.276,4.682,-0.089,1.301,-4.817'

将音色向量放到代码中,就可以实现生成固定音色的音频,代码奉上:

speaker_vector = ' xxx ' # 768维向量
speaker = torch.tensor([float(x) for x in speaker_vector.split(',')])
my_text = """你好啊,我是猴哥,欢迎您关注 猴哥的AI知识库 公众号,下次更新不迷路"""
params_infer_code = {'prompt':'[speed_2]', 'temperature':.1,'top_P':0.9, 'top_K':1, 'spk_emb' : speaker}
params_refine_text = {'prompt':'[oral_0][laugh_0][break_6]'}wavs = chat.infer([my_text],do_text_normalization=True,params_refine_text=params_refine_text,params_infer_code=params_infer_code)torchaudio.save("output.wav", torch.from_numpy(wavs[0]), 24000)

ChatTTS 进阶-固定语速

有心的同学可能已经发现了,上述代码中多了两个参数。

没错,ChatTTS 中,语速是 speed 参数实现的,设置播放的语速,从慢到快,共有10个等级[speed_0]~[speed_9]。

params_infer_code = {'prompt':'[speed_2]'}

ChatTTS 进阶-添加停顿

ChatTTS 中,主要的停顿有三种,分别是:

  • 停顿词
  • 笑声
  • 口头语

添加停顿词

在ChatTTS中,停顿词主要有10个等级 [break_0]~[break_9] 。其中[break_0]是完全没有停顿,[break_9] 停顿最长。当然,模型会根据文本内容自行添加停顿,你也可以在文本中手动添加停顿 [uv_break]。

如果需要 ChatTTS 模型自动添加停顿,只需加上参数 refine_text_only=True

my_text="""不跟生活认输,不向岁月低头。如果风景不够好,就让我们多走几步。如果道路不好走,就让我们放慢脚步。只要还在路上,总能看到想要的风景,总能找到属于自己的方向。"""
params_infer_code = {'prompt':'[speed_4]', 'temperature':.1,'top_P':0.9, 'top_K':1, 'spk_emb' : girl_speaker}
params_refine_text = {'prompt':'[oral_0][laugh_0][break_6]'}
text = chat.infer([my_text], skip_refine_text=False,refine_text_only=True,params_refine_text=params_refine_text,params_infer_code=params_infer_code)
print("\n".join(text))

输出效果如下:

'不 跟 生 活 认 输 , 不 向 岁 月 低 头 [uv_break] 。 如 果 风 景 不 够 好 [uv_break] , 就 让 我 们 多 走 几 步 。 如 果 道 路 不 好 走 , 就 让 我 们 放 慢 脚 步 [uv_break] 。 只 要 还 在 路 上 , 总 能 看 到 想 要 的 风 景 [uv_break] , 总 能 找 到 属 于 自 己 的 方 向 。'

如果需要自行添加停顿,只需在 文本中添加 [uv_break]

示例如下:

my_text="不跟生活[uv_break]认输,不向岁月低头[laugh]。"

添加笑声

在ChatTTS中,添加笑声主要是通过 [laugh], 共有10个等级[laugh_0]~[laugh_9]。其中[laugh_0]是没有笑声,[laugh_9]笑声较多。当然,模型会根据文本自动添加笑声,也可以像上面的示例一样手动添加 [laugh].

添加口头语

口头语,就是:“那么,就是,嗯” 等语气词。

在ChatTTS中,添加口头语主要是通过 [oral],共有10个等级[oral_0]~[oral_9]。其中[oral-0]基本不添加口头语,[oral_9]添加最多口头语。

举个例子而言,你可以将下面这个参数放入模型试试看哈:

params_refine_text = {'prompt':'[oral_9][laugh_9][break_6]'}

ChatTTS 进阶-超长文本语音生成

目前 ChatTTS ,受模型限制,一次性只支持 30 秒以内的语音生成,当你一次生成文本过多时,模型只会截取前30秒的,之后的文本将被截断。

而要实现超长文本语音生成,只能通过过将文本拆分成多个句子,然后在合并音频的方式。

比如按行拆分,代码实现如下:

text = """如果本文对你有帮助,欢迎点赞收藏备用!猴哥一直在做 AI 领域的研发和探索,会陆续跟大家分享路上的思考和心得。最近开始运营一个公众号,旨在分享关于AI效率工具、自媒体副业的一切。用心做内容,不辜负每一份关注。新朋友欢迎关注 “猴哥的AI知识库” 公众号,下次更新不迷路。"""
params_infer_code = {'prompt':'[speed_2]', 'temperature':.1,'top_P':0.9, 'top_K':1, 'spk_emb' : speaker}
params_refine_text = {'prompt':'[oral_6][laugh_3][break_6]'}
# 将文本按照句号拆分为句子
inputs_text = text.split("\n")
wavs = chat.infer(inputs_text,do_text_normalization=True,params_refine_text=params_refine_text,params_infer_code=params_infer_code)
# 合并音频
finally_wavs = torch.tensor(np.concatenate(wavs,axis=-1))
# 将输出的语音保存为音频文件
torchaudio.save("output.wav",finally_wavs, 24000)

写在最后

这是关于 ChatTTS 的第三篇文章,希望这些进阶用法,可以帮你轻松打造更个性化的 AI 语音助手。

如果本文对你有帮助,欢迎 点赞收藏 备用!

猴哥一直在做 AI 领域的研发和探索,会陆续跟大家分享路上的思考和心得。

最近开始运营一个公众号,旨在分享关于AI效率工具、自媒体副业的一切。用心做内容,不辜负每一份关注。

新朋友欢迎关注 “猴哥的AI知识库” 公众号,下次更新不迷路。

这篇关于一文梳理ChatTTS的进阶用法,手把手带你实现个性化配音,音色、语速、停顿,口语,全搞定的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1072044

相关文章

Python xmltodict实现简化XML数据处理

《Pythonxmltodict实现简化XML数据处理》Python社区为提供了xmltodict库,它专为简化XML与Python数据结构的转换而设计,本文主要来为大家介绍一下如何使用xmltod... 目录一、引言二、XMLtodict介绍设计理念适用场景三、功能参数与属性1、parse函数2、unpa

C#实现获得某个枚举的所有名称

《C#实现获得某个枚举的所有名称》这篇文章主要为大家详细介绍了C#如何实现获得某个枚举的所有名称,文中的示例代码讲解详细,具有一定的借鉴价值,有需要的小伙伴可以参考一下... C#中获得某个枚举的所有名称using System;using System.Collections.Generic;usi

Go语言实现将中文转化为拼音功能

《Go语言实现将中文转化为拼音功能》这篇文章主要为大家详细介绍了Go语言中如何实现将中文转化为拼音功能,文中的示例代码讲解详细,感兴趣的小伙伴可以跟随小编一起学习一下... 有这么一个需求:新用户入职 创建一系列账号比较麻烦,打算通过接口传入姓名进行初始化。想把姓名转化成拼音。因为有些账号即需要中文也需要英

C# 读写ini文件操作实现

《C#读写ini文件操作实现》本文主要介绍了C#读写ini文件操作实现,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧... 目录一、INI文件结构二、读取INI文件中的数据在C#应用程序中,常将INI文件作为配置文件,用于存储应用程序的

C#实现获取电脑中的端口号和硬件信息

《C#实现获取电脑中的端口号和硬件信息》这篇文章主要为大家详细介绍了C#实现获取电脑中的端口号和硬件信息的相关方法,文中的示例代码讲解详细,有需要的小伙伴可以参考一下... 我们经常在使用一个串口软件的时候,发现软件中的端口号并不是普通的COM1,而是带有硬件信息的。那么如果我们使用C#编写软件时候,如

Python使用qrcode库实现生成二维码的操作指南

《Python使用qrcode库实现生成二维码的操作指南》二维码是一种广泛使用的二维条码,因其高效的数据存储能力和易于扫描的特点,广泛应用于支付、身份验证、营销推广等领域,Pythonqrcode库是... 目录一、安装 python qrcode 库二、基本使用方法1. 生成简单二维码2. 生成带 Log

一文带你理解Python中import机制与importlib的妙用

《一文带你理解Python中import机制与importlib的妙用》在Python编程的世界里,import语句是开发者最常用的工具之一,它就像一把钥匙,打开了通往各种功能和库的大门,下面就跟随小... 目录一、python import机制概述1.1 import语句的基本用法1.2 模块缓存机制1.

Go语言使用Buffer实现高性能处理字节和字符

《Go语言使用Buffer实现高性能处理字节和字符》在Go中,bytes.Buffer是一个非常高效的类型,用于处理字节数据的读写操作,本文将详细介绍一下如何使用Buffer实现高性能处理字节和... 目录1. bytes.Buffer 的基本用法1.1. 创建和初始化 Buffer1.2. 使用 Writ

基于WinForm+Halcon实现图像缩放与交互功能

《基于WinForm+Halcon实现图像缩放与交互功能》本文主要讲述在WinForm中结合Halcon实现图像缩放、平移及实时显示灰度值等交互功能,包括初始化窗口的不同方式,以及通过特定事件添加相应... 目录前言初始化窗口添加图像缩放功能添加图像平移功能添加实时显示灰度值功能示例代码总结最后前言本文将

Redis延迟队列的实现示例

《Redis延迟队列的实现示例》Redis延迟队列是一种使用Redis实现的消息队列,本文主要介绍了Redis延迟队列的实现示例,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习... 目录一、什么是 Redis 延迟队列二、实现原理三、Java 代码示例四、注意事项五、使用 Redi