笔记摘录:2018.09.01---Kaldi构建一个简单的英文数字串识别系统

本文主要是介绍笔记摘录:2018.09.01---Kaldi构建一个简单的英文数字串识别系统,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

本文主要参考的是 kaldi-asr.org,主要讲述的是用自己的录音来构建一个数字串识别系统。

本文将主要分为以下几个部分:

录制语音

这里是英文数字串识别,因此需要一些用英语朗读数字的语音。我录制了 128 个语音文件,分别是两个人朗读,其中每个文件只包含三个数字。这 128 文件中 80 个用于训练, 48 个用于测试。并且训练数据和测试数据都被分成了 8 部分(可以假装成 8 个人),每部分分别 10 个 和 6 个。读者可以到我的 GitHub 上下载这些语音数据。训练集和测试集的前五个目录是我(男)朗读,后面三个是女士朗读的。

我的录音是 16 kHz 采样,16 位量化的。如果是 8 kHz 采样,需要在提特征时指明采样频率。即将 steps/make_mfcc.sh 里面 compute-mfcc-feats 命令加上 --sample-frequency=8000 的选项。

目录结构如下:

当然您也可以选择自己录音,也欢迎你把你的录音共享。

从数据准备到编写脚本一直都是在为最后做准备,如果都完成了的话,你的目录应该长这样:

数据准备

数据的准备包括声学数据和语言数据。首先建立一个目录 data/

声学数据

声学数据主要包括训练集和测试集的数据,以及一个语料库,在 data/ 目录下新建两个目录:train/test/local

声学训练数据准备

这些数据主要是四个文件:spk2genderwav.scptextutt2spk,都保存在 data/train/ 目录下。

spk2gender

这个文件主要描述说话人编号和性别的对应关系,具有这样的形式:<speakerID> <gender>,比如我这里训练集是“8”个人,前 5 个男(Male),后 3 个女(Female)。比如:

wav.scp

这个文件主要是保存了语音编号和语音文件的对应关系,具有这样的形式:<uterranceID> <full_path_to_audio_file>。比如:

text

这个文件主要是标注,描述了语音编号和标注之间的对应关系,具有这样的形式:<utterranceID> <text_transcription>。比如

utt2spk

这个文件是联系说话人编号和语音编号,具有这样的形式:<uterranceID> <speakerID>。比如:

这里我们建议把说话人的编号放在语音编号的前缀。至此,训练集的声学数据准备好了。

声学测试数据准备

和训练集数据一样,这些数据也主要是四个文件:spk2genderwav.scptextutt2spk,都保存在 data/test/ 目录下。这些目录的形式和之前一样,只是 wav.scp 需要映射到测试集的数据,重新修改语音编号。

语料库

我们还需要一个语料库,其包含了所有训练集数据的标注,请命名为 corpus.txt,并保存在 data/local/ 目录下。比如:

语言数据

语言数据主要包括 lexicon.txtoptional_silence.txtnonsilence_phones.txtsilence_phones.txt,并在 data/local/ 目录下新建文件夹 dict,将这四个文件保存到那里。

lexicon.txt

这个文件应该包括你标注里所有出现的词的发音,即音素表达,由于这里只有十个单词,再加上静音,因此不管是词还是音素,数量都比较少。

nonsilence_phones.txt

这个文件列出了上面出现的所有的非静音音素。

silence_phones.txt

这里面包含了静音音素。

optional_silence.txt

只有可选的静音音素。

环境准备

回到项目根目录。首先我们需要定义文件 cmd.shpath.sh,其中前者主要包括运行的形式,而后者主要包括 kaldi 依赖的路径。

cmd.sh

export train_cmd="run.pl"
export decode_cmd="run.pl"
export mkgraph_cmd="run.pl"

path.sh

export train_cmd="run.pl"
export decode_cmd="run.pl"
export cuda_cmd="run.pl"
export mkgraph_cmd="run.pl"
Huang-Lus-MacBook-Air:en huanglu$ cat path.sh 
export KALDI_ROOT=`pwd`/../../..
[ -f $KALDI_ROOT/tools/env.sh ] && . $KALDI_ROOT/tools/env.sh
export PATH=$PWD/utils/:$KALDI_ROOT/tools/openfst/bin:$KALDI_ROOT/tools/irstlm/bin/:$PWD:$PATH
[ ! -f $KALDI_ROOT/tools/config/common_path.sh ] && echo >&2 "The standard file $KALDI_ROOT/tools/config/common_path.sh is not present -> Exit!" && exit 1
. $KALDI_ROOT/tools/config/common_path.sh
export LC_ALL=C
Huang-Lus-MacBoo

此外我们还需要一些 kaldi 已经写好的脚本,比如 steps/decode.shutils/mkgraph.sh,简单起见,我们直接这两个文件夹链接过来。

$ ln -s ../../timit/s5/steps steps
$ ln -s ../../timit/s5/utils utils

然后就是为了打分以及处理提特征时采样率非 16 kHz 时,需要在根目录下建立 local 文件夹,然后从别的那里拷贝来 make_mfcc.shscore.sh

最后就是建立目录 conf,存放一些配置文件。包括:

decode.config

first_beam=10.0
beam=13.0
lattice_beam=6.0

mfcc.conf

--use-energy=false

编写运行脚本

在根目录下建立一个 run.sh 脚本,输入以下内容(有注释,我不解释了):

#!/bin/bash. ./path.sh || exit 1
. ./cmd.sh || exit 1nj=4
lm_order=1. utils/parse_options.sh || exit 1
[[ $# -ge 1 ]] && { echo "Wrong arguments!"; exit 1; }# Removing previously created data (from last run.sh execution)
rm -rf exp mfcc data/train/spk2utt data/train/cmvn.scp data/train/feats.scp \data/train/split1 data/test/spk2utt data/test/cmvn.scp data/test/feats.scp \data/test/split1 data/local/lang data/lang data/local/tmp \data/local/dict/lexiconp.txtecho
echo "===== PREPARING ACOUSTIC DATA ====="
echo# Making spk2utt files
utils/utt2spk_to_spk2utt.pl data/train/utt2spk > data/train/spk2utt
utils/utt2spk_to_spk2utt.pl data/test/utt2spk > data/test/spk2uttecho
echo "===== FEATURES EXTRACTION ====="
echo# Making feats.scp files
mfccdir=mfcc
# Uncomment and modify arguments in scripts below if you have any problems with data sorting
# utils/validate_data_dir.sh data/train     # script for checking prepared data - here: for data/train directory
# utils/validate_data_dir.sh data/test
# utils/fix_data_dir.sh data/train          # tool for data proper sorting if needed - here: for data/train directory
# utils/fix_data_dir.sh data/teststeps/make_mfcc.sh --nj $nj --cmd "$train_cmd" data/train \exp/make_mfcc/train $mfccdir
steps/make_mfcc.sh --nj $nj --cmd "$train_cmd" data/test \exp/make_mfcc/test $mfccdir# Making cmvn.scp files
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train $mfccdir
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test $mfccdirecho
echo "===== PREPARING LANGUAGE DATA ====="
echo
# Preparing language data
utils/prepare_lang.sh data/local/dict "<UNK>" data/local/lang data/langecho
echo "===== LANGUAGE MODEL CREATION ====="
echo "===== MAKING lm.arpa ====="
echoloc=`which ngram-count`;
if [ -z $loc ]; thenif uname -a | grep 64 >/dev/null; thensdir=$KALDI_ROOT/tools/srilm/bin/i686-m64elsesdir=$KALDI_ROOT/tools/srilm/bin/i686fiif [ -f $sdir/ngram-count ]; thenecho "Using SRILM language modelling tool from $sdir"export PATH=$PATH:$sdirelseecho "SRILM toolkit is probably not installed. Instructions: tools/install_srilm.sh"exit 1fi
filocal=data/local
mkdir $local/tmp
ngram-count -order $lm_order -write-vocab $local/tmp/vocab-full.txt \-wbdiscount -text $local/corpus.txt -lm $local/tmp/lm.arpaecho
echo "===== MAKING G.fst ====="
echolang=data/lang
arpa2fst --disambig-symbol=#0 --read-symbol-table=$lang/words.txt \$local/tmp/lm.arpa $lang/G.fstecho
echo "===== MONO TRAINING ====="
echosteps/train_mono.sh --nj $nj --cmd "$train_cmd" data/train data/lang exp/mono  || exit 1echo
echo "===== MONO DECODING ====="
echoutils/mkgraph.sh --mono data/lang exp/mono exp/mono/graph || exit 1
steps/decode.sh --config conf/decode.config --nj $nj --cmd "$decode_cmd" \exp/mono/graph data/test exp/mono/decode
local/score.sh data/test data/lang exp/mono/decode/echo
echo "===== MONO ALIGNMENT ====="
echosteps/align_si.sh --nj $nj --cmd "$train_cmd" data/train data/lang exp/mono exp/mono_ali || exit 1echo
echo "===== TRI1 (first triphone pass) TRAINING ====="
echosteps/train_deltas.sh --cmd "$train_cmd" 2000 11000 data/train data/lang exp/mono_ali exp/tri1 || exit 1echo
echo "===== TRI1 (first triphone pass) DECODING ====="
echoutils/mkgraph.sh data/lang exp/tri1 exp/tri1/graph || exit 1
steps/decode.sh --config conf/decode.config --nj $nj --cmd "$decode_cmd" \exp/tri1/graph data/test exp/tri1/decode
local/score.sh data/test data/lang exp/tri1/decode/echo
echo "===== run.sh script is finished ====="
echo

运行脚本

最后在运行 run.sh 之前先进行下面的命令:

$ chmod u+x local/*.sh *.sh

运行结果(省去了部分重复的):

$ ./run.sh ===== PREPARING ACOUSTIC DATA ========== FEATURES EXTRACTION =====steps/make_mfcc.sh --nj 4 --cmd run.pl data/train exp/make_mfcc/train mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC features for train
steps/make_mfcc.sh --nj 4 --cmd run.pl data/test exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC features for test
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train mfcc
Succeeded creating CMVN stats for train
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test===== PREPARING LANGUAGE DATA =====utils/prepare_lang.sh data/local/dict <UNK> data/local/lang data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> data/local/dict/silence_phones.txt is OKChecking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> data/local/dict/optional_silence.txt is OKChecking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> data/local/dict/nonsilence_phones.txt is OKChecking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> data/local/dict/lexicon.txt is OKChecking data/local/dict/extra_questions.txt ...
--> data/local/dict/extra_questions.txt is empty (this is OK)
--> SUCCESS [validating dictionary directory data/local/dict]**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int 
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking data/lang/phones.txt ...
--> data/lang/phones.txt is OKChecking words.txt: #0 ...
--> data/lang/words.txt is OKChecking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OKChecking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> summation property is OKChecking data/lang/phones/context_indep.{txt, int, csl} ...
--> 10 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OKChecking data/lang/phones/nonsilence.{txt, int, csl} ...
--> 80 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OKChecking data/lang/phones/silence.{txt, int, csl} ...
--> 10 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OKChecking data/lang/phones/optional_silence.{txt, int, csl} ...
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OKChecking data/lang/phones/disambig.{txt, int, csl} ...
--> 2 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OKChecking data/lang/phones/roots.{txt, int} ...
--> 22 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OKChecking data/lang/phones/sets.{txt, int} ...
--> 22 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OKChecking data/lang/phones/extra_questions.{txt, int} ...
--> 9 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OKChecking data/lang/phones/word_boundary.{txt, int} ...
--> 90 entry/entries in data/lang/phones/word_boundary.txt
--> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
--> data/lang/phones/word_boundary.{txt, int} are OKChecking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OKChecking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OKChecking topo ...Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
--> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
--> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
--> data/lang/phones/word_boundary.txt is OKChecking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking word_boundary.int and disambig.int
--> generating a 67 word sequence
--> resulting phone sequence from L.fst corresponds to the word sequence
--> L.fst is OK
--> generating a 16 word sequence
--> resulting phone sequence from L_disambig.fst corresponds to the word sequence
--> L_disambig.fst is OKChecking data/lang/oov.{txt, int} ...
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]===== LANGUAGE MODEL CREATION =====
===== MAKING lm.arpa ========== MAKING G.fst =====arpa2fst --disambig-symbol=#0 --read-symbol-table=data/lang/words.txt data/local/tmp/lm.arpa data/lang/G.fst 
LOG (arpa2fst:Read():arpa-file-parser.cc:96) Reading \data\ section.
LOG (arpa2fst:Read():arpa-file-parser.cc:151) Reading \1-grams: section.
LOG (arpa2fst:RemoveRedundantStates():arpa-lm-compiler.cc:355) Reduced num-states from 1 to 1===== MONO TRAINING =====steps/train_mono.sh --nj 4 --cmd run.pl data/train data/lang exp/mono
steps/train_mono.sh: Initializing monophone system.
steps/train_mono.sh: Compiling training graphs
steps/train_mono.sh: Aligning data equally (pass 0)
steps/train_mono.sh: Pass 1
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 2
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 3
···············
steps/train_mono.sh: Pass 38
steps/train_mono.sh: Aligning data
steps/train_mono.sh: Pass 39
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/mono
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono/log/analyze_alignments.log
311 warnings in exp/mono/log/update.*.log
84 warnings in exp/mono/log/align.*.*.log
exp/mono: nj=4 align prob=-79.54 over 0.06h [retry=0.0%, fail=0.0%] states=70 gauss=1007
steps/train_mono.sh: Done training monophone system in exp/mono===== MONO DECODING =====WARNING: the --mono, --left-biphone and --quinphone options are now deprecated and ignored.
tree-info exp/mono/tree 
tree-info exp/mono/tree 
fstdeterminizestar --use-log=true 
fsttablecompose data/lang/L_disambig.fst data/lang/G.fst 
fstminimizeencoded 
fstpushspecial 
fstisstochastic data/lang/tmp/LG.fst 
-0.0421835 -0.0423222
[info]: LG not stochastic.
fstcomposecontext --context-size=1 --central-position=0 --read-disambig-syms=data/lang/phones/disambig.int --write-disambig-syms=data/lang/tmp/disambig_ilabels_1_0.int data/lang/tmp/ilabels_1_0 
fstisstochastic data/lang/tmp/CLG_1_0.fst 
-0.0421835 -0.0423222
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/mono/graph/disambig_tid.int --transition-scale=1.0 data/lang/tmp/ilabels_1_0 exp/mono/tree exp/mono/final.mdl 
fsttablecompose exp/mono/graph/Ha.fst data/lang/tmp/CLG_1_0.fst 
fstminimizeencoded 
fstrmepslocal 
fstdeterminizestar --use-log=true 
fstrmsymbols exp/mono/graph/disambig_tid.int 
fstisstochastic exp/mono/graph/HCLGa.fst 
0.000319138 -0.0427566
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/mono/final.mdl 
steps/decode.sh --config conf/decode.config --nj 4 --cmd run.pl exp/mono/graph data/test exp/mono/decode
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/mono/graph exp/mono/decode
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,1,2) and mean=1.2
steps/diagnostic/analyze_lats.sh: see stats in exp/mono/decode/log/analyze_lattice_depth_stats.log
exp/mono/decode/wer_10
%WER 1.39 [ 2 / 144, 1 ins, 0 del, 1 sub ]
%SER 4.17 [ 2 / 48 ]
exp/mono/decode/wer_11
%WER 1.39 [ 2 / 144, 1 ins, 0 del, 1 sub ]
%SER 4.17 [ 2 / 48 ]
exp/mono/decode/wer_12
····················
%SER 4.17 [ 2 / 48 ]
exp/mono/decode//wer_8
%WER 1.39 [ 2 / 144, 1 ins, 0 del, 1 sub ]
%SER 4.17 [ 2 / 48 ]
exp/mono/decode//wer_9
%WER 1.39 [ 2 / 144, 1 ins, 0 del, 1 sub ]
%SER 4.17 [ 2 / 48 ]===== MONO ALIGNMENT =====steps/align_si.sh --nj 4 --cmd run.pl data/train data/lang exp/mono exp/mono_ali
steps/align_si.sh: feature type is delta
steps/align_si.sh: aligning data in data/train using model from exp/mono, putting alignments in exp/mono_ali
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/mono_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/mono_ali/log/analyze_alignments.log
steps/align_si.sh: done aligning data.===== TRI1 (first triphone pass) TRAINING =====steps/train_deltas.sh --cmd run.pl 2000 11000 data/train data/lang exp/mono_ali exp/tri1
steps/train_deltas.sh: accumulating tree stats
steps/train_deltas.sh: getting questions for tree-building, via clustering
steps/train_deltas.sh: building the tree
WARNING (gmm-init-model:InitAmGmm():gmm-init-model.cc:55) Tree has pdf-id 1 with no stats; corresponding phone list: 6 7 8 9 10 
** The warnings above about 'no stats' generally mean you have phones **
** (or groups of phones) in your phone set that had no corresponding data. **
** You should probably figure out whether something went wrong, **
** or whether your data just doesn't happen to have examples of those **
** phones. **
steps/train_deltas.sh: converting alignments from exp/mono_ali to use current tree
steps/train_deltas.sh: compiling graphs of transcripts
steps/train_deltas.sh: training pass 1
steps/train_deltas.sh: training pass 2
··················
steps/train_deltas.sh: aligning data
steps/train_deltas.sh: training pass 31
steps/train_deltas.sh: training pass 32
steps/train_deltas.sh: training pass 33
steps/train_deltas.sh: training pass 34
steps/diagnostic/analyze_alignments.sh --cmd run.pl data/lang exp/tri1
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri1/log/analyze_alignments.log
382 warnings in exp/tri1/log/update.*.log
1 warnings in exp/tri1/log/questions.log
1 warnings in exp/tri1/log/build_tree.log
12 warnings in exp/tri1/log/align.*.*.log
28 warnings in exp/tri1/log/init_model.log
1 warnings in exp/tri1/log/mixup.log
exp/tri1: nj=4 align prob=-79.41 over 0.06h [retry=0.0%, fail=0.0%] states=88 gauss=1003 tree-impr=5.48
steps/train_deltas.sh: Done training system with delta+delta-delta features in exp/tri1===== TRI1 (first triphone pass) DECODING =====tree-info exp/tri1/tree 
tree-info exp/tri1/tree 
fstcomposecontext --context-size=3 --central-position=1 --read-disambig-syms=data/lang/phones/disambig.int --write-disambig-syms=data/lang/tmp/disambig_ilabels_3_1.int data/lang/tmp/ilabels_3_1 
fstisstochastic data/lang/tmp/CLG_3_1.fst 
0 -0.0423222
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/tri1/graph/disambig_tid.int --transition-scale=1.0 data/lang/tmp/ilabels_3_1 exp/tri1/tree exp/tri1/final.mdl 
fsttablecompose exp/tri1/graph/Ha.fst data/lang/tmp/CLG_3_1.fst 
fstdeterminizestar --use-log=true 
fstminimizeencoded 
fstrmsymbols exp/tri1/graph/disambig_tid.int 
fstrmepslocal 
fstisstochastic exp/tri1/graph/HCLGa.fst 
0.000384301 -0.0999308
HCLGa is not stochastic
add-self-loops --self-loop-scale=0.1 --reorder=true exp/tri1/final.mdl 
steps/decode.sh --config conf/decode.config --nj 4 --cmd run.pl exp/tri1/graph data/test exp/tri1/decode
decode.sh: feature type is delta
steps/diagnostic/analyze_lats.sh --cmd run.pl exp/tri1/graph exp/tri1/decode
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,1,1) and mean=1.1
steps/diagnostic/analyze_lats.sh: see stats in exp/tri1/decode/log/analyze_lattice_depth_stats.log
exp/tri1/decode/wer_10
···················===== run.sh script is finished =====

 

 

 

 

 

 

 

 

 

 

这篇关于笔记摘录:2018.09.01---Kaldi构建一个简单的英文数字串识别系统的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/404569

相关文章

利用Python编写一个简单的聊天机器人

《利用Python编写一个简单的聊天机器人》这篇文章主要为大家详细介绍了如何利用Python编写一个简单的聊天机器人,文中的示例代码讲解详细,感兴趣的小伙伴可以跟随小编一起学习一下... 使用 python 编写一个简单的聊天机器人可以从最基础的逻辑开始,然后逐步加入更复杂的功能。这里我们将先实现一个简单的

使用IntelliJ IDEA创建简单的Java Web项目完整步骤

《使用IntelliJIDEA创建简单的JavaWeb项目完整步骤》:本文主要介绍如何使用IntelliJIDEA创建一个简单的JavaWeb项目,实现登录、注册和查看用户列表功能,使用Se... 目录前置准备项目功能实现步骤1. 创建项目2. 配置 Tomcat3. 项目文件结构4. 创建数据库和表5.

使用PyQt5编写一个简单的取色器

《使用PyQt5编写一个简单的取色器》:本文主要介绍PyQt5搭建的一个取色器,一共写了两款应用,一款使用快捷键捕获鼠标附近图像的RGB和16进制颜色编码,一款跟随鼠标刷新图像的RGB和16... 目录取色器1取色器2PyQt5搭建的一个取色器,一共写了两款应用,一款使用快捷键捕获鼠标附近图像的RGB和16

四种简单方法 轻松进入电脑主板 BIOS 或 UEFI 固件设置

《四种简单方法轻松进入电脑主板BIOS或UEFI固件设置》设置BIOS/UEFI是计算机维护和管理中的一项重要任务,它允许用户配置计算机的启动选项、硬件设置和其他关键参数,该怎么进入呢?下面... 随着计算机技术的发展,大多数主流 PC 和笔记本已经从传统 BIOS 转向了 UEFI 固件。很多时候,我们也

Python中构建终端应用界面利器Blessed模块的使用

《Python中构建终端应用界面利器Blessed模块的使用》Blessed库作为一个轻量级且功能强大的解决方案,开始在开发者中赢得口碑,今天,我们就一起来探索一下它是如何让终端UI开发变得轻松而高... 目录一、安装与配置:简单、快速、无障碍二、基本功能:从彩色文本到动态交互1. 显示基本内容2. 创建链

基于Qt开发一个简单的OFD阅读器

《基于Qt开发一个简单的OFD阅读器》这篇文章主要为大家详细介绍了如何使用Qt框架开发一个功能强大且性能优异的OFD阅读器,文中的示例代码讲解详细,有需要的小伙伴可以参考一下... 目录摘要引言一、OFD文件格式解析二、文档结构解析三、页面渲染四、用户交互五、性能优化六、示例代码七、未来发展方向八、结论摘要

Golang使用etcd构建分布式锁的示例分享

《Golang使用etcd构建分布式锁的示例分享》在本教程中,我们将学习如何使用Go和etcd构建分布式锁系统,分布式锁系统对于管理对分布式系统中共享资源的并发访问至关重要,它有助于维护一致性,防止竞... 目录引言环境准备新建Go项目实现加锁和解锁功能测试分布式锁重构实现失败重试总结引言我们将使用Go作

MyBatis框架实现一个简单的数据查询操作

《MyBatis框架实现一个简单的数据查询操作》本文介绍了MyBatis框架下进行数据查询操作的详细步骤,括创建实体类、编写SQL标签、配置Mapper、开启驼峰命名映射以及执行SQL语句等,感兴趣的... 基于在前面几章我们已经学习了对MyBATis进行环境配置,并利用SqlSessionFactory核

csu 1446 Problem J Modified LCS (扩展欧几里得算法的简单应用)

这是一道扩展欧几里得算法的简单应用题,这题是在湖南多校训练赛中队友ac的一道题,在比赛之后请教了队友,然后自己把它a掉 这也是自己独自做扩展欧几里得算法的题目 题意:把题意转变下就变成了:求d1*x - d2*y = f2 - f1的解,很明显用exgcd来解 下面介绍一下exgcd的一些知识点:求ax + by = c的解 一、首先求ax + by = gcd(a,b)的解 这个

hdu2289(简单二分)

虽说是简单二分,但是我还是wa死了  题意:已知圆台的体积,求高度 首先要知道圆台体积怎么求:设上下底的半径分别为r1,r2,高为h,V = PI*(r1*r1+r1*r2+r2*r2)*h/3 然后以h进行二分 代码如下: #include<iostream>#include<algorithm>#include<cstring>#include<stack>#includ