卷积神经网络文本句子分类CNN-text (Yoon Kim)复现实践

2023-10-17 11:40

本文主要是介绍卷积神经网络文本句子分类CNN-text (Yoon Kim)复现实践,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

软件:

1.在运行代码时,python环境换为python3.6,我用的是Anaconda3-4.0.0恰好满足;

2.TensorFlow版本最好换为1.5以下,本人就换为1.4.1了,否则,在进行测试时会报错;

1.简介

TextCNN 是利用卷积神经网络对文本进行分类的算法,由 Yoon Kim 在 “Convolutional Neural Networks for Sentence Classification” 一文 (见参考[1]) 中提出. 是2014年的算法.

2.参数与超参数
sequence_length
Q: 对于CNN, 输入与输出都是固定的,可每个句子长短不一, 怎么处理?
A: 需要做定长处理, 比如定为n, 超过的截断, 不足的补0. 注意补充的0对后面的结果没有影响,因为后面的max-pooling只会输出最大值,补零的项会被过滤掉.
num_classes
多分类, 分为几类.
vocabulary_size
语料库的词典大小, 记为|D|.
embedding_size
将词向量的维度, 由原始的 |D| 降维到 embedding_size.
filter_size_arr
多个不同size的filter.
3.Embedding Layer
通过一个隐藏层, 将 one-hot 编码的词 投影 到一个低维空间中.
本质上是特征提取器,在指定维度中编码语义特征. 这样, 语义相近的词, 它们的欧氏距离或余弦距离也比较近.

4.Convolution Layer
为不同尺寸的 filter 都建立一个卷积层. 所以会有多个 feature map.
图像是像素点组成的二维数据, 有时还会有RGB三个通道, 所以它们的卷积核至少是二维的.
从某种程度上讲, word is to text as pixel is to image, 所以这个卷积核的 size 与 stride 会有些不一样.

xixi
xi∈Rkxi∈Rk, 一个长度为n的句子中, 第 i 个词语的词向量, 维度为k.
xi:jxi:j
xi:j=xi⊕xi+1⊕...⊕xjxi:j=xi⊕xi+1⊕...⊕xj
表示在长度为n的句子中, 第 [i,j] 个词语的词向量的拼接.
hh
卷积核所围窗口中单词的个数, 卷积核的尺寸其实就是 hkhk.
ww
w∈Rhkw∈Rhk, 卷积核的权重矩阵.
cici
ci=f(w⋅xi:i+h−1+b)ci=f(w⋅xi:i+h−1+b), 卷积核在单词i位置上的输出. b∈RKb∈RK, 是 bias. ff 是双曲正切之类的激活函数.
c=[c1,c2,...,cn−h+1]c=[c1,c2,...,cn−h+1]
filter在句中单词上进行所有可能的滑动, 得到的 feature mapfeature map.
5.Max-Pooling Layer
max-pooling只会输出最大值, 对输入中的补0 做过滤.

6.SoftMax 分类 Layer
最后接一层全连接的 softmax 层,输出每个类别的概率。
3. 环境搭建

1) 安装Visual Studio 2019
下载Visual Studio 社区版
下载链接:https://visualstudio.microsoft.com/zh-hans/downloads/


注意:安装时勾选“Python开发”和“C++桌面开发”
2) 下载和安装nvidia显卡驱动
首先要在设备管理器中查看你的显卡型号,比如在这里可以看到我的显卡型号为Titan XP。

 NVIDIA 驱动下载:https://www.nvidia.cn/Download/index.aspx?lang=cn下载对应你的英伟达显卡驱动。

 下载之后就是简单的下一步直到完成。完成之后,在cmd中输入执行:

nvidia-smi

如果有错误:
'nvidia-smi' 不是内部或外部命令,也不是可运行的程序
或批处理文件。
把C:\Program Files\NVIDIA Corporation\NVSMI添加到环境变量的path中。再重新打开cmd窗口。如果输出下图所示的显卡信息,说明你的驱动安装成功。

 注:图中的 CUDA Version是当前Driver版本能支持的最高的CUDA版本

3) 下载CUDA
CUDA用的是10.2版本
cuda下载链接:https://developer.nvidia.com/cuda-downloads? target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal

https://developer.nvidia.com/cuda-downloads?
target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal

 载后得到文件:cuda_10.2.89_441.22_win10.exe

4) 下载cuDNN
cudnn下载地址:https://developer.nvidia.com/cudnn需要有账号

 下载后得到文件:cudnn-10.2-windows10-x64-v7.6.5.32.zip
5) 安装cuda
(1) 将cuda运行安装,建议默认路径

 

 安装时可以勾选Visual Studio Integration
(2) 安装完成后设置环境变量

计算机上点右键,打开属性->高级系统设置->环境变量,可以看到系统中多了CUDA_PATH和CUDA_PATH_V10_2两个环境变量。
接下来,还要在系统中添加以下几个环境变量:
这是默认安装位置的路径: C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2
CUDA_SDK_PATH = C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2 CUDA_LIB_PATH = %CUDA_PATH%\lib\x64
CUDA_BIN_PATH = %CUDA_PATH%\bin
CUDA_SDK_BIN_PATH = %CUDA_SDK_PATH%\bin\win64
CUDA_SDK_LIB_PATH = %CUDA_SDK_PATH%\common\lib\x64

 C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\extras\CUPTI\lib64 C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2\bin\win64
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v10.2\common\lib\x64

 注:与CUDA Samples相关的几个路径也可以不设置

6) 安装cuDNN
复制cudnn文件
对于cudnn直接将其解开压缩包,然后需要将bin,include,lib中的文件复制粘贴到cuda的文件夹下C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2
注意:对整个文件夹bin,include,lib复制粘贴

 7)CUDA安装测试
最后测试cuda是否配置成功:打开CMD执行:

nvcc -V

可看到cuda的信息

 8) 安装Anaconda
Anaconda 是一个用于科学计算的 Python 发行版,支持 Linux, Mac, Windows, 包含了众多流行的科学计算、数据分析的 Python 包。
1) 下载安装包
Anaconda下载Windows版:https://www.anaconda.com/products/individual

官网历史版本下载:(Index of / (anaconda.com)

2)    然后安装anaconda
3)    添加Aanaconda国内镜像配置

清华TUNA提供了 Anaconda 仓库的镜像,运行以下命令:

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/

conda config --add channels 
https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/

conda config --set show_channel_urls yes

9)安装tensorflow=1.4(pytorch)

创建虚拟环境,环境名字可自己确定,这里本人使用mypytorch作为环境名:

(查看虚拟环境:conda info --env,删除虚拟环境:第一步:首先退出环境 conda deactivate # 第二步:删除环境 conda remove -n 需要删除的环境名 --all)

conda create -n tensorflow35 python=3.5

(或者:conda create -c https://conda.anaconda.org/conda-forge -n nlp-book python=3.8.5)

安装成功后激活tensorflow35环境:此时低版本必须用此命令激活环境activate tensorflow35

conda activate tensorflow35

在所创建的tensorflow35环境下安装tensorflow, 执行命令:

conda install tensorflow==1.4 或 pip install tensorflow==1.4 

 pip install tensorflow==1.4,安装tensorflow时,出现timeout,多次安装即可成功

注释:python的版本3.5,tensorflow的版本1.4.

4. 数据预处理

在windows窗口执行

(tensorflow35) D:\cnn-text>python data_helpers.py

(tensorflow35) D:\cnn-text>

查看结果:

5. 训练cnn-text

进入虚拟环境tensorflow35, 进入D盘下的cnn-text目录,

执行:python train.py

但是在windows报错:

tensorflow.python.framework.errors_impl.PermissionDeniedError

于是发现自己建立的文件夹cnn-text没有权限。于是点击该文件右击,属性,安全 增加完全控制权限。重新操作就可以启动gpu训练 。

 训练数据后,训练的结果,保存在run文件夹下面

6. 测试数据 eval.py

执行python eval.py 会报错

报错

 NewRandomAccessFile failed to Create/Open: ..\vocab : \u03f5\u0373\udcd5\u04b2\udcbb\udcb5\udcbd\u05b8\udcb6\udca8\udcb5\udcc4\udcce\u013c\udcfe\udca1\udca3

 报错,没有文件vocab,读取不到。

Traceback (most recent call last):
  File "eval.py", line 56, in <module>
    vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)
  File "C:\Users\zql10\Anaconda3\envs\tensorflow35\lib\site-packages\tensorflow\contrib\learn\python\learn\preprocessing\text.py", line 226, in restore
    return pickle.loads(f.read())
  File "C:\Users\zql10\Anaconda3\envs\tensorflow35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 119, in read
    self._preread_check()
  File "C:\Users\zql10\Anaconda3\envs\tensorflow35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 79, in _preread_check
    compat.as_bytes(self.__name), 1024 * 512, status)
  File "C:\Users\zql10\Anaconda3\envs\tensorflow35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: NewRandomAccessFile failed to Create/Open: ..\vocab : \u03f5\u0373\udcd5\u04b2\udcbb\udcb5\udcbd\u05b8\udcb6\udca8\udcb5\udcc4\udcce\u013c\udcfe\udca1\udca3
; No such file or directory

修改代码:

# Map data into vocabulary

#vocab_path = os.path.join(FLAGS.checkpoint_dir, "..", "vocab")

# path = os.path.join(os.getcwd(), 'images')

vocab_path = os.path.join(FLAGS.checkpoint_dir, "vocab") #2023.1.29 现在改了

同时修改了路径:

 tf.flags.DEFINE_string("checkpoint_dir", "runs/1674888681", "Checkpoint directory from training run")

如果还是无法执行,那么我们就可以使用pycharm这个强大的平台来执行:

在虚拟环境tensorflow35下(windows)

直接打开pycharm,然后打开项目cnn-text,  选择好解释器也就是虚拟环境tensorflow35下的解释器,然后就可以执行

 修改路径

# Evaluation
# ==================================================
FLAGS.checkpoint_dir = './runs/1675947435/checkpoints'
checkpoint_file = tf.train.latest_checkpoint(FLAGS.checkpoint_dir)

直接点击运行 eval.py

#指定是否在训练集和测试集上进行验证,反之使用给出的两条数据,这里选择True
tf.flags.DEFINE_boolean("eval_train", True, "Evaluate on all training data")

得到验证数据集的结果:

 Total number of test examples: 10662
Accuracy: 0.971769

并将结果保存在
Saving evaluation to ./runs/1675947435/checkpoints\..\prediction.csv

完美复现!!!!!

完成测试代码的整个过程。

附录代码:

train.py

#! /usr/bin/env pythonimport tensorflow as tf
import numpy as np
import os
import time
import datetime
import data_helpers
from text_cnn import TextCNN
from tensorflow.contrib import learn# Parameters
# ==================================================# Data loading params
tf.flags.DEFINE_float("dev_sample_percentage", .1, "Percentage of the training data to use for validation")
tf.flags.DEFINE_string("positive_data_file", "./data/rt-polaritydata/rt-polarity.pos", "Data source for the positive data.")
tf.flags.DEFINE_string("negative_data_file", "./data/rt-polaritydata/rt-polarity.neg", "Data source for the negative data.")# Model Hyperparameters
tf.flags.DEFINE_integer("embedding_dim", 128, "Dimensionality of character embedding (default: 128)")
tf.flags.DEFINE_string("filter_sizes", "3,4,5", "Comma-separated filter sizes (default: '3,4,5')")
tf.flags.DEFINE_integer("num_filters", 128, "Number of filters per filter size (default: 128)")
tf.flags.DEFINE_float("dropout_keep_prob", 0.5, "Dropout keep probability (default: 0.5)")
tf.flags.DEFINE_float("l2_reg_lambda", 0.0, "L2 regularization lambda (default: 0.0)")# Training parameters
tf.flags.DEFINE_integer("batch_size", 64, "Batch Size (default: 64)")
tf.flags.DEFINE_integer("num_epochs", 200, "Number of training epochs (default: 200)")
tf.flags.DEFINE_integer("evaluate_every", 100, "Evaluate model on dev set after this many steps (default: 100)")
tf.flags.DEFINE_integer("checkpoint_every", 100, "Save model after this many steps (default: 100)")
tf.flags.DEFINE_integer("num_checkpoints", 5, "Number of checkpoints to store (default: 5)")
# Misc Parameters
tf.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
tf.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")FLAGS = tf.flags.FLAGS
# FLAGS._parse_flags()
# print("\nParameters:")
# for attr, value in sorted(FLAGS.__flags.items()):
#     print("{}={}".format(attr.upper(), value))
# print("")def preprocess():# Data Preparation# ==================================================# Load dataprint("Loading data...")x_text, y = data_helpers.load_data_and_labels(FLAGS.positive_data_file, FLAGS.negative_data_file)# Build vocabularymax_document_length = max([len(x.split(" ")) for x in x_text])vocab_processor = learn.preprocessing.VocabularyProcessor(max_document_length)x = np.array(list(vocab_processor.fit_transform(x_text)))# Randomly shuffle datanp.random.seed(10)shuffle_indices = np.random.permutation(np.arange(len(y)))x_shuffled = x[shuffle_indices]y_shuffled = y[shuffle_indices]# Split train/test set# TODO: This is very crude, should use cross-validationdev_sample_index = -1 * int(FLAGS.dev_sample_percentage * float(len(y)))x_train, x_dev = x_shuffled[:dev_sample_index], x_shuffled[dev_sample_index:]y_train, y_dev = y_shuffled[:dev_sample_index], y_shuffled[dev_sample_index:]del x, y, x_shuffled, y_shuffledprint("Vocabulary Size: {:d}".format(len(vocab_processor.vocabulary_)))print("Train/Dev split: {:d}/{:d}".format(len(y_train), len(y_dev)))return x_train, y_train, vocab_processor, x_dev, y_devdef train(x_train, y_train, vocab_processor, x_dev, y_dev):# Training# ==================================================with tf.Graph().as_default():session_conf = tf.ConfigProto(allow_soft_placement=FLAGS.allow_soft_placement,log_device_placement=FLAGS.log_device_placement)sess = tf.Session(config=session_conf)with sess.as_default():cnn = TextCNN(sequence_length=x_train.shape[1],num_classes=y_train.shape[1],vocab_size=len(vocab_processor.vocabulary_),embedding_size=FLAGS.embedding_dim,filter_sizes=list(map(int, FLAGS.filter_sizes.split(","))),num_filters=FLAGS.num_filters,l2_reg_lambda=FLAGS.l2_reg_lambda)# Define Training procedureglobal_step = tf.Variable(0, name="global_step", trainable=False)optimizer = tf.train.AdamOptimizer(1e-3)grads_and_vars = optimizer.compute_gradients(cnn.loss)train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step)# Keep track of gradient values and sparsity (optional)grad_summaries = []for g, v in grads_and_vars:if g is not None:grad_hist_summary = tf.summary.histogram("{}/grad/hist".format(v.name), g)sparsity_summary = tf.summary.scalar("{}/grad/sparsity".format(v.name), tf.nn.zero_fraction(g))grad_summaries.append(grad_hist_summary)grad_summaries.append(sparsity_summary)grad_summaries_merged = tf.summary.merge(grad_summaries)# Output directory for models and summariestimestamp = str(int(time.time()))out_dir = os.path.abspath(os.path.join(os.path.curdir, "runs", timestamp))print("Writing to {}\n".format(out_dir))file = open('model_dir.txt', 'w') #2023.2.3file.write(out_dir)file.close()# Summaries for loss and accuracyloss_summary = tf.summary.scalar("loss", cnn.loss)acc_summary = tf.summary.scalar("accuracy", cnn.accuracy)# Train Summariestrain_summary_op = tf.summary.merge([loss_summary, acc_summary, grad_summaries_merged])train_summary_dir = os.path.join(out_dir, "summaries", "train")train_summary_writer = tf.summary.FileWriter(train_summary_dir, sess.graph)# Dev summariesdev_summary_op = tf.summary.merge([loss_summary, acc_summary])dev_summary_dir = os.path.join(out_dir, "summaries", "dev")dev_summary_writer = tf.summary.FileWriter(dev_summary_dir, sess.graph)# Checkpoint directory. Tensorflow assumes this directory already exists so we need to create itmodel_dir = open('model_dir.txt').readline()  # 2023.2.3vocab_path = model_dir + "\\vocab"checkpoint_dir = os.path.abspath(os.path.join(out_dir, "checkpoints"))#'D:\\cnn-text\\runs\\1675912493\\checkpoints'#checkpoint_dir='D:\\cnn-text\\runs\\1675912493\\checkpoints'checkpoint_prefix = os.path.join(checkpoint_dir, "model")if not os.path.exists(checkpoint_dir):os.makedirs(checkpoint_dir)saver = tf.train.Saver(tf.global_variables(), max_to_keep=FLAGS.num_checkpoints)# Write vocabularyvocab_processor.save(os.path.join(out_dir, "vocab"))# Initialize all variablessess.run(tf.global_variables_initializer())def train_step(x_batch, y_batch):"""A single training step"""feed_dict = {cnn.input_x: x_batch,cnn.input_y: y_batch,cnn.dropout_keep_prob: FLAGS.dropout_keep_prob}_, step, summaries, loss, accuracy = sess.run([train_op, global_step, train_summary_op, cnn.loss, cnn.accuracy],feed_dict)time_str = datetime.datetime.now().isoformat()print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))train_summary_writer.add_summary(summaries, step)def dev_step(x_batch, y_batch, writer=None):"""Evaluates model on a dev set"""feed_dict = {cnn.input_x: x_batch,cnn.input_y: y_batch,cnn.dropout_keep_prob: 1.0}step, summaries, loss, accuracy = sess.run([global_step, dev_summary_op, cnn.loss, cnn.accuracy],feed_dict)time_str = datetime.datetime.now().isoformat()print("{}: step {}, loss {:g}, acc {:g}".format(time_str, step, loss, accuracy))if writer:writer.add_summary(summaries, step)# Generate batchesbatches = data_helpers.batch_iter(list(zip(x_train, y_train)), FLAGS.batch_size, FLAGS.num_epochs)# Training loop. For each batch...for batch in batches:x_batch, y_batch = zip(*batch)train_step(x_batch, y_batch)current_step = tf.train.global_step(sess, global_step)if current_step % FLAGS.evaluate_every == 0:print("\nEvaluation:")dev_step(x_dev, y_dev, writer=dev_summary_writer)print("")if current_step % FLAGS.checkpoint_every == 0:path = saver.save(sess, checkpoint_prefix, global_step=current_step)print("Saved model checkpoint to {}\n".format(path))def main(argv=None):x_train, y_train, vocab_processor, x_dev, y_dev = preprocess()train(x_train, y_train, vocab_processor, x_dev, y_dev)if __name__ == '__main__':tf.app.run()

eval.py:

#! /usr/bin/env pythonimport tensorflow as tf
import numpy as np
import os
import time
import datetime
import data_helpers
from text_cnn import TextCNN
from tensorflow.contrib import learn
import csv# Parameters
# ==================================================
# tf.reset_default_graph()# Data Parameters
tf.flags.DEFINE_string("positive_data_file", "./data/rt-polaritydata/rt-polarity.pos", "Data source for the positive data.")
tf.flags.DEFINE_string("negative_data_file", "./data/rt-polaritydata/rt-polarity.neg", "Data source for the negative data.")# Eval Parameters
tf.flags.DEFINE_integer("batch_size", 64, "Batch Size (default: 64)")
tf.flags.DEFINE_string("checkpoint_dir", "", "Checkpoint directory from training run")#指定是否在训练集和测试集上进行验证,反之使用给出的两条数据
tf.flags.DEFINE_boolean("eval_train", True, "Evaluate on all training data") #测试数据集上所有句子
# tf.flags.DEFINE_boolean("eval_train", False, "Evaluate on all training data") #测试两个句子# Misc Parameters设备参数
tf.flags.DEFINE_boolean("allow_soft_placement", True, "Allow device soft device placement")
tf.flags.DEFINE_boolean("log_device_placement", False, "Log placement of ops on devices")FLAGS = tf.flags.FLAGS
FLAGS._parse_flags()
x = FLAGS.checkpoint_dir
print("\nParameters:")
for attr, value in sorted(FLAGS.__flags.items()):print("{}={}".format(attr.upper(), value))
print("")# CHANGE THIS: Load data. Load your own data here
if FLAGS.eval_train:x_raw, y_test = data_helpers.load_data_and_labels(FLAGS.positive_data_file, FLAGS.negative_data_file)y_test = np.argmax(y_test, axis=1)
else:x_raw = ["a masterpiece four years in the making", "everything is off."]y_test = [1, 0]# Map data into vocabulary# model_dir = open('model_dir.txt').readline() #2023.2.3
# vocab_path = model_dir + "/vocab"# Map data into vocabulary
#vocab_path = "./runs/1516092210/vocab"FLAGS.checkpoint_dir = './runs/1675947435/checkpoints'
vocab_path = os.path.join(FLAGS.checkpoint_dir, "..", "vocab")
vocab_processor = learn.preprocessing.VocabularyProcessor.restore(vocab_path)
x_test = np.array(list(vocab_processor.transform(x_raw)))print("\nEvaluating...\n")# Evaluation
# ==================================================最新保存的模型FLAGS.checkpoint_dir = './runs/1675947435/checkpoints'
checkpoint_file = tf.train.latest_checkpoint(FLAGS.checkpoint_dir)
#checkpoint_dir='D:\\cnn-text\\runs\\1675912493\\checkpoints'graph = tf.Graph()
with graph.as_default():session_conf = tf.ConfigProto(allow_soft_placement=FLAGS.allow_soft_placement,log_device_placement=FLAGS.log_device_placement)sess = tf.Session(config=session_conf)with sess.as_default():# Load the saved meta graph and restore variablessaver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file))saver.restore(sess, checkpoint_file)# Get the placeholders from the graph by nameinput_x = graph.get_operation_by_name("input_x").outputs[0]# input_y = graph.get_operation_by_name("input_y").outputs[0]dropout_keep_prob = graph.get_operation_by_name("dropout_keep_prob").outputs[0]# Tensors we want to evaluatepredictions = graph.get_operation_by_name("output/predictions").outputs[0]# Generate batches for one epochbatches = data_helpers.batch_iter(list(x_test), FLAGS.batch_size, 1, shuffle=False)# Collect the predictions hereall_predictions = []for x_test_batch in batches:batch_predictions = sess.run(predictions, {input_x: x_test_batch, dropout_keep_prob: 1.0})all_predictions = np.concatenate([all_predictions, batch_predictions])# Print accuracy if y_test is defined
if y_test is not None:correct_predictions = float(sum(all_predictions == y_test))print("Total number of test examples: {}".format(len(y_test)))print("Accuracy: {:g}".format(correct_predictions/float(len(y_test))))# Save the evaluation to a csv
predictions_human_readable = np.column_stack((np.array(x_raw), all_predictions))
out_path = os.path.join(FLAGS.checkpoint_dir, "..", "prediction.csv")
print("Saving evaluation to {0}".format(out_path))
with open(out_path, 'w') as f:csv.writer(f).writerows(predictions_human_readable)

主程序类:cnn-text:

import tensorflow as tf
import numpy as npclass TextCNN(object):"""A CNN for text classification.Uses an embedding layer, followed by a convolutional, max-pooling and softmax layer."""def __init__(self, sequence_length, num_classes, vocab_size,embedding_size, filter_sizes, num_filters, l2_reg_lambda=0.0):# Placeholders for input, output and dropoutself.input_x = tf.placeholder(tf.int32, [None, sequence_length], name="input_x")self.input_y = tf.placeholder(tf.float32, [None, num_classes], name="input_y")self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob")# Keeping track of l2 regularization loss (optional)l2_loss = tf.constant(0.0)# 构建中间层,单词转化成向量的形式,在-1,1之间生产均匀分布数
# Embedding layer
with tf.device('/cpu:0'), tf.name_scope("embedding"): #vocab_size:词库大小;embedding_size:词向量维度.#self.W可以理解为词向量词典,存储vocab_size个大小为embedding_size的词向量,随机初始化为-1~1之间的值;#self.embedded_chars是输入input_x对应的词向量表示;size:[句子数量, sequence_length, embedding_size]self.W = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),#构建中间层,单词转化成向量的形式,在-1,1之间生产均匀分布数name="W")self.embedded_chars = tf.nn.embedding_lookup(self.W, self.input_x) #输入词向量表示self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1) #将词向量表示扩充一个维度(embedded_chars * 1)#self.embedded_chars_expanded是,将词向量表示扩充一个维度(embedded_chars * 1),扩充为维度变为[句子数量, sequence_length, embedding_size, 1]方便进行卷积。
        # Create a convolution + maxpool layer for each filter sizepooled_outputs = []for i, filter_size in enumerate(filter_sizes):with tf.name_scope("conv-maxpool-%s" % filter_size):# Convolution Layerfilter_shape = [filter_size, embedding_size, 1, num_filters]W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W")b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b") # b表示变量维度为卷积核个数,数值为0.1的张量conv = tf.nn.conv2d(self.embedded_chars_expanded,W,strides=[1, 1, 1, 1],padding="VALID",name="conv")# Apply nonlinearityh = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")# Maxpooling over the outputspooled = tf.nn.max_pool(h,ksize=[1, sequence_length - filter_size + 1, 1, 1],strides=[1, 1, 1, 1],padding='VALID',name="pool")pooled_outputs.append(pooled)# Combine all the pooled features# 将三种filtersize的output拼接并拉平,用以全连接层num_filters_total = num_filters * len(filter_sizes)self.h_pool = tf.concat(pooled_outputs, 3)self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total])# Add dropoutwith tf.name_scope("dropout"):self.h_drop = tf.nn.dropout(self.h_pool_flat, self.dropout_keep_prob)# Final (unnormalized) scores and predictions#全连接层(output)with tf.name_scope("output"):W = tf.get_variable("W",shape=[num_filters_total, num_classes],initializer=tf.contrib.layers.xavier_initializer())b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b")l2_loss += tf.nn.l2_loss(W)l2_loss += tf.nn.l2_loss(b)self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores")self.predictions = tf.argmax(self.scores, 1, name="predictions")# Calculate mean cross-entropy losswith tf.name_scope("loss"):losses = tf.nn.softmax_cross_entropy_with_logits(logits=self.scores, labels=self.input_y)self.loss = tf.reduce_mean(losses) + l2_reg_lambda * l2_loss# Accuracy#计算准确率with tf.name_scope("accuracy"):correct_predictions = tf.equal(self.predictions, tf.argmax(self.input_y, 1))self.accuracy = tf.reduce_mean(tf.cast(correct_predictions, "float"), name="accuracy")结束

复现文献:

Convolutional Neural Networks for Sentence Classification

Yoon Kim

[1408.5882] Convolutional Neural Networks for Sentence Classification (arxiv.org)

参考文献:

Convolutional Neural Networks for Sentence Classification
Tensorflow版TextCNN主要代码解析
Recurrent Neural Network for Text Classification with Multi-Task Learning
implementing-a-cnn-for-text-classification-in-tensorflow
understanding-convolutional-neural-networks-for-nlp
textcnn实现-github

项目地址:https://github.com/finisky/TextCNN

这里的实现基于: https://github.com/Shawn1993/cn

这篇关于卷积神经网络文本句子分类CNN-text (Yoon Kim)复现实践的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/225120

相关文章

基于MySQL Binlog的Elasticsearch数据同步实践

一、为什么要做 随着马蜂窝的逐渐发展,我们的业务数据越来越多,单纯使用 MySQL 已经不能满足我们的数据查询需求,例如对于商品、订单等数据的多维度检索。 使用 Elasticsearch 存储业务数据可以很好的解决我们业务中的搜索需求。而数据进行异构存储后,随之而来的就是数据同步的问题。 二、现有方法及问题 对于数据同步,我们目前的解决方案是建立数据中间表。把需要检索的业务数据,统一放到一张M

基于人工智能的图像分类系统

目录 引言项目背景环境准备 硬件要求软件安装与配置系统设计 系统架构关键技术代码示例 数据预处理模型训练模型预测应用场景结论 1. 引言 图像分类是计算机视觉中的一个重要任务,目标是自动识别图像中的对象类别。通过卷积神经网络(CNN)等深度学习技术,我们可以构建高效的图像分类系统,广泛应用于自动驾驶、医疗影像诊断、监控分析等领域。本文将介绍如何构建一个基于人工智能的图像分类系统,包括环境

认识、理解、分类——acm之搜索

普通搜索方法有两种:1、广度优先搜索;2、深度优先搜索; 更多搜索方法: 3、双向广度优先搜索; 4、启发式搜索(包括A*算法等); 搜索通常会用到的知识点:状态压缩(位压缩,利用hash思想压缩)。

图神经网络模型介绍(1)

我们将图神经网络分为基于谱域的模型和基于空域的模型,并按照发展顺序详解每个类别中的重要模型。 1.1基于谱域的图神经网络         谱域上的图卷积在图学习迈向深度学习的发展历程中起到了关键的作用。本节主要介绍三个具有代表性的谱域图神经网络:谱图卷积网络、切比雪夫网络和图卷积网络。 (1)谱图卷积网络 卷积定理:函数卷积的傅里叶变换是函数傅里叶变换的乘积,即F{f*g}

系统架构师考试学习笔记第三篇——架构设计高级知识(20)通信系统架构设计理论与实践

本章知识考点:         第20课时主要学习通信系统架构设计的理论和工作中的实践。根据新版考试大纲,本课时知识点会涉及案例分析题(25分),而在历年考试中,案例题对该部分内容的考查并不多,虽在综合知识选择题目中经常考查,但分值也不高。本课时内容侧重于对知识点的记忆和理解,按照以往的出题规律,通信系统架构设计基础知识点多来源于教材内的基础网络设备、网络架构和教材外最新时事热点技术。本课时知识

Prometheus与Grafana在DevOps中的应用与最佳实践

Prometheus 与 Grafana 在 DevOps 中的应用与最佳实践 随着 DevOps 文化和实践的普及,监控和可视化工具已成为 DevOps 工具链中不可或缺的部分。Prometheus 和 Grafana 是其中最受欢迎的开源监控解决方案之一,它们的结合能够为系统和应用程序提供全面的监控、告警和可视化展示。本篇文章将详细探讨 Prometheus 和 Grafana 在 DevO

springboot整合swagger2之最佳实践

来源:https://blog.lqdev.cn/2018/07/21/springboot/chapter-ten/ Swagger是一款RESTful接口的文档在线自动生成、功能测试功能框架。 一个规范和完整的框架,用于生成、描述、调用和可视化RESTful风格的Web服务,加上swagger-ui,可以有很好的呈现。 SpringBoot集成 pom <!--swagge

【Python报错已解决】AttributeError: ‘list‘ object has no attribute ‘text‘

🎬 鸽芷咕:个人主页  🔥 个人专栏: 《C++干货基地》《粉丝福利》 ⛺️生活的理想,就是为了理想的生活! 文章目录 前言一、问题描述1.1 报错示例1.2 报错分析1.3 解决思路 二、解决方法2.1 方法一:检查属性名2.2 步骤二:访问列表元素的属性 三、其他解决方法四、总结 前言 在Python编程中,属性错误(At

Detectorn2预训练模型复现:数据准备、训练命令、日志分析与输出目录

Detectorn2预训练模型复现:数据准备、训练命令、日志分析与输出目录 在深度学习项目中,目标检测是一项重要的任务。本文将详细介绍如何使用Detectron2进行目标检测模型的复现训练,涵盖训练数据准备、训练命令、训练日志分析、训练指标以及训练输出目录的各个文件及其作用。特别地,我们将演示在训练过程中出现中断后,如何使用 resume 功能继续训练,并将我们复现的模型与Model Zoo中的

Level3 — PART 3 — 自然语言处理与文本分析

目录 自然语言处理概要 分词与词性标注 N-Gram 分词 分词及词性标注的难点 法则式分词法 全切分 FMM和BMM Bi-direction MM 优缺点 统计式分词法 N-Gram概率模型 HMM概率模型 词性标注(Part-of-Speech Tagging) HMM 文本挖掘概要 信息检索(Information Retrieval) 全文扫描 关键词