NLP 学习笔记 1:pytorch基础操作以及Perceptron 和 FF networks实现

2023-10-25 12:20

本文主要是介绍NLP 学习笔记 1:pytorch基础操作以及Perceptron 和 FF networks实现,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一些自己的nlp学习笔记

一:一些基础的pytorch操作

1 tensor的建立

import torch
import numpy as np
x = torch.Tensor(2,3) # 建立两行三列的torch tensorprint(x.type())       # type是Tensor类的一个mothod,返回Python string# torch.FloatTensor是real number的默认类型,一般来说GPU都能很好的处理x = torch.rand(2,3)   # uniform distribution
x = torch.randn(2,3)  # normal distributionx = torch.zeros(2, 3) # 全0 tensor
x = torch.ones(2, 3)  # 全1 tensor
x.fill_(5)            # 将tensor中全填入某相同的值#Tensor from list
x = torch.Tensor([[1, 2, 3], [4, 5, 6]])    #从list中获取tensor#From numpy to torch
a = np.random.rand(2, 3)                          
x = torch.from_numpy(a)                          # 用from_numpy将numpy类型转为tensor
x = torch.from_numpy(a).type(torch.FloatTensor)  # 可以用type来指定数据类型
y = torch.from_numpy(a).type_as(x)               # 可以用type_as来指定与其他tensor相同的数 # 据类型#数据类型以及数据类型的转换,一般默认为FloatTensor
z = x.long()                                     # 转为long

 2 tensor基础操作

# 求和
print(torch.add(x,x))
print(torch.sum(x, dim=0))          #按列求和# 对应元素求积
print(torch.mul(x,x))
print(x*x)# range tensor
print(torch.arange(6))# 返回不同shape的tensor
print(x.view(3, 2))x1 = torch.arange(6).view(2,3)# indexing + sum
x2 = torch.ones(3, 2).long()
x2[:, 1] += 1print('x1 =', x1)
print('x2 =', x2)# 矩阵乘
print(torch.mm(x1, x2))

3 检查pytorch所需硬件

import torchprint(torch.cuda.is_available())
print(torch.cuda.current_device())
print(torch.cuda.device(0))
print(torch.cuda.device_count())
print(torch.cuda.get_device_name(0))

4 pytorch 中的 Automatic differentiation

x = torch.ones(1, requires_grad=True)
print(x)y = x+42
print(y)z = 3*y*y
print(z)z.backward()     # 计算梯度
print(x.grad)    # ∂z/∂x = 6(x+42) = 6*1+252 = 258
print(y.grad)    # y的gradient没有保存因为没有requires_grad=True

二:The Perceptron

import torch
import torch.nn as nn# nn.Module 是所有神经网络的基类
class Perceptron(nn.Module):"""Our perceptron class"""def __init__(self, input_dim):"""Constructor"""super().__init__()self.fc = nn.Linear(input_dim, 1)self.sigmoid = torch.nn.Sigmoid()def forward(self, x_in):# squeeze unwraps the result from the singleton listreturn self.sigmoid(self.fc(x_in)) #.squeeze()print(Perceptron(10).forward(torch.ones(10)))
Activation functions

Sigmoid :f\left ( x \right )=\frac{1}{1+e^{-x}}

Tanh : f(x)=\frac{e^{x}-e^{-x}}{e^{x}+e^{-x}}

Relu :f(x)=max(0,x)

Loss function

MSE Loss:L(y,\hat{y})=\frac{1}{n}\sum_{i=1}^{n}(y_{i}-\hat{y_{i}})^{2}

import torch
import torch.nn as nnmse_loss = nn.MSELoss()
produced = torch.randn(2, 4, requires_grad=True)
print(produced)
expected = torch.randn(2, 4)
print(expected)
loss = mse_loss(produced, expected)
print(loss)

categorical cross-entropy loss

L(y,\hat{y})=-\sum_{i=1}^{n}y_{i}log(\hat{y})

import torch
import torch.nn as nnce_loss = nn.CrossEntropyLoss() # for binary classification, we can use nn.BCELoss()
produced = torch.randn(2, 4, requires_grad=True) # 2*4, normal distribution
print(produced)
# input is an index for each vector indicating the correct category/class
expected = torch.tensor([1, 0], dtype=torch.int64)
loss = ce_loss(produced, expected)
print(loss)

三:Language classification with the Perceptron

1 setup

from random import randintimport torch
from torch.utils.data import Dataset, DataLoaderimport torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

2 Data Preparation

建立LanguageRecognitionDataset类,用于处理原始data,生成我们language classification训练所需要的dataset

class LanguageRecognitionDataset(Dataset):"""An automatically generated dataset for our language classification task."""def _get_bigrams(self, sentence_list):big  rams = {}# for each sentencefor s in sentence_list:# for each bigramfor k in range(len(s)-1):bigrams[s[k:k+2]] = 1.0return bigrams.keys()def _get_bigram_vector(self, sentence):sent_bigrams = self._get_bigrams([sentence])vector = []for bigram in self.bigrams:vector.append(1.0 if bigram in sent_bigrams else 0.0)return vectordef __init__(self, sample, training_bigrams = None):"""Args:sample: List of sentences with their classification (True/False)"""self.num_samples = len(sample)if not training_bigrams:self.bigrams = self._get_bigrams([x for x, _ in sample])else:self.bigrams = training_bigramsself.data = []for sentence, gold_label in sample:sentence = sentence.lower()item = {'inputs': torch.tensor(self._get_bigram_vector(sentence)), 'outputs': torch.tensor([gold_label])}self.data.append(item)def __len__(self):return self.num_samplesdef __getitem__(self, idx):return self.data[idx]LanguageRecognitionDataset([("ciao ciao pippo", 1), ("la casa si trova in collina", 1)])[1]

3 建立一个简单的dataset

training_sentences = [("Scienziata italiana scopre la più grande esplosione nell’Universo.", 1.0),("Nell’ammasso di galassie di Ofiuco, distante 390 milioni di anni luce.", 1.0),("Ha rilasciato una quantità di energia 5 volte più grande della precedente che deteneva il primato.", 1.0),("Syria war: Turkey says thousands of migrants have crossed to EU.", 0.0),("Turkey could no longer deal with the amount of people fleeing Syria's civil war, he added.", 0.0),("Greece says it has blocked thousands of migrants from entering illegally from Turkey.", 0.0),("Tutto perfetto? Non proprio. Ci sono elementi problematici che vanno considerati.", 1.0),("Il primo è l’autonomia degli studenti, che devono essere in grado di gestire la tecnologia.", 1.0),("Il secondo, è la durata e la cadenza delle lezioni.", 1.0),("Per motivi di connessione, di competenze, di strumenti.", 1.0),("Serve un’assistenza dedicata.", 1.0),("Potremmo completare l’anno scolastico in versione virtuale?", 1.0),("Siamo preparati per affiancare la didattica tradizionale a quella virtuale, ma non siamo pronti per sostituirla", 1.0),("Various architectures of recurrent neural networks have been successful.", 0.0),("They perform tasks relating to sequence measuring", 0.0),("The networks operate by processing input components sequentially", 0.0),("They retain a hidden vector between iterations", 0.0),("It is constantly used and modified throughout the sequence.", 0.0),("They are able to model arbitrarily complicated programs.", 0.0),("L’Istituto, che raccoglie studenti di liceo scientifico, linguistico e tecnico economico, è l’esempio ideale.", 1.0),]validation_sentences = [("L’Istituto superiore di sanità ha confermato tutti i casi esaminati.", 1.0),("Measures announced after an emergency cabinet meeting also include the cancellation of the Paris half-marathon which was to be held on Sunday.", 0.0),("Lavagne in condivisione, documenti scaricabili sulla piattaforma gratuita, esercizi collaborativi.", 1.0),("Each encoder consists of two major components", 0.0),]test_sentences = [("Il ministro della Salute francese ha raccomandato di salutarsi mantenendo le distanze, mentre l’Organizzazione mondiale della sanità alza l’allerta a molto alta.", 1.0),("Possiamo riammalarci ma in questo caso si parla di ricaduta.", 1.0),("The vast majority of infections and deaths are in China, where the virus originated late last year.", 0.0),("France has banned all indoor gatherings of more than 5,000 people, as part of efforts to contain the country's coronavirus outbreak", 0.0)]def test_dataset_class():simple_dataset = LanguageRecognitionDataset(training_sentences)print('Dataset test:')for i in range(len(training_sentences)):print(f'  sample {i}: {simple_dataset[i]}')test_dataset_class()

4 Model training 

我们建立一个trainer类,其中包含了以下几个部分

  • training loop:使用模型,在数据集上迭代,来解决我们的问题
  • evaluation function:来评估我们模型的学习状态
  • prediction function:获取我们模型的output

为了让模型正确的学习,我们需要loss function来评估模型输出与真实值的差距,需要optimizer来基于loss更正模型参数

class Trainer():"""Utility class to train and evaluate a model."""def __init__(self,model,loss_function,optimizer):"""Args:model: the model we want to train.loss_function: the loss_function to minimize.optimizer: the optimizer used to minimize the loss_function."""self.model = modelself.loss_function = loss_functionself.optimizer = optimizerdef train(self, train_dataset, valid_dataset, epochs=1):"""Args:train_dataset: a Dataset or DatasetLoader instance containingthe training instances.valid_dataset: a Dataset or DatasetLoader instance used to evaluatelearning progress.epochs: the number of times to iterate over train_dataset.Returns:avg_train_loss: the average training loss on train_dataset overepochs."""assert epochs > 1 and isinstance(epochs, int)print('Training...')train_loss = 0.0for epoch in range(epochs):print(' Epoch {:03d}'.format(epoch + 1))epoch_loss = 0.0for step, sample in enumerate(train_dataset):inputs = sample['inputs']labels = sample['outputs']# we need to set the gradients to zero before starting to do backpropragation# because PyTorch accumulates the gradients on subsequent backward passesself.optimizer.zero_grad()predictions = self.model(inputs)sample_loss = self.loss_function(predictions, labels)#print("Before BP:", list(model.parameters()))sample_loss.backward()self.optimizer.step()#print("After BP:", list(model.parameters()))# sample_loss is a Tensor, tolist returns a float (alternative: use float() instead of .tolist())epoch_loss += sample_loss.tolist()print('    [E: {:2d} @ step {}] current avg loss = {:0.4f}'.format(epoch, step, epoch_loss / (step + 1)))avg_epoch_loss = epoch_loss / len(train_dataset)train_loss += avg_epoch_lossprint('  [E: {:2d}] train loss = {:0.4f}'.format(epoch, avg_epoch_loss))valid_loss = self.evaluate(valid_dataset)print('  [E: {:2d}] valid loss = {:0.4f}'.format(epoch, valid_loss))print('... Done!')avg_epoch_loss = train_loss / epochsreturn avg_epoch_lossdef evaluate(self, valid_dataset):"""Args:valid_dataset: the dataset to use to evaluate the model.Returns:avg_valid_loss: the average validation loss over valid_dataset."""valid_loss = 0.0# no gradient updates herewith torch.no_grad():for sample in valid_dataset:inputs = sample['inputs']labels = sample['outputs']predictions = self.model(inputs)sample_loss = self.loss_function(predictions, labels)valid_loss += sample_loss.tolist()return valid_loss / len(valid_dataset)def predict(self, x):"""Returns: hopefully the right prediction."""return self.model(x).tolist()

5 最后,定义dataset,setup trainer,训练我们的模型

training_dataset = DataLoader(LanguageRecognitionDataset(training_sentences), batch_size=6)
validation_dataset = DataLoader(LanguageRecognitionDataset(validation_sentences, training_dataset.dataset.bigrams), batch_size=2)
test_dataset = DataLoader(LanguageRecognitionDataset(test_sentences, training_dataset.dataset.bigrams), batch_size=2)print("Number of input dimensions", len(training_dataset.dataset.bigrams))
model = Perceptron(len(training_dataset.dataset.bigrams))
trainer = Trainer(model,loss_function = nn.MSELoss(),optimizer = optim.SGD(model.parameters(), lr=0.01)
)avg_epoch_loss = trainer.train(training_dataset, validation_dataset,epochs=50)

5 evaluation

检查我们的模型是否真的学习了一些东西

trainer.evaluate(test_dataset)for step, batch in enumerate(test_dataset):print(step, trainer.predict(batch['inputs']), batch['outputs'])

四:Language classification with a Feedforward Neural Network

1 model definition

class LanguageRecognitionFF(nn.Module):"""A simple model that classifies language"""def __init__(self, input_dim, hparams):super().__init__()# Hidden layer: transforms the input value/scalar into# a hidden vector representation.self.fc1 = nn.Linear(input_dim, hparams.hidden_size)self.relu = nn.ReLU()# Output layer: transforms the hidden vector representation# into a value/scalar (hopefully the input value + 1).self.fc2 = nn.Linear(hparams.hidden_size, 1)self.sigmoid = nn.Sigmoid()def forward(self, x):hidden = self.fc1(x)relu = self.relu(hidden)result = self.fc2(relu)return self.sigmoid(result)

2 Model Building

尽量把超参数与model definition分开,因为这样可以我们可以在不碰模型的情况下改变超参数

class HParams():hidden_size = 16

instance

model_ff = LanguageRecognitionFF(len(training_dataset.dataset.bigrams), HParams)

3 Model Training

trainer = Trainer(model = model_ff,loss_function = nn.MSELoss(),optimizer = optim.SGD(model_ff.parameters(), lr=1e-5)
)
trainer.train(training_dataset, validation_dataset, 50)

4 Model Evaluation

trainer.evaluate(test_dataset)for step, batch in enumerate(test_dataset):print(trainer.predict(batch['inputs']), batch['outputs'])

这篇关于NLP 学习笔记 1:pytorch基础操作以及Perceptron 和 FF networks实现的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/282525

相关文章

pandas中位数填充空值的实现示例

《pandas中位数填充空值的实现示例》中位数填充是一种简单而有效的方法,用于填充数据集中缺失的值,本文就来介绍一下pandas中位数填充空值的实现,具有一定的参考价值,感兴趣的可以了解一下... 目录什么是中位数填充?为什么选择中位数填充?示例数据结果分析完整代码总结在数据分析和机器学习过程中,处理缺失数

Golang HashMap实现原理解析

《GolangHashMap实现原理解析》HashMap是一种基于哈希表实现的键值对存储结构,它通过哈希函数将键映射到数组的索引位置,支持高效的插入、查找和删除操作,:本文主要介绍GolangH... 目录HashMap是一种基于哈希表实现的键值对存储结构,它通过哈希函数将键映射到数组的索引位置,支持

Java学习手册之Filter和Listener使用方法

《Java学习手册之Filter和Listener使用方法》:本文主要介绍Java学习手册之Filter和Listener使用方法的相关资料,Filter是一种拦截器,可以在请求到达Servl... 目录一、Filter(过滤器)1. Filter 的工作原理2. Filter 的配置与使用二、Listen

Pandas使用AdaBoost进行分类的实现

《Pandas使用AdaBoost进行分类的实现》Pandas和AdaBoost分类算法,可以高效地进行数据预处理和分类任务,本文主要介绍了Pandas使用AdaBoost进行分类的实现,具有一定的参... 目录什么是 AdaBoost?使用 AdaBoost 的步骤安装必要的库步骤一:数据准备步骤二:模型

使用Pandas进行均值填充的实现

《使用Pandas进行均值填充的实现》缺失数据(NaN值)是一个常见的问题,我们可以通过多种方法来处理缺失数据,其中一种常用的方法是均值填充,本文主要介绍了使用Pandas进行均值填充的实现,感兴趣的... 目录什么是均值填充?为什么选择均值填充?均值填充的步骤实际代码示例总结在数据分析和处理过程中,缺失数

Java对象转换的实现方式汇总

《Java对象转换的实现方式汇总》:本文主要介绍Java对象转换的多种实现方式,本文通过实例代码给大家介绍的非常详细,对大家的学习或工作具有一定的参考借鉴价值,需要的朋友参考下吧... 目录Java对象转换的多种实现方式1. 手动映射(Manual Mapping)2. Builder模式3. 工具类辅助映

Go语言开发实现查询IP信息的MCP服务器

《Go语言开发实现查询IP信息的MCP服务器》随着MCP的快速普及和广泛应用,MCP服务器也层出不穷,本文将详细介绍如何在Go语言中使用go-mcp库来开发一个查询IP信息的MCP... 目录前言mcp-ip-geo 服务器目录结构说明查询 IP 信息功能实现工具实现工具管理查询单个 IP 信息工具的实现服

SpringBoot基于配置实现短信服务策略的动态切换

《SpringBoot基于配置实现短信服务策略的动态切换》这篇文章主要为大家详细介绍了SpringBoot在接入多个短信服务商(如阿里云、腾讯云、华为云)后,如何根据配置或环境切换使用不同的服务商,需... 目录目标功能示例配置(application.yml)配置类绑定短信发送策略接口示例:阿里云 & 腾

Python ZIP文件操作技巧详解

《PythonZIP文件操作技巧详解》在数据处理和系统开发中,ZIP文件操作是开发者必须掌握的核心技能,Python标准库提供的zipfile模块以简洁的API和跨平台特性,成为处理ZIP文件的首选... 目录一、ZIP文件操作基础三板斧1.1 创建压缩包1.2 解压操作1.3 文件遍历与信息获取二、进阶技

Python Transformers库(NLP处理库)案例代码讲解

《PythonTransformers库(NLP处理库)案例代码讲解》本文介绍transformers库的全面讲解,包含基础知识、高级用法、案例代码及学习路径,内容经过组织,适合不同阶段的学习者,对... 目录一、基础知识1. Transformers 库简介2. 安装与环境配置3. 快速上手示例二、核心模