搓过专题

回忆杀之手搓当年搓过的Transformer

整体代码 import mathimport paddleimport paddle.nn as nnimport paddle.nn.functional as Fclass MaskMultiHeadAttention(nn.Layer):def __init__(self, hidden_size, num_heads):super(MaskMultiHeadAttention, se