TFLite:使用1维CNN处理序列数据的过程

本文主要是介绍TFLite:使用1维CNN处理序列数据的过程，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

开发环境

tf.__version__
'2.0.0-beta1'

tf.keras.__version__
'2.2.4-tf'

数据来源

http://www.cis.fordham.edu/wisdm/dataset.php

根据sensor数据x, y, z分类出Downstairs, Upstairs, jogging, sitting, standing, walking 6个类别

数据预处理函数

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
from scipy import stats
from sklearn import metrics
from sklearn.metrics import classification_report
from sklearn import preprocessing

import keras
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Reshape, GlobalAveragePooling1D
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Conv1D, MaxPooling1D

columns = ['user','activity','timestamp', 'x-axis', 'y-axis', 'z-axis']
def read_data(file_patch):
train = pd.read_csv(file_patch, header = None, names = columns)
train = train.dropna()
return train

# mean and std
def feature_normalize(feature):
mu = np.mean(feature, axis=0)
sigma = np.std(feature, axis=0)
return (feature - mu) / sigma

# x1,y1,z1, x2, y2, z2
def create_segments_for_rnn(df, step, time_steps=200):
X_train = []
Y_train = []
hot_test = []
for i in range(0, df.shape[0] - time_steps, step):
xyz_data = df[['x-axis', 'y-axis', 'z-axis']][i:i+time_steps]
X_train.append(np.array(xyz_data))

label = stats.mode(df['activity'][i: i + time_steps])[0][0]
hot_test.append(label)
#怎样找到 one-hot和字符串的对应关系？
Y_train = np.asarray(pd.get_dummies(hot_test), dtype = np.float32)
return np.array(X_train), Y_train

# x1, x2, x3, -----, x(timesteps), y1,...z1,...
def create_segments_for_cnn(df, step, time_steps=200):
X_train = []
Y_train = []
hot_test = []
for i in range(0, df.shape[0] - time_steps, step):
xs = df['x-axis'].values[i: i + time_steps]
ys = df['y-axis'].values[i: i + time_steps]
zs = df['z-axis'].values[i: i + time_steps]
X_train.append([xs, ys, zs])

label = stats.mode(df['activity'][i: i + time_steps])[0][0]
hot_test.append(label)

reshaped_segments = np.asarray(X_train, dtype= np.float32).reshape(-1, time_steps, 3)
#怎样找到 one-hot和字符串的对应关系？
Y_train = np.asarray(pd.get_dummies(hot_test), dtype = np.float32)
return reshaped_segments, Y_train

def shuffle_data(X, Y):
np.random.seed(10)
randomList = np.arange(X.shape[0])
np.random.shuffle(randomList)
return X[randomList], Y[randomList]

def split_data(X,Y,rate):
X_train = X[int(X.shape[0]*rate):]
Y_train = Y[int(Y.shape[0]*rate):]
X_val = X[:int(X.shape[0]*rate)]
Y_val = Y[:int(Y.shape[0]*rate)]
return X_train, Y_train, X_val, Y_val

网络模型

rnn-lstm

参数的意义：64:lstm隐藏单元，input_length:time sequence length, input_dim: input feature number

return_sequences:多对多还是多对一

def build_rnn_model(shape, num_classes):
model = Sequential()
model.add(LSTM(64, input_length = shape[1], input_dim = shape[2], return_sequences = True ))
model.add(LSTM(64, return_sequences = False))
model.add(Dense(num_classes, activation='softmax'))
print(model.summary())
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
return model

cnn

参数的意义：30：提取多少个feature, 10: kenerl size, filter, input_shape:filter要扫描的形状

def build_cnn_model(shape, num_classes):
model = Sequential()
model.add(Conv1D(30, 10, activation='relu', input_shape=(shape[1], shape[2])))
model.add(Conv1D(30, 10, activation='relu'))
model.add(MaxPooling1D(3))
model.add(Conv1D(48, 10, activation='relu'))
model.add(Conv1D(48, 10, activation='relu'))
model.add(GlobalAveragePooling1D())
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
print(model.summary())
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
return model

这里理解下什么是1D cnn, 什么是2D cnn?

https://juejin.im/post/5beb7432f265da61524cf27c

[译] 在 Keras 中使用一维卷积神经网络处理时间序列数据

对应的参考代码

https://github.com/ni79ls/har-keras-cnn/blob/master/20180903_Keras_HAR_WISDM_CNN_v1.0_for_medium.py

无论是一维、二维还是三维，卷积神经网络（CNNs）都具有相同的特点和相同的处理方法。关键区别在于输入数据的维数以及特征检测器（或滤波器）如何在数据之间滑动

模型training

df = read_data("WISDM_ar_v1.1_raw.txt")
df['x-axis'] = feature_normalize(df['x-axis'])
df['y-axis'] = feature_normalize(df['y-axis'])
df['z-axis'] = feature_normalize(df['z-axis'])
df.head()

#_x, _y = create_segments_for_cnn(df, 40, 80)
_x, _y = create_segments_for_rnn(df, 40, 80)
_xs, _ys = shuffle_data(_x, _y)

x_train, y_train, x_test, y_test = split_data(_xs, _ys, 0.1)
model = build_cnn_model(x_train.shape, 6)

callbacks_list = [
keras.callbacks.ModelCheckpoint(
filepath='callback_test.h5',
),
#keras.callbacks.tensorboard(log_dir='my_log_dir', histogram_freq=1,)
]

# Hyper-parameters
BATCH_SIZE = 400
EPOCHS = 30

# Enable validation to use ModelCheckpoint and EarlyStopping callbacks.
history = model.fit(x_train,
y_train,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_split=0.2,
#callbacks=callbacks_list,not supported now
verbose=1)

数据处理方式的选择

不同数据格式训练出的效果不同
_x, _y = create_segments_for_rnn(df, 40, 80) 较好但生成训练数据慢

_x, _y = create_segments_for_cnn(df, 40, 80) 相对较差，但生成训练数据快

有关callbacks 和tensorboard

当前tf.keras的版本对callbacks支持的不好，实测只有保存h5文件好用, 要使用keras，callbacks

callbacks_list = [
keras.callbacks.ModelCheckpoint(
filepath='lstm_model.h5',
),
keras.callbacks.TensorBoard(
log_dir='my_log_dir',
histogram_freq=1,
)
]

图形化表示训练效果

从网上找到一些方法，但报找不到xxx, 只要用history.history.keys()找到对应的字段就可以了

In [54]: history.history.keys()
Out[54]: ['loss', 'val_accuracy', 'val_loss', 'accuracy']

print("\n--- Learning curve of model training ---\n")

# summarize history for accuracy and loss
plt.figure(figsize=(6, 4))
plt.plot(history.history['accuracy'], "g--", label="Accuracy of training data")
plt.plot(history.history['val_accuracy'], "g", label="Accuracy of validation data")
plt.plot(history.history['loss'], "r--", label="Loss of training data")
plt.plot(history.history['val_loss'], "r", label="Loss of validation data")
plt.title('Model Accuracy and Loss')
plt.ylabel('Accuracy and Loss')
plt.xlabel('Training Epoch')
plt.ylim(0)
plt.legend()
plt.show()

测试数据上的效果

score = model.evaluate(x_test, y_test, verbose=1)

保存h5格式model文件

model.save('0710_cnn.h5')

h5转换为tflite格式文件

使用tflite_convert (推荐)

usage: tflite_convert [-h] --output_file OUTPUT_FILE
(--saved_model_dir SAVED_MODEL_DIR | --keras_model_file KERAS_MODEL_FILE)

tf.lite.TFLiteConverter.from_keras_model(model)

model = keras.models.load_model('path_to_my_model.h5')

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)

tflite 模型文件在手机上的使用

修改编译 minimal

bazel build --cxxopt='--std=c++11' //tensorflow/lite/examples/minimal:minimal \
--crosstool_top=//external:android/crosstool \
--host_crosstool_top=@bazel_tools//tools/cpp:toolchain \
--cpu=arm64-v8a

tflite::PrintInterpreterState(interpreter.get());

会打印出这个model的结构如：

Interpreter has 41 tensors and 12 nodes

inputs/outputs

Inputs: 1
Outputs: 0

tensor

Tensor 0 Identity kTfLiteFloat32 kTfLiteArenaRw 24 bytes ( 0.0 MB) 1 6
Tensor 1 conv1d_input kTfLiteFloat32 kTfLiteArenaRw 960 bytes ( 0.0 MB) 1 80 3
Tensor 2 sequential/conv1d/Relu kTfLiteFloat32 kTfLiteArenaRw 8520 bytes ( 0.0 MB) 1 1 71 30
Tensor 3 sequential/conv1d/conv1d/ExpandDims kTfLiteFloat32 kTfLiteArenaRw 960 bytes ( 0.0 MB) 1 1 80 3
Tensor 4 sequential/conv1d/conv1d/ExpandDims/dim_0 kTfLiteInt32 kTfLiteMmapRo 16 bytes ( 0.0 MB) 1 4
Tensor 5 sequential/conv1d/conv1d/ExpandDims_1 kTfLiteFloat32 kTfLiteMmapRo 3600 bytes ( 0.0 MB) 30 1 10 3
Tensor 6 sequential/conv1d/conv1d_bias kTfLiteFloat32 kTfLiteMmapRo 120 bytes ( 0.0 MB) 30
Tensor 7 sequential/conv1d_1/conv1d/ExpandDims_1 kTfLiteFloat32 kTfLiteMmapRo 36000 bytes ( 0.0 MB) 30 1 10 30
Tensor 8 sequential/conv1d_1/conv1d/Squeeze kTfLiteFloat32 kTfLiteArenaRw 7440 bytes ( 0.0 MB) 1 1 62 30
Tensor 9 sequential/conv1d_1/conv1d_bias kTfLiteFloat32 kTfLiteMmapRo 120 bytes ( 0.0 MB) 30
Tensor 10 sequential/conv1d_2/conv1d/ExpandDims kTfLiteFloat32 kTfLiteArenaRw 2400 bytes ( 0.0 MB) 1 1 20 30
Tensor 11 sequential/conv1d_2/conv1d/ExpandDims/dim_0 kTfLiteInt32 kTfLiteMmapRo 16 bytes ( 0.0 MB) 1 4
Tensor 12 sequential/conv1d_2/conv1d/ExpandDims_1 kTfLiteFloat32 kTfLiteMmapRo 57600 bytes ( 0.1 MB) 48 1 10 30
Tensor 13 sequential/conv1d_2/conv1d/Squeeze kTfLiteFloat32 kTfLiteArenaRw 2112 bytes ( 0.0 MB) 1 1 11 48
Tensor 14 sequential/conv1d_2/conv1d_bias kTfLiteFloat32 kTfLiteMmapRo 192 bytes ( 0.0 MB) 48
Tensor 15 sequential/conv1d_3/Relu kTfLiteFloat32 kTfLiteArenaRw 384 bytes ( 0.0 MB) 1 2 48
Tensor 16 sequential/conv1d_3/conv1d/ExpandDims_1 kTfLiteFloat32 kTfLiteMmapRo 92160 bytes ( 0.1 MB) 48 1 10 48
Tensor 17 sequential/conv1d_3/conv1d/Squeeze kTfLiteFloat32 kTfLiteArenaRw 384 bytes ( 0.0 MB) 1 1 2 48
Tensor 18 sequential/conv1d_3/conv1d/Squeeze_shape kTfLiteInt32 kTfLiteMmapRo 12 bytes ( 0.0 MB) 3
Tensor 19 sequential/conv1d_3/conv1d_bias kTfLiteFloat32 kTfLiteMmapRo 192 bytes ( 0.0 MB) 48
Tensor 20 sequential/dense/BiasAdd kTfLiteFloat32 kTfLiteArenaRw 24 bytes ( 0.0 MB) 1 6
Tensor 21 sequential/dense/MatMul/ReadVariableOp/transpose kTfLiteFloat32 kTfLiteMmapRo 1152 bytes ( 0.0 MB) 6 48
Tensor 22 sequential/dense/MatMul_bias kTfLiteFloat32 kTfLiteMmapRo 24 bytes ( 0.0 MB) 6
Tensor 23 sequential/global_average_pooling1d/Mean kTfLiteFloat32 kTfLiteArenaRw 192 bytes ( 0.0 MB) 1 48
Tensor 24 sequential/global_average_pooling1d/Mean/reduction_indices kTfLiteInt32 kTfLiteMmapRo 4 bytes ( 0.0 MB)
Tensor 25 sequential/max_pooling1d/ExpandDims kTfLiteFloat32 kTfLiteArenaRw 7440 bytes ( 0.0 MB) 1 62 1 30
Tensor 26 sequential/max_pooling1d/ExpandDims/dim_0 kTfLiteInt32 kTfLiteMmapRo 16 bytes ( 0.0 MB) 1 4
Tensor 27 sequential/max_pooling1d/MaxPool kTfLiteFloat32 kTfLiteArenaRw 2400 bytes ( 0.0 MB) 1 20 1 30
Tensor 28 (null) kTfLiteInt32 kTfLiteArenaRw 12 bytes ( 0.0 MB) 3
Tensor 29 (null) kTfLiteInt32 kTfLiteArenaRw 4 bytes ( 0.0 MB) 1
Tensor 30 (null) kTfLiteFloat32 kTfLiteArenaRw 192 bytes ( 0.0 MB) 48
Tensor 31 (null) kTfLiteNoType kTfLiteMemNone 0 bytes ( 0.0 MB) (null)
Tensor 32 (null) kTfLiteNoType kTfLiteMemNone 0 bytes ( 0.0 MB) (null)
Tensor 33 (null) kTfLiteFloat32 kTfLiteArenaRw 8520 bytes ( 0.0 MB) 1 1 71 30
Tensor 34 (null) kTfLiteFloat32 kTfLiteArenaRwPersistent 3600 bytes ( 0.0 MB) 30 30
Tensor 35 (null) kTfLiteFloat32 kTfLiteArenaRw 74400 bytes ( 0.1 MB) 1 1 62 300
Tensor 36 (null) kTfLiteFloat32 kTfLiteArenaRwPersistent 36000 bytes ( 0.0 MB) 300 30
Tensor 37 (null) kTfLiteFloat32 kTfLiteArenaRw 13200 bytes ( 0.0 MB) 1 1 11 300
Tensor 38 (null) kTfLiteFloat32 kTfLiteArenaRwPersistent 57600 bytes ( 0.1 MB) 300 48
Tensor 39 (null) kTfLiteFloat32 kTfLiteArenaRw 3840 bytes ( 0.0 MB) 1 1 2 480
Tensor 40 (null) kTfLiteFloat32 kTfLiteArenaRwPersistent 92160 bytes ( 0.1 MB) 480 48

Node

Node 0 Operator Builtin Code 22
Inputs: 1 4
Outputs: 3
Node 1 Operator Builtin Code 3
Inputs: 3 5 6
Outputs: 2
Node 2 Operator Builtin Code 3
Inputs: 2 7 9
Outputs: 8
Node 3 Operator Builtin Code 22
Inputs: 8 26
Outputs: 25
Node 4 Operator Builtin Code 17
Inputs: 25
Outputs: 27
Node 5 Operator Builtin Code 22
Inputs: 27 11
Outputs: 10
Node 6 Operator Builtin Code 3
Inputs: 10 12 14
Outputs: 13
Node 7 Operator Builtin Code 3
Inputs: 13 16 19
Outputs: 17
Node 8 Operator Builtin Code 22
Inputs: 17 18
Outputs: 15
Node 9 Operator Builtin Code 40
Inputs: 15 24
Outputs: 23
Node 10 Operator Builtin Code 9
Inputs: 23 21 22
Outputs: 20
Node 11 Operator Builtin Code 25
Inputs: 20
Outputs: 0

Tensor的名字和形状

Tensor 0 Identity kTfLiteFloat32 kTfLiteArenaRw 24 bytes ( 0.0 MB) 1 6
Tensor 1 conv1d_input kTfLiteFloat32 kTfLiteArenaRw 960 bytes ( 0.0 MB) 1 80 3

怎样赋值input

如果输入数据的形状如一维矢量，二维矩阵等，会引起tensor data的赋值方式吗？或者说需要根据shape更改输入数据的表示吗？如用二维数组表示矩阵等。其实没有必要，只要数据存储顺序按照二维的格式就行，可以用np.reshape一下即可

float sensor_test[] = { //[0, 1, 0...]
-1.02622092e-01, 1.73495474e+00, 1.96860060e+00, -7.71851445e-03,
-5.41881331e-01, 7.67533877e-01, -3.54595601e-02, -2.29101321e+00,

-----

}

// Fill input buffers
// TODO(user): Insert code to fill input tensors
Interpreter *interp = interpreter.get();
TfLiteTensor* input = interp->tensor(interp->inputs()[0]);
for (int i=0; i < input->bytes/4; i++) {
//input->data.f[i] = feature_test[i];
input->data.f[i] = sensor_test[i];//和一维数据时一样，不用多维数组表示数据
//input->data.f[i] = tensor_test_1[i];
}

TFLite 模型quantilization化

toco需要是源码编译生成的不是自动安装的，另外后面的参数可以从上面tflite的打印log中看出

从结果看并不是所有的float变为uint8型

./toco --input_file='30_cnn.tflite' --input_format=TFLITE --output_format=TFLITE --output_file='quanized_30_cnn.tflite' --inference_type=FLOAT --input_type=FLOAT --input_arrays=conv1d_input --output_arrays=Identity --input_shapes=1,80,3 --post_training_quantize
2019-07-11 12:06:26.592093: W tensorflow/lite/toco/toco_cmdline_flags.cc:283] --input_type is deprecated. It was an ambiguous flag that set both --input_data_types and --inference_input_type. If you are trying to complement the input file with information about the type of input arrays, use --input_data_type. If you are trying to control the quantization/dequantization of real-numbers input arrays in the output file, use --inference_input_type.
2019-07-11 12:06:26.594579: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before Removing unused ops: 12 operators, 28 arrays (0 quantized)
2019-07-11 12:06:26.594937: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before general graph transformations: 12 operators, 28 arrays (0 quantized)
2019-07-11 12:06:26.595742: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] After general graph transformations pass 1: 12 operators, 28 arrays (0 quantized)
2019-07-11 12:06:26.596726: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before dequantization graph transformations: 12 operators, 28 arrays (0 quantized)
2019-07-11 12:06:26.597362: I tensorflow/lite/toco/allocate_transient_arrays.cc:345] Total transient array allocated size: 17024 bytes, theoretical optimal value: 16064 bytes.
2019-07-11 12:06:26.597556: I tensorflow/lite/toco/toco_tooling.cc:397] Estimated count of arithmetic ops: 0.00166014 billion (note that a multiply-add is counted as 2 ops).
2019-07-11 12:06:26.598574: I tensorflow/lite/toco/tflite/export.cc:569] Quantizing TFLite model after conversion to flatbuffer. dump_graphviz will only output the model before this transformation. To visualize the output graph use lite/tools/optimize.py.
2019-07-11 12:06:26.603058: I tensorflow/lite/tools/optimize/quantize_weights.cc:199] Skipping quantization of tensor sequential/conv1d/conv1d/ExpandDims_1 because it has fewer than 1024 elements (900).
2019-07-11 12:06:26.603184: I tensorflow/lite/tools/optimize/quantize_weights.cc:278] Quantizing tensor sequential/conv1d_1/conv1d/ExpandDims_1 with 9000 elements for hybrid evaluation.
2019-07-11 12:06:26.603352: I tensorflow/lite/tools/optimize/quantize_weights.cc:278] Quantizing tensor sequential/conv1d_2/conv1d/ExpandDims_1 with 14400 elements for hybrid evaluation.
2019-07-11 12:06:26.603476: I tensorflow/lite/tools/optimize/quantize_weights.cc:278] Quantizing tensor sequential/conv1d_3/conv1d/ExpandDims_1 with 23040 elements for hybrid evaluation.
2019-07-11 12:06:26.603670: I tensorflow/lite/tools/optimize/quantize_weights.cc:199] Skipping quantization of tensor sequential/dense/MatMul/ReadVariableOp/transpose because it has fewer than 1024 elements (288).

LSTM模型h5转换为tflite格式

原本计划使用lstm，且从tflite的operator也看到lstm, 但转换失败

ValueError: Cannot find the Placeholder op that is an input to the ReadVariableOp.

问题对应的源码如下：如果不是Placeholder, 是什么哪？

   165       if map_name_to_node[input_name].op != "Placeholder":
166         raise ValueError("Cannot find the Placeholder op that is an input "
   167                          "to the ReadVariableOp.")

直接在源码打log没有输出，原因不明。

使用PyCharm 打断点：map_name_to_node[input_name].op 是Switch

看Graph_def中有大量的Switch, 也许是传说中的控制流吧，直接放弃lstm 改用cnn,从效果看也挺好