本文主要是介绍TFLite: 编译(rpi),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
看rpi的编译过程,分析TFLite的代码组成,参考tensorflow/lite/g3doc/rpi.md。
.sh相关目录修改
因为lite代码从contrib移动到上级目录,而rpi编译相关的sh并没有做对应的修改如:
tensorflow/lite/tools/make/download_dependencies.sh
-cd "$SCRIPT_DIR/../../../../.."
+cd "$SCRIPT_DIR/../../../.."
tensorflow/lite/tools/make/build_rpi_lib.sh
-cd "$SCRIPT_DIR/../../../../.."
+cd "$SCRIPT_DIR/../../../.."
1 生成libtensorflow-lite.a的步骤
1.1]nstall the toolchain and libs
To cross compile TensorFlow Lite, first install the toolchain and libs.
```bash
sudo apt-get update
sudo apt-get install crossbuild-essential-armhf
```
安装后的工具链和库文件,所谓的工具链就是一组工具如编译、链接等
/usr/arm-linux-gnueabihf :ls
bin include lib
其中lib中有用于math计算的libm.so
libm.so
libstdc++.so.6
libdl.so
libpthread.so
使用libdl.so库
动态库加载原理
动态库中函数的查找已经封装成库libdl.so
libdl.so里面有4个函数:
dlopen//打开一个动态库
dlsym//在打开的动态库里找一个函数
dlclose//关闭动态库
dlerror//返回错误
armhf指的什么?
# uname --help
usage: uname [-asnrvm]
Print system information.
-s System name
-n Network (domain) name
-r Kernel Release number
-v Kernel Version
-m Machine (hardware) name
-a All of the above
cepheus:/ # uname -a
Linux localhost 4.14.83-perf-g43a3e8b #1 SMP PREEMPT Wed Apr 10 21:42:45 CST 2019 aarch64
cepheus:/ # uname -v
#1 SMP PREEMPT Wed Apr 10 21:42:45 CST 2019
cepheus:/ # uname -m
aarch64
1.2]download all the dependencies
### Building
Clone this Tensorflow repository, Run this script at the root of the repository to download all the dependencies:
```bash
./tensorflow/lite/tools/make/download_dependencies.sh
```
1.3] 由shell调用Makefile
to compile:
```bash
./tensorflow/lite/tools/make/build_rpi_lib.sh
```
This should compile a static library in:
`tensorflow/lite/gen/lib/rpi_armv7/libtensorflow-lite.a`.
2. download_dependencies.sh文件分析
2.1 set -e
set -e
"Exit immediately if a simple command exits with a non-zero status."
也就是说,在"set -e"之后出现的代码,一旦出现了返回值非零,整个脚本就会立即退出。有的人喜欢使用这个参数,是出于保证代码安全性的考虑。
2.2 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
https://blog.csdn.net/davidhopper/article/details/78989369
最近经常在bash脚本文件中看到类似于如下所示的语句:
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
cd "${DIR}/.."
刚开始真弄不明白这是什么含义,通过深入学习bash脚本知识,终于理解其含义,现将详细解释记录如下,以备今后不时之需。
${BASH_SOURCE[0]}表示bash脚本的第一个参数(如果第一个参数是bash,表明这是要执行bash脚本,这时"${BASH_SOURCE[0]}"自动转换为第二个参数),例如:
bash modules/tools/planning_traj_plot/run.sh
modules/tools/planning_traj_plot/example_data/1_planning.pb.txt
modules/tools/planning_traj_plot/example_data/1_localization.pb.tx
"${BASH_SOURCE[0]}"代表的是“modules/tools/planning_traj_plot/run.sh”。
"dirname"表示提取参数里的目录,dirname "${BASH_SOURCE[0]}"表示提取bash脚本第一个参数里的目录,例如:“modules/tools/planning_traj_plot/run.sh”的目录为”modules/tools/planning_traj_plot”。
cd "$( dirname "${BASH_SOURCE[0]}" )”表示切换到刚才提取的目录,例如:对于上述示例中的的目录:“modules/tools/planning_traj_plot”,cd "$( dirname "${BASH_SOURCE[0]}" )"表示在当前目录的基础上,切换到子目录“modules/tools/planning_traj_plot”。
DIR=cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd则表示,如果第一条语句顺利执行,就执行pwd显示当前目录,并将结果赋值给变量“DIR”。
cd "${DIR}/..”不必细说,就是切换到“DIR”变量所指目录的上一级目录.
下面是完整的示例:
run.sh文件内容(省略实际执行部分):
#!/bin/bash
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
echo ${DIR}
cd "${DIR}/.."
在Shell终端执行如下命令(当前路径为:/home/davidhopper/code/apollo)
bash modules/tools/planning_traj_plot/run.sh
modules/tools/planning_traj_plot/example_data/1_planning.pb.txt
modules/tools/planning_traj_plot/example_data/1_localization.pb.tx
结果如下:
/home/davidhopper/code/apollo/modules/tools/planning_traj_plot
2.3 if [ ! -f $BZL_FILE_PATH ];
if [ -f file ] 如果文件存在
if [ -d … ] 如果目录存在
if [ -s file ] 如果文件存在且非空
if [ -r file ] 如果文件存在且可读
if [ -w file ] 如果文件存在且可写
if [ -x file ] 如果文件存在且可执行
2.4 下载的依赖库
download_and_extract "${EIGEN_URL}" "${DOWNLOADS_DIR}/eigen"
https://bitbucket.org/eigen/eigen/overview
Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
Eigen是用于线性代数,矩阵和向量运算,几何变换,数值求解器和相关算法的模板头的高级C ++库。
download_and_extract "${GEMMLOWP_URL}" "${DOWNLOADS_DIR}/gemmlowp"
低精度的矩阵乘法
gemmlowp: a small self-contained low-precision GEMM library
This is not a full linear algebra library, only a GEMM library: it only does general matrix multiplication ("GEMM").
download_and_extract "${GOOGLETEST_URL}" "${DOWNLOADS_DIR}/googletest"
download_and_extract "${ABSL_URL}" "${DOWNLOADS_DIR}/absl"
https://github.com/abseil/abseil-cpp
Abseil is an open-source collection of C++ library code designed to augment the C++ standard library.
The Abseil library code is collected from Google's own C++ code base,
has been extensively tested and used in production, and is the same code we depend on in our daily coding lives.
download_and_extract "${NEON_2_SSE_URL}" "${DOWNLOADS_DIR}/neon_2_sse"
ARM NEON 是适用于ARM Cortex-A和Cortex-R52系列处理器的一种128位SIMD(single instruction multiple data, 单指令多数据)扩展结构。
ARM CPU最开始只有普通的寄存器,可以进行基本数据类型的基本运算。自ARMv5开始引入了VFP(Vector Floating Point)指令,该指令用于向量化加速浮点运算。
自ARMv7开始正式引入NEON指令,NEON性能远超VFP,因此VFP指令被废弃。
类似于Intel CPU下的MMX/SSE/AVX/FMA指令,ARM CPU的NEON指令同样是通过向量化来进行速度优化。
download_and_extract "${FLATBUFFERS_URL}" "${DOWNLOADS_DIR}/flatbuffers"
FlatBuffers is a cross platform serialization library architected for maximum memory efficiency.
It allows you to directly access serialized data without parsing/unpacking it first,
while still having great forwards/backwards compatibility.
download_and_extract "${FARMHASH_URL}" "${DOWNLOADS_DIR}/farmhash"
https://github.com/google/farmhash
FarmHash, a family of hash functions.
download_and_extract "${FFT2D_URL}" "${DOWNLOADS_DIR}/fft2d"
FFT (Fast Fourier/Cosine/Sine Transform)快速傅里叶变换
3 分析Makefile文件
3.1 makefile: origin
ifeq ($(origin MAKEFILE_DIR), undefined)
函数“origin”和其他函数不同,函数“origin”的动作不是操作变量(它的参数)。
它只是获取此变量(参数)相关的信息,告诉我们这个变量的出处(定义方式)。
¾ 函数语法:
$(origin VARIABLE)
¾ 函数功能:函数“origin”查询参数“VARIABLE”(一个变量名)的出处
3.2 获得Makefile路径
MAKEFILE_DIR := $(shell dirname $(realpath $(lastword $(MAKEFILE_LIST))))
3.3 uname
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
HOST_OS := linux
uname --help
Usage: uname [OPTION]...
Print certain system information. With no OPTION, same as -s.
-a, --all print all information, in the following order,
except omit -p and -i if unknown:
-s, --kernel-name print the kernel name
-n, --nodename print the network node hostname
-r, --kernel-release print the kernel release
-v, --kernel-version print the kernel version
-m, --machine print the machine hardware name
-p, --processor print the processor type (non-portable)
-i, --hardware-platform print the hardware platform (non-portable)
-o, --operating-system print the operating system
--help display this help and exit
--version output version information and exit
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/uname>
or available locally via: info '(coreutils) uname invocation'
3.4 makefile includes
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
得到 .sh当前所在路径
/home/ws/code/tensorflow/tensorflow/lite/tools/make
cd "$SCRIPT_DIR/../../../.." [获得路径然后做对应的修改]
得到tensorflow代码所在目录
/home/ws/code/tensorflow
INCLUDES := \
-I. \
-I$(MAKEFILE_DIR)/../../../../../ \
-I$(MAKEFILE_DIR)/../../../../../../ \
-I$(MAKEFILE_DIR)/downloads/ \
-I$(MAKEFILE_DIR)/downloads/eigen \
-I$(MAKEFILE_DIR)/downloads/absl \
-I$(MAKEFILE_DIR)/downloads/gemmlowp \
-I$(MAKEFILE_DIR)/downloads/neon_2_sse \
-I$(MAKEFILE_DIR)/downloads/farmhash/src \
-I$(MAKEFILE_DIR)/downloads/flatbuffers/include \
-I$(OBJDIR)
感觉没有包含文件是<>路径, 不是 "",看代码里的""没有加到这个路径里
3.5 makefile 中cc/c源文件
3.5.1. 对文件进行分类
PROFILER_SRCS
PROFILE_SUMMARIZER_SRCS
CORE_CC_ALL_SRCS
MINIMAL_SRCS
# A small example program that shows how to link against the library.
MINIMAL_SRCS := \
tensorflow/lite/examples/minimal/minimal.cc
PROFILER_SRCS := \
tensorflow/lite/profiling/time.cc
PROFILE_SUMMARIZER_SRCS := \
tensorflow/lite/profiling/profile_summarizer.cc \
tensorflow/core/util/stats_calculator.cc
3.5.2 wildcard
函数名称:获取匹配模式文件名函数—wildcard
函数功能:列出当前目录下所有符合模式“PATTERN”格式的文件名。
返回值:空格分割的、存在当前目录下的所有符合模式“PATTERN”的文件名。
函数说明:“PATTERN”使用shell可识别的通配符,包括“?”(单字符)、“*”(多
字符)等。
示例:$(wildcard *.c)
返回值为当前目录下所有.c 源文件列表。
CORE_CC_ALL_SRCS := \
$(wildcard tensorflow/lite/*.cc) \
$(wildcard tensorflow/lite/*.c) \
$(wildcard tensorflow/lite/c/*.c) \
$(wildcard tensorflow/lite/core/api/*.cc)
ifneq ($(BUILD_TYPE),micro)//如果是MCU就不包含kernel下的代码,这些代码不是必须的?
CORE_CC_ALL_SRCS += \
$(wildcard tensorflow/lite/kernels/*.cc) \
$(wildcard tensorflow/lite/kernels/internal/*.cc) \
$(wildcard tensorflow/lite/kernels/internal/optimized/*.cc) \
$(wildcard tensorflow/lite/kernels/internal/reference/*.cc) \
$(PROFILER_SRCS) \
$(wildcard tensorflow/lite/kernels/*.c) \
$(wildcard tensorflow/lite/kernels/internal/*.c) \
$(wildcard tensorflow/lite/kernels/internal/optimized/*.c) \
$(wildcard tensorflow/lite/kernels/internal/reference/*.c) \
$(wildcard tensorflow/lite/tools/make/downloads/farmhash/src/farmhash.cc) \
$(wildcard tensorflow/lite/tools/make/downloads/fft2d/fftsg.c)
endif
3.5.3 Remove any duplicates
# Remove any duplicates.
CORE_CC_ALL_SRCS := $(sort $(CORE_CC_ALL_SRCS))
3.5.4 Filter out all the excluded files
CORE_CC_EXCLUDE_SRCS
CORE_CC_EXCLUDE_SRCS := \
$(wildcard tensorflow/lite/*test.cc) \
$(wildcard tensorflow/lite/*/*test.cc) \
$(wildcard tensorflow/lite/*/*/*test.cc) \
$(wildcard tensorflow/lite/*/*/*/*test.cc) \
$(wildcard tensorflow/lite/kernels/test_util.cc) \
ifeq ($(BUILD_TYPE),micro) //micro: MCU单片机
CORE_CC_EXCLUDE_SRCS += \
tensorflow/lite/mmap_allocation.cc \
tensorflow/lite/nnapi_delegate.cc
endif
# Filter out all the excluded files.
$(filter-out PATTERN...,TEXT)
函数名称:反过滤函数—filter-out。
函数功能:和“filter”函数实现的功能相反。过滤掉字串“TEXT”中所有符合模式
“PATTERN”的单词,保留所有不符合此模式的单词。可以有多个模式。
存在多个模式时,模式表达式之间使用空格分割。。
返回值:空格分割的“TEXT”字串中所有不符合模式“PATTERN”的字串。
函数说明:“filter-out”函数也可以用来去除一个变量中的某些字符串,(实现和
“filter”函数相反)。
示例:
objects=main1.o foo.o main2.o bar.o
mains=main1.o main2.o
$(filter-out $(mains),$(objects))
实现了去除变量“objects”中“mains”定义的字串(文件名)功能。它的返回值
为“foo.o bar.o”
TF_LITE_CC_SRCS := $(filter-out $(CORE_CC_EXCLUDE_SRCS), $(CORE_CC_ALL_SRCS))
3.6 makefile 编译选项
CXXFLAGS += --std=c++11
CC_PREFIX=arm-linux-gnueabihf- make -j 3 -f tensorflow/lite/tools/make/Makefile TARGET=rpi TARGET_ARCH=armv7l
These target-specific makefiles should modify or replace options
# These target-specific makefiles should modify or replace options like
# CXXFLAGS or LIBS to work for a specific targetted architecture. All logic
# based on platforms or architectures should happen within these files, to
# keep this main makefile focused on the sources and dependencies.
include $(wildcard $(MAKEFILE_DIR)/targets/*_makefile.inc) 在具体的target文件中定义编译配置
# Settings for Raspberry Pi.
ifeq ($(TARGET),rpi)
# Default to the architecture used on the Pi Two/Three (ArmV7), but override this
# with TARGET_ARCH=armv6 to build for the Pi Zero or One.
TARGET_ARCH := armv7l
TARGET_TOOLCHAIN_PREFIX := arm-linux-gnueabihf-
ifeq ($(TARGET_ARCH), armv7l)
CXXFLAGS += \
-march=armv7-a \
-mfpu=neon-vfpv4 \
-funsafe-math-optimizations \
-ftree-vectorize \
-fPIC
CCFLAGS += \
-march=armv7-a \
-mfpu=neon-vfpv4 \
-funsafe-math-optimizations \
-ftree-vectorize \
-fPIC
LDFLAGS := \
-Wl,--no-export-dynamic \
-Wl,--exclude-libs,ALL \
-Wl,--gc-sections \
-Wl,--as-needed
endif
LIBS := \
-lstdc++ \
-lpthread \
-lm \
-ldl
endif
-lz
- is zlib, http://zlib.net/ 用于压缩解压用的库
3.7 Where compiled objects are stored
# Where compiled objects are stored.
GENDIR := $(MAKEFILE_DIR)/gen/$(TARGET)_$(TARGET_ARCH)/
OBJDIR := $(GENDIR)obj/
BINDIR := $(GENDIR)bin/
LIBDIR := $(GENDIR)lib/
LIB_PATH := $(LIBDIR)$(LIB_NAME) //LIB_NAME := libtensorflow-lite.a
MINIMAL_BINARY := $(BINDIR)minimal
3.8 具体编译CC, AR工具,怎没有LD?
which arm-linux-gnueabihf-g++
/usr/bin/arm-linux-gnueabihf-g++
TARGET_TOOLCHAIN_PREFIX := arm-linux-gnueabihf-
CC_PREFIX=arm-linux-gnueabihf-
CXX := $(CC_PREFIX)${TARGET_TOOLCHAIN_PREFIX}g++
CC := $(CC_PREFIX)${TARGET_TOOLCHAIN_PREFIX}gcc
AR := $(CC_PREFIX)${TARGET_TOOLCHAIN_PREFIX}ar
3.9 编译源文件
3.9.1 addprefix
函数名称:加前缀函数—addprefix。
函数功能:为“NAMES...”中的每一个文件名添加前缀“PREFIX”。参数“NAMES...”
是空格分割的文件名序列,将“SUFFIX”添加到此序列的每一个文件名之前。
返回值:以单空格分割的添加了前缀“PREFIX”的文件名序列。
函数说明:
示例:
$(addprefix src/,foo bar)
返回值为“src/foo src/bar”
3.9.2 patsubst
$(patsubst PATTERN,REPLACEMENT,TEXT)
函数名称:模式替换函数—patsubst。
函数功能:搜索“TEXT”中以空格分开的单词,将否符合模式“TATTERN”替换
为“REPLACEMENT”。
返回值:替换后的新字符串
//替换 .c/.cc成 .o然后添加前缀
MINIMAL_OBJS := $(addprefix $(OBJDIR), \
$(patsubst %.cc,%.o,$(patsubst %.c,%.o,$(MINIMAL_SRCS))))
LIB_OBJS := $(addprefix $(OBJDIR), \
$(patsubst %.cc,%.o,$(patsubst %.c,%.o,$(TF_LITE_CC_SRCS))))
3.9.3 真正的编译 CXX/CC (c/C++分别编译)
# For normal manually-created TensorFlow C++ source files.
$(OBJDIR)%.o: %.cc
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(INCLUDES) -c $< -o $@
# For normal manually-created TensorFlow C source files.
$(OBJDIR)%.o: %.c
@mkdir -p $(dir $@)
$(CC) $(CCFLAGS) $(INCLUDES) -c $< -o $@
4. 生成静态库
# Hack for generating schema file bypassing flatbuffer parsing
tensorflow/lite/schema/schema_generated.h:
@cp -u tensorflow/lite/schema/schema_generated.h.OPENSOURCE tensorflow/lite/schema/schema_generated.h
# Gathers together all the objects we've compiled into a single '.a' archive.
$(LIB_PATH): tensorflow/lite/schema/schema_generated.h $(LIB_OBJS) [静态库也可以依赖 .h文件]
@mkdir -p $(dir $@)
$(AR) $(ARFLAGS) $(LIB_PATH) $(LIB_OBJS)
5. 生成可执行文件 (cXX -O就是ld)
$(MINIMAL_BINARY): $(MINIMAL_OBJS) $(LIB_PATH)
@mkdir -p $(dir $@)
$(CXX) $(CXXFLAGS) $(INCLUDES) \
-o $(MINIMAL_BINARY) $(MINIMAL_OBJS) \
$(LIBFLAGS) $(LIB_PATH) $(LDFLAGS) $(LIBS)
当前的疑问
download_and_extract "${FARMHASH_URL}" "${DOWNLOADS_DIR}/farmhash"
download_and_extract "${FFT2D_URL}" "${DOWNLOADS_DIR}/fft2d"
只有farmhash\fft2d编译,而下面的三个没有编译?是什么原因?下面三个是必须的吗?[]
download_and_extract "${EIGEN_URL}" "${DOWNLOADS_DIR}/eigen"
download_and_extract "${GEMMLOWP_URL}" "${DOWNLOADS_DIR}/gemmlowp"
download_and_extract "${ABSL_URL}" "${DOWNLOADS_DIR}/absl"
download_and_extract "${NEON_2_SSE_URL}" "${DOWNLOADS_DIR}/neon_2_sse"
download_and_extract "${FLATBUFFERS_URL}" "${DOWNLOADS_DIR}/flatbuffers"
1. 当前使用的计算库是libm.so 没有使用eigen/gemmlowp进行优化
上面这个结论是错误的,实现矩阵的运算需要eigen,eigen的使用方法:gemmlowp也是模板类通过头文件引入
In fact, the header files in the Eigen
subdirectory are the only files required to compile programs using Eigen. The header files are the same for all platforms. It is not necessary to use CMake or install anything.
A simple first program
Here is a rather simple program to get you started.
#include <iostream>
#include <Eigen/Dense>
using Eigen::MatrixXd;
int main()
{
MatrixXd m(2,2);
m(0,0) = 3;
m(1,0) = 2.5;
m(0,1) = -1;
m(1,1) = m(1,0) + m(0,1);
std::cout << m << std::endl;
}
2. schema_generated.h的文件不仅包含数据结构的定义,也包含了inline函数
inline flatbuffers::Offset<void> BuiltinOptionsUnion::Pack()
因为是inline函数,所以没有对应的 .o文件生成。
3. absl/ neon没有使用?[是怎样控制的?]
参考
https://blog.csdn.net/u011964923/article/details/73297443
Linux下Makefile中动态链接库和静态链接库的生成与调用
这篇关于TFLite: 编译(rpi)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!