AI算力基础_Why-systolic-architecture

本文主要是介绍AI算力基础_Why-systolic-architecture，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

阅读总结

作者：H.T.Kung 1982.
年份：1982.
标题：《Why Systolic Architecture？》
关键词：
cost-effectiveness：成本高效益
concurrency：并发性
decompose：分解
massive parallelism：大规模并行

Why Systolic Architecuture? H.T.Kung 1982
Systolic architectures, which permit multiple computations for each memory access, can speed execution of compute-bound problems without increasing I/O requirements.

Systolic architectures, which permit multiple computations for each memory access, can speed execution of compute-bound problems without increasing I/O requirements.
Systolic 结构，在不增加 IO 需求前提下，加速 compute-bound 问题的解决.

Key architectual issues in designing special-purpose systems

①Simple and regular design：可以降低设计成本，通过模块化实现成本与性能成比例；

②Concurrency and communication：由于器件速度的限制，可通过大量并行和降低路由成本加快运算速度；

③Balancing computation with I/O：I/O制约了最大运算速率，所以需要分解运算以减少I/O，平衡I/O需求、系统规模、存储大小之前的关系，探寻I/O带宽对速度的影响

Systolic architectures： the basic principle

脉动阵列的基本原理
基本定义
A systolic system consists of a set of interconnected cells, each capable of performing some simple operation.

Cells in a systolic system are typically interconnected to form a systolic array or a systolic tree. Information in a systolic system flows between cells in a pipelined fashion, and communication with the outside world occurs only at the “boundary cells.” For example, in a systolic array, only those cells on the array boundaries may be I/O ports for the system.

计算任务分类
Computational tasks can be conceptually classified into two families-compute-bound computations and I/O-bound computations

在这里插入图片描述
如图，将传统的单个处理单元替换为PE阵列，数据从MEMORY中流出，并沿着PE阵列流过每个PE，实现重复使用。