Stanford哈工大编译原理

本文主要是介绍Stanford哈工大编译原理，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

文章目录

1 introduction to compilers
2 Lexical analysis
- 2.1 regular language & formal language(P9+P10+P11)
3 Parsing(P18)
- 3.1 CFG: context-free grammar(P19+P20)
- 3.2 derivation: 派生、推导(P20)
4 Semantic analysis(P42)
- 4.1 scope(P43)
- 4.2 symbol tables(P44)
6 语法分析
- 6.1 预测分析法
- 6.2 算符优先分析法
7 语义分析

1 introduction to compilers

https://lagunita.stanford.edu/courses/Engineering/Compilers/Fall2014/info

compiler: offline
interpreter: online, 程序每次都要重新解释
the structure of compiler

lexical analysis 词法分析：divide program text into words or tokens
parsing 语法分析：分清主语谓语
semantic analysis 语义分析
optimization
code generation

COOL: classroom object oriented language

2 Lexical analysis

classify program substrings according to role
communicate tokens to the parser
token记号标志: identifier标识符, keywords, (), number， whitespace
在fortan中空格可以忽略，这会导致关键词和变量名不容易区分

2.1 regular language & formal language(P9+P10+P11)

The regular expressions over $\Sigma$ are the smallest set of expressions including.

union并集
concatenation交集
iteration

Let $\Sigma$ be a set of characters (an alphabet). A language over $\Sigma$ is a set of strings of characters drawn from $\Sigma$ .

lexical specification：
rugular languages are a language specification语言规范

3 Parsing(P18)

parser output: parser tree

3.1 CFG: context-free grammar(P19+P20)

可以参考https://blog.csdn.net/qq_37171771/article/details/79342819

Begin with a string with only the start symbol S
Replace any non-terminal X in the string by the right-hand side of some production X->Y1…Yn

3.2 derivation: 派生、推导(P20)

a derivation can be drawn as a tree

start symbol is the tree’s root
for a production X->Y1Y2…Yn, add children Y1Y2…Yn to node X

4 Semantic analysis(P42)

4.1 scope(P43)

The scope of an identifier is the portion of a program in which that identifier is accessible

static scope(静态作用域): depends on the program text, not run-time behavior
dynamically scoped: depend on execution of the program

4.2 symbol tables(P44)

a symbol table is a data structure that tracks the current bindings of identifiers
for a simple symbol table we can use a stack

6 语法分析

最右推导：每次选择句型的最右非终结符进行替换
在自底向上的分析中，总是采用最左规约的方式，所以最左规约称为规范归约，最右推导称为规范推导
自顶向下的语法分析采用最左推导方式
递归下降分析：由一组过程组成，每个过程对应一个非终结符
预测分析：是递归下降分析的一个通过在输入中向前看固定个数符号来确定正确的产生式，预测分析不需要回溯，是一种确定的自顶向下分析方法
含有 $A->A\alpha$ 形式产生式的文法称为直接左递归
S文法（简单的确定性文法）：每个产生式的右部都以终结符开始，同一非终结符的各个候选式的首终结符都不同，这就要求S文法不含 $\epsilon$
非终结符A的后继符号集：可能在某个句型中紧跟在A后边的终结符a的集合，记为FOLLOW(A)
产生式的可选集：产生式 $A->\beta$ 的可选集指可以选用该产生式进行推导时，对应的输入符号的集合，记为 $SELECT(A->\beta)$
q文法：每个产生式的右部或为 $\epsilon$ ，或以终结符开始，而且具有相同左部的产生式具有不相交的可选集
串首终结符：串首的第一个符号，并且是终结符，简称首终结符
给定一个文法符号串 $\alpha$ ， $\alpha$ 的串首终结符集 $FIRST(\alpha)$ 定义为可以从 $\alpha$ 推导出的所有串首终结符构成的集合
对于产生式 $A->\alpha$ 的可选集SELECT
如果 $\epsilon \notin FIRST(\alpha)$ ，那么 $SELECT(A->\alpha)=FIRST(\alpha)$
如果 $\epsilon \in FIRST(\alpha)$ ，那么 $SELECT(A->\alpha)=FIRST(\alpha) \cup FOLLOW(A)$

6.1 预测分析法

在这里插入图片描述

递归的预测分析法：在递归下降分析中，编写每一个非终结符对应的过程时，根据预测分析表进行产生式的选择
非递归的预测分析法：不需要为每个非终极符编写递归下降过程，而是根据预测分析表构造一个自动机，也叫表驱动的预测分析
预测分析法的实现步骤：

构造文法
改造文法
求变量的FIRST集和FOLLOW集，从而求得SELECT集
如果是LL(1)文法，则构造预测分析表
对于递归的预测分析，根据预测分析表为每个非终结符编写一个过程；对于非递归的预测分析，实现表驱动的预测分析算法

句柄：每次规约的符号串
规范句型：最右推导每次得到的句型
LR文法：对输入进行从左到右的扫描，反向构造出一个最右推导序列
如果LR(0)分析表中没有语法分析动作冲突，那么给定的文法称为LR(0)文法，不是所有的上下文无关文法是LR(0)文法
增广文法：如果G是一个以S为开始符号的文法，则G的增广文法G’就是在G中加上新开始符号S‘和产生式S’->S而得到的文法
引入新的开始产生式的目的是使得文法开始符号仅出现在一个产生式的左边，从而使分析器只有一个接收状态
项目集闭包：所有等价的项目组成的项目集，每个项目集闭包对应着自动机的一个状态
SLR分析法：简单的考察下一个输入符号b是否属于与规约项目 $A-\alpha$ 相关联的FOLLOW(A)
问题在于，在特定位置，A的后继符集合是FOLLOW(A)的子集

6.2 算符优先分析法

简单优先分析法：PDA读入一个单词后，比较栈顶符号和该单词的优先级级，以决定是入栈还是规约
算符优先分析法：优先表里只有非终结符之间的优先关系，节省了空间
算符文法：给定上下文无关文法G，若G中所有产生式右部都不包含两个相继的非终结符，则G是算符文法

7 语义分析

这篇关于Stanford哈工大编译原理的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

Stanford哈工大编译原理

文章目录

1 introduction to compilers

2 Lexical analysis

2.1 regular language & formal language(P9+P10+P11)