差别

这里会显示出您选择的修订版和当前版本之间的差别。

--- cs:comp_n_arch:courses:fnti_i:week_6 [2025/05/19 08:56] – [pre-defined symbols] codinghare
+++ cs:comp_n_arch:courses:fnti_i:week_6 [2025/05/19 14:24] (当前版本) – [Developing a Hack Assembler] codinghare
@@ 行 90: / 行 90: @@
 由于这些 symbol 只会在 A 指令中出现，因此只需要根据对照表，将 symbol 转化为对应的地址值即可：比如
 <code nand2tetris-hdl>
-// R1 cooresponds address 1
+// R1 corresponds address 1
 @R1->@1
 </code>
+===Label symbols===
+label 类型的 symbol 主要用于：
+  * 标记 goto 命令的终点，比如 ''@loop''。这类 symbol 会被直接替换成其值（与 pre-defined symbol 类似），比如下面例子中的 ''LOOP'' 会被替换为 ''16''
+  * 对应的伪代码声明，比如 ''(loop)''：这类 label 通常不会进行翻译，但汇编器遇到这类 label 时，会将其对应的 symbol 与其起始行关联起来。比如：
+<code nand2tetris-hdl>
+@i // line 0
+M=1
+@sum
+M=0
+(LOOP) // not a instruction line
+    @i // line 4
+    ...
+    @LOOP //line 16
+</code>
+这里的 ''(LOOP)'' 标签，对应的 block 起始行为 ''4''；这个关系将会使用一张表来维护：
+^Symbol^value^
+|LOOP|4|
+===Variable symbols===
+变量类型的 symbol 指非提前定义的，且没有使用 driective 在别处进行定义的（比如 ''LOOP''）的 symbol。上面例子中的 ''@i''，''@sum'' 都是变量类型的 symbol。之前提到过，这类 symbol 都会被存储在一个独特地址的内存单元中（地址从 ''16'' 开始）。因此这类 symbol 对应的值，是他们所在内存单元的地址：
+^Symbol^value^
+|i|16|
+|sum|17|
+由于变量可能会被多次用到，当第一次用到时，该变量 symbol 和其对应地址会被加入到映射表中；之后再使用时，汇编器会从映射表中查找到该对应关系，并把 symbol 翻译为对应的地址。
+===Symbol table===
+可以看到的是，上述的所有翻译都依赖于一种数据结构：//Symbol table//，用于建立翻译的映射关系。Hack 计算机的 Symbol table 通过几个阶段进行构造：
+  * 初始化：这个阶段会建立一张空表，并将 pre-defined symbol 的映射关系写入表中
+  * 寻找 label 声明：这个阶段（//first pass//）会扫描整个指令序列，将所有的 label 声明找出来
+    * 当遇到 ''('' 的时候，就识别为 label 的声明
+    * 同时维护一个计数器，对行数计数。记录的是**已经扫描过的行数**：比如之前的 ''(LOOP)''，其对应的数据是之前已经扫描过的行数 ''4''
+    * 然后接着扫描，直到遇到下一个 ''('' 开头的 label 声明，重复上面的过程，直到构建出整个 Symbol table 中的 label 声明数据
+  * 变量处理（//second pass//）：这个阶段会再次扫描指令序列，寻找**变量**，再将其与 RAM地址关联，并写入关系到 //Symbol table// 中：
+    * 新变量会直接加入 symbol table
+    * 已存在的变量会直接读取 symbol table 中的对应映射关系
+===Assembly process===
+<code cpp>
+// init
+- construct an empty symbol table
+- add the pre-defined symbols to the symbol table
+// first pass
+// adding label declaration to the symbol table
+- scan the program
+- For each instruction of the form(xxx):
+    - add the pair(xxx, address) to the symbol table, where address is the number of the instruction following (xxx)
+// second pass
+// adding variable symbol to the symbol table
+- scan the program
+- For each instruction:
+    - if the insturction is @symbol, look up the symbol in the table
+        - if (symbol, value) is found, use value to complete the instruction's translation
+        - if not found
+            - add(symbol, n) to the symbol table
+            - use n to complete the instruction translation
+            - ++n
+    - if the instruction is a C-instruction, complate the instruction translation
+- write the translated instruction to the output file
+</code>

What & How & Why

[ About ]

[ My Links ]

[ Notice ]

页面工具

站点工具

用户工具

差别