mit6.828 Lab1

2023-10-23 08:59
Part 1

Exercise 1

这本书中介绍的是使用nasm汇编器所支持的汇编(Intel Syntax),但在这个lab中实际使用的是GNU汇编器(AT&T Syntax)

Exercise 2

使用gdb对JOS进行调试,调试步骤先在lab的终端中输入make qemu-gdb,再开另外一个终端输入make gdb

Part 2

Exercise 3

查看lab tools guide,这里面包括一些调试OS特殊的GDB技巧

See the GDB manual for a full guide to GDB commands. Here are some particularly useful commands for 6.828, some of which don't typically come up outside of OS development.Ctrl-c
Halt the machine and break in to GDB at the current instruction. If QEMU has multiple virtual CPUs, this halts all of them.
c (or continue)
Continue execution until the next breakpoint or Ctrl-c.
si (or stepi)
Execute one machine instruction.
b function or b file:line (or breakpoint)
Set a breakpoint at the given function or line.
b *addr (or breakpoint)
Set a breakpoint at the EIP addr.
set print pretty
Enable pretty-printing of arrays and structs.
info registers
Print the general purpose registers, eip, eflags, and the segment selectors. For a much more thorough dump of the machine register state, see QEMU's own info registers command.
x/Nx addr
Display a hex dump of N words starting at virtual address addr. If N is omitted, it defaults to 1. addr can be any expression.
x/Ni addr
Display the N assembly instructions starting at addr. Using $eip as addr will display the instructions at the current instruction pointer.
symbol-file file
(Lab 3+) Switch to symbol file file. When GDB attaches to QEMU, it has no notion of the process boundaries within the virtual machine, so we have to tell it which symbols to use. By default, we configure GDB to use the kernel symbol file, obj/kern/kernel. If the machine is running user code, say hello.c, you can switch to the hello symbol file using symbol-file obj/user/hello.
QEMU represents each virtual CPU as a thread in GDB, so you can use all of GDB's thread-related commands to view or manipulate QEMU's virtual CPUs.thread n
GDB focuses on one thread (i.e., CPU) at a time. This command switches that focus to thread n, numbered from zero.
info threads
List all threads (i.e., CPUs), including their state (active or halted) and what function they're in.
  1. 通过指令b *0x7c00在地址0x7c00中打个断点
  2. 通过指令c来时程序运行到断点处
  3. 使用si进行单步执行,与boot.S中的指令进行比较
  4. 使用x/8i $eip指令来查看从当前执行到的地址开始的8条指令
  • 对boot/main.c中的bootmain()打断点(尚未完成,打不到该函数的断点)


  1. 在哪里开始处理器开始执行32位代码?是什么导致从16位到32位的转换?
  # Jump to next instruction, but in 32-bit code segment.# Switches processor into 32-bit mode.ljmp    $PROT_MODE_CSEG, $protcseg
  1. boot loader的最后一条指令是什么?加载kernel后的第一条指令是什么?
  2. kernel的第一条指令的地址是什么?
  3. 为了将整个kernel加载从磁盘加载到内存boot loader如何决定要加载多少个扇区?boot loader从哪里找到要加载的扇区的信息
Exercise 4
  • 推荐弄懂其中的每一个细节,保证对C语言的掌握足以应对接下来的实验
#include <stdio.h>
#include <stdlib.h>void
{int a[4];int *b = malloc(16);int *c;int i;printf("1: a = %p, b = %p, c = %p\n", a, b, c);c = a;for (i = 0; i < 4; i++)a[i] = 100 + i;c[0] = 200;printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",a[0], a[1], a[2], a[3]);c[1] = 300;*(c + 2) = 301;3[c] = 302; //C语言中,数组和下标可以互换,这是由数组下标的指针定义决定的,由于存在加法交换律,只要一个是指针,另一个是整型就			 //行,而无关顺序,a[3]等价于3[a],等价于*(a+3),等价于*(3+a)。printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",a[0], a[1], a[2], 3[a]);c = c + 1;*c = 400;printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",a[0], a[1], a[2], a[3]);c = (int *) ((char *) c + 1);*c = 500;printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",a[0], a[1], a[2], a[3]);b = (int *) a + 1;	// addr = a.addr + sizeof(int) * 1,指针的算术运算c = (int *) ((char *) a + 1);	// addr  = a.addr + sizeof(char) * 1printf("6: a = %p, b = %p, c = %p\n", a, b, c);
main(int ac, char **av)
{f();return 0;


1: a = 0x7fffb9dbf680, b = 0x117f010, c = 0x1
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7fffb9dbf680, b = 0x7fffb9dbf684, c = 0x7fffb9dbf681
  1. C语言中,数组和下标可以互换,这是由数组下标的指针定义决定的,由于存在加法交换律,只要一个是指针,另一个是整型就行,而无关顺序,a[3]等价于3[a],等价于*(a+3),等价于*(3+a)。

当对一个C程序进行编译链接时,编译器将C源代码转化为包含二进制格式的汇编指令的object file(.o),链接器将所有的obejct file链接成单个二进制镜像,该镜像以ELF(executable linklable format)为标准。

  • ELF
  1. ELF文件由4部分组成,分别是ELF头(ELF header)、程序头表(Program header table)、节(Section)和节头表(Section header table)
  2. 在这里插入图片描述
    elf section:
  • .text: 存放程序的可执行指令
  • .rodata:存放只读数据,例如C语言中的字符串常量
  • .data:存放程序中已初始化的数据,例如被初始化的全局变量


$objdump -h obj/kern/kernelobj/kern/kernel:     文件格式 elf32-i386节:
Idx Name          Size      VMA       LMA       File off  Algn0 .text         000019e9  f0100000  00100000  00001000  2**4CONTENTS, ALLOC, LOAD, READONLY, CODE1 .rodata       000006c0  f0101a00  00101a00  00002a00  2**5CONTENTS, ALLOC, LOAD, READONLY, DATA2 .stab         00003b95  f01020c0  001020c0  000030c0  2**2CONTENTS, ALLOC, LOAD, READONLY, DATA3 .stabstr      00001948  f0105c55  00105c55  00006c55  2**0CONTENTS, ALLOC, LOAD, READONLY, DATA4 .data         00009300  f0108000  00108000  00009000  2**12CONTENTS, ALLOC, LOAD, DATA5 .got          00000008  f0111300  00111300  00012300  2**2CONTENTS, ALLOC, LOAD, DATA6 .got.plt      0000000c  f0111308  00111308  00012308  2**2CONTENTS, ALLOC, LOAD, DATA7 .data.rel.local 00001000  f0112000  00112000  00013000  2**12CONTENTS, ALLOC, LOAD, DATA8 00000044  f0113000  00113000  00014000  2**2CONTENTS, ALLOC, LOAD, DATA9 .bss          00000648  f0113060  00113060  00014060  2**5CONTENTS, ALLOC, LOAD, DATA10 .comment      0000002b  00000000  00000000  000146a8  2**0CONTENTS, READONLY


Exercise 5


(gdb) b *0x7c2a
Breakpoint 1 at 0x7c2a
(gdb) c
[   0:7c2a] => 0x7c2a:	mov    %eax,%cr0Breakpoint 1, 0x00007c2a in ?? ()
(gdb) si
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x7c42
0x00007c2d in ?? ()   //报错
Exercise 6


(gdb) x /8wx 0x00100000
0x100000:	0x00000000	0x00000000	0x00000000	0x00000000
0x100010:	0x00000000	0x00000000	0x00000000	0x00000000


(gdb) b *0x10000c
Breakpoint 2 at 0x10000c
(gdb) c
The target architecture is assumed to be i386
=> 0x10000c:	movw   $0x1234,0x472Breakpoint 2, 0x0010000c in ?? ()
(gdb) x /8wx 0x00100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x2000b812	0x220f0011	0xc0200fd8
Exercise 7

trace到movl %eax, %cr0指令处,检查0x00100000和0xf0100000处的内存

(gdb) b *0x100025
Breakpoint 1 at 0x100025
(gdb) c
The target architecture is assumed to be i386
=> 0x100025:	mov    %eax,%cr0Breakpoint 1, 0x00100025 in ?? ()
(gdb) x/8x 0x00100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x2000b812	0x220f0011	0xc0200fd8
(gdb) x/8x 0xf0100000
0xf0100000 <_start+4026531828>:	0x00000000	0x00000000	0x00000000	0x00000000
0xf0100010 <entry+4>:	0x00000000	0x00000000	0x00000000	0x00000000
(gdb) si
=> 0x100028:	mov    $0xf010002f,%eax
0x00100028 in ?? ()
(gdb) x/8x 0xf0100000
0xf0100000 <_start+4026531828>:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0xf0100010 <entry+4>:	0x34000004	0x2000b812	0x220f0011	0xc0200fd8
(gdb) x/8x 0x00100000
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010:	0x34000004	0x2000b812	0x220f0011	0xc0200fd8

将movl %eax, %cr0注释掉后,重新构建项目,运行会出现以下错误:

qemu: fatal: Trying to execute code outside RAM or ROM at 0xf010002c
Exercise 8


  • 可变参数定义在stdarg.h中
  • va_list定义一个指向参数列表的指针
  • void va_start(va_list, last_arg); 对va_list的初始化,last_arg为省略号前的那个参数
  • type va_arg(va_list, type) 获取参数的下一个参数,并以type类型返回
  • void va_end(va_list ap) 回收参数列表


#include "stdarg.h"
#include <iostream>int sum(char* msg, ...);int main()
{int total = 0;total = sum("hello world", 1, 2, 3);std::cout << "total = " << total << std::endl;system("pause");return 0;
}int sum(char* msg, ...)
{va_list vaList; //定义一个具有va_list型的变量,这个变量是指向参数的指针。va_start(vaList, msg);//第一个参数指向可变列表的地址,地址自动增加,第二个参数位固定值std::cout << msg << std::endl;int sumNum = 0;int step;while ( 0 != (step = va_arg(vaList, int)))//va_arg第一个参数是可变参数的地址,第二个参数是传入参数的类型,返回值就是va_list中接着的地址值,类型和va_arg的第二个参数一样{                          //va_arg 取得下一个指针//不等于0表示,va_list中还有参数可取sumNum += step;}va_end(vaList);//结束可变参数列表return sumNum;
  • 补充%o输出
case 'o':// Replace this with your code.num = getint(&ap, lflag);if ((long long) num < 0) { // 判断该数是否为负数,如果是负数在屏幕上显示负号putch('-', putdat);num = -(long long) num;	// abs(num)}base = 16;goto number;


  • question 1:
    Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?
  • question 2:
    Explain the following from console.c:
1      if (crt_pos >= CRT_SIZE) {
2              int i;
3              memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
4              for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
5                      crt_buf[i] = 0x0700 | ' ';
6              crt_pos -= CRT_COLS;
7      }

答:将1-79行挪到0-78行,将79行的每个字符全部置为’ ’

  • question 3
    int x = 1, y = 3, z = 4;
    cprintf(“x %d, y %x, z %d\n”, x, y, z);
    In the call to cprintf(), to what does fmt point? To what does ap point?
    List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

  • question 4
    Run the following code.

    unsigned int i = 0x00646c72;cprintf("H%x Wo%s", 57616, &i);


He110 World

原因:d(57601) = 0xe101,ASCII(0x72) = r,ASCII(0x6c)=l,ASCII(0x64)=d,ASCII(0x00) = ‘\0’

  • question 5
    In the following code, what is going to be printed after ‘y=’? (note: the answer is not a specific value.) Why does this happen?
   cprintf("x=%d y=%d", 3);


Exercise 9

堆栈的初始化位于entry.S line 75

	movl	$0x0,%ebp			# nuke frame pointer# Set the stack pointermovl	$(bootstacktop),%esp# now to C codecall	i386_init

根据上面的指令可知第一个栈的栈底是0x0,当调用i386_init()后会push eip,push ebp,更新ebp。
kernel通过在entry.S 的bootstack段中使用.space伪指令来预留堆栈空间		KSTKSIZE	# 8 * 4096,预留32KB


Exercise 10

使用gdb在obj/kern/kernel.asm中的test_backtrace()打上断电,查看函数调用的细节。当使用call指令时,第1步push返回地址,第2步push ebp,第3步更新ebp为esp的值(mov %esp %ebp)。
disas test_backtrace

 0xf0100040 <+0>:	push   %ebp
=> 0xf0100041 <+1>:	mov    %esp,%ebp0xf0100043 <+3>:	push   %esi0xf0100044 <+4>:	push   %ebx0xf0100045 <+5>:	call   0xf01001bc <__x86.get_pc_thunk.bx>0xf010004a <+10>:	add    $0x112be,%ebx0xf0100050 <+16>:	mov    0x8(%ebp),%esi0xf0100053 <+19>:	sub    $0x8,%esp0xf0100056 <+22>:	push   %esi0xf0100057 <+23>:	lea    -0xf868(%ebx),%eax0xf010005d <+29>:	push   %eax0xf010005e <+30>:	call   0xf0100a79 <cprintf>0xf0100063 <+35>:	add    $0x10,%esp0xf0100066 <+38>:	test   %esi,%esi0xf0100068 <+40>:	jg     0xf0100095 <test_backtrace+85>0xf010006a <+42>:	sub    $0x4,%esp0xf010006d <+45>:	push   $0x00xf010006f <+47>:	push   $0x00xf0100071 <+49>:	push   $0x00xf0100073 <+51>:	call   0xf0100883 <mon_backtrace>0xf0100078 <+56>:	add    $0x10,%esp0xf010007b <+59>:	sub    $0x8,%esp

Exercise 11


Exercise 12

objdump -G obj/kern/kernel > output.md将内核的符号表信息输出到output.md文件,在output.md文件中可以看到以下片段:

Symnum n_type n_othr n_desc n_value  n_strx String
118    FUN    0      0      f01000a6 2987   i386_init:F(0,25)
119    SLINE  0      24     00000000 0      
120    SLINE  0      34     00000012 0      
121    SLINE  0      36     00000017 0      
122    SLINE  0      39     0000002b 0      
123    SLINE  0      43     0000003a 0      


objdump -h kernel

kernel:     文件格式 elf32-i386节:
Idx Name          Size      VMA       LMA       File off  Algn0 .text         00001ad9  f0100000  00100000  00001000  2**4CONTENTS, ALLOC, LOAD, READONLY, CODE1 .rodata       00000714  f0101ae0  00101ae0  00002ae0  2**5CONTENTS, ALLOC, LOAD, READONLY, DATA2 .stab         00003cd9  f01021f4  001021f4  000031f4  2**2CONTENTS, ALLOC, LOAD, READONLY, DATA3 .stabstr      0000196b  f0105ecd  00105ecd  00006ecd  2**0CONTENTS, ALLOC, LOAD, READONLY, DATA4 .data         00009300  f0108000  00108000  00009000  2**12CONTENTS, ALLOC, LOAD, DATA5 .got          00000008  f0111300  00111300  00012300  2**2CONTENTS, ALLOC, LOAD, DATA6 .got.plt      0000000c  f0111308  00111308  00012308  2**2CONTENTS, ALLOC, LOAD, DATA7 .data.rel.local 00001000  f0112000  00112000  00013000  2**12CONTENTS, ALLOC, LOAD, DATA8 00000044  f0113000  00113000  00014000  2**2CONTENTS, ALLOC, LOAD, DATA9 .bss          00000648  f0113060  00113060  00014060  2**5CONTENTS, ALLOC, LOAD, DATA10 .comment      0000002b  00000000  00000000  000146a8  2**0CONTENTS, READONLY

注意printf的这个用法printf("%.*s", length, string)

  • lab1获取50分
struct Eipdebuginfo info;if (debuginfo_eip(p[1], &info) == 0) {cprintf("\t%s:%d: %.*s+%d\n", info.eip_file, info.eip_line, info.eip_fn_namelen,info.eip_fn_name,  p[1] - info.eip_fn_addr);}


MMU(memory management unit):内存管理单元,负责将虚拟地址映射为物理机制,以及提供硬件的访问授权。

  • GCC中的内嵌ASM

  • 汇编中in和out指令

  IN AL,21H;表示从21H端口读取一字节数据到ALIN AX,21H;表示从端口地址21H读取1字节数据到AL,从端口地址22H读取1字节到AHMOV DX,379HIN AL,DX ;从端口379H读取1字节到ALOUT 21H,AL;将AL的值写入21H端口OUT 21H,AX;将AX的值写入端口地址21H开始的连续两个字节。(port[21H]=AL,port[22h]=AH)MOV DX,378HOUT DX,AX ;将AH和AL分别写入端口379H和378H
  • cld与DF标志

  • 关于内存的使用

  • 关于GDT和LDT
     Base : 32位,代表这个程序的这个段的基地址。

Limit : 20位,代表这个程序的这个段的大小。

Flags :12位,代表这个程序的这个段的访问权限

这篇关于mit6.828 Lab1的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!


