⌈xv6-fall2021⌋ lab 2：System calls

Word count: 3.2kReading time: 14 min

 2022/03/16 

LAB

INTRO

If you run, make grade, you will see that the grading script cannot exec trace and sysinfotest. Your job is to add the necessary system calls and stubs to make them work.

"当你 make grade 运行时，你会发现打分脚本不能执行 trace 和 sysinfotest。要让这两个 assignment 工作你需要添加必要的系统调用和 stub 程序"

关于什么是 stub 程序，可以参考一下

在真正实验开始之前，有件事要搞清楚就是单纯向 xv6 里面加入代码是不行，你还需要补充完整系统调用的接口。我这里以一个已完成实验者的视角先分享一份关于 syscall 如何添加 的 Flowcharts。当然，如果你想自己摸索，那就跳过这部分内容

在 user/usys.pl 打桩（也即是 "add stub"），即在后面追加 entry（"SYSCALL-NAME"）
在 kernel/syscall.h 后面追加宏（"MACRO"）
在 kernel/syscall.c 中：
- 声明你自己定义的函数调用，用 extern 声明在外部即可（extern uint64 <SYSCALL-NAME>(void);），至于实际的函数体，你可以定义在任何地方（比如我定义在 kernel/sysproc.c，你也可以在该目录下新建一个文件进行定义）
  - [注] 参数必须是 void，因为 xv6 只定义了参数为 void 的函数指针，所以只能使用 void 作形参类型
  - [注] 实际用户使用 syscall() 时，传入的参数会保存在 trapframe 内存区，详见 xv6-2021-handout, p44, Section 4.4, paragraph 1
- 在函数指针数组加上新增的函数
- 在 syscall（下文将 "系统调用" 简称为 syscall，而将 xv6 定义的 syscall 函数称为 syscall()）名称数组（这是 assignment trace 要求新增的数组）加上新增的 syscall 的名字
在 user/user.h 加入新增的 syscall 的原型（prototype）

可能现在你还不清楚 为什么流程是这样的？，或 为什么是这样的？ 甚至感到疑惑：在 kernel/syscall.c 中系统调用看起来是定义为 sys_trace() 的，但为什么我在 xv6 里却是用 trace 来执行的呢？

要回答这个问题，就要看看系统调用是怎么调用的：

用户输入 trace，xv6 执行时先找到 user/usys.pl（其实应该是由该 perl 脚本生成的汇编文件）
Perl 脚本匹配其相应的 entry（现在是 "trace"），然后在 sub entry() 内组合成 SYS_trace
根据 kernel/syscall.h 定义的 SYS_trace，通过 kernel/syscall.c 的函数指针数组找到 sys_trace(void)

至此，完成了系统调用添加的流程

System call tracing (MODERATE)

要求

In this assignment you will add a system call tracing feature that may help you when debugging later labs. You'll create a new trace system call that will control tracing. It should take one argument, an integer "mask", whose bits specify which system calls to trace. For example, to trace the fork system call, a program calls trace(1 << SYS_fork), where SYS_fork is a syscall number from kernel/syscall.h. You have to modify the xv6 kernel to print out a line when each system call is about to return, if the system call's number is set in the mask. The line should contain the process id, the name of the system call and the return value; you don't need to print the system call arguments. The trace system call should enable tracing for the process that calls it and any children that it subsequently forks, but should not affect other processes.

"在这个任务里你需要添加一个系统调用，用于追踪一些能帮助你 debug 的数据。你需要创建一个用于控制追踪的系统调用。该函数调用只接收一个参数，一个整型值 ‘掩码’，该掩码的比特位指出要追踪哪一个系统调用。举个例子，为了追踪 fork() 系统调用，该程序需要执行 trace(1 << SYS_fork)。其中，参数部分是 kernel/syscall.h 定义的系统调用号。你需要做的事情是修改 xv6 内核，如果在参数 ‘掩码’ 上对应的系统调用号是置位的，那么每次当对应的系统调用将要返回时，都要能打印一行信息。这些信息包含进程 id、系统调用的名字，以及系统调用的返回值，注意不需要打印系统调用参数。这个 trace 系统调用应该能追踪调用它的进程以及其产生的所有子进程，但不能影响其他无关的进程"

观察例子可得 "掩码" 的规则：

第一个例子：trace 32 输出 syscall read，下文给出 "32 is 1 << SYS_read"，而 "SYS_read" 是 kernel/syscall.h 定义的宏。理解为，即 bit-5 置位，所以是追踪 5 这个宏也就是 read
第二个例子：trace 2147483647 输出多个 syscall，同理即理解为低位全部 31 位均置位，所以追踪 ≤31 的所有宏
第三个例子：trace 2 输出 syscall fork，同前面例子理解即追踪 1 这个宏。但是这里要注意是追踪父进程以及所有子进程。

We provide a trace user-level program that runs another program with tracing enabled (see user/trace.c)

"我们提供了一个用户态的 trace 命令，该命令运行了另一个能够启动追踪的程序（参考 user/trace.c）" ———— 这句话其实是说，实际上系统调用并不是直接执行的，而是通过运行用户态的这个程序，再由这个程序去真正执行系统调用

提示

Run make qemu and you will see that the compiler cannot compile user/trace.c, because the user-space stubs for the system call don't exist yet: add a prototype for the system call to user/user.h, a stub to user/usys.pl, and a syscall number to kernel/syscall.h. The Makefile invokes the perl script user/usys.pl, which produces user/usys.S, the actual system call stubs, which use the RISC-V ecall instruction to transition to the kernel. Once you fix the compilation issues, run trace 32 grep hello README; it will fail because you haven't implemented the system call in the kernel yet.

"运行 make qemu 你会发现不能编译因为用户空间内的系统调用 stub（桩程序）还不存在。你需要在 user/user.h 添加系统调用原型；在 user/usys.pl 添加 stub（桩程序）；在 kernel/syscall.h 添加系统调用号。之后 Makefile 会调用 user/usys.pl 这个生成 user/usys.S 汇编文件的 perl 脚本，这才是用 RISC-V ecall 访管指令生成的、最终跳入内核的系统调用程序。不过就算你修复了这个编译问题，运行 trace 32 grep hello README 还是会失败因为你还未在内核里实现真正的系统调用" ———— 其实这段话就是开头处谈论的系统调用安装流程，在此不再赘述

Add a sys_trace() function in kernel/sysproc.c that implements the new system call by remembering its argument in a new variable in the proc structure (see kernel/proc.h). The functions to retrieve system call arguments from user space are in kernel/syscall.c, and you can see examples of their use in kernel/sysproc.c.

"通过在 proc 结构体（见 kernel/proc.h）新建一个变量来保存 trace 的参数（即那个整型掩码），从而在 kernel/sysproc.c 实现这个新系统调用 sys_trace()。这个用来检查用户传入参数（即整型掩码）的函数应存在于 kernel/syscall.c，你可以参考 kernel/sysproc.c 里面系统调用的使用"

Modify fork() (see kernel/proc.c) to copy the trace mask from the parent to the child process.

"修改 fork()（见 kernel/proc.）为能够从父进程复制 trace 的掩码到子进程"

Modify the syscall() function in kernel/syscall.c to print the trace output. You will need to add an array of syscall names to index into.

"修改 syscall()（见 kernel/syscall.c）为能够打印 trace 的输出。你需要增加一个数组，该数组可以找出对应系统调用的名称"

Summary

现在总结一下任务：

如果系统调用号在掩码中是置位，那么对应系统调用在返回时要打印要求给定的信息
可以参考 user/trace.c 看看用户态 trace 是怎么工作的
参数 Mask 作用：哪一位是置位，那么这一位代表的系统调用宏就要被追踪
参考系统调用安装流程使系统调用通过编译、运行
在 kernel/proc.h 的 proc 结构体 新增的一个变量记录 trace 传入的参数
在 kernel/syscall.c 处理用户传入的参数，如何处理参数的例子参考 kernel/sysproc.c 里面其他的系统调用函数
修改 kernel/proc.c 使 fork() 可继承父进程的 Mask
修改 kernel/syscall.c 里面用于打印 trace 信息的 syscall() ———— 增加一个保存系统调用名称的数组

Solution

//// path: kernel/syscall.h
#define SYS_trace   22

//// path: kernel/syscall.c
// Line 107 左右 "extern" 块处
extern uint64 sys_trace(void);
// *syscall[] 函数指针数组处
[SYS_trace]   sys_trace,
// 新增系统调用名称数组
static char *syscallname[NELEM(syscalls)] = {
  "",       "fork",   "exit",   "wait",   "pipe",
  "read",   "kill",   "exec",   "fstat",  "chdir",
  "dup",    "getpid", "sbrk",   "sleep",  "uptime",
  "open",   "write",  "mknod",  "unlink", "link",
  "mkdir",  "close",  "trace",
};
// syscall() 的 if() 内新增
if (p->tmask >= 2) {
    int mask = p->tmask;
    int move = 1 << num;
    if (mask & move)
    printf("%d: syscall %s -> %d\n", p->pid, syscallname[num], p->trapframe->a0);
}

//// path: kernel/proc.h
// struct proc {} 内
int tmask;// 记录 trace 命令传入的参数

//// path: kernel/proc.c
// int fork(void) 内
np->tmask = p->tmask;

//// path: kernel/sysproc.c
uint64
sys_trace(void)
{
    int mask;
    
    // 要检索 trace() 的参数，用 syscall.c 里的函数来检查
    // 例子参考 kernel/sysproc.c
    // 为什么要检查参数？xv6-handout-ch4.4 有说明
    if(argint(0, &mask) < 0)// 检查 a0 里用户传入的参数，保存到变量 mask
		return -1;
    myproc()->tmask = mask;// 保存到 trapframe
    return 0;
}

//// path: user/user.h
int trace(int);

//// path: user/usys.pl
entry("trace");

Sysinfo (MODERATE)

要求

In this assignment you will add a system call, sysinfo, that collects information about the running system. The system call takes one argument: a pointer to a struct sysinfo (see kernel/sysinfo.h). The kernel should fill out the fields of this struct: the freemem field should be set to the number of bytes of free memory, and the nproc field should be set to the number of processes whose state is not UNUSED. We provide a test program sysinfotest; you pass this assignment if it prints "sysinfotest: OK".

"你要实现一个叫做 sysinfo 的系统调用，用以收集一些系统运行的信息。该系统调用接收一个参数：一个指向 sysinfo 结构体 的指针（见 kernel/sysinfo.h）。内核应该能填充该结构体的一些字段：freemem 字段记录可用内存的字节数；nproc 字段记录处于 UNUSED 状态的进程的数量。我们提供了一个测试脚本，sysinfotest，如果你运行时该脚本打印 ‘sysinfotest: OK’ 那就是通过测试了"

提示

To declare the prototype for sysinfo() in user/user.h you need predeclare the existence of struct sysinfo:
1
2
struct sysinfo;
int sysinfo(struct sysinfo *);  

"为了在 user/user.h 声明 sysinfo() 的原型，你需要像上面给出那样先声明 sysinfo 结构体 的存在"

sysinfo needs to copy a struct sysinfo back to user space; see sys_fstat() (kernel/sysfile.c) and filestat() (kernel/file.c) for examples of how to do that using copyout().

"sysinfo() 需要将 sysinfo 结构体 拷贝回用户空间。参考 sys_fstat()（见 kernel/sysfile.c）和 filestat()（见 kernel/file.c）这两个例子，看看是怎样使用 copyout() 这个函数的"

To collect the amount of free memory, add a function to kernel/kalloc.c

"在 kernel/kalloc.c 增加一个用以收集空闲内存数量的函数"

To collect the number of processes, add a function to kernel/proc.c

"在 kernel/proc.c 增加一个用以收集进程数量的函数"

Summary

这个 assignment 难点在于如何统计 空闲内存 以及 进程数量

对于空闲内存，这需要好好阅读 kernel/kalloc.c，分析里面每个函数以及 run 结构体 kmen 结构体 的作用：

run 结构体：这是一个链表，指出当前运行的进程
kmem 结构体：具体起什么作用我还搞不懂，但是，里面的 freelist 成员 就是一个空闲块的链表（参考 kernel/kalloc.c/alloc()）

因此，统计空闲内存数量就变成了，遍历 kmem 结构体 里面的空闲块链表就行。需要注意的是链表结点是空闲块，最后数完后返回的需要是字节的数量

对于进程数量，需要阅读 kernel/proc.c 好好分析进程的组织形式，我是通过参考 procinit()、allocproc()、sheduler() 和 kill() 了解的。简单来说就是进程以数组形式组织，之后对进程作出什么操作只需要遍历即可

Solution

//// path: kernel/syscall.h
#define SYS_sysinfo 23

//// path: kernel/syscall.c
// Line 107 左右 "extern" 块处
extern uint64 sys_sysinfo(void);
// *syscall[] 函数指针数组处
[SYS_sysinfo] sys_sysinfo,
// 新增系统调用名称数组
static char *syscallname[NELEM(syscalls)] = {
  ... ... "sysinfo",
};

//// path: kernel/defs.h
// kalloc.c 注释处
int freenum(void);
// proc.c 注释处
int procnum(void);

//// path: kernel/kalloc.c
int
freenum(void)
{
    struct run *p = kmem.freelist;
    int cr = 0;

    while (p) {
        ++cr;
        p = p->next;
    }

    return cr * PGSIZE;
}

//// path: kernel/proc.c
int
procnum(void)
{
  struct proc *p;
  int cr = 0;
  for(p = proc; p < &proc[NPROC]; p++)
    if (p->state != UNUSED)    cr++;
  return cr;
}

//// path: kernel/sysproc.c
#include "sysinfo.h"
uint64
sys_sysinfo(void)
{
	uint64 uptr;
	struct sysinfo info;
  	struct proc *p = myproc();

	if(argaddr(0, &uptr) < 0)    return -1;// 获取用户传入的指针
	info.freemem = freenum();
	info.nproc = procnum();
	if(copyout(p->pagetable, uptr, (char*)&info, sizeof(info)) < 0)    return -1;
	return 0;
}

//// path: user/user.h
struct sysinfo;
int sysinfo(struct sysinfo *);
//// path: user/usys.pl
entry("sysinfo");