⌈xv6-fall2021⌋ lab 9：file system

Word count: 2.9kReading time: 13 min

 2022/04/17 

ABOUT

Large files (MODERATE)

实验要求（译）（节选）

In this assignment you'll increase the maximum size of an xv6 file. Currently xv6 files are limited to 268 blocks, or 268*BSIZE bytes (BSIZE is 1024 in xv6). This limit comes from the fact that an xv6 inode contains 12 "direct" block numbers and one "singly-indirect" block number, which refers to a block that holds up to 256 more block numbers, for a total of 12+256=268 blocks.

"这个任务你要增加 xv6 文件的最大尺寸。当前的 xv6 文件限定为 268 个块，或者说是 268 * BSIZE 字节（xv6 的 BSIZE 是 1024）。这个限制是因为一个 xv6 inode 包含 12 个 '直接' 块号，以及一个 '一级间接' 块号。这些一级索引指向一个保存最多 256 个块号的块，共计 12+256 个块"

You'll change the xv6 file system code to support a "doubly-indirect" block in each inode, containing 256 addresses of singly-indirect blocks, each of which can contain up to 256 addresses of data blocks. The result will be that a file will be able to consist of up to 65803 blocks, or 256*256+256+11 blocks (11 instead of 12, because we will sacrifice one of the direct block numbers for the double-indirect block).

"你需要改变 xv6 文件系统的代码以使每个 inode 支持一个 '两级间接' 块，每个二级索引块包含 256 个一级索引块的地址，而这些一级索引块又最多支持 256 个数据块的地址。结果就是一个文件能够包含 65803 个块，或者说 256 * 256 + 256 + 11 个块（是 11 而不是 12 个直接块，因为我们要牺牲一个块来放置二级索引块）"

The format of an on-disk inode is defined by struct dinode in fs.h. You're particularly interested in NDIRECT, NINDIRECT, MAXFILE, and the addrs[] element of struct dinode.

"一个磁盘 inode 的格式定义在 fs.h 的 struct dinode 结构体种。你需要特别关注结构体里的 NDIRECT、NINDIRECT、MAXFILE 以及 addrs[] 这些字段"

The code that finds a file's data on disk is in bmap() in fs.c. Have a look at it and make sure you understand what it's doing. bmap() is called both when reading and writing a file. When writing, bmap() allocates new blocks as needed to hold file content, as well as allocating an indirect block if needed to hold block addresses.

"在磁盘上找一个文件的数据的代码在 fs.c 里的 bmap()，阅读这个函数并确保你理解了这个函数的功能。读和写一个文件都会调用 bmap()。当执行写操作时，bmap() 会在需要保存文件内容时分配一个新块，以及在需要保存块地址时分配一个间接块"

Modify bmap() so that it implements a doubly-indirect block, in addition to direct blocks and a singly-indirect block. You'll have to have only 11 direct blocks, rather than 12, to make room for your new doubly-indirect block; you're not allowed to change the size of an on-disk inode. The first 11 elements of ip->addrs[] should be direct blocks; the 12th should be a singly-indirect block (just like the current one); the 13th should be your new doubly-indirect block.

"修改 bmap() 以便你可以实现除了直接块和一个一级索引块以外的另一个二级索引块。为了给新的二级索引块分配空间，你只能使用 11 个直接块，而不是 12。但你不应该修改一个磁盘 inode 的大小。ip->addrs[] 开始的 11 个元素应该为直接块，第 12 个应该是一级索引块（就像原来的代码那样），第 13 个应该是新的二级索引块"

bigfile will take at least a minute and a half to run.

"bigfile 最快也要运行一分半钟"

实验提示

实验提示比较简单，所以就不译了

理解 bmap()，主要是参考读一级索引的逻辑
思考逻辑块号和一级索引块、二级索引块有什么联系
主要修改的函数只有两个：bmap() 和 itrunc()

实现思路

类比 bmap() 读一级索引就可以了

读一级索引块：

先读一级索引块的地址（实际上是块号）
通过上一步的块号调用 bread() 读取磁盘块（实际上是磁盘块的内存副本，对应的结构是 struct buf）
磁盘块在 xv6 组织为 "元信息" + "真正数据" 的形式，通过 "元信息" 部分找到真实的块数据。由于当前仍是在一级索引块内，所以这些块数据仍然是一个块号。最后通过这个块号才能找到实际的数据块，所以最后返回这个块号

读二级索引块（类比上面步骤）：

读二级索引块地址（实际上也是块号）
通过上一步块号读取磁盘块，这个磁盘块就是二级索引块，所以它里面的数据是一级索引块的地址
得到一级索引块的地址后，和上面逻辑就是一模一样了

但是，要注意的一点是逻辑块号和直接块、一二级索引块的联系。下面举个例子：

原来的代码已经给出支持最大 12 + 256 个块，那么逻辑块号就是 0~267，如果我要访问逻辑块号 #250，那么应该是先减去前面 12 个直接块，得出 238。这个 238 说的是一级索引里的 #238 号块（由于块号是从 0 开始，可能会相差 1，这里我没仔细想）

如果再加上第二级索引，那么就是 11 + 256 + 256 * 256 个块，你要访问一个块，那么要先减去前面的直接块和一级索引，这里就不细说了

Solution

// path: kernel/file.c
struct inode {
    ...
    uint addrs[NDIRECT+2];
};

// path: kernel/fs.h
#define NDIRECT 11
#define NINDIRECT (BSIZE / sizeof(uint))
#define MAXFILE (NDIRECT + NINDIRECT + NINDIRECT * NINDIRECT)

// On-disk inode structure
struct dinode {
    short type;           // File type
    short major;          // Major device number (T_DEVICE only)
    short minor;          // Minor device number (T_DEVICE only)
    short nlink;          // Number of links to inode in file system
    uint size;            // Size of file (bytes)
    uint addrs[NDIRECT+2];   // Data block addresses
};

// path: kernel/fs.c
static uint
bmap(struct inode *ip, uint bn) {
    ...
    if(bn < NDIRECT) {
        ...
    }
    bn -= NDIRECT;
    
    if(bn < NINDIRECT) {
        ...
    }

    //lab fs 新增
    bn -= NINDIRECT;
    if (bn < NINDIRECT * NINDIRECT) {
        if((addr = ip->addrs[NDIRECT + 1]) == 0)// 0 表示没有分配块
        ip->addrs[NDIRECT + 1] = addr = balloc(ip->dev);
        bp = bread(ip->dev, addr);// 读二级索引块
        a = (uint*)bp->data;// 在块中某个偏移处读二级索引块真实数据
        // 取一级索引
        if((addr = a[bn / NINDIRECT]) == 0) {// 0 表示没有分配块
            a[bn / NINDIRECT] = addr = balloc(ip->dev);
            log_write(bp);
        }
        brelse(bp);

        // 读一级索引
        bp = bread(ip->dev, addr);// 读一级索引块
        a = (uint*)bp->data;// 在块中某个偏移处读一级索引块真实数据
        if ((addr = a[bn % NINDIRECT]) == 0) {
            a[bn % NINDIRECT] = addr = balloc(ip->dev);
            log_write(bp);
        }
        brelse(bp);
        return addr;
    }

    ...
}

void
itrunc(struct inode *ip)
{
  int i, j, k;
  struct buf *bp, *bp2;
  uint *a, *a2;

    ...

    // 二级索引
    if (ip->addrs[NDIRECT + 1]) {
        bp2 = bread(ip->dev, ip->addrs[NDIRECT + 1]);
        a2 = (uint*)bp2->data;
        for (j = 0; j < NINDIRECT; ++j) {
            bp = bread(ip->dev, a2[j]);
            a = (uint*)bp->data;
            // 一级索引
            for (k = 0; k < NINDIRECT; ++k) {
                if (a[k])    bfree(ip->dev, a[k]);
            }
            brelse(bp);
        }
        brelse(bp2);
        bfree(ip->dev, ip->addrs[NINDIRECT + 1]);
        ip->addrs[NINDIRECT + 1] = 0;
    }

    ...
}

Symbolic links (MODERATE)

实验要求（译）（节选）

Symbolic links (or soft links) refer to a linked file by pathname; when a symbolic link is opened, the kernel follows the link to the referred file.

"符号链接（也称为软链接）指向一个链接到某个路径名的文件，当内核打开一个符号链接时，它会跳转去所指向的文件上"

You will implement the symlink(char target, char path) system call, which creates a new symbolic link at path that refers to file named by target.

"你要实现 symlink(char *target, char *path) 系统调用，这在 path 上创建了一个新的符号链接，指向 target 命名的文件"

实验提示（译）（节选）

Note that target does not need to exist for the system call to succeed. You will need to choose somewhere to store the target path of a symbolic link, for example, in the inode's data blocks.

"注意 target 不需要存在以可使系统调用成功。你需要选择某个地方保存这个符号链接的 target 路径，比如在 inode 的数据块中"

Modify the open system call to handle the case where the path refers to a symbolic link. If the file does not exist, open must fail. When a process specifies O_NOFOLLOW in the flags to open, open should open the symlink (and not follow the symbolic link).

"修改 open 系统调用使之可以处理符号链接。如果一个文件不存在，open 失败。当一个进程指定 O_NOFOLLOW 标记打开时，open 应该打开符号链接文件本身（而不是跳转至符号链接所指向的文件）"

Other system calls (e.g., link and unlink) must not follow symbolic links; these system calls operate on the symbolic link itself.

"其他系统调用（比如 link 和 unlink）不需要跳转至符号链接所指向的文件，这些系统调用直接操作符号链接文件本身"

实现思路

安装系统调用的流程可以参考我系统调用一文开头处结论

首先来说说符号链接是个什么东西，参考 csdn lab fs 一文：

它也是一个文件，它保存在 target 路径这个字符串
它有两种打开方式：
- follow 方式：打开 target 指向的文件
- nofollow 方式：打开文件本身

像 symlink("/usr/a", "/usr/b") 这样调用，其实就是在 /usr/ 下创建一个文件 b，它指向同路径下的另一个文件 a。如果是 follow 方式打开 b，其实是打开了 a；否则会直接打开 b，但打开的样子是什么样的，我也不知道，实际也不用关心

所以 symlink() 其实就只有一个逻辑就是创建文件，直接调用 create() 就行了。需要注意的是创建出来的文件会占用一个 inode，但实际上一个文件只有在真正使用的时候（比如读写文件）才需要 inode。你现在只创建出来，还未到真正时候，所以应该调用 iunlockput() 释放掉占用的 inode，需要用的时候再通过 namei() 某个文件名就可以取回 inode

最后一个值得注意的是，跑 make grade 需要把 timeout 改大点，我这边无论跑上一个 assignment，还是跑这个，都接近 600 s，放在原来的 timeout 是过不了的。虽然我也不知道这样改算不算犯规..

Solution

// path: ./Makefile
UPROGS=\
    ...
    $U/_symlinktest\

// path: ./grade-lab-fs
...
@test(40, "running bigfile")
def test_bigfile():
    r.run_qemu(shell_script([
        'bigfile'
    ]), timeout=2000)
    r.match('^wrote 65803 blocks$')
    r.match('^bigfile done; ok$')
...
@test(19, "usertests")
def test_usertests():
    r.run_qemu(shell_script([
        'usertests'
    ]), timeout=2000)
    r.match('^ALL TESTS PASSED$')

// path: user/usys.pl
entry("symlink");

// path: user/user.h
// ulib.c
...
int symlink(char *, char *);

// path: kernel/syscall.h
#define SYS_symlink 23

// path: kernel/syscall.c
extern uint64 sys_symlink(void);

static uint64 (*syscalls[])(void) = {
...
[SYS_symlink] sys_symlink,
};

// path: kernel/stat.h
#define T_SYMLINK 4

// path: kernel/fcntl.h
#define O_NOFOLLOW 0x100// 可以被 open() 调用

// path: kernel/file.h
struct inode {
    ...
    char symlink[128];  // 软链接 pathname
    ...
}

// path: kernel/sysfile.c
uint64
sys_open(void) {
    ...
    if(omode & O_CREATE) {
        ...
    } else {
        if((ip = namei(path)) == 0) {
            ...
        }

        // lab fs
        // 可能 link 仍然是 link，所以要迭代处理
        int depth = 11;
        while (depth--) {
            if (ip->type == T_SYMLINK && (omode & O_NOFOLLOW) == 0) {
                if (depth == 0) {// 迭代 10 次以后相当于 cycle
                    end_op();
                    return -1;
                }
                // NOFOLLOW 这一位为零，即以 follow 形式打开
                if ((ip = namei(ip->symlink)) == 0) {
                    end_op();
                    return -1;
                }
            } else    break;
        }

        ...
    }

    ...
}

// 创建一个软链接，指向链接到某路径名的文件
// 在 path 上创建一个指向 target 的文件
// target 不需要存在本系统调用也可以成功
uint64
sys_symlink(void) {
    char target[MAXPATH], path[MAXPATH];

    if(argstr(0, target, MAXPATH) < 0 || argstr(1, path, MAXPATH) < 0)
        return -1;

    struct inode *ip;
    begin_op();

    // symbolic link 也属于文件
    // create() 会返回带锁的 inode
    ip = create(path, T_SYMLINK, 0, 0);
    if(ip == 0){
        end_op();
        return -1;
    }

    strncpy(ip->symlink, target, MAXPATH);

    iunlockput(ip);
    end_op();
    return 0;
}