Finding register contents and arguments at routine entry (fregs) ================================================================ The fregs command attempts to determine the contents of CPU registers at the time of entry to kernel routines in a stack trace. It also attempts to identify arguments to those routines, and in some cases, decode information from the arguments. This is a helpful tool in vmcore analysis. Options provided by 'fregs':: crash> fregs -h usage: fregs [-h] [-V] [-a] [-l] [-r ROUTINE] [-u | -A] [pid|taskp|cmd] Show register contents at routine entry. positional arguments: pid|taskp|cmd PID or task struct pointer or command (if omitted, use current context) optional arguments: -h, --help show this help message and exit -V, --version show program's version number and exit -a, --args identify arguments (-aa for more detail) -l, --lines show source code line numbers -r ROUTINE, --routine ROUTINE only show routines whose names include ROUTINE -u, --unint do all uninterruptible tasks -A, --all do ALL tasks (this may take a while!) **NOTE: fregs is supported on vmcores from x86_64 systems only.** * `Overview`_ * `Limitations`_ * `Show a specific process or command (pid|taskp|cmd)`_ * `Identify arguments (-a, --args)`_ * `Show source code line numbers (-l, --lines)`_ * `Show only matching routines (-r, --routine)`_ * `Show all uninterruptible tasks (-u, --unint)`_ * `Show all tasks (-A, --all)`_ Overview -------- Determining the arguments to kernel routines and the contents of CPU registers is a fundamental part of crash dump analysis. This is commonly done by disassembling kernel source code, manually following the code flow, and examining stack contents. This technique can be tedious and time-consuming. The fregs command automates much (though not all) of this work. It is not a complete replacement for manual analysis (see "Limitations" below) but can provide a significant savings of time and effort. This example is from a forced crash of a hung system. There is a 'grep' process that has been stuck in uninterruptible mode for over 43 minutes:: crash> ps -m 84981 [ 0 00:43:30.858] [UN] PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: "grep" The stack trace shows that it is waiting for a mutex while trying to perform an asynchronous write to a file:: #0 [ffff8a00027c5bc0] schedule at ffffffff8143f369 #1 [ffff8a00027c5d08] __mutex_lock_slowpath at ffffffff8144056f #2 [ffff8a00027c5d78] mutex_lock at ffffffff8143fffa #3 [ffff8a00027c5d90] generic_file_aio_write at ffffffff810f188e #4 [ffff8a00027c5e00] do_sync_write at ffffffff8114f910 #5 [ffff8a00027c5f10] vfs_write at ffffffff8114ff2e #6 [ffff8a00027c5f40] sys_write at ffffffff811500a3 #7 [ffff8a00027c5f80] system_call_fastpath at ffffffff81449392 It would be helpful to identify the file being written to. The vfs_write() routine is a good candidate for this because its first argument is a pointer to a file structure:: crash> whatis vfs_write ssize_t vfs_write(struct file *, const char *, size_t, loff_t *); In the x86_64 architecture, the first six arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9. (Subsequent arguments would be stored on the stack, but by convention Linux kernel routines do not take more than six arguments.) In this case the file pointer is the first argument, so it is passed in RDI. By disassembling the code at the beginning of vfs_write(), we see that the routine saves several registers on the stack, whose contents could be examined with 'bt -f':: crash> dis vfs_write | less 0xffffffff8114fe60 : sub $0x28,%rsp 0xffffffff8114fe64 : mov %rbx,0x8(%rsp) 0xffffffff8114fe69 : mov %rbp,0x10(%rsp) 0xffffffff8114fe6e : mov $0xfffffffffffffff7,%rbx 0xffffffff8114fe75 : mov %r12,0x18(%rsp) 0xffffffff8114fe7a : mov %r13,0x20(%rsp) Unfortunately, RDI is not among the registers saved on the stack; only RBX, RBP, R12, and R13 are saved. However, examining the code in the calling routine up to the point where vfs_write() is called may provide additional clues. The address of the call is shown in the next frame of the stack trace:: #6 [ffff8a00027c5f40] sys_write at ffffffff811500a3 crash> dis -r 0xffffffff811500a3 | tail -10 0xffffffff81150082 : mov %rax,%rbx 0xffffffff81150085 : je 0xffffffff811500b7 0xffffffff81150087 : mov 0x40(%rax),%rax 0xffffffff8115008b : lea 0x8(%rsp),%rcx 0xffffffff81150090 : mov %rbx,%rdi 0xffffffff81150093 : mov %r12,%rdx 0xffffffff81150096 : mov %r13,%rsi 0xffffffff81150099 : mov %rax,0x8(%rsp) 0xffffffff8115009e : call 0xffffffff8114fe60 0xffffffff811500a3 : mov %rax,%rbp The argument is loaded into RDI from register RBX by the instruction `mov %rbx,%rdi`, and neither RDI nor RBX is modified again before vfs_write() is called. The value of RBX is then saved on the stack at the beginning of vfs_write(), so it can be found with 'bt -f' and is almost certainly the same value contained in RDI, i.e. the file pointer argument. (There is a slight chance that the value could be wrong; see Limitations below.) The file pointer can then be used with the 'files -d' command to determine the name of the file that is being written. The fregs command automates much of this kind of manual analysis. Instead of disassembling the code manually and searching for the correct value on the stack, we can run the command:: crash> fregs PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: grep #0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223> +RBP: 0xffff8b9f3f4aeaa8 (snip) #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 3 RDX: 0x1000 2 RSI: 0x7f66b19ad000 #6 sys_write called from 0xffffffff81449392 +R13: 0x1000 #7 system_call_fastpath RIP: 00007f66b14e5bc0 RSP: 00007fff229de988 RFLAGS: 00010202 RAX: 0000000000000001 RBX: ffffffff81449392 RCX: 0000000000000002 RDX: 0000000000001000 RSI: 00007f66b19ad000 RDI: 0000000000000001 RBP: 0000000000001000 R8: 676f6c2030323a37 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f66b178b7a0 R13: 00007f66b19ad000 R14: 0000000000001000 R15: 0000000000648000 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b ** Execution took 0.03s (real) 0.04s (CPU) Under vfs_write, the output shows the value 0xffff8a1c33e817c0 for both RDI and RBX. This is the value contained in those registers at the time of entry to the routine. (The plus sign ("+") or number before the register name indicates confidence in the value; this is discussed in the Limitations section.) This value can then be used with 'files -d' to determine the file name from its dentry (directory entry):: crash> file.f_path.dentry 0xffff8a1c33e817c0 f_path.dentry = 0xffff8896a9084780 crash> files -d 0xffff8896a9084780 DENTRY INODE SUPERBLK TYPE PATH ffff8896a9084780 ffff8b9f3f4aea70 ffff891f428da800 REG /opt/omni/scripts/log/arch_all.lst However, fregs can go further than the basic output above. By adding the '-a' option, it will identify each of the argument registers:: crash> fregs -a PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: grep (snip) #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 arg0 struct file * 3 RDX: 0x1000 arg2 size_t 2 RSI: 0x7f66b19ad000 arg1 char * Here RDI is identified as "arg0" (the first argument to the routine) whose type is a pointer to a file struct. In some cases, fregs can go even further. By adding a second "-a", it will attempt to extract useful information from some common data types, of which a file struct is one:: crash> fregs -aa PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: grep (snip) #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 arg0 struct file * = /opt/omni/scripts/log/arch_all.lst 3 RDX: 0x1000 arg2 size_t 2 RSI: 0x7f66b19ad000 arg1 char * The RDI line includes the file name, so in this case fregs has been able to find the desired information in a single command. A list of the data types for which extra information is extracted is included in the discussion of the "-a" option below. Limitations ----------- While fregs is much faster than manual analysis, it is not a complete replacement due to some limitations. These include: fregs is supported only on vmcores from x86_64 systems. fregs examines only a limited amount of instructions at the beginning of the called routine and prior to the point of its call. As such, it will miss useful register operations outside these boundaries that would be detected manually. The number or plus sign at the beginning of each register line indicates the level of confidence in the value printed. The plus sign indicates that the register contents were retrieved from the stack after they were pushed onto it at the beginning of the called routine, as was the case with RBX in the above example. The confidence value of such registers is virtually certain, barring the unlikely possibility that something has improperly overwritten the stack. When the register line is preceded by a number, the number indicates the number of instructions prior to the call where the value was determined. It is possible that this could result in an incorrect value due to the flow of control in the calling routine; for example, an argument register could be loaded at a different location in the routine and then it jumps to the call. This becomes more likely as the distance between the register load and the call increases. As such, lower numbers indicate higher confidence. If a routine is part of a DLKM, fregs will not be able to identify its arguments until the module's symbols and debugging information are loaded with the 'mod' command. Show a specific process or command (pid|taskp|cmd) -------------------------------------------------- By default, fregs will operate on the process in the current context. To display registers for a different process or command, you can specify a PID, task structure, or command name. For example:: crash> fregs 85555 PID: 85555 TASK: ffff8b0721a9c140 CPU: 30 COMMAND: su #0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223> +RBP: 0xffff8b1f21e03dd8 (snip) crash> fregs ffff8b0721a9c140 PID: 85555 TASK: ffff8b0721a9c140 CPU: 30 COMMAND: su #0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223> +RBP: 0xffff8b1f21e03dd8 (snip) When a command name is used, all processes with that command name are displayed:: crash> fregs su PID: 85555 TASK: ffff8b0721a9c140 CPU: 30 COMMAND: su (snip) PID: 85558 TASK: ffff8a8e3d3d4580 CPU: 1 COMMAND: su (snip) Identify arguments (-a, --args) ------------------------------- The "-a" option will cause fregs to identify registers corresponding to routine arguments. It will also show the argument types if symbols and debugging information are available; for DLKMs, this information may need to be loaded with the 'mod -s' command:: #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 arg0 struct file * 3 RDX: 0x1000 arg2 size_t 2 RSI: 0x7f66b19ad000 arg1 char * When a second "-a" is added, fregs will also attempt to extract useful information from some common data types, for example:: #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 arg0 struct file * = /opt/omni/scripts/log/arch_all.lst 3 RDX: 0x1000 arg2 size_t 2 RSI: 0x7f66b19ad000 arg1 char * Information is currently extracted from the following data types. If it is not possible to extract the information, strings such as "N/A", "UNK", "", or a blank string may be displayed. .. list-table:: :widths: 50 50 :header-rows: 1 * - Data type - Information extracted * - struct dentry - file name * - struct file - file name * - struct path - file name * - struct nameidata - file name * - struct filename - file name * - struct qstr - file name * - struct vfsmount - mount point * - struct task_struct - PID * - struct device - device name * - struct scsi_device - SCSI device name and state * - struct scsi_target - SCSI target name and state * - struct Scsi_Host - SCSI host number * - struct bio - device name in major:minor format * - struct mutex - PID of owning task * - struct linux_binrpm - file name Suggestions for additional data types are welcome. Show source code line numbers (-l, --lines) ------------------------------------------- The "-l" option adds source code line numbers to the fregs output, similar to the 'bt -l' command in crash. Note: if a routine is part of a DLKM, its symbol and debugging information may need to be loaded with 'mod -s' first:: crash> fregs -l PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: grep #0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223> /usr/src/debug/kernel-default-3.0.34/linux-3.0/kernel/sched.c: 3344 +RBP: 0xffff8b9f3f4aeaa8 #1 __mutex_lock_slowpath called from 0xffffffff8143fffa /usr/src/debug/kernel-default-3.0.34/linux-3.0/include/linux/spinlock.h: 285 +R14: 0x0 +R15: 0xffff8a1c33e817c0 #2 mutex_lock called from 0xffffffff810f188e /usr/src/debug/kernel-default-3.0.34/linux-3.0/arch/x86/include/asm/current.h: 14 +RBP: 0xffff8a00027c5e08 +RBX: 0xffff8b9f3f4aeaa8 1 RDI: 0xffff8b9f3f4aeaa8 (snip) Example of a routine from a DLKM:: #10 bond_get_stats called from 0xffffffffaff90ecb +R13: 0xff25c6bbfee18000 +R14: 0xff25c6b9ffa6c800 +R15: 0xff25c6baff5a7700 After loading the module's symbols and debugging information, the source code line number is displayed:: crash> mod -s bonding MODULE NAME BASE SIZE OBJECT FILE ffffffffc10a0d40 bonding ffffffffc107a000 196608 /data/martin/spinlock/usr/lib/debug/lib/modules/4.18.0-305.25.1.el8_4.x86_64/kernel/drivers/net/bonding/bonding.ko.debug #10 bond_get_stats called from 0xffffffffaff90ecb /usr/src/debug/kernel-4.18.0-305.25.1.el8_4/linux-4.18.0-305.25.1.el8_4.x86_64/./include/linux/string.h: 368 +R13: 0xff25c6bbfee18000 +R14: 0xff25c6b9ffa6c800 +R15: 0xff25c6baff5a7700 Show only matching routines (-r, --routine) ------------------------------------------- The full output of fregs can be quite long, especially with long stack traces, while usually only a few routines are of interest for analysis. The "-r" option limits the output to routines matching a specified name:: crash> fregs -aa -r vfs_write PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: grep #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 arg0 struct file * = /opt/omni/scripts/log/arch_all.lst 3 RDX: 0x1000 arg2 size_t 2 RSI: 0x7f66b19ad000 arg1 char * Partial matches are accepted and will return all matching routines. For example, using "-r write" in the example stack returns information for several routines:: crash> fregs -r write PID: 84981 TASK: ffff8a000315c340 CPU: 4 COMMAND: grep #3 generic_file_aio_write called from 0xffffffff8114f910 +R12: 0xffff8a1c33e817c0 +R13: 0xffff8a00027c5f50 +R14: 0x0 +R15: 0x649000 5 RAX: 0xffffffffa00f9280 +RBP: 0xffff8a00027c5ed8 +RBX: 0x1000 4 RDX: 0x1 2 RSI: 0xffff8a00027c5ed8 #4 do_sync_write called from 0xffffffff8114ff2e +R12: 0xffff8a00027c5f50 +R13: 0x7f66b19ad000 +RBP: 0xffff8a1c33e817c0 +RBX: 0x1000 3 RCX: 0xffff8a00027c5f50 1 RDI: 0xffff8a1c33e817c0 +RDX: 0x1000 +RSI: 0x7f66b19ad000 #5 vfs_write called from 0xffffffff811500a3 +R12: 0x1000 +R13: 0x7f66b19ad000 +RBP: 0xfffffffffffffff7 +RBX: 0xffff8a1c33e817c0 4 RDI: 0xffff8a1c33e817c0 3 RDX: 0x1000 2 RSI: 0x7f66b19ad000 #6 sys_write called from 0xffffffff81449392 +R13: 0x1000 Show all uninterruptible tasks (-u, --unint) -------------------------------------------- The "-u" option causes fregs to process all uninterruptible (UN) tasks, which can be helpful in analyzing forced crash dumps from hung systems. Depending on the number of UN tasks, this can take a long time and the output can be quite lengthy, so it's often a good idea to redirect the output to a file. Show all tasks (-A, --all) -------------------------- Similar to the "-u" option, "-A" causes fregs to process **all** tasks in the vmcore. This will take a long time and produce a very large output, so it is strongly recommended to redirect the output to a file.