Finding register contents and arguments at routine entry (fregs)

The fregs command attempts to determine the contents of CPU registers at the time of entry to kernel routines in a stack trace. It also attempts to identify arguments to those routines, and in some cases, decode information from the arguments. This is a helpful tool in vmcore analysis.

Options provided by 'fregs':

crash> fregs -h
usage: fregs [-h] [-V] [-a] [-l] [-r ROUTINE] [-u | -A] [pid|taskp|cmd]

Show register contents at routine entry.

positional arguments:
  pid|taskp|cmd         PID or task struct pointer or command (if omitted, use current context)

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -a, --args            identify arguments (-aa for more detail)
  -l, --lines           show source code line numbers
  -r ROUTINE, --routine ROUTINE
                        only show routines whose names include ROUTINE
  -u, --unint           do all uninterruptible tasks
  -A, --all             do ALL tasks (this may take a while!)

NOTE: fregs is supported on vmcores from x86_64 systems only.

Overview

Determining the arguments to kernel routines and the contents of CPU registers is a fundamental part of crash dump analysis. This is commonly done by disassembling kernel source code, manually following the code flow, and examining stack contents. This technique can be tedious and time-consuming. The fregs command automates much (though not all) of this work. It is not a complete replacement for manual analysis (see "Limitations" below) but can provide a significant savings of time and effort.

This example is from a forced crash of a hung system. There is a 'grep' process that has been stuck in uninterruptible mode for over 43 minutes:

crash> ps -m 84981
[  0 00:43:30.858] [UN]  PID: 84981    TASK: ffff8a000315c340  CPU: 4    COMMAND: "grep"

The stack trace shows that it is waiting for a mutex while trying to perform an asynchronous write to a file:

#0 [ffff8a00027c5bc0] schedule at ffffffff8143f369
#1 [ffff8a00027c5d08] __mutex_lock_slowpath at ffffffff8144056f
#2 [ffff8a00027c5d78] mutex_lock at ffffffff8143fffa
#3 [ffff8a00027c5d90] generic_file_aio_write at ffffffff810f188e
#4 [ffff8a00027c5e00] do_sync_write at ffffffff8114f910
#5 [ffff8a00027c5f10] vfs_write at ffffffff8114ff2e
#6 [ffff8a00027c5f40] sys_write at ffffffff811500a3
#7 [ffff8a00027c5f80] system_call_fastpath at ffffffff81449392

It would be helpful to identify the file being written to. The vfs_write() routine is a good candidate for this because its first argument is a pointer to a file structure:

crash> whatis vfs_write
ssize_t vfs_write(struct file *, const char *, size_t, loff_t *);

In the x86_64 architecture, the first six arguments are passed in registers RDI, RSI, RDX, RCX, R8, and R9. (Subsequent arguments would be stored on the stack, but by convention Linux kernel routines do not take more than six arguments.) In this case the file pointer is the first argument, so it is passed in RDI. By disassembling the code at the beginning of vfs_write(), we see that the routine saves several registers on the stack, whose contents could be examined with 'bt -f':

crash> dis vfs_write | less
0xffffffff8114fe60 <vfs_write>: sub    $0x28,%rsp
0xffffffff8114fe64 <vfs_write+4>:       mov    %rbx,0x8(%rsp)
0xffffffff8114fe69 <vfs_write+9>:       mov    %rbp,0x10(%rsp)
0xffffffff8114fe6e <vfs_write+14>:      mov    $0xfffffffffffffff7,%rbx
0xffffffff8114fe75 <vfs_write+21>:      mov    %r12,0x18(%rsp)
0xffffffff8114fe7a <vfs_write+26>:      mov    %r13,0x20(%rsp)

Unfortunately, RDI is not among the registers saved on the stack; only RBX, RBP, R12, and R13 are saved. However, examining the code in the calling routine up to the point where vfs_write() is called may provide additional clues. The address of the call is shown in the next frame of the stack trace:

#6 [ffff8a00027c5f40] sys_write at ffffffff811500a3

crash> dis -r 0xffffffff811500a3 | tail -10
0xffffffff81150082 <sys_write+50>:      mov    %rax,%rbx
0xffffffff81150085 <sys_write+53>:      je     0xffffffff811500b7 <sys_write+103>
0xffffffff81150087 <sys_write+55>:      mov    0x40(%rax),%rax
0xffffffff8115008b <sys_write+59>:      lea    0x8(%rsp),%rcx
0xffffffff81150090 <sys_write+64>:      mov    %rbx,%rdi
0xffffffff81150093 <sys_write+67>:      mov    %r12,%rdx
0xffffffff81150096 <sys_write+70>:      mov    %r13,%rsi
0xffffffff81150099 <sys_write+73>:      mov    %rax,0x8(%rsp)
0xffffffff8115009e <sys_write+78>:      call   0xffffffff8114fe60 <vfs_write>
0xffffffff811500a3 <sys_write+83>:      mov    %rax,%rbp

The argument is loaded into RDI from register RBX by the instruction mov %rbx,%rdi, and neither RDI nor RBX is modified again before vfs_write() is called. The value of RBX is then saved on the stack at the beginning of vfs_write(), so it can be found with 'bt -f' and is almost certainly the same value contained in RDI, i.e. the file pointer argument. (There is a slight chance that the value could be wrong; see Limitations below.) The file pointer can then be used with the 'files -d' command to determine the name of the file that is being written.

The fregs command automates much of this kind of manual analysis. Instead of disassembling the code manually and searching for the correct value on the stack, we can run the command:

crash> fregs

PID: 84981  TASK: ffff8a000315c340  CPU: 4  COMMAND: grep

#0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223>
 +RBP: 0xffff8b9f3f4aeaa8

(snip)

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0
3 RDX: 0x1000
2 RSI: 0x7f66b19ad000

#6 sys_write called from 0xffffffff81449392 <system_call_fastpath+22>
 +R13: 0x1000

#7 system_call_fastpath
    RIP: 00007f66b14e5bc0  RSP: 00007fff229de988  RFLAGS: 00010202
    RAX: 0000000000000001  RBX: ffffffff81449392  RCX: 0000000000000002
    RDX: 0000000000001000  RSI: 00007f66b19ad000  RDI: 0000000000000001
    RBP: 0000000000001000   R8: 676f6c2030323a37   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007f66b178b7a0
    R13: 00007f66b19ad000  R14: 0000000000001000  R15: 0000000000648000
    ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b

 ** Execution took   0.03s (real)   0.04s (CPU)

Under vfs_write, the output shows the value 0xffff8a1c33e817c0 for both RDI and RBX. This is the value contained in those registers at the time of entry to the routine. (The plus sign ("+") or number before the register name indicates confidence in the value; this is discussed in the Limitations section.) This value can then be used with 'files -d' to determine the file name from its dentry (directory entry):

crash> file.f_path.dentry 0xffff8a1c33e817c0
  f_path.dentry = 0xffff8896a9084780
crash> files -d 0xffff8896a9084780
     DENTRY           INODE           SUPERBLK     TYPE PATH
ffff8896a9084780 ffff8b9f3f4aea70 ffff891f428da800 REG  /opt/omni/scripts/log/arch_all.lst

However, fregs can go further than the basic output above. By adding the '-a' option, it will identify each of the argument registers:

crash> fregs -a

PID: 84981  TASK: ffff8a000315c340  CPU: 4  COMMAND: grep

(snip)

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0 arg0 struct file *
3 RDX: 0x1000 arg2 size_t
2 RSI: 0x7f66b19ad000 arg1 char *

Here RDI is identified as "arg0" (the first argument to the routine) whose type is a pointer to a file struct.

In some cases, fregs can go even further. By adding a second "-a", it will attempt to extract useful information from some common data types, of which a file struct is one:

crash> fregs -aa

PID: 84981  TASK: ffff8a000315c340  CPU: 4  COMMAND: grep

(snip)

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0 arg0 struct file * = /opt/omni/scripts/log/arch_all.lst
3 RDX: 0x1000 arg2 size_t
2 RSI: 0x7f66b19ad000 arg1 char *

The RDI line includes the file name, so in this case fregs has been able to find the desired information in a single command. A list of the data types for which extra information is extracted is included in the discussion of the "-a" option below.

Limitations

While fregs is much faster than manual analysis, it is not a complete replacement due to some limitations. These include:

fregs is supported only on vmcores from x86_64 systems.

fregs examines only a limited amount of instructions at the beginning of the called routine and prior to the point of its call. As such, it will miss useful register operations outside these boundaries that would be detected manually.

The number or plus sign at the beginning of each register line indicates the level of confidence in the value printed. The plus sign indicates that the register contents were retrieved from the stack after they were pushed onto it at the beginning of the called routine, as was the case with RBX in the above example. The confidence value of such registers is virtually certain, barring the unlikely possibility that something has improperly overwritten the stack.

When the register line is preceded by a number, the number indicates the number of instructions prior to the call where the value was determined. It is possible that this could result in an incorrect value due to the flow of control in the calling routine; for example, an argument register could be loaded at a different location in the routine and then it jumps to the call. This becomes more likely as the distance between the register load and the call increases. As such, lower numbers indicate higher confidence.

If a routine is part of a DLKM, fregs will not be able to identify its arguments until the module's symbols and debugging information are loaded with the 'mod' command.

Show a specific process or command (pid|taskp|cmd)

By default, fregs will operate on the process in the current context. To display registers for a different process or command, you can specify a PID, task structure, or command name. For example:

crash> fregs 85555

PID: 85555  TASK: ffff8b0721a9c140  CPU: 30  COMMAND: su

#0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223>
 +RBP: 0xffff8b1f21e03dd8

(snip)


crash> fregs ffff8b0721a9c140

PID: 85555  TASK: ffff8b0721a9c140  CPU: 30  COMMAND: su

#0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223>
 +RBP: 0xffff8b1f21e03dd8

(snip)

When a command name is used, all processes with that command name are displayed:

crash> fregs su
PID: 85555  TASK: ffff8b0721a9c140  CPU: 30  COMMAND: su

(snip)

PID: 85558  TASK: ffff8a8e3d3d4580  CPU: 1  COMMAND: su

(snip)

Identify arguments (-a, --args)

The "-a" option will cause fregs to identify registers corresponding to routine arguments. It will also show the argument types if symbols and debugging information are available; for DLKMs, this information may need to be loaded with the 'mod -s' command:

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0 arg0 struct file *
3 RDX: 0x1000 arg2 size_t
2 RSI: 0x7f66b19ad000 arg1 char *

When a second "-a" is added, fregs will also attempt to extract useful information from some common data types, for example:

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0 arg0 struct file * = /opt/omni/scripts/log/arch_all.lst
3 RDX: 0x1000 arg2 size_t
2 RSI: 0x7f66b19ad000 arg1 char *

Information is currently extracted from the following data types. If it is not possible to extract the information, strings such as "N/A", "UNK", "<invalid>", or a blank string may be displayed.

Data type

Information extracted

struct dentry

file name

struct file

file name

struct path

file name

struct nameidata

file name

struct filename

file name

struct qstr

file name

struct vfsmount

mount point

struct task_struct

PID

struct device

device name

struct scsi_device

SCSI device name and state

struct scsi_target

SCSI target name and state

struct Scsi_Host

SCSI host number

struct bio

device name in major:minor format

struct mutex

PID of owning task

struct linux_binrpm

file name

Suggestions for additional data types are welcome.

Show source code line numbers (-l, --lines)

The "-l" option adds source code line numbers to the fregs output, similar to the 'bt -l' command in crash. Note: if a routine is part of a DLKM, its symbol and debugging information may need to be loaded with 'mod -s' first:

crash> fregs -l

PID: 84981  TASK: ffff8a000315c340  CPU: 4  COMMAND: grep

#0 schedule called from 0xffffffff8144056f <__mutex_lock_slowpath+223>
    /usr/src/debug/kernel-default-3.0.34/linux-3.0/kernel/sched.c: 3344
 +RBP: 0xffff8b9f3f4aeaa8

#1 __mutex_lock_slowpath called from 0xffffffff8143fffa <mutex_lock+26>
    /usr/src/debug/kernel-default-3.0.34/linux-3.0/include/linux/spinlock.h: 285
 +R14: 0x0
 +R15: 0xffff8a1c33e817c0

#2 mutex_lock called from 0xffffffff810f188e <generic_file_aio_write+62>
    /usr/src/debug/kernel-default-3.0.34/linux-3.0/arch/x86/include/asm/current.h: 14
 +RBP: 0xffff8a00027c5e08
 +RBX: 0xffff8b9f3f4aeaa8
1 RDI: 0xffff8b9f3f4aeaa8

 (snip)

Example of a routine from a DLKM:

#10 bond_get_stats called from 0xffffffffaff90ecb <dev_get_stats+91>
    <No line numbers; possibly you need to load a module>
 +R13: 0xff25c6bbfee18000
 +R14: 0xff25c6b9ffa6c800
 +R15: 0xff25c6baff5a7700

After loading the module's symbols and debugging information, the source code line number is displayed:

crash> mod -s bonding
     MODULE       NAME                        BASE           SIZE  OBJECT FILE
ffffffffc10a0d40  bonding               ffffffffc107a000   196608  /data/martin/spinlock/usr/lib/debug/lib/modules/4.18.0-305.25.1.el8_4.x86_64/kernel/drivers/net/bonding/bonding.ko.debug

#10 bond_get_stats called from 0xffffffffaff90ecb <dev_get_stats+91>
    /usr/src/debug/kernel-4.18.0-305.25.1.el8_4/linux-4.18.0-305.25.1.el8_4.x86_64/./include/linux/string.h: 368
 +R13: 0xff25c6bbfee18000
 +R14: 0xff25c6b9ffa6c800
 +R15: 0xff25c6baff5a7700

Show only matching routines (-r, --routine)

The full output of fregs can be quite long, especially with long stack traces, while usually only a few routines are of interest for analysis. The "-r" option limits the output to routines matching a specified name:

crash> fregs -aa -r vfs_write

PID: 84981  TASK: ffff8a000315c340  CPU: 4  COMMAND: grep

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0 arg0 struct file * = /opt/omni/scripts/log/arch_all.lst
3 RDX: 0x1000 arg2 size_t
2 RSI: 0x7f66b19ad000 arg1 char *

Partial matches are accepted and will return all matching routines. For example, using "-r write" in the example stack returns information for several routines:

crash> fregs -r write

PID: 84981  TASK: ffff8a000315c340  CPU: 4  COMMAND: grep

#3 generic_file_aio_write called from 0xffffffff8114f910 <do_sync_write+192>
 +R12: 0xffff8a1c33e817c0
 +R13: 0xffff8a00027c5f50
 +R14: 0x0
 +R15: 0x649000
5 RAX: 0xffffffffa00f9280
 +RBP: 0xffff8a00027c5ed8
 +RBX: 0x1000
4 RDX: 0x1
2 RSI: 0xffff8a00027c5ed8

#4 do_sync_write called from 0xffffffff8114ff2e <vfs_write+206>
 +R12: 0xffff8a00027c5f50
 +R13: 0x7f66b19ad000
 +RBP: 0xffff8a1c33e817c0
 +RBX: 0x1000
3 RCX: 0xffff8a00027c5f50
1 RDI: 0xffff8a1c33e817c0
 +RDX: 0x1000
 +RSI: 0x7f66b19ad000

#5 vfs_write called from 0xffffffff811500a3 <sys_write+83>
 +R12: 0x1000
 +R13: 0x7f66b19ad000
 +RBP: 0xfffffffffffffff7
 +RBX: 0xffff8a1c33e817c0
4 RDI: 0xffff8a1c33e817c0
3 RDX: 0x1000
2 RSI: 0x7f66b19ad000

#6 sys_write called from 0xffffffff81449392 <system_call_fastpath+22>
 +R13: 0x1000

Show all uninterruptible tasks (-u, --unint)

The "-u" option causes fregs to process all uninterruptible (UN) tasks, which can be helpful in analyzing forced crash dumps from hung systems. Depending on the number of UN tasks, this can take a long time and the output can be quite lengthy, so it's often a good idea to redirect the output to a file.

Show all tasks (-A, --all)

Similar to the "-u" option, "-A" causes fregs to process all tasks in the vmcore. This will take a long time and produce a very large output, so it is strongly recommended to redirect the output to a file.