crash C-module


This module implements Python bindings to crash and GDB internal commands and structures. Some of these subroutines are intended for the framework itself. Those of general interest are available after import pydkump.API; there is no need to import crash to use them. That is - in most cases, you do not need to import this module in your own programs.

Basic Info about struct/union/enum

crash.struct_size(structname)
Parameters:

structname -- a string containing the struct name, e.g. struct task_struct. While crash lets you specify simplified names without struct, this subroutine needs proper C syntax.

Returns:

size as an integer, or -1 if there is no symbolic info for this struct

crash.union_size(unionname)

Similar to struct_size() but for unions instead of structs

crash.member_offset(sname, smember)
Parameters:
  • sname -- a string containing the struct name

  • smember -- a string containing the member name

Returns:

offset as an integer, or -1 if there is no such member

crash.member_size(sname, smember)
Parameters:
  • sname -- a string containing the struct name

  • smember -- a string containing the member name

Returns:

size as an integer, or -1 if there is no such member

crash.enumerator_value(ename)

Interface to crash internal subroutine int enumerator_value(char *e, long *value)

Parameters:

ename -- a string containing the enum name

Returns:

an int with numeric value

Example: WORK_CPU_NONE = enumerator_value("WORK_CPU_NONE")

Symbol/Address Subroutines

crash.symbol_exists(symname)

Tests whether symbol symname exists in this kernel (as listed by crash builtin sym command)

Returns value that evaluates to True if it does and False if it does not.

crash.sym2addr(symbolname)
Parameters:

symbolname -- a string containing the symbol name

Returns:

address as an integer. 0 means that there is no such symbol. If there are multiple variables with this name (e.g. in different DLKMs), address of the first one is returned.

crash.sym2alladdr(symbolname)

Similar to sym2addr() but returns a list of addresses. If there are no matches at all, returns an empty list. If there is one match only, returns a list of one element.

crash.addr2sym(addr, loose_match=False)

Tries to find a symbol matching the given address. By default, it tries to find an exact match and if found, returns a string. If no exact match is found, returns None.

If we call this subroutine with loose_match=True, we are trying to find an approximate match and return a tuple (name, offset).

Example: there is a symbol tcp_shudown with address 0xffffffff8147e580:

print(crash.addr2sym(0xffffffff8147e581, True))

('tcp_shutdown', 1)

When there is no match for loose matching we return a tuple of (None, None).

crash.addr2mod(addr)
Parameters:

addr -- address as an integer

Returns:

a string containing the module name where this address belongs, or None

Reading Memory

There are different types of memory, e.g. KVADDR. Some of the following subroutines let you specify the memory type as an extra argument and some rely on default.

crash.mem2long(bytestr, signed, array)

This is a Swiss-army knife subroutine to convert a byte string into integers or a list of integers. In C, we have integers of different sizes, signed/unsigned, and arrays of integers (this subroutine can handle 1-dimensional arrays only). After we read a chunk of memory, it is represented by a byte string. This subroutine converts it as specified by its arguments. We assume the byte string consists of int for this architecture. As such, you cannot use this subroutine for dealing e.g. with short int a[10], only for C objects like int a[10] or signed int[10].

Parameters:
  • bytestr -- a byte-string with data

  • signed -- True/False to specify whether integers are signed or not, unsigned by default

  • array -- if specified, we will return a list of array integers instead of one value

crash.readmem(addr, size[, mtype])

Interface to crash builtin readmem().

Parameters:
  • addr -- address to read from

  • size -- how many bytes to read

  • mtype -- memory type to read, by default KVADDR

Returns:

a byte-string with data

crash.readPtr(addr[, mtype])

Assuming that addr contains a pointer, read pointer value.

Parameters:
  • addr -- address

  • mtype -- memory type, by default KVADDR

crash.readInt(addr, size[, signedvar[, mtype]])

Given an address, read an integer of given size

Parameters:
  • addr -- address to read from

  • size -- integer size, according to C char/short/int/long/longlong specification for this architecture

  • signedvar -- False for unsigned, True for signed. If not specified, we assume unsigned.

  • mtype -- memory type, by default KVADDR

crash.set_readmem_task(taskaddr)
Parameters:

taskaddr -- task address or zero

  • if taskaddr=0, reset readmem operations to use KVADDR

  • if taskaddr is a valid task address, set readmem operations to UVADDR and set the current context to this task

Returns:

nothing

Conversion between Memory Types

crash.uvtop(taskaddr, vaddr)

Interface to crash builtin uvtop(tskaddr, vaddr) - converts a virtual address to physical address in the context of specified task

Parameters:
  • taskaddr -- address of struct task_struct

  • vaddr -- virtual address

Returns:

physical address as an integer

crash.phys_to_page(physaddr)

Interface to crash builtin phys_to_page(physaddr_t phys, ulong *pp)

Parameters:

physaddr -- physical address

Returns:

page as an integer

crash.PAGEOFFSET(vaddr)

Interface to crash builtin PAGEOFFSET(vaddr)

Miscellaneous

crash.getListSize(addr, offset[, maxel = 1000])

Assuming that addr points to a list head, find the total number of elements. The same can be done in Python easily, but C is faster for big lists.

Parameters:
  • addr -- address of a structure representing a list head

  • offset -- offset or next pointer in this structure

  • maxel -- maximum number of elements to search for, i.e. we stop iteration if we reach this limit

Returns:

number of list elements found (not counting the list head itself)

crash.getFullBuckets(start, bsize, items, chain_off)

Find full buckets in hash-tables. If we have hash tables consisting of many buckets (>100,000) but just a few of them are non-empty, this subroutine is significantly faster than trying to do the same in pure Python. Useful for networking tables.

Parameters:
  • start -- address of the hash-table

  • bsize -- hash-bucket size

  • items -- how many buckets (hash-size)

  • chain_off -- chain offset

Returns:

a list of addresses of full buckets

crash.getFullBucketsH(start, bsize, items, chain_off)

Similar to getFullBuckets() but for different hash table structure.

crash.FD_ISSET(i, fileparray)

Interface to C-macro FD_ISSET

Parameters:
  • i -- an index in fileparray

  • fileparray -- address of struct fdtable *fdt in struct files_struct

crash.get_NR_syscalls(void)
Returns:

number of system calls registered in sys_call_table

crash.get_pathname(dentry, vfsmnt)
Parameters:
  • dentry -- dentry address

  • vfsmnt -- vfsmnt address

Returns:

a string containing the pathname of this object

crash.setprocname(name)

Changes the name of the currently running process; needed if we want to implement daemons or background processes.

Parameters:

name -- a string containing a new name

crash.is_task_active(taskaddr)

Interface to internal crash subroutine is_task_active

Parameters:

taskaddr -- address of a task

Returns:

True for active tasks, False for inactive ones

crash.pid_to_task(pid)

Interface to internal crash subroutine pid_to_task

Returns:

address of the task

crash.task_to_pid(taskaddr)

Interface to internal crash subroutine task_to_pid

Returns:

PID of this task

crash.get_uptime()

Interface to crash builtin subroutine get_uptime(NULL, &jiffies)

Returns:

an integer - seconds since boot

crash.get_task_mem_usage()

Conversion of Integers

Python integers are always signed and have arbitrary precision. As a result, they do not behave in the same way as in C; e.g. they do not overflow. So to emulate C behavior, we need to use special functions.

crash.sLong(i)

In C, the same bit sequence can represent either a signed or unsigned integer. In Python, there is no native unsigned integer. This subroutine lets you convert a Python integer to signed, assuming that integer size is that for long type of this architecture.

Parameters:

i -- Python integer of any size/value

Returns:

process sizeof(long) lower bits of provided integer as C unsigned long and return this value as signed long

An example:

l = 0xffffffffffffffff
print(l, sLong(l))

# Prints 18446744073709551615 -1
crash.le32_to_cpu(ulong)

Interface to __le32_to_cpu C macro

Parameters:

ulong -- unsigned integer

Returns:

converts Python integer to C ulong val, applies __le32_to_cpu(val) and returns a Python integer

crash.le16_to_cpu(uint)

Similar to le32_to_cpu() but invokes C macro __le16_to_cpu

crash.cpu_to_le32(uint)

Similar to le32_to_cpu() but invokes C macro __cpu_to_le32

Executing Commands

crash.exec_crash_command(cmd, no_stdout=0)

Execute a built-in crash command and return output as a string. There is no timeout mechanism for this subroutine.

Parameters:

cmd -- a string containing the command name and arguments

crash.exec_crash_command_bg2(cmd, no_stdout=0)

This command opens and writes to a FIFO, so we expect someone to read it. Execution is done in the background; we fork() a child process that executes with output redirected to a pipe.

This function is used in high-level subroutine exec_crash_command_bg(cmd,  timeout = None)

Parameters:

cmd -- a string containing the command name and arguments

Returns:

a tuple of (fileno, pid) where fileno is OS file descriptor and pid is PID of the child process

crash.exec_epython_command(cmd)
Parameters:

cmd -- a string containing the command name and arguments

Returns:

nothing - we just execute the command and output goes to stdout

crash.set_default_timeout(timeout)

Set default timeout for execution of crash built-in commands as done via exec_crash_command_bg2()

Parameters:

timeout -- default timeout in seconds

Registering Commands

Normally you execute your own programs with epython progname. But if you develop a program to be included in PyKdump for general consumption, it makes sense to register it so you can execute it in crash without specifying epython every time, so that you would be able to execute it just as progname. For example, xportshow is implemented in Python but is registered.

crash.register_epython_prog(progname, description, shorthelp, longhelp)
Parameters:
  • progname -- a string containing the program name

  • description -- a string containing the program description

  • shorthelp -- a string containing short help

  • longhelp -- a string containing detailed help

An example:

help = '''
Print information about tasks in more details as the built-in 'ps'
command
'''

register_epython_prog("taskinfo", "Detailed info about tasks",
      "-h   - list available options",
      help)
crash.get_epython_cmds()

Get a list of registered epython commands. Used internally in higher-level PyKdump API.

Returns:

a list of strings

GDB Interface

This section describes GDB-specific subroutines, intended primarily for use by framework developers, not end users.

When we use the whatis or struct command in crash, we really execute internal gdb commands whatis and ptype and they print information in C syntax. Programmatically in GDB we rely on struct symbol obtained by calling different internal GDB functions.

Python bindings to GDB internals return type info as a dictionary with the following keys:

  • basetype - type name, e.g. 'int' or 'struct net_protocol'

  • codetype - GDB type, e.g. TYPE_CODE_INT

  • fname - field or variable name

  • typelength - an integer, sizeof() for this type

  • dims - for array, a list of integers with dimensions

  • stars - for pointers, how many starts in C syntax

  • ptrbasetype - for pointers, base type of object

  • uint - 0 for signed, 1 for unsigned

  • bitsize - for bitfields, the size in bits. For normal fields, this key is not present in the dictionary.

  • bitoffset - for bitfields, offset from the word boundary, in bits

  • edef - for enumeration types, a list of pairs (name, value)

For struct, we have an extra key - body - which is a list of dictionaries for all fields. These entries have bitoffset keys with values showing the offset in bits from the beginning of this struct. This is true even for normal fields (when there is no bitsize key).

To make this clearer, here are some examples.

In crash:

crash64> whatis int
SIZE: 4

crash64> whatis struct task_struct
struct task_struct {
    volatile long state;
    void *stack;
...

crash64> whatis inet_protos
const struct net_protocol *inet_protos[256];

crash64> struct list_head
struct list_head {
    struct list_head *next;
    struct list_head *prev;
}
SIZE: 16

Now the same in PyKdump program:

pp.pprint(crash.gdb_whatis("int"))
pp.pprint(crash.gdb_whatis("struct task_struct"))
pp.pprint(crash.gdb_whatis("inet_protos"))
pp.pprint(crash.gdb_typeinfo("struct list_head"))

This results in output:

{'basetype': 'int', 'codetype': 8, 'fname': 'int', 'typelength': 4, 'uint': 0}

{   'basetype': 'struct task_struct',
    'codetype': 3,
    'fname': 'struct task_struct',
    'typelength': 2648}

{   'basetype': 'struct net_protocol',
    'codetype': 1,
    'dims': [256],
    'fname': 'inet_protos',
    'ptrbasetype': 3,
    'stars': 1,
    'typelength': 8}

{   'basetype': 'struct list_head',
    'body': [   {   'basetype': 'struct list_head',
                    'bitoffset': 0,
                    'codetype': 1,
                    'fname': 'next',
                    'ptrbasetype': 3,
                    'stars': 1,
                    'typelength': 8},
                {   'basetype': 'struct list_head',
                    'bitoffset': 64,
                    'codetype': 1,
                    'fname': 'prev',
                    'ptrbasetype': 3,
                    'stars': 1,
                    'typelength': 8}],
    'codetype': 3,
    'typelength': 16}
crash.get_GDB_output(cmd)

Execute GDB command and return its output as a string

crash.gdb_whatis(varname)

Interface to gdb_whatis GDB internal subroutine

Parameters:

varname -- a string that will be passed to gdb_whatis

Returns:

a dictionary describing this object

crash.gdb_typeinfo(typename)
Parameters:

typename -- a string containing the data type, e.g. struct task_struct

Returns:

a dictionary describing this type

gdb/gdbtypes.h from GDB source defines

enum type_code
  {
    TYPE_CODE_BITSTRING = -1,   /* Deprecated  */
    TYPE_CODE_UNDEF = 0,        /* Not used; catches errors */
    TYPE_CODE_PTR,              /* Pointer type */
    ...

Some of these values are accessible as module constants, namely:

crash.TYPE_CODE_PTR
crash.TYPE_CODE_ARRAY
crash.TYPE_CODE_STRUCT
crash.TYPE_CODE_UNION
crash.TYPE_CODE_ENUM
crash.TYPE_CODE_FUNC
crash.TYPE_CODE_INT
crash.TYPE_CODE_FLT
crash.TYPE_CODE_VOID
crash.TYPE_CODE_BOOL

Other Module-level Constants

crash.error

Exception raised when we have a problem executing a crash internal subroutine, e.g. bad address

crash.version

A string containing the crash module version, e.g. "3.2.0"

The following constants are copied from crash sources, namely from defs.h

crash.KVADDR
crash.UVADDR
crash.PHYSADDR
crash.XENMACHADDR
crash.FILEADDR
crash.AMBIGUOUS
crash.PAGESIZE
crash.PAGE_CACHE_SHIFT
crash.HZ

An integer with the value of HZ for this vmcore

crash.WARNING

A string to be used while printing warnings, at this moment set to "++WARNING+++"

When we build PyKdump, we use headers from a specific crash version's sources. We do not necessarily need to load the extension using exactly the same version of crash; typically extensions are compatible with any crash binary as long as its major version is the same. So it is OK to build extensions using e.g. crash-7.2.3 and use them with the binary of crash-7.2.8. But when the major version of crash changes, extensions built with a previous major version will likely not work.

crash.Crash_run

Version of crash utility that we are using at this moment

crash.Crash_build

version of crash used for building the extension