:mod:`crash` C-module ======================================= .. module:: crash :synopsis: provides crash/GDB bindings .. moduleauthor:: Alex Sidorenko -------------- This module implements Python bindings to ``crash`` and ``GDB`` internal commands and structures. Some of these subroutines are intended for the framework itself. Those of general interest are available after ``import pydkump.API``; there is no need to ``import crash`` to use them. That is - in most cases, you do not need to import this module in your own programs. Basic Info about struct/union/enum ---------------------------------- .. function:: struct_size(structname) :param structname: a string containing the struct name, e.g. *struct task_struct*. While ``crash`` lets you specify simplified names without *struct*, this subroutine needs proper C syntax. :return: size as an integer, or -1 if there is no symbolic info for this struct .. function:: union_size(unionname) Similar to :func:`struct_size` but for unions instead of structs .. function:: member_offset(sname, smember) :param sname: a string containing the struct name :param smember: a string containing the member name :return: offset as an integer, or -1 if there is no such member .. function:: member_size(sname, smember) :param sname: a string containing the struct name :param smember: a string containing the member name :return: size as an integer, or -1 if there is no such member .. function:: enumerator_value(ename) Interface to ``crash`` internal subroutine ``int enumerator_value(char *e, long *value)`` :param ename: a string containing the enum name :return: an int with numeric value Example: ``WORK_CPU_NONE = enumerator_value("WORK_CPU_NONE")`` Symbol/Address Subroutines -------------------------- .. function:: symbol_exists(symname) Tests whether symbol *symname* exists in this kernel (as listed by ``crash`` builtin ``sym`` command) Returns value that evaluates to `True` if it does and `False` if it does not. .. function:: sym2addr(symbolname) :param symbolname: a string containing the symbol name :return: address as an integer. 0 means that there is no such symbol. If there are multiple variables with this name (e.g. in different DLKMs), address of the first one is returned. .. function:: sym2alladdr(symbolname) Similar to :func:`sym2addr` but returns a list of addresses. If there are no matches at all, returns an empty list. If there is one match only, returns a list of one element. .. function:: addr2sym(addr, loose_match = False) Tries to find a symbol matching the given address. By default, it tries to find an exact match and if found, returns a string. If no exact match is found, returns *None*. If we call this subroutine with ``loose_match=True``, we are trying to find an approximate match and return a tuple ``(name, offset)``. Example: there is a symbol *tcp_shudown* with address 0xffffffff8147e580:: print(crash.addr2sym(0xffffffff8147e581, True)) ('tcp_shutdown', 1) When there is no match for loose matching we return a tuple of ``(None, None)``. .. function:: addr2mod(addr) :param addr: address as an integer :return: a string containing the module name where this address belongs, or *None* .. _reading_memory: Reading Memory -------------- There are different types of memory, e.g. :data:`KVADDR`. Some of the following subroutines let you specify the memory type as an extra argument and some rely on default. .. function:: mem2long(bytestr, signed, array) This is a Swiss-army knife subroutine to convert a byte string into integers or a list of integers. In C, we have integers of different sizes, signed/unsigned, and arrays of integers (this subroutine can handle 1-dimensional arrays only). After we read a chunk of memory, it is represented by a byte string. This subroutine converts it as specified by its arguments. We assume the byte string consists of *int* for this architecture. As such, you cannot use this subroutine for dealing e.g. with ``short int a[10]``, only for C objects like ``int a[10]`` or ``signed int[10]``. :param bytestr: a byte-string with data :param signed: *True/False* to specify whether integers are signed or not, *unsigned* by default :param array: if specified, we will return a list of *array* integers instead of one value .. function:: readmem(addr, size [, mtype]) Interface to ``crash`` builtin ``readmem()``. :param addr: address to read from :param size: how many bytes to read :param mtype: memory type to read, by default :data:`KVADDR` :return: a byte-string with data .. function:: readPtr(addr [, mtype]) Assuming that *addr* contains a pointer, read pointer value. :param addr: address :param mtype: memory type, by default :data:`KVADDR` .. function:: readInt(addr, size [, signedvar [, mtype]]) Given an address, read an integer of given *size* :param addr: address to read from :param size: integer size, according to C char/short/int/long/longlong specification for this architecture :param signedvar: False for ``unsigned``, True for ``signed``. If not specified, we assume ``unsigned``. :param mtype: memory type, by default :data:`KVADDR` .. function:: set_readmem_task(taskaddr) :param taskaddr: task address or zero * if taskaddr=0, reset readmem operations to use KVADDR * if taskaddr is a valid task address, set readmem operations to UVADDR and set the current context to this task :return: nothing Conversion between Memory Types ------------------------------- .. function:: uvtop(taskaddr, vaddr) Interface to ``crash`` builtin ``uvtop(tskaddr, vaddr)`` - converts a virtual address to physical address in the context of specified task :param taskaddr: address of ``struct task_struct`` :param vaddr: virtual address :return: physical address as an integer .. function:: phys_to_page(physaddr) Interface to ``crash`` builtin ``phys_to_page(physaddr_t phys, ulong *pp)`` :param physaddr: physical address :return: page as an integer .. function:: PAGEOFFSET(vaddr) Interface to ``crash`` builtin ``PAGEOFFSET(vaddr)`` Miscellaneous ------------- .. function:: getListSize(addr, offset[, maxel = 1000]) Assuming that *addr* points to a list head, find the total number of elements. The same can be done in Python easily, but C is faster for big lists. :param addr: address of a structure representing a list head :param offset: offset or ``next`` pointer in this structure :param maxel: maximum number of elements to search for, i.e. we stop iteration if we reach this limit :return: number of list elements found (not counting the list head itself) .. function:: getFullBuckets(start, bsize, items, chain_off) Find full buckets in hash-tables. If we have hash tables consisting of many buckets (>100,000) but just a few of them are non-empty, this subroutine is significantly faster than trying to do the same in pure Python. Useful for networking tables. :param start: address of the hash-table :param bsize: hash-bucket size :param items: how many buckets (hash-size) :param chain_off: chain offset :return: a list of addresses of full buckets .. function:: getFullBucketsH(start, bsize, items, chain_off) Similar to :func:`getFullBuckets` but for different hash table structure. .. function:: FD_ISSET(i, fileparray) Interface to C-macro ``FD_ISSET`` :param i: an index in ``fileparray`` :param fileparray: address of ``struct fdtable *fdt`` in ``struct files_struct`` .. function:: get_NR_syscalls(void) :return: number of system calls registered in *sys_call_table* .. function:: get_pathname(dentry, vfsmnt) :param dentry: dentry address :param vfsmnt: vfsmnt address :return: a string containing the pathname of this object .. function:: setprocname(name) Changes the name of the currently running process; needed if we want to implement daemons or background processes. :param name: a string containing a new name .. function:: is_task_active(taskaddr) Interface to internal ``crash`` subroutine ``is_task_active`` :param taskaddr: address of a task :return: *True* for active tasks, *False* for inactive ones .. function:: pid_to_task(pid) Interface to internal ``crash`` subroutine ``pid_to_task`` :return: address of the task .. function:: task_to_pid(taskaddr) Interface to internal ``crash`` subroutine ``task_to_pid`` :return: PID of this task .. function:: get_uptime() Interface to ``crash`` builtin subroutine ``get_uptime(NULL, &jiffies)`` :return: an integer - seconds since boot .. function:: get_task_mem_usage Conversion of Integers ---------------------- Python integers are always signed and have arbitrary precision. As a result, they do not behave in the same way as in C; e.g. they do not overflow. So to emulate C behavior, we need to use special functions. .. function:: sLong(i) In C, the same bit sequence can represent either a *signed* or *unsigned* integer. In Python, there is no native *unsigned* integer. This subroutine lets you convert a Python integer to *signed*, assuming that integer size is that for *long* type of this architecture. :param i: Python integer of any size/value :return: process ``sizeof(long)`` lower bits of provided integer as C ``unsigned long`` and return this value as ``signed long`` An example:: l = 0xffffffffffffffff print(l, sLong(l)) # Prints 18446744073709551615 -1 .. function:: le32_to_cpu(ulong) Interface to ``__le32_to_cpu`` C macro :param ulong: unsigned integer :return: converts Python integer to C ``ulong`` val, applies ``__le32_to_cpu(val)`` and returns a Python integer .. function:: le16_to_cpu(uint) Similar to :func:`le32_to_cpu` but invokes C macro ``__le16_to_cpu`` .. function:: cpu_to_le32(uint) Similar to :func:`le32_to_cpu` but invokes C macro ``__cpu_to_le32`` Executing Commands ------------------ .. function:: exec_crash_command(cmd, no_stdout = 0) Execute a built-in ``crash`` command and return output as a string. There is no timeout mechanism for this subroutine. :param cmd: a string containing the command name and arguments .. function:: exec_crash_command_bg2(cmd, no_stdout = 0) This command opens and writes to a FIFO, so we expect someone to read it. Execution is done in the background; we fork() a child process that executes with output redirected to a pipe. This function is used in high-level subroutine ``exec_crash_command_bg(cmd, timeout = None)`` :param cmd: a string containing the command name and arguments :return: a tuple of (fileno, pid) where *fileno* is OS file descriptor and *pid* is PID of the child process .. function:: exec_epython_command(cmd) :param cmd: a string containing the command name and arguments :return: nothing - we just execute the command and output goes to stdout .. function:: set_default_timeout(timeout) Set default timeout for execution of ``crash`` built-in commands as done via :func:`exec_crash_command_bg2` :param timeout: default timeout in seconds Registering Commands -------------------- Normally you execute your own programs with ``epython progname``. But if you develop a program to be included in PyKdump for general consumption, it makes sense to register it so you can execute it in ``crash`` without specifying ``epython`` every time, so that you would be able to execute it just as ``progname``. For example, ``xportshow`` is implemented in Python but is registered. .. function:: register_epython_prog(progname, description, shorthelp, longhelp) :param progname: a string containing the program name :param description: a string containing the program description :param shorthelp: a string containing short help :param longhelp: a string containing detailed help An example:: help = ''' Print information about tasks in more details as the built-in 'ps' command ''' register_epython_prog("taskinfo", "Detailed info about tasks", "-h - list available options", help) .. function:: get_epython_cmds() Get a list of registered ``epython`` commands. Used internally in higher-level PyKdump API. :return: a list of strings GDB Interface ------------- This section describes GDB-specific subroutines, intended primarily for use by framework developers, not end users. When we use the ``whatis`` or ``struct`` command in ``crash``, we really execute internal ``gdb`` commands *whatis* and *ptype* and they print information in C syntax. Programmatically in ``GDB`` we rely on ``struct symbol`` obtained by calling different internal ``GDB`` functions. Python bindings to ``GDB`` internals return type info as a dictionary with the following keys: * basetype - type name, e.g. 'int' or 'struct net_protocol' * codetype - GDB type, e.g. :data:`TYPE_CODE_INT` * fname - field or variable name * typelength - an integer, sizeof() for this type * dims - for array, a list of integers with dimensions * stars - for pointers, how many starts in C syntax * ptrbasetype - for pointers, base type of object * uint - 0 for signed, 1 for unsigned * bitsize - for bitfields, the size in bits. For normal fields, this key is not present in the dictionary. * bitoffset - for bitfields, offset from the word boundary, in bits * edef - for enumeration types, a list of pairs (name, value) For *struct*, we have an extra key - *body* - which is a list of dictionaries for all fields. These entries have *bitoffset* keys with values showing the offset in bits from the beginning of this *struct*. This is true even for normal fields (when there is no *bitsize* key). To make this clearer, here are some examples. In crash:: crash64> whatis int SIZE: 4 crash64> whatis struct task_struct struct task_struct { volatile long state; void *stack; ... crash64> whatis inet_protos const struct net_protocol *inet_protos[256]; crash64> struct list_head struct list_head { struct list_head *next; struct list_head *prev; } SIZE: 16 Now the same in PyKdump program:: pp.pprint(crash.gdb_whatis("int")) pp.pprint(crash.gdb_whatis("struct task_struct")) pp.pprint(crash.gdb_whatis("inet_protos")) pp.pprint(crash.gdb_typeinfo("struct list_head")) This results in output: .. code-block:: text {'basetype': 'int', 'codetype': 8, 'fname': 'int', 'typelength': 4, 'uint': 0} { 'basetype': 'struct task_struct', 'codetype': 3, 'fname': 'struct task_struct', 'typelength': 2648} { 'basetype': 'struct net_protocol', 'codetype': 1, 'dims': [256], 'fname': 'inet_protos', 'ptrbasetype': 3, 'stars': 1, 'typelength': 8} { 'basetype': 'struct list_head', 'body': [ { 'basetype': 'struct list_head', 'bitoffset': 0, 'codetype': 1, 'fname': 'next', 'ptrbasetype': 3, 'stars': 1, 'typelength': 8}, { 'basetype': 'struct list_head', 'bitoffset': 64, 'codetype': 1, 'fname': 'prev', 'ptrbasetype': 3, 'stars': 1, 'typelength': 8}], 'codetype': 3, 'typelength': 16} .. function:: get_GDB_output(cmd) Execute ``GDB`` command and return its output as a string .. function:: gdb_whatis(varname) Interface to ``gdb_whatis`` GDB internal subroutine :param varname: a string that will be passed to ``gdb_whatis`` :return: a dictionary describing this object .. function:: gdb_typeinfo(typename) :param typename: a string containing the data type, e.g. ``struct task_struct`` :return: a dictionary describing this type ``gdb/gdbtypes.h`` from GDB source defines .. code-block:: c enum type_code { TYPE_CODE_BITSTRING = -1, /* Deprecated */ TYPE_CODE_UNDEF = 0, /* Not used; catches errors */ TYPE_CODE_PTR, /* Pointer type */ ... Some of these values are accessible as module constants, namely: .. data:: TYPE_CODE_PTR .. data:: TYPE_CODE_ARRAY .. data:: TYPE_CODE_STRUCT .. data:: TYPE_CODE_UNION .. data:: TYPE_CODE_ENUM .. data:: TYPE_CODE_FUNC .. data:: TYPE_CODE_INT .. data:: TYPE_CODE_FLT .. data:: TYPE_CODE_VOID .. data:: TYPE_CODE_BOOL Other Module-level Constants ---------------------------- .. data:: error Exception raised when we have a problem executing a ``crash`` internal subroutine, e.g. bad address .. data:: version A string containing the ``crash`` module version, e.g. "3.2.0" The following constants are copied from ``crash`` sources, namely from ``defs.h`` .. data:: KVADDR .. data:: UVADDR .. data:: PHYSADDR .. data:: XENMACHADDR .. data:: FILEADDR .. data:: AMBIGUOUS .. data:: PAGESIZE .. data:: PAGE_CACHE_SHIFT .. data:: HZ An integer with the value of HZ for this vmcore .. data:: WARNING A string to be used while printing warnings, at this moment set to "++WARNING+++" When we build PyKdump, we use headers from a specific ``crash`` version's sources. We do not necessarily need to load the extension using exactly the same version of ``crash``; typically extensions are compatible with any ``crash`` binary as long as its major version is the same. So it is OK to build extensions using e.g. crash-7.2.3 and use them with the binary of crash-7.2.8. But when the major version of ``crash`` changes, extensions built with a previous major version will likely not work. .. data:: Crash_run Version of ``crash`` utility that we are using at this moment .. data:: Crash_build version of ``crash`` used for building the extension