Debugging Aids

There are many ready-to-use PyKdump programs included in its distribution. To use them, you do not need to know anything about PyKdump internals; they can be used just like crash builtins.

This section is mainly of interest to developers, both framework developers and user-program developers.

Global Options

There are some options parsed by pykdump.API itself before arguments are passed to programs. They change global behavior and after processing are stripped from the argument list. The most important of them are:

op.add_option("--timeout", dest="timeout", default=120,
          action="store", type="int",
          help="set default timeout for crash commands")

op.add_option("--maxel", dest="Maxel", default=10000,
          action="store", type="int",
          help="set maximum number of list elements to traverse")

op.add_option("--usens", dest="usens",
          action="store", type="int",
          help="use namespace of the specified PID")

op.add_option("--reload", dest="reload", default=0,
          action="store_true",
          help="reload already imported modules from Linuxdump")

Running Code From Your Source Tree

By default, all needed framework modules are loaded directly from the mpykdump.so file. You can check how modules will be searched for using:

crash64> epython -p
3.8.3 (default, May 21 2020, 13:02:14)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-18)]
['.', '/usr/local/lib/mpykdump64.so/pylib', '/usr/local/lib/mpykdump64.so', '/usr/local/lib/mpykdump64.so/dist-packages']

Please note that before searching for modules in mpykdump.so we will check the current directory. This makes it possible to quickly write simple ad-hoc programs, creating a file in your current directory (usually where the vmcore resides).

But if you are participating in PyKdump development using its GIT repository, you need to search for code in your local repository copy before searching for it in mpykdump.so.

You rarely need to rebuild mpykdump.so. This is only needed if you want to prepare a new binary file for your organization or if you are working on the PyKdump C-module.

To change the search path used by PyKdump, set the shell environment variable PYKDUMPPATH, e.g.:

$ export PYKDUMPPATH=~alexs/tools/pykdump/progs:~alexs/tools/pykdump/experiments

After that, PyKdump will search these locations before using the built-in mpykdump.so:

crash64> epython -p
3.8.3 (default, May 21 2020, 13:02:14)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-18)]
['.', '/usr/local/lib/mpykdump64.so/pylib', '/home/alexs/tools/pykdump/progs', '/home/alexs/tools/pykdump/experiments', '/usr/local/lib/mpykdump64.so', '/usr/local/lib/mpykdump64.so/dist-packages']

Please note one important exception: we still search /usr/local/lib/mpykdump64.so/pylib before our own path. This section of the binary module contains parts of the Python Standard Library. We need them and do not want to override them (except for some very special cases).

Controlling Debugging

A typical program consists of the main program importing several modules. You can add a --debug option or something similar to your main program, but how to pass this value to modules? We can always pass it as a parameter to subroutines/methods, but this is not very convenient. Assuming that we have modules mod1 and mod2, we can set debug variable from the main module using the following approach:

import mod1
mod1.debug = debug

import mod2
mod2.debug = debug

This works, but is rather ugly. There is another problem: importing a module might run some code. So what if we want to display debugging info at this stage?

PyKdump initializes the Python machine only once, when you load the extension. Before running each program, it does some cleanup but preserves many things, such as already cached symbolic information. This significantly improves the performance if you execute the same program multiple times (possibly with different options or arguments). This persistence lets us implement per-module debugging controls, somewhat similar to what the Linux kernel does with procfs/sysfs.

In your module, you register an attribute in the following way:

registerModuleAttr("debugDLKM",
                   default=0, help="Debug DLKM debuginfo subroutines")

This creates in this module a variable debugDLKM,` initializes it to 0, and registers this option in a session cache. Registration is done during import of this module. There is a programmatic interface to change the values of such attributes externally. You can do it easily using the 'pyctl' command, modeled after the Linux command sysctl.

While the pyctl command is included in binary mypkdump.so, it is not registered as a top-level command (i.e. it is not visible in man or help). This is done to avoid confusing normal users. To avoid prepending it with epython every time, you can create an alias and put it into your .crashrc file.

Examples:

crash64> epython pyctl -a
    debugDLKM       - Debug DLKM debuginfo subroutines
             currentvalue=0 default=0
    debugDeref
             currentvalue=0 default=0
    debugMP_KW      - Debug Monkey-Patching of default keywords
             currentvalue=0 default=0
    debugMemoize    - Debug Memoization
             currentvalue=0 default=0
    debugReload     - Debug reloading Python modules
             currentvalue=0 default=0

    crash64> epython pyctl debugMemoize=2

     ** Execution took   0.00s (real)   0.00s (CPU)
    crash64> epython pyctl -v debugMemoize
      debugMemoize    - Debug Memoization
               currentvalue=2 default=0
          pykdump.Generic  func=__func
          {'pyctlname': None, 'default': 0, 'type': <class 'int'>, 'help': 'Debug Memoization'}

If you try to assign a value to a non-registered attribute, you will get an error message:

crash64> epython pyctl debugNN=2
  Unknown key: <debugNN>, skipping it

Reloading Modules

As we do not reinitialize the Python machine every time we start a program (as long as we do not exit crash), this means that modules imported during the previous command execution stay in memory and are not reimported every time. This is good for performance, but what if we are working on a module and would like to force its reimport, to accommodate for changes we made?

To reload our modules, you just add --reload to your command. To see what is being reloaded, you need to set debugReload. An example:

crash64> taskinfo --summ
Number of Threads That Ran Recently
-----------------------------------
   last second     114
   last     5s     161
   last    60s     266

 ----- Total Numbers of Threads per State ------
  TASK_INTERRUPTIBLE                         896
  TASK_NONINTERACTIVE                          2
  TASK_RUNNING                                 2
  TASK_STOPPED                                 1
  TASK_TRACED                                  1
  TASK_UNINTERRUPTIBLE                       161


 ** Execution took   0.95s (real)   0.93s (CPU)

crash64> epython pyctl debugReload=2

 ** Execution took   0.00s (real)   0.01s (CPU)
crash64> taskinfo --summ --reload
LinuxDump /home/alexs/tools/pykdump/progs/LinuxDump/__init__.py
--reloading LinuxDump
LinuxDump.percpu /home/alexs/tools/pykdump/progs/LinuxDump/percpu.py
--reloading LinuxDump.percpu
LinuxDump.inet /home/alexs/tools/pykdump/progs/LinuxDump/inet/__init__.py
--reloading LinuxDump.inet
LinuxDump.Time /home/alexs/tools/pykdump/progs/LinuxDump/Time.py
--reloading LinuxDump.Time
LinuxDump.inet.proto /home/alexs/tools/pykdump/progs/LinuxDump/inet/proto.py
--reloading LinuxDump.inet.proto
LinuxDump.BTstack /home/alexs/tools/pykdump/progs/LinuxDump/BTstack.py
--reloading LinuxDump.BTstack
LinuxDump.fs /home/alexs/tools/pykdump/progs/LinuxDump/fs/__init__.py
--reloading LinuxDump.fs
LinuxDump.Tasks /home/alexs/tools/pykdump/progs/LinuxDump/Tasks.py
--reloading LinuxDump.Tasks
Number of Threads That Ran Recently
-----------------------------------
   last second     114
   last     5s     161
   last    60s     266

 ----- Total Numbers of Threads per State ------
  TASK_INTERRUPTIBLE                         896
  TASK_NONINTERACTIVE                          2
  TASK_RUNNING                                 2
  TASK_STOPPED                                 1
  TASK_TRACED                                  1
  TASK_UNINTERRUPTIBLE                       161


 ** Execution took   0.39s (real)   0.39s (CPU)

At present we do not reload the modules of the framework itself - the contents of the pykdump directory - as this is difficult to implement properly. (Is it OK to reload the module which is responsible for reloading? We are running code from it at this moment!). So this approach works well for developing user programs, but not the framework itself.

Monkey-Patching Default Values for Keywords

This started as a fun project (to better understand Python internals) but can be really useful in some cases.

If you look at the sources of several list-traversal subroutines, e.g. readList(), you will see that we can optionally specify the maximum number of elements to traverse, otherwise we use a default:

def readList(start, offset=0, *, maxel = _MAXEL, inchead = True, warn = True):

There are several subroutines of this type in pykdump/highlevel.py and they all use as a default the _MAXEL global variable set in the beginning of this file.

The idea is to limit the number of elements in lists to traverse, both to prevent infinite iteration or just to warn you about something unexpected. For example, if normally the list size for some kernel table is not greater than 10000, finding more than this number during iteration probably means memory corruption.

But in most cases, we do not specify this keyword argument and expect that the default value is good enough.

If we reach a limit during traversal, a warning is printed. To demonstrate this, let us set the default to an unreasonably low value:

crash64> xportshow --summ --maxel=100
...
    We have reached the limit while reading a list maxel=100
                from sk_for_each <- get_AF_UNIX <- TCPIP_Summarize

We print a warning and then the sequence of subroutine calls.

There are currently 7 subroutines/methods using this, so how can we change the default externally? Originally we used the following approach (this is not a real subroutine, it's just used for illustration purposes):

def a(a1, maxel = None):
  maxel = maxel if (maxel is not None) else _MAXEL

and in pykdump.API we did:

import highlevel
...
highlevel._MAXEL = newvalue

Now we do it in the following way:

def setListMaxel(newval):
    patch_default_kw(getCurrentModule(), 'maxel', newval)

A new subroutine patch_default_kw(mod, kname, newval) replaces all default keyword arguments that have name kname in functions/methods defined in module mod with the new value newval.

Interactive Development

The standard Python interpreter can be run as a REPL (Read-Eval-Print Loop), which allows the user to enter Python code and see the result of running it interactively. PyKdump supports this mode of operation as well, with a built in program that can be run using epython repl:

crash> epython repl
PyKdump Embedded REPL: Python 3.8.3 (default, Jul 17 2020, 16:58:36)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39.0.3)]
Use Ctrl-D to return to crash

from pykdump.API import *

>>> from LinuxDump.Tasks import TaskTable
>>> tt = TaskTable()
>>> tt.getByPid(332005)
PID=332005 <struct task_struct 0xffff88341422af70> CMD=awk
>>>
Returning to crash

 ** Execution took  32.47s (real)   0.00s (CPU)
crash>

This allows you to import any Python code included by PyKdump and execute it interactively.

For the best results, your mpykdump.so should be compiled with a static readline (as is the default, see the installation instructions). This will allow line editing and command history. The REPL will still work without readline, but it is less convenient to use.

When you are done using the REPL, use Ctrl-D to exit it. If you re-open the REPL (by running epython repl again), your variables will be preserved:

crash> epython repl
PyKdump Embedded REPL: Python 3.8.3 (default, Jul 17 2020, 16:58:36)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39.0.3)]
Use Ctrl-D to return to crash

from pykdump.API import *

>>> x = 5
>>>
Returning to crash

 ** Execution took   3.14s (real)   0.00s (CPU)
crash> epython repl
PyKdump Embedded REPL: Python 3.8.3 (default, Jul 17 2020, 16:58:36)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39.0.3)]
Use Ctrl-D to return to crash

from pykdump.API import *

>>> print(x)
5

Profiling

The PyKdump extension has Python cProfile included. Please see the Python documentation for usage details. In the simplest form you can do something like this:

crash64> epython -m cProfile -s tottime /home/alexs/tools/pykdump/progs/xportshow.py --every

to profile /home/alexs/tools/pykdump/progs/xportshow.py execution.