spartakus: Using sparse to have semantic checks for kernel ABI breakages

I have been working on this project I have named spartakus that deals with kernel ABI checks through semantic processing of the kernel source code. I have made the source code available on Github some time back and this does deserve a blog post.

Introduction
spartakus is a tool that can be used to generate checksums for exported kernel symbols through semantic processing of the source code using sparse. These checksums would constitute the basis for kernel ABI checks, with changes in checksums meaning a change in the kABI.

spartakus (which is currently a WIP) is forked from sparse and has been modified to fit the requirements of semantic processing of the kernel source for kernel ABI checks. This adds a new binary ‘check_kabi‘ upon compilation, which can be used during the linux kernel build process to generate the checksums for all the exported symbols. These checksums are stored in Module.symvers file which is generated during the build process if the variable CONFIG_MODVERSIONS is set in the .config file.

What purpose does spartakus serve?
In an earlier post I had spoken a bit on exported symbols constituting the kernel ABI and how its stability can be kept track of through CRC checksums. genksyms has been the tool that had been doing this job of generating the checksums for exported symbols that constitute the kernel ABI so far, but it has several problems/limitations that make the job of developers difficult wrt maintaining the stability of the kernel ABI. Some of these limitations as determined by me are mentioned below:

1.genksyms generates different checksums for structs/unions for declarations which are semantically similar. For example, for the following 2 declarations for the struct ‘list_head’:

struct list_head {
    struct list_head *next, *prev;
};

struct list_head {
    struct list_head *next;
    struct list_head *prev;
};

both declarations are essentially the same and should not result in a change in kABI wrt ‘list_head’. Sparse treats these 2 declarations as semantically the same and different checksums are not generated.

2. For variable declarations with just unsigned/signed specification with no type specified, sparse considers the type as int by default.

Hence,
‘unsigned foo’ is converted to ‘unsigned int foo’ and then processed to get the corresponding checksum. On the other hand, genksyms would generate different checksums for 2 different definitions which are semantically the same.

License
sparse is licensed under the MIT license which is GPL compatible. The added files as mentioned above are under a GPLv2 license since I have used code from genksyms which is licensed under GPLv2.

Source Code
Development on spartakus is in progress and the corresponding source code is hosted on Github.

Lastly, I know there will be at least who will wonder why the name ‘spartakus’. Well I do not totally remember, but it had something to do with sparse and Spartacus. Does sound cool though?

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s