You should submit the following TWO files:
For each question, include the following (where applicable or otherwise unspecified):
Note that, depending on the question, some of the above requirements will not necessarily exist. Also, #1 and #2 from above can often be combined into a single screencap. Also note that some questions may explicitly state what to submit. This is to ensure you submit the correct information we're looking for, and the other requirements are expected as well (see list above).
These requirements may seem tedious and unnecessary; however, they are useful for markers to see that you completed each question, explained that you understood the question, and provided proof that the task was successfully completed.
When in doubt, explain as much as you can. I need to see that you understand the answer and the process you used to get the answer. Not including an explanation or providing too little explanation may result in lost marks.
johnsmith-assignment2.pdf
Any code included in the tar archive is for the TA to test, and should therefore be submitted in a format that is easy to run/compile, e.g., with an appropriate directory structure (if needed), required Makefiles, etc.
Code that you include in your report is for the TA to read, and may therefore be formatted to best explain what you did, e.g., by sectioning the code, highlighting lines of interest and including additional comments, etc.
SUBMITTING CODE WITHOUT EXPLANATION WILL RESULT IN SEVERE PENALTY!
This assignment will be done on the SCS OpenStack platform (please consult the General Page for further details and instructions). The name of the VM snapshot for this assignment is comp4108-w24-assignment-02
. To protect your VM against abuse, you will automatically be prompted to choose a new password upon your first login—please do so, especially if you enable SSH access.
At the end of Assignment 1 you saw how you could exploit a setuid binary with a race condition in order to gain root privilege on the box. Now what? If you were a real attacker it could be a matter of minutes before the system admin patches the vulnerable program and gives you the boot. How do you hide your tracks? How do you install a backdoor that ensures you aren't a one-trick pony?
history
command. You can use it to refresh your memory on commands you may have entered.script
command, which records input and output to a file automatically. See man script
for more information.
The answer to both questions (at least as far as this assignment is concerned) is a rootkit. A garden variety Linux rootkit is generally written as a Loadable Kernel Module or LKM. An LKM allow the system administrator to load new code to extend the kernel's functionality while the machine is running. Many device drivers are implemented as LKMs.
Only root can insert and remove modules (using insmod
and rmmod
respectfully), but it just so happens you're root today. Lucky you. There is a free guide to programming Linux Kernel Modules available online. This guide explains benign kernel module functionality, and you'll likely want to read (or at least skim) Sections 1, 2, 3, and 8.
Rootkit LKMs alter the state of the system to present processes interacting with the Kernel sanitized information, or to add new functionality convenient for an attacker. A classic way this is done is by hooking system calls. For an idea of what system calls a process invokes, you should revisit your use of the strace
command in Assignment #1. The guide to programming Linux Kernel Modules introduces syscall hooking briefly.
To hide files from appearing in directory listings for instance, you would find the syscall that ls
used to get filesystem directory entries and hook it. By hooking syscalls related to the filesystem a malicious rootkit might hide all files with a $sys$
prefix, allowing it to stash its own files from the system. Rootkits also frequently serve as backdoors that allow a user to elevate their priviledges, or get a remote shell without logging in.
You no doubt saw the grave warnings affixed to both the getdents
man page and the LKM guide section on hooking syscalls. Not only are these warnings correct, things are worse than you might imagine. These techniques are dangerous, and often unreliable. No sane engineer would design their device driver in this fashion, we're hacking in the true definition.
The details in the LKM guide are unfortunately specific to Linux Kernel versions < 2.6.x and a 32bit architecture. We're living in 2024 and running Linux Kernel version 5.4.x on a 64bit architecture. This affects us in two major ways:
sys_call_table
symbol is no longer exported by the kernel to LKMs. This is to prevent developers from doing stupid things with it. We're going to have to find the address manually so that we can do stupid things with it.sys_call_table
lives is now marked read only to prevent things from going wrong. We can't write a new hook into the table without first making the page writable, thereby allowing things to go wrong.Writing a rootkit from scratch is going to be a grueling endeavour. Thankfully your connections in the underground have hooked you up with some super eleet warez. With their C code you should be able to write a respectable piece of kernel malware without losing your mind. Unfortunately your hookup only got you so far. The code's author must have uploaded it before it was completely finished. It looks like you'll have to pick up where they left off...
You're writing code that runs in kernel space, with full privileges. The slightest mistake in your code is going to lead to legitimately weird things happening including (but not limited to):
Don't keep anything on your VM you aren't ready to lose! Keep your code on your own machine and copy it over to compile/test.
You're going to want to work in very small, verifiable steps. Do not attempt to sit down and program the whole assignment. Instead, start with very small steps in mind and progress further only when you get that step working.
For example, in the file hiding task: start by figuring out what to hook, then try hooking it and keeping the original behavior intact. Once you can do that without crashing your VM, try printing all filenames in a directory to syslog from your hook. Once that's working start writing code to identifying files you want to hide from those being printed to syslog. Finally attempt to remove the entry from the results.
/usr/src/linux-5.4.0-171
.
wget
command. THE USERNAME AND PASSWORD CAN BE FOUND IN A POST IN THE "Announcements" DIRECTORY ON BRIGHTSPACE.sudo bash
to give yourself a bash shell with root privileges. We'll pretend that you got this from the race condition in A1. For most of this assignment you're going to be switching between a root user and a normal user, so I recommend you keep two windows open (the gurus might want to try the screen tool, or a terminal multiplexer with a somewhat steep learning curve).sys_call_table
symbol inspecting /proc/kallsyms
.rootkit.c
file to provide the right symbol as an argument to kallsyms_lookup_name()
in the get_syscall_table_bf()
function. It should the same as the symbol you found in Q3.make
. You can safely ignore the warning about defined but not used variables, as you will be fixing that as you complete the assignment../insert.sh
as root. Ensure it was inserted by running lsmod
and by checking the syslog../eject.sh
as root. Ensure it was ejected by running lsmod
and by checking the syslog.open()
hook works. Look for the TODO markers. Show a snippet of the syslog output it generates once loaded.tail /var/log/syslog
to display the last few lines of the syslog. You may also want to try tail -f /var/log/syslog
to interactively tail the syslog file. In interactive mode as new lines are printed to the log your terminal will update immediately. Press ctrl-c
(that is ctrl
and then c
) to end the tail command and get back to the shell.
execve
syscall using the framework code from Part A. Consult the execve man page to learn the details and function signature of execve()
. You will need to know which __NR_X
define is used to find the offset in sys_call_table
to hook for execve
(where X
will vary syscall to syscall). You might find https://elixir.bootlin.com/linux/v5.4.171/source/arch/sh/include/uapi/asm/unistd_64.h useful in this regard.
printk
. Example output: Jan 28 20:49:17 COMP4108-A2 kernel: [81423.749198] Executing /usr/bin/tail Jan 28 20:49:17 COMP4108-A2 kernel: [81423.749200] Effective UID 0 Jan 28 20:49:19 COMP4108-A2 kernel: [81425.950497] Executing /bin/ls Jan 28 20:49:19 COMP4108-A2 kernel: [81425.950499] Effective UID 1000
current_*
macros defined in the https://elixir.bootlin.com/linux/v5.4.171/source/include/linux/cred.h include will help you get the information you need to include in your printk
message.
uname -a
we can find the corresponding argument registers for x86-64 by looking at the second table in the Architecture Calling Conventions section, and compare this to how the openat()
hook code is able to access the pathname
argument.
root_uid
parameter, they are given uid/euid 0 (i.e. root privs). The root_uid
parameter must be provided via the insmod
command in insert.sh
. Note that the root_uid
parameter should be set to your user's UID to get root, not root's UID. You will need to add this behaviour.prepare_kernel_cred()
, and commit_creds()
functions.root_uid
param in insert.sh
equal to your user's UID, and provide the input/output from:whoami
as a normal user in one terminal./insert.sh
in a second terminal.whoami
again and being told you are root.comp4108@NodeX:/A2/code/rootkit_framework$ whoami comp4108 comp4108@NodeX:/A2/code/rootkit_framework$ whoami root
ls
and the OS provided directory abstraction.getdents64
system call (man page here). Once again this will require finding the __NR_*
define for the syscall number.
getdents64()
to syslog using printk
. Sample output:Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441674] getdents64() hook invoked. Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441704] entry: rootkit.o Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441706] entry: .rootkit.mod.o.cmd Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441708] entry: .. Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441710] entry: insert.sh Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441711] entry: rootkit.c Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441712] entry: rootkit.mod.c Oct 1 11:44:36 COMP4108-A2 kernel: [ 2266.441714] entry: rootkit.ko <snipped>
struct linux_dirent*
buffer you return to the calling process does not include any dirents for filenames that start with magic_prefix
. The magic_prefix
character array should be provided as a kernel module parameter given to insmod
in the insert.sh
script. You will need to implement this parameter yourself.
getdents64
hook and implementing the magic_prefix
parameter you'll want to test it in action:
insert.sh
script and set the magic_prefix
parameter to $sys$
make
$sys$_lol_hidden.txt
in your current directory.ls -l
to see if your $sys$_lol_hidden.txt
file was created../insert.sh
as root.ls -l
command to validate the $sys$_lol_hidden.txt
file is no longer included. It shouldn't be in ls -la
either (i.e. isn't just a regular 'hidden' dotfile).$sys$
as your magic_prefix
value you must remember to escape the $
s in the bash shell. The easiest way is to use \$
instead of $
when trying to create, edit, delete, or otherwise interact with one of your hidden files.
comp4108@COMP4108-A2:/A2/code/rootkit_framework/test$ touch \$sys\$_lol_hidden.txt comp4108@COMP4108-A2:/A2/code/rootkit_framework/test$ ls -la total 8 -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 11:59 bar.txt -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 11:59 baz.txt -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 11:59 foo.txt -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 12:00 $sys$_lol_hidden.txt comp4108@COMP4108-A2:/A2/code/rootkit_framework/test$ ls -la total 8 drwxrwxr-x 2 comp4108 comp4108 4096 Oct 1 12:00 . drwxrwxr-x 5 comp4108 comp4108 4096 Oct 1 11:59 .. -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 11:59 bar.txt -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 11:59 baz.txt -rw-rw-r-- 1 comp4108 comp4108 0 Oct 1 11:59 foo.txt
dirent
s to hide files is the trickiest bit of the assignment. Luckily it is no more difficult than a typical data structure question (with a few twists). struct linux_dirent *dirp
buffer provided as the 2nd argument to the getdents
syscall. This buffer is allocated by the calling process (i.e they make sure there is enough memory malloc
'd for the struct linux_dirent
s that the syscall puts into the buffer.)dirp
buffer is not an array of struct linux_dirent
s of equal size. To save memory each dirent
struct is only as big as it needs to be. In order to allow iterating through the dirent
structs in the buffer each dirent
struct stores its length to be used as an offset to the next dirent
in the buffer (see the figure). You will need to use this knowledge to determine how you can remove a dirent
from the buffer. The man page for getdents()
has example code for iterating the buffer.dirp
buffer is userland memory You can not edit it directly or bad things will happen. Instead you must first allocate a kernel memory buffer of equivalent size. To do this you must use kalloc
and kfree
not their user-land counterparts malloc
and free
. Once you have a kernel buffer of the right size you can use the copy_from_user
and copy_to_user
functions to copy the userland buffer into your kernel buffer and vice versa.getdents64()
syscall with the dirp
buffer your hook receives to have it populated with dirent
structs.kmalloc
copy_from_user
dirent
structs you don't want seen.copy_to_user
kfree()
on the kernel buffer to free it and avoid a leak.