Showing posts with label reversing. Show all posts
Showing posts with label reversing. Show all posts

Monday, September 12, 2016

LuaBot: Malware targeting cable modems

During mid-2015 I disclosed some vulnerabilities affecting multiple ARRIS cable modems. I wrote a blogpost about ARRIS' nested backdoor and detailed some of my cable modem research during the 2015 edition from NullByte Security Conference.

CERT/CC released the Vulnerability Note VU#419568 and it got lots of media coverage. I did not provide any POC's during that time because I was pretty sure that those vulnerabilities were easily wormable... And guess what? Someone is actively exploiting those devices since May/2016.

The malware targets Puma 5 (ARM/Big Endian) cable modems, including the ARRIS TG862 family. The infection happens in multiple stages and the dropper is very similar to many common worm that targets embedded devices from multiple architectures. The final stage is an ARMEB version from the LuaBot Malware.


The ARMEL version from the LuaBot Malware was dissected on a blogpost from Malware Must Die, but this specific ARMEB was still unknown/undetected for the time being. The malware was initially sent to VirusTotal on 2016-05-26 and it still has a 0/0 detection rate.



Cable Modem Security and ARRIS Backdoors

Before we go any further, if you want to learn about cable modem security, grab the slides from my talk "Hacking Cable Modems: The Later Years". The talk covers many aspects of the technology used to manage cable modems, how the data is protected, how the ISPs upgrade the firmwares and so on.


Pay special attention to the slide #86:


I received some reports that malware creators are remotely exploiting those devices in order to dump the modem's configuration and steal private certificates. Some users also reported that those certificates are being sold for bitcoin to modem cloners all around the world. The report from Malware Must Die! also points that the LuaBot is being used for flooding/DDoS attacks.


Exploit and Initial Infection

Luabot malware is part of a bigger botnet targeting embedded devices from multiple architectures. After verifying some infected systems, I noticed that most cable modems were compromised by a command injection in the restricted CLI accessible via the ARRIS Password of The Day Backdoor.

Telnet honeypots like the one from nothink.org have been logging these exploit attempts for some time. They are logging many attempts to bruteforce the username "system" and the password "ping ; sh", but they're, in fact, commands used to escape from the restricted ARRIS telnet shell.


The initial dropper is created by echoing shell commands to the terminal to create a standard ARM ELF.


I have cross compiled and uploaded a few debugging tools to my cross-utils repository, including gdbserver, strace and tcpdump. I also happen to have a vulnerable ARRIS TG862 so I can perform dynamic analysis in a controlled environment.

If you run the dropper using strace to monitor the network syscalls, you can see the initial connection attempt:

./strace -v -s 9999 -e poll,select,connect,recvfrom,sendto -o network.txt ./mw/drop
connect(6, {sa_family=AF_INET, sin_port=htons(4446), sin_addr=inet_addr("46.148.18.122")}, 16) = -1 ENODEV (No such device)

The command is a simple download and exec ARMEB shellcode. The malicious IP 46.148.18.122 is known for bruteforcing SSH servers and trying to exploit Linksys router command injections in the wild. After downloading the second stage malware, the script will echo the following string:
echo -e 61\\\\\\x30ck3r

This pattern is particularly interesting because it is quite similar to the one reported by ProtectWise while Observing Large-Scale Router Exploit Attempts:
cmd=cd /var/tmp && echo -ne \\x3610cker > 610cker.txt && cat 610cker.txt

The second stage binary ".nttpd" (MD5 c867d00e4ed65a4ae91ee65ee00271c7) performs some internal checks and creates iptables rules allowing remote access from very specific subnets and blocking external access to ports 8080, 80, 433, 23 and 22:


These rules block external exploit attempts to ARRIS services/backdoors, restricting access to networks controlled by the attacker.

After setting up the rules, two additional binaries were transferred/started by the attacker. The first one, .sox.rslv (889100a188a42369fd93e7010f7c654b) is a simple DNS query tool based on udns 0.4.



The other binary, .sox (4b8c0ec8b36c6bf679b3afcc6f54442a), sets the device's DNS servers to 8.8.8.8 and 8.8.4.4 and provides multiple tunneling functionalities including SOCKS/proxy, DNS and IPv6.



Parts of the code resembles some shadowsocks-libev functionalities and there's an interesting reference to the whrq[.]net domain, which seems to be used as a dnscrypt gateway:



All these binaries are used as auxiliary tools for the LuaBot's final stage, arm_puma5 (061b03f8911c41ad18f417223840bce0), which seems to be selectively installed on vulnerable cable modems.

UPDATE: According to this interview with the supposed malware author, "reversers usually get it wrong and say there’s some modules for my bot, but those actually are other bots, some routers are infected with several bots at once. My bot never had any binary modules and always is one big elf file and sometimes only small <1kb size dropper"


Final Stage: LuaBot

The malware's final stage is a 716KB ARMEB ELF binary, statically linked, stripped and compiled using the same Puma5 toolchain as the one I made available on my cross-utils repository.


If we use strace to perform a dynamic analysis we can see the greetings from the bot's author and the creation of a mutex (bbot_mutex_202613). Then the bot will start listening on port 11833 (TCP) and will try to contact the command and control server at  80.87.205.92.


In order to understand how the malware works, let's mix some manual and dynamic analysis. Time to analyse the binary using IDA Pro and...

Reversing stripped binaries

The binaries are stripped and IDA Pro's F.L.I.R.T. didn't recognize standard function calls for our ARMEB binary. Instead of spending hours manually reviewing the code, we can use @matalaz's diaphora diffing plugin to port all the symbols.

First, we need to export the symbols from uClibC's Puma5 toolchain. Download the prebuilt toolchain here and open the library "armeb-linux\ti-puma5\lib\libuClibc-0.9.29.so" using IDA Pro. Choose File/Script File (Alt+F7), load diaphora.py, select a location to Export IDA Database to SQLite, mark "Export only non-IDA generated functions" and hit OK.

When it finishes, close the current IDA database and open the binary arm_puma5. Rerun the diaphora.py script and now choose a SQLite database to diff against:


After a while, it will show various tabs with all the unmatched functions in both databases, as well as the "Best", "Partial" and "Unreliable" matches tabs.

Browse the "Best matches" tab, right click on the list and select "Import *all* functions" and choose not to relaunch the diffing process when it finishes. Now head to the "Partial matches" tab, delete everything with a low ratio (I removed everything below 0.8), right click in the list and select "Import all data for sub_* function":


The IDA strings window display lots of information related to the Lua scripting language. For this reason, I also cross-compiled Lua to ARMEB, loaded the "lua" binary into IDA Pro and repeated the diffing process with diaphora:


We're almost done now. If you google for some debug messages present on the code, you can find a deleted Pastebin that was cached by Google.



I downloaded the C code (evsocketlib.c), created some dummy structs for everything that wasn't included there and cross-compiled it to ARMEB too. And now what? Diffing again =)



Reversing the malware is way more legible now. There's builtin Lua interpreter and some native code related to event sockets. The list of the botnet commands is stored at 0x8274: bot_daemonize, rsa_verify, sha1, fork, exec, wait_pid, pipe, evsocket, ed25519, dnsparser, struct, lpeg, evserver, evtimer and lfs:


The bot starts by setting up the Lua environment, unpacks the code and then forks, waiting for instructions from the Command and Control server. The malware author packed the lua source code as a GZIP blob, making the entire reversing job easier for us, as we don't have to deal with Lua Bytecode.


The blob at 0xA40B8 contains a standard GZ header with the last modified timestamp from 2016-04-18 17:35:34:


Another easy way to unpack the lua code is to attach the binary to your favorite debugger (gef, of course) and dump the process memory (heap).

First, copy gdbserver to the cable modem, run the malware (arm_puma5) and attach the debugger to the corresponding PID:
./gdbserver --multi localhost:12345 --attach 1058


Then, start gef/GDB and attach it to the running server:
gdb-multiarch -q
set architecture arm
set endian big
set follow-fork-mode child
gef-remote 192.168.100.1:12345


Lastly, list the memory regions and dump the heap:
vmmap
dump memory arm_puma5-heap.mem 0x000c3000 0x000df000


That's it, now you have the full source code from the LuaBot:


The LuaBot source code is composed of several modules:


The bot settings, including the DNS recurser and the CnC settings are hardcoded:


The code is really well documented and it includes proxy checking functions and a masscan log parser:


Bot author is seeding random with /dev/urandom (crypgtographers rejoice):


LuaBot integrates an embedded JavaScript engine and executes scripts signed with the author's RSA key:


Meterpreter is so 2000's, the V7 JavaScript interpreter is named shiterpreter:


There's a catchy function named checkanus.penetrate_sucuri, on what seems to be some sort of bypass for Sucuri's Denial of Service (DDoS) Protection:



LuaBot has its own lua resolver function for DNS queries:


Most of the bot capabilities are in line with the ones described on the Malware Must Die! blogpost. It's interesting to note that the IPs from the CnC server and iptables rules don't overlap, probably because they're using different environments for different bot families (or they were simply updated).

I did not analise the remote botnet structure, but the modular approach and the interoperability of the malware indicates that there's a professional and ongoing effort.


Conclusion

The analysed malware doesn't have any persistence mechanism to survive reboots. It wouldn't try to reflash the firmware or modify volatile partitions (NVRAM for example), but the first stage payload restricts remote access to the device using custom iptables rules.

This is a quite interesting approach because they can quickly masscan the Internet and block external access to those IoT devices and selectively infect them using the final stage payloads.

On 2015, when I initially reported about the ARRIS backdoors, there were over 600.000 vulnerable ARRIS devices exposed on the Internet and 490.000 of them had telnet services enabled:
If we perform the same query nowadays (September/2016) we can see that the number of exposed devices was reduced to approximately 35.000:
I know that the media coverage and the security bulletins contributed to that, but I wonder how much of those devices were infected and had external access restricted by some sort of malware...

The high number of Linux devices with Internet-facing administrative interfaces, the use of proprietary Backdoors, the lack of firmware updates and the ease to craft IoT exploits make them easy targets for online criminals.

IoT botnets are becoming a thing: manufacturers have to start building secure and reliable products, ISPs need to start shipping updated devices/firmwares and the final user has to keep his home devices patched/secured.

We need to find better ways to detect, block and contain this new trend. Approaches like the one from SENRIO can help ISPs and Enterprises to have a better visibility of their IoT ecosystems. Large scale firmware analysis can also contribute and provide a better understanding of the security issues for those devices.


Indicators of Compromise (IOCs)

LuaBot ARMEB Binaries:
  • drop (5deb17c660de9d449675ab32048756ed)
  • .nttpd (c867d00e4ed65a4ae91ee65ee00271c7)
  • .sox (4b8c0ec8b36c6bf679b3afcc6f54442a)
  • .sox.rslv (889100a188a42369fd93e7010f7c654b)
  • .arm_puma5 (061b03f8911c41ad18f417223840bce0)

GCC Toolchains:
  • GCC: (Buildroot 2015.02-git-00879-g9ff11e0) 4.8.4
  • GCC: (GNU) 4.2.0 TI-Puma5 20100224

Dropper and CnC IPs:
  • 46.148.18.122
  • 80.87.205.92

IP Ranges whitelisted by the Attacker:
  • 46.148.18.0/24
  • 185.56.30.0/24
  • 217.79.182.0/24
  • 85.114.135.0/24
  • 95.213.143.0/24
  • 185.53.8.0/24

Monday, February 9, 2015

Firmware Forensics: Diffs, Timelines, ELFs and Backdoors

This post covers some common techniques that I use to analyze and reverse firmware images. These techniques are particularly useful to dissect malicious firmwares, spot backdoors and detect unwanted modifications.

Backdooring and re-flashing firmware images is becoming mainstream: malicious guys are infecting embedded devices and inserting trojans in order to achieve persistence. Recent articles covered the increasing number of trojanized android firmwares and routers that are being permanently modified.

Attackers with a privileged network position may MITM your requests and forge fake updates containing malicious firmwares. Writing Evilgrade modules for this is really simple, as most vendors keep failing to deliver updates securely, right ASUS?

All your HTTP packets are belong to us...
Older versions of ASUS firmwares were vulnerable to MITM attacks (CVE-2014-2718) because it transmitted updates over HTTP and there were no security/signature checks. ASUS silently patched the issue on 3.0.0.4.376+ and they're now verifying RSA signatures via /sbin/rsasign_check.:

Valid signature -> nvram_set("rsasign_check", "1")

NoConName 2014 CTF Finals: Vodka

I'll keep my tradition of writing posts based on CTF challenges because everybody upvotes CTF posts on reddit it's cool.

The challenge "Vodka", from NoConName 2014 CTF Finals was created by @MarioVilas, who kindly provided the files here (thanks dude!).

I did not participate on the CTF finals, but I found the challenge really interesting because there were many different ways to solve it, summarizing the actions needed to audit a compromised firmware. In my opinion, the best CTF challenges are the ones that require us to develop/use new techniques and improve existing tools.

NoConName 2014 Finals: Vodka
Challenge Category: Forensics
Description: No hints :( just get the flag.

This challenge description is not very intriguing, so I hired a couple of marketing specialists to design a new logo add some Infosec drama and reformulate it:

A mysterious bug affected one of the core routers at a major Internet service provider in Syria. The failure of this router caused the whole country to suddenly lose all connection to the Internet. The Syrian government recorded a traffic capture right before the crash and hired you to perform a forensic analysis.

Download provided: https://github.com/MarioVilas/write-ups/blob/master/ncn-ctf-2014/Vodka/vodka


Network Forensics

The download provided is a packet capture using the PCAP-NG format. Wireshark is too mainstream, so let's convert the PCAP-NG to PCAP and open it using Network Miner:



Network Miner makes it very easy for us to understand what's going on: there's some sort of file transfer via TFTP and the filename seems to be related to an OpenWRT firmware image.


Firmware structure

We always binwalk all the things but very few people stop to analyze and understand the firmware structure properly. We know that the firmware image was downloaded using TFTP, a common way used by many routers to transfer config files/updates and it is probably based on the OpenWRT project.

So what does binwalk tell us?



The Commom Firmware Environment (CFE) is a firmware interface/bootloader present on Broadcom SOCs. It is analogous to the BIOS on PC platforms and it is responsible for CPU initialization and bootstrap code on embedded processors. The CFE is also referred as PMON and it is generally mapped to mtd0.

The JFFS2/NVRAM filesystem is the non-volatile partition. They store all the configuration parameters, including router settings, passwords and logs.

Bear in mind firmware updates generally do not include the CFE/NVRAM partition. You can access the CFE console using serial and you can also dump them on a live system using DD or via SPI. Let's focus on the firmware sections included on the provided image (openwrt-wrtsl54gs-squasfs.bin):

TRX (Offset 0x20)

The TRX header is just an encapsulation, describing a series of information from the firmware, including the image size, CRC, flags, version information and partition offsets. Binwalk wasn't recognizing the header and the relative offsets properly so I submitted these two pull requests. Creating custom signatures for binwalk is pretty straightforward.


Some firmwares (like the newer ones from ASUS and Netgear) use this TRX structure but don't include a loader: the Linux Kernel and the RootFS may be shifted on this occasion.


If the firmware includes any extra header before the TRX, you have to sum their size with the displayed partition offsets in order to find the real values. Some firmwares for SOHO modems out there won't include it, so these values should be right on most cases. The downloaded OpenWRT image had the following offsets:

  • Loader: 0x20 + 0x1C = 0x3C
  • Kernel:  0x20 + 0x8D8 = 0x8F8
  • RootFS: 0x20 + 0x7E400 = 0x7E420

In this specific case, we have a BinHeader right before the TRX, indicating the board ID, the FW Date and the Hardware Date. The struct is described on cyutils.h:

This extra header appears on a few routers like the WRT54G series: the Web GUI checks for this pattern before actually writing the firmware.



We are particularly interested on the fwdate field (Firmware Date), composed by the hex values 07 02 03. According to addpattern.c, the first byte defines the year, the second one is the month and the third byte refers to the day the firmware was created. The fwdate seems to be 03-February-2007, save that for later, we will need that =)

GZ'd LZMA Loader (Offset 0x3C)

According to OpenWRT Wiki, the boot loader has no concept of filesystems: it assumes that the start of the TRX data section is executable code.

The boot loader boots into an LZMA program which decompresses the kernel into RAM and executes it. It turns out the boot loader does know gzip compression, so we have a gzip-compressed LZMA decompression program at 0x3C.

You can find the source code for this lzma-loader here and here. Note the TEXT_START offset at 0x80001000: we may need to adjust the Loading Address on our Disassembler in order to reverse the compiled loader. Don't forget to decompress it (gunzip) before reversing the file.


Most embedded toolchains would strip the binaries in order to reduce the firmware size. If you want to reverse a friendlier version of the loader, grab the latest OpenWRT ImageBuilder and search for loader.elf:


Woohoo, blue code =)

Note that if we modify the loader to include a backdoor, we would have our very own Router Bootkit, cool isn't it?

LZMA'd Kernel (Offset 0x8F8)

Instead of just putting a kernel directly onto flash, most embedded devices compress the kernel using LZMA. The boot loader boots into an LZMA program which decompresses the kernel into RAM and executes it.

Binwalk has a signature to find Kernel strings in raw Linux Kernels. The identified string lists the toolchain used to compile the Kernel, as well as the compiled date and version information:


And why did binwalk manage to find all these information from the Kernel? The answer can be found on the toolchain's Makefile:

If we follow the steps from my previous post we can build a customized Kernel for OpenWRT. The generated vmlinux is generally an ELF file, but in our case, the object was stripped using objcopy:

Did you notice the compile date was 03-February-2007? Let's save that for later as well.

SquashFS (Offset 0x72420)

The last part is the actual filesystem. Most embedded Linux devices use SquashFS and many vendors hack it in order to get better compression and faster performance. Hopefully we don't have to worry about that as Sasquatch handles different SquashFS header/compression formats.

The filesystem has the standard OpenWRT directories and files, including a banner from the 0.9 build (White Russian).


Both binwalk and sasquatch display the SquashFS superblock information, including the creation/last append time:


Did you spot the date 29-October-2014? There's definitely something going on here =)


Directory Tree Diff & Fuzzy Hashing

Now that we have unpacked & unsquashed the firmware, let's use binwally to compare the directory tree and find the needle in the haystack.

After googling the filename (openwrt-wrtsl54gs-squashfs.bin), we get three possible candidates:

https://downloads.openwrt.org/whiterussian/0.9/default/openwrt-wrtsl54gs-squashfs.bin
https://downloads.openwrt.org/whiterussian/0.9/micro/openwrt-wrtsl54gs-squashfs.bin
https://downloads.openwrt.org/whiterussian/0.9/pptp/openwrt-wrtsl54gs-squashfs.bin

OpenWRT offers different builds for the same device because of constraints like limited flash size. Let's download these three candidates, unpack and compare them:

binwally.py ctf/_openwrt-wrtsl54gs-squashfs.bin.extracted/ micro/_openwrt-wrtsl54gs-squashfs.bin.extracted/


The "micro" build has the highest overall match score (99%), let's spot the differences:

binwally.py ctf/_openwrt-wrtsl54gs-squashfs.bin.extracted/ micro/_openwrt-wrtsl54gs-squashfs.bin.extracted/ | grep -E -v "ignored|matches"


After carefully reviewing these files, we notice that the "/etc/profile" was modified to include a call to the nc backdoor.


The LZMA'd Kernel (offset 0x8F8) is the same on both images, even though binwally reports a difference. This happens because binwalk extraction doesn't know when to stop and both files also contain additional data like the SquashFS partition.

The backdoor located at "/bin/nc" is a simple bash script that checks the MD5 from "/etc/profile" and draws a Nyan Cat along with the challenge key. In order to get the proper key, we simply modify the file location to the relative path "./etc/banner", to avoid overlapping with the file from the original system.


After running the file, we get the key NCNdeadb6adec4c77a40c23e04770924d3c5b18face.


This was just too easy right? But what if we didn't have a known template for comparison?


Timeline Analysis

My tool of choice to perform timeline analysis is Plaso, created by @el_killerdwarf. The tool is python-based, modular and very fast. What I like most about it is the ease to output results to ELK. If you don't know about Plaso and the ELK stack, read this quick tutorial and set up your environment.

Let's use log2timeline to create a dump file, pointing to the extracted SquashFS path:

log2timeline.py output.dump squashfs-root/


Let's fire up psort and include data in the timeline:

psort.py -o elastic output.dump


That's all, Plaso uses the filestat parser to extract metadata from the files, outputting results to Elasticsearch.

We already identified the following dates from the firmware:

  • 03 February 2007 (??:??:??): BinHeader firmware creation date
  • 03 February 2007 (13:16:08): Linux Kernel compile date
  • 29 October  2014 (16:53:25): SquashFS creation or last append time

First let's filter the filesystem attributes: we just want to display the mtime (modified) timestamp, so we are going to perform a micro analysis to include the value. The filter should be something like this: field must | field timestamp_desc | query: "mtime".


The histogram view is very helpful to get a big picture of what's going on:


We can clearly see that the files included/modified on 2014-10-29 had a malicious nature. The state sponsored attacker did not modify other files from the OpenWRT base image.


At this point it is pretty clear that the firmware was modified using the OpenWRT Image Builder, which is a pre-compiled OpenWrt build environment. The BinHeader and the Kernel timestamps were left untouched and the only partition modified was the SquashFS one.

Of course these timestamps, like any kind of metadata, could be tampered by the malicious hacker. However, they are very helpful during the initial phases, speeding up investigations and narrowing the analysis to a smaller set of data.


ELF Structural Information

I always get impressed when AV vendors manage to profile APT and State-sponsored attackers based on PE timestamps. Techniques like the imphash are generally used exclusively on Windows.

PE Imports are the functions that a piece of software calls from other files (typically DLLs). To track these imports, a hash is created based on library/API names and their specific order within the executable. Because of the way a PE’s import table is generated, we can use the imphash value to identify related malware samples, for example.

Everybody does that for Windows binaries but what about Linux? Virustotal recently included detailed ELF information on their engine. We can also use these sections to identify useful information from the binaries, including the toolchain used to compile them.

We generally don't have any timestamp information on the ELF section, but there are many other interesting fields. This quick guide on using strip summarizes some topics:

When an executable is produced from source code, there are two stages - compilation and linking. Compiling takes a source file and produces an object file. Linking concatenates these object files into a single executable. The concatenation occurs by section. For example, the .comment section for the final executable will contain the contents of the .comment section of each object file that was linked into the executable.
If we examine the contents of the .comment section we can see the compiler used, plus the version of the compiler
It's pretty simple to read and parse the .comment sections from ELF files. GNU readelf (part of binutils) and pyelftools include all the necessary functions parse them.

I always try to display information from object files using different toolchains in order to find out which one understands the file structure properly. On this specific case, I'm going to use mipsel-linux-gnu-readelf (part of Emdebian toolchain), but the regular readelf also does the job.

for i in $(find .) ; do echo $i ; mipsel-linux-gnu-readelf -p .comment $i ; done > comment-section.txt
./lib/modules/2.4.30/diag.o

String dump of section '.comment':
  [     1]  GCC: (GNU) 3.4.4 (OpenWrt-1.0)

./lib/modules/2.4.30/switch-adm.o

String dump of section '.comment':
  [     1]  GCC: (GNU) 3.4.4 (OpenWrt-1.0)

./lib/modules/2.4.30/switch-robo.o

String dump of section '.comment':
  [     1]  GCC: (GNU) 3.4.4 (OpenWrt-1.0)

./lib/modules/2.4.30/switch-core.o

String dump of section '.comment':
  [     1]  GCC: (GNU) 3.4.4 (OpenWrt-1.0)

./lib/modules/2.4.30/wlcompat.o

String dump of section '.comment':
  [     1]  GCC: (GNU) 3.4.4 (OpenWrt-1.0)

Just a few ELF files included the comment section, others got stripped during the compilation/linking phase. If we download OpenWRT 0.9 sources we can see that GCC 3.4.4 was indeed used:


TheMoon Worm exploited a command injection to infect Linksys wireless routers with a self-replicating malware. If we analyze its .comment section, we can see that it was probably compiled and linked using GCC 4.2.4 and 3.3.2. If we search for a .comment section on the router E4200, targeted by the worm, we can't find any reference because the toolchain stripped all of them. Having a file compiled with a different toolchain or containing extra ELF sections (that others files don't) is something highly suspicious.


The .comment section for the final executable includes the contents of the .comment section of each object file that was linked into the executable. If we compare the comment section on ASUS RT-AC87U Firmwares v3.0.0.4.378.3885 and v3.0.0.4.376.2769, we can spot an extra line on the newer version from tfat.ko:


If you want to dump all sections from the ELF file you may use this command line (kind of hacky, but works):

for i in $(find .) ; do echo "$i" ; for j in $(readelf -S "$i" | grep \\[ | cut -d"]" -f2 | cut -d " " -f2 | grep -v "Name") ; do mipsel-linux-gnu-readelf -p "$j" "$i" ; done ; done > list.txt

The output will be a bit too verbose, you may want to narrow the analysis to the following sections:

  • .comment - contains version control information
  • .modinfo - displays information from a kernel module
  • .notes - comments put there by the compiler/linker toolchain
  • .debug - contains information for symbol debugging
  • .interp - contains the name of the dynamic loader

For more information regarding the ELF file structure, check the ELF man and the Chapter 5 from Malware Forensics Field Guide for Linux Systems.


Conclusion

Without further clues or context these information may not be relevant, but in conjunction with other data they're helpful to get a big picture of what's going on:

  • Diffing the content from previous firmwares may be useful to find out when backdoors were first installed, modified and/or removed.

  • Artifact timeline creation and analysis also helps to speed up investigations by correlating the vast amount of information found on system.

  • The contents from the ELF section will likely reveal the toolchain and the compiler version used to compile a suspect executable. Clues such as this are attribution identifiers, contributing towards identifying the platform used by the attacker to craft his code.

We can use the timestamps from the kernel partition to correlate different firmwares from the same family, for example. We can also compare the timestamps from each partition to find deviations: a firmware header created on 2007, with a Kernel timestamp from 2007 and a SquashFS partition dated to 2014 is highly suspicious.

The Firmware.RE project is performing a large scale analysis, providing a better understanding of the security issues related to firmwares. A broader view on firmwares is not only beneficial, but necessary to discover new vulnerabilities and backdoors, correlating different device families and showing how vulnerabilities reappear across different products. This is a really cool project to track how firmwares are evolving and getting security fixes.