Recently, we came across some firmware samples from D-Link routers that we were unable to unpack properly. Luckily, we got our hands on an older, cheaper but similar device (DIR882) that we could analyze more closely. The goal is to find a way to mitigate the firmware encryption that was put in place to prevent tampering and static analysis. This series highlights the results and necessary steps to write a custom decryption routine that actually works for numerous other models as well, but more about that later on. First, let's take a look at the problem.
The problem:
The latest D-Link 3060 firmware (as of time of writing) can be downloaded from here. I'll be examining v1.02B03, which was released on 10/22/19. A brief initial analysis shows the following:
> md5sum DIR-3060_RevA_Firmware111B01.bin
86e3f7baebf4178920c767611ec2ba50 DIR3060A1_FW102B03.bin
> file DIR-3060_RevA_Firmware111B01.bin
DIR3060A1_FW102B03.bin: data
> binwalk DIR-3060_RevA_Firmware111B01.bin
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
> hd -n 128 DIR-3060_RevA_Firmware111B01.bin
00000000 53 48 52 53 01 13 1f 9e 01 13 1f a0 67 c6 69 73 |SHRS........g.is|
00000010 51 ff 4a ec 29 cd ba ab f2 fb e3 46 2e 97 e7 b1 |Q.J.)......F....|
00000020 56 90 b9 16 f8 0c 77 b8 bf 13 17 46 7b e3 c5 9c |V.....w....F{...|
00000030 39 b5 59 6b 75 8d b8 b0 a3 1d 28 84 33 13 65 04 |9.Yku.....(.3.e.|
00000040 61 de 2d 56 6f 38 d7 eb 43 9d d9 10 eb 38 20 88 |a.-Vo8..C....8 .|
00000050 1f 21 0e 41 88 ff ee aa 85 46 0e ee d7 f6 23 04 |.!.A.....F....#.|
00000060 fa 29 db 31 9c 5f 55 68 12 2e 32 c3 14 5c 0a 53 |.).1._Uh..2..\.S|
00000070 ed 18 24 d0 a6 59 c0 de 1c f3 8b 67 1d e6 31 36 |..$..Y.....g..16|
00000080
So, all we got from the file command is that we have some form of (binary) data file at hand, which is not very useful. Our goto choice for initial recon: binwalk is also unable to identify any file sections within the firmware image, not even any false positives. Lastly, the hex dump of the first 128 bytes shows seemingly random data right from offset 0x0. These are indicators of an encrypted image, which an entropy analysis can confirm:
> binwalk -E DIR-3060_RevA_Firmware111B01.bin
DECIMAL HEXADECIMAL ENTROPY
--------------------------------------------------------------------------------
0 0x0 Rising entropy edge (0.978280)
There's not a single drop in the entropy curve, leaving no room for us to extract any kind of information about the target...
The attempt:
As we were reluctant to buy the D-Link DIR 3060 for around ~$200 we checked similar models from D-Link that were on the cheaper side with the goal to find at least one alternative that deploys the same encryption scheme. In the end, we came across the D-Link DIR 882, which was considerable cheaper.
On a side note, even when we weren't able to find a similar encryption scheme, looking at different firmware headers could have provided some hints on what their goto mechanic to 'secure' their firmware looks like.
As we stumbled upon the DIR 882, we checked the firmware v1.30B10 that was released on 02/20/20, and it shows the same behavior as the one from the big brother the DIR3060, including the constant entropy of nearly 1. One thing that the invested reader might notice is the same 4-byte sequence at the start, "SHRS". We will come to that one later.
> md5sum DIR_882_FW120B06.BIN
89a80526d68842531fe29170cbd596c3 DIR_882_FW120B06.BIN
> file DIR_882_FW120B06.BIN
DIR_882_FW120B06.BIN: data
> binwalk DIR_882_FW120B06.BIN
DECIMAL HEXADECIMAL DESCRIPTION
--------------------------------------------------------------------------------
> hd -n 128 DIR_882_FW120B06.BIN
00000000 53 48 52 53 00 d1 d9 a6 00 d1 d9 b0 67 c6 69 73 |SHRS........g.is|
00000010 51 ff 4a ec 29 cd ba ab f2 fb e3 46 fd a7 4d 06 |Q.J.)......F..M.|
00000020 a4 66 e6 ad bf c4 9d 13 f3 f7 d1 12 98 6b 2a 35 |.f...........k*5|
00000030 1d 0e 90 85 b7 83 f7 4d 3a 2a 25 5a b8 13 0c fb |.......M:*%Z....|
00000040 2a 17 7a b2 99 04 60 66 eb c2 58 98 82 74 08 e3 |*.z...`f..X..t..|
00000050 54 1e e2 51 44 42 e8 d6 8e 46 6e 2c 16 57 d3 0b |T..QDB...Fn,.W..|
00000060 07 d7 7c 9e 11 ec 72 1d fb 87 a2 5b 18 ec 53 82 |..|...r....[..S.|
00000070 85 b9 84 39 b6 b4 dd 85 de f0 28 3d 36 0e be aa |...9......(=6...|
00000080
Another thing this firmware confirms for us is that the same crypto scheme is still used in early 2020.
The solution:
Once we acquired the DIR882 we could enter a serial console on the device and look around the file systems for any clues and candidates that handles the en-/decryption of firmware updates (Attaching to the UART console is out of scope for this article and not particular interesting as it involves no 'hardware hacking' besides attaching 4 cables…) We quickly could identify a suitable candidate:
> file imgdecrypt
imgdecrypt: ELF 32-bit LSB executable, MIPS, MIPS32 rel2 version 1 (SYSV), dynamically linked, interpreter /lib/ld-, stripped
> md5sum imgdecrypt
a5474af860606f035e4b84bd31fc17a1 imgdecrypt
As we were just interested in this particular binary, we dumped it the cruelest way possible:
> base64 < imgdecrypt
After copying the output to our local machine and converting the base64 back to binary, we can start taking a closer look!
Binary Reconnaissance:
We have already seen above that we're dealing with a 32-bit ELF binary for MIPS, which is dynamically linked (as expected) and stripped. Let's see what good old strings
can do for us here:
> strings -n 10 imgdecrypt | uniq
/lib/ld-uClibc.so.0
[...]
SHA512_Init
SHA512_Update
SHA512_Final
RSA_verify
AES_set_encrypt_key
AES_cbc_encrypt
AES_set_decrypt_key
PEM_write_RSAPublicKey
OPENSSL_add_all_algorithms_noconf
PEM_read_RSAPublicKey
PEM_read_RSAPrivateKey
RSA_generate_key
EVP_aes_256_cbc
PEM_write_RSAPrivateKey
decrypt_firmare
encrypt_firmare
[...]
libcrypto.so.1.0.0
[...]
no image matic found
check SHA512 post failed
check SHA512 before failed %d %d
check SHA512 vendor failed
static const char *pubkey_n = "%s";
static const char *pubkey_e = "%s";
Read RSA private key failed, maybe the key password is incorrect
/etc_ro/public.pem
%s <sourceFile>
/tmp/.firmware.orig
0123456789ABCDEF
%s sourceFile destFile
[...]
Sweet! There is still a lot of useful stuff in there. I just removed the garbage lines indicated by the "[...]". Most note-worthy are the following things:
- Uses uClibc and libcrypto
- Calculates/Checks SHA512 hash digests
- Uses AES_CBC mode to en-/decrypt things
- Has an RSA certificate check with the certificate path pinned to /etc_ro/public.pem
- The RSA private key is protected by a password
- /tmp/.firmware.orig could be a hint towards where things get temporarily decrypted to
- General usage of imgdecrypt binary
Intermediate Summary:
So far, we already learned multiple interesting things that should help us further down the road!
- D-Link probably re-uses the same encryption scheme across multiple devices.
- These devices are based on the MIPS32 architecture
- (Access to a UART serial console on the DIR 882 is doable without a problem)
- Linked against uClibc and libcrypto
4.1 Potential usage of AES, RSA, and SHA512 routines - Binary seems to be responsible for both en- and decryption
- There is a public certificate
- The usage of imgdecrypt seems to be ./imgdecrypt myInFile
- Usage of a /tmp/ path for storing results?
Next up, we will dive into the static analysis of the imgdecrypt
binary to understand how firmware updates are controlled! But before that, for those of you who feel a bit rusty/are new to MIPS32 assembly language here is a short primer on it.
Primer on MIPS32 disassembly
Most of you are most likely familiar with x86/x86_64 disassembly, so here are a few general rules on how MIPS does things and how it's different from the x86 world. First, there are two calling conventions (O32 vs N32/N64). I'll be discussing the O32 one as it seems to be the most common one around. Discussing these in depths would be out of scope for this article!
Registers:
In MIPS32 there are 32 registers you can use. The O32 calling convention defines them as follows:
+---------+-----------+------------------------------------------------+
| Name | Number | Usage |
+----------------------------------------------------------------------+
| $zero | $0 | Is always 0, writes to it are discarded. |
+----------------------------------------------------------------------+
| $at | $1 | Assembler temporary register (pseudo instr.) |
+----------------------------------------------------------------------+
| $v0─$v1 | $2─$3 | Function returns/expression evaluation |
+----------------------------------------------------------------------+
| $a0─$a3 | $4─$7 | Function arguments, remaining are in stack |
+----------------------------------------------------------------------+
| $t0─$t7 | $8─$15 | Temporary registers |
+----------------------------------------------------------------------+
| $s0─$s7 | $16─$23 | Saved temporary registers |
+----------------------------------------------------------------------+
| $t8─$t9 | $24─$25 | Temporary registers |
+----------------------------------------------------------------------+
| $k0─$k1 | $26─$27 | Reserved for kernel |
+----------------------------------------------------------------------+
| $gp | $28 | Global pointer |
+----------------------------------------------------------------------+
| $sp | $29 | Stack pointer |
+----------------------------------------------------------------------+
| $fp | $30 | Frame pointer |
+----------------------------------------------------------------------+
| $ra | $31 | Return address |
+---------+-----------+------------------------------------------------+
The most important things to remember are:
- First four function arguments are moved into
$a0 - $a3
while the remaining are placed on top of the stack - Function returns are placed in
$v0
and eventually in$v1
when there is a second return value - Return addresses are stored in the
$ra
register when a function call is executed via jump and link (JAL) or jump and link register (JALR) $sX
registers are preserved across procedure calls (subroutine can use them but has to restore them before returning)$gp
points to the middle of the 64k block of memory in the static data segment$sp
points to the last location of the stack- Distinction between leaf vs nonleaf subroutines:
- Leaf: Do not call any other subroutines and do not use any memory space on the stack. As a result, they don't build up a stack frame (and hence don't need to change
$sp
) - Leaf with data: Same as leaf, but they require stack space, e.g.: for local variables. They will push a stack frame but can omit stack frame sections they do not need
- Non-leaf: Those will call other subroutines. These one will most likely have a full-fledged stack frame
- On Linux with PIC
$t9
is supposed to contain the address of the called function
+ +-------------------+ +-+
| | | |
| +-------------------+ |
| | | | Previous
| +-------------------+ +-> Stack
| | | | Frame
| +-------------------+ |
| | | |
| +-------------------+ +-+
| | local data x─1 | +-+
| +-------------------+ |
| | | |
| +-------------------+ |
| | local data 0 | |
| +-------------------+ |
| | empty | |
Stack | +-------------------+ |
Growth | | return value | |
Direction | +-------------------+ |
| | saved reg k─1 | |
| +-------------------+ | Current
| | | +-> Stack
| +-------------------+ | Frame
| | saved reg 0 | |
| +-------------------+ |
| | arg n─1 | |
| +-------------------+ |
| | | |
| +-------------------+ |
| | arg 4 | |
| +-------------------+ |
| | arg 3 | |
| +-------------------+ |
| | arg 2 | |
| +-------------------+ |
| | arg 1 | |
| +-------------------+ |
| | arg 0 | |
v +-------------------+ +-+
|
|
v
Common operations
There are a bunch of very common operations and if you're already familiar with other assembly languages you'll catch on quickly. Here are a selected few to give you a head start for part 2 of this series:
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| Mnemonic | Full name | Syntax | Operation |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| ADD | Add (with overflow) | add $a, $b, $c | $a = $b + $c |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| ADDI | Add immediate (with overflow) | addi $a, $b, imm | $a = $b + imm |
+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| ADDIU | Add immediate unsigned (no overflow) | addiu $a, $b, imm | see ADDI |
+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| ADDU | Add unsigned (no overflow) | addu $a, $b, $c | see ADD |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| AND* | Bitwise and | and $a, $b, $c | $a = $b & $c |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| B** | Branch to offset unconditionally | b offset | goto offset |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| BEQ | Branch on equal | beq $a, $b, offset | if $a == $t goto offset |
+---+----------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| BEQZ | Branch on equal to zero | beqz $a, offset | if $a == 0 goto offset |
+---+----------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| BGEZ | Branch on greater than or equal to zero | bgez $a, offset | if $a >= 0 goto offset |
+---+----------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| BGEZAL | Branch on greater than or equal to zero and link | bgezal $a, offset | if $a >= 0: $ra = PC+8 and goto offset |
+---+----------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| BAL | Branch and link | bal offset | $ra=PC+8 and goto offset |
+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| BNE | Branch on not equal | bne $a, $b, offset | if $a != $b: goto offset |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| DIV(U) | Divide (unsigned) | div $a, $b | $LO = $s/$t, $HI = $s%$t (LO/HI are special registers) |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| J** | Jump | j target | PC=target |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| JR | Jump register | jr target | PC=$register |
+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| JALR | Jump and link register | jalr target | $ra=PC+8, PC=$register |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| L(B/W) | Load (byte/word) | l(b/w) $a, offset($b) | $a = memory[$b + offset] |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| LWL | Load word left | lwl $a, offset(base) | |
+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| LWR | Load word right | lwr $a, offset(base) | |
+---+--------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| OR* | Bitewise or | or $a, $b, $c | $a = $b|$c |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| S(B/W) | Store (byte/word) | s(w/b) $a, offset($b) | memory[$b + offset] = $a |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| SLL** | Shift left logical | sll $a, $b, h | $a = $b << h |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| SRL** | Shift right logical | srl $a, $b, h | $a = $b >> h |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| SYSCALL | System call | syscall | PC+=4 |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
| XOR* | Bitwise exclusive or | xor $a, $b, $c | $a = $b^$c |
+------------------+----------------------------------------------------+-------------------------+----------------------------------------------------------+
Note: Those who do not explicitly state a change in PC can be assumed to have PC+=4 upon execution.
Note 1: Those marked with an asterisk (*) also have at least one immediate version.
Note 2: Those marked with a double asterisk (**) have a multitude of other variants!
Note 4: The ADD
variants only have SUB(U)
as a counterpart!
Note 5: The DIV
variants have a MULT(U)
counterpart.
Note 6: The general difference between j
and b
instructions is that branching uses PC-relative displacements, whereas jumps use absolute addresses. This is rather important when you consider PIC.
Okay, now that I lost all of you we'll end it here with the initial somewhat dry recon phase. However, it is a necessary evil to learn more about our target. Finally, keep in mind that the above MIPS32 assembly table is only a super set of all available instructions. However, even if you are not familiar with MIPS assembly, the table above should be enough to follow along in part 2!
See you in part 2 where we will deep dive into the imgdecrypt
binary in IDA :).
Stay tuned!