Hello and welcome to the second part of this crackme series, these binaries progressively get (somewhat) harder to reverse engineer but nothing is too hard for me!
Let's first start out with a high level perspective of what the binary does:
As we can see, there are now 2 check functions after the user has entered a string.
Let's work on check1
first.
Once again we have a loop as seen by the control flow of the loc_8048499
block.
Recalling that before in the previous binary, Ex1_2
, the check
function looped through the user input.
Similarly in Ex2_2
, this input is also looped through as seen by the movzx
instruction which extracts a character at a time.
We know that characters are represented as numbers using ASCII in computers. This loop subtracts 0x61 (which is 'a' in ASCII) from the character in the current iteration.
The value is then compared to 0x19 = 25, if the value is above it, then the check1
function returns -1 (denoted by two's complement 0xFFFFFFFF). Else the function returns 1.
From this we can assume that the purpose of the check1
is to check that all the characters in the entered string are all lowercase [a-z].
Using GDB to check this confirms this to be true by looking that the contents of eax
directly after main
calls check1
.
Now let's take a look at check2
:
As we can see just by looking at the blocks and the direction of the arrows, we have another loop.
The most striking part of the start block of check2
is the repne scasb not ecx
.
Reading online:
(https://stackoverflow.com/questions/26783797/repnz-scas-assembly-instruction-specifics)
This instruction calculates the length +1 of a string stored in edi
and stores the output in the given register, which happens to be ecx
.
In this case, the entered string was stored on the stack in [esp+0x14+arg_0]
which was then moved to edx
and then over to edi
.
The program then calculates the length of the string by subtracting 1 from ecx
and storing it in eax
.
The two lea
commands are used to set up the loop, the lea
command used to point to the second character of the string (which you will see why in the body of the loop) and the second lea
instruction used
to point to
the end of the string, to determine the end of the loop condition.
Now let us move to the body of the loop. To be honest, I do not understand what the overall purpose of this loop does but by examining the assembly, it is a lot of random calculations and bit manipulations. I figured this out after using GDB and examining the contents of the registers.
The assembly can be summarised by the following Python code:
for i in range(len(enteredString)-1): a = ord(enteredString[i]) b = ord(enteredString[i+1]) eax = ((a*b*0x55555556) &0xFFFFFFFF) edx = ((a*b*0x55555556) >> 32) &0xFFFFFFFF ecx = (((a*b) & 0xFFFFFFFF) >> 0x1F) &0x1 edx -= ecx sum += edx + (2//a)
The register edi
is used for the sum, this is trivial to spot since the register edi
is only updated towards the end of the loop. As well as this, the edi
is used after the loop which confirms our
theory.
After the end of the loop, in the final part of the check2
function, we can see that another set of calculations are performed. The most important part of this is the div
instructions. The way that the
div
command
works is that it calculates eax/ecx
(instead of ecx
it could be another register when specified with the div
instruction) and stores the answer in eax
and the remainder in edx
.
By looking at the registers used, the last block of assembly essentially performs $sum \mod len(enteredString) + 1$ where sum is the variable used to store the sum in the loop.
So based on this function we can see that an integer value in the range [0, len(enteredString)] is returned.
Now going back to the main function, we can see that the return values of the check1
and check2
functions are stored in the ebx
and the edx
registers respectively, we know this since
directly
after the
functions are called, the contents of eax
are moved into those registers.
Once again we see the repne scasb ecx
instruction followed by subtracting 1 from the ecx
. This means the ecx
instruction now contains the length of the entered string.
Now the final parts of this block of code in main
is that the return values from check1
and check2
are multiplied together and stored in edx
.
This value is then compared with the length of the string, if they are not equal then the password fails otherwise it passes.
To summarise, the password must satisfy the following rules:
An example of a valid password is ad and in order to generate more solutions, you can do so by the following Python code :
import sys
import string
from itertools import chain, product
#https://stackoverflow.com/questions/11747254/python-brute-force-algorithm
def bruteforce(charset, maxlength):
return (''.join(candidate)
for candidate in chain.from_iterable(product(charset, repeat=i)
for i in range(1, maxlength + 1)))
#will generate all strings of upto 4 characters long, change as required.
string = string.lowercase
for attempt in bruteforce(string, 4):
sum = 0
# inner body of the loop
for i in range(len(attempt)-1):
a = ord(attempt[i])
b = ord(attempt[i+1])
eax = (((a*b*0x55555556) &0xFFFFFFFF))
edx = ((a*b*0x55555556) >> 32) &0xFFFFFFFF
ecx = (((a*b) & 0xFFFFFFFF) >> 0x1F) &0x1
edx -= ecx
sum += edx + (2 //a)
#post loop
returnVal = sum % (len(attempt) + 1)
#main function
if returnVal == len(attempt):
print(attempt)