Home Projects Blog

My First Crackme (Part 2)

Ex2_2


Hello and welcome to the second part of this crackme series, these binaries progressively get (somewhat) harder to reverse engineer but nothing is too hard for me!


Let's first start out with a high level perspective of what the binary does:


Figure 1: Checking control flow of main with IDA.

As we can see, there are now 2 check functions after the user has entered a string.



Let's work on check1 first.


Figure 2: Examining the check1 function.

Once again we have a loop as seen by the control flow of the loc_8048499 block. Recalling that before in the previous binary, Ex1_2, the check function looped through the user input. Similarly in Ex2_2 , this input is also looped through as seen by the movzx instruction which extracts a character at a time.



We know that characters are represented as numbers using ASCII in computers. This loop subtracts 0x61 (which is 'a' in ASCII) from the character in the current iteration. The value is then compared to 0x19 = 25, if the value is above it, then the check1 function returns -1 (denoted by two's complement 0xFFFFFFFF). Else the function returns 1.


From this we can assume that the purpose of the check1 is to check that all the characters in the entered string are all lowercase [a-z]. Using GDB to check this confirms this to be true by looking that the contents of eax directly after main calls check1.


Now let's take a look at check2:


Figure 3: Examining the check2 function.

As we can see just by looking at the blocks and the direction of the arrows, we have another loop.


The most striking part of the start block of check2 is the repne scasb not ecx. Reading online: (https://stackoverflow.com/questions/26783797/repnz-scas-assembly-instruction-specifics)


This instruction calculates the length +1 of a string stored in edi and stores the output in the given register, which happens to be ecx. In this case, the entered string was stored on the stack in [esp+0x14+arg_0] which was then moved to edx and then over to edi.


The program then calculates the length of the string by subtracting 1 from ecxand storing it in eax.


The two lea commands are used to set up the loop, the lea command used to point to the second character of the string (which you will see why in the body of the loop) and the second lea instruction used to point to the end of the string, to determine the end of the loop condition.


Now let us move to the body of the loop. To be honest, I do not understand what the overall purpose of this loop does but by examining the assembly, it is a lot of random calculations and bit manipulations. I figured this out after using GDB and examining the contents of the registers.


The assembly can be summarised by the following Python code:


          for i in range(len(enteredString)-1):
                  a = ord(enteredString[i])
                  b = ord(enteredString[i+1])

                  eax =  ((a*b*0x55555556) &0xFFFFFFFF)
                  edx =  ((a*b*0x55555556) >> 32) &0xFFFFFFFF
                  ecx =  (((a*b) & 0xFFFFFFFF) >> 0x1F) &0x1
                  edx -= ecx

                  sum += edx  + (2//a)
      

The register edi is used for the sum, this is trivial to spot since the register edi is only updated towards the end of the loop. As well as this, the edi is used after the loop which confirms our theory.


After the end of the loop, in the final part of the check2 function, we can see that another set of calculations are performed. The most important part of this is the div instructions. The way that the div command works is that it calculates eax/ecx (instead of ecx it could be another register when specified with the div instruction) and stores the answer in eax and the remainder in edx.


By looking at the registers used, the last block of assembly essentially performs $sum \mod len(enteredString) + 1$ where sum is the variable used to store the sum in the loop.


So based on this function we can see that an integer value in the range [0, len(enteredString)] is returned.


Now going back to the main function, we can see that the return values of the check1 and check2 functions are stored in the ebx and the edx registers respectively, we know this since directly after the functions are called, the contents of eax are moved into those registers.


Once again we see the repne scasb ecx instruction followed by subtracting 1 from the ecx. This means the ecx instruction now contains the length of the entered string.


Now the final parts of this block of code in main is that the return values from check1 and check2 are multiplied together and stored in edx.


This value is then compared with the length of the string, if they are not equal then the password fails otherwise it passes.


To summarise, the password must satisfy the following rules:


  1. The password must only consist of lowercase characters [a-z]
  2. From the code given above, $sum \mod len(enteredString) + 1 == len(enteredString)$

An example of a valid password is ad and in order to generate more solutions, you can do so by the following Python code :


        
      import sys
      import string
      from itertools import chain, product
      #https://stackoverflow.com/questions/11747254/python-brute-force-algorithm

      def bruteforce(charset, maxlength):
          return (''.join(candidate)
                  for candidate in chain.from_iterable(product(charset, repeat=i)
      	            for i in range(1, maxlength + 1)))

      #will generate all strings of upto 4 characters long, change as required.

      string = string.lowercase
      for attempt in bruteforce(string, 4):
          sum = 0
      # inner body of the loop
          for i in range(len(attempt)-1):
             a = ord(attempt[i])
             b = ord(attempt[i+1])


             eax =  (((a*b*0x55555556) &0xFFFFFFFF))
             edx =  ((a*b*0x55555556) >> 32) &0xFFFFFFFF
             ecx =  (((a*b) & 0xFFFFFFFF) >> 0x1F) &0x1
             edx -= ecx

             sum += edx  + (2 //a)

           #post loop
           returnVal = sum % (len(attempt) + 1)
           #main function
           if returnVal == len(attempt):
                 print(attempt)