ByteCode Reverse Engineering Season 1 Episode 2

Sunday, July 6, 2008 8:45
Posted in category Java, Security

Okay, Last time I deviated from the topic a bit but nevertheless, it still deserved to mention the importance of secure programming to keep away novices from breaking a system just by some invalidated inputs.

Now, let’s get back to Bytecodes. :-) Let me put the disclaimer first that these posts of mine are not tutorials or something but they are just reference pointers on the way of reverse engineering the code. So, one needs to be an expert or atleast something near to that level to achieve the goals in reverse engineering. For detailed information and tutorials and other stuff, google the net.

I’ll start with a simple example (As I always feel that example are better than numerous pages of theory :-) )

Consider the following java code:

1
2
3
4
5
public class Test {
    public static void main(String args[]){
        System.out.println("Hello F@^#ers !!!");
    }
}

Now, Let’s look at the generated bytecode for the above code. (I used javap utility to get this code. So, the simple command is “javap -c ” to get this output)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Compiled from "Test.java"
public class Test extends java.lang.Object{
public Test();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return
 
public static void main(java.lang.String[]);
  Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   ldc     #3; //String  Hello F@^#ers !!!
   5:   invokevirtual   #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   8:   return
}

Looks Simple right??? :-)

Now, first thing you can notice is the default constructor. So, all that talk about default constructor is not bogus. It really does exists and is implicitly implemented by the compiler.

Let’s take the statement one by one.

   0:   aload_0

This statement contains what is called an opCode(operator code) in java. the number ‘0′ is insignificant. This statement pushes the “this” reference on to the stack to signify the object of this class.

   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V

“invokespecial” just invokes the constructor of the parent class. #1 signifies index position in the constant table.

   4:   return

This is just the return statement to signify the end of the constructor and specifies the exit point.

Now, comes the main method which is the entry point for any application.

0:	getstatic	#2; //Field java/lang/System.out:Ljava/io/PrintStream;

This line include the opcode getstatic. As you might guess, this opcode gets a static field (in this case, System.out) and pushes it onto the operand stack. As you might have guessed, that #2 refers to that field in our constant table. Let’s go on to our next line.

3:	ldc	#3; //String Hello F@^#ers !!!

This line uses the opcode ldc, which loads a constant onto the operand stack. At this point, we’re going to load whatever constant is in index #3 of our constant table. That constant is our String,”Hello F@^#ers !!!”. Going forward, we run into this line:

5:	invokevirtual	#4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V

This line invokes the method println of our object System.out. The process involved here is to pop the two operands off the stack and then execute the method. At this point, our method is over and we return.

Now, by this example, you’ll able to see that even a small program like this one has so many instructions to execute before actually printing the desired result.

Now, similarly, one needs to understand the complex bytecode and the instruction set before actually playing with it. To read more about bytecodes and the instruction set, please go through Java Virtual Machine specification and then, you’ll able to understand the nitty gritties of VM and bytecode.

In the next episode, I’ll just touch upon how to change the code at the bytecode level to change the output of this program (without considering bytecode verification side of it which I’ll discuss after that.). That means, it will look like that program is printing something else than desired string.

Meanwhile, Happy Learning about bytecodes…

Quote of the Day:"The individual desires judgment. Without that desire, the cohesion of groups is impossible, and so is civilization." - Morpheus
Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google
  • Blogosphere News
  • IndianPad
  • LinkedIn
  • MySpace
  • Reddit
  • Slashdot
  • StumbleUpon
  • Technorati
  • TwitThis

Related posts

Reading: ByteCode Reverse Engineering Season 1 Episode 2Tweet This: Send Page to Twitter
You can leave a response, or trackback from your own site.

Leave a Reply

.