Retrologic Systems Retrologic Systems Retrologic Systems
RetroGuard for Java Obfuscation
  RetroGuard  |  CAPTCHA  |  Contracting  |  Resellers  |  Contact

About Obfuscation - RetroGuard Documentation


Prev   Contents   Next

Java bytecode (*.class files) contains all of the information, apart from comments, that is in Java source (*.java) files. Using a tool called a decompiler a hostile competitor can easily reverse engineer your Java classes. To counter this threat, it is possible to obfuscate your class files before distributing your software.

The obfuscation process strips all unnecessary information from the classes. This includes the line number tables, local variable names and source file names used by debuggers. Also, class, interface, field and method identifiers are renamed to render them meaningless. The Java virtual machine, which runs your bytecode, does not care at all about these changes. However, the decompiled version of these classes is extremely difficult to understand, frustrating any attempt to reverse engineer your code. The changes that an obfuscator makes to your Java classes are not reversible - there is no automated way for a reverse engineer to recover the lost information about your code.

An additional benefit to obfuscation is a substantial reduction in the size of your Java classes, due to the removal of unnecessary information and the replacement of large, human-readable identifiers with small machine generated names. This size reduction leads to faster download times for your Java applets, and the ability to pack more features into your midlets running on small devices like cellphones and PDAs.

To determine which classes are to be obfuscated, most obfuscators start at a single entry point (usually the 'main' method of an application, or the 'Applet'-derived class for an applet), and construct a tree of all classes accessible from that point. Unfortunately, this method is quite limiting and works only in simple cases. If your Java code has multiple entry points (several applications, applets, or JavaBeans, or if your code is intended to be used as a Java library) then this method is just not flexible enough.

Instead, RetroGuard obfuscates all classes and interfaces within a JAR file. JAR files are the industry standard mechanism for packaging Java classes for distribution - it is easy to package your classes as a jar using the 'jar' utility distributed with the Java Development Kit from Sun Microsystems. Any number of entry points to the JAR can be specified using a RetroGuard script file. This allows the obfuscation process to be completely flexible.

A technique used by several obfuscators is to introduce corrupt bytecode into the obfuscated Java classes. These corruptions are prohibited by the definitive text, the Java Virtual Machine Specification by Yellin and Lindholm, but do not happen to be noticed by the current virtual machine implementations. The corruptions are sufficient to break some of the simpler decompilers on the market. This class corruption is a very dangerous course to take, however, since virtual machines will certainly enforce the constraints of the Specification much more strictly in the future. At that point, code which uses this 'corrupting obfuscation' will simply fail.

Note that, as of Java SE 6, class file corruptions are disallowed by the latest virtual machine. From Sun's compatibility notes for Java SE 6:
"Some early bytecode obfuscators produced class files that violated the class file format as given in the virtual machine specification. Such improperly formatted class files will not run on the JDK virtual machine, though some of them could have run on earlier versions of the virtual machine. To remedy this problem, regenerate the class files with a newer obfuscator that produces properly formatted class files."

Corruption of classes is unacceptable - one cannot afford to ship Java bytecode which only sometimes runs, or fails completely on some virtual machines. For this reason the RetroGuard obfuscator produces only verifiable bytecode in full compliance with the Java Virtual Machine Specification. Instead of corrupting the bytecode, RetroGuard uses heavy overloading of identifiers (multiple uses of method names within a class) and the introduction of Java source-code keywords as identifiers to make it almost impossible to understand decompiled Java classes.

Another technique that is often suggested to prevent decompilation is encryption of Java classes and the use of a custom classloader to decrypt them. However, since the decrypted classes can always be intercepted using a modified version of the 'java.lang.ClassLoader' method 'defineClass', that technique is fundamentally flawed. The issue is explained very clearly in Vladimir Roubtsov's article at JavaWorld.


Prev   Contents   Next
RetroGuard Newsletter
Your email address will be kept strictly confidential and never provided to third parties. Unsubscribe using the email address beneath each mailing.
 Copyright © 1998-2010 Retrologic Systems. retroguard | captcha | contracting | resellers | site map | contact
 All rights reserved. site terms | privacy policy