Android ProGuard & Serialisation - exempting files from obfuscation

After battling with a problem involving obfuscated variables I wanted to share the solution.

Left:Green Android robot.Right:the same robot with its body jumbled up.Centre is a wiggly arrow & the word "Obfuscate..."

I have been working hard to finish off app-integrated registration in eVitabu recently, making the onboarding process easier for our end users.  All of my code was working fine: the cloud component (eVitabu management, EVM) had an API that would register new users, and the app had a registration form that would send data to the cloud component.  All my tests were fine until I told Android Studio to build me a release.

I have two build variants when working on the Android app.  First there is the debug variant which is the build I use most of the time.  When testing on my phone, tablet, or Android Virtual Devices (AVDs) it's often the debug build that gets used.  As the name implies, the debug build allows the app to send debug information back to Logcat so I can see what's happening.  Where System.out.println() has been used I'll see that output also.  Uploading a debug build to the Google Play Store results in an error.

My problem came with the release build.  Debug output is turned off in a release and, critically, the build is run through ProGuard.

What is ProGuard?

ProGuard does several jobs.  One of these is to reduce the size of your app by using minification to remove redundant items and to remove needless characters / white space from files (not everything becomes part of the compiled binary).  Another job is to obfuscate code, so if your app is reverse engineered it is more difficult to understand.  From a technical perspective there's nothing particularly special about eVitabu, so I could turn obfuscation off, but it was a default setting and all has been fine up until now...

Problems with obfuscated variables

For the registration process, the app fires JSON [1] encoded data at EVM.  EVM then takes the submission and parses it into an object which in turn gets saved as a user.  My problem was that now the variable names in the JSON data were obfuscated, and the first I knew about it was when I attempted a registration from the release build.  Luckily this was something I attempted before making the latest version available to everyone else!

Unfortunately, for EVM this obfuscation meant that whereas initially the app sent field to content pairs (e.g. forename --> Jonathan), after ProGuard had run EVM was receiving obfuscated field names (e.g. a --> Jonathan).  This broke registration, as EVM expects pairings to have a known field name.  Fortunately I was able to see the runtime errors on the EVM web server, and could see I was receiving ErrorException: Undefined array key "forename" in /AddUser.php.  This was confusing initially because I knew I had not changed my code.

To do some more investigating I enabled debug within EVM and could see it was receiving obfuscated variable names.  While that's probably obvious to you, dear reader, given the build up to this point, it had taken me about fifteen minutes of thinking and testing to come to this conclusion.  An example of the JSON data, parsed into a PHP array is below.

Array
(
    [a] => Jonathan
    [b] => Haddock
    [c] => Just a bio for this test.
    [d] => Somerhays
    [e] => United Kingdom
)

When it should be getting:

Array
(
    [forename] => Jonathan
    [surname] => Haddock
    [bio] => Just a bio for this test.
    [town] => Somerhays
    [country] => United Kingdom
)

Researching online I found this article from GuardSquare which said:

By default, ProGuard obfuscates the code: it assigns new short random names to classes and class members. It removes internal attributes that are only useful for debugging, such as source files names, variable names, and line numbers

This confirmed that obfuscation was my problem.

Workaround - don't obfuscate anything

In file app/proguard-rules.pro there's a section marked "# Add any project specific keep options here".  Under that section you can add the -donotobfuscate flag to prevent variable obfuscation:

# Add project specific ProGuard rules here.
# By default, the flags in this file are appended to flags specified
# in /opt/android_sdk/tools/proguard/proguard-android.txt
# You can edit the include path and order by changing the proguardFiles
# directive in build.gradle.
#
# For more details, see
#   http://developer.android.com/guide/developing/tools/proguard.html

# Add any project specific keep options here:
-dontobfuscate

This means that no obfuscation will take place, which is not ideal from a security perspective.  Nonetheless, I tried disabling obfuscation initially and while that confirmed the obfuscation was my problem (I got my variable names back), it highlighted another problem.  Minification was removing some of my variables.  EVM should also receive information about the user's language choices and town, but this information was missing from the output.

Clearly this wasn't going to resolve my issue, and more research was required.

Fix 1 - ensuring variable names

After talking with my friend Mike, a Java programmer by trade and someone who was originally on the eVitabu project, he mentioned that exporting to JSON (serialisation [2]) involved taking the reflected variable name by default.  Essentially this meant the conversion process would take the variable's name to use for the JSON field name.  This is why after obfuscation my fields were coming through as a, b, c etc.

Fortunately there's a fix for this - you override the variable's serialised name.  To do this we specify our chosen name just above the declaration using @:SerializedName(""):

@SerializedName("forename")
public String forename;

Applying this at every declaration allowed the variable name to be preserved after obfuscation.  I was still missing some fields though - onwards....

Fix 2 - avoid minification for some files

For some reason, the minification process was removing fields that it "thought" weren't going to be used.  I never got to the bottom of why that was presumed to be the case, but fixing it ahead of the deadline was the important bit!!

Using ProGuard rules it's possible to tell the process to avoid minification for only specific files.  This is good because it means I can keep the app size down while ensuring the app doesn't have important fields dropped.

Opening up app/proguard-rules.pro again I can add two rules with the -keepclassmembers instruction:

# Add any project specific keep options here:
-keepclassmembers class org.africanpastors.e_vitabu.util.DataProcessor {*;}
-keepclassmembers class org.africanpastors.e_vitabu.activity.AddUser {*;}

We specify the names of the classes that we want to preserve, and importantly end the line with {*;}.  I don't claim to fully understand the syntax, but essentially we're telling the minifier to not remove parts of the class.

Conclusion

This process showed, again, why testing is so important in software development.  If I'd just pushed my latest version out to everyone, new users would have been unable to register within the app which was a key feature of the new release.


Banner image: Demonstrating my artistic merit, I took the Android logo and jumbled it up 🙂.

[1] JavaScript Object Notation

[2] Serialization to my American friends

Disclaimer: URLs in this post don't reflect their true names.  Ironically, they've been obfuscated!