Having Fun With JavaScript Obfuscation
10 min read

Having Fun With JavaScript Obfuscation

The dynamic nature of JavaScript can be a blessing and a curse but one thing is for certain, it allows for some pretty neat obfuscation techniques.

A simple function call in JavaScript looks like this:

console.log("hello");
// hello

But did you know it could also look like this?

console["log"]("hello");
// hello

Both produce the same output despite having different syntax. Let's look at what's happening here.

AST Explorer is a tool that displays a tree view of a code snippet's Abstract Syntax Tree. An Abstract Syntax Tree (AST) is just a tree representation of our source code. This is useful for parsing and transforming source code.

Pasting two of the earlier statements into AST Explorer provides us with the following view:

Tree view for two console.log statements as shown above

On the left is our original code snippets and on the right is the tree view representing the AST of the code. This tree view can be expanded to view the children but we will use the JSON representation instead.

As you may have noticed, the body property contains both of our console log statements as ExpressionStatement. The JSON representation of the first ExpressionStatement in the body looks like:

// Statement One console.log("Hello")
{
    "type": "ExpressionStatement",
    "expression": {
        "type": "CallExpression",
        "callee": {
            "type": "MemberExpression",
            "object": {
                "type": "Identifier",
                "name": "console"
            },
            "computed": false,
            "property": {
                "type": "Identifier",
                "name": "log"
            }
        },
        "arguments": [{
            "type": "StringLiteral",
            "value": "Hello"
        }]
    }
}

Let's break this down:

Our statement is defined as an ExpressionStatement whose expression is of type CallExpression. The callee of this CallExpression is a MemberExpression where the object is an Identifier with the name console and the property is an Identifier with the name log. The only argument passed into the CallExpression is a StringLiteral with the value hello.

Breaking down source code into its building blocks can be tough to wrap your mind around and is why I urge you to tinker in AST Explorer with various code snippets to learn how they're constructed.

So what makes this different from our second console log statement? Let's view the AST of that.

{
    "type": "ExpressionStatement",
    "expression": {
      "type": "CallExpression",
      "callee": {
        "type": "MemberExpression",
        "object": {
          "type": "Identifier",
          "name": "console"
        },
        "computed": true,
        "property": {
          "type": "StringLiteral",
          "value": "log"
        }
      },
      "arguments": [
        {
          "type": "StringLiteral",
          "value": "Hello"
        }
      ]
    }
  }

Let's break this down again and see what's changed:

Our statement is defined as an ExpressionStatement whose expression is of type CallExpression. The callee of this CallExpression is a MemberExpression where the object is an Identifier with the name console and oh... the property is a StringLiteral with the name log instead of an Identifier. We also notice a computed property set as true. ย The argument remains the same and is a StringLiteral with the value hello.

Ok, ok.. We know that JavaScript is weird and that we can represent code in multiple ways. How can we (ab)use this for obfuscation?

A common transformation of most obfuscators is string concealment. By hiding text in our source we make it annoying for those trying to statically analyze our program.

const a = ["log", "hello"];

// ... imagine a ton of code between here

console[a[0]](a[1]); // hello

By placing our strings in an array and doing an array lookup for our console.log call and argument we've made the source slightly more annoying to work with. While it's very easy in this scenario to map these array indices to their string counterparts using your brain, imagine if the array was composed of hundreds of strings.

It may not be feasible to replace each array lookup with their string by hand but it is super trivial to do this using Babel libraries. @babel/parser is a library that can take source code and transform it into AST. @babel/traverse allows us to traverse and modify the syntax tree, and @babel/generator can take our modified AST and return valid JavaScript code for our viewing pleasures.

const parse = require('@babel/parser').parse;
const traverse = require('@babel/traverse').default;
const generate = require('@babel/generator').default;
const t = require('@babel/types');

const obfuscatedCode = `
const a = ["log", "hello"];
console[a[0]](a[1]);
`

const ast = parse(obfuscatedCode);
let stringArr = [];

traverse(ast, {
    VariableDeclarator(path) {
        const node = path.node;
        if (node.id.name === 'a') { // Find array with the name a
            stringArr = node.init.elements.map(e => e.value); // Add all its elements to an array in our program
            path.parentPath.remove(); // Remove this node as array lookup would no longer be necessary
        }
    },
    MemberExpression(path) {
        const node = path.node;
        if(node.object.name === 'a') { // Find all expressions where the object's name is "a"
            const arrayIndex = node.property.value; // get the index of the lookup
            path.replaceWith(t.stringLiteral(stringArr[arrayIndex])); // replace the array lookup with the string literal
        }
    }
});

console.log(generate(ast).code); // console["log"]("hello");

This is a super barebones example of how we would use Babel to achieve this. The code could be written better to provide error checking and better AST checking but works perfectly for this example.

Another variant of this string concealment is to use encoded or encrypted strings. This can look like a simple Base64 decode:

console[atob("bG9n")](atob("aGVsbG8=")); // hello

or even a cipher of some sort:

function xor(str) {
  const key = "supersecretkey";
  return [...str].map((c, i) => String.fromCharCode(c.charCodeAt(0) ^ key.charCodeAt(i % key.length))).join('');
}

console[xor('\x1F\x1A\x17')](xor('\x1B\x10\x1C\t\x1D')) // hello

Transforming these examples is also pretty trivial and can be done in about 30 lines of code or less. But let's say we don't want to use AST to figure out these strings, how else could we go about it?

A Debugger!

Adding a debugger statement to our last example and placing it into your Browser's Developer Tools should run the code and pause at the breakpoint. From here, we can reveal those concealed strings simply by hovering over the function calls.

Chrome DevTools showing "log" when hovering over a xor function call.

Hm, is there any way we could make this any more annoying to work with? One trick used in the Malware world is to obfuscate API calls by hashing the desired names and then traversing an export table, hashing each name in the table and comparing it to the hashed value of the API we desire (Read more here). Can we apply this to the JavaScript?

The equivalent of the Export Address Table in the browser world is the window object. In NodeJS it is the globalThis object. This means that function call console.log("Hello") can also be defined as window.console.log("Hello"). We can get a list of all properties of an object using Object.getOwnPropertyNames

Object.getOwnPropertyNames(window)

// ['Object', 'Function', 'Array', 'Number', 'parseFloat', 'parseInt', 'Infinity', 'NaN', 'undefined', 'Boolean', 'String', 'Symbol', 'Date', 'Promise', 'RegExp', 'Error', 'AggregateError', 'EvalError', 'RangeError', 'ReferenceError', 'SyntaxError', 'TypeError', 'URIError', 'globalThis', 'JSON', 'Math', 'console', 'Intl', 'ArrayBuffer', 'Uint8Array', 'Int8Array', 'Uint16Array', 'Int16Array', 'Uint32Array', 'Int32Array', 'Float32Array', 'Float64Array', 'Uint8ClampedArray', 'BigUint64Array', 'BigInt64Array', 'DataView', 'Map', 'BigInt', 'Set', 'WeakMap', 'WeakSet', 'Proxy', 'Reflect', 'FinalizationRegistry', 'WeakRef', 'decodeURI', 'decodeURIComponent', 'encodeURI', 'encodeURIComponent', 'escape', 'unescape', 'eval', 'isFinite', 'isNaN', 'Option', 'Image', 'Audio', 'webkitURL',ย โ€ฆ]

To implement the obfuscation technique we described we can traverse this list, hashing each of them and compare the result to the hardcoded hash of the desired object/function.

To do this we will utilize the SuperFastHash algorithm written in JavaScript. This is not a cryptographically safe hashing algorithm but should be fine for the purposes of this example. The hashing function is defined as so:

function hash(message) {
  let { length } = message;

  let index = 0;
  let digest = length;

  for (let n = length >> 2; n > 0; n--) {
    digest = digest + (message[index++] | message[index++] << 8) >>> 0;
    digest ^= digest << 16 >>> 0 ^
              (message[index++] | message[index++] << 8) << 11;
    digest = digest + (digest >>> 11) >>> 0;
  }

  switch (length & 3) {
  case 3:
    digest = digest + (message[index++] | message[index++] << 8) >>> 0;
    digest ^= digest << 16 >>> 0;
    digest ^= message[index++] << 18;
    digest = digest + (digest >>> 11) >>> 0;
    break;
  case 2:
    digest = digest + (message[index++] | message[index++] << 8) >>> 0;
    digest ^= digest << 11 >>> 0;
    digest = digest + (digest >>> 17) >>> 0;
    break;
  case 1:
    digest = digest + message[index++] >>> 0;
    digest ^= digest << 10 >>> 0;
    digest = digest + (digest >>> 1) >>> 0;
  }

  digest ^= digest << 3 >>> 0;
  digest = digest + (digest >>> 5) >>> 0;
  digest ^= digest << 4 >>> 0;
  digest = digest + (digest >>> 17) >>> 0;
  digest ^= digest << 25 >>> 0;
  digest = digest + (digest >>> 6) >>> 0;

  return digest | 0;
}

The parameter is the message as a Uint8Array and the result is a 32-bit digest.

Running this function on the string log would look like:

hash(new Uint8Array([..."log"].map(c => c.charCodeAt(0))))

and our output is -1739799878.

And utilizing this for function lookups and function calls would look like:

function resolve(obj, hashConstant) {
    for(const name of Object.getOwnPropertyNames(obj)) {
        const hashedName = hash(new Uint8Array([...name].map(c => c.charCodeAt(0))));
        if(hashedName === hashConstant) {
            return obj[name];
        }
    }
}

resolve(console, -1739799878); // reference to the log function
resolve(console, -1739799878)("hello"); // equivalent to console.log("hello")

We could take this a step further and not only use this for function lookups but also calling our functions.

function resolve(obj, hashConstant, callFunction, ...args) {
    for(const name of Object.getOwnPropertyNames(obj)) {
        const hashedName = hash(new Uint8Array([...name].map(c => c.charCodeAt(0))));
        if(hashedName === hashConstant) {
            if(callFunction) {
                return obj[name](...args);
            }
            return obj[name];
        }
    }
}

resolve(console, -1739799878, true, "Hello"); // equivalent to console.log("Hello")

This looks fairly readable but with some code minification, concealing variable names, and string concealment in the actual resolve function we can get something truly ugly

let IIiIIiIIiiIIiIIii = window;
function IiiIIiIIiIIiIIiIIiiiI(text) {
    let result = '';
    for (let i = 0; i < text.length; i++) {
        result += String.fromCharCode(text.charCodeAt(i) ^ '2246822507478238'.charCodeAt(i % '2246822507478238'.length));
    }
    return result;
}

function IiIIiIIIiIIiiiI(emojis) {
    return [...emojis].map(e  => String.fromCharCode([...'๐Ÿ˜€๐Ÿ˜ƒ๐Ÿ˜„๐Ÿ˜๐Ÿ˜†๐Ÿ˜…๐Ÿ˜‚๐Ÿคฃ๐Ÿ˜Š๐Ÿ˜‡๐Ÿ™‚๐Ÿ™ƒ๐Ÿ˜‰๐Ÿ˜Œ๐Ÿ˜๐Ÿฅฐ๐Ÿ˜˜๐Ÿ˜—๐Ÿ˜™๐Ÿ˜š๐Ÿ˜‹๐Ÿ˜›๐Ÿ˜๐Ÿ˜œ๐Ÿคช๐Ÿคจ๐Ÿง๐Ÿค“๐Ÿ˜Ž๐Ÿคฉ๐Ÿฅณ๐Ÿ˜๐Ÿ˜’๐Ÿ˜ž๐Ÿ˜”๐Ÿ˜Ÿ๐Ÿ˜•๐Ÿ™โ˜น๏ธ๐Ÿ˜ฃ๐Ÿ˜–๐Ÿ˜ซ๐Ÿ˜ฉ๐Ÿฅบ๐Ÿ˜ข๐Ÿ˜ญ๐Ÿ˜ค๐Ÿ˜ ๐Ÿ˜ก๐Ÿคฌ๐Ÿคฏ๐Ÿ˜ณ๐Ÿฅต๐Ÿฅถ๐Ÿ˜ฑ๐Ÿ˜จ๐Ÿ˜ฐ๐Ÿ˜ฅ๐Ÿ˜“๐Ÿค—๐Ÿค”๐Ÿคญ๐Ÿคซ๐Ÿคฅ๐Ÿ˜ถ๐Ÿ˜๐Ÿ˜‘๐Ÿ˜ฌ๐Ÿ™„๐Ÿ˜ฏ๐Ÿ˜ฆ๐Ÿ˜ง๐Ÿ˜ฎ๐Ÿ˜ฒ๐Ÿฅฑ๐Ÿ˜ด๐Ÿคค๐Ÿ˜ช๐Ÿ˜ต๐Ÿค๐Ÿฅด๐Ÿคข๐Ÿคฎ๐Ÿคง๐Ÿ˜ท๐Ÿค’๐Ÿค•๐Ÿค‘๐Ÿค ๐Ÿ˜ˆ๐Ÿ‘ฟ๐Ÿ‘น๐Ÿ‘บ๐Ÿคก๐Ÿ’ฉ๐Ÿ‘ป๐Ÿ’€โ˜ ๏ธ๐Ÿ‘ฝ๐Ÿ‘พ๐Ÿค–๐ŸŽƒ๐Ÿ˜บ๐Ÿ˜ธ๐Ÿ˜น๐Ÿ˜ป๐Ÿ˜ผ๐Ÿ˜ฝ๐Ÿ™€๐Ÿ˜ฟ๐Ÿ˜พ๐Ÿ•๐Ÿ•‘๐Ÿ•’๐Ÿ•“๐Ÿ•”๐Ÿ••๐Ÿ•–๐Ÿ•—๐Ÿ•˜๐Ÿ•™๐Ÿ•š๐Ÿ•›๐Ÿ•œ๐Ÿ•๐Ÿ•ž๐Ÿ•Ÿ๐Ÿ• ๐Ÿ•ก๐Ÿ•ข๐Ÿ•ฃ๐Ÿ•ค๐Ÿ•ฅ๐Ÿ•ฆ๐Ÿ•ง'].findIndex(a => a === e))).join('')
}

const IIIiIIiIIiiIiiIIi = function(str, seed = 0) {
    let h1 = 0xdeadbeef ^ seed, h2 = 0x41c6ce57 ^ seed;
    for (let i = 0, ch; i < str.length; i++) {
        ch = str.charCodeAt(i);
        h1 = Math.imul(h1 ^ ch, 2654435761);
        h2 = Math.imul(h2 ^ ch, 1597334677);
    }
    h1 = Math.imul(h1 ^ (h1>>>16), 2246822507) ^ Math.imul(h2 ^ (h2>>>13), 3266489909);
    h2 = Math.imul(h2 ^ (h2>>>16), 2246822507) ^ Math.imul(h1 ^ (h1>>>13), 3266489909);
    return 4294967296 * (2097151 & h2) + (h1>>>0);
};

function IiIiiIIiIIIIIIii(IIiIIiIIiIIiIiIIi, IIIIiIiIiIIIiIiIiIii, IIIiiIIiIIiiIII, ...IIiIiIiiiIiIiIiIi) { return this[IiiIIiIIiIIiIIiIIiiiI(IiIIiIIIiIIiiiI('๐Ÿ•œ๐Ÿค๐Ÿคก๐Ÿคฎ๐Ÿ‘ฟ๐Ÿ˜ฏ'))][IiiIIiIIiIIiIIiIIiiiI(IiIIiIIIiIIiiiI('๐Ÿ˜ท๐Ÿค•๐Ÿคฅ๐Ÿ•˜๐Ÿ˜ต๐Ÿ‘นโ˜ ๐Ÿ˜ฆ๐Ÿ’ฉ๐Ÿ˜ฆ๐Ÿฅด๐Ÿ™„๐Ÿ˜ด๐Ÿฅฑ๐Ÿ•œ๐Ÿค ๐Ÿ’ฉ๐Ÿค•๐Ÿ˜ฆ'))](IIiIIiIIiIIiIiIIi)[IiiIIiIIiIIiIIiIIiiiI(IiIIiIIIiIIiiiI('๐Ÿคง๐Ÿ‘ฟ๐Ÿค‘๐Ÿ˜๐Ÿ‘บ๐Ÿคฅ'))](IiiIiIIIiIIi => IIIiIIiIIiiIiiIIi(IiiIiIIIiIIi) === IIIIiIiIiIIIiIiIiIii)[IiiIIiIIiIIiIIiIIiiiI(IiIIiIIIiIIiiiI('๐Ÿ’ฉ๐Ÿคฎ๐Ÿ˜ฌ'))](IIiIIiIIiIIiIiiiIi => IIIiiIIiIIiiIII ? IIiIIiIIiIIiIiIIi[IIiIIiIIiIIiIiiiIi] : IIiIIiIIiIIiIiIIi[IIiIIiIIiIIiIiiiIi](...IIiIiIiiiIiIiIiIi))[+![]]}

console.log(IiIiiIIiIIIIIIii(IiIiiIIiIIIIIIii(IIiIIiIIiiIIiIIii, 1333326451136123, true), 2318998113347069, false, 2, 4));

The above code is the equivalent of Math.min(2, 4). Don't believe me? Run it in your browser's console. The hashing algorithm is different as this was an early prototype I made while toying with the idea.

And this could be made to be even uglier! But I'll leave that as an exercise for you.

While this is pretty annoying, it doesn't make messing around in the debugger annoying enough.

Trolling the Debugger

trollface

Is there a way we can troll the debugger?

Well we could always have a setInterval calling debugger causing the person snooping around to get caught in a debugger loop

setInterval((()=>{debugger}), 10);

But this is too easy, and even if concealed it can be bypassed by renaming the debugger keyword in the browser source code. I cover this in another blog post

Well what else can we do?

Generators

JavaScript Generators are resumable functions that can yield multiple values. They are different from normal functions because when called it returns a special object used to manage the execution of the generator.

Let's take a look at how they're used.

function* gen() {
  yield "hello";
  yield "world";
}

const generator = gen();

console.log(generator.next().value); // hello
console.log(generator.next().value); // world

How can we apply this to our obfuscated code?

function xor(str) {
  const key = "supersecretkey";
  return [...str].map((c, i) => String.fromCharCode(c.charCodeAt(0) ^ key.charCodeAt(i % key.length))).join('');
}

function* gen() {
  yield xor('\x1F\x1A\x17');
  yield xor('\x1B\x10\x1C\t\x1D');
}

const generator = gen();

console[generator.next().value](generator.next().value);

Now, let's use the debugger to find out what our code is doing.

Generator value is showing undefined in the Chrome Devtools

Huh? The value is undefined. This is because the debugger is attempting to fetch the next value but we've already used them when calling the function.

Alright, alright, so let's just put the debugger statement before we yield the generator's values.

Generator values showing "hello" then undefined when being hovered over the same place.

!?!?!?

The Chrome Debugger is causing our generator function to yield on hover before it is called in the code. When hovering over the same expression, we cause the generator to yield each time giving us false values at the given spot. If you resume the code, you're met with an error since the generator is now yielding undefined.

Console.log function can't be found due to hovering.

You could obfuscate this further and dynamically create your generator function like so:

Function("return function* a(){}")().constructor // returns a GeneratorFunction

and if you wanted to go even step further you could use string concealment, export hashing, and other obfuscation techniques to end up with something pretty horrifying.

LASTLY

There are always pros and cons to consider when coming up with new obfuscation techniques. For example, IE doesn't have support for Generator functions and this can be an issue if IE is something you must serve on. The code provided is also not complete as it doesn't take into account for yielding values in loops, running in different code paths, etc.. This is something you would have to sit down and design further. And lastly, while yes JavaScript can make for some pretty neat obfuscation techniques due to the dynamic nature of the language, obfuscation will only delay a dedicated reverser from achieving their goal, not prevent them.