Dissecting a VM
4 min read

Dissecting a VM

"Democritus called these smallest units atoms. The word 'a-tom' means 'un-cuttable' "

What is a virtual machine?

Compared to our "non-virtual" machines, a "virtual" machine seems trivial. Why is one virtual, and one non virtual? Outputs lay on the same screen, inputs are derived from the same keyboard.

The same way we can construct simulations of the universe, render CGI, mimic instruments on a computer, we can also simulate a computer, on a computer. At its core, our machines are but lightning through a rock, electrons either in one tube, or not in a tube: represented respectively with 1 or 0. By recreating these sequences of 1's and 0's, placeholders for physical electrons, we recreate a virtual machine, a placeholder for our physical one.

1, Language

Have you ever wondered why programming languages, are called languages?

Let us first dissect human spoken language. There is English. Arabic. Chinese. Russian. German. Latin. Hebrew. Swedish. Enough languages with their own rules and grammar to stretch the Rosetta stone to mars. A language is but a medium to express ideas.  The arts: music, paintings, theater, dance, are also mediums. In these, subcategories of ballet or contemporary dance, classical or electronic music, each for a niche. And similarly, programming languages are as varied as spoken ones, each serving a different purpose, to convey different ideas most effectively.

All languages however, serve a purpose. The general language "mathematics", seeks to quantify a system into parts with clear borders and rules. It is a medium for humans to interact with the natural world. Spoken language can not quite convey numbers, but it allows for expression of emotions and time. It is a medium for humans to interact with other humans. And programming languages serve to allow humans to interact with machine. At its core, is a machine not part of the natural world? Well, at its core, is programming not mathematics? And in the same sense set theory and category theory are niche's, programming for data analysis and programming for cyber security have their respective sub-languages as well.

With all these mediums, each serving a different purpose, how can we generalize them into one umbrella term: "language"?

This is where the atom comes into play. In any system, there is an undividable building block, an atom if you will. For English, it is one of the 26 letters of the alphabet. For physics, there are 6 flavors of quarks. And for computers, at its core, it is 1 or 0.

This is uninteresting, as a block is a block. A singular lego does not represent or invoke anything. A medium is much more than its blocks, it must create rules for the blocks to stack together and create meaning. It is the 6 pegs on the lego that dictate the rules for stacking, that allow for one lego to become an effigy of the Eiffel tower or the Colosseum. In physics, these rules are forces, 4 fundamental ones: strong, weak, gravitational, and electromagnetic. In computers, there are a handful called opcodes (different computers/processors have a different number. for x86, there are 1503 opcodes). These are fundamental instructions that can not be broken further, but can be chained to create complex executions.

For example, the tidal force we learn about in middle school, the flow of the wave above and the ebb of the water under, pressure from all sides via pascal's law, coupled with bernoulli's principle for fluid dynamics: it all breaks down into those 4 primal forces. And for a computer, the forces are the opcodes.

2, Memory

A machine (virtual and non virtual), would be useless without states. Imagine a calculator who could add and subtract numbers, but could only hold one number at a time. It would be quite useless. Imagine a book, the paper able to convey any thought or idea, but it is only able to hold a singular word at a time. We see that simply having blocks, and rules to chain the blocks is not enough. We need Storage, for multiple blocks, in order for meaning and purpose to be realized.

Here we begin to distinguish Virtual and Non-virtual machines. A non-virtual machine, if you unscrew the cover of your computer, you see wires, tubes, boards jutting in odd angles. It uses physical pieces of hardware to save the state of its electrons: it's 1's and 0's. The hard drive, as you use more and more memory, is a disk with physical indents to represent state. A virtual machine however, runs on the physical one. To represent state and storage, a plethora of methods can be used. A simple but effective one is to instantiate an area of memory on the real, non-virtual machine to store numbers. Most programming languages have the concept of an array, a buffer, a linked list. They are a virtual storage system that ultimately, does become physical transistors way down the line. The difference here is, our non-virtual machine has access to the physical storage. Our virtual machine does not, it only has access to the non-virtual machine, which in turn converts it to physical storage.

3, Interpretation

If Dostoevsky wrote Crime and Punishment in latin, and no one else spoke latin, then the book would not be a masterpiece. It would be senseless. In the same way, machines need an "interpreter", to understand the sequence we define for the opcodes, and properly execute them. This is perhaps the most abstract part of the entire article.

"an opcode is a force that dictates how fundamental units should act, is that in itself not enough?"

Well, say we had one more physical force, that acted on the fifth dimension. We however, live in a four dimensional world. Though the action for the force is defined, there is no way to understand or even witness the action. We must create a dimension, a channel, a mode of interpretation, for said forces. In specifics, this is actually a multitude of steps for computers. There is a linker, an assembler, a compiler, and interpreter... However the ultimate purpose is akin to a book. We have the words on paper, syntax and grammar being the rules, and characters being the building blocks. Pages being our storage. We just need someone that can associate the words with meaning. We need an interpreter to breathe life into our lines of opcodes.

Closing remarks

To reverse a VM, begin by looking for these 3 components:

  1. Opcodes: Fundamental instructions for blocks to build upon each other
  2. Storage: Arrays or lists to store states and instructions
  3. Interpreter: A stepper that chains one opcode to the next, that reads and writes to memory

As I understand a majority of the blog's readers are into antibot deobfuscation, I will give some hints:

Fingerprint js's vm has 51 opcodes at time of writing (May-2021)

Shape Security's vm has a fluctuating range of 2-500 opcodes.

See how they chain opcodes together, how instruction sequences are defined. See how they store state, see what the opcodes fundamentally do. Good luck.

-musicbot