Virtualization in Commercial products

Hi all.

Last april, I looked to a commercial software to see how it was protected. It was using Themida 3.0, a very good virtualization with custom handles, optimized bitcode and stuff.. But after looking around it for one hour, I figured out that some very important parts of the code were not virtualized. In fact, the entier license system were clear x64 !

Today I will try to talk about it without revealing the software for obvious reasons.

So how a good security like Themida could lead to this result ? How this security is applied during the release process ? 



Well in general software that are a little bit serious protect things that should be protected. But, if you look closly, you may notice that there is a problem. Some important parts of the code are not protected at all by the virtualization !!! It's could be resumed by the thing that some software providers wants to apply a security to protect their software without understanding the point of it. They just check security features in the virutalizer menu and click compile. The result : a software that is highly tampored from a first look, but terribly unprotected at the end.

You may think that this software is not that used and didn't face cracks in the past to consider a new security. But it's actualy a software downloaded by over 10 millions of people, made by a big company.

So let's illustrate, and of course, I can't show that much code for obvious reasons.

This is the code section after applying the virtualization :

The virtualized code is in fact a jump to the VM engine with the according opcode. But as you can see, there are lot of remaining code here, and this code is not virtualized. And those code parts are very important, licence check, functions linked to the communications with the server and DRM stuff.

Because all of the license verifications are visible, I can understand the licence system and, by performing a timing attack, I could simply set the licence struct to something valid.

The funny part of this is that the main licence check is not virtualized, but the functions in the vtable that "delete" the object in memory are !

Something very legit that I could show you :

As you can see, the algorith is readable.

But what can lead to this result ? Because some people that are highly skilled in VM said to me that Themida and others like VMP can virtualize the entier executable if they would. So why this result ?

In reality, virtualizers have to consider the code optimization and the ability to find functions without the pdb file. Virtualizing an entier executable could slow down the application in high frequencies procedures. But the most important point is that x86 or x64 executables are terribly locked after the compilation. Everything is melt together and if you add one byte to the code, everything else will crash if you don't do it correctly. Because same instruction in x86 and Themida bytecode has not the same length. The virtualizer as to deal with a lot of problems, and sometimes the solution chosen by Themida is to simply not virtualize.

For exemple, the x86 instruction set has jump instructions that could jump from a current position to another torward. This is called relative jumps and it's easy to handle in a new virtualized language if the next code is also virtualized. But if the code between the jump and the target change its size, the virtualizer has to recalculate the jump offset in order to keep the code flow. Those jumps are handleable, but can became complex very fast. 

There are 2 other instructions that are very complex as the relative jump, but maybe not so handleable by the virtualizer. The absolute jump, it's a jump like the previous one, but it's jumping to an absolute address in the executable. It could be anything, anywhere, so in some way, if the main compiler craft an x86 absolute jump from a function to another, the virtualizer has to make a choise. Virtualizing the two functions in hope that the absolute address is keeped, or virtualize one function and not the other, or (like in most of the case where the destination is unidentified) not virtualize at all.

 


The last instruction which is the most problematic is absolute indirect jumps. It's a jump, but you jump to an address contained in a register, so it's could be anything. Something calculated, something that change in function of a context ... So during the function parsing process of Themida to get the tree of basic blocks (blocks of code) of a function. It's impossible to get the target basic block (next code executed after the jump) of the jump. You can probably find a way to get it by, calculating it from instructions, it's too complex to get something accurate. Using symbolic execution to get the value in the register at the time, but you have not warranty that your code will be executed, and even if it is, the value could change from an execution to another. In this case, we don't how Themida handle it, but it's certainly not accurate, and the final result should be, to not virtualize.

There are tons of other instructions and circumstances that lead to the same result, but you get the idea, virtualization is not perfect, and virtualazing everything is not a good idea.

But STILL, the developpers have a SDK that could define where the code should be virtualized.

At the end, we can't really define who is at fault here, but the worst could be easily avoided. I know this thread seems very empty, but giving more details equals potential actions against me from the company. So I hope you found this interesting as I do :)

 ~r0da

Commentaires