Initial workaround for difficult Win32 targets |
Dongleling | |
by fravia+ |
||
Oh my dongle! Just one day after the spectacular essay by Quine here we have another incredibly interesting essay by Spyder... Dongle protection schemes 'as wrappers' seem to be all the rage now, and here you have another very sound approach to 'unwrapping' inn order to undongle... and I know that some friends working on Cubase will be VERY happy to read these recent essays... Moreover I'm happy to notice that we are (finally) moving from the boring 'how I cracked this specific proggy gna gna gna' (who cares?) to the much more interesting, and important for anybody, 'object oriented' cracking (in this case 'Sentinel Oriented' of course :-) I like a lot this work by Spider... may be a little note about this passage he wrote: "...I haven't looked at the exact details of the main encryption, but it isn't weak. I'm pretty sure it is a rolling encryption... So... and now I'll leave you to this tasty reading, prepared for your reversing pleasure by our +contributor Shadow, that gave us a couple of months ago his sound essay about a fundamental tool of the trade: BoundsChecker. Enjoy! |
Related: Packers |
|
This isn't a complete crack of anything, just a workaround which makes a seemingly impossible crack possible. I know little about dongles, or about the companies that make them. For this target the company is called "Sentinel", and if you see references to Sentinel, or SSI, then we are talking about the same thing. If you tried for a dead listing and found megabytes of garbage that's also a good (or bad?) sign. I know little about the actual dongle code - never had to look at it - setting breakpoints on printer port I/O or something is the *last* way to start cracking any dongle-protected (aka dongled) program (IMHO).
Why was this crack going to be so impossible? Well the dongle code is a wrapper for a semi-protected executable. When the wrapper is applied it encrypts most of the executable code, in fact it attaches to the end of the program and takes over the program entry point. When the program is run, the dongle code gets control, and if it is happy with the dongle it will decrypt the executable and pass control to it. The executable can then call back into the dongle code to make whatever specific dongle checks it wants, in order to confirm that the dongle is still present at various times during execution. So even if you crack the initial dongle code you will still need to understand and patch the main executable. The encryption makes understanding difficult and patching almost impossible. Fully cracking the dongle code is unlikely to be an option. These types of dongles do things like returning a value from a variable string passed to them, the algorithm and keys to do this are actually sitting inside the dongle - there is no easy way of patching the dongle driver code to get the same result. I haven't looked at the exact details of the main encryption, but it isn't weak. I'm pretty sure it is a rolling encryption where changing one byte will affect all following bytes which is why patching is so hard. I am also pretty sure that the encryption keys come out of the dongle (they should work this way if the dongle suppliers have any sense) which means you are going to need the dongle at least for a couple of hours, but, you might also have SOME LUCK. "Cracking without luck it's as impossible as cracking without feeling". If the program has a demo mode, or perhaps some kind of option for network licensing, then it has to be decrypted without the dongle present, so it can run and then decide if it really needs a dongle or a network license or whatever in order to run fully. So, knowing this, how do we work around it? My approach is to undo what was done when the wrapper was applied, re-creating something near enough to the original program before it was wrapped, so that normal reversing techniques can be used. Lets look at some interesting bits of a tdump for a sample program:- Entry RVA 0003D950 Image base 00400000 Object table: # Name VirtSize RVA PhysSize Phys off Flags -- -------- -------- -------- -------- -------- -------- 01 .text 0001E200 00001000 0001E200 00000400 E0000020 [CERW] 02 .rdata 00000800 00020000 00000800 0001E600 C0000040 [IRW] 03 .data 00013E24 00021000 0000D200 0001EE00 C0000040 [IRW] 04 .PAD000 00000E00 00035000 00000E00 00000000 C0000080 [URW] 05 .SSINod 00007A00 00036000 00007A00 0002C000 E0000040 [IERW] 06 .rsrc 00000800 0003E000 00000800 00033A00 C0000040 [IRW] 07 .idata 00000C00 0003F000 00000C00 00034200 C0000040 [IRW] 08 .reloc 00000200 00040000 00000200 00034E00 42000040 [IDR] SSINod is the added dongle code, PAD000 is where the original program .reloc segment (which contains import addresses patched up during program load) was. Image base is where the program gets loaded in a flat memory model (most addresses are relative to this base). Entry RVA is the program start address. The next step is to get the program loaded and decrypted and into Turbo debug. If the program runs and stays running while it presents an error message then the best operating way is to fire up TD32 in a dos window and attach to the program. One of the wonders of Win32 debugging (dunno how it works): just pick File | Attach and select the window, after a while TD32 will get control of the program and you can look around. You might also be able to start the program with TD32 but getting control once it has run and decrypted isn't so easy. You may even be able to trust the memory image when the program has terminated or maybe look at the stack and find some locations where an hardware breakpoint can be set. I'm sure it's a lot easier with softice but softice (as far as I know) can't do what TD32 is about to. Anyhow I assume you somehow managed to get control of the program using TD32. Have a look at the image base + .text RVA and make sure it looks like code, to confirm that the program was decrypted, if not then maybe you need to borrow a dongle from some friends and start again. Now open up a dump window, go to (Ctrl-G) the image base address and start marking a block by holding shift and dragging the mouse from the required hex byte (block marking in TD32 is a bit strange and shift cursor keys don't always work). Now go to the end of the .text segment (which should extend the marked block). Select Block | write (Ctrl-B, W) and TD32 will kindly dump the executable header and whole decrypted .text segment to a file for you. TD32 seems to write such blocks at about 1-2kB per second, which is pretty slow, so you might take a coffee/cocktail break. (Sorry if I am being too specific about "driving TD32" but the first time I did this I didn't realise TD32 could dump memory blocks to disk and I spent hours writing memory dump 'panes' to the log file and then I wrote a whole own-made program to turn the dumps back into binary, so I want to make sure that my readers will know this trick - a little knowledge sometimes goes a long way). Now one more thing before we are done with TD32. Take a look at the PAD000 segment. It will be mostly zeros and you will see a block of 32 bit addresses. These are the import entries which were patched up at load time in the .reloc segment and part of the dongle code copies them here, where the main executable expects them to be. The order and number of entries is not the same as in the .reloc segment so you can't do a simple copy of it, nor fiddle with RVAs to get them in the right place. Set an hardware memory write breakpoint on any of these locations and then restart the program from TD32 (Ctrl-F2). TD32 should break in the middle of the dongle code which is copying across these entries. Take a look at the code and -if it is like the stuff I have seen- you will find a table of pairs (couples) of 32 bit addresses pointed to by ebx. These are pairs of relative (to image load) addresses inside .reloc, with the required relative address in .PAD000 for the same import. Mark this table and dump to disk as before. Now we are done with TD32 and -probably- with the dongle. Putting it back together. Now we have to insert the decrypted dump back into the executable, which is really just binary editing of a sort, but rather tricky. Take another look at the tdump object table and understand what it means... Entry RVA 0003D950 Image base 00400000 Object table: # Name VirtSize RVA PhysSize Phys off Flags -- -------- -------- -------- -------- -------- -------- 01 .text 0001E200 00001000 0001E200 00000400 E0000020 [CERW] 02 .rdata 00000800 00020000 00000800 0001E600 C0000040 [IRW] 03 .data 00013E24 00021000 0000D200 0001EE00 C0000040 [IRW] 04 .PAD000 00000E00 00035000 00000E00 00000000 C0000080 [URW] 05 .SSINod 00007A00 00036000 00007A00 0002C000 E0000040 [IERW] 06 .rsrc 00000800 0003E000 00000800 00033A00 C0000040 [IRW] 07 .idata 00000C00 0003F000 00000C00 00034200 C0000040 [IRW] 08 .reloc 00000200 00040000 00000200 00034E00 42000040 [IDR] The .text segment RVA is 1000 which means that in memory this is started at the image base + 1000 = 401000. The Phys off is where .text lives in the executable file. That means the decrypted dump has c00 bytes between 400 and 1000, which need to be cut. The executable file header is not encrypted, and in my experience it is always identical in the memory image. With you favorite editor, hack out the c00 bytes and paste the dumped image on top of the executable.
Fixing the imports. We have this dump of pairs of addresses to move import addresses between .reloc and .PAD000 segments at start up. We should have plenty of space for code in the .SSINod segment as once the program is really cracked it shouldn't ever have to call inside there. I turned the binary dump into hex, then manually edited it into db assembler statements for any old assembler that can create a binary output file. Add A1h in front of the first address for a mov eax,[????] and A3h in front of the next for a mov [????],eax. Remember to add the Image load address to all addresses as they are also relative. Double check that you are loading from .reloc and storing into .PAD000. I just dumped this binary fragment into the executable at the start of the .SSINod segment. You might then want to patch up the Entry RVA in the executable header to point at this code (or patch a jump in later).
Where are we now? At this stage you should have an executable that has no encryption and will start running a lump of code that moves import addresses to where the original program expects them. The entry point probably points at this code but don't run it as the import patch is currently followed by dongle garbage. The executable should be exactly the same size as the original - if not I'm afraid you messed up the editing somewhere. Now you have something you can start cracking, fine for dead listing, but you will have to find the original program entry point and patch inside it a jump at the end of the import patching code before you can try to run it. If you are using IDA it may well discover the program entry point as a standard runtime library startup function although IDA seems not to automatically recognize the compiler in these reconstructed executables (select a signature manually). The other alternative is to debug another program built with the same tools to find what the entry point code looks like. You could even search for likely looking code, startup code will anyway probably call GetVersion pretty quickly. Once you have found and patched inside the original program entry point, the workaround is done. You have now an executable with the same function as the original before it was wrapped. The chances are that this program will still call into the dongle code but as the authors expected the dongle wrapper to make examination difficult and patching impossible they are unlikely to have done anything too sophisticated, and if they have, that suit us well: even more challenges!
Well I hope you found this interesting and perhaps useful. I haven't seen this particular sentinel dongle encrypt anything more than the .text segment, indeed, it doesn't always encrypt the whole text segment. The same recovery technique could be used for other encryption based schemes but things get tricky where more than the .text segment is encrypted. You would have to break and capture an image before the program runs enough to 'smudge' its own data segments, the .reloc segment is particularly painful as the OS 'corrupts' that during loading (however, this kind of stuff is equally difficult for the protectors). After having written this essay I read Quine's excellent HASP essay, published yesterday by fravia+, where Quine shows how to dump large areas of memory with SOFTICE, among other marvels. The HASP protection sounds quite similar to my Sentinel one. I suspect the Sentinel rolling encryption is more robust. I have also seen Win32 HASP protected targets with no encryption (possibly because they also had network licensing options). Spyder