Code Guts: Grunt Work

Today seemed more like work than anything has so far. All because of Turbo Pascal overlays.

I noticed before that some parts of the LORD code weren't automatically diassembled. Sometimes I went ahead and manually did the area, and just assumed maybe these were functions that weren't used anymore and had no CALLs to them for the disassembler to look there. I still had a nagging suspicion about it though.

Well, today I decided to go through and find all of these and convert them into data and code, just to see what was there. I did this to what I could (with some problems here and there) until I eventually got up into an area of some kind of labeled structures, sometimes followed by JMPs to somewhere in the overlay. Sometimes these jumps seemed to go to places which didn't necessarily make sense compared to what the upper level function doing the calling would have intended to do. And some of the time functions were calling into one of these areas but it was crap code there instead of jumps. I knew I had suddenly found a serious problem with disassembling the overall program if main parts of the game were trying to call things which were simply broken.

Well, the first part of these mystery structures was a call to interrupt 3F. This is the overlay manager interrupt. But that still didn't help me any. I looked and looked trying to find information on how Turbo Pascal overlays work, but was coming up very empty. Luckily, through a link on some forum, I found a single website which explained aspects of it. And I'm going to go ahead and link it here, just for anyone else who might stumble onto this post trying to find the same information that I was.

The article covers Turbo Pascal 5 and 6 overlays, and points out that TP7 had an implementation to take advantage of protected mode on 286 hardware for storing overlays in higher memory. But I guess LORD was compiled to not use that (I don't remember what kind of options you had for overlays in the TP configuration), because I found the information in the article to still hold true.

In a nutshell, Turbo Pascal puts stub segments into your main EXE, one for each unit in the overlay. You have a 32-byte structure of information about the unit, such as where it's located in the overlay and all that. The article documents the structure's format. Then following the 32-byte structure are a number of 5-byte records, one for each function exported from the unit. The first two bytes are an interrupt 3F call again. The second two bytes are the offset in the unit to the function you're trying to call. The fifth byte is zero. When you make a call in your main EXE to a function in the overlay, the call comes to one of these 5-byte records in a stub. So through Turbo Pascal magic, the interrupt call dynamically changes the code in that 5-byte location to be a far jmp to wherever it loaded this unit from the overlay into memory. All future calls to this overlay function then go immediately to the code you want through this far jump. That is, until you call something else in another unit, and it needs to free memory of a previously loaded unit for the new one. So it sets that old record back to the original 5-bytes (so that it'll call the overlay manager interrupt again if you try to use it later), and goes on to repeat this whole process to load the new unit for whatever function you called.

Once I understood this, I also had to understand that IDA (the diassembler I'm using) had already tried to translate some of the bytes in these 5-byte records in each stub. Sometimes it did it correctly and created proper jumps into the overlay. Other times it wrote the far jump, but it actually jumped to the wrong function in a unit. And other times it just made a royal mess out of the byes to the point that it was junk code. I didn't even know these 5-byte records actually existed in the EXE until I looked at it in a hex editor, outside of IDA. It wasn't until then that I actually knew the website I linked was still relevant to TP7. In the hex editor, I could clearly see that each 5-byte record started with an interrupt 3F, so I knew it was what I was looking for.

So now came the task of manually fixing IDA's disassembly. This meant doing the following: go to a stub segment in IDA (e.g. stub002). Now match up bytes in the 32-byte header structure to bytes in the external hex editor, so that I know I'm looking at the same stub in both. Now back to IDA, bring up the segments list and look at the corresponding overlay segment which matches the value on the stub (e.g. stub002 and ovr002). Make a note of the segment's base address. Now go back to your hex editor, and look up the first 5-byte record below the 32-byte structure you located. It'll start with "CD 3F" which is an interrupt call (not to be confused with the same interrupt call at the start of the 32-byte struct above). You can ignore those two bytes. The second two are what you want: they're the function offset. Make a note of them. Now go back to IDA, and into the stub segment again. Click on the first byte of the first 5-byte record below the header structure. Now go to your Hex View window in IDA (not your hex editor!), and you should be synchronized to the corresponding byte (assuming you have synchronization on, which you should). Right-click on the byte and pick Edit. Now you can type in new hex values. Start by typing "EA" as the first byte, which is a far jump. Next, type in your function's offset, which you got from your hex editor. Following those two, now type in the low and high bytes of the corresponding overlay segment's base address which you already noted (remember to swap bytes, since little-endian is low-byte first). Right-click now and Commit your changes, and go back to the normal IDA code view. You should be able to convert that 5-byte record into code (if it's not already), and it should turn into a legal far jump to somewhere in the overlay. If you then convert that into a function, its name will stay synced to whatever the overlay function gets named (just with a j_ prefix, and making it easier to keep up with when code in the main EXE calls the overlay function through this stub). You now have to repeat this for the rest of the 5-byte records in the stub (the number of which is held in the 32-byte record above, check the site linked to know where). And then repeat it for the rest of the stubs, too. If I were you, I'd double-check ones which are already legal far jumps, because they might very well be jumping to the wrong thing!

Luckily I only had to do about a dozen 5-byte records in all for LORD. But it was still rather tedious. And if I hadn't found that website with the stub structures, I would have been up the creek without a paddle!

So hopefully this helps anyone else who might have had overlay problems in IDA. Or maybe just someone who wants to better understand TP overlays in general. If that website ever goes down, or if anyone has any questions about something I said, feel free to ask. Trust me, I know what it's like to not be able to find any information on this stuff!

Code Guts

Friday, April 27, 2012

Grunt Work

No comments:

Post a Comment