Re-creating source code from bootable games

This is a step by step guide to how I re-create the source code from bootable diskettes. Since it's a tedious process, certainly some improvements can be made. Please give me feedback or suggestions.

Tools I use:
dsk2img - reads a diskette and creates an image file of it. (part of the flopper project)
img2dsk - inverse of the above utility.
soft ice debugger - good for poking around and watching code execute.
sourcer disassembler - converts binary executables back to x86 assembly source code.
hex editor - handy for quick edits and extracting code/data sections from disk image.
code editor - DOS brief is all I ever needed.
masm/link - compiler to put it all back together.

The first task in converting a game that came on a floppy disk is to extract the data off the diskette and get it to the hard drive. There are many utilities to make uncompressed disk image files; I used the one from the flopper project, dsk2img.

Once the entire disk contents were saved as a disk image file, it was then a matter of figuring out what sections of the disk were loaded when the game was booted. I then want to take those sections out of the disk image and convert them back to source code.

The easiest way to figure out what code is loaded during the game's boot is to watch it happen with a debugger. I load up soft-ice in DOS and set a breakpoint on BIOS service INT 13 (diskette access) and "boot" the machine from s-ice, using an actual floppy disk with the game saved to it. Since my machine doesn't have a 5.25" drive anymore, I use the img2dsk program to re-write the disk image to a 3.5" diskette. Provided there is no copy protection in the game, writing a 360k diskette image to a 1.44MB diskette will work just fine. The game won't know the difference.

As the game boots under soft-ice, I can watch every INT 13 get called and see what data is being loaded from the disk to memory. The boot loader routine is often jumped to immediately after the first sector of the diskette is loaded and executed, so I don't have to trace too much code to get to the good stuff. Copy protection may be checked at this stage, so watch out.

I keep a fair amount of notes during this process so I know if I've gotten all the main code off the diskette:

destination address     # of sectors   data loaded
50:0                    8 sectors      "c4 e6 10 23 10 bd a1 90 90 cd 19 57"
50:1000                 8 sectors      "89 16 48 59 29 de 92 7a 16 aa 97 34"
50:2000                 8 sectors      "74 27 95 66 18 48 7d 71 7a d7 e2 d0"
50:3000                 8 sectors      "56 90 16 47 7a 8c 6d dd 1f 6a 8e 1a"

From the INT 13 interface I can tell how many sectors are loaded, from what section of the diskette and where the data is loaded to in memory.
After each INT 13 load sector is called, I go look at the destination address and check what data was copied into memory.
Check with ralph brown if you're a little fuzzy on the INT 13 parameters.

After the main code section of the game is loaded, the loader will then jump to it in memory, often with a far jump. I make note of where in the code the 1st jump is; it's not always at the very beginning of the code. After watching the game load itself and jump to the first instruction, I now know what bits of code I need to extract from the disk image.

Once running, the game might load additional sections of data or code off the disk, but I can handle those at another time. Getting the 1st big chunk is a huge step.

Isolate the code from the disk image.

Now that I know how much code is loaded during the main boot load, I need to extract this data from the disk image file so I can convert it back to source code again.

With my notes in hand, I start up a hex editor with the capabilities of cutting/pasting and copying data to a new file. I think most hex editors nowadays have these features; I use my own homebrewed program.

I search the disk image file for the strings of data I have in my notes. Since the game loaded 8 sectors of data each time it read from the disk, and I know each sector is 512 bytes in length, I know I need to copy 4096 (or 1000h) bytes of data from the disk image and paste that data into a new file. If I'm lucky, I might even find the next 3 chunks of data all located contiguously in the disk image, so it's just a matter of copying 4 * 4096 bytes of data to a new file. If I'm not so lucky, I might have to search the disk image to locate all 4 data strings and copy the data and copy/paste them to 1 new file with all the data concatenated together.
It's often fairly obvious where the code/data starts and stops in a disk image file. Empty sectors are often filled with the "F6" character. We don't want any of that data, since that was put there by DOS format.

Now that the data has been extracted into a new file, we essentially have an executable binary image of the game.
It won't run directly the way it is, but it can be disassembled. It's time for sourcer to work its magic.

┌──────────────────────────────────────────────────────────────────────────────┐
│ ▀▀▀▀▀ ▀▀▀▀▀ ▀   ▀ ▀▀▀▀▀ ▀▀▀▀▀ ▀▀▀▀▀ ▀▀▀▀▀    αпΓπΣσ   V COMMUNICATIONS, INC. │
│ ▀▀▀▀▀ ▀   ▀ ▀   ▀ ▀     ▀     ▀▀▀▀  ▀         Б≥≤⌠    ЕτΦΘΩδ∞φ⌡ї≈А∙М√ⁿВ■ ЋЛ  │
│ ▀▀▀▀▀ ▀▀▀▀▀ ▀▀▀▀▀ ▀     ▀▀▀▀▀ ▀▀▀▀▀ ▀  ε∩                     V3.09          │
│                                                                              │
│  F1 - help    ┌ File format, list(.lst)/source(.asm)  Analysis options menu  │
│  ┌────────────┴───────────────────────────────────────────────────────────┐  │
│   ┌ Output filename   ┌ Header title            Xref (cross reference) OFF   │
│  ┌┴───────────┐┌──────┴──────────────────────────┐  Press return when done   │
│   bcw.asm       000000: 20 00 00 CF 20 00 CF F0 0                   Page 3   │
│                                xor   dl,dl           ; Zero register         │
│                                int   11h             ; Put equip bits in ax  │
│                                mov   cx,data_6       ; (6033:A1E4=3C1h)      │
│  └┬───┘              └┬─────────────────└┬─────────┘  └──┬─────────────────┘ │
│   └ Segment display   └ Word case style  └ Label type    └ Remarks - all     │
│         on/off                             (decimal)                         │
│  Target Assembler: MASM-5.1                                Press "Q" to Quit │
├───────────────────────────────┬──────────────┬───────────────────────────────┤
│  Input filename  bcw.bin      │ Label Counts │  Drive g: used for output     │
│  Beginning addr  76A7:0000    │  Data     0  │         > 100 Megs available  │
│  Ending address  76A7:1E00    │  Sub      0  │  217K bytes free (using EMS)  │
│  Math  off   uP  8086/8088    │  Loc      0  │  Code zero start    Passes 5  │
├───────────────────────────────┴──────────────┴───────────────────────────────┤
│  Select highlighted options desired, and enter "G" to Go begin processing.   │
└──────────────────────────────────────────────────────────────────────────────┘

I've not really found the optimal settings for sourcer yet. I do have it attempt to build a MASM compatible .asm file.

The output from sourcer is often a big mess. It probably won't compile without some major rework, and it certainly won't run correctly. It's time to start digging through the code and figure out what's going on.

This portion of the project is very much like figuring out a jigsaw puzzle. At first glance, it looks like an impossible task, but you'll find a small piece to work on and those small pieces help you figure out bigger sections, and slowly things fall into place.

I like to start with the smallest and most obvious subroutines first. If I find a subroutine that has INT 10h functions, it's going to be something involving the video; either a print routine or a set video mode. I rename sourcer's label of "sub_###" to something more human readable, such as "setVideoMode" I then change all references of "sub_###" to setVideoMode. Even if it turns out later that I named a subroutine incorrectly for what its real purpose is, it's easy to change again later on, as the code evolves into something more readable.

Similarly, any subroutine that appears to be accessing memory in the 0B800h segment is working with the screen. (at least in programs that use graphic video modes) Any subroutine that accesses port 61h is likely to be a speaker routine. Sourcer often gives hints in the comments as to what the code is likely to be doing. As the smaller routines get labelled correctly, it becomes easier to figure out what the bigger routines do, and eventually it'll all make sense. Or at least we hope.

The next step is to convert all the variables that sourcer has placed into the code with temporary variable names. When doing this, I like to keep a 2nd computer running with the actual game loaded under soft-ice. Until I figure out what a variable's true purpose is, I name them based on the memory location of that variable in the real game. This way if I see a variable named "v8022" in my code, I can always reference the physical address 8022 in the real game and see if my varaible has the same value in it. This also makes for quick cross references between the sourcer code and the original game code. Once the true purpose of the variable is found (if it's ever found) I'll rename it to a more human readable name, although I'll keep a note of the physical address in the comments in case I ever need to investigate its function in the real game.

Here's a code snippet example of the evolution of a routine in 3 steps.

Step 1: Original code.
Icky. I can't make heads or tails of this.

loc_336::
                mov     dx,0Dh
                call    sub_39
                mov     bl,byte ptr ds:[4FAh]
                xor     bh,bh                   ; Zero register
                add     bx,60h
                call    sub_65
                xor     dx,dx                   ; Zero register
                call    sub_39
                mov     bx,5Eh
                call    sub_65
                cmp     byte ptr ds:[4FBh],0FFh
                jne     loc_337                 ; Jump if not equal
                mov     bx,5Fh
                call    sub_65
loc_337::
                mov     dx,1700h
                call    sub_39
                mov     bx,6Ch
                cmp     byte ptr ds:data_9e,1
                je      loc_338                 ; Jump if equal
                inc     bx
loc_338::
                call    sub_65
                call    sub_10
loc_339::
                mov     ah,1
                int     16h                     ; Keyboard i/o  ah=function 01h
                                                ;  get status, if zf=0  al=char
                jnz     loc_340                 ; Jump if not zero
                call    sub_13
                jmp     short loc_339
loc_340::
                mov     ah,0
                int     16h                     ; Keyboard i/o  ah=function 00h
                                                ;  get keybd char in al, ah=scan
                cmp     al,0Dh
                je      loc_343                 ; Jump if equal
                cmp     al,3
                jne     loc_341                 ; Jump if not equal
                xor     byte ptr ds:data_9e,21h ; '!'

Start replacing the hard-coded memory references with variables named by their phyisical location in the original code. To do this, I cross reference the code snippet against the real code under soft-ice. Once I do a variable, I just search/replace the rest of the code, which makes this portion of the job a bit easier. This is probably the most frustrating and boring phase of the operation. Unfortunately, this is also the most bug-prone phase. It's easy to make mistakes, such as a typoed search/replace, which will introduce bugs and haunt the final game for many generations of future code.

Step 2: Variables fixed, some call routines figured out.

loc_336::
                mov     dx,0Dh
                call    locate
                mov     bl,byte ptr ds:[v4FA]
                xor     bh,bh                   ; Zero register
                add     bx,60h                  ; 96
                call    sub_49                  ;
                xor     dx,dx                   ; Zero register
                call    locate
                mov     bx,5Eh
                call    sub_49                  ;
                cmp     byte ptr ds:[v4FB],0FFh
                jne     loc_337                 ; Jump if not equal
                mov     bx,5Fh
                call    sub_49                  ;
loc_337::
                mov     dx,1700h
                call    locate
                mov     bx, 6Ch                 ;
                cmp     byte ptr ds:[v0c],1     ;data_9e,1
                je      loc_338                 ; Jump if equal
                inc     bx
loc_338::
                call    sub_49
                call    getakey
loc_339::
                mov     ah,1
                int     16h                     ; Keyboard i/o  ah=function 01h
                                                ;  get status, if zf=0  al=char
                jnz     loc_340                 ; Jump if not zero
                call    sub_13
                jmp     short loc_339
loc_340::
                mov     ah,0
                int     16h                     ; Keyboard i/o  ah=function 00h
                                                ;  get keybd char in al, ah=scan
                cmp     al, CR
                je      loc_343                 ; Jump if equal
                cmp     al,3
                jne     loc_341                 ; Jump if not equal
                xor     byte ptr ds:[v0c], 21h  ;data_9e,21h    ; '!'

After replacing the memory references with variables, some things are starting to shape up. I've figured out what one of the calls did (locate) since it was a tiny subroutine and its obvious use of INT 10 was a giveaway.

I can see there is some user input going on because of the INT 16. With the locate subroutine in place, it appears this code prints a lot of something to the screen at various positions, probably text.

Step 3: final code

loc_336::
                mov     dx,0Dh                  ; row, col
                call    locate
                mov     bl,byte ptr cs:[skillLevel] ;0-4
                xor     bh,bh                   ; Zero register
                add     bx,60h                  ; 96
                call    printMsg                ; input bx=message # to print
                xor     dx,dx                   ; Zero register
                call    locate
                mov     bx,5Eh
                call    printMsg                ; "skill level..."
 
                cmp     byte ptr cs:[resumeGame],0FFh
                jne     loc_337                 ; Jump if not equal
                mov     bx, 5Fh
                call    printMsg                ; "D) resume game"
loc_337::
                mov     dx,1700h
                call    locate
                mov     bx, 6Ch                 ; monitor type
                cmp     byte ptr cs:[monitorType],1     ;data_9e,1
                je      loc_338                 ; Jump if equal
                inc     bx
loc_338::
                call    printMsg
                call    getakey
loc_339::
                mov     ah,1
                int     16h                     ; Keyboard i/o  ah=function 01h
                                                ;  get status, if zf=0  al=char
                jnz     loc_340                 ; Jump if not zero
                call    getRand                 ; seed the rnd generator while we wait
                jmp     short loc_339
loc_340::
                mov     ah,0
                int     16h                     ; Keyboard i/o  ah=function 00h
                                                ;  get keybd char in al, ah=scan
                cmp     al, CR
                je      loc_343                 ; Jump if equal

                cmp     al, ESCAPE
                je      exitProgram

                cmp     al,3
                jne     loc_341                 ; Jump if not equal
                xor     byte ptr cs:[monitorType], 21h  ;data_9e,21h    ; '!'

Here, through a bit of guesswork and code deduction, the last of the variables have been renamed and the often-used "sub_49" turns out to simply be a print string function, used everywhere throughout the code.

This section of code prints the current skill level and monitor type to the screen during the main menu. It is also responsible for getting the user input. Pretty straightforward overall, except a neat surprise was that while the routine is waiting for user input, it is continually seeding the random number generator, giving a very random human input variable into the randomness of the game. Nice touch.

At this section in the code, I added a check for the Escape key to allow the user to exit back to DOS.

The code labels have not been changed from loc_xxx:: because the code is fairly readable at this state and retaining the original loc_xxx labels allows me to cross reference code against previous versions in case of bugs that may have crept in.

Now that the code is in the final stages of revision, it's time to make minor enhancements and additions to the game. I first like to attempt to get full backwards compatibility for the game; I want to make it run exactly the same on an IBM XT as it does on a pentium IV 2.5GHz machine, thus the only changes will be "behind the scenes" to retain the original look and feel. Essentially, I am attempting to reproduce the game the way it originally should have been released, had the original author been able to see 20 years or more into the future of computer technology and adjusted for it.

Speed fixing and throttling:
The biggest issue in playing old games on new computers is speed. Not only have CPU clock speeds increased, but today's CPUs run smarter and more optimally than their predecessors. If you took a pentium based CPU and throttled the clock speed back to the original IBM PC speed of 4.77MHz, the pentium would still run too fast. Caching and branch prediction tricks aside, pentium CPUs simply do things in less clock cycles than the early CPUs did 20 years ago.

The trick is then providing the illusion that the game is running at the same speed on any computer. Traditionaly, if you wanted to play an old game on a fast computer, you'd reach for a slowdown program which interrupts the computer at regular intervals (several times per second) and make it delay a tiny amount, thus simulating a slower computer. As speeds have increased, the amount of delay has to increase, which eventually leads to very jerky motion and unresponsive gameplay, since the computer is doing more waiting than game processing.

By restoring the source code to the game, strategicaly placed, real-time based delays can be added to restore a game to its original playability without sacrificing responsiveness or video quality. Using delays based in real time, rather than delay loops ensure that the game will run at the proper speed on any cpu, regardless of processor speed.

The most effective area to start throttling a game is the video. I'm currently using a method that ensures that sprite movements and screen refreshes are always done in a minimum of 1ms. I start a timer at the start of a video routine, then at the end of that routine, I make sure at least 1ms has passed before returning from the procedure.

Since video updates happen often in action games, a well placed delay can have an incredible impact on overall game speed without impacting other aspects of play. Using specific delays like this results in games visually operating at the same speed regardless of CPU speed. The only thing to be careful of is running the game on an original 4.77MHz machine. The added overhead of checking the time before and after a video update may very well take over 1ms in time, which can then result it a game running too slow! A reasonable fix for this is to detect the CPU type at startup, and if running on an 8088 or 8086 processor, simply skip all delays that were added to the original code. Those folks running on an overclocked 8088 or 10MHz 8086 can just use their "turbo" button. ;)

The next hurdle in converting this game was the voice samples played over the PC speaker. The timing on the playback of these samples is extremely precise, or at least as precise as one could be back in 1985. Machines running too fast played the samples too quick; just a fast blurp of sound. If I put too much delay in, the samples draw out like satan himself was speaking. Because of the precise timing requirements, it was difficult to find a solution that worked on all speeds of machines, and in fact I'm not sure if I've even found one yet. Windows complicated matters even further. With proper timings that worked in a DOS environment, the game's sounds were simply noise under windows 2000 or XP. I believe this is due to the overhead of the hardware abstraction layer. I'm not certain if an all encompassing solution can be made for this issue.

A side-solution and game enhancement was to use the soundblaster to play samples from the original game. It was an odd solution indeed; playing 1 bit samples from a PC speaker on a sound card capabable of CD quality playback! Anything in the name of video game conservation I guess. I suppose this is no different than playing a game based on IBM PC hardware on a new Xeon 2.5G.

Adding soundblaster support opens up many possibilities for enhancing the game play in future revisions. Better voice samples, sound effects, mood music, and much more can be added to bring gameplay to another level. Of course all of the enhanced features must always have the ability to be turned off so that the game remains true to its origins.