DSx86 - Blog

Jun 24th, 2012 - DSx86 bug fixing

The additional project I have been working on has still required some fixing and enhancing, so I have not been able to focus on DSx86-related work fully. However, I decided to get back to working on DSx86 by first porting the much-enhanced tester program I have coded for DS2x86 back to DSx86. This turned out to be a good idea, as I have found quite a few bugs in DSx86 code that I would not have found without the new enhanced tester program. Here is a list of the bugs I have found so far:

Looking at that list of issues, the first impression is that it is a wonder that DSx86 has worked at all! However, on closer inspection nearly all of those problems are things that are either very rare, or need some special situation to occur for them to cause any problems. Even so, I am currently fixing them, as they may cause problems in some games.

The overflow flag handling problem in the adc opcode is worth a special mention. The adc (add with carry) opcode works like the normal add opcode if the input carry flag is not set, but if the input carry is set, it will add one to the result. Since I am using the high 16 bits of the 32-bit ARM registers when emulating the x86 registers, I have not been able to use the ARM adc opcode directly. Instead, I first add one to one of the input values if carry is set, and then perform the addition. This works fine in all other cases except when the input value is 0xFFFF (which would became 0 after adding 1 to it). So, I used to have a special code to detect this situation, like this (where the input operands are in high 16 bits of registers r0 and r1, with the result put into high 16 bits of r0, and the ARM flags set):

	addcss	r1, #0x00010000			@ If input Carry is set, the right operand = (register value + 1).
	bcs	adc_pass_carry_r0		@ If Carry is now set, it means the right operand was 0xFFFF and carry was set, so need special handling.
	adds	r0, r1				@ Perform the actual addition, setting the resulting flags.
If the input carry was set, and a carry is still set after adding 1 to the r1 value, it meant that the input register r1 value was 0xFFFF0000, and it is now 0. Adding zero to the r0 register does not change r0 value, so I had a special common handling for this situation:
adc_pass_carry_r0:
	ands	r0, r0				@ Set Sign and Zero flags, keep Carry set, Overflow flag is not changed
	mrs	r0,cpsr				@ Put the flags into r0
	bic	r0, #0x10000000			@ Clear the Overflow flag
	b	restore_flags_from_r0		@ Back to loop, setting the proper flags.
In other words, I always cleared the overflow flag, which however is not the correct behaviour.

After some thought I figured out a way to let the ARM processor calculate the proper flags for me, so that I don't need to attempt to calculate the overflow flag myself. The new code looks like this:

	addcs	r0, #0x00010000			@ If input Carry is set, adjust the right operand so that ...
	subcs	r0, #0x00000001			@ ... we can use the ARM ADC opcode for the actual operation.
	adcs	r0, r1				@ Perform the actual addition, setting the resulting flags.
It is much cleaner than the original code as there is no need for a special case handling. The idea is that if the input carry is set, instead of adding 1 (shifted to the high 16 bits) we set the low 16 bits of the input register, so that the ARM adc when adding 1 (the input carry) to the actual lowest bit, will automatically clear the lowest 16 bits and add one to the high 16 bits (the lowest bit of the x86 register value). This way all the resulting conditional flags, and the resulting r0 register, will get correct values in one go.

The corresponding sbb handling is somewhat trickier. I could not figure out a way to use the ARM sbc opcode for that, so I decided to handle the subtraction using two code brances, one for a simple subtraction when the input carry is not set, and one for a situation where the input carry is set (where I then have to calculate the proper flags myself). A bit slower solution, but the end result has all the correct flags.

I hope to be able to do all these fixes during the next week, so that I can then release a fixed DSx86 version on the first of July. This depends a bit on how much work I still have to do for the additional project, but hopefully not a lot any more.

Jun 17th, 2012 - Status update

Sorry for the lack of updates recently, but I have been busy with the additional project. However, I have just about finished my work on this extra project, it is currently undergoing testing. So, if the customer does not find anything major wrong with it, I should be able to get back to working on DSx86 and DS2x86 in the very near future. I still might need to do some minor enhancements and cleanup work for this extra project, but I would think that by the time my summer vacation begins (in two weeks) I should be fully back to working on DSx86 and DS2x86.

The first step is to try to remember what I was working on when I had to abandon them for a while, but this is where this blog is very useful. It works as a reminder for me just as much as it gives you readers info about what I am doing. :-)

Apr 29th, 2012 - DSx86 and DS2x86 on a small hiatus

Last week was again a busy week at the office. In addition to that, I was given an additional project that I need to spend my free time on, so I have not been able to work on DSx86 or DS2x86 at all during the past week. It seems that I need to spend my free time mostly working on other things besides DSx86/DS2x86 also for the next few weeks at least. Sorry about that, but occasionally real life interferes with your hobbies. I do plan to continue working on DS2x86 Windows 3.11 support and other enhancements as soon as I have more time to spend on it. I have my summer vacation in July, and I am pretty confident that I will have these other tasks finished by then, so I should be able to concentrate on DS2x86 fully during my summer vacation. It is quite possible that I am able to continue long before that, but we shall see.

Thanks again for your continued interest in DSx86 and DS2x86, and sorry for not being able to enhance them for a few weeks. I hope they are currently at a sufficiently usable state, so that you can manage for a while without new updates.

Apr 22nd, 2012 - DSx86 version 0.42 released!

During the past week I have been busy with other things besides working on DSx86 or DS2x86. We have a major customer installation coming up at my work in the beginning of May, so I have been busy with work-related stuff. During this weekend, instead of continuing with the DS2x86 Windows 3.11 support (which seems to be a bit of a never-ending project), I decided to attempt to fix the issues with the experimental proportional text mode font system that were left in the 0.41 version of DSx86. It took me the whole Saturday to hunt down the bugs, so I did not have time to do much else to it. But, the Smooth text mode should now be somewhat more usable. I used the Moria game and Norton Sysinfo for testing the system, and at least those do not seem to have the problems that were present in the previous 0.41 version.

 

The next week looks to be as busy as the previous week, so do not expect much progress with DS2x86 during the next week either. I hope to get back to working on it when the work situation eases up. Thanks again for your continued interest in DSx86 and DS2x86!

Apr 15th, 2012 - DS2x86 progress

During the past week I have continued trying to get Windows 3.11 running in DS2x86. This work has progressed slowly but steadily. I keep running into new problems, all of which seem to take a day or two of debugging and studying before I understand the cause of the problem and can implement a fix for it.

The problem of dropping back to DOS that I had last Sunday was caused by my RETF opcode not adjusting the CS register properly when in VM mode. It was an easy fix, and the next problem was also reasonably easy to fix, I needed to add support for INT 2F AX=1603 (MS Windows/386 - GET INSTANCE DATA) and INT 2F AX=1607 (MS Windows - DOSMGR VIRTUAL DEVICE API) software interrupts. I looked at DOSBox to see what it returns for those interrupts, and created similar handling into DS2x86.

The next problem was a bit more difficult, Windows 3.11 again dropped back to DOS, but this time with a message "Insufficient memory to run Windows". Windows 3.11 should run fine in 16MB of RAM, so I knew it was not a question of having too little RAM, but some problem with the memory managers or such. After some debugging I noticed that Windows calls the DOS interrupt INT 21 AH=52 (DOS 2+ internal - SYSVARS - GET LIST OF LISTS), and then preforms all sorts of checks of the list pointers. In DS2x86 I had left many of the list pointers (like pointer to directory array, pointer to next device driver after NUL device, pointer to FCB file pointers, and the next pointer of the master file table) as 0xFFFF:0xFFFF, which means the list end. However, Windows assumed all of those to be valid pointers, and attempted to make some size checks and memory allocations using those pointer values, which then of course failed. I needed to create new fake data to the DOS memory segments for many new data structures and have those pointers point to these new structures before I was able to make Windows continue past the checks. I also added proper CON device header, which had been missing until now. This will make the DOS device list look a bit more realistic, so it might be useful for other software besides Windows as well.

After fixing that problem, Windows next dropped back to DOS with a message "Unsupported expanded memory driver". This problem took several days to find and fix, as I was first looking for the problem in the code that gets executed after the previous fix I made. It took me a while to realize that Windows stores the EMM driver features almost as soon as it starts, but only checks these stored values much later in the bootup process. In the end it turned out that this problem was caused by my not supporting the DOS interrupt INT 21 AX=4402 (Memory Managers - GET EMM IMPORT STRUCTURE ADDRESS). I still don't know exactly what Windows checks from within this structure, I just used the same values as what DOSBox puts into that structure. DOSBox creates the structure into the VGA BIOS segment 0xC000, so I assume the contents do not need to actually reflect the current EMS manager state.

The current problem with Windows 3.11 is that it causes some unsupported I/O port accesses (it writes to VGA port 0x3CA, which is a read-only port). This port access does not happen in DOSBox, so somewhere before this happens the execution path has differed between DS2x86 and DOSBox. After a few of these port writes Windows 3.11 in DS2x86 shows a BSOD with a message: "Invalid VxD dynamic link call to device number 0001, service 2484. Your Windows configuration is invalid. Run the Windows Setup program again to correct this problem." I strongly suspect this is caused by the VGA driver, but I have not yet managed to find where the difference between DOSBox and DS2x86 execution begins. After I locate that, I can then check what VGA features my emulation is still missing.

So, all in all, slow progress, and I have no idea yet how many problems are still ahead before I get Windows 3.11 to actually start up in DS2x86. After that it might still take a while before any actual Windows programs will start running properly.

Apr 8th, 2012 - DS2x86 progress

Sorry, no new version released today. I have been working on Windows 3.11 support so extensively, that I have not had time to do any other improvements. I did check Alien Legacy, and found out that it uses CON device (stdin) input when selecting the sound support (in the beginning of the game). It reads the user selection with a C-language equivalent of fscanf(stdin, "%d", &value). I do not have proper support for stdin buffered input via file handle yet in DS2x86 (nor in DSx86). I tried to quickly code a hack for that, but it did not help with the problem, so I moved back to working on Windows 3.11.

The Windows 3.11 work has progressed slowly, as I keep running into various difficult problems that take several days to solve. The first weird problem was that after going to VM86 mode and executing a real-mode interrupt call in VM mode, the code returned to inside a string in emulated BIOS area! That is, the IRET opcode that pops the return address from stack, popped a real-mode address F4E4:1637, which was inside the BIOS ROM area.

I began looking for a problem in my code, tracing backwards for code that sets those invalid values to variables later used as the return address, until I found code that begins scanning from the end of the BIOS area downwards, until it finds a lower case 'c' letter, and then it stores the address of this letter as the return address for all DOS interrupt calls! I didn't think this made any sense whatsoever, so I began debugging the Windows 3.11 behaviour in DOSBox. After a while I realized that it behaves exactly the same way also in DOSBox! It still took me a while to believe that this was indeed the correct behaviour. Windows 3.11 returns from VM mode back to proper protected mode using a General Protection Fault exception. This fault is generated by trying to execute a protected mode ARPL opcode in VM mode. This opcode has a hex value of 0x63, the same as the ASCII letter 'c'. So, Windows just saves some code space by executing the faulting opcode directly from BIOS. I added a GPF generation code to that opcode, and then finally managed to progress a bit further.

The next problem was that I got a BSOD exception in DS2x86, and the exception adderss pointed to within my emulated file routines. After some thinking it occurred to me that with paging on, the page tables I use in my file routines might not have been properly set up. I haven't had to check this before adding VM mode, as the DOS file routines could only be used in real mode with paging off. Now Windows wants to call them with paging on in VM mode, so I needed to add a much more complex page table address calculation to every DOS routine that needs a memory address (which is pretty much all of them). This will cause a slight slowdown to every DOS operation, but since those are very slow compared to the actual CPU emulation in any case, this is probably pretty much unnoticeable. It is a major architectural change, though, so it is possible that some game will get broken because of this change.

After fixing the BSOD problem, I again felt I managed to get good progress done, as I ran into several opcodes that needed VM mode enhancements, like GPF generation for PUSHF and STI opcodes when the current I/O protection level did not allow the VM-mode program to access the CPU flags without the protected mode monitor intervening. This work progressed nicely, until Windows simply dropped back to DOS with a message "Video initialization failed".

This problem was again somewhat more difficult to track down. Windows 3.11 reads all the VGA register values into it's memory during the logo screen display (which is in 640x480 16-color mode). It then goes to the EGA-compatible 640x350 16-color mode, and then to the normal text mode. What was weird, was that after going to the text mode it checked that the register values it had saved when in 640x480 graphics mode were for a text mode! This again seemed to make no sense, so again I needed to debug the behaviour in DOSBox. In DOSBox the values saved to memory were indeed for text mode. I then added a watch for the memory address that seemed to change somewhere between storing and later reading the value. I noticed that the memory values changed when DOSBox was executing the INT 10 mode change routine (the C-language routine within the DOSBox sources, not in the emulated machine code)! This was pretty strange, but I finally found out that DOSBox detects if the INT 10 code is running in VM mode, and actually launches a separate emulation core (called IOFaultCore) for every port access (of which there are several dozen when changing the video mode)! This separate core then causes a GPF and calls the Windows 3.11 trap routine, which then checks that the VGA registers are getting changed and stores the new values into memory areas.

In DS2x86 and DSx86 I have not paid much attention to the VGA registers when changing the modes. My mode change C routine simply copied the new mode-specific register values to the memory addresses that the emulated I/O port routines used, in case a program wants to read the VGA ports after the mode change. I had used simple memcpy() calls, which obviously will not work at all for the Windows needs.

So, I spent a day changing the DS2x86 mode change routine and INT 10 emulation so that after calling the mode change C routine, I execute within the emulated BIOS area a x86 routine that outputs the proper mode-specific VGA CRTC and Graphics register values, so that Windows can trap each of these access and handle them as it wishes. I actually first thought that I could simply read the port value and then write it back, so that I don't need to actually access the VideoParameterTable, but that did not work. Windows traps also the VGA port read routines, and replaces the actually read value with the value from it's memory table! I believe this is how it virtualizes the screen access so that it can prohibit a game from changing the graphics mode for real, but still make the game think that the graphics mode has been changed. When running such code within DS2x86, there is sort of two layers of emulation on top of each other!

After I finally got Windows to think that it is running on 100% VGA compatible display adapter, I had a simple "Unsupported INT 15 call!" problem (Windows tried to detect the BIOS mouse device type), but after implementing that, Windows startup simply drops back to DOS with no messages. This is again a bit more difficult to find and fix, and this is what I am currently doing. I still have various missing features in the VM mode support, which may be the cause, but I still need to spend some time debugging this problem and comparing the behaviour of DS2x86 with that of DOSBox.

Have a Happy Easter, and I hope you can wait for a while longer for the next version. It has been interesting learning about Windows internals while attempting to make it run in DS2x86, so I plan to continue with that for a while longer, at least.

Apr 1st, 2012 - DS2x86 progress

DS2x86 runs Windows 95!

Well, not really, but that was too good a title to pass by on April Fool's Day. :-) What I am actually working on, is support for Windows 3.11. The problem with it in the current DS2x86 was that it crashed immediately when switching to protected mode. This was caused by it storing the Global Descriptor Table into virtual memory before switching to protected mode. I did not have support for that in DS2x86. It is actually a bit of a chicken and egg situation, as virtual memory needs the GDT table to be correct for page fault handling. Windows 3.11 actually accesses virtual memory using the page tables from real mode before going to protected mode and activating paging! Adding support for such allowed it to continue to protected mode correctly.

After fixing that problem, I needed to improve pushad opcode behaviour, add better support for accessing CPU debug registers, add support for reading secondary DMA controller ports, etc. All of these were pretty easy and straghtforward, so it was nice to see some real progress without having to hunt for bugs for a change. After those I did run into a bigger unsupported issue, namely Virtual 8086 mode. Windows 3.11 seems to use VM86 mode when calling some real-mode interrupt vectors from protected mode, and until now I have not had any support for VM86. Or to be more precise, I have checked that if the processor is supposed to be in VM86 mode, I simply drop into the debugger. However, to support Windows 3.11, I need to add proper support for VM86 mode.

So, for the last couple of days I have been working on adding the VM86 mode handling into DS2x86. Many opcodes need only minor changes, but the actual going into VM86 mode and back using the IRET opcode is a bit complex, so adding support for that will still take some time. I have also found some weird behaviour in Windows 3.11 on DS2x86, so there seem to be something else besides the VM86 mode also missing on broken. So, I don't expect Windows 3.11 working in the next version yet, but we shall see.

I have also checked Alien Legacy, which seems to loop displaying the sound selection menu. I originally thought the problem is in the key input routine, but it seems that the keyboard interrupt INT 16 is never even called by Alien Legacy in DS2x86, even though it is called in DOSBox, so this needs some further studying and debugging. I hope to get that at least working in the next version.

Thanks for the debug logs and error reports again, I plan to fix some issues in those as well for the next version.

Mar 25th, 2012 - DS2x86 version 0.36 released!

This version has mostly fixes for Borland RTM extender and Jazz Jackrabbit. The full list of changes looks like this:

During the last week I managed to fix the graphics problems in Jazz Jackrabbit. Those were caused by the game reading data from file directly to Mode-X graphics memory. So far I had only supported reading data directly to EGA memory (which is done by Heimdall, for example), but not to Mode-X graphics memory as I have not had a suitable program to test this with. Now both EGA and Mode-X direct reading is supported. I also finally implemented proper AdLib timer handling. That is used by many games to detect the presence of AdLib-compatible audio system, including SoundBlaster FM audio. Now the AdLib timers increment at the proper speed, so that the AdLib/SoundBlaster detection in various games should now be more reliable.

After I got Jazz Jackrabbit working, I also spent some time trying to track down the bug that causes various games to crash after a while, especially if SoundBlaster digital audio is in use. I did not manage to find the problem, though. I added various counters and tracing features, but the problem is that even when I let the games run for up to half an hour, they did not crash, and my counters showed that the SB IRQ emulation had been executed for hundreds of thousands of times. It is really difficult to find a problem that happens so rarely that executing the potential problem code works fine for thousands of times before failing.

I did however managed to find and fix a bug in SoundBlaster digital audio buffering scheme, that in certain situations could cause the whole ARM9 side to hang. I encountered this problem when testing Mortal Kombat after the AdLib detection change, so I used that game to track down the cause for the bug. I had earlier noticed that this hang also happens occasionally in Supaplex, but at that time I was not able to track the bug down. Now neither of those games should cause a hang any more.

Next I would like to study Windows 3.11, as it would be interesting to get it running. I have already logged it in DOSBox, so I just need to trace and compare that log with the behaviour of DS2x86, similar to what I did with Jazz Jackrabbit, to find out why it currently crashes. I suspect there are still various features in my protected mode and paging support that are missing, so it is not a big surprise that Windows 3.11 does not work quite yet.

Mar 18th, 2012 - DS2x86 progress

For the past week I have been working on getting Jazz Jackrabbit running. It uses Borland RTM DOS Extender instead of the more common dos4gw extender. It has been a while since I last worked on it, and it has never progressed very far, so I thought that it was about time that I really looked into supporting the Borland RTM extender properly. Thus, I began debugging the game. Here is a list of the changes and improvements I have so far needed to do for that game.

  1. Pretty soon after I began working on it, I noticed that it does not progress even as far as it did when I last worked on it a long time ago. It crashed with a "Page Fault in InitPage!" message when running a "repe stosw" opcode. The segment it tried to write to had a base address of 0xF0000004, which obviously was not within the 16MB of emulated memory. I spent some hours debugging it, comparing the behaviour to DOSBox, and finally I noticed that the game runs some memory allocation stuff twice in DS2x86, but only once in DOSBox. It seemed to enable various memory allocation handlers depending on the memory managers present on the system, and DS2x86 reported that both HIMEM.SYS and extended memory is available. I looked at DOSBox sources, and it reports that no extended memory is available if HIMEM.SYS features are enabled. I made DS2x86 also report that no extended memory is available, and then Jazz progressed up to the same crash location as it did a while ago. A bit silly that the game enables several memory managers at the same time, but of course on the real system only one method of handling extended memory is available at any one time.
  2. The next problem was that the DS segment got an invalid value 0x15B9 in a "mov ds,ax" opcode. The segment descriptor tables only had valid values in the range 0x0000..0x011F, so that value was far outside that range. This problem was a bit weird, as the code looked like this:
    00E7:48C2	mov	ax,15B9
    		inc	bp
    		push	bp
    		mov	bp,sp
    		push	ds
    00E7:48CA	mov	ds,ax
    
    That is, the invalid value was loaded into AX register as an immediate value from the code segment! I spent a couple of days trying to debug this, but could not find the cause. Finally I decided to run the game in DOSBox, writing to a log file every single opcode and the resulting register values, up to the point where the game crashes in DS2x86. This log file was around 30 megabytes in size and had 483957 rows (for 161319 opcodes). Of course I did not need to check the result of every single opcode in DS2x86 and compare it with the log, but even checking the situation only after various function calls meant quite a lot of searching.
     
    Pretty soon I discovered that the game allocates some DOS RAM using a "LAST FIT" memory allocation strategy, that is, it requests a block of memory from the end of the DOS 640KB memory area. DS2x86 has never supported this strategy, it always allocates the next sufficiently large free block. This has not caused problems before, but this was obviously a potential problem. I wanted to make sure that this was the cause, though, so I actually spent a couple of days comparing the behaviour of the game in DS2x86 to the DOSBox log file. Finally I found the problem. The game frees a block of memory that still contains a part of the RTM DOS extender data, then allocates the LAST FIT memory block, clears it, and then copies the already freed RTM DOS extender data to the extended memory. Since DS2x86 allocated the new block over the freed RTM DOS extender data, it was cleared when the game cleared the newly allocated memory block. This cleared block was then copied to the extended memory, and when it was later used to set up various segment selectors, the selectors got invalid values because the block did not contain the data it should have contained. I implemented the "LAST FIT" DOS memory allocation strategy into DS2x86, and this enabled Jazz Jackrabbit to progress further.
  3. The next problem was that the game crashed in a "les di,[bp+06]" opcode. This opcode loads both the ES segment selector and a DI register from memory. The selector was 0x01A7, which was in the correct range, but the corresponding descriptor had a base value of 0x545404AA! This was obviously again far outside of the supported 16MB emulated memory area. I also noticed that the descriptor did not have the "Present" bit on. This bit is used to cause a Page Fault when a segment selector is loaded, if the page is not present in memory. I checked in DOSBox, and indeed the game wants to cause a Page Fault at that point. I did not support Page Faults in that opcode in DS2x86 yet, and adding Page Fault handling to that opcode again allowed the game to progress further.
  4. After fixing the previous problem, the game simply exited to DOS with a message "Loader error (0000): unrecognized error". This seemed to mean that all the work I had done so far was only for the Borland RTM loader, and that the actual game had not even started to run yet. I again ran the game in DOSBOX and generated an even larger log, this time it was over 120 megabytes in size until I stopped it running. After some more searching in this file, I noticed that at some point it runs interrupt 0x2F with AX=0xFB42 and BX=0x1002. Looking at Ralf Brown's Interrupt List this interrupt was called "Borland RTM.EXE 1.0 - EXECUTE COMPILED PROGRAM", which looked like it might be the "loader" that the error message talks about. I tried to break the game execution in DS2x86 at this interrupt, but the game did not get that far before printing the error. The code in question looked like the following:
    02E6:0710	push	es
    		mov	ax,ds
    		mov	es,ax
    		mov	bx,0146
    		mov	dx,0154		Pointer to "C:\GAMES\JJRABBIT\rtm.000" string
    		mov	ax,4B00
    		int	21		DOS 2+ - EXEC - LOAD AND EXECUTE PROGRAM
    		jnc	0725
    		jmp	0858
    02E6:0725	mov	ax,4D00
    		int	21		DOS 2+ - GET RETURN CODE (ERRORLEVEL)
    		cmp	ax,0300		Is the return code == 0x0300 (terminate and stay recident, no errors)?
    		je	0734
    		call	0805
    		jmp	076B
    02E6:0734	pop	es
    		mov	ax,FB42
    		mov	bx,1002
    		mov	dx,[01E0]
    		int	2F		Borland RTM.EXE 1.0 - EXECUTE COMPILED PROGRAM
    
    I noticed that DS2x86 ran fine up to the 02E6:0725 address, but did not get to the 02E6:0734 address. So, it looked like the return code was not correct. And indeed, even though the "rtm.000" program stayed resident in memory, my INT 21 AX=4D00 handling never returned anything in the AX high byte! I fixed the INT 21 AX=4D00 handler to set the AX high byte properly, and then the game again progressed further in DS2x86.
  5. The next problem was similar to the "les di,[bp+06]" problem above, but this time the opcode was "mov es,[bp-02]". I added the Page Fault handling to this opcode as well, and then I got the actual game to start and print the welcome text strings!
  6. The problem was that after printing the welcome strings, nothing more happened. It looked like the whole MIPS side had hung, as I could not even drop into the debugger. So, I began hunting for the location where the hang occurred, by looking for returns from routines, and then tracing over the subroutines they called until I got to higher and higher level, and found the routine that made the system hang. After that I then burrowed deeper and deeper into the hanging routines, until I was at the lowest level, and found out that the system did hang immediately after the game gave a 0x1C (start 8bit auto-init DMA playing) command to the SoundBlaster DSP port. I checked the emulated SoundBlaster variables in DS2x86, and noticed that at that point (which actually begins the digitized audio playing) the length of the memory block to play was still zero! When using auto-init DMA, the length needs to be given using a separate DSP command 0x48, but the game never does that! Instead, it looks like there is a bug in the game code, as it has set the DMA block size to 0x7FF before starting the playing, but the DSP command 0x1C is given like this:
    0417:25DE	cmp	word [22ED],0200	
    		jbe	25F8			Jump if SoundBlaster DSP version <= 2.00
    		mov	al,1C
    		call	22F5			Send DSP command 0x1C (start 8-bit autoinit DMA)
    		mov	ax,[22F1]		ax = 0x0800
    		dec	ax			ax = 0x07FF
    		call	22F5			Send DSP command 0xFF (undocumented!)
    		mov	al,ah
    		call	22F5			Send DSP command 0x07 (undocumented!)
    0417:25F7	ret
    
    Looks like the game coders have mistakenly used the same system as when using the DSP command 0x14 for normal 8bit DMA playing, which needs the length of the playing buffer as parameters, first the low byte and then the high byte. DOSBox simply ignores the zero length, same as those 0xFF and 0x07 DSP commands, but all of these caused problems in DS2x86. I hacked the DSP command 0x1C handling in DS2x86 so that if the SB transfer length is zero, the DMA transfer length is used instead. This seemed to help, as the game dropped into debugger for those unsupported 0xFF and 0x07 SB DSP commands, but then started up properly, began to play music and then started up the demo game!

There are still some graphics problems in Jazz Jackrabbit, but it looks like it will be playable in the next DS2x86 version. These fixes will probably help with other games using Borland RTM DOS Extender as well, though I haven't tested other such games (I'm not even sure if I currently have any other game that uses it). I'll still need to silently ignore those invalid SB DSP commands, and look into the graphics problems. Also, it looks like the AdLib hardware detection fix I made for Warcraft 2 in the 0.35 version broke the detection method that Mortal Kombat uses! That was a bit annoying. It looks like I still need to work on that.

Thanks again for the debug logs and other information you have sent me! I have not yet had much time to look into those, as I have been focusing on Jazz Jackrabbit, but it looks like I should be able to look into some other problems during the next week as well.

Mar 11th, 2012 - DS2x86 version 0.35 released!

This version has the following fixes and improvements:

After I got the DSx86 version 0.41 released last weekend, I continued improving DS2x86. I had already managed to fix the keyboard emulation for LBA and the SB effects for Mortal Kombat, so I focused on the weird partial BSOD error message that happened in Warcraft 2 after moving a mouse. Before I could test that, I needed to use the setup program to disable audio, and while doing that I noticed that the setup program crashed due to an unsupported AAM opcode. This opcode behaves the same in real and protected mode, I had just forgotten to add it to the protected mode opcode pointer table, so that was an easy fix. However, finding the cause for the BSOD was considerably harder.

It took me three days to work on the BSOD problem. It seemed I got nowhere during the first two days, so on Wednesday I then decided to rewrite the whole BSOD system. Ever since I originally coded the BSOD system it has been the MIPS side that generates the full exception string and sends the whole screen back to the NDS side. Now I thought that perhaps with the new transfer system it would be better to transfer just the raw data from the MIPS side, and then build the string on the NDS side. After I did that, I got some proper-looking addresses on the BSOD screen. However, the exception cause was weird, it was 0x1F which is not mentioned in any MIPS reference material I have. Also, the crash address was in the middle of a page table clearing routine, which did not make much sense. If there was something wrong at that point, the code should have crashed on the first of the 16 similar opcodes, not on the 10th!

My exception handler walks the stack and looks for values that are in the range 0x80000000 .. 0x801DAFD0 (or where ever the first symbol from the library file is located). These numbers are between -2147483648 and -2145538096 in decimal notation, and such values are rarely used as parameters to functions, so this will usually give a proper list of function return addresses. The first eight of these values have been shown in the BSOD crash dump ever since I originally coded it, but this time the values it printed did not make much sense in relation to the crash location. Thus, the next step was that I added full register dump into my BSOD crash printout, hoping that that would give some more info about what exactly caused the crash. This is what the new BSOD crash dump displayed:

Now I finally had enough information to help me find the crash cause. The registers could not possibly have such values at the crash location 0x801B5C14, so it seemed that the crash location was reported wrong. However, the stack and register values very clearly pointed to my MouseEvent() routine which was called from the MouseMoved() routine. Since the crash happened after moving a mouse, these seemed to be correct values. But the crash address still pointed to within the PageTable clearing routine... And at that point I finally figured out the cause of the crash! The MouseEvent() routine gets called inside an interrupt generated originally from the NDS side data transfer request. The crash location points inside the PageTable clearing routine, probably because that was what the code was doing when the interrupt occurred! And my MouseEvent() routine uses the PageTable and does not expect it to be clear!

I changed the MouseEvent() routine not to use the PageTable (as it did not actually need to use it, it was just a convenient way to determine the location of some memory variables), and that got rid of the BSOD crash! I also some time later found out why the original message was cut after 4 characters: I overwrote the string in the new transfer system. This also caused the exception cause to show 0x1F instead the correct 0x03 "TLB miss on store" it should have displayed in the rewritten BSOD handling. This is also fixed now.

After that fix I then did some other minor fixes, for example I added a 80x50 character text mode handling, which was used by the Little Big Adventure setup program. I also found that WarCraft 2 sometimes detected AdLib hardware and sometimes not, so I spent some time debugging that problem and noticed that my handling of the detection routine was very poor and unreliable. I improved that routine so that now both WarCraft 2 and Little Big Adventure seem to play both FM music and SoundBlaster digital effects fine. There is still a potential problem where some SB IRQs might get missed when paging is on, which may cause SB digital audio to stop working after a while. This needs some better IRQ handling in general, so fixing this is on hold until I can figure out some improved ideas for that.

Anyways, this version should now have some worthwhile improvements, let me know of the problems you encounter, and again crash logs and BSOD error info is very welcome! Thanks again for your interest in DS2x86 and DSx86!

Mar 4th, 2012 - DSx86 version 0.41 released!

DSx86 v0.41 release notes

It was in November last year when I last released a version of the original DSx86, so I wanted to spend some time bringing it closer to the current level of DS2x86. Here is a list of the fixes and improvements I had time to add into it, during the past week:

The new experimental proportional font system was originally implemented by "sverx" a few weeks ago, after we had a discussion about various possibilities to fit the whole 80x25 text mode area into the NDS 256x192 pixel screen. I asked him whether a proportional font would be possible, and he then spent some time experimenting. After looking at the results we agreed that the system might actually work. I have been busy with DS2x86 so I only had time to implement his algorithm into DSx86 this weekend.

The algorithm divides the horizontal 256 pixels (for 80 characters) into four blocks of 64 pixels (20 characters), and then draws characters from a 4x8 pixel font, using only 3 or even 2 pixels for narrow characters (like a space, 'i' and 'l'), and if that still results in more than 64 pixels, the algorithm forcibly strips some other characters down to 3 pixels as well. Thus, when you type on the command prompt also the previously typed characters may move, and the cursor position is only an approximation.

Here are some examples of the result. The SYSINFO screen looks a bit weird, as the algorithm has had to move the line drawing characters around in an attempt to make the text parts readable. However, even with that problem, the text is far clearer than what the "Scale" or "Jitter" modes will achieve.

 

I was also requested to look into Titus the Fox, so I implemented a few previously missing features to make it run. However, there are still problems with the palette, so the game does not look quite correct. It does look correct in DS2x86, as there I use separate EGA and VGA mode palettes. In DSx86 I have only a single palette which attempts to work in both modes, and due to the way that game changes the palette registers the result is not quite correct. I plan to look into this problem in the future versions, but it might be a bit tricky to fix without causing problems in other games.

DS2x86 and Little Big Adventure

After releasing the latest DS2x86 version, several games needing VESA SVGA screen mode have managed to at least start up. One of them is Little Big Adventure (LBA). However, there is a problem in the current DS2x86 that makes LBA hang whenever a key is pressed. I spent a little while looking into this problem, and found out that it is caused by DS2x86 not clearing the "Buffer Full" bit in the keyboard status register. The LBA keyboard reading is programmed so that it loops reading the keyboard data register until the "Buffer Full" bit is clear, and thus the game hangs after a key is pressed.

There is a way to hack the game using the DS2x86 inbuilt debugger, so that the game will not hang. Here is the procedure for doing that. Note that the actual values may change a bit depending on the version of LBA you have, so I try to give instructions about how to find the correct address to hack.

  1. Start LBA on DS2x86, and let it run up to the initial "Little Big Adventure" splash screen.
  2. Press a key, so that LBA hangs.
  3. Click on the X button on the top right of the virtual touchpad keyboard, to drop into the debugger. You should get something like the following showing on the bottom screen:

    The important things to check here are that the second row shows CPU: PROT, USE32, CPL=0, Paging=0. Also, there should be a row stating DS=0160 ES=0160 SS=0088 CS=0158, and below the "NV UP ..." etc row the number should start with 0015 with 4 numbers or letters folowing.
  4. If the important values mentioned above are not similar to what you got, simply click on the G (for 'Go') character on the bottom row, and then on the rightmost v character (which is 'Enter') to continue the game, and then try dropping into the debugger again. Note that if you make a mistake and click on the wrong character, you can use the < symbol (for 'Backspace') to erase the last typed character.
  5. Next, we need to find the location in the code where the game tests for the "Buffer Full" bit. This is done by opcode in al,64 followed by opcode test al,01. In my version of LBA those two opcodes are at addresses 159D76 and 159D78, like this:

    To look for those opcodes, let's first check whether your version is the same as mine, so using the bottom row, give the following command: U159D60 and click on the rightmost v. If you can see the in al,64 and test al,01 opcodes in the result, check the address of the test al,01 opcode and add one to it. Note that the addresses are in hexadecimal notation, so the next number after 9 is A, not 10. In my case the address we are interested in is 159D79.
  6. If you can not see those opcodes in your case, then you need to look for them, starting from where ever you dropped into the debugger. The keyboard reading code in my version of LBA is at addresses 159C81 .. 159D7A, with the location we want being very near the end of that area. So, depending on where you dropped into the debugger in your version of LBA, you should be able to locate those opcodes after a few tries.
  7. After we have found the opcode, we need to change it. That is done with the E command in the debugger. In my case the full command is E158:159D79 2, that is, we want to change the bit 1 in the test al,01 to 2, so that the opcode will become test al,02. This will always return zero, so the game will continue.
  8. After giving that command, you may use the U command again to check that the opcode did actually change.
  9. If the opcode was changed, give the G command (for 'Go'), and now the game should not hang after a key press any more!
Note that this only changes the instance that is running, so the next time you start LBA you need to do this again. However, the next version of DS2x86 has this problem fixed, so if you don't want to experiment or have trouble locating the correct opcode, you only need to wait for a week or so until I get the next DS2x86 version released.

DS2x86 progress

I have also done some minor improvements to DS2x86 during the past week. One was that LBA keyboard status bit fix, and I also worked on improving the quality of the scaled SVGA screen modes. I changed the SVGA graphics transfer code to transfer 512x240 pixels instead of the 320x240 pixels per frame. This dropped the maximum framerate down to 38 fps, but since the original code did not quite reach 60 fps either, this has no noticeable effect. However, moving more data allows me to have more pixels on the NDS side to scale, so the scaling quality will improve.

Lastly, I also debugged a SoundBlaster problem in Mortal Kombat. It did play SB sound effects, but they were just screeching instead of proper samples. I found out that the physical RAM address where my SB emulation tried to play the samlples was at the beginning of the emulated RAM area, which meant that the program had given a zero address for the sample memory area! I debugged the game also on DOSBox, and thre I found out that the game detects SoundBlaster by first attempting to use DMA channel 0, checking for a SB IRQ, and if it does not happen, moving on to DMA channel 1 etc. In DS2x86 I only emulate DMA channel 1, but I launched an SB IRQ regardless of the DMA channel in use. Thus the DMA 1 address was left at zero, but the game detected SB at DMA channel 0, and then used that channel within the game as well. I fixed that by skipping SB IRQ launching if the correct DMA channel is not running, and this fixed the SB problems in Mortal Kombat.

I am currently attempting to find the nasty bug that causes both Warcraft 2 and Command & Conquer to crash with an BSOD exception that only shows the beginning of the exception string. It looks like this is caused by something clearing much of the RAM area, so that also the error strings get cleared. I have not been able to find out what causes this, but I'll keep working on this during the next week.

Feb 26th, 2012 - DS2x86 version 0.34 released!

This version has the following changes and improvements:

In addition to the above fixes, I also tried to enhance the speed of the graphics transfer algorithms. I managed to make some of the routines faster, so that the transfer can now achieve 60 fps in all but the new SVGA and the weird Mode-X modes. I hope to be able to improve the SVGA mode handling (both the transfer speed and the quality of the scaled modes) in future versions. Here below is the revised frame rate table for the various graphics modes.

ModeZoomScaleNotes
80x25 Text>200 fps>200 fps(Varies by the number of changed characters)
320x200 CGA229 fps229 fps 
640x200 CGA222 fps229 fps 
320x200 EGA128 fps128 fps(When logical screen width = 320 pixels)
320x200 EGA119 fps119 fps(When logical screen width > 320 pixels)
640x200 EGA178 fps77 fps 
640x350 EGA178 fps87 fps(In Scale mode 320x175 bytes per frame, interlaced)
640x480 VGA178 fps64 fps(In Scale mode 320x240 bytes per frame, interlaced)
320x200 MCGA77 fps77 fps 
360x240 Mode-X56 fps56 fps(Used in "Settlers")
320x480 Mode-X64 fps64 fps(Used in "LineWars II", 320x240 bytes per frame, interlaced)
640x480 SVGA100 fps59 fps(In Scale mode 320x240 bytes per frame, interlaced)

I have now added various enhancements to DS2x86 which would be useful also in the original DSx86, and it has been several months since I released the most recent version of it. I think I will next spend a few weeks enhancing the original DSx86, to bring it up to the current level of DS2x86, regarding bug fixes and graphics quality improvements. After that, I think I will work on the SoundBlaster emulation and general game compatibility problems in DS2x86. I am finally happy with the new transfer system in DS2x86, so I can now move on to actually enhancing the programs with new features! Please send me again the debug logs and test reports about this new version!

Feb 19th, 2012 - DS2x86 progress

SoundBlaster transfer system improvements

Okay, during the past week I finally managed to rewrite the SoundBlaster data transfer routines for the new DSTwo transfer system I am using. Now I am happy with the method, it is much cleaner than before, and it handles all the various types of playing (auto-init, 8-bit WAV, ADPCM) properly. It can also support Direct DAC output, but I think my actual SB emulation does not support that yet. There are also other problems in the actual emulation, as Warcraft digitized sounds are still not working, but I believe those problems are in the actual emulation running on the MIPS side, not in the transfer system.

Bug fixing and VESA SVGA support

After I got the SB audio working, I spent some time trying to get Command & Conquer running. I only got a black screen, with the HDD "led" blinking constantly. It took me quite a while to realize that the problem was not in the game looping, but instead I had broken the graphics blitting when paging is active in the 0.33 version! Sorry about that.. I broke it while moving the emulated VGA VRAM area into a physically different location on the DSTwo RAM (for the future SVGA support), and forgot that the paging routines still used the original memory address. Thus nothing was shown on the screen.

After fixing that, I began working on the VESA BIOS support. I again used my own old LineWars II game as a test bench, as it has the option to use VESA 640x480 mode with 256 colors. I did my first test after I had coded only a couple of the most essential VESA BIOS calls, enough to make LW2 think the VESA mode is available. The intro did run for a few seconds, and then the game crashed! That was rather unexpected, as I thought I had not changed anything that might have caused a crash.

So, I began debugging the reason for the weird crash. I found out that the game crashed because a jump table was overwritten with garbage. I have a memory watch feature in the DS2x86 inbuilt debugger, so I used that to find out the opcode that overwrites the jump table. The overwriting happened in a routine that draws the sun image to the offscreen buffer I use when blitting the 3D viewport to the actual SVGA VRAM. Having the source code of the misbehaving game was again very useful, as I could easily see what the misbehaving routine attempts to do and why it might fail. Since the offscreen buffer is 512x256 pixels (or 128KB) and LW2 is a real-mode game, it needs to access two 64KB segments while drawing the images to the screen buffer. I had used the shld r/m16, r16, imm8 opcode with the imm8 value of 21 to simultaneously copy and shift one register to another register, to help selecting the correct segment. Comparing my algorithm with the DOSBox algorithm made me realize that my version does not work properly when the shift count is more than 16, and thus the segment where the game draw the sun image was wrong and so it overwrote some code outside of the screen buffer. Seems like the DOSBox people had also had problems with that variation of the shld opcode, as the comment in the DOSBox source code says "let's hope bochs has it correct with the higher than 16 shifts" :-). Anyways, I copied the same algorithm (converted to ASM) into my shld implementation, and after that LineWars II stopped crashing.

The next problem was that the polygons that LineWars II draws to the viewport sometimes were not correct. Even though the ship was clearly in the middle of screen, some polygons were drawn all the way towards the left edge of the viewport. After some more debugging I found a bug in my idiv opcode handling. If the result was negative, the high 16 bits of the eax register got overwritten with 0xFFFF! LineWars II used the high 16 bits of eax as a temporary save for the polygon X coordinate, so occasionally the X coordinate became -1 and thus the polygon began from the left edge of the viewport. After I fixed that problem, LineWars II began to run fine in the SVGA 640x480 with 256 colors mode on DS2x86! (The screen copy below is from DOSBox, as I don't have screen copy feature in the current DS2x86.)

It's a bit unnerving having to fix such serious bugs in the core CPU emulation, it makes me wonder how serious problems there still are in the code! LineWars II only uses 386 features when using the VESA graphics mode, in the lesser graphics modes it runs using only 80186 opcodes. This is why I have not noticed these problems before, but I would have expected these to cause problems in other games as well. Perhaps games usually either use 80186 opcodes or fully 386 opcodes, and not a mixture of them as LW2 does. Both of these problems were caused by mixing 16-bit and 32-bit opcodes in the same routines. Anyways, the more bugs I find and fix, the fewer there are remaining!

Next I plan to continue improving the VESA graphics mode support (currently it only works in Zoom mode), and I think I will need to revisit the current scaling modes as well, they do not produce quite good enough quality in the higher resolution modes yet. Perhaps I need to reimplement the Smooth scaling mode for the higher resolution graphics modes after all.

Feb 12th, 2012 - DS2x86 version 0.33 released!

This version has the following changes and improvements:

I did not have enough time to work on the SoundBlaster emulation rewrite, so I plan to focus on that during the next two-week period. After that I have plans to at least experiment with adding some VESA SVGA graphics modes, mainly 640x400 and 640x480 with 256 colors. Many games seem to require the presence of a VESA BIOS, and if the game would otherwise work, there is no reason to not support those graphics modes. Playability might suffer due to the small physical screen resolution, but I hope the new Zoom/2 scaling mode helps with this potential problem.

I have also studied a few misbehaving games, but did not yet manage to find out why they don't work. I plan to continue debugging them also while rewriting the SB code, hoping to make a few more games playable in the next version. Please send me the debug logs again from this version, those will help me in finding out the problems in the code!

Feb 5th, 2012 - DS2x86 progress

Graphics transfer improvements

During the past week I have finally had some time to work on DS2x86, I have spent an hour or two almost every day on it. My goal was to improve the graphics transfer and blitting code so that also the high resolution modes reach at least 30fps screen refresh rate. I managed to reach speeds of over 60fps in Zoom mode in all the standard graphics modes, but in 640x480 scaled mode and in the weird Mode-X resolutions the blitting speed will sadly still not reach 60fps. But in any case all graphics transfer is now faster than what it was in the original DSTwo transfer system. Below is a table showing the current graphics refresh rates in the various resolutions, in both Zoom and Scaled modes.

ModeZoomScaleNotes
80x25 Text>200 fps>200 fpsVaries by the number of changed characters
320x200 CGA229 fps229 fps 
640x200 CGA222 fps229 fps 
320x200 EGA128 fps128 fpsWhen logical screen width = 320 pixel
320x200 EGA119 fps119 fpsWhen logical screen width > 320 pixels
640x200 EGA156 fps70 fps 
640x350 EGA156 fps80 fpsIn Scale mode 320x175 bytes transferred
640x480 VGA156 fps58 fpsIn Scale mode 320x240 bytes transferred
320x200 MCGA70 fps70 fps 
360x240 Mode-X56 fps56 fpsUsed in "Settlers"
320x480 Mode-X58 fps58 fpsUsed in "LineWars II", 320x240 bytes transferred

To reach these speeds in the >=350 scanline Scale modes, I had to combine two adjacent scanlines on the MIPS side before sending them as a single scanline. This will cause the graphics quality to suffer (since I don't want to go to 16-bit color mode where I could do some palette averaging, but which would again drop the refresh rate down to around 40 fps). In the Mode-X 320x400 and 320x480 modes this joining of two scanlines is done also in the Zoom mode, to improved the aspect ratio, so those modes will suffer the most. Luckily, those are rather uncommon screen modes, so not many games are affected. In the 640x??? Zoom modes I only transfer the 256x192 pixels of the visible screen area, so those will run much faster than the corresponding Scale modes.

Serious FS and GS segment register bug fixed

While testing various games with the new graphics transfer routines, I noticed a problem in Chaos Engine. It went into the game fine, but then immediately the screen got filled with seemingly random multicolour pixels. I tested the game with the previous 0.23 version, but the problem was present in there as well! So, it was not caused by the new transfer system, but something more serious. I tested also with DS2x86 version 0.22, in which the game worked fine, so the problem was obviously caused by the "major internal rewrite" that I did for version 0.23.

Next I tried to stop the game into the debugger immediately after the problem began to occur, and almost by chance I managed to get inside some graphics drawing routine, where I immediately noticed that the game uses the FS segment register. It is rather uncommon for real-mode games to use the FS and GS segment registers, and since in version 0.23 I had changed the FS and GS registers to be handled differently, this immediately made me suspect the new handling for those segment registers. The FS register had a value 0x03D2, and the graphics code tested whether a byte in that segment was 0xFF. When I looked at the address of that byte, I noticed that it actually seems to point to code and not data! This was very suspicious, so I checked what the FS register value was in DS2x86 version 0.22 when running that code, and there the FS register value was 0x3D2F!

So, now it was just a matter of determining which of my opcode handlers cause the FS segment register to shift 4 bit positions to the right. Pretty soon I found that the problem was in the software interrupt (INT opcode) handler. It was supposed to shift both FS and GS registers 4 bit positions right (to adjust from effective to actual segment register value when calling the interrupt), but due to a copy-paste bug I shifted FS register two times and did not shift the GS register at all! So, whenever a software interrupt was called, both FS and GS registers got invalid values!

Since those registers are rarely used in real mode, and the protected mode handling for those registers was not broken, this only caused problems in a few games, Chaos Engine being one of them. After I fixed that, I tested also Elder Scrolls: Arena, which had also started to misbehave during the same time, dropping back to DOS with a "Memory list blown" error message. This FS and GS segment register fix did seem to fix that game also, so in the next 0.33 version Arena should again be playable.

Future work

I haven't yet had time to recode the SoundBlaster support, so that is what I plan to work on next. I would like to be able to implement all the ADPCM digitized audio modes, and fix the timing and skipping problems in the current system. This will be rather lot of work, so I am not sure if I have time to do all of that during the next week, but we shall see. I would also like to soon have time to work on the game compatibility again, but it looks like that will have to wait for a little while longer. In any case, thanks for your test reports for the previous 0.32 version, I will eventually get around to fixing the problems!

Jan 29th, 2012 - DS2x86 version 0.32 released!

This version only has a few minor improvements. I have only had a couple of days to work on DS2x86 during the previous two week period. As it will probably take me another two weeks to get the graphics blitting improved, I decided to release a new version today, even though it has not had much work done compared to the previous version. This is what is new:

The lack of progress is mostly due to an extended electrical blackout that occurred on last Monday, and which (indirectly) caused two of my three computers to die. My laptop (which is almost 9 years old and still had the original battery) ran it's battery empty during the blackout, and it looks like this finally caused the battery to fail completely. The machine does not run without a battery, so I had to order a new battery for it. Somewhat surprisingly some online stores still sell (and have in stock) batteries suitable for a laptop that old!

The bigger problem was that the motherboard in my HTPC also died. It was less than two years old, so it should have lasted longer, but of course it only had a one-year warranty. So, I had to spend all my free time last week with first familiarizing myself with the current status of suitable hardware for a HTPC machine, then online shopping for parts, modifying my silent cooling system to fit a new socket type, and then finally building and configuring the new machine. That left no time for me to work on DS2x86, as I only got the new machine up and running yesterday evening.

I hope to finally get back on track with working on DS2x86 during the next week. I first want to improve the screen blitting speed and quality in the higher-resolution modes, and then I really need to re-implement the SoundBlaster digital audio support. Sorry that it will take so long for me to get around to these, but sometimes unexpected complications arise.

Jan 22nd, 2012 - DS2x86 progress

The last week has again seen only some slow progress with DS2x86. There were a couple of snow storms, causing blackouts and the need to shovel snow again last week. That meant I only had two evenings where I was able to work on DS2x86 at all. I spent those fixing the mouse behaviour in the new transfer system, so that now both the D-Pad mouse and the Touchscreen mouse seem to work in DS2x86 just like they do in the original DSx86.

Currently I am attempting to improve the screen blitting in the high resolution modes (640x???). I will need to move some of the blitting code back to the MIPS side to be able to avoid sending unnecessary data via the slow card interface. This will cause a slight slowdown to the CPU emulation, but since most of the more CPU-hungry games will use the 256-color modes (where the 640x resolution is not available), this should not cause much of a problem.

There are still problems in the SoundBlaster emulation, but it looks like those will be very difficult to fix using my current transfer method. I think I will still need to rethink the SB digital audio transfer system one more time, to be able to handle all the different methods that the SB card can play digital audio. I have not yet figured out a system that would support all the needs, so I will probably leave this improvement for the later versions.

Thanks for your test reports about the problems in the previous version again! Sorry I probably won't get around to improving the specific problems in the various games before I get the new transfer system working better. This new transfer system was a major architectural change, so it will still take a few releases to get it working properly.

Jan 15th, 2012 - DS2x86 version 0.31 released!

Okay, this version has various fixes to bugs introduced in the previous version. Sadly I did not have time to fix all the bugs in the SoundBlaster handling, as it took me many days to find the and fix a problem that made Supaplex lose digital sounds immediately when the game began. I managed to finally fix this problem, but it still loses the digital sounds occasionally for a little while. Anyways, here are the fixes in this version:

I will try to continue improving the SoundBlaster features and other still missing features during the upcoming weeks. My time to work on DS2x86 has been somewhat limited recently, as it is winter and we get quite a lot of snow, which also sometimes causes electrical black-outs. So, I need to spend time shoveling snow and just waiting for the electricity to return instead of working on DS2x86! Hopefully things improve in a couple of months as spring comes. :-)

Jan 8th, 2012 - DS2x86 progress

Since the release of 0.30 last week, I have been working on the remaining problems in the new transfer system. These are the problems that I have now managed to fix, and which will be included in the 0.31 version. The current plan is to release DS2x86 version 0.31 next Sunday. I hope to still add some additional fixes and improvements during the next week.

The new EGA mode 0x0D blitting code now has two working modes. If the logical screen layout is 320 pixels wide (so no horizontal scrolling or additional trickery is used), the blitting speed is 6.7 ms (149 fps), and if the logical screen width is more than 320, a separate transfer code is used on the MIPS side. This code only sends 8 extra pixels per screen row (to handle possible smooth pixel panning function), and thus the blitting speed drops only slightly, to 7.9 ms (126 fps). This change got rid of the screen tearing problem in Supaplex and Commander Keen 4 intro.

The EGA and VGA graphics cards have an option to jump back to the beginning of the graphics VRAM memory at a certain scanline (when the card is drawing the image on the monitor). This is activated by giving the EGA/VGA line compare register a scanline number that is less than the number of screen rows. There is also a bit in another register that tells the graphics card to reset the pixel panning to zero in this situation. The pixel panning register is used to shift the screen image 0..7 pixels left during the graphics VRAM scanning and drawing onto the monitor. Since the screen image start address needs to be at a byte boundary, and each byte in the 16-color modes contain 8 adjacent pixels, using the pixel panning register is needed when smoothly panning the image horizontally by less than 8 pixels at a time.

In DS2x86 version 0.30 I simplified the screen blitting code (compared to DSx86 and previous DS2x86 versions) so that I don't handle the pixel panning value in the code (by shifting the pixels before blitting them to the screen), but instead I use the Nintendo DS graphics background registers to emulate the pixel panning, much like the actual EGA/VGA card does it. However, it only occurred to me last weekend that I can also handle the line compare pixel panning reset using Nintendo DS hardware! Since the NDS graphics features include a VCount interrupt, I can use that to get an interrupt at the line compare scanline, and reset the NDS background register horizontal position to zero! The end result is exactly similar to the EGA/VGA card behaviour, with much of the functionality done by the NDS graphics hardware! This is a change I plan to port back to the original DSx86, as it will simplify the EGA blitting code there as well. This change made the Supaplex bottom score panel stay put while the upper area is panning.

Supaplex also helped me in finding the problem in my AdLib emulation. The music seemed to skip a lot of notes during the beginning. I logged the AdLib notes in DOSBox to a file, and also wrote code to log the notes that the MIPS side sends to the ARM9 to a file on the SD card, and noticed that there were no differences. The exact same notes get sent with nearly identical timing from the MIPS to the ARM9 side. And since the ARM7 uses the same code as the original DSx86 (which works fine), it was easy to figure out that the problem must be in the new ARM9 code. And there the problem indeed was. I had a minor bug in the buffering scheme, where the last command in the buffer was never sent from ARM9 to ARM7 until the buffer got additional data from MIPS to ARM9. In Supaplex music there are places where only one instrument is playing, and in these places the game sends only three commands: Note Off, Note Frequency, and Note On. So, when the last new command never got sent, this in effect made the ARM7 see the music as a sequence of Note On, Note Off, Note Frequency commands, so there was no sound output.

After those fixes I then began debugging the Warcraft BSOD crash problem. It is somewhat weird, as the location of the crash (as reported by the BSOD texts) seems to jump all over the MIPS code area. What is even more strange, the location seems to often point to a code that can not crash, that is, it has only some simple aritmetic operations or such. So, my first theory was that perhaps this is some interrupt routine re-entrancy problem in the new more accurate SoundBlaster IRQ emulation. I checked the Warcraft SB emulation code (which I reverse engineered some time ago, when an earlier DS2x86 version had problems with it), and noticed that it sets up an auto-init DMA audio transfer with a buffer length of 2 samples! That is, the SoundBlaster will send an IRQ after every 2 samples have been played! As the playing frequency was 22 kHz, this meant that my emulation code began getting over 11000 IRQs per second!

I experimented by forcibly limiting the auto-init IRQ frequency, but rather annoyingly, even at an IRQ frequency of 366 Hz (24000000/65536) the BSOD problem remained. Only at a frequency of 183 Hz (24000000/(2*65536)) I got rid of the BSOD problem. This made me realize that the IRQ speed itself can not be the actual cause for the BSOD, as for example Windows sets the PC timer to run at 1000Hz, which is also emulated similarly using a hardware IRQ at that speed, and it does work fine. Finally I then realized that my buffer copying code inside the IRQ handler expects the pointers to be word-aligned, and with the transfer buffer length of only 2 samples, the pointer was actually only halfword-aligned! Since Warcraft only uses this buffer setup when testing for a SoundBlaster, it is not so important to play the correct samples, and thus I forcibly aligned the pointers to be word-aligned. This got rid of the BSOD, but still the SB audio does not work quite correctly in Warcraft. I'll continue working on this problem during the next week. There are still various other problems in the new transfer code as well, which I also hope to be able to fix and/or implement during the upcoming weeks. But, you can expect at least the above fixes to be included in the next version.

Jan 1st, 2012 - DS2x86 version 0.30 released!

Happy New Year! It is again time to start a new blog page, as I like to have my blog pages not contain more than half a year's worth of blog posts. Makes it faster for you to read/download the latest entries as well.

DS2x86 v0.30 release notes

This is the first DS2x86 version to use my own completely rewritten transfer system between the MIPS and ARM processors for the DSTwo flash cart. Pretty much nothing in the new transfer code is copied directly from the SuperCard SDK sources. I have used the ideas from their sources, but the actual code is quite completely rewritten. The main differences are that now it is the ARM side that controls when and what data to transfer, the graphics are transferred in the native format of the emulated PC graphics memory and drawn on the ARM9 side, and the AdLib sound is fully generated on the ARM7 side.

I have not had time to implement all the features I had coded for the original transfer system, so not everything is fully working yet. Also, note that there are no compatibility improvements in this version, so if a game did not work in version 0.25, it most certainly will not work in this version either. Rather the opposite, if a game did work on 0.25, it might not work in this version! Here below is a list of features still missing from this version, so any game that needs one or more of these might not work properly:

However, there are obviously also some advantages in using the new transfer system. Here is a list of the current advantages, and as I get around to improving the still missing and buggy features, this list will hopefully get longer and the above list will get shorter:

DS2x86 v0.30 screen blitting speed

The original SDK transfer system always transferred the graphics data using 16-bit color video buffers, so that transferring one screen frame needed 256x192x2 = 98304 bytes to get transferred. Since the card interface runs at 4.2MHz speed, transferring this much data took 23.4 milliseconds, and the frames could be transferred at a maximum speed of 42.7 fps. These numbers are the theoretical maximum, the real speed was somewhat less because of the need to transfer also audio data and various commands.

In the new system I transfer only as much data as is needed, so the amount is usually much less than in the original system, except in the high-resolution modes like 640x480, where I might need to transfer more data per frame than in the original SDK. Also, since the data needs to be transferred in 1024-byte blocks without direct random access, I can not currently skip bytes that are outside of the visible area if the game has set up the graphics memory to have a logical screen width larger than the visible screen width. For example, Commander Keen 4 uses a logical screen width of 1260 pixels in the EGA 320x200 mode during the intro scroller, so that I need to transfer almost 1000 extra pixels for each screen row! This will drop the blitting framerate down to 24 fps, which will cause visible tearing of the screen image. Luckily the game itself uses the normal 320x200 mode.

Here below is a table showing the most common graphics modes, and the maximum (theoretical) frame rate in each of them using the new transfer method:

ModeOriginal SDKNew methodMax FPSNotes
80x25 Text23.4 ms0.8..4.5 ms>200 fpsDepending on changed characters
320x200 CGA23.4 ms4.1 ms244 fps 
640x200 CGA23.4 ms4.2 ms238 fps 
320x200 EGA23.4 ms6.7..40.9 ms24..149 fpsDepending on logical screen width
640x200 EGA23.4 ms13.1 ms76 fps 
640x350 EGA23.4 ms22.9 ms43 fps 
640x480 VGA23.4 ms31.2 ms32 fps 
320x200 MCGA23.4 ms13.1 ms76 fps 
360x240 Mode-X23.4 ms17.8 ms56 fpsUsed in "Settlers"
320x480 Mode-X23.4 ms35.0 ms28 fpsUsed in "LineWars II"

Future work

For the coming week or two, I plan to still work on the new transfer system, fine tuning it and adding the missing features. First, I hope to come up with a faster way to transfer the screen graphics in the high resolution modes, and in special EGA modes that use a very wide logical screen (like Commander Keen 4 or Supaplex). After that, I need to improve and fix the SoundBlaster audio handling, and then look into the hardware mouse cursor and other mouse features. Also, I can now add all the keyboard-related features (visible upper/lower case changes, key flash when clicked, etc) that are in the original DSx86 but have so far not been possible to implement into DS2x86. I don't plan to work on any game-specific compatibility improvements until I have improved the new transfer system still quite a bit further.

Please report games to seriously misbehave using this new transfer system, or if some core functionality that I have not mentioned above is missing, though! It helps me in improving the new system if I have a list of games to test, so I don't need to guess and hope that my change fixes something. Thanks again for your interest in DS2x86!

Previous blog entries


Main Page | Downloads | Credits