View previous topic :: View next topic |
Author |
Message |
Charles MacDonald Member
Joined: 07 Dec 2005 Posts: 35
|
Posted: Wed Jun 21, 2006 9:42 pm Post subject: Compiled sprites |
|
|
After some discussion with FraGMarE about how to handle animated sprites efficiently, I came up with an adaptation of the compiled sprite technique for the PCE which involves a crude form of hardware-driven decompression.
Normally when writing VRAM data, the fastest you send data is with TAI which has a 6 cycle overhead per byte transferred. A compiled sprite is a sequence of ST1 and ST2 opcodes which encode the bitplane data as the operand, which only use 4 cycles per byte. This typically means the source data is twice the size as it normally should be, but it can be compressed to some degree:
Because the LSB of every word written to VRAM is latched, you get free decompression; for any sequential words in the data stream with the same LSB, you only need to output ST1 once and then multiple ST2 opcodes, saving 2 bytes per word per LSB skipped. For a bunch of ripped Sonic sprites that had some empty space out of each 48x48 animation frame, the sequential runs of 0 at the edges compressed to almost 1/2 the size exactly. Not bad!
The savings really add up, to transfer 1152 bytes takes 6912 cycles with TIA (counting transfers only) and at worst 4608 cycles with a compiled sprite, but probably less due to compression. That's in terms of CPU cycles, not sure what delays the VDC might insert (haven't been able to get any conclusive test results in that area)
Because of the small 64K address space the HuC6280 can access, the applications of this technique are limited. However, I think it would be quite useful for Arcade Card programs where there is plenty of RAM and various ways to sequentially access memory through the data ports. A more complex analysis program could find identical runs of data in a source image and thread them together with some JSRs to reduce code size.
If anyone is interested, I could write a little demonstration program and include a utility to create compiled sprites too. The same idea also works with background tiles (or even name table data), but sprites seem to be the most appropriate. |
|
Back to top |
|
|
ccovell Regular
Joined: 19 Dec 2005 Posts: 100 Location: Japan
|
Posted: Thu Jun 22, 2006 3:43 am Post subject: |
|
|
That's a really smart technique! I don't really need compressed sprites for the game I'm making, but for action games with lots of sprites being shuffled around, it would be great!
On a related note, so many PCE games use some sort of compression for their tiles. Does anybody know the compression method? And, is the method the same or different for each game company? _________________ http://www.chrismcovell.com |
|
Back to top |
|
|
Tomaitheous Elder
Joined: 27 Sep 2005 Posts: 306 Location: Tucson
|
Posted: Thu Jun 22, 2006 5:55 am Post subject: |
|
|
Charles MacDonald wrote: | The savings really add up, to transfer 1152 bytes takes 6912 cycles with TIA (counting transfers only) and at worst 4608 cycles with a compiled sprite, but probably less due to compression. |
That's a pretty clever idea, setting up the sprite as a string of ST1/ST2 opcodes. Too bad there aren't equivalent opcodes for the VCE.
ccovell wrote: | On a related note, so many PCE games use some sort of compression for their tiles. Does anybody know the compression method? |
I was wondering this myself. Most CD games that I looked at didn't use compression for the tiles and sprites, but Gate of Thunder does and it decompresses the tiles and sprites on the fly.
If it helps, GOT stores the sprite/tile pixel data in a different format(non-PCE) before compressing it. I assume this is so the compression algorithm can be more efficient.
32 byte tile format:
A single byte holds two nibbles of linear plane pixel data. The first 4 bytes holds the first 8 pixels of each plane, the preceeding bytes hold the additional rows of pixels. The plane order of the nibble is PL1, PL2, PL3, and PL4. The conversion routine reads the bits/pixels and shifts them in place, so the first nibble of the 4byte block is pixel 1 from left to right.
byte1 byte2 byte3 byte4
12341234 12341234 12341234 12341234
I haven't fully figured out the compression routine, but I'll try to explain what I know so far. The compression routine uses some sort of progressive pattern building. The beginning of the data is compressed the least, unless it contains a string of zeros. As the routine starts to build/decompress the data, it starts to use patterns from the uncompressed/build area more often. A single byte mask is loaded and shifted on each pass until the loop expires. 0= uncompressed data and 1 is used to append the start and end points of a string of bytes from the uncompressed/build area. If I recall correctly, the byte following the mask byte is used in part with assembling the start/end points of a pattern.
-Rich _________________ www.pcedev.net |
|
Back to top |
|
|
fragmare Member
Joined: 01 Feb 2003 Posts: 59
|
Posted: Thu Jun 22, 2006 3:22 pm Post subject: |
|
|
Anyone know if any of this sprite/tile compression and streaming data from HuCard > VRAM on the fly can be efficiently accomplished in HuC? _________________ Rec'ing Minds... |
|
Back to top |
|
|
Charles MacDonald Member
Joined: 07 Dec 2005 Posts: 35
|
Posted: Thu Jun 22, 2006 11:28 pm Post subject: |
|
|
fragmare wrote: | Anyone know if any of this sprite/tile compression and streaming data from HuCard > VRAM on the fly can be efficiently accomplished in HuC? |
It could be entirely done in HuC, providing it supports function pointers. I don't know how closely HuC adheres to the C standard.
I found a convenient way to load sprites was something like this in assembly:
lda #sprite_frame_num
jsr load_sprite
load_sprite:
asl a
tax
jmp [frame_table, x]
frame_table:
.dw $abcd // address of 1st compiled sprite
.dw $abcd // address of 2nd compiled sprite
:
.. and then each compiled sprite ended in an RTS, returning from the subroutine. In HuC the translation would be an array of function pointers that gets called with the desired animation frame as the index. The right banks have to be loaded up in memory beforehand.
The compiled sprite data could be output as a binary file to be incbin'd into an assembly program, or an array of bytes that is compiled/included into a HuC program. It should be possible to use it without using any (or minimal) assembly. |
|
Back to top |
|
|
Tomaitheous Elder
Joined: 27 Sep 2005 Posts: 306 Location: Tucson
|
Posted: Fri Jun 23, 2006 2:01 am Post subject: |
|
|
You could also have someone write you a custom routine in ASM.
HuC has a "#pragma FASTCALL" function that allows you write ASM routines with the same arguement passing as the Magickit lib functions, i.e. ax,bx,cd,dx,di,si,reg a, reg x, reg y - bypassing the stack passing style of HuC in favor of speed. The function calls just like any other C funtion to the user.
You can even use '.n' directive for mutilple number of arguements for the ASM label, i.e.
#pragma fastcall decompress_tile( word acc )
#pragma fastcall decompress_tile( word acc , word ax)
#asm
decompress_tile: ; If no arguements are called.
<code>
rts
decompress_tile.1:
<code>
rts
decompress_tile.2:
<code>
rts
#endasm
The only catch is the fastcall function code must be placed in the CONST bank or added to the HuC library.
-Rich _________________ www.pcedev.net |
|
Back to top |
|
|
fragmare Member
Joined: 01 Feb 2003 Posts: 59
|
Posted: Fri Jun 23, 2006 3:37 pm Post subject: |
|
|
Charles and Tomaitheous,
That's awesome news. Sounds like we can not only do this in HuC, but also make little "core" engines of ASM code embedded within the other HuC code for things that require more speed like collision, tile swapping, etc. I'll need to suggest this to Bt. _________________ Rec'ing Minds... |
|
Back to top |
|
|
Paranoia Dragon Elder
Joined: 09 Apr 2005 Posts: 366 Location: New Iacon
|
Posted: Sat Jun 24, 2006 6:07 am Post subject: |
|
|
Wow, this all goes over my head, but it sounds like something Nodveidt should know about for some future titles. Won't come in handy now exactly, though maybe for Neutopia 3, since that's an action game. But, ultimately I'm talking out of my arse _________________ Sentience is the right of all freedom beings? |
|
Back to top |
|
|
Keranu Elder
Joined: 05 Aug 2002 Posts: 669 Location: Neo Geo Land
|
Posted: Sun Jun 25, 2006 6:39 am Post subject: |
|
|
Ditto! _________________ LaZer Dorks |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|