46
Welcome!! new blood to rebirth this awesome machine!
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
I recently restarted working on Sonic Z-Treme, but I'm slowly transitionning to the newer version of the Z-Treme engine.
I pretty much need to discard almost everything I did with Sonic Z-Treme and restart, but it also forced me to make the engine more modular, which is good for future projects.
Right now I'm mainly working on implementing new audio functions to stop using PCM streams and properly use the Saturn's sound CPUs, but I haven't made much progress yet.
I probably won't be showing anything about Sonic Z-Treme until Sage 2018.
My big next things :
-Custom view frustum culling function
-PVS and/or portals!
-Improve the collision detection
-Add basic ennemies/shooting collision-Add (maybe) splitscreen-done
(Don't mind the Sonic face, it's just a placeholder)
You mean with the Jo Engine functions?Well, I don’t know is Johannes implemented some function to help to debug specify in this part of pipeline rendering... For the SGL/SDK well, is possible... I remember some documentation for debug things... but I don’t know if are useful for that.
I think it's just how it reads one sector (2048 bytes) at a time and puts it in a buffer.
I just transfer everything right into memory without any intermediate buffer.
As you can see, the loading is super fast.
I could work hard to make it load 1 or 2 seconds faster, but it's not worth the extra work I think since it's super fast already.
For the mipmapping, I don't generate the texture now if the 2 quads are too different. It increases the geometry, but it looks much better.
I'm just using the GFS_Load function from SBL.
Here is the game running on stock hardware (30 fps) :
https://youtu.be/VsHibSGWKuw
So I used a basic SBL cd function to load the maps instead.
The result? The maps now load in...5 seconds! 1.4 MB in 5 seconds instead of 100-120 seconds.
I couldn't believe my eyes how fast it was.
The VDP1 limit is pretty well known, but there are ways to speed it up : use VDP2 CRAM, avoid overdraw, use lower quality textures, avoid some effects, do preclipping, etc.
For the difference, SGL is just poorly designed for memory counsumption. 4 bytes vertices (16.16 Fixed) is fine I guess (vs 2 bytes - 8.8 Fixed - for Slavedriver), but the quads are wasting way too much memory. One quad in SGL (using realtime light) is taking, without considering the vertices, 44 bytes. If you add an average of 2 vertices per quads (since some vertices are reused), that's 68 bytes.
There is nothing to be done, it's just how it is with SGL.
For lightning, again that's how Slavedriver works : they use 8 bytes for vertices, 2 for each axis (x,y,z) and 2 for the color.
You can do the same thing (static light) with SGL and gouraud shading without major issues simply by re-using the same gouraud adresses. For dynamic light, you need another vector (12 bytes). Even if you have the normals, you still need that extra vector as Sega found out their method had issues, so they just added something else on top of everything else.
Duke Nukem 3D is using 16 CLUT, not 16 bits RGB.
For mipmap, no, I can't use the VDP2 RAM. I really want to use the CRAM, but you only have a maximum of 2048 colors, but if you use 16 colors per sprite, you still need to have these 16 colors in the same area. When you have 500 sprites, it's nearly impossible. Using 256 colors is possible of course, but then you double the VRAM counsumption.
Gouraud shading can be used with VDP2 Color bank as proved by the "Chrome" demo, but Sega writes in its documentation that they can't guarantee good results.
Per texture light is great, but then again you need to store many more sprites, which isn't easy.
For the PVS, I haven't seen much code online, but unless someone else here codes something first, I'll have to do it myself.I will try find something in C, that is the same language and how example or beginning is something.
For the cd functions, the problem is that duplicating work isn't efficient, we all have a job, girlfriends, and all, so we can't spend as much time as we'd like on these projects.I take your word... I have a child XD But this is a funny and epic "mission"!
For audio RAM, for PCM, I never saw an example, but maybe some homebrew emulator do it, I really don't know.I will try to find something here also.
For the SCU : like I mentionned before, SGL doesn't work with that. You just cannot use it for lights since it's done during the polygon processing and you can't intercept it unless you do some really complicated tricks, so forget it, it won't happen. Plus it wouldn't help with the framerate as the main framerate bottleneck is the VDP1 more than the CPUs. I might use the SCU for some more advanced physics down the line, but I'm not holding my breath.
For RAM, of course I know! Else how would I have been able to load my maps (13 000+ quads, 30 000+ vertices), fill the VDP1 RAM to the limit with my generated textures (500+ different sprites with 4 bpp CLUT)?
Plus you can't compare my engine with the Slavedriver engine : SGL takes MINIMUM 32 bytes per quad, EXCLUDING the vertices (12 bytes each). The Slavedriver engine takes 5 bytes per quad and 8 bytes per vertices (x, y, z and the vertex color data) plus some per-plane data.
You are comparing 2 very different things. SGL is super fast, but it's wasting lot of memory and I'm stuck with that.
For realtime lightning : each realtime lightning polygon requires 12 bytes of extra vector data. At some 13 000 quads per map, we are talking about 156 KB. Yes the Saturn has 2 MB of Work-RAM, but you need to store the code, the tables, the entities (Sonic, ennemies, etc.) and other stuff. My maps are totally filling the low-work RAM. I want to keep high-work RAM for the PVS and I need to find a way to store PCM in audio RAM (I'm not even sure if it's possible with the Sega audio driver as I never saw any examples).“Realtime lightning polygon requires 12 bytes of extra vector data” plus the 32 bytes and 8 bytes that refer before of SGL? 52 bytes total for distorted sprite?
Btw, I'm not using gouraud on everything : only on the LOD model to reduce both CPU and VDP1 usage. Since the LOD quads are large, it doesn't look fully smooth, but realtime lightning would look terrible because of that (like one large quad getting lit all at once). Palette swaps or per-texture lightning (like in Wipeout) might be a better choice for static light, but then the issue is VDP1 RAM. The depth gouraud is only taking 64 bytes, so it's really not much.
For the PVS or Portal, the problem is the implementation, it's not easy.
For the CD, obviously I know exactly what I'm loading as everything is tightly packed in memory, so I know I'm not loading more than 1,3 or 1,4 MB : I load the textures first to LWRAM and DMA them to the VDP1 RAM, then the actual map data that I all keep in the low-work RAM. The problem is that the current jo_fs_read_next_bytes function fetches 1 sector (2048 bytes) at a time, so it's not making good use of the SH1 512 KB buffer since it keeps stopping and transfering the data. It could fetch 10 sectors at a time without stopping. The async load function, as-is, can't be used for what I do, so I'll have to see what Johannes has in mind or continue writting my own CD function using SBL.
So my available memory is : VDP2 RAM, audio RAM and SH-1 RAM. The rest is fully used or will be soon.
Probably not, but right now it wouldn't work well for sure, both on the CPU and the VDP1.
Anyway RAM is so tight that it wouldn't fit since I would need to keep space for the light vectors, so I will stick with static lights.
I don't plan to add dynamic lights.
On real hardware, the framerate goes between 20 and 30 FPS (or 60 fps sometimes if left uncapped). The video seen here is on emulator, which stays at 30 fps (or 60 if I leave the framerate uncapped).
There is lot of overdraw (sprites written over other sprites), if I find a solution for that the framerate should be much better.
The CD loading functions are really terribly slow (2 minutes to load a 1,3 MB map!!!) and prone to failure, so I'm not showing real hardware footage until I can bring that loading time down to something better (15 seconds or so).
I tested the demo on real hardware : 20 to 60 fps, which is great considering it's not optimized.
Here is a video with 2 more Quake maps :
https://youtu.be/2B48PMqd1zg