For using untextured quads for far away objects, I won't need it to improve the framerate as I think I'm on the right track with the current technique (the per-quad collision might cause some issues thought if it's not fully optimized for early rejection).
For the clipping, my plan is still the per-octree node pvs, the main issue will be the compression used to store the data, but I'm not there yet as the new collision technique might take a while to get it right.
Sorry for not answer sooner... Absolutely you are in the right path. Your actual work and improvement in the engine talk about the quality work it for you.
My only intention is try to add something possible ideas to get new paths to improve the use of the machine.
In other hand your goals are impressive: Clipping with per-octree node pvs, the compression for store the data(in SDK SS can see the RLE compression are been implemented) and finally a new collision technique. All goal are important and impressive to build a strong engine for a "SS Sonic 3D" game
Still lot of work to be done on the per-quad collision, the view frustum culling and the LOD (or tessalation) texture generation, but here is, just to show my newer version of the engine, an old "map" revisited with the map converter (I didn't try to fix the geometry for the Saturn other than turning triangles to quads and subdividing everything once).
The intro level isn't the biggest level, but it shows that Mario 64 could have worked on the Saturn (with perhaps some compromises of course).
Video here : https://youtu.be/ScGVHMFB58M
Is a outstanding work! Really! You are use the two SH2?? Because is possible will go more far... and If you could use de DSP of SCU it be a crazy improvement!
In the video is possible see a peak of near 1000 quads in screen at 30FPS!!! is a madness!! XD Is possiible that you cap the limit FPS to 30, is possible that you rendering at more FPS, near 60FPS??
Thanks and encourage!!!! GO GO GO!
For RLE, you mean using SBL?
I thought of something else, but everything is on the table.
It might take a few months before I truly focus on pvs.
I am using both SH2, but not the SCU DSP. The problem with the SCU DSP is that it needs to be coded in assembly (I haven't learnt it yet) and there is some overhead since you need to DMA transfer data, start the DSP program, wait for it to complete the processing, stop the program and then fetch the data. Unless you are dealing with complex maths in parallel, it's not really worth it. Most people used it for lightning calculations, but SGL for some reasons didn't use it.
It also has really limited RAM and can't do divisions, so you can't do that much with it.
For 60 fps, it would work with around 450 quads on screen, no gouraud shading, low res textures and/or no overdraw, so it's not really a solution for now with complex levels.
But it works nice with Sonic X-Treme levels.