Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Jorhlok

Pages: [1] 2
1
Just a quick update to mention that I got it to draw using a framebuffer and also got it to draw a distorted sprite with gouraud first hardcoded then dynamically with uniforms. Later, after testing the various batch methods, I'll come back with some data.

2
Just a quick update that I have a GLSL, GLES2, SDL2, C++ program that draws a moving, colored, and textured triangle which is all the basic shader elements I was using previously.

Here I wanted to scribble down my ideas that I wanted to try so far. They could be in GLES, GL, or vulkan but I don't think it will make much of a difference with what I'm doing except maybe maximum batch size.
  • I already tried one quad at a time with one quad defined in uniforms/global shader variables back in monogame.
  • I've tried batches with quads encoded into a texture. I'd like to try it again to compare the speed between monogame GL and direct use of GL.
  • I had the idea to pass quad parameters in as vertex attributes. This would be much data duplication. Maybe I'll try it?
  • It seems like uniforms can be arrays. I'd like to try a big uniform array as a quad batch.
  • Some other sort of buffer object?

The alternative to hacking a triangle rasterizer to display distorted sprites is a purpose-built rasterizer possibly built in OpenCL like this one: https://github.com/a2flo/oclraster
Non triangle primitives (like quads) are specifically mentioned in the thesis (even if not implemented).
Other than more appropriate primitives, processing could be oriented to be per polygon rather than vertex and fragment which, I've been discovering, is one of the most important differences between antiquated polygon graphics and modern 3D graphics.

3
Share your code / Re: Monogame/HLSL Effect/Shader for Saturn-like graphics.
« on: December 21, 2018, 05:48:43 pm »
Wanted to post that I got back into experimenting with this again. It started when I thought about experimenting with vulkan and I tried adding extra vertex attributes with monogame and hlsl but finally I got fed up wrangling soeme random hlsl version that was going through one or more translation layers anyways and decided to switch over to managing opengl myself.

Presently, I have a C++, SDL2, and GLES 2 project that will compile, run, and draw the same triangle in visual studio on windows (with ANGLE installed in my dev directories, but my code doesn't need to be aware of it), linux with a simple makefile (I think mesa3d provides GLES compatibility), and webgl via emscripten all from the same source code. I'm tentatively targeting OpenGL ES 2.0 because it'll run on damn near any modernish platform at least through one or another translation layer. However I also plan to test out different methods including those that are too advanced for GLES2 to see how the performance might be and try to get more work to be (optionally) done on the GPU.

I also have a tentative name for the shader: Saturnalia. The ancient roman celebration of the god Saturn every winter solstice or so, and one of the holidays those pesky christians stole.

4
Project announcement / Re: Sonic Z-Treme
« on: December 21, 2018, 05:35:05 pm »
That reminds me, is the Z-treme source available anywhere? I was never able to find it or a consistent place to download the iso.

5
So, why doesn't the mesh effect create artifacts that the apple II and Atari 8 bit line used often for color (green and pink) in high-res mode?

Short answer: it did.

IIRC the original apple ii hardware is monochrome under the hood and was able to make color with composite monitors with patterns of white and black pixels. This is horizontally on a scanline.
I'm not as familiar with atari computers but high resolution modes for many computers such as 80 column mode on a commodore 128 is hardly legible on a composite monitor or tv. Thankfully you could use it with monochrome or the equivalent of s video monitors.

6
Is SCART basically RGB in another form factor?

I'm an american so I might have something wrong but my understanding is that SCART is a multipurpose audiovisual cable/port specification. So it carries analog video and audio and supports what is equivalent to composite, s video, component, and RGB. Not everything that uses SCART does everything and if it does you need a cable that supports it by having those pins. This is sort of like an old VGA cable that's in my drawer that only has the bare minimum of pins and wires so things like the monitor sending specs via one of the pins won't work with that cable.

7
You mentioned the fm towns marty 3D sphere which gave me an idea but then I looked it up on youtube and it was nothing like my idea.

The idea is to use a normal VDP2 layer with small premade tiles and a CPU can decide how to composite the tiles to recreate what looks like any shadows.


The VDP1 draws arbitrary lines from the left line to the right using a greedy algorithm which causes pixels to be overdrawn. This avoids PS1-like gaps but causes the moire pattern with transparency. The other thing that causes the moire pattern was demonstrated well in one of Jon Burton's Sonic R videos. So unless each line is perfectly vertical or horizontal it'll overdraw causing transparency to be bollocks. This applies to skewed and rotated quads, not just perspective corrected quads.


One of the things you might have heard is that CRTs and composite video blurs the dithering enough to be a convincing shadow just drawing a dithered black oval over the ground. I didn't understand just how effective this was until I saw this composite capture of Z-treme on youtube: https://youtu.be/x7cW9wgIW00?t=50
Anyways you can tell how, back in the day, a team on a deadline scraping for ounces of power on a not quite adequate for 3D system would just draw an oval and be done with it.

I think one of the major issues with performance is that we're using an incredibly old jack-of-all-trades graphics library instead of tightly focused custom routines. Unlike a more straightforward graphics system, the saturn's really hard to write efficient graphic routines or even wrap your head around especially the memory scheduling. If two chips try to access the same memory chip in the same cycle, one of them will be halted until it gets access. This can basically throw away many cycles of work the saturn would otherwise be able to accomplish.

Anyways we got some interesting ideas floating around.

8
Project announcement / Re: Sonic Z-Treme
« on: August 12, 2018, 11:42:42 pm »
Via the last post on SAGE from SonicRetro: http://sonicretro.org/2018/07/sage-2018-information-delay-booths-trailer-and-more/

"SAGE 2018 is DELAYED to August 25th through September 1st."
"We will now be accepting submissions to SAGE up until August 18th."

I imagine at least until the 18th XL2 will be polishing Z-treme as much as possible.

9
I was thinking of making a python script to do it and I may make it soon. However I think one or two of the edges of the triangle needs to span an entire edge of the sprite because that's how it'll be distorted when drawn as a quad.

Well, the python script I was imagining would stretch either columns or rows of pixels but maybe a full uv preprocessor could be made instead. I might make the simple version first and think on the latter later.

10
More to the point, it's pre-processed; the image on the sphere is pretty much just a screenshot. You'll notice the heavy lag time before the ball comes into play after the game stops.

Oh yeah, the background wouldn't be in the VDP1 framebuffer so it would necessarily need to be touched up to include what the VDP2 would normally produce.

11
Another example is Rayman at those end level sequences, where it takes the screen and, (maybe) using software, wraps the screen on a sphere, rolls it up like a sheet of paper, or does water ripple distortion, while maintaining a constant framerate of 60 fps! https://youtu.be/G4SyK0fSwKM?t=11m43s. Now that is impressive. I think the lag is because of the efficiency of the code in Tempest 2000.

I watched that and was thinking about it, I think a way to do that is to copy chunks of the framebuffer into sprites in vram and then using those sprites draw a sphere using the regular VDP1 functionality. This works because the image on the sphere doesn't change.

12
From that video Jon says he planned to have an environment mapped metal sonic but scrapped it and only used the software renderer for the trophies.

From what I know of the VDP1 it draws in strips of pixels from the left side of the quad to the right. This way it reads from the sprite linearly. So it would take all the pixels from start to end in a row and stretch that from the left side of the quad to the right instead of leaving blank pixels to either side, I think? Having not programmed the saturn yet I can't say for certain, though.

13
Sorry Jorhlok. I totally pass your great contribution!

It is a great project! You have thought about the possibility of adding your real preview of the graphics of the VDP1 to the work of @XL2 to export 3D geometry to SS from Blender 3D.

The possibility of creating 3D graphics in a tool with Blender, and really being able to visualize how the graphics will look. In addition to being able to later export the data to a format as optimized to the machine. It is one of the tasks that remained pending in the past. There were no plugins, or at least I have not seen for 3DS max or others for SS graphics. However for PSX there were many. This facilitated and potentiated taking advantage of both the possibilities of the artist and the machine.

What do you think?
I hadn't heard of exporter plugins for the PSX. I'll have to check that out and see what's feasible.

I decided to try storing the quad data in a texture that triangles would then reference in the shader. I'm using a Vector4 format texture so I can store 4 arbitrary floats per pixel. I can use values outside of 0-1. Texture width for this proof of concept is 8 and texture height is the buffer size. The inverse bilinear algorithm doesn't use z values so I left them out. So the first two texels in a row store the quad's coordinates, next two are uv coordinates, next 4 texels are gouraud colors for each corner. I'm sending triangles in with position and uv values. each vertex of a triangle should be the same so the end result makes it to the pixel shader. There, the V coordinate points to the row in the quad buffer. U doesn't do anything but will be options such as sprite vs quad, screendoors, etc.

Here's what it looks like. That strip along the left is the first 240 quads in the last drawn buffer. Values outside of 0-1 are clamped so most of the first two texels just appear as white.


Although the method is a bit unorthodox I'm getting good results compared to last time.

591 kqps depth 8000 buf 1M quad
489 kqps depth 2048 buf 1M quad
288 kqps no depth 8000 buf 1M quad

        /* depth 8000 buf 1M quad
         + 1688.0965
         + 1686.0964
         + 1691.0967
         + 1744.0998
         + 1682.0962
         + 1665.0952
         + 1699.0972
         + 1682.0963
         + 1685.0963
         + 1710.0978
         * avg 1693.29684
         * qps 590563
         * 60fps 9842
         * 30fps 19685
         *
         * depth 2048 buf 1M quad
         + 2060.1178
         + 2058.1177
         + 2043.1169
         + 2053.1175
         + 2031.1161
         + 2032.1162
         + 2042.1168
         + 2043.1169
         + 2071.1185
         + 2021.1156
         * avg 2045.517
         * qps 488873
         * 60fps 8147
         * 30fps 16295
         *
         * no depth 8000 buf 1M quad (disabled depth buffer so it needs to process all the fragments in a triangle)
         + 3495.1999
         + 3487.1994
         + 3428.1961
         + 3482.1992
         + 3469.1984
         + 3461.198
         + 3441.1968
         + 3487.1995
         + 3441.1969
         + 3490.1996
         * avg 3468.29838
         * qps 288325
         * 60fps 4805
         * 30fps 9610
         */

14
I don't understand what's happening but you have my interest.
IIRC the Saturn graphics hardware has no Z-buffer anyway, perhaps ignore it then?

Question: What graphics API is this? [OGL/Vulkan/DX]
[my own research indicates: various, depending on build target]

I'm using the cross-platform desktop opengl project so the monogame content pipeline automagically converts the hlsl to glsl. I don't think I'm doing anything platform specific at this point. Shaders are written in hlsl because that's what XNA used. I think I only have access to shader model 2 or equivalent so no geometry shader or other things that might come in handy.
One of the limiting things in this case is that rasterization and z buffering is out of my hands. Writing in vulkan would take care of this (I think) but as far as I know the end result of all this finagling will run on older hardware than vulkan supports.

15
Finagling that inverse bilinear shader into my own wasn't so hard. For testing I had it tell the gpu to draw two triangles over the entire framebuffer and let it clip fragments. Unfortunately that also means the z buffer becomes useless so I finagled a couple more algorithms to fit a convex shape over the arbitrary quads. Now that I had several ways to draw a quad, I wondered how the performance would be. So I had it draw a million random quads with each method and recorded the time it took in milliseconds 11 times and threw the first one away just in case of shader initialization or whatever.

Performance is incredibly low due to sending quads one at a time to the GPU. Below are ordered from fastest to slowest.

/* DrawQuadQuick Colored quad. Tesselated as a triangle fan around the midpoint of the 4 verticies. That's 4 triangles. Sort of accurate for convex quads.
         * 6867.3928
         * 6995.4001
         * 6851.3919
         * 6944.3972
         * 6838.3911
         * 6815.3898
         * 6864.3927
         * 6919.3958
         * 6901.3948
         * 6857.3922
         * avg 6885.49384
         * qps 145232
         * 60fps 2420
         * 30fps 4841
        */

/* DrawSpriteQuick Gouraud sprite. 4 tris. Accurate for squares only.
         * 8050.4605
         * 8045.4601
         * 8059.4609
         * 7989.457
         * 8070.4616
         * 8042.46
         * 8043.46
         * 8056.4608
         * 7890.4514
         * 8038.4598
         * avg 8028.65921
         * qps 124553
         * 60fps 2075
         * 30fps 4151
         */

/* DrawSpriteBilinearToScreen Gouraud sprite using the inverse bilinear shader. Draws 2 tris over the entire screen. Z buffer is useless.
         * 9706.5551
         * 9698.5547
         * 9731.5566
         * 9703.555
         * 9732.5567
         * 9648.5518
         * 9703.555
         * 9668.553
         * 9691.5543
         * 9756.5581
         * avg 9704.15503
         * qps 103048
         * 60fps 1717
         * 30fps 3434
        */

/* DrawSpriteBilinear Gouraud sprite using inverse bilinear shader with fitted shape. 1 tri for concave, else 2.
         * 10868.6216
         * 10942.6258
         * 10955.6266
         * 11025.6306
         * 10959.6269
         * 11031.631
         * 10975.6278
         * 11082.6339
         * 10812.6185
         * 10827.6193
         * avg 10948.2262
         * qps 91338
         * 60fps 1522
         * 30fps 3044
         */

/* DrawQuad Colored quad tesselated into horizontal strips. 2 tris per strip. Reasonable accuracy with enough strips.
         * 32 strips
         * 11216.6415
         * 11114.6357
         * 10986.6284
         * 11144.6374
         * 11135.6369
         * 11160.6384
         * 11101.635
         * 11102.635
         * 11217.6416
         * 11071.6332
         * avg 11125.23631
         * qps 89885
         * 60fps 1498
         * 30fps 2996
        */

/* DrawSprite Gouraud sprite tesselated into horizontal strips. 2 tris per strip. Very accurate when strips == texture height. Reasonable accuracy with enough strips.
         * 32 strips
         * 13811.79
         * 14014.8016
         * 13926.7966
         * 13998.8007
         * 14025.8023
         * 13840.7916
         * 13935.7971
         * 13998.8007
         * 13912.7958
         * 13902.7952
         * avg 13936.89716
         * qps 71751
         * 60fps 1195
         * 30fps 2391
         *
         * 96 strips
         * 25656.4675
         * 25565.4623
         * 25539.4608
         * 25433.4547
         * 25344.4496
         * 25326.4486
         * 25285.4462
         * 25299.447
         * 25298.4469
         * 25495.4583
         * avg 25424.45419
         * qps 39332
         * 60fps 655
         * 30fps 1311
        */

145 kqps DrawQuadQuick
125 kqps DrawSpriteQuick
103 kqps DrawSpriteBilinearToScreen
091 kqps DrawSpriteBilinear
090 kqps DrawQuad 32 strips
072 kqps DrawSprite 32 strips
039 kqps DrawSprite 96 strips (texture height)

Below from: https://segaretro.org/Sega_Saturn/Technical_specifications#VDP1
Polygon rendering performance: Lighting

    800,000 polygons/s: Flat shading, 32-pixel polygons
    500,000 polygons/s: Flat shading, 50-pixel polygons
    200,000 polygons/s: Gouraud shading, 32-pixel polygons

Texture mapping performance: Lighting

    300,000 polygons/s: 32-texel textures
    200,000 polygons/s: 70-texel textures
    140,000 polygons/s: Gouraud shading, 32-texel textures


This is all on my desktop computer which can do recent games with mediumish settings depending on the game. On more modest machines like my laptop it might be well below saturn-level performance.
Anyways, I think the performance will be more than satisfactory once I figure out how to send quads in bulk to the GPU.

Just for fun, here's a million quads with and without z buffer.


Pages: [1] 2
SMF spam blocked by CleanTalk