Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Jorhlok

Pages: [1]
So, why doesn't the mesh effect create artifacts that the apple II and Atari 8 bit line used often for color (green and pink) in high-res mode?

Short answer: it did.

IIRC the original apple ii hardware is monochrome under the hood and was able to make color with composite monitors with patterns of white and black pixels. This is horizontally on a scanline.
I'm not as familiar with atari computers but high resolution modes for many computers such as 80 column mode on a commodore 128 is hardly legible on a composite monitor or tv. Thankfully you could use it with monochrome or the equivalent of s video monitors.

Is SCART basically RGB in another form factor?

I'm an american so I might have something wrong but my understanding is that SCART is a multipurpose audiovisual cable/port specification. So it carries analog video and audio and supports what is equivalent to composite, s video, component, and RGB. Not everything that uses SCART does everything and if it does you need a cable that supports it by having those pins. This is sort of like an old VGA cable that's in my drawer that only has the bare minimum of pins and wires so things like the monitor sending specs via one of the pins won't work with that cable.

You mentioned the fm towns marty 3D sphere which gave me an idea but then I looked it up on youtube and it was nothing like my idea.

The idea is to use a normal VDP2 layer with small premade tiles and a CPU can decide how to composite the tiles to recreate what looks like any shadows.

The VDP1 draws arbitrary lines from the left line to the right using a greedy algorithm which causes pixels to be overdrawn. This avoids PS1-like gaps but causes the moire pattern with transparency. The other thing that causes the moire pattern was demonstrated well in one of Jon Burton's Sonic R videos. So unless each line is perfectly vertical or horizontal it'll overdraw causing transparency to be bollocks. This applies to skewed and rotated quads, not just perspective corrected quads.

One of the things you might have heard is that CRTs and composite video blurs the dithering enough to be a convincing shadow just drawing a dithered black oval over the ground. I didn't understand just how effective this was until I saw this composite capture of Z-treme on youtube:
Anyways you can tell how, back in the day, a team on a deadline scraping for ounces of power on a not quite adequate for 3D system would just draw an oval and be done with it.

I think one of the major issues with performance is that we're using an incredibly old jack-of-all-trades graphics library instead of tightly focused custom routines. Unlike a more straightforward graphics system, the saturn's really hard to write efficient graphic routines or even wrap your head around especially the memory scheduling. If two chips try to access the same memory chip in the same cycle, one of them will be halted until it gets access. This can basically throw away many cycles of work the saturn would otherwise be able to accomplish.

Anyways we got some interesting ideas floating around.

Project announcement / Re: Sonic Z-Treme
« on: August 12, 2018, 11:42:42 pm »
Via the last post on SAGE from SonicRetro:

"SAGE 2018 is DELAYED to August 25th through September 1st."
"We will now be accepting submissions to SAGE up until August 18th."

I imagine at least until the 18th XL2 will be polishing Z-treme as much as possible.

I was thinking of making a python script to do it and I may make it soon. However I think one or two of the edges of the triangle needs to span an entire edge of the sprite because that's how it'll be distorted when drawn as a quad.

Well, the python script I was imagining would stretch either columns or rows of pixels but maybe a full uv preprocessor could be made instead. I might make the simple version first and think on the latter later.

More to the point, it's pre-processed; the image on the sphere is pretty much just a screenshot. You'll notice the heavy lag time before the ball comes into play after the game stops.

Oh yeah, the background wouldn't be in the VDP1 framebuffer so it would necessarily need to be touched up to include what the VDP2 would normally produce.

Another example is Rayman at those end level sequences, where it takes the screen and, (maybe) using software, wraps the screen on a sphere, rolls it up like a sheet of paper, or does water ripple distortion, while maintaining a constant framerate of 60 fps! Now that is impressive. I think the lag is because of the efficiency of the code in Tempest 2000.

I watched that and was thinking about it, I think a way to do that is to copy chunks of the framebuffer into sprites in vram and then using those sprites draw a sphere using the regular VDP1 functionality. This works because the image on the sphere doesn't change.

From that video Jon says he planned to have an environment mapped metal sonic but scrapped it and only used the software renderer for the trophies.

From what I know of the VDP1 it draws in strips of pixels from the left side of the quad to the right. This way it reads from the sprite linearly. So it would take all the pixels from start to end in a row and stretch that from the left side of the quad to the right instead of leaving blank pixels to either side, I think? Having not programmed the saturn yet I can't say for certain, though.

Sorry Jorhlok. I totally pass your great contribution!

It is a great project! You have thought about the possibility of adding your real preview of the graphics of the VDP1 to the work of @XL2 to export 3D geometry to SS from Blender 3D.

The possibility of creating 3D graphics in a tool with Blender, and really being able to visualize how the graphics will look. In addition to being able to later export the data to a format as optimized to the machine. It is one of the tasks that remained pending in the past. There were no plugins, or at least I have not seen for 3DS max or others for SS graphics. However for PSX there were many. This facilitated and potentiated taking advantage of both the possibilities of the artist and the machine.

What do you think?
I hadn't heard of exporter plugins for the PSX. I'll have to check that out and see what's feasible.

I decided to try storing the quad data in a texture that triangles would then reference in the shader. I'm using a Vector4 format texture so I can store 4 arbitrary floats per pixel. I can use values outside of 0-1. Texture width for this proof of concept is 8 and texture height is the buffer size. The inverse bilinear algorithm doesn't use z values so I left them out. So the first two texels in a row store the quad's coordinates, next two are uv coordinates, next 4 texels are gouraud colors for each corner. I'm sending triangles in with position and uv values. each vertex of a triangle should be the same so the end result makes it to the pixel shader. There, the V coordinate points to the row in the quad buffer. U doesn't do anything but will be options such as sprite vs quad, screendoors, etc.

Here's what it looks like. That strip along the left is the first 240 quads in the last drawn buffer. Values outside of 0-1 are clamped so most of the first two texels just appear as white.

Although the method is a bit unorthodox I'm getting good results compared to last time.

591 kqps depth 8000 buf 1M quad
489 kqps depth 2048 buf 1M quad
288 kqps no depth 8000 buf 1M quad

        /* depth 8000 buf 1M quad
         + 1688.0965
         + 1686.0964
         + 1691.0967
         + 1744.0998
         + 1682.0962
         + 1665.0952
         + 1699.0972
         + 1682.0963
         + 1685.0963
         + 1710.0978
         * avg 1693.29684
         * qps 590563
         * 60fps 9842
         * 30fps 19685
         * depth 2048 buf 1M quad
         + 2060.1178
         + 2058.1177
         + 2043.1169
         + 2053.1175
         + 2031.1161
         + 2032.1162
         + 2042.1168
         + 2043.1169
         + 2071.1185
         + 2021.1156
         * avg 2045.517
         * qps 488873
         * 60fps 8147
         * 30fps 16295
         * no depth 8000 buf 1M quad (disabled depth buffer so it needs to process all the fragments in a triangle)
         + 3495.1999
         + 3487.1994
         + 3428.1961
         + 3482.1992
         + 3469.1984
         + 3461.198
         + 3441.1968
         + 3487.1995
         + 3441.1969
         + 3490.1996
         * avg 3468.29838
         * qps 288325
         * 60fps 4805
         * 30fps 9610

I don't understand what's happening but you have my interest.
IIRC the Saturn graphics hardware has no Z-buffer anyway, perhaps ignore it then?

Question: What graphics API is this? [OGL/Vulkan/DX]
[my own research indicates: various, depending on build target]

I'm using the cross-platform desktop opengl project so the monogame content pipeline automagically converts the hlsl to glsl. I don't think I'm doing anything platform specific at this point. Shaders are written in hlsl because that's what XNA used. I think I only have access to shader model 2 or equivalent so no geometry shader or other things that might come in handy.
One of the limiting things in this case is that rasterization and z buffering is out of my hands. Writing in vulkan would take care of this (I think) but as far as I know the end result of all this finagling will run on older hardware than vulkan supports.

Finagling that inverse bilinear shader into my own wasn't so hard. For testing I had it tell the gpu to draw two triangles over the entire framebuffer and let it clip fragments. Unfortunately that also means the z buffer becomes useless so I finagled a couple more algorithms to fit a convex shape over the arbitrary quads. Now that I had several ways to draw a quad, I wondered how the performance would be. So I had it draw a million random quads with each method and recorded the time it took in milliseconds 11 times and threw the first one away just in case of shader initialization or whatever.

Performance is incredibly low due to sending quads one at a time to the GPU. Below are ordered from fastest to slowest.

/* DrawQuadQuick Colored quad. Tesselated as a triangle fan around the midpoint of the 4 verticies. That's 4 triangles. Sort of accurate for convex quads.
         * 6867.3928
         * 6995.4001
         * 6851.3919
         * 6944.3972
         * 6838.3911
         * 6815.3898
         * 6864.3927
         * 6919.3958
         * 6901.3948
         * 6857.3922
         * avg 6885.49384
         * qps 145232
         * 60fps 2420
         * 30fps 4841

/* DrawSpriteQuick Gouraud sprite. 4 tris. Accurate for squares only.
         * 8050.4605
         * 8045.4601
         * 8059.4609
         * 7989.457
         * 8070.4616
         * 8042.46
         * 8043.46
         * 8056.4608
         * 7890.4514
         * 8038.4598
         * avg 8028.65921
         * qps 124553
         * 60fps 2075
         * 30fps 4151

/* DrawSpriteBilinearToScreen Gouraud sprite using the inverse bilinear shader. Draws 2 tris over the entire screen. Z buffer is useless.
         * 9706.5551
         * 9698.5547
         * 9731.5566
         * 9703.555
         * 9732.5567
         * 9648.5518
         * 9703.555
         * 9668.553
         * 9691.5543
         * 9756.5581
         * avg 9704.15503
         * qps 103048
         * 60fps 1717
         * 30fps 3434

/* DrawSpriteBilinear Gouraud sprite using inverse bilinear shader with fitted shape. 1 tri for concave, else 2.
         * 10868.6216
         * 10942.6258
         * 10955.6266
         * 11025.6306
         * 10959.6269
         * 11031.631
         * 10975.6278
         * 11082.6339
         * 10812.6185
         * 10827.6193
         * avg 10948.2262
         * qps 91338
         * 60fps 1522
         * 30fps 3044

/* DrawQuad Colored quad tesselated into horizontal strips. 2 tris per strip. Reasonable accuracy with enough strips.
         * 32 strips
         * 11216.6415
         * 11114.6357
         * 10986.6284
         * 11144.6374
         * 11135.6369
         * 11160.6384
         * 11101.635
         * 11102.635
         * 11217.6416
         * 11071.6332
         * avg 11125.23631
         * qps 89885
         * 60fps 1498
         * 30fps 2996

/* DrawSprite Gouraud sprite tesselated into horizontal strips. 2 tris per strip. Very accurate when strips == texture height. Reasonable accuracy with enough strips.
         * 32 strips
         * 13811.79
         * 14014.8016
         * 13926.7966
         * 13998.8007
         * 14025.8023
         * 13840.7916
         * 13935.7971
         * 13998.8007
         * 13912.7958
         * 13902.7952
         * avg 13936.89716
         * qps 71751
         * 60fps 1195
         * 30fps 2391
         * 96 strips
         * 25656.4675
         * 25565.4623
         * 25539.4608
         * 25433.4547
         * 25344.4496
         * 25326.4486
         * 25285.4462
         * 25299.447
         * 25298.4469
         * 25495.4583
         * avg 25424.45419
         * qps 39332
         * 60fps 655
         * 30fps 1311

145 kqps DrawQuadQuick
125 kqps DrawSpriteQuick
103 kqps DrawSpriteBilinearToScreen
091 kqps DrawSpriteBilinear
090 kqps DrawQuad 32 strips
072 kqps DrawSprite 32 strips
039 kqps DrawSprite 96 strips (texture height)

Below from:
Polygon rendering performance: Lighting

    800,000 polygons/s: Flat shading, 32-pixel polygons
    500,000 polygons/s: Flat shading, 50-pixel polygons
    200,000 polygons/s: Gouraud shading, 32-pixel polygons

Texture mapping performance: Lighting

    300,000 polygons/s: 32-texel textures
    200,000 polygons/s: 70-texel textures
    140,000 polygons/s: Gouraud shading, 32-texel textures

This is all on my desktop computer which can do recent games with mediumish settings depending on the game. On more modest machines like my laptop it might be well below saturn-level performance.
Anyways, I think the performance will be more than satisfactory once I figure out how to send quads in bulk to the GPU.

Just for fun, here's a million quads with and without z buffer.

I saw this and had an idea to try so I did a screen cap of manipulating an image in gimp and the result it drew. It got reasonable results so I cut the videos together and slapped it on youtube.

For this I took each vertical column of pixels and stretched them vertically to fill the space.

EDIT: I was worried about texture warping horizontally so I added some lines to the corrected texture to see how it would be effected. I was expecting the grey lines to bend either left or right but to my surprise it didn't


The shader itself supports textured/untextured tris, additive/subtractive gouraud for textured tris, color goraud for untextured tris, screen doors style dithering transparency at different sizes, and configurable "magic pink" style transparency with any color. It's all in RGBA as of writing, so no indexed color or CLUTs at present. External to the shader are things like the depth buffer which is still useable.

Then using (too) many triangles you can get the same sprite distortion as the saturn.

Here's some screenshots showing what I mean. More here:

This code will draw quads with a variable number of horizontal strips. This is an improvement over my last thing which drew in pixel-thin strips and used the default shader so only had multiplicative gouraud and had to switch to a second framebuffer, draw, and mask over it to accomplish the dithering effect.

Anyways I was thinking of making a low poly 3D modeller/animator targeting saturn style graphics also supporting billboards and camera facing strips in model.
One thing I'm not sure of is how important indexed color effects may be or how they're used on the saturn.

Don't get your hopes up or anything, my last thing only had keyboard/table input, no saving or textures, and looked like this.

EDIT: Okay I've been reading some of the VDP1 User's Manual (thanks google and redacted) and I'm getting the difference between VDP1 CLUT and VDP2 Color Banks and also mixed RGB/indexed color in the same framebuffer. Also I noticed that "Table 5.3 Relationship between Gouraud Shading Table Settings and Correction Values" is lining up with how I got the shading to match up with the example in the gallery by using color values from 0.25f and 0.75f so the shader will add +-0.5f to any pixel. The "correction values" go from -16 to +15 or half of 5-bit color space.

Anyways, still not sure if it's worth working indexed color into the shader or if fudging it with software is good enough.

EDIT2: Okay after much reading and confusion over the VDP1 and VDP2 User's Manuals yesterday, I think I have a grip on the terminology. I'll briefly explain how I see it so call me out if I'm off base. "Color Bank" is just what the VDP1 calls the other bits in a pixel beyond the index specified by the image. This can be more significant bits in the CRAM index (up to 11 bits total index) and also color calc or other special bits. The VDP2 has 1024 or 2048 colors in CRAM. CLUTs are how the VDP1 decompresses a 4-bit image to arbitrary 16 bit patterns. They're specified with a pointer per draw call. I think much of my confusion was each chip calling the same thing by different names or treating the same data slightly differently.

Real quick, I noticed an error in my shader as of writing and that is a colored polygon is specified only by the vertex colors so if I were to take a blue quad and shade it light to dark, it would lose saturation in between. I need to add polygon color as a variable in the shader. I was also thinking about other changes to make. Other than dealing with indexed color, I was thinking, since each object drawn basically needs to reupdate the shader variables, I could specify all four quad vertecies as global variables and use inverse bilinear to calculate the uv coordinates to accomplish accurate texturing with out needing to draw in many, many strips. Like this example here:

I don't think I said it before but the purpose of the shader is to experiment with saturn-like graphics but also could be used to co-develop an enhanced port of saturn homebrew or similar.

This comes into play with what I was thinking about implementing any indexed color in the shader. I was thinking that usually indexed color in a 3D saturn game is going to be used for the transparency effect and also saving VRAM. Transparency can be accomplished in RGBA and saving VRAM isn't an issue. Then I was thinking about advanced effects like that metallic shading XL2 showed off in Z-Treme recently. So finally I thought, "It sure would be a shame if an 'enhanced' port had to cut out effects," thus coming to my decision to try and implement CLUT and CRAM effects in the shader. It'll require some bullshittery with pulling integers of various sizes out of vector4 color but if anything is, a GPU is up to the task.

Pages: [1]
SMF spam blocked by CleanTalk