Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU: Clip depth properly when also clamping #16049

Merged
merged 2 commits into from
Sep 19, 2022

Conversation

unknownbrackets
Copy link
Collaborator

This takes care of #11399 and #11216, but only for devices which support user clip spaces (i.e. not Mali.) Note that some other devices (I think older Intel, Adreno, and PowerVR) also don't support user clip spaces, but generally don't support depth clamp either.

We still need to do something about a geo shader for the range culling, and that could be used for this depth clamp as well.

Anyway, this at least solves it for quite a few devices.

-[Unknown]

Will be used later, for now just the enable/disable logic.
Helps situations like hrydgard#11216, where only one side should be clamped.
Keeps depth clamp (i.e. hrydgard#7932) working.  See hrydgard#11399.
@hrydgard
Copy link
Owner

Very cool! I'll review this tomorrow.

Yeah need to resurrect my geometry shader branch at some point and figure out why it's not working right, and how to implement all of these things with them...

@unknownbrackets
Copy link
Collaborator Author

unknownbrackets commented Sep 18, 2022

Here's a larger change (which is probably slow...) that would also allow things to work on other devices, as a fallback. I'm sure a geometry shader (where supported) would be faster, but I think Metal doesn't (didn't?) support those.

master...unknownbrackets:ppsspp:simulate-clip-range

It means a conditional discard practically always, though. Maybe it could be optional or something, for people on more powerful devices without user clip/cull and without geometry shaders, etc.

Depth clamp could be supported more places by just always writing depth in fragment shader when clamp unsupported, too. Could allow us to get rid of the depth range multipliers, but all these things can be a bit slow.

-[Unknown]

@ghost
Copy link

ghost commented Sep 19, 2022

Does my phone support this?
Screenshot_2022-09-19-09-10-32-74
Screenshot_2022-09-19-09-10-23-32

@unknownbrackets
Copy link
Collaborator Author

shaderClipDistance being marked available / enabled is what would say - unfortunately, it's cut off in the screenshot.

If you have an Adreno GPU, I think it's likely to be supported.

-[Unknown]

@ghost
Copy link

ghost commented Sep 19, 2022

shaderClipDistance being marked available / enabled is what would say - unfortunately, it's cut off in the screenshot.

If you have an Adreno GPU, I think it's likely to be supported.

-[Unknown]

Thanks my phone is supported!
Screenshot_2022-09-19-11-25-59-15

Comment on lines +1214 to +1219
// This is similar, but for maxz when it's below 65535.0. -1/0 don't matter here.
WRITE(p, " if (u_depthRange.x + u_depthRange.y <= 65534.0) {\n");
WRITE(p, " %sgl_ClipDistance%s = 65535.0 - integerZ;\n", compat.vsOutPrefix, clip1);
WRITE(p, " } else {\n");
WRITE(p, " %sgl_ClipDistance%s = 0.0;\n", compat.vsOutPrefix, clip1);
WRITE(p, " }\n");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaxZ is quite often at 65535. wonder if it would be beneficial to add a shader bit and omit this check in that case. (Not for this PR, of course).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to avoid shader variants, obviously could have added shader bits for one or both of 0 and 65535, which are both common.

-[Unknown]

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just philosophizing if it's worth it or not.

Shader variants are only "bad" if a game actually uses both variants, each with a lot of different shaders of the other type (fragment vs vs) - if there's only one extra shader being created (maybe only used in a post processing pass or something), it doesn't realy hurt at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely have seen games swap between them, but yes it's usually not changed frequently.

-[Unknown]

@hrydgard
Copy link
Owner

Good time to merge this now between two releases, always scary to increase the use of GPU features :)

@hrydgard hrydgard merged commit 4b165f9 into hrydgard:master Sep 19, 2022
@unknownbrackets
Copy link
Collaborator Author

What do you think about #16049 (comment)?

-[Unknown]

@unknownbrackets unknownbrackets deleted the depth-clamp branch September 19, 2022 13:55
@hrydgard
Copy link
Owner

@unknownbrackets I think we probably want it, but likely behind a compat flag, or similar, so we only use it when really needed..

@stuken
Copy link
Contributor

stuken commented Sep 19, 2022

This has caused a regression with toca 2. Previously there was an area of very subtle warping of the ground texture. Now its appearing as a black area. Happening on both opengl and vulkan on windows.
ULES00040_00000

@unknownbrackets
Copy link
Collaborator Author

unknownbrackets commented Sep 19, 2022

@unknownbrackets I think we probably want it, but likely behind a compat flag, or similar, so we only use it when really needed..

Well, I don't really have the time or patience to go through a few hundred games start to finish to end up guessing that wrong anyway, so I guess we should skip it.

This has caused a regression with toca 2. Previously there was an area of very subtle warping of the ground texture. Now its appearing as a black area. Happening on both opengl and vulkan on windows.

I did notice there was a half pixel offset issue in the depth range (probably from copying code D3D9 -> D3D11 -> Vulkan), and tried to account for it to avoid necessarily changing that part, maybe that's why. Can you include a GE frame dump?

-[Unknown]

@hrydgard
Copy link
Owner

@unknownbrackets Understandable obviously, but it might be interesting to actually do some benchmarking and see the magnitude of the hit on a few devices. I might take that on at some point. Do keep the branch around.

@stuken
Copy link
Contributor

stuken commented Sep 19, 2022

ULES00040_0001.zip

@unknownbrackets
Copy link
Collaborator Author

Hm, seems to happen at draw 40/591 of that. minz=0, maxz=0xfff0, so it should clip against maxz.

WRITE(p, " if (u_depthRange.x + u_depthRange.y <= 65534.0) {\n"); is what's triggering, and that's the right one to trigger and it should in this case because maxz != 0xFFFF. Adjusting it to allow clamp to 0xfff0 makes it draw, but that's not correct.

The depthrange is scale=65520.000000, center=0.001953. Although center should probably be zero, this doesn't seem to be the issue. Viewport is (32760 * z) - 32760, which means the clipping is simply 0 -> 1 (vpZOffset is 0, vpDepthScale is -1.)

I tried WRITE(p, " %sgl_ClipDistance%s = 1.0 - projPos.z;\n", compat.vsOutPrefix, clip1); which ought to be equivalent in this case (and avoid any precision issues), as well as highp - neither makes any difference.

Depth clamp disabled is also fine, which should also be clipping against 1.0.

Even 85535.0 - integerZ incorrectly clips some. But the actual depth is well below that and ought to be fine. I wonder what's going on here... I guess this must mean vertex depth rounding is wrong in this case as well? But even turning that on doesn't affect it, and doesn't write awful depth values either.

-[Unknown]

@unknownbrackets
Copy link
Collaborator Author

In RenderDoc, currently pass 5, event 286. One of the triangles missing includes vertices 267-269, and the other is 144-146. Maybe also 147-149.

For 144,145,146 the clip distances are: -74524, +59085, -12484. The z/w on each output coord is: 2.137671732, 0.098436966, 1.190772977. The middle coord is the topmost coord, so it's bottom right, top middle, bottom left. This seems about right to me - clipping should occur on the portion of the triangle close to the top middle point.

Also note that +59085 is approximately 65535 * (1.0 - 0.098436966).

So now I'm thinking: maybe clip distance is affected by w? Since these points have different ws, specifically 0.38432, 8.34595, 0.68993. Hm.

-[Unknown]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants