Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional driver workaround to RenderingDevice for Adreno 6XX. #91514

Merged
merged 1 commit into from May 14, 2024

Conversation

DarioSamo
Copy link
Contributor

@DarioSamo DarioSamo commented May 3, 2024

Fixes #90459. This was a perceived regression by the end user as #84976 was the one that introduced the crash. However, the reality turned out to be the graph exposed a driver bug we were successfully dodging in Mobile. This very well could explain why Forward+ was prone to crashing on this family of devices.

Quoting from the PR itself for the in-depth explanation.

Workaround for the Adreno 6XX family of devices.

There's a known issue with the Vulkan driver in this family of devices where it'll crash if a dynamic state for drawing is used in a command buffer before a dispatch call is issued. As both dynamic scissor and viewport are basic requirements for the engine to not bake this state into the PSO, the only known way to fix this issue is to reset the command buffer entirely.

As the render graph has no built in limitations of whether it'll issue compute work before anything needs to draw on the frame, and there's no guarantee that compute work will never be dependent on rasterization in the future, this workaround will end recording on the current command buffer any time a compute list is encountered after a draw list was executed. A new command buffer will be created afterwards and the appropriate synchronization primitives will be inserted.

Executing this workaround has the added cost of synchronization between all the command buffers that are created as well as all the individual submissions. This performance hit is accepted for the sake of being able to support these devices without limiting the design of the renderer.

TODO:

  • Implement the logic for enabling this workaround here only for the affected devices. Tagging @clayjohn as he has a device that can reproduce the issue.

@DarioSamo
Copy link
Contributor Author

Clay has added the device detection and confirmed this to be working.

@TCROC
Copy link
Contributor

TCROC commented May 3, 2024

I just tested and can confirm this PR fixes it! Great work everyone! Shoutout to @clayjohn @DarioSamo and @TokageItLab !! :)

@darksylinc
Copy link
Contributor

The device detection code has the same problem as checking for "Windows 9". It will falsely flag Adreno 6000 if it ever comes out.

The proper way to check is via vendor id + device id + name string. Like this:

if( physical_device_properties.vendorID == 0x5143 && // Qualcomm
     physical_device_properties.deviceID >= 0x6000000 && // Adreno 6xx
     physical_device_properties.deviceID < 0x7000000 &&
     strstr( physical_device_properties.deviceName, "Turnip" ) == nullptr
)
{
    // Enable workaround.
}

Of course there's still always the chance Qualcomm one day starts using unused deviceID entries in the range [0x6000000; 0x7000000) but it should narrow down the chance of false positives.

We still use string evaluation to rule out Turnip (i.e. Mesa's FOSS driver).

@DarioSamo
Copy link
Contributor Author

@darksylinc I agree with your comment, but I'll leave @clayjohn to make the decision on that as he coded the current detection. I have no problems with upgrading it to do that.

@clayjohn
Copy link
Member

clayjohn commented May 6, 2024

That sounds great to me!

I just copied what we do for OpenGL which comes with a TODO:

//Adreno 3xx Compatibility
const String rendering_device_name = String::utf8((const char *)glGetString(GL_RENDERER));
//TODO: Check the number between 300 and 399(?)
adreno_3xx_compatibility = (rendering_device_name.left(13) == "Adreno (TM) 3");

Edit: On second thought, this will not be trivial. It seems to rely on Vulkan-specific properties that we don't expose through the RD right now. So implementing @darksylinc's solution will require exposing a lot more information than we currently have access to.

@DarioSamo
Copy link
Contributor Author

DarioSamo commented May 7, 2024

Edit: On second thought, this will not be trivial. It seems to rely on Vulkan-specific properties that we don't expose through the RD right now. So implementing @darksylinc's solution will require exposing a lot more information than we currently have access to.

Vendor ID is already present (VENDOR_QUALCOMM = 0x5143). We can extend the current Device struct in RenderingContextDriver to provide the device ID. D3D12 has this information and can return it easily. I'd ping @stuartcarnie about Metal, but it might or might not need this as the set of hardware it runs on is more limited.

@darksylinc
Copy link
Contributor

This is a vulkan-specific workaround though. Shouldn't the flag be set at the Vk implementation level?

@DarioSamo
Copy link
Contributor Author

DarioSamo commented May 7, 2024

This is a vulkan-specific workaround though. Shouldn't the flag be set at the Vk implementation level?

It's a bit of a mixed bag as the workaround must be implemented at a level above the driver due to how the ARG works, so detecting the workaround could be one of two possibilities:

  • We add something to the RHI's interface to check if the workaround is required and only the Vulkan driver implements this check or
  • We just check if it's using the Vulkan RDD along with the device name/id detection.

@darksylinc
Copy link
Contributor

I think these workarounda should be centralized in a global section so they can be documented and analyzed if they're still needed.

e.g.

struct Workarounds
{
   // Explanation about the workaround.
   bool adreno_compute_clip = false;
};

Workarounds workarounds;

Thinking about proper design is pointless IMHO because by its very nature workarounds can be anything, anywhere and cross multiple isolation boundaries.

Thus a global bool is the best choice, all in one place.

@DarioSamo
Copy link
Contributor Author

DarioSamo commented May 9, 2024

I've pushed a new version of the workaround detection based on driver version instead. For this I moved the logic specifically to be a part of the Vulkan driver. The workarounds structure is now part of RenderingContextDriver's Device structure, which makes it pretty straightforward to only enable it where it's relevant and with as much information as possible.

The only additional question here is if we want to specifiy a driver version range or just enable this on all devices previous to this version. Right now, this PR is implementing the latter.

Tagging @clayjohn and @TCROC to test again if possible to confirm the detection works.

@TCROC
Copy link
Contributor

TCROC commented May 11, 2024

I've pushed a new version of the workaround detection based on driver version instead. For this I moved the logic specifically to be a part of the Vulkan driver. The workarounds structure is now part of RenderingContextDriver's Device structure, which makes it pretty straightforward to only enable it where it's relevant and with as much information as possible.

The only additional question here is if we want to specifiy a driver version range or just enable this on all devices previous to this version. Right now, this PR is implementing the latter.

Tagging @clayjohn and @TCROC to test again if possible to confirm the detection works.

I tested and it worked! :)

Co-authored-by: Clay John <claynjohn@gmail.com>
Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Let's go ahead and merge.

For maintainers, this shouldn't be cherrypicked

@akien-mga akien-mga merged commit db3c4a4 into godotengine:master May 14, 2024
16 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants