Commit Graph

586 Commits

Author SHA1 Message Date
squidbus
5b3e156197
externals: Update MoltenVK (#2492) 2025-02-21 12:41:36 +02:00
georgemoralis
cc583b6189 hot-fix: rasterizer 2025-02-19 13:55:18 +02:00
TheTurtle
da0ab005c7
video_core: Fix some cases of "Attempted to track non-GPU memory" (#2447)
* memory: Consider flexible mappings as gpu accessible

Multiple guest apps do this with perfectly valid sharps in simple shaders. This needs some hw testing to see how it is handled but for now doesnt hurt to handle it

* memory: Clamp large buffers to mapped area

Sometimes huge buffers can be bound that start on some valid mapping but arent fully contained by it. It is not reasonable to expect the game needing all of the memory, so clamp the size to avoid the gpu tracking assert

* clang-format fix

---------

Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
2025-02-19 13:31:35 +02:00
squidbus
fd3d3c4158
shader_recompiler: Implement AMD buffer bounds checking behavior. (#2448)
* shader_recompiler: Implement AMD buffer bounds checking behavior.

* shader_recompiler: Use SRT flatbuf for bounds check size.

* shader_recompiler: Fix buffer atomic bounds check.

* buffer_cache: Prevent false image-to-buffer sync.

Lowering vertex fetch to formatted buffer surfaced an issue where a CPU modified range may be overwritten with stale GPU modified image data.

* Address review comments.
2025-02-17 16:13:39 +02:00
squidbus
e13fb2e366
renderer_vulkan: Bind descriptors to specific stages in layout. (#2458) 2025-02-16 15:08:16 +02:00
TheTurtle
82cacec8eb
shader_recompiler: Remove special case buffers and add support for aliasing (#2428)
* shader_recompiler: Move shared mem lowering into emitter

* IR can be quite verbose during first stages of translation, before ssa and constant prop passes have run that drastically simplify it. This lowering can also be done during emission so why not do it then to save some compilation time

* runtime_info: Pack PsColorBuffer into 8 bytes

* Drops the size of the total structure by half from 396 to 204 bytes. Also should make comparison of the array a bit faster, since its a hot path done every draw

* emit_spirv_context: Add infrastructure for buffer aliases

* Splits out the buffer creation function so it can be reused when defining multiple type aliases

* shader_recompiler: Merge srt_flatbuf into buffers list

* Its no longer a special case, yay

* shader_recompiler: Complete buffer aliasing support

* Add a bunch more types into buffers, such as F32 for float reads/writes and 8/16 bit integer types for formatted buffers

* shader_recompiler: Remove existing shared memory emulation

* The current impl relies on backend side implementaton and hooking into every shared memory access. It also doesnt handle atomics. Will be replaced by an IR pass that solves these issues

* shader_recompiler: Reintroduce shared memory on ssbo emulation

* Now it is performed with an IR pass, and combined with the previous commit cleanup, is fully transparent from the backend, other than requiring workgroup_index be provided as an attribute (computing this on every shared memory access is gonna be too verbose

* clang format

* buffer_cache: Reduce buffer sizes

* vk_rasterizer: Cleanup resource binding code

* Reduce noise in the functions, also remove some arguments which are class members

* Fix gcc
2025-02-15 14:06:56 +02:00
squidbus
6e12642151
shader_recompiler: Lower non-compute shared memory into spare VGPRs. (#2403) 2025-02-12 20:10:13 -08:00
squidbus
c9d425dc08 fix: Correct number of allocated VGPRs. 2025-02-12 17:53:52 -08:00
squidbus
2188895b40
buffer_cache: Give null buffer full usage flags. (#2400) 2025-02-11 00:19:38 -08:00
squidbus
843cd01308 fix: Disable VK_EXT_tooling_info on AMD proprietary for now. 2025-02-09 16:34:20 -08:00
psucien
04fe3a79b9
fix: lower UBO max size to account buffer cache offset (#2388)
* fix: lower UBO max size to account buffer cache offset

* review comments

* remove UBO size from spec and always set it to max on shader side
2025-02-09 22:03:20 +01:00
squidbus
15b520f4a2
renderer_vulkan: Skip tessellation isolines if not supported. (#2384) 2025-02-09 10:20:13 -08:00
psucien
5d4812d1a6 hot-fix: fix for unintended gamma correction bypass when HDR is disabled 2025-02-09 18:22:07 +01:00
psucien
8f2883a388
video_out: HDR support (#2381)
* Initial HDR support

* fix for crashes when debug tools used
2025-02-09 15:54:54 +01:00
squidbus
cfe249debe
shader_recompiler: Replace texel buffers with in-shader buffer format interpretation (#2363)
* shader_recompiler: Replace texel buffers with in-shader buffer format interpretation

* shader_recompiler: Move 10/11-bit float conversion to functions and address some comments.

* vulkan: Remove VK_KHR_maintenance5 as it is no longer needed for buffer views.

* shader_recompiler: Add helpers for composites and bitfields in pack/unpack.

* shader_recompiler: Use initializer_list for bitfield insert helper.
2025-02-06 20:40:49 -08:00
DanielSvoboda
46cbee1585
RemapSwizzle formatting (#2368)
This doesn't change anything, it just reduces duplicate information.
2025-02-06 18:18:02 -08:00
squidbus
78ea536c95 hotfix: 4444 swizzle order 2025-02-06 17:55:46 -08:00
squidbus
1a00b1af24
vulkan: Use more supported 4444 format. (#2366) 2025-02-06 17:45:47 -08:00
squidbus
1eb0affdea
vk_instance: Clean up extension management. (#2342) 2025-02-06 16:38:02 -08:00
Stephen Miller
e972a8805d
Bump size of buffer_views (#2357)
Uncharted 4 (and perhaps some other games) fill this up, causing a `Unhandled exception: boost::container::bad_alloc thrown` exception
2025-02-05 19:54:13 +02:00
squidbus
b879dd59c6
shader_recompiler: Add workaround for drivers with unexpected unorm rounding behavior. (#2310) 2025-02-04 01:01:59 -08:00
Vladislav Mikhalin
1d8c607c15 hotfix: stronger conditions for the vtx offset error message 2025-02-01 20:44:10 +03:00
squidbus
84c27eea2a
texture_cache: Make sure left-overlapped mips get marked for rebind. (#2268) 2025-02-01 01:54:40 -08:00
poly
eed4de1da9
renderer_vulkan: use LDS buffer as SSBO on unsupported shared memory size (#2245)
* renderer_vulkan: use LDS buffer as SSBO on unsupported shared memory size

* shader_recompiler: add `v_trunc_f64` on inst format table
2025-01-31 13:52:31 +02:00
tomboylover93
e805b97520
Add Vulkan debug options to the Debug tab (#2254)
Co-authored-by: DanielSvoboda <daniel.svoboda@hotmail.com>
2025-01-30 18:34:31 +01:00
squidbus
d2127b38de
vk_rasterizer: Keep viewport depth offset even without native depth clip control. (#2257) 2025-01-28 11:12:48 +03:00
Vladislav Mikhalin
191e64bfa1
renderer: respect zmin/zmax even if clipping is disabled (#2250) 2025-01-27 00:17:23 -08:00
squidbus
a5a1253185
liverpool: Implement PM4 MEM_SEMAPHORE. (#2235) 2025-01-25 04:12:18 -08:00
squidbus
a51c8c17e0
shader_recompiler: Fix image write swizzles. (#2236) 2025-01-24 12:47:04 -08:00
squidbus
56f4b8a2b8
shader_recompiler: Implement shader export formats. (#2226) 2025-01-24 10:41:58 -08:00
squidbus
d1b9a5adcc
texture_cache: Do not overwrite overlap hit with a miss. (#2217) 2025-01-24 10:23:18 +02:00
squidbus
74710116f6
renderer_vulkan: Remove dead code. (#2228) 2025-01-24 10:21:56 +02:00
squidbus
91444a0545
liverpool: Fix tiled check for color buffer. (#2227) 2025-01-24 10:21:32 +02:00
panzone91
d7c2cb17f3
update extension vector capacity (#2210) 2025-01-22 23:53:54 +02:00
squidbus
2a4798cfa6
tile: Fix some tile thickness calculation errors. (#2203)
* tile: Fix some tile thickness calculation errors.

* tile: Do not pad mip height to tile height.
2025-01-22 09:40:00 +01:00
squidbus
95a30b2b3e
texture_cache: Lock when updating image. (#2198) 2025-01-20 22:38:09 +01:00
squidbus
a3967ccdb4
externals: Update vulkan-headers (#2197) 2025-01-20 14:48:32 +02:00
squidbus
e1132db197
texture_cache: Prevent unregistered images from being tracked. (#2196) 2025-01-20 08:33:37 +01:00
squidbus
d14e57f6a8 hotfix: Move some command buffer references down.
Prevents references becoming stale due to stream buffer flushes.
2025-01-19 18:45:37 -08:00
DanielSvoboda
80092b6367
Fix SurfaceFormat Format4_4_4_4 (#2193)
* Fix SurfaceFormat Format4_4_4_4

Pac-Man 256

* add_extension
2025-01-19 15:09:10 -08:00
DanielSvoboda
201f2817ca
Fix SurfaceFormat Format1_5_5_5 - Format5_5_5_1 (#2191)
* Fix SurfaceFormat Format1_5_5_5 - again

* Fix Format5_5_5_1
2025-01-19 13:55:27 -08:00
DanielSvoboda
17ac63d23a
Fix SurfaceFormat (#2188) 2025-01-19 17:47:40 +02:00
Quang Ngô
ec0dfb32b5
Some ImGui tweaks for the game window (#2183)
* Remove window border
* Remove window rounding
* Set background color to black
2025-01-19 09:03:15 -03:00
squidbus
746f2e091d
tile: Account for thickness in micro tiled size calculation. (#2185) 2025-01-19 12:06:31 +01:00
Vladislav Mikhalin
269ce12614 fix build on arch 2025-01-18 16:54:06 +03:00
squidbus
c80151adde
vk_presenter: Fix splash issues. (#2180) 2025-01-18 02:29:19 -08:00
squidbus
d361579618
texture_cache: Fix image mip overlap. (#2177) 2025-01-18 10:35:44 +01:00
squidbus
12364b197a
renderer_vulkan: Remove swapchain image reinterpretation. (#2176) 2025-01-18 01:13:16 -08:00
Quang Ngô
81ad575b22
video_core: Use adaptive mutex on Linux (#2105)
Fix performance regression with #1973 on SteamDeck
2025-01-17 23:47:38 -08:00
Quang Ngô
9a956f5ed0
renderer_vulkan: Clear blank frame (#2095)
* renderer_vulkan: Clear blank frame

Fix display of garbage images on startup on some drivers.

* Remove duplicated attachment declarations

* Remove duplicated rendering_info declarations
2025-01-17 23:08:45 -08:00
Vladislav Mikhalin
7b8177f48e
renderer: handle disabled clipping (#2146)
Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2025-01-18 09:20:38 +03:00
polybiusproxy
99a04357d1
don't compile cs with higher shared memory than supported (#2175) 2025-01-17 21:51:33 +01:00
squidbus
9e5b50c866
vk_platform: Clean up unnecessary debug message filters. (#2171) 2025-01-17 10:16:15 +02:00
squidbus
1d3427780a
renderer_vulkan: Fix present related validation errors. (#2169) 2025-01-17 10:16:03 +02:00
squidbus
1e5b316ac4
renderer_vulkan: Add debug markers for presenter. (#2167) 2025-01-17 10:15:43 +02:00
squidbus
3b474a12f9
shader_recompiler: Improvements to buffer addressing implementation. (#2123) 2025-01-16 18:40:03 -08:00
squidbus
eb49193309
liverpool: Revert queue scope markers. (#2166) 2025-01-16 18:24:29 -08:00
Vinicius Rangel
56a6c95730
Render without rendering (#2152)
* presenter: render the game inside a ImGui window

* presenter: render the previous frame to keep the render rendering

* swapchain: fix swapchain image view format not being converted to unorm

* devtools: fix frame graph timing
2025-01-16 21:27:23 +02:00
squidbus
b3739bea92
renderer_vulkan: Simplify debug marker settings. (#2159)
* renderer_vulkan: Simplify debug marker settings.

* liverpool: Add scope markers for graphics/compute queues.

* liverpool: Remove unneeded extra label from command buffer markers.

* vk_rasterizer: Add scopes around filtered draw passes.
2025-01-16 12:14:34 +02:00
squidbus
53d0a309cc
liverpool_to_vk: Add R32Uint depth promote. (#2145) 2025-01-15 18:33:15 +03:00
squidbus
5040be1640
renderer_vulkan: Handle depth-stencil copies through depth render overrides. (#2134) 2025-01-15 08:48:40 +03:00
psucien
394331f206
video_core: detiler: display micro 64bpp (#2137) 2025-01-12 19:25:25 +01:00
squidbus
5c845d4ecc hotfix: Constrain view layers to actual layers. 2025-01-10 16:30:28 -08:00
squidbus
6ec68f66a9 hotfix: Check correct template for setting binding divisor. 2025-01-10 15:59:20 -08:00
squidbus
e656093d85
shader_recompiler: Fix some image view type issues. (#2118) 2025-01-10 12:35:03 -08:00
squidbus
562ed2a025
renderer_vulkan: Simplify vertex binding logic and properly handle null buffers. (#2104)
* renderer_vulkan: Simplify vertex binding logic and properly handle null buffers.

* renderer_vulkan: Remove need for empty bindVertexBuffers2EXT.
2025-01-10 10:52:12 +02:00
squidbus
4563b6379d
amdgpu: Handle 8-bit float format case for stencil. (#2092) 2025-01-10 10:49:08 +02:00
squidbus
725814ce01
shader_recompiler: Improvements to array and cube handling. (#2083)
* shader_recompiler: Account for instruction array flag in image type.

* shader_recompiler: Check da flag for all mimg instructions.

* shader_recompiler: Convert cube images into 2D arrays.

* shader_recompiler: Move image resource functions into sharp type.

* shader_recompiler: Use native AMD cube instructions when possible.

* specialization: Fix buffer storage mistake.
2025-01-10 10:48:12 +02:00
squidbus
b0d7feb292
video_core: Implement conversion for uncommon/unsupported number formats. (#2047)
* video_core: Implement conversion for uncommon/unsupported number formats.

* shader_recompiler: Reinterpret image sample output as well.

* liverpool_to_vk: Remove mappings for remapped number formats.

These were poorly supported by drivers anyway.

* resource_tracking_pass: Fix image write swizzle mistake.

* amdgpu: Add missing specialization and move format mapping data to types

* reinterpret: Fix U/SToF input type.
2025-01-07 12:21:49 +02:00
squidbus
c08fc85b72
renderer_vulkan: Fix null buffer views with wrong format. (#2079) 2025-01-07 07:00:07 +02:00
psucien
5559f35905 hot-fix: buffers resolve barriers fixed 2025-01-06 22:50:09 +01:00
squidbus
fb67d948b6
vk_resource_pool: Handle eErrorFragmentedPool. (#2071) 2025-01-06 15:31:45 +02:00
squidbus
7cdeb51670
renderer_vulkan: Add debug names to pipelines. (#2069) 2025-01-06 15:31:25 +02:00
squidbus
c0f57df4e6
vk_instance: Enable additional debug tagging if crash diagnostics is enabled. (#2066) 2025-01-06 00:45:54 +02:00
Mahmoud Adel
79663789bd
bump up vector size to 64 in image_info and image_binding (#2055)
solves ```boost::bad_alloc``` error when compiling shaders
2025-01-05 00:02:37 +02:00
psucien
9d3143231c macOS build fixed; indirect_args_addr moved out from queues context 2025-01-04 22:44:46 +01:00
psucien
7459d9c333 hot-fix: amdgpu: use different indirect dispatch packet on ASC 2025-01-04 22:23:12 +01:00
squidbus
78a32a3c0f
image_info: Add Neo mode macro tile extents. (#2045) 2025-01-04 11:44:14 +01:00
squidbus
7153bc8d8f
kernel: Check PSF for neo mode support. (#2028) 2025-01-04 00:29:09 +01:00
psucien
8e8671323a
texture_cache: slight detilers refactoring (#2036) 2025-01-03 21:42:23 +01:00
squidbus
c2be12f009
amdgpu: Add some resource bits for Neo mode. (#2035) 2025-01-03 21:25:20 +01:00
squidbus
9434cae458
gnmdriver: Implement neo mode differences. (#2011)
* gnmdriver: Implement neo mode differences.

* gnmdriver: Move init sequences to separate file.
2025-01-03 21:22:27 +01:00
¥IGA
2951788afc
texture_cache: Adding some missing textures (#2031) 2025-01-03 20:11:24 +01:00
psucien
345d55669e texture_cache: 8bpp macro detiler 2025-01-02 23:27:18 +01:00
TheTurtle
77d2172441
renderer_vulkan: Cleanup and improve barriers in caches (#1865)
* texture_cache: Stricter barriers on image upload

* buffer_cache: Stricter barrier for vkCmdUpdateBuffer

* vk_rasterizer: Barrier also normal buffers and make it apply to all stages

* texture_cache: Minor barrier cleanup

* Batch image and buffer barriers in a single command

* clang format
2025-01-02 19:43:56 +01:00
psucien
f7a8e2409c hot-fix: debug build 2025-01-02 19:41:15 +01:00
liberodark
596f4cdf0e
Fix amdgpu & other issues (#2000) 2025-01-02 15:39:39 +02:00
TheTurtle
c25447097e
buffer_cache: Improve buffer cache locking contention (#1973)
* Improve buffer cache locking contention

* buffer_cache: Revert some changes

* clang fmt 1

* clang fmt 2

* clang fmt 3

* buffer_cache: Fix build
2025-01-02 15:39:02 +02:00
hspir404
6862c9aad7
Speed up LiverpoolToVK::SurfaceFormat (#1982)
* Speed up LiverpoolToVK::SurfaceFormat

In Bloodborne this shows up as the function with the very highest cumulative "exclusive time". This is true both in scenes that perform poorly, and scenes that perform well.

I took (approximately) 10s samples using an 8khz sampling profiler.

In the Nightmare Grand Cathedral (looking towards the stairs, at the rest of the level):
- Reduced total time from 757.34ms to 82.61ms (out of ~10000ms).
- Reduced average frame times by 2ms (though according to the graph, the gap may be as big as 9ms every N frames).

In the Hunter's Dream (in the spawn position):
- Reduced the total time from 486.50ms to 53.83ms (out of ~10000ms).
- Average frame times appear to be roughly the same.

These are profiles of the change vs the version currently in the main branch. These improvements also improve things in the `threading` branch. They might improve them even more in that branch, but I didn't bother keeping track of my measurements as well in that branch. I believe this change will still be useful even when that branch is stabilized and merged.

It could be there are other bottlenecks in rendering on this branch that are preventing this code from being the critical path in places like the Hunter's Dream, where performance isn't currently as constrained. That might explain why the reduction in call times isn't resulting in a higher frame rate.

* Implement SurfaceFormat with derived lookup table instead of switch

* Clang format fixes
2025-01-02 15:38:51 +02:00
Mahmoud Adel
099e685bff
add R16Uint to Format Detiler (#1995)
helps with Matterfall
2025-01-02 14:29:57 +02:00
polybiusproxy
a76e8f0211
clang-format 2025-01-01 13:21:00 +01:00
psucien
d69341fd31 hot-fix: detiler: forgotten lut optimizations 2025-01-01 03:40:28 +01:00
squidbus
927dc6d95c
vk_platform: Fix incorrect type for MVK debug flag. (#1993) 2024-12-31 12:38:30 +02:00
squidbus
41d64a200d
shader_recompiler: Add swizzle support for unsupported formats. (#1869)
* shader_recompiler: Add swizzle support for unsupported formats.

* renderer_vulkan: Rework MRT swizzles and add unsupported format swizzle support.

* shader_recompiler: Clean up swizzle handling and handle ImageRead storage swizzle.

* shader_recompiler: Fix type errors

* liverpool_to_vk: Remove redundant clear color swizzles.

* shader_recompiler: Reduce CompositeConstruct to constants where possible.

* shader_recompiler: Fix ImageRead/Write and StoreBufferFormatF32 types.

* amdgpu: Add a few more unsupported format remaps.
2024-12-31 06:14:47 +02:00
squidbus
38f1cc2652
renderer_vulkan: Render polygons using triangle fans. (#1969) 2024-12-29 12:30:37 +01:00
Quang Ngô
1bc27135e3
renderer_vulkan: fix deadlock when resizing the SDL window (#1860)
* renderer_vulkan: Fix deadlock when resizing the SDL window

* Address review comment
2024-12-29 13:22:35 +02:00
TheTurtle
f09a95453e
hot-fix: Correct queue id in dispatch indirect
I missed this
2024-12-29 12:48:45 +02:00
Mahmoud Adel
e952013fe0
add EventWrite and DispatchIndirect to ProcessCompute (#1948)
* add EventWrite and DispatchIndirect to ProcessCompute

helps Alienation go Ingame

* apply review changes

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>

---------

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-12-29 12:47:15 +02:00
Quang Ngô
202c1046a1
Fix loading RenderDoc in offline mode for Linux (#1968) 2024-12-29 12:36:29 +02:00
Quang Ngô
99e1e028c0
texture_cache: Don't read max ansio value if not aniso filter (#1942)
Fix Sonic Forces.
2024-12-28 13:18:56 +02:00