Commit Graph

151 Commits

Author SHA1 Message Date
TheTurtle
e9ede8d627
Revert "DmaData and Recompiler fixes (#1775)" (#1784)
This reverts commit cafd40f2c2.
2024-12-14 16:17:14 +02:00
Vladislav Mikhalin
cafd40f2c2
DmaData and Recompiler fixes (#1775)
* liverpool: fix dmadata packet handling

* recompiler: emit a label right after s_branch to prevent dead code interferrence

* specialize barriers
2024-12-14 14:33:06 +02:00
baggins183
3c0c921ef5
Tessellation (#1528)
* shader_recompiler: Tessellation WIP

* fix compiler errors after merge

DONT MERGE set log file to /dev/null

DONT MERGE linux pthread bb fix

save work

DONT MERGE dump ir

save more work

fix mistake with ES shader

skip list

add input patch control points dynamic state

random stuff

* WIP Tessellation partial implementation. Squash commits

* test: make local/tcs use attr arrays

* attr arrays in TCS/TES

* dont define empty attr arrays

* switch to special opcodes for tess tcs/tes reads and tcs writes

* impl tcs/tes read attr insts

* rebase fix

* save some work

* save work probably broken and slow

* put Vertex LogicalStage after TCS and TES to fix bindings

* more refactors

* refactor pattern matching and optimize modulos (disabled)

* enable modulo opt

* copyright

* rebase fixes

* remove some prints

* remove some stuff

* Add TCS/TES support for shader patching and use LogicalStage

* refactor and handle wider DS instructions

* get rid of GetAttributes for special tess constants reads. Immediately replace some upon seeing readconstbuffer. Gets rid of some extra passes over IR

* stop relying on GNMX HsConstants struct. Change runtime_info.hs_info and some regs

* delete some more stuff

* update comments for current implementation

* some cleanup

* uint error

* more cleanup

* remove patch control points dynamic state (because runtime_info already depends on it)

* fix potential problem with determining passthrough

---------

Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-12-14 12:56:17 +02:00
TheTurtle
722a0e36be
graphics: Improve handling of color buffer and storage image swizzles (#1763)
* liverpool_to_vk: Remove wrong component swap formats

* shader_recompiler: Handle storage and buffer format swizzles

* shader_recompiler: Skip unsupported depth export

* image_view: Remove image format swizzle

* Platform support is not always guaranteed
2024-12-13 21:49:37 +02:00
TheTurtle
cfbd869126
texture_cache: Improve support for stencil reads (#1758)
* texture_cache: Improve support for stencil reads

* libraries: Supress some spammy logs

* core: Support loading font libraries

* texture_cache: Remove assert
2024-12-13 18:28:19 +02:00
Daniel R.
2a953391ef
liverpool: implement Rewind and IndirectBuffer packets 2024-12-11 19:40:45 +01:00
Daniel R.
a88850fec6
video_core/amdgpu: fix calculation of lod range 2024-12-08 16:02:38 +01:00
Vladislav Mikhalin
8eacb88a86
recompiler: fixed fragment shader built-in attribute access (#1676)
* recompiler: fixed fragment shader built-in attribute access

* handle en/addr separately

* handle other registers as well
2024-12-07 01:20:09 +02:00
squidbus
17abbcd74d
misc: Fix clang format (#1673) 2024-12-06 02:21:35 +02:00
IndecisiveTurtle
77da8bac00 core: Return proper address of eh frame/add more opcodes 2024-12-06 00:47:11 +02:00
TheTurtle
22a2741ea0
shader_recompilers: Improvements to SSA phi generation and lane instruction elimination (#1667)
* shader_recompiler: Add use tracking for Insts

* ssa_rewrite: Recursively remove phis

* ssa_rewrite: Correct recursive trivial phi elimination

* ir: Improve read lane folding pass

* control_flow: Avoid adding unnecessary divergant blocks

* clang format

* externals: Update ext-boost

---------

Co-authored-by: Frodo Baggins <baggins31084@proton.me>
2024-12-05 23:14:16 +02:00
Marcin Mikołajczyk
642dedea8c
Handle INDIRECT_BUFFER_CONST in ProcessCeUpdate (#1613) 2024-12-05 23:09:59 +02:00
Daniel R.
98f0cb65d7
The way to Unity, pt.1 (#1659) 2024-12-05 17:21:35 +01:00
squidbus
920acb8d8b
renderer_vulkan: Parse fetch shader per-pipeline (#1656)
* shader_recompiler: Read image format info directly from sharps instead of storing in shader info.

* renderer_vulkan: Parse fetch shader per-pipeline

* Few minor fixes.

* shader_recompiler: Specialize on vertex attribute number types.

* shader_recompiler: Move GetDrawOffsets to fetch shader
2024-12-04 13:03:47 +02:00
psucien
d6d1ec4f22 hot-fix: apply vgt index offset to draw commands 2024-11-29 14:17:53 +01:00
psucien
16e1d679dc
video_core: clean-up of indirect draws logic (#1589) 2024-11-24 15:43:28 +01:00
TheTurtle
c4506da0ae
kernel: Rewrite pthread emulation (#1440)
* libkernel: Cleanup some function places

* kernel: Refactor thread functions

* kernel: It builds

* kernel: Fix a bunch of bugs, kernel thread heap

* kernel: File cleanup pt1

* File cleanup pt2

* File cleanup pt3

* File cleanup pt4

* kernel: Add missing funcs

* kernel: Add basic exceptions for linux

* gnmdriver: Add workload functions

* kernel: Fix new pthreads code on macOS. (#1441)

* kernel: Downgrade edeadlk to log

* gnmdriver: Add sceGnmSubmitCommandBuffersForWorkload

* exception: Add context register population for macOS. (#1444)

* kernel: Pthread rewrite touchups for Windows

* kernel: Multiplatform thread implementation

* mutex: Remove spamming log

* pthread_spec: Make assert into a log

* pthread_spec: Zero initialize array

* Attempt to fix non-Windows builds

* hotfix: change incorrect NID for scePthreadAttrSetaffinity

* scePthreadAttrSetaffinity implementation

* Attempt to fix Linux

* windows: Address a bunch of address space problems

* address_space: Fix unmap of region surrounded by placeholders

* libs: Reduce logging

* pthread: Implement condvar with waitable atomics and sleepqueue

* sleepq: Separate and make faster

* time: Remove delay execution

* Causes high cpu usage in Tohou Luna Nights

* kernel: Cleanup files again

* pthread: Add missing include

* semaphore: Use binary_semaphore instead of condvar

* Seems more reliable

* libraries/sysmodule: log module on `sceSysmoduleIsLoaded`

* libraries/kernel: implement `scePthreadSetPrio`

---------

Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
Co-authored-by: Daniel R. <47796739+polybiusproxy@users.noreply.github.com>
2024-11-21 22:59:38 +02:00
Daniel R.
e968b1c23f
video_core/amdgpu: heuristic for shader binary info
Games can strip the first shader instruction (meant for debugging) which we rely on for obtaining shader information (e.g. LittleBigPlanet 3). For this reason, we start a search through the code start until we arrive at the shader binary info.
2024-11-21 19:24:13 +01:00
Vladislav Mikhalin
c45af9a2ca
Fix border color (#1548) 2024-11-19 18:55:05 +02:00
psucien
8fbd9187f8
libraries: gnmdriver: few more functions implemented (#1544) 2024-11-18 11:23:21 +02:00
Lander Gallastegi
aa4c6c0178
shader_recompiler: patch fmask access instructions (#1439)
* Fix multisample texture fetch

* Patch some fmask reads

* clang-format

* Assert insteed of ignore, coordinate fixes

* Patch ImageQueryDimensions
2024-11-05 22:39:57 +01:00
psucien
a8d2684929 hot-fix: proper calculation of image samples num 2024-10-23 23:11:01 +02:00
TheTurtle
87f8fea4de
renderer_vulkan: Commize and adjust buffer bindings (#1412)
* shader_recompiler: Implement finite cmp class

* shader_recompiler: Implement more opcodes

* renderer_vulkan: Commonize buffer binding

* liverpool: More dma data impl

* fix

* copy_shader: Handle additional instructions from Knack

* translator: Add V_CMPX_GE_I32
2024-10-19 15:30:58 +03:00
Vinicius Rangel
25de4d6b65
Devtools improvements I (#1392)
* devtools: fix showing entire depth instead of bits

* devtools: show button for stage instead of menu bar

- fix batch view dockspace not rendering when window collapsed

* devtools: removed useless "Batch" collapse & don't collapse last batch

* devtools: refactor DrawRow to templating

* devtools: reg popup size adjusted to the content

* devtools: better window names

* devtools: regview layout compacted

* devtools: option to show collapsed frame dump

keep most popups open when selection changes
best popup windows positioning

* devtools: show compute shader regs

* devtools: tips popup
2024-10-16 13:12:46 +03:00
Lander Gallastegi
877cda9b9a
video_core: Rework clear values (#1381)
* Clear color convertion

* Add missing formats

* Add swap handling

* Format bits and offsets

* clang-format

* Make num_components const

* Initialize alpha to 1

* Handle SnormNz as Snorm

* Don0t leave accidental nonzero values

* parallel3 for linux-qt

* Move number_utils to common
2024-10-16 12:55:45 +03:00
Vinicius Rangel
cf2e617f08
Devtools - Inspect regs/User data/Shader disassembly (#1358)
* devtools: pm4 - show markers

* SaveDataDialogLib: fix compile with mingw

* devtools: pm4 - show program state

* devtools: pm4 - show program disassembly

* devtools: pm4 - show frame regs

* devtools: pm4 - show color buffer info as popup

add ux improvements for open new windows with shift+click
better window titles

* imgui: skip all textures to avoid hanging with crash diagnostic enabled

not sure why this happens :c

* devtools: pm4 - show reg depth buffer
2024-10-13 15:02:22 +03:00
korenkonder
6e986f8133
video_core: Implement sceGnmInsertPushColorMarker (#989) 2024-10-10 18:03:12 +03:00
TheTurtle
100036aecf
spirv: Flush denormals if possible (#1302) 2024-10-10 17:47:39 +03:00
psucien
927bb0c175
Initial support of Geometry shaders (#1244)
* video_core: initial GS support

* fix for components mapping; missing prim type
2024-10-06 01:26:50 +03:00
Vinicius Rangel
af398e3684
Devtools: PM4 Explorer (#1094)
* Devtools: Pause system

* Devtools: pm4 viewer

- new menu bar
- refactored video_info layer
- dump & inspect pm4 packets
- removed dumpPM4 config
- renamed System to DebugState
- add docking space
- simple video info constrained to window size

* Devtools: pm4 viewer - add combo to select the queue

* Devtools: pm4 viewer - add hex editor

* Devtools: pm4 viewer - dump current cmd

* add monospaced font to devtools

* Devtools: pm4 viewer - use spec op name

avoid some allocations
2024-10-03 22:43:23 +02:00
Vladislav Mikhalin
7d96c9d634
Use correct scissor rects (#1146)
* WIP

* Proper combination of scissors

* convert static functions to lambdas
2024-10-01 21:42:01 +03:00
Daniel R.
80bf46da4c
core/memory: Pooled memory implementation (#1085) 2024-09-29 10:28:41 +03:00
squidbus
11c155d0f1
amdgpu: Fix buffer comparison by naming padding fields for initialization. (#1050) 2024-09-25 14:08:10 +03:00
squidbus
36ef61908d
renderer_vulkan: Refactor surface and depth format mapping. (#1067)
* renderer_vulkan: Refactor surface and depth format mapping.

* image: Convert usage to feature flags for format support checks.
2024-09-25 12:10:44 +03:00
psucien
5f4ddc14fc
Image subresources barriers (#904)
* video_core: texture: image subresources state tracking

* shader_recompiler: use one binding if the same image is read and written

* video_core: added rebinding of changed textures after overlap resolve

* don't use pointers; slight `FindTexture` refactoring

* video_core: buffer_cache: don't copy over the image size

* redundant barriers removed; fixes

* regression fixes

* texture_cache: 3d texture layers count fixup

* shader_recompiler: support for partially bound cubemaps

* added support for cubemap arrays

* don't bind unused color buffers

* fixed depth promotion to do not use stencil

* doors

* bonfire lit

* cubemap array index calculation

* final touches
2024-09-21 21:45:56 +02:00
korenkonder
60f315a54d
video_core: stride fix (#986)
I don't know why it was forced to be 1 while in reality it should be as is
2024-09-19 21:43:03 +02:00
TheTurtle
b09b28c7f3
graphics_pipeline: Move some depth configuration to dynamic state (#931)
* graphics_pipeline: More proper masking

* pipeline_cache: Skip setting depth/stencil fields when test is disabled

* pipeline_cache: More fixes to depth stencil state

* vk_rasterizer: Use dynamic state for depth bounds and bias

* pipeline_cache: Missed depth bias enable

* vk_rasterizer: Add stencil dynamic states

* thread: Reduce spammy log

* Remove some leftover state

* pipeline_cache: Allocate pipelines from pools

* vk_graphics_pipeline: Remove bindings member

Saves about 1KB from each pipeline
2024-09-15 22:42:14 +02:00
Raven
1879c9d12f
Add PM4 opcodes 2024-09-15 01:46:39 +08:00
baggins183
bc66fe8fb5
Fix copyGpuBuffers when resize invalidates commands in flight (#876)
* Fix copyGpuBuffers when resize invalidates commands in flight

* Use _MB macro for size constant
2024-09-12 21:54:54 +02:00
Pipi86
1c0dfc60a1
Typo fix (#820)
* Update pm4_cmds.h

* Update pm4_cmds.h
2024-09-11 13:40:19 +03:00
TheTurtle
b0bbb16aae
video_core: Add fallback path for pipelines with more than 32 bindings (#837)
* video_core: Small fixes

* renderer_vulkan: Add fallback path for pipelines with more than 32 bindings

* vk_resource_pool: Rewrite desc heap

* work
2024-09-10 20:54:39 +03:00
psucien
adfb3af95f hot-fix: nullGpu functionality restored 2024-09-09 08:59:47 +02:00
TheTurtle
13743b27fc
shader_recompiler: Implement data share append and consume operations (#814)
* shader_recompiler: Add more format swap modes

* texture_cache: Handle stencil texture reads

* emulator: Support loading font library

* readme: Add thanks section

* shader_recompiler: Constant buffers as integers

* shader_recompiler: Typed buffers as integers

* shader_recompiler: Separate thread bit scalars

* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow

* shader_recompiler: Implement data append/consume operations

* clang format

* buffer_cache: Simplify invalidation scheme

* video_core: Remove some invalidation remnants

* adjust
2024-09-07 00:14:51 +03:00
psucien
34ffd95306
video_core: added VK_LAYER_LUNARG_crash_diagnostic (#751) 2024-09-03 21:56:23 +02:00
TheTurtle
f087f43736
shader_recompiler: Implement render target swizzles when no format is available (#739)
* shader_recompiler: Use null image when shader is compiled with unbound sharp

* video_core: Refactor and render target swizzles

* liverpool_to_vk: Add missing swap format from RDR

* video_core: Refactor shader recompiler interface

* Makes it much easier to pass runtime information to the recompiler and have it treated as part of the shader key. Also pulls out most runtime state from Info struct

* shader_recompiler: Avoid some asserts
2024-09-03 14:04:30 +03:00
baggins183
3f8a8d3a24
video_core: Add bounds checking for subspan use in liverpool functions (#717) 2024-09-03 13:58:45 +03:00
psucien
ca1613258f
video_core: added support for indirect draws (#678)
* video_core: added support for indirect draws

* barriers simplified
2024-08-30 22:59:56 +02:00
TheTurtle
66e96dd944
video_core: Account of runtime state changes when compiling shaders (#575)
* video_core: Compile shader permutations

* spirv: Only specific storage image format for atomics

* ir: Avoid cube coord patching for storage image

* spirv: Fix default attributes

* data_share: Add more instructions

* video_core: Query storage flag with runtime state

* kernel: Use std::list for semaphore

* video_core: Use texture buffers for untyped format load/store

* buffer_cache: Limit view usage

* vk_pipeline_cache: Fix invalid iterator

* image_view: Reduce log spam when alpha=1 in storage swizzle

* video_core: More features and proper spirv feature detection

* video_core: Attempt no2 for specialization

* spirv: Remove conflict

* vk_shader_cache: Small cleanup
2024-08-29 19:29:54 +03:00
psucien
9d349a1308 video_core: added support for indirect dispatches (gfx only) 2024-08-29 12:32:37 +02:00
georgemoralis
be49871c68
Merge pull request #618 from vertver/main
video_core: Added copyGPUCmdBuffers option
2024-08-28 14:00:26 +03:00