Commit Graph

342 Commits

Author SHA1 Message Date
Stephen Miller
122fe22a32
Implement IMAGE_GATHER4 and IMAGE_GATHER4_O (#1939)
* Implement IMAGE_GATHER4_O

Used by The Last of Us Remastered.

* Fix type on IMAGE_GATHER4_C_LZ

Had a different set of types compared to the other IMAGE_GATHER4 ops.

* IMAGE_GATHER4
2024-12-28 02:42:41 +02:00
georgemoralis
3218c36b22
knack fixes by niko (#1933) 2024-12-27 23:03:03 +02:00
squidbus
b1f74660df
shader_recompiler: Implement S_BCNT1_I32_B64 and S_FF1_I32_B64 (#1889)
* shader_recompiler: Implement S_BCNT1_I32_B64

* shader_recompiler: Implement S_FF1_I32_B64

* shader_recompiler: Implement IEqual for 64-bit.

* shader_recompiler: Fix immediate type in S_FF1_I32_B32
2024-12-27 16:46:07 +02:00
squidbus
a89c29c2ca
shader_recompiler: Rework image read/write emit. (#1819) 2024-12-25 01:13:32 +02:00
IndecisiveTurtle
7b24b42711 data_share: Emit barrier before reads
* Fixes artifacts in TLG when using NVIDIA gpus. When LDS is written and read in the same basic block, the barrier pass wont handle it properly, so insert a barrier before reads
2024-12-24 16:04:30 +02:00
Daniel R.
c284cf72e1
Switch remaining CRLF terminated files to LF 2024-12-24 13:56:31 +01:00
TheTurtle
092d42e981
renderer_vulkan: Implement rectlist emulation with tessellation (#1857)
* renderer_vulkan: Implement rectlist emulation with tessellation

* clang format

* renderer_vulkan: Use tessellation for quad primitive as well

* vk_rasterizer: Handle viewport enable flags

* review

* shader_recompiler: Fix quad/rect list FS passthrough semantics.

* spirv: Bump to 1.5

* remove pragma

---------

Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
2024-12-24 13:28:47 +02:00
georgemoralis
b0b74243af clang-fix 2024-12-19 10:25:03 +02:00
TheTurtle
188eebb92a
ir: Add heuristic based LDS barrier pass (#1801)
* ir: Add heuristic based LDS barrier pass

* Attempts to insert barriers after zero-depth divergant conditional blocks in shaders that use shared memory

* lds_barriers: Limit to nvidia

* Intel has historically had problems with cs barriers, will debug other time
2024-12-19 10:18:28 +02:00
Stephen Miller
32435674f2
Misc UE4 fixes (#1821)
* Add ExecLo case to S_SAVEEXEC_B64

Seen in CUSA38209

* S_BCNT1_I32_B32

Turtle said our implementation of S_BCNT1_I32_B64 was meant to be for S_BCNT1_I32_B32, so renaming the opcode is the fix.
2024-12-18 22:05:35 +02:00
Marcin Mikołajczyk
b1b4c8c487
Handle setting Vcc in Translator::SetDst64 (#1826) 2024-12-18 21:57:58 +02:00
Marcin Mikołajczyk
be4c38bf1c
Handle 32bit int ImageFormat (#1823) 2024-12-18 21:48:00 +02:00
squidbus
8a4e03228a
spirv_emit_context: Prevent double-add of GS in attributes to interface. (#1800) 2024-12-16 02:11:15 +02:00
baggins183
9aa1c13c7e
Fix some compiler problems with ds3 (#1793)
- Implement S_CMOVK_I32
- Handle Isoline abstract patch type
2024-12-15 16:30:19 +02:00
psucien
0fd1ab674b
GPU processor refactoring (#1787)
* coroutine code prettification

* asc queues submission refactoring

* better asc ring context handling

* final touches and review notes

* even more simplification for context saving
2024-12-15 00:54:46 +02:00
Vladislav Mikhalin
876445faf1
recompiler: emit a label right after s_branch to prevent dead code interferrence (#1785) 2024-12-14 22:46:55 +02:00
squidbus
f93677b953
resource_tracking_pass: Fix converting dimensions to float for normalization. (#1790) 2024-12-14 22:46:35 +02:00
TheTurtle
e9ede8d627
Revert "DmaData and Recompiler fixes (#1775)" (#1784)
This reverts commit cafd40f2c2.
2024-12-14 16:17:14 +02:00
squidbus
27447537c3
externals: Update sirit to fix debug assert (#1783) 2024-12-14 16:12:41 +02:00
squidbus
e752f04cde
shader_recompiler: Fixups from stencil changes (#1776) 2024-12-14 14:33:24 +02:00
Vladislav Mikhalin
cafd40f2c2
DmaData and Recompiler fixes (#1775)
* liverpool: fix dmadata packet handling

* recompiler: emit a label right after s_branch to prevent dead code interferrence

* specialize barriers
2024-12-14 14:33:06 +02:00
baggins183
3c0c921ef5
Tessellation (#1528)
* shader_recompiler: Tessellation WIP

* fix compiler errors after merge

DONT MERGE set log file to /dev/null

DONT MERGE linux pthread bb fix

save work

DONT MERGE dump ir

save more work

fix mistake with ES shader

skip list

add input patch control points dynamic state

random stuff

* WIP Tessellation partial implementation. Squash commits

* test: make local/tcs use attr arrays

* attr arrays in TCS/TES

* dont define empty attr arrays

* switch to special opcodes for tess tcs/tes reads and tcs writes

* impl tcs/tes read attr insts

* rebase fix

* save some work

* save work probably broken and slow

* put Vertex LogicalStage after TCS and TES to fix bindings

* more refactors

* refactor pattern matching and optimize modulos (disabled)

* enable modulo opt

* copyright

* rebase fixes

* remove some prints

* remove some stuff

* Add TCS/TES support for shader patching and use LogicalStage

* refactor and handle wider DS instructions

* get rid of GetAttributes for special tess constants reads. Immediately replace some upon seeing readconstbuffer. Gets rid of some extra passes over IR

* stop relying on GNMX HsConstants struct. Change runtime_info.hs_info and some regs

* delete some more stuff

* update comments for current implementation

* some cleanup

* uint error

* more cleanup

* remove patch control points dynamic state (because runtime_info already depends on it)

* fix potential problem with determining passthrough

---------

Co-authored-by: IndecisiveTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-12-14 12:56:17 +02:00
squidbus
8caca4df32
shader_recompiler: Support VK_AMD_shader_image_load_store_lod for IMAGE_STORE_MIP (#1770)
* shader_recompiler: Support VK_AMD_shader_image_load_store_lod for IMAGE_STORE_MIP

* emit_spirv: Fix missing extension declaration.
2024-12-14 12:03:42 +02:00
squidbus
f1c23d514b
shader_recompiler: Implement FREXP instructions. (#1766) 2024-12-13 21:51:39 +02:00
TheTurtle
722a0e36be
graphics: Improve handling of color buffer and storage image swizzles (#1763)
* liverpool_to_vk: Remove wrong component swap formats

* shader_recompiler: Handle storage and buffer format swizzles

* shader_recompiler: Skip unsupported depth export

* image_view: Remove image format swizzle

* Platform support is not always guaranteed
2024-12-13 21:49:37 +02:00
squidbus
028be3ba5d
shader_recompiler: Emulate unnormalized sampler coordinates in shader. (#1762)
* shader_recompiler: Emulate unnormalized sampler coordinates in shader.

* Address review comments.
2024-12-13 21:49:07 +02:00
TheTurtle
5be807fc8a
hot-fix: Fix order of operands 2024-12-13 00:31:49 +02:00
squidbus
0b59ebb22f
shader_recompiler: Implement S_ABS_I32 (#1713) 2024-12-09 12:12:33 +01:00
squidbus
71a82199ed
shader_recompiler: Fix check for fragment depth store. (#1694) 2024-12-08 10:20:05 +02:00
IndecisiveTurtle
cde84e4bac shader_recompiler: Fix mistake 2024-12-07 23:45:23 +02:00
squidbus
c076ba69e8
shader_recompiler: Implement V_LSHL_B64 for immediate arguments. (#1674) 2024-12-07 23:28:17 +02:00
¥IGA
2266622dcf
Support for Vulkan 1.4 (#1665) 2024-12-07 19:41:41 +02:00
Vladislav Mikhalin
8eacb88a86
recompiler: fixed fragment shader built-in attribute access (#1676)
* recompiler: fixed fragment shader built-in attribute access

* handle en/addr separately

* handle other registers as well
2024-12-07 01:20:09 +02:00
TheTurtle
9e618c0e0c
video_core: Add multipler to handle special cases of texture buffer stride mismatch (#1640)
* page_manager: Enable userfaultfd by default

* Much faster than page faults and causes less problems

* shader_recompiler: Add texel buffer multiplier

* Fixes format mismatch assert when vsharp stride is multiple of format stride

* shader_recompiler: Specialize UBOs on size

* Some games can perform manual vertex pulling and thus bind read only buffers of varying size. We only recompile when the vsharp size is larger than size in shader, in opposite case its not needed

* clang format
2024-12-06 19:54:59 +02:00
squidbus
d05846a327
specialization: Fix fetch shader field type (#1675) 2024-12-06 12:59:55 +02:00
IndecisiveTurtle
77da8bac00 core: Return proper address of eh frame/add more opcodes 2024-12-06 00:47:11 +02:00
TheTurtle
22a2741ea0
shader_recompilers: Improvements to SSA phi generation and lane instruction elimination (#1667)
* shader_recompiler: Add use tracking for Insts

* ssa_rewrite: Recursively remove phis

* ssa_rewrite: Correct recursive trivial phi elimination

* ir: Improve read lane folding pass

* control_flow: Avoid adding unnecessary divergant blocks

* clang format

* externals: Update ext-boost

---------

Co-authored-by: Frodo Baggins <baggins31084@proton.me>
2024-12-05 23:14:16 +02:00
squidbus
920acb8d8b
renderer_vulkan: Parse fetch shader per-pipeline (#1656)
* shader_recompiler: Read image format info directly from sharps instead of storing in shader info.

* renderer_vulkan: Parse fetch shader per-pipeline

* Few minor fixes.

* shader_recompiler: Specialize on vertex attribute number types.

* shader_recompiler: Move GetDrawOffsets to fetch shader
2024-12-04 13:03:47 +02:00
TheTurtle
eb844b9b63
shader_recompiler: Implement manual barycentric interpolation path (#1644)
* shader_recompiler: Implement manual barycentric interpolation path

* clang format

* emit_spirv: Fix typo

* emit_spirv: Simplify variable definition

* spirv_emit: clang format
2024-12-02 23:20:54 +02:00
Jamie Tong
b0860d6e8c
implement DS_AND_B32, DS_OR_B32, DS_XOR_B32 (#1593)
* implement DS_OR_B32

* implement DS_AND_B32, DS_XOR_B32
2024-11-30 22:39:11 +02:00
Vladislav Mikhalin
086c5802f4
Fixed DS_SWIZZLE_32 (#1619) 2024-11-29 22:13:36 +02:00
psucien
d6d1ec4f22 hot-fix: apply vgt index offset to draw commands 2024-11-29 14:17:53 +01:00
psucien
3d95ad0e3a
Image binding and texture cache interface refactor (1/2) (#1481)
* video_core: texture_cache: interface refactor and better overlap handling

* resources binding moved into vk_rasterizer

* remove `virtual` flag leftover
2024-11-24 17:07:51 +01:00
baggins183
fde1726af5
recompiler: fix how srt pass handles step rate sharps in special case (#1587) 2024-11-24 11:49:59 +01:00
psucien
5976300788 hot-fix: proper offset calculation for single offset lds instructions 2024-11-23 10:14:19 +01:00
TheTurtle
c4506da0ae
kernel: Rewrite pthread emulation (#1440)
* libkernel: Cleanup some function places

* kernel: Refactor thread functions

* kernel: It builds

* kernel: Fix a bunch of bugs, kernel thread heap

* kernel: File cleanup pt1

* File cleanup pt2

* File cleanup pt3

* File cleanup pt4

* kernel: Add missing funcs

* kernel: Add basic exceptions for linux

* gnmdriver: Add workload functions

* kernel: Fix new pthreads code on macOS. (#1441)

* kernel: Downgrade edeadlk to log

* gnmdriver: Add sceGnmSubmitCommandBuffersForWorkload

* exception: Add context register population for macOS. (#1444)

* kernel: Pthread rewrite touchups for Windows

* kernel: Multiplatform thread implementation

* mutex: Remove spamming log

* pthread_spec: Make assert into a log

* pthread_spec: Zero initialize array

* Attempt to fix non-Windows builds

* hotfix: change incorrect NID for scePthreadAttrSetaffinity

* scePthreadAttrSetaffinity implementation

* Attempt to fix Linux

* windows: Address a bunch of address space problems

* address_space: Fix unmap of region surrounded by placeholders

* libs: Reduce logging

* pthread: Implement condvar with waitable atomics and sleepqueue

* sleepq: Separate and make faster

* time: Remove delay execution

* Causes high cpu usage in Tohou Luna Nights

* kernel: Cleanup files again

* pthread: Add missing include

* semaphore: Use binary_semaphore instead of condvar

* Seems more reliable

* libraries/sysmodule: log module on `sceSysmoduleIsLoaded`

* libraries/kernel: implement `scePthreadSetPrio`

---------

Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
Co-authored-by: Daniel R. <47796739+polybiusproxy@users.noreply.github.com>
2024-11-21 22:59:38 +02:00
Daniel R.
6904764aab
shader_recompiler/frontend: implement V_MIN3_U32 2024-11-21 19:52:48 +01:00
Daniel R.
e968b1c23f
video_core/amdgpu: heuristic for shader binary info
Games can strip the first shader instruction (meant for debugging) which we rely on for obtaining shader information (e.g. LittleBigPlanet 3). For this reason, we start a search through the code start until we arrive at the shader binary info.
2024-11-21 19:24:13 +01:00
¥IGA
c83ac654ce
Bump to Clang 18 (#1549) 2024-11-21 12:08:22 +02:00
Ruah Devlin
96cd79f272
Implement V_MED3_U32 vector ALU Opcode (#1553) 2024-11-20 17:23:59 +01:00
Daniel R.
17c47bcd96
shader_recompiler/frontend: Implement bitcmp instructions (#1550) 2024-11-19 21:38:32 +01:00
Lander Gallastegi
b64dcd2f56
Assert fix (#1521) 2024-11-12 09:26:48 +02:00
psucien
204bba9be8 hot-fix: pr merge conflict resolved 2024-11-05 22:59:45 +01:00
Lander Gallastegi
aa4c6c0178
shader_recompiler: patch fmask access instructions (#1439)
* Fix multisample texture fetch

* Patch some fmask reads

* clang-format

* Assert insteed of ignore, coordinate fixes

* Patch ImageQueryDimensions
2024-11-05 22:39:57 +01:00
baggins183
9ec75c3feb
Implement shader resource tables (#1165)
* Implement shader resource tables

* fix after rebase + squash

* address some review comments

* fix pipeline_common

* cleanup debug stuff

* switch to using single codegenerator
2024-11-01 08:55:53 +02:00
TheTurtle
87f8fea4de
renderer_vulkan: Commize and adjust buffer bindings (#1412)
* shader_recompiler: Implement finite cmp class

* shader_recompiler: Implement more opcodes

* renderer_vulkan: Commonize buffer binding

* liverpool: More dma data impl

* fix

* copy_shader: Handle additional instructions from Knack

* translator: Add V_CMPX_GE_I32
2024-10-19 15:30:58 +03:00
Herman Semenoff
96ea686eb6
Fixed return strict const iterator, replace to range-based loop C++17 and code refactor (#548)
Signed-off-by: Herman Semenov <GermanAizek@yandex.ru>
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
2024-10-18 11:06:11 +03:00
squidbus
0f91661660
resource_tracking_pass: Make sure immediate offset is accessed as correct type. (#1339) 2024-10-10 23:58:01 +03:00
squidbus
2f80d7565d
resource_tracking_pass: Fix type handling of sample offsets. (#1337) 2024-10-10 23:30:09 +03:00
squidbus
21eb175aa1
shader_recompiler: Add asserts for get/set register bounds. (#1336) 2024-10-10 23:14:50 +03:00
squidbus
dcc4057dd8
shader_recompiler: Set correct operand field for VOP3b sdst. (#1335) 2024-10-10 23:04:51 +03:00
squidbus
0df0d0cb66
shader_recompiler: Fix last image sample address parameter. (#1334) 2024-10-10 22:51:11 +03:00
squidbus
0a5f46942e
shader_recompiler: Make sure RuntimeInfo is zero initialized. (#1332) 2024-10-10 20:32:13 +03:00
squidbus
09cbccb40b
shader_recompiler: Implement V_SUBB_U32 and V_SUBBREV_U32. (#1331) 2024-10-10 19:40:19 +03:00
squidbus
d91ad6174e
shader_recompiler: Move sampling parameter resolution to tracking pass and support more derivative types. (#1290)
* shader_recompiler: Move sampling parameter resolution to tracking pass and support more derivative types.

* shader_recompiler: Only track sampler sharp on sample instructions.

* shader_recompiler: Fix Inst args size.
2024-10-10 19:27:34 +03:00
TheTurtle
100036aecf
spirv: Flush denormals if possible (#1302) 2024-10-10 17:47:39 +03:00
Stephen Miller
aba803bd04
IMAGE_GATHER4_C_LZ_O (#1274)
Used by Crash Team Racing Nitro Fueled.
2024-10-08 10:35:07 +03:00
baggins183
3c0255b953
DebugPrintf in shaders (#1252)
* Add shader debug print opcode that uses NonSemantic DebugPrintf extension

* small correction for flags in Inst

* Fix IR Debug Print. Add StringLiteral op

* add missing microinstruction changes for debugprint

* cleanup. delete vaarg stuff. Smuggle format string in Info and flags

* more cleanup

* more

* (dont merge??) update sirit submodule

* fix num args 4 -> 5

* add notes about DebugPrint IR op

* use NumArgsOf again

* copyright

* update sirit submodule

* fix clangformat

* add new Value variant for string literal. Use arg0 for fmt string

* remove string pool changes

* Update src/shader_recompiler/ir/value.cpp

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>

---------

Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com>
2024-10-06 22:34:40 +03:00
TheTurtle
310814ac71
shader_recompiler: Support for more offset layouts (#1270) 2024-10-06 18:43:59 +02:00
squidbus
2a7d56dbf2
shader_recompiler: Remove outdated image array warning. (#1256) 2024-10-06 01:42:58 +03:00
psucien
927bb0c175
Initial support of Geometry shaders (#1244)
* video_core: initial GS support

* fix for components mapping; missing prim type
2024-10-06 01:26:50 +03:00
squidbus
8576d5e72c
shader_recompiler: Set array size to max UBO size when 0. (#1251)
* shader_recompiler: Set array size to max UBO size when 0.

* vulkan: Account for fallbacks when setting depth attachment format.
2024-10-05 22:31:52 +03:00
squidbus
ee57c2fd69
vulkan: Fix two more validation errors. (#1250) 2024-10-05 21:35:02 +03:00
Mahmoud Adel
76644a0169
add Opcodes to switch case (#1233)
* add Opcodes to switch case

Added Opcodes to switch case, they were done here but weren't added to switch 9f79764b01 (diff-9a6c2e2027c03231e88aaaab30908baecae202661839f35c31a777fec2500c7aR659)

* clang
2024-10-04 11:24:45 +03:00
korenkonder
9f79764b01
Add various V_CVT opcodes (#1223) 2024-10-04 08:48:05 +02:00
korenkonder
da519f9091
Moved opcode to it's proper location (#1221) 2024-10-03 22:47:26 +02:00
ElBread3
ff13aff862
video_core: IMAGEGATHER4_C_O (#1210) 2024-10-03 18:48:54 +02:00
dbz400
54dafce541
Add V_CVT_F64_I32 (#1219) 2024-10-03 18:48:28 +02:00
squidbus
7209b7d786
shader_recompiler: Shader param fixups (#1199) 2024-10-03 10:50:51 +03:00
squidbus
e68774d449
shader_recompiler: Define fragment output type based on number format. (#1097)
* shader_recompiler: Define fragment output type based on number format.

* shader_recompiler: Fix GetAttribute SPIR-V output type.

* shader_recompiler: Don't bitcast on SetAttribute unless integer target.
2024-10-01 23:42:37 +03:00
squidbus
75adf7c8d1
vulkan: Fix some common validation errors. (#1101)
* vulkan: Fix some extension support related validation errors.

* vulkan: Fix validation error on zero-size buffer.

* vulkan: Fix primitive list restart validation error.
2024-10-01 23:42:20 +03:00
dbz400
c7ff0419ad
Fix V_CMP_CLASS_F32 (#1153) 2024-09-30 11:36:26 +03:00
squidbus
45e206e248
shader_recompiler: Sample images using correct result type. (#1068) 2024-09-25 14:20:28 +03:00
jnack
beb809b612
add V_CMPX_LE_I32 (#1056) 2024-09-24 18:22:31 +03:00
baggins183
7f9bc0abbd
fix lane inst decoding (#1051) 2024-09-24 12:29:57 +03:00
squidbus
a5001d11a8
shader_recompiler: Increase push constants user data to full capacity. (#1032) 2024-09-23 13:40:33 +03:00
TheTurtle
ee38eec7fe
shader_recompiler: Additional scope handling and user data as push constants (#1013)
* shader_recompiler: Use push constants for user data regs

* shader: Add some GR2 instructions

* shader: Add some instructions

* shader: Add instructions for knack

* touchups

* spirv: Better names

* buffer_cache: Ignore non gpu modified images

* clang format

* Add log

* more fixes
2024-09-23 08:55:43 +02:00
psucien
fb5bc371cb hot-fix: unnecessary optimization removed 2024-09-22 19:56:07 +02:00
IndecisiveTurtle
e1d03c35fd hotfix: Fix mipmap query for images 2024-09-22 19:17:54 +03:00
squidbus
a18419dd73
shader_recompiler: Exclude non-float results from output modifiers. (#1016) 2024-09-22 15:03:17 +03:00
korenkonder
8811cc5cc6
Add V_CVT_PK_U8_F32 opcode (#1022) 2024-09-22 15:02:34 +03:00
korenkonder
5db27109c9
Optimise out unnecessary shifts (#1021) 2024-09-22 15:02:20 +03:00
TheTurtle
edde0a3e7e
hotfix: Revert ADDC change 2024-09-22 01:53:10 +03:00
squidbus
dd184fd95d
shader_recompiler: Use SetDst in more instructions. (#1015) 2024-09-22 01:41:19 +03:00
psucien
5f4ddc14fc
Image subresources barriers (#904)
* video_core: texture: image subresources state tracking

* shader_recompiler: use one binding if the same image is read and written

* video_core: added rebinding of changed textures after overlap resolve

* don't use pointers; slight `FindTexture` refactoring

* video_core: buffer_cache: don't copy over the image size

* redundant barriers removed; fixes

* regression fixes

* texture_cache: 3d texture layers count fixup

* shader_recompiler: support for partially bound cubemaps

* added support for cubemap arrays

* don't bind unused color buffers

* fixed depth promotion to do not use stencil

* doors

* bonfire lit

* cubemap array index calculation

* final touches
2024-09-21 21:45:56 +02:00
squidbus
913a46173a
resource_tracking_pass: Allow derivatives for 2D array images. (#1000) 2024-09-21 14:19:01 +02:00
korenkonder
07de1ee977
Sort opcodes by their indices. Group them too when applicable (#945) 2024-09-19 20:29:56 +02:00
Raven
84e2c4d3bb
Add other 64-bit floating point shader instructions (#944) 2024-09-17 18:01:33 +02:00
Daniel R.
dcf245b814
shader_recompiler: Implement basic 64-bit floating point support (#915)
* shader_recompiler: Implement basic 64-bit floating point support

* Fix formatting
2024-09-15 22:53:08 +02:00
Raven
b14f447060
Add DS_READ2ST64_B32 (#916)
* Add DS_READ2ST64_B32

* Fix CLANG

* Fix CI again

* Parameter update for DS_READ
2024-09-14 21:16:12 +03:00
Raven
5c5c02cb04
Add S_XOR_B32 opcode (#911)
* Add S_XOR_B32

* Stub S_OR_B32
2024-09-14 18:52:30 +03:00
Raven
12a0a02e38
Map BUFFER_AUTOMIC SMIN/SAMX/AND/OR/XOR/INC/DEC (#910) 2024-09-14 18:52:20 +03:00
Emulator-Team-2
c924457e28
Implement IMAGE_SAMPLE_L_O opcode (#899) 2024-09-13 19:20:35 +02:00
Luke Warner
c181102a02
Implement S_ABSDIFF_I32 shader instruction (#902) 2024-09-13 19:02:17 +02:00
TheTurtle
1b6cc447b4
hotfix: Restore unreachable 2024-09-12 23:46:29 +03:00
illusion0001
b911c70d35
Silence unhandled case warns (#823) 2024-09-12 23:01:13 +03:00
squidbus
a49c7e9dcb
shader_recompiler: Add buffer offset calculation when swizzle is enabled. (#834) 2024-09-12 22:59:52 +03:00
squidbus
136f6072b9
shader_recompiler: Use correct integer type for OpImageWrite. (#871) 2024-09-11 23:04:02 +03:00
TheTurtle
b0bbb16aae
video_core: Add fallback path for pipelines with more than 32 bindings (#837)
* video_core: Small fixes

* renderer_vulkan: Add fallback path for pipelines with more than 32 bindings

* vk_resource_pool: Rewrite desc heap

* work
2024-09-10 20:54:39 +03:00
Andrew Middendorp
4502a5ddcc
Added S_ANDN2_B32 and S_NAND_B32 opcodes (#833)
* Added S_ANDN2_B32 and S_NAND_B32 opcodes

* Update src/shader_recompiler/frontend/translate/scalar_alu.cpp

Co-authored-by: baggins183 <baggins31084@proton.me>

* Fix result and src1

Co-authored-by: baggins183 <baggins31084@proton.me>

* update result

Co-authored-by: baggins183 <baggins31084@proton.me>

* Update src1

Co-authored-by: baggins183 <baggins31084@proton.me>

---------

Co-authored-by: baggins183 <baggins31084@proton.me>
2024-09-09 22:46:57 +03:00
TheTurtle
13743b27fc
shader_recompiler: Implement data share append and consume operations (#814)
* shader_recompiler: Add more format swap modes

* texture_cache: Handle stencil texture reads

* emulator: Support loading font library

* readme: Add thanks section

* shader_recompiler: Constant buffers as integers

* shader_recompiler: Typed buffers as integers

* shader_recompiler: Separate thread bit scalars

* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow

* shader_recompiler: Implement data append/consume operations

* clang format

* buffer_cache: Simplify invalidation scheme

* video_core: Remove some invalidation remnants

* adjust
2024-09-07 00:14:51 +03:00
Stephen Miller
09ce12a868
shader_recompiler: Add more opcodes (#802)
* Implement some missing shader opcodes

Implements TBUFFER_STORE_FORMAT_XYZW, IMAGE_SAMPLE_CD, and IMAGE_GATHER4_C_LZ.

These are seen in https://github.com/shadps4-emu/shadPS4/issues/496.

* Implement IMAGE_STORE_MIP

Not sure if this is the right way to do this, let me know if this needs changing.

* Revert "Implement IMAGE_STORE_MIP"

This reverts commit cff78b5924.
2024-09-06 23:51:20 +03:00
baggins183
bb29224daf
Implement V_MOVREL variants (#745)
* shader_recompiler: Implement V_MOVRELS_B32, V_MOVRELD_B32,
V_MOVRELSD_B32

Generates a ton of OpSelects to hardcode reading or writing from each
possible vgpr depending on the value of m0

Future work is to do range analysis to put an upper bound on m0 and
check fewer registers.

* fix runtime info after rebase
2024-09-06 23:47:47 +03:00
squidbus
1d7ee198e1
shader_recompiler: Add ConvertF16F32 to FP16 detection. (#800) 2024-09-06 14:12:07 +03:00
georgemoralis
0dd6e257c5
Merge pull request #573 from polybiusproxy/shader_recompiler/format
shader_recompiler/frontend: Miscellaneous fixes
2024-09-04 23:21:23 +03:00
Sebastian Kassai
0a5c36482e
shader_recompiler: change ir.SetScalarReg() -> SetDst() (#777)
Fixes an out-of-bounds crash on Amplitude and Rock Band 4 startup.
2024-09-04 17:30:43 +03:00
squidbus
b87e6f3838
vulkan: Emulate depth clip control when extension is not available. (#762) 2024-09-04 01:07:05 +03:00
squidbus
d48836d5ae
shader_recompiler: Limit src0 to 4-bit in V_CVT_OFF_F32_I4 (#759) 2024-09-03 21:37:52 +03:00
TheTurtle
f087f43736
shader_recompiler: Implement render target swizzles when no format is available (#739)
* shader_recompiler: Use null image when shader is compiled with unbound sharp

* video_core: Refactor and render target swizzles

* liverpool_to_vk: Add missing swap format from RDR

* video_core: Refactor shader recompiler interface

* Makes it much easier to pass runtime information to the recompiler and have it treated as part of the shader key. Also pulls out most runtime state from Info struct

* shader_recompiler: Avoid some asserts
2024-09-03 14:04:30 +03:00
baggins183
101aeb920d
Implement V_BFM_B32 and V_FFBH_U32 (#663)
* Implement V_BFM_B32

* Render.Recompiler: Implement V_FFBH_U32

* fix clang-format
2024-09-01 22:20:42 +03:00
Grégoire Hage
1bd9317509
Implement V_READFIRSTLANE_B32 (#681)
* Implement V_READFIRSTLANE_B32

* refactor
2024-09-01 21:49:42 +03:00
Daniel R.
84f1690dfb
Merge branch 'shadps4-emu:main' into shader_recompiler/format 2024-08-30 15:40:17 +02:00
Grégoire Hage
1651db24fe
Implement S_XNOR_B64 (#654) 2024-08-30 02:43:12 +03:00
IndecisiveTurtle
6fbbe3d79b translator: Add missed flow instruction 2024-08-30 00:26:01 +03:00
IndecisiveTurtle
fab390b860 spirv: More correct texel buffer usage 2024-08-30 00:25:56 +03:00
TheTurtle
66e96dd944
video_core: Account of runtime state changes when compiling shaders (#575)
* video_core: Compile shader permutations

* spirv: Only specific storage image format for atomics

* ir: Avoid cube coord patching for storage image

* spirv: Fix default attributes

* data_share: Add more instructions

* video_core: Query storage flag with runtime state

* kernel: Use std::list for semaphore

* video_core: Use texture buffers for untyped format load/store

* buffer_cache: Limit view usage

* vk_pipeline_cache: Fix invalid iterator

* image_view: Reduce log spam when alpha=1 in storage swizzle

* video_core: More features and proper spirv feature detection

* video_core: Attempt no2 for specialization

* spirv: Remove conflict

* vk_shader_cache: Small cleanup
2024-08-29 19:29:54 +03:00
georgemoralis
18e95ae4c0
Merge branch 'main' into shader_recompiler/format 2024-08-29 10:18:12 +03:00
psucien
3fbb68048e
shader_recompiler: frontend: SOPC and SOPK handling separated; more cmp opcodes (#634) 2024-08-28 22:27:47 +02:00
0xsegf4ult
9f4e55a8e7
shader_recompiler: constant propagation bitwise operations + S_CMPK_EQ_U32 fix (#613)
* rebase on main branch impl of V_LSHL_B64

* remove V_LSHR_B64

* fix S_CMPK_EQ_u32

* fix conflicts

* fix broken merge

* remove duplicate cases

* remove duplicate declaration
2024-08-28 13:10:21 +03:00
psucien
371d1d009a Added missing headers and 2D MSAA image type 2024-08-27 19:17:23 +02:00
Grégoire Hage
288db9a0cf
Implement V_LSHL_B64 (#608) 2024-08-27 14:15:32 +03:00
psucien
af4356bfe1 shader_recompiler: fix for pattern detection in TryDisableAnisoLod0
Also fix for forgotten log message params.
2024-08-26 23:49:36 +02:00
Plínio Larrubia
ad8373095a
fix typo in LOG_INFO (#559)
fix: file name typo constant_propogation_pass.cpp

fix typo from 'symbol_vitrual_addr' variable

fix typo in emit_spirv_context_get_set.cpp

fix typo from constant_propagation_pass.cpp in CMakeLists

fix typo in these some config.cpp functions
- setSliderPosition
- setSliderPositionGrid
- getSliderPosition
- getSliderPositionGrid

fix typo inside src\core\aerolib\stubs.cpp

fix typo in a comment from src\core\file_format\pkg.cpp

fix typo inside src\core\file_sys\fs.cpp + fs.h
- NeedsCaseInsensiveSearch -> NeedsCaseInsensitiveSearch

fix 2 function typos: sceAppContentAddcontEnqueueDownloadByEntitlemetId and sceAppContentAddcontMountByEntitlemetId

fix typo on comment inside src\core\libraries\kernel\file_system.cpp

fix typo on src\core\libraries\videoout\driver.cpp

fix typo in src\core\memory.cpp

fix typo from comment in src\qt_gui\game_list_utils.h

fix typo in src\video_core\amdgpu\liverpool.h
- window_offset_disble to window_offset_disable

fix typo from comments in src\video_core\host_shaders\detile_m32x1.comp + detile_m32x2.comp
- subotimal -> suboptimal

fix typo from comment in src\video_core\renderer_vulkan\renderer_vulkan.cpp
- dimentions -> dimensions

fix typo from enum in src\common\debug.h and other files
- MarkersPallete -> MarkersPalette

fix last typo in src\video_core\amdgpu\pm4_opcodes.h
- PremableCntl -> PreambleCntl
2024-08-26 15:22:11 +03:00
Lizardy
aae6e5be73
shader_recompiler: BUFFER_ATOMIC_SWAP Opcode (#566)
* shader_recompiler: BUFFER_ATOMIC_SWAP Opcode

* clang

* follow 32 convention

---------

Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com>
2024-08-26 15:21:20 +03:00
greggameplayer
86870e7c8d Implement TBUFFER_STORE_FORMAT_XY 2024-08-26 03:39:38 +02:00
DanielSvoboda
2a737d0800
V_NOP | PfpSyncMe | S_CMPK_EQ_U32 (#426)
* V_NOP

V_NOP = Do nothing

* PfpSyncMe

PfpSyncMe ensures that all previous commands are completed before continuing.
'break' should be enough for now

* S_CMPK_EQ_U32

S_CMPK_EQ_U32
SCC = (D.u == SIMM16)

* S_CMPK_EQ_U32

* OperandField::Undefined:

* Update translate.cpp

remove  OperandField::Undefined:

* Update image_view.cpp

[Render.Vulkan] <Error> image_view.cpp:ImageViewInfo:109: Storage image (num_comps = 4) requires swizzling [BGRA]
format 43 dst_sel 3886

* Update liverpool_to_vk.cpp

* S_CMPK_EQ_U32

* S_CMPK_EQ_U32
2024-08-25 22:07:46 +02:00
psucien
b687ae5e34
GnmDriver: Clear context support (#567)
* gnmdriver: added support for gpu context reset

* shader_recompiler: minor validation fixes

* shader_recompiler: added `V_CMPX_GT_I32`

* shader_recompiler: fix for crash on inline sampler access

* compilation warnings and dead code elimination

* amdgpu: fix for registers addressing

* libraries: videoout: reduce logging pressure

* shader_recompiler: fix for devergence scope detection
2024-08-25 23:01:05 +03:00
Daniel R
3a8a666df0
shader_recompiler/frontend: fix IMAGE_SAMPLE_CD format
* Seen on Dark Souls
2024-08-25 19:53:45 +02:00
Daniel R.
977371e7e1
shader_recompiler/frontend: fix IMAGE_GATHER4_C_LZ format 2024-08-25 14:06:41 +02:00
Daniel R
ba140b9680
shader_recompiler/frontend: add information on instruction format assert 2024-08-25 13:17:59 +02:00
Daniel R
5691838eca
shader_recompiler/frontend: fix V_NOP instruction format 2024-08-25 13:17:24 +02:00
Daniel R.
d241867c7b
shader_recompiler/frontend: implement V_NOP 2024-08-24 23:18:04 +02:00
TheTurtle
c79b10edc1
video_core: Bloodborne stabilization pt1 (#543)
* shader_recompiler: Writelane elimination pass + null image fix

* spirv: Implement image derivatives

* texture_cache: Reduce page bit size

* clang format

* slot_vector: Back to debug assert

* vk_graphics_pipeline: Handle null tsharp

* spirv: Revert some change

* vk_instance: Support primitive restart on list topology

* page_manager: Adjust windows exception handler

* clang format

* Remove subres tracking

* Will be done separately
2024-08-24 22:51:47 +03:00
Vinicius Rangel
9e4fc17e6c
shader_recompiler: handle fetch shader address offsets (#538)
* shader_recompiler: handle fetch shader address offsets

parse index & offset sgpr from fetch shader and propagate them to vkBindVertexBuffers

* shader_recompiler: fix fetch_shader when offset is not present

* video_core: propagate index/offset SGPRs to vkCmdDraw instead of offsetting the buffer address

* video_core: add vertex_offset to non-indexed draw calls

renamed fetch offset fields
2024-08-24 17:36:40 +02:00
Herman Semenov
243fd0be78
core,shader_recompiler: align structures for 64-bit platforms (#447)
Decreased sizes:
 * TextureDefinition 32 bytes -> 24 bytes
 * PortOut 72 bytes -> 64 bytes
 * Request 48 bytes -> 40 bytes
 * WindowSystemInfo 32 bytes -> 24 bytes
2024-08-24 16:18:12 +03:00
psucien
2c540fbecb
Merge pull request #497 from xezrunner/xezrunner/cfg-msb-fix
shader_recompiler: fix BranchTarget sign flip for sopp.simm
2024-08-24 11:39:10 +02:00
Xphalnos
d4be3dbb31 Lot of small fixes 2024-08-22 18:01:30 +02:00
Herman Semenov
aed9a737d6 Added const reference params if possible, removed less 16 size 2024-08-22 02:56:01 +03:00
TheTurtle
3f9c86ad33
vk_pipeline_cache: Avoid recompiling new shaders on each new PL (#480)
* cfg: Add one more divergence case

* Seen in RDR shaders

* renderer_vulkan: Reduce number of compiled shaders

* vk_pipeline_cache: Remove some unnecessary checks
2024-08-21 02:00:24 +03:00
xezrunner
42c4d8353a Fix control.sopp.simm flipping sign in CFG label generation
This used to cause a fatal crash that would prevent Amplitude [CUSA02480] from booting beyond initialization.

A conditional true label would get an address starting with 0xffff...., which wasn't realistic with the given shader.

The multiplication by 4 causes the value to have its MSB set due to the smaller type.
2024-08-20 22:48:28 +02:00