Commit Graph

106 Commits

Author SHA1 Message Date
dbz400
54dafce541
Add V_CVT_F64_I32 (#1219) 2024-10-03 18:48:28 +02:00
squidbus
e68774d449
shader_recompiler: Define fragment output type based on number format. (#1097)
* shader_recompiler: Define fragment output type based on number format.

* shader_recompiler: Fix GetAttribute SPIR-V output type.

* shader_recompiler: Don't bitcast on SetAttribute unless integer target.
2024-10-01 23:42:37 +03:00
dbz400
c7ff0419ad
Fix V_CMP_CLASS_F32 (#1153) 2024-09-30 11:36:26 +03:00
jnack
beb809b612
add V_CMPX_LE_I32 (#1056) 2024-09-24 18:22:31 +03:00
baggins183
7f9bc0abbd
fix lane inst decoding (#1051) 2024-09-24 12:29:57 +03:00
TheTurtle
ee38eec7fe
shader_recompiler: Additional scope handling and user data as push constants (#1013)
* shader_recompiler: Use push constants for user data regs

* shader: Add some GR2 instructions

* shader: Add some instructions

* shader: Add instructions for knack

* touchups

* spirv: Better names

* buffer_cache: Ignore non gpu modified images

* clang format

* Add log

* more fixes
2024-09-23 08:55:43 +02:00
squidbus
a18419dd73
shader_recompiler: Exclude non-float results from output modifiers. (#1016) 2024-09-22 15:03:17 +03:00
korenkonder
8811cc5cc6
Add V_CVT_PK_U8_F32 opcode (#1022) 2024-09-22 15:02:34 +03:00
TheTurtle
edde0a3e7e
hotfix: Revert ADDC change 2024-09-22 01:53:10 +03:00
squidbus
dd184fd95d
shader_recompiler: Use SetDst in more instructions. (#1015) 2024-09-22 01:41:19 +03:00
psucien
5f4ddc14fc
Image subresources barriers (#904)
* video_core: texture: image subresources state tracking

* shader_recompiler: use one binding if the same image is read and written

* video_core: added rebinding of changed textures after overlap resolve

* don't use pointers; slight `FindTexture` refactoring

* video_core: buffer_cache: don't copy over the image size

* redundant barriers removed; fixes

* regression fixes

* texture_cache: 3d texture layers count fixup

* shader_recompiler: support for partially bound cubemaps

* added support for cubemap arrays

* don't bind unused color buffers

* fixed depth promotion to do not use stencil

* doors

* bonfire lit

* cubemap array index calculation

* final touches
2024-09-21 21:45:56 +02:00
korenkonder
07de1ee977
Sort opcodes by their indices. Group them too when applicable (#945) 2024-09-19 20:29:56 +02:00
Raven
84e2c4d3bb
Add other 64-bit floating point shader instructions (#944) 2024-09-17 18:01:33 +02:00
Daniel R.
dcf245b814
shader_recompiler: Implement basic 64-bit floating point support (#915)
* shader_recompiler: Implement basic 64-bit floating point support

* Fix formatting
2024-09-15 22:53:08 +02:00
Raven
b14f447060
Add DS_READ2ST64_B32 (#916)
* Add DS_READ2ST64_B32

* Fix CLANG

* Fix CI again

* Parameter update for DS_READ
2024-09-14 21:16:12 +03:00
Raven
5c5c02cb04
Add S_XOR_B32 opcode (#911)
* Add S_XOR_B32

* Stub S_OR_B32
2024-09-14 18:52:30 +03:00
Raven
12a0a02e38
Map BUFFER_AUTOMIC SMIN/SAMX/AND/OR/XOR/INC/DEC (#910) 2024-09-14 18:52:20 +03:00
Emulator-Team-2
c924457e28
Implement IMAGE_SAMPLE_L_O opcode (#899) 2024-09-13 19:20:35 +02:00
Luke Warner
c181102a02
Implement S_ABSDIFF_I32 shader instruction (#902) 2024-09-13 19:02:17 +02:00
TheTurtle
b0bbb16aae
video_core: Add fallback path for pipelines with more than 32 bindings (#837)
* video_core: Small fixes

* renderer_vulkan: Add fallback path for pipelines with more than 32 bindings

* vk_resource_pool: Rewrite desc heap

* work
2024-09-10 20:54:39 +03:00
Andrew Middendorp
4502a5ddcc
Added S_ANDN2_B32 and S_NAND_B32 opcodes (#833)
* Added S_ANDN2_B32 and S_NAND_B32 opcodes

* Update src/shader_recompiler/frontend/translate/scalar_alu.cpp

Co-authored-by: baggins183 <baggins31084@proton.me>

* Fix result and src1

Co-authored-by: baggins183 <baggins31084@proton.me>

* update result

Co-authored-by: baggins183 <baggins31084@proton.me>

* Update src1

Co-authored-by: baggins183 <baggins31084@proton.me>

---------

Co-authored-by: baggins183 <baggins31084@proton.me>
2024-09-09 22:46:57 +03:00
TheTurtle
13743b27fc
shader_recompiler: Implement data share append and consume operations (#814)
* shader_recompiler: Add more format swap modes

* texture_cache: Handle stencil texture reads

* emulator: Support loading font library

* readme: Add thanks section

* shader_recompiler: Constant buffers as integers

* shader_recompiler: Typed buffers as integers

* shader_recompiler: Separate thread bit scalars

* We can assume guest shader never mixes them with normal sgprs. This helps avoid errors where ssa could view an sgpr write dominating a thread bit read, due to how control flow is structurized, even though its not possible in actual control flow

* shader_recompiler: Implement data append/consume operations

* clang format

* buffer_cache: Simplify invalidation scheme

* video_core: Remove some invalidation remnants

* adjust
2024-09-07 00:14:51 +03:00
Stephen Miller
09ce12a868
shader_recompiler: Add more opcodes (#802)
* Implement some missing shader opcodes

Implements TBUFFER_STORE_FORMAT_XYZW, IMAGE_SAMPLE_CD, and IMAGE_GATHER4_C_LZ.

These are seen in https://github.com/shadps4-emu/shadPS4/issues/496.

* Implement IMAGE_STORE_MIP

Not sure if this is the right way to do this, let me know if this needs changing.

* Revert "Implement IMAGE_STORE_MIP"

This reverts commit cff78b5924.
2024-09-06 23:51:20 +03:00
baggins183
bb29224daf
Implement V_MOVREL variants (#745)
* shader_recompiler: Implement V_MOVRELS_B32, V_MOVRELD_B32,
V_MOVRELSD_B32

Generates a ton of OpSelects to hardcode reading or writing from each
possible vgpr depending on the value of m0

Future work is to do range analysis to put an upper bound on m0 and
check fewer registers.

* fix runtime info after rebase
2024-09-06 23:47:47 +03:00
Sebastian Kassai
0a5c36482e
shader_recompiler: change ir.SetScalarReg() -> SetDst() (#777)
Fixes an out-of-bounds crash on Amplitude and Rock Band 4 startup.
2024-09-04 17:30:43 +03:00
squidbus
d48836d5ae
shader_recompiler: Limit src0 to 4-bit in V_CVT_OFF_F32_I4 (#759) 2024-09-03 21:37:52 +03:00
TheTurtle
f087f43736
shader_recompiler: Implement render target swizzles when no format is available (#739)
* shader_recompiler: Use null image when shader is compiled with unbound sharp

* video_core: Refactor and render target swizzles

* liverpool_to_vk: Add missing swap format from RDR

* video_core: Refactor shader recompiler interface

* Makes it much easier to pass runtime information to the recompiler and have it treated as part of the shader key. Also pulls out most runtime state from Info struct

* shader_recompiler: Avoid some asserts
2024-09-03 14:04:30 +03:00
baggins183
101aeb920d
Implement V_BFM_B32 and V_FFBH_U32 (#663)
* Implement V_BFM_B32

* Render.Recompiler: Implement V_FFBH_U32

* fix clang-format
2024-09-01 22:20:42 +03:00
Grégoire Hage
1bd9317509
Implement V_READFIRSTLANE_B32 (#681)
* Implement V_READFIRSTLANE_B32

* refactor
2024-09-01 21:49:42 +03:00
Grégoire Hage
1651db24fe
Implement S_XNOR_B64 (#654) 2024-08-30 02:43:12 +03:00
IndecisiveTurtle
6fbbe3d79b translator: Add missed flow instruction 2024-08-30 00:26:01 +03:00
TheTurtle
66e96dd944
video_core: Account of runtime state changes when compiling shaders (#575)
* video_core: Compile shader permutations

* spirv: Only specific storage image format for atomics

* ir: Avoid cube coord patching for storage image

* spirv: Fix default attributes

* data_share: Add more instructions

* video_core: Query storage flag with runtime state

* kernel: Use std::list for semaphore

* video_core: Use texture buffers for untyped format load/store

* buffer_cache: Limit view usage

* vk_pipeline_cache: Fix invalid iterator

* image_view: Reduce log spam when alpha=1 in storage swizzle

* video_core: More features and proper spirv feature detection

* video_core: Attempt no2 for specialization

* spirv: Remove conflict

* vk_shader_cache: Small cleanup
2024-08-29 19:29:54 +03:00
psucien
3fbb68048e
shader_recompiler: frontend: SOPC and SOPK handling separated; more cmp opcodes (#634) 2024-08-28 22:27:47 +02:00
0xsegf4ult
9f4e55a8e7
shader_recompiler: constant propagation bitwise operations + S_CMPK_EQ_U32 fix (#613)
* rebase on main branch impl of V_LSHL_B64

* remove V_LSHR_B64

* fix S_CMPK_EQ_u32

* fix conflicts

* fix broken merge

* remove duplicate cases

* remove duplicate declaration
2024-08-28 13:10:21 +03:00
Grégoire Hage
288db9a0cf
Implement V_LSHL_B64 (#608) 2024-08-27 14:15:32 +03:00
Lizardy
aae6e5be73
shader_recompiler: BUFFER_ATOMIC_SWAP Opcode (#566)
* shader_recompiler: BUFFER_ATOMIC_SWAP Opcode

* clang

* follow 32 convention

---------

Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com>
2024-08-26 15:21:20 +03:00
greggameplayer
86870e7c8d Implement TBUFFER_STORE_FORMAT_XY 2024-08-26 03:39:38 +02:00
DanielSvoboda
2a737d0800
V_NOP | PfpSyncMe | S_CMPK_EQ_U32 (#426)
* V_NOP

V_NOP = Do nothing

* PfpSyncMe

PfpSyncMe ensures that all previous commands are completed before continuing.
'break' should be enough for now

* S_CMPK_EQ_U32

S_CMPK_EQ_U32
SCC = (D.u == SIMM16)

* S_CMPK_EQ_U32

* OperandField::Undefined:

* Update translate.cpp

remove  OperandField::Undefined:

* Update image_view.cpp

[Render.Vulkan] <Error> image_view.cpp:ImageViewInfo:109: Storage image (num_comps = 4) requires swizzling [BGRA]
format 43 dst_sel 3886

* Update liverpool_to_vk.cpp

* S_CMPK_EQ_U32

* S_CMPK_EQ_U32
2024-08-25 22:07:46 +02:00
psucien
b687ae5e34
GnmDriver: Clear context support (#567)
* gnmdriver: added support for gpu context reset

* shader_recompiler: minor validation fixes

* shader_recompiler: added `V_CMPX_GT_I32`

* shader_recompiler: fix for crash on inline sampler access

* compilation warnings and dead code elimination

* amdgpu: fix for registers addressing

* libraries: videoout: reduce logging pressure

* shader_recompiler: fix for devergence scope detection
2024-08-25 23:01:05 +03:00
TheTurtle
c79b10edc1
video_core: Bloodborne stabilization pt1 (#543)
* shader_recompiler: Writelane elimination pass + null image fix

* spirv: Implement image derivatives

* texture_cache: Reduce page bit size

* clang format

* slot_vector: Back to debug assert

* vk_graphics_pipeline: Handle null tsharp

* spirv: Revert some change

* vk_instance: Support primitive restart on list topology

* page_manager: Adjust windows exception handler

* clang format

* Remove subres tracking

* Will be done separately
2024-08-24 22:51:47 +03:00
Vinicius Rangel
9e4fc17e6c
shader_recompiler: handle fetch shader address offsets (#538)
* shader_recompiler: handle fetch shader address offsets

parse index & offset sgpr from fetch shader and propagate them to vkBindVertexBuffers

* shader_recompiler: fix fetch_shader when offset is not present

* video_core: propagate index/offset SGPRs to vkCmdDraw instead of offsetting the buffer address

* video_core: add vertex_offset to non-indexed draw calls

renamed fetch offset fields
2024-08-24 17:36:40 +02:00
Xphalnos
d4be3dbb31 Lot of small fixes 2024-08-22 18:01:30 +02:00
Lizardy
63938ba8dd
shader_recompiler: BUFFER_ATOMIC & DS_* Opcodes (#428)
* BUFFER_ATOMIC | DS_MINMAX_U32

- Emission of BufferAtomicU32
- Addition of Buffer opcodes to IR
- Translator for BUFFER_ATOMIC Opcode
- Translators for DS_MAXMIN_U32 Opcodes

* Clang Format & UNREACHABLE_MSG

* clang

* no crash on compile

* clang

* Shared Atomics

* reuse

* rm vscode

* resolve

* opcodes

* side effects

* attempt fix shader comp

* failed attempt to fix

* clang

* do correct vdata set (still fails)

* clang

* fixed BUFFER_ATOMIC_ADD, DS_ADD_U32 fails

* data share should work

* clang

* resource tracking for buffer atomic

* clang

* distinguish RTN opcodes

* clean IsBufferInstruction

---------

Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com>
2024-08-17 22:06:06 +03:00
TheTurtle
1d1c88ad31
control_flow_graph: Initial divergence handling (#434)
* control_flow_graph: Initial divergence handling

* cfg: Handle additional case

* spirv: Handle tgid enable bits

* clang format

* spirv: Use proper format

* translator: Add more instructions
2024-08-16 20:05:37 +03:00
psucien
9adc638220
shader_recompiler: basic implementation of BUFFER_STORE_FORMAT_ (#431)
* shader_recompiler: basic implementation of buffer store w\ fmt conversion

* added `Format16` dfmt
2024-08-15 00:15:07 +02:00
TheTurtle
d332a5e611
spirv: Simplify shared memory handling (#427)
* spirv: Simplify shared memory handling

* spirv: Ignore clip plane

* spirv: Fix image offsets

* ir_pass: Implement shared memory lowering pass

* NVIDIA doesn't like using shared mem in fragment shader and softlocks driver

* spirv: Add log for ignoring pos1
2024-08-14 19:01:17 +03:00
TheTurtle
d8b9d82ffa
video_core: Various fixes (#423)
* video_core: Various fixes

* clang format
2024-08-13 20:05:10 +03:00
TheTurtle
1fb0da9b89
video_core: Crucial buffer cache fixes + proper GPU clears (#414)
* translator: Use templates for stronger type guarantees

* spirv: Define buffer offsets upfront

* Saves a lot of shader instructions

* buffer_cache: Use dynamic vertex input when available

* Fixes issues when games like dark souls rebind vertex buffers with different stride

* externals: Update boost

* spirv: Use runtime array for ssbos

* ssbos can be large and typically their size will vary, especially in generic copy/clear cs shaders

* fs: Lock when doing case insensitive search

* Dark Souls does fs lookups from different threads

* texture_cache: More precise invalidation from compute

* Fixes unrelated render targets being cleared

* texture_cache: Use hashes for protect gpu modified images from reupload

* translator: Treat V_CNDMASK as float

* Sometimes it can have input modifiers. Worst this will cause is some extra calls to uintBitsToFloat and opposite. But most often this is used as float anyway

* translator: Small optimization for V_SAD_U32

* Fix review

* clang format
2024-08-13 09:21:48 +03:00
Vinicius Rangel
dfcfd62d4f
spirv: fix image sample lod/clamp/offset translation (#402)
* spirv: fix image sample lod/clamp translation

* spirv: fix image sample offsets

* fix ImageSample opcodes & offset emission
2024-08-13 09:12:38 +03:00
TheTurtle
381ba8c7a5
video_core: Implement guest buffer manager (#373)
* video_core: Introduce buffer cache

* video_core: Use multi level page table for caches

* renderer_vulkan: Remove unused stream buffer

* fix build

* oops forgot optimize off
2024-08-08 15:02:10 +03:00