psucien
3fbb68048e
shader_recompiler: frontend: SOPC and SOPK handling separated; more cmp opcodes ( #634 )
2024-08-28 22:27:47 +02:00
0xsegf4ult
9f4e55a8e7
shader_recompiler: constant propagation bitwise operations + S_CMPK_EQ_U32 fix ( #613 )
...
* rebase on main branch impl of V_LSHL_B64
* remove V_LSHR_B64
* fix S_CMPK_EQ_u32
* fix conflicts
* fix broken merge
* remove duplicate cases
* remove duplicate declaration
2024-08-28 13:10:21 +03:00
Grégoire Hage
288db9a0cf
Implement V_LSHL_B64 ( #608 )
2024-08-27 14:15:32 +03:00
Lizardy
aae6e5be73
shader_recompiler: BUFFER_ATOMIC_SWAP Opcode ( #566 )
...
* shader_recompiler: BUFFER_ATOMIC_SWAP Opcode
* clang
* follow 32 convention
---------
Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com >
2024-08-26 15:21:20 +03:00
greggameplayer
86870e7c8d
Implement TBUFFER_STORE_FORMAT_XY
2024-08-26 03:39:38 +02:00
DanielSvoboda
2a737d0800
V_NOP | PfpSyncMe | S_CMPK_EQ_U32 ( #426 )
...
* V_NOP
V_NOP = Do nothing
* PfpSyncMe
PfpSyncMe ensures that all previous commands are completed before continuing.
'break' should be enough for now
* S_CMPK_EQ_U32
S_CMPK_EQ_U32
SCC = (D.u == SIMM16)
* S_CMPK_EQ_U32
* OperandField::Undefined:
* Update translate.cpp
remove OperandField::Undefined:
* Update image_view.cpp
[Render.Vulkan] <Error> image_view.cpp:ImageViewInfo:109: Storage image (num_comps = 4) requires swizzling [BGRA]
format 43 dst_sel 3886
* Update liverpool_to_vk.cpp
* S_CMPK_EQ_U32
* S_CMPK_EQ_U32
2024-08-25 22:07:46 +02:00
psucien
b687ae5e34
GnmDriver: Clear context support ( #567 )
...
* gnmdriver: added support for gpu context reset
* shader_recompiler: minor validation fixes
* shader_recompiler: added `V_CMPX_GT_I32`
* shader_recompiler: fix for crash on inline sampler access
* compilation warnings and dead code elimination
* amdgpu: fix for registers addressing
* libraries: videoout: reduce logging pressure
* shader_recompiler: fix for devergence scope detection
2024-08-25 23:01:05 +03:00
TheTurtle
c79b10edc1
video_core: Bloodborne stabilization pt1 ( #543 )
...
* shader_recompiler: Writelane elimination pass + null image fix
* spirv: Implement image derivatives
* texture_cache: Reduce page bit size
* clang format
* slot_vector: Back to debug assert
* vk_graphics_pipeline: Handle null tsharp
* spirv: Revert some change
* vk_instance: Support primitive restart on list topology
* page_manager: Adjust windows exception handler
* clang format
* Remove subres tracking
* Will be done separately
2024-08-24 22:51:47 +03:00
Vinicius Rangel
9e4fc17e6c
shader_recompiler: handle fetch shader address offsets ( #538 )
...
* shader_recompiler: handle fetch shader address offsets
parse index & offset sgpr from fetch shader and propagate them to vkBindVertexBuffers
* shader_recompiler: fix fetch_shader when offset is not present
* video_core: propagate index/offset SGPRs to vkCmdDraw instead of offsetting the buffer address
* video_core: add vertex_offset to non-indexed draw calls
renamed fetch offset fields
2024-08-24 17:36:40 +02:00
Xphalnos
d4be3dbb31
Lot of small fixes
2024-08-22 18:01:30 +02:00
Lizardy
63938ba8dd
shader_recompiler: BUFFER_ATOMIC & DS_* Opcodes ( #428 )
...
* BUFFER_ATOMIC | DS_MINMAX_U32
- Emission of BufferAtomicU32
- Addition of Buffer opcodes to IR
- Translator for BUFFER_ATOMIC Opcode
- Translators for DS_MAXMIN_U32 Opcodes
* Clang Format & UNREACHABLE_MSG
* clang
* no crash on compile
* clang
* Shared Atomics
* reuse
* rm vscode
* resolve
* opcodes
* side effects
* attempt fix shader comp
* failed attempt to fix
* clang
* do correct vdata set (still fails)
* clang
* fixed BUFFER_ATOMIC_ADD, DS_ADD_U32 fails
* data share should work
* clang
* resource tracking for buffer atomic
* clang
* distinguish RTN opcodes
* clean IsBufferInstruction
---------
Co-authored-by: microsoftv <6063922+microsoftv@users.noreply.github.com >
2024-08-17 22:06:06 +03:00
TheTurtle
1d1c88ad31
control_flow_graph: Initial divergence handling ( #434 )
...
* control_flow_graph: Initial divergence handling
* cfg: Handle additional case
* spirv: Handle tgid enable bits
* clang format
* spirv: Use proper format
* translator: Add more instructions
2024-08-16 20:05:37 +03:00
psucien
9adc638220
shader_recompiler: basic implementation of BUFFER_STORE_FORMAT_ ( #431 )
...
* shader_recompiler: basic implementation of buffer store w\ fmt conversion
* added `Format16` dfmt
2024-08-15 00:15:07 +02:00
TheTurtle
d332a5e611
spirv: Simplify shared memory handling ( #427 )
...
* spirv: Simplify shared memory handling
* spirv: Ignore clip plane
* spirv: Fix image offsets
* ir_pass: Implement shared memory lowering pass
* NVIDIA doesn't like using shared mem in fragment shader and softlocks driver
* spirv: Add log for ignoring pos1
2024-08-14 19:01:17 +03:00
TheTurtle
d8b9d82ffa
video_core: Various fixes ( #423 )
...
* video_core: Various fixes
* clang format
2024-08-13 20:05:10 +03:00
TheTurtle
1fb0da9b89
video_core: Crucial buffer cache fixes + proper GPU clears ( #414 )
...
* translator: Use templates for stronger type guarantees
* spirv: Define buffer offsets upfront
* Saves a lot of shader instructions
* buffer_cache: Use dynamic vertex input when available
* Fixes issues when games like dark souls rebind vertex buffers with different stride
* externals: Update boost
* spirv: Use runtime array for ssbos
* ssbos can be large and typically their size will vary, especially in generic copy/clear cs shaders
* fs: Lock when doing case insensitive search
* Dark Souls does fs lookups from different threads
* texture_cache: More precise invalidation from compute
* Fixes unrelated render targets being cleared
* texture_cache: Use hashes for protect gpu modified images from reupload
* translator: Treat V_CNDMASK as float
* Sometimes it can have input modifiers. Worst this will cause is some extra calls to uintBitsToFloat and opposite. But most often this is used as float anyway
* translator: Small optimization for V_SAD_U32
* Fix review
* clang format
2024-08-13 09:21:48 +03:00
Vinicius Rangel
dfcfd62d4f
spirv: fix image sample lod/clamp/offset translation ( #402 )
...
* spirv: fix image sample lod/clamp translation
* spirv: fix image sample offsets
* fix ImageSample opcodes & offset emission
2024-08-13 09:12:38 +03:00
TheTurtle
381ba8c7a5
video_core: Implement guest buffer manager ( #373 )
...
* video_core: Introduce buffer cache
* video_core: Use multi level page table for caches
* renderer_vulkan: Remove unused stream buffer
* fix build
* oops forgot optimize off
2024-08-08 15:02:10 +03:00
TheTurtle
159be2c7f4
video_core: Minor fixes ( #366 )
...
* data_share: Fix DS instruction
* vk_graphics_pipeline: Fix unnecessary invalidate
* spirv: Remove subgroup id
* vector_alu: Simplify mbcnt pattern
* shader_recompiler: More instructions
* clang format
* kernel: Fix cond memory leak and reduce spam
* liverpool: Print error on exception
* build fix
2024-08-05 13:45:28 +03:00
TheTurtle
a7c9bfa5c5
shader_recompiler: Small instruction parsing refactor/bugfixes ( #340 )
...
* translator: Implemtn f32 to f16 convert
* shader_recompiler: Add bit instructions
* shader_recompiler: More data share instructions
* shader_recompiler: Remove exec contexts, fix S_MOV_B64
* shader_recompiler: Split instruction parsing into categories
* shader_recompiler: Better BFS search
* shader_recompiler: Constant propagation pass for cmp_class_f32
* shader_recompiler: Partial readfirstlane implementation
* shader_recompiler: Stub readlane/writelane only for non-compute
* hack: Fix swizzle on RDR
* Will properly fix this when merging this
* clang format
* address_space: Bump user area size to full
* shader_recompiler: V_INTERP_MOV_F32
* Should work the same as spirv will emit flat decoration on demand
* kernel: Add MAP_OP_MAP_FLEXIBLE
* image_view: Attempt to apply storage swizzle on format
* vk_scheduler: Barrier attachments on renderpass end
* clang format
* liverpool: cs state backup
* shader_recompiler: More instructions and formats
* vector_alu: Proper V_MBCNT_U32_B32
* shader_recompiler: Port some dark souls things
* file_system: Implement sceKernelRename
* more formats
* clang format
* resource_tracking_pass: Back to assert
* translate: Tracedata
* kernel: Remove tracy lock
* Solves random crashes in Dark Souls
* code: Review comments
2024-07-30 23:32:40 +02:00
Vinicius Rangel
680192a0c4
64 bits OP, impl V_ADDC_U32 & V_MAD_U64_U32 ( #310 )
...
* impl V_ADDC_U32 & V_MAD_U64_U32
* shader recompiler: add 64 bits version to get register / GetSrc
* fix V_ADDC_U32 carry
* shader recompiler: removed automatic conversion to force_flt in GetSRc
* shader recompiler: auto cast between u32 and u64 during ssa pass
* shader recompiler: fix SetVectorReg64 & standardize switches-case
* shader translate: fix overflow detection in V_ADD_I32
use vcc lo instead of vcc thread bit
* shader recompiler: more 64-bit work
- removed bit_size parameter from Get[Scalar/Vector]Register
- add BitwiseOr64
- add SetDst64 as a replacement for SetScalarReg64 & SetVectorReg64
- add GetSrc64 for 64-bit value
* shader recompiler: add V_MAD_U64_U32 vcc output
- add V_MAD_U64_U32 vcc output
- ILessThan for 64-bits
* shader recompiler: removed unnecessary changes & missing consts
* shader_recompiler: Add s64 type in constant propagation
2024-07-27 17:23:59 +03:00
DanielSvoboda
b2ba84aa11
BUFFER_STORE_DWORDX2
2024-07-26 00:25:29 -03:00
squidbus
6a6d5bad42
Fix one-off bug with user data registers.
2024-07-21 22:36:12 +03:00
IndecisiveTurtle
cd009cfec6
shader_recompiler: Normal gathers
2024-07-17 16:49:45 +03:00
Vladislav Mikhalin
f9e96793cc
Implemented load_buffer_format_* conversions ( #295 )
...
* Implemented load_buffer_format_* conversions
* clang-format insists on ugly things
2024-07-16 15:03:07 +03:00
IndecisiveTurtle
58d1cbd9b7
ssa_rewrite_pass: Correct phi node type for thread bitmask
2024-07-15 13:34:34 +03:00
georgemoralis
b4df90d8e4
Merge pull request #292 from shadps4-emu/games/00144
...
Missing graphics features for flOw & Flower
2024-07-14 23:07:46 +03:00
psucien
f041276b04
recompiler: added support for discard on export with masked EXEC
2024-07-13 14:57:01 +02:00
Daniel R
83c8204d23
shader_recompiler/frontend: Implement opcodes ( #289 )
...
`S_ASHR_I32` and `BUFFER_LOAD_DWORD`.
2024-07-13 12:37:25 +03:00
psucien
1b94f07a6a
recompiler: proper VS inputs initialization
2024-07-13 01:00:24 +02:00
Vinicius Rangel
94d1a6b0b9
impl V_CMP_CLASS_F32 common filter masks ( #276 )
2024-07-10 02:24:01 +03:00
DanielSvoboda
63b0465a33
add V_MAD_U32_U24 ( #262 )
...
* V_MAD_U32_U24
* adjust V_MAD_I32_I24 for bit extraction
* optional bit extraction parameter
* Update vector_alu.cpp
* clang-format
* Update src/shader_recompiler/frontend/translate/vector_alu.cpp
Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com >
* Update vector_alu.cpp
* Update translate.h
---------
Co-authored-by: TheTurtle <47210458+raphaelthegreat@users.noreply.github.com >
2024-07-09 01:35:01 +03:00
Stolas
2620919f0b
Added Legacy Min/Max ops ( #266 )
...
* Forwarding V_MAX_LEGACY_F32 to V_MAX3_F32. Fixes Translation error in Geometry Wars 3.
* Forwarded to correct op
* Implemented Legacy Max/Min using NMax/NMin
* Added extra argument to Min/Max op codes
* Removed extra translator functions, replaced with bool
* Formatting
2024-07-08 12:24:12 +03:00
psucien
19c85c78cf
recompiler: switch instance data to storage buffers
2024-07-07 13:08:39 +02:00
psucien
bf4bf4ccb2
recompiler: fix for gather4 components return
2024-07-07 13:00:52 +02:00
psucien
cfbe8b9e6d
renderer: added support for instance step rates
2024-07-06 18:03:43 +02:00
TheTurtle
38080b60af
shader_recompiler: Check usage before enabling capabilities ( #245 )
...
* vk_instance: Better feature check
* shader_recompiler: Make most features optional
* vk_instance: Bump extension vector size
* resource_tracking_pass: Perform BFS for sharp tracking
* The Witness triggered this
2024-07-06 02:42:16 +03:00
TheTurtle
6ceab6dfac
shader_recompiler: Implement most integer image atomics, workgroup barriers and shared memory load/store ( #231 )
...
* shader_recompiler: Add LDEXP
* shader_recompiler: Add most image integer atomic ops
* shader_recompiler: Implement shared memory load/store
* shader_recompiler: More image atomics
* externals: Update sirit
* clang format
* cmake: Add missing files
* shader_recompiler: Fix some atomic bugs
* shader_recompiler: Vs outputs
* shader_recompiler: Shared mem has side-effects, fix format component order
* shader_recompiler: Inline constant buffer impl
* video_core: Fix regressions
* Work
* Fixup a few things
2024-07-05 00:15:44 +03:00
IndecisiveTurtle
a603bc7d88
shader_recompiler: More instructions
2024-07-01 22:42:45 +03:00
IndecisiveTurtle
7d4f0da40e
video_core: Fix some regressions
2024-07-01 18:26:22 +03:00
IndecisiveTurtle
6774216038
shader_recompiler: Apply buffer swizzle on vertex attribs
2024-07-01 13:56:14 +03:00
IndecisiveTurtle
22b930ba5e
video_core: Track renderpass scopes properly
2024-07-01 13:56:14 +03:00
IndecisiveTurtle
ad10020836
video_core: Fix a few problems
2024-07-01 13:56:14 +03:00
IndecisiveTurtle
5da79d4798
spirv: Add fragdepth and implement image query
2024-07-01 13:56:14 +03:00
georgemoralis
0ada442cbc
Stabilization8 ( #218 )
...
* disable configured flexible memory size (caused issues in some games)
* fixed case S_OR_B64 for blazing chrome
* submodules updates and fixes for latest SDL
* stubbed _sigprocmask (not handled and spams too much)
* added ReplaceOp case in Stencilop
* dummy ajm module added
2024-06-27 16:37:17 +03:00
IndecisiveTurtle
c8ed338d5a
kernel: Const correctness
2024-06-26 18:24:06 +03:00
IndecisiveTurtle
c081663aac
translator: Merge ANDN2 with AND and impl ORN2
2024-06-26 18:16:01 +03:00
IndecisiveTurtle
4846704832
shader_recompiler: More instructions and fix for swords of ditto
2024-06-26 18:03:09 +03:00
psucien
cb6b21de1f
Initial instancing and asynchronous compute queues ( #207 )
...
* gnm_driver: added `sceGnmRegisterOwner` and `sceGnmRegisterResource`
* video_out: `sceVideoOutGetDeviceCapabilityInfo` for sdk runtime
* gnm_driver: correct vqid index range
* amdgpu: indirect buffer, release mem and some additional irq modes
* amdgpu: added ASC commands processor
* shader_recompiler: added support for fetch instance id
* amdgpu: classic bitfields for T# representation (debugging experience)
* renderer_vulkan: skip zero sized VBs from binding
* texture_cache: image upload logic moved into `Image` object
* gnm_driver: `sceGnmDingDong` implementation
* texture_cache: `Image` usage flags moved; correct VO buffer pitch
2024-06-22 19:50:20 +03:00
georgemoralis
32225f4a8b
more clang fix
2024-06-22 18:15:42 +03:00