* Organize settings and fix defaults
setDefaultValues was missing several rather important config options, and some of the defaults were inaccurate.
* Set alternative settings based on defaults
Reduces how many times we redefine the defaults for each setting.
I avoided changing behavior for installDirs and installDirsEnabled because I don't want to introduce any changes I can't easily test.
* Run save after loading to fill in missing config entries
A fix for the growing complaints about config files not updating when new settings are added.
Thanks to the prior changes, default settings are new properly defined everywhere, so running save here will properly fill in missing values with their expected defaults instead of the weird settings load would sometimes use.
* Clang
* Only update config when necessary
Instead of updating the config on each emulator boot, calculate the number of entries stored in the config and compare to a hardcoded constant.
If the config doesn't have the right number of entries, then regenerate it.
* Clang
* Clang
* buffer_cache: Fix various thread races on data upload and invalidation
* memory_tracker: Improve locking more on invalidation
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* vector_alu: Improve handling of mbcnt append/consume patterns
The existing implementation was written to handle a single pattern of mbcnt before the DS_APPEND instruction
v_mbcnt_hi_u32_b32 vX, exec_hi, 0
v_mbcnt_lo_u32_b32 vX, exec_lo, vX
ds_append vY offset:4 gds
v_add_i32 vX, vcc, vY, vX
In this case however the DS_APPEND is before the mbcnt pattern
ds_append vX gds
v_mbcnt_hi_u32_b32 vY, exec_hi, vX
v_mbcnt_lo_u32_b32 vZ, exec_lo, vY
The mbcnt instructions are always in pairs of hi/lo and in general are quite flexible. But they assume the subgroup size is 64 so they are not recompiled literally. Together with DS_APPEND they are used to derive a unique per thread index in a buffer (different from using thread_id as order could be random). DS_APPEND instruction works on per subgroup level, by adding number of active threads of subgroup to the GDS counter, essentially giving a multiple-of-64 base index to all threads. Then each thread executes the mbcnt pair which returns the number of active threads with id less than the itself and adds it with the base.
The recompiler translates DS_APPEND into an atomic increment of a storage buffer counter, which already gives the desired unique index, so this pattern is a no-op. On main it was set to zero as per the first pattern to avoid altering the DS_APPEND result. The new handling passes through the initial value of the pattern instead, which has the same effect but works on either case.
* vk_rasterizer: Always sync DMA buffers
* Added V_CMP_EQ_U64 shader opcode support and added 64-bit relational operators (<,>,<=,>=)
* Fixed clang-format crying because I typed xargs clang-format instead of xargs clang-format-19
* Replaced V_CMP_EQ_U64 code to match V_CMP_U32 to test
* Updated V_CMP_U64 for future addons
* Remapping GUI V2 - initial commit
* Unmap button with escape key
* Allow combination inputs
* Use separate class for SDL event signals so that i can work with the SDL window event loop
* Automatically pause game when GUI open to better manage event queue
* Move sd;_gamepad_added event from remap object to GUI object to avoid conflicts with sdl window
* Use signals on button/trigger to release to make GUI more responsive
* pause game while KBM window is open for consistency
* don't check gamepad when game is running to avoid conflicts
* Block all other sdl events instead of pausing game, automatic parse inputs after saving
* Don't block window restored or window exposed cases
* Properly exit event loop thread on exit
* Changed symbol bindings to their names
* Fixed kakposfos' requests on GitHub
* Fine, I'll do it myself.
---------
Co-authored-by: kalaposfos13 <153381648+kalaposfos13@users.noreply.github.com>
Older games aren't fond of how our sceVideodec2GetPictureInfo implementation outputs AVC picture info after the struct size increase.
Adding the old struct, and additional code using it for these games works around this problem.
* shader_recompiler: Simplify dma types
Only U32 is needed for S_LOAD_DWORD
* shader_recompiler: Perform address shift on IR level
Buffer instructions now expect address in the data unit they work on. Doing the shift on IR level will allow us to optimize some operations away on common case
* shader_recompiler: Optimize common buffer access pattern
* emit_spirv: Use 32-bit integer ops for fault buffer
Not many GPUs have 8-bit bitwise or operations so that would probably require some overhead to emulate from the driver
* resource_tracking_pass: Fix texel buffer shift
* Favorites in the game list (#2649)
Changed how favorites are saved to match PR #2984. Adjusted the favorite
icon size. Fixed bug where favorites were inconsistent when changing to
list mode. Instantly sort list when adding or removing a favorite.
Co-authored-by: David Antunes <david.f.antunes@tecnico.ulisboa.pt>
* fix formatting
* Favorites in the game list (#2649)
Fixed issue where background change was inconsistent while adding
favorites, unselect row when adding favorites, cleaned code, changed
right click menu options to match the game's favorite status.
* fixed right click bug
* keep row selection when adding favorites
* fixed sorting on game grid after using search bar
* change the way favorites are saved to match #3119
* Validate requested dmem range in MapMemory
Handles a rare edge case that only comes up when modding Driveclub
* Specify type
auto has failed us once again.
* Types cleanup
Just some basic tidying up.
* Clang
* fixed nonload issues with background music (#3094)
* Fixing my pull request branch
* Pull request change part 2
* Continued changes to project and altered kbm_help_dialog.h text to QStringLiterals
* Finalized commit and changed kbm_help_dialog.h
* KBM Input Bug Fixes / Added Binds
Fixed input issues where some inputs would not bind when pressing (side mouse buttons, some symbols, etc). Also, fixed up code formatting in altered files (removed C-style casts and replaced with C++ <static_casts>, added a few macros and one member functions).
This is v2 of my commit, addressing all issues brought up by @kalaposfos
* Updated C-style casts in kbm_gui.cpp
* Fixed formatting from clang-format
* Updated expendable sections location and changed order of appearance
* Merged PR #3098 into kbm_gui.cpp
* Updates from running clang-format
* Potential MacOS error fix
Changes std::string to std::string_view, which prevented MacOS from building
* Undid MacOS commit for new PR
* Revert "Undid MacOS commit for new PR"
This reverts commit fc376c5e1f.
* Updated SDL_INVALID_ID=UINT32_MAX macro to SDL_UNMAPPED=UINT32_MAX-1
* Update from merge conflicts
Updated SDL_INVALID_ID=UINT32_MAX macro to SDL_UNMAPPED=UINT32_MAX-1
* FIxed memory.cpp errors from testing PR #3117 (MacOS fixes)
* Removed "kp;"
* Fixed help dialogue from kalaposfos' changes
Fixed 3 edits made by kalaposfos from a recent commit.
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* Rework framework to allow for more types of mouse-to-something emulation and hook up gyro to it
* Remove the unnecessary null check now that deltatime is handled differently
* Fix toggle key
* Basic gyro emulation working for two out of the three dimensions
* clang
* Added bindable key to hold for switching from looking to the sides to rolling
* documentation
* Bit array test
* Some corrections
* Fix AVX path on SetRange
* Finish bitArray
* Batched protect progress
* Inclusion fix
* Last logic fixes for BitArray
* Page manager: batch protect, masked ranges
* Page manager bitarray
* clang-format
* Fix out of bounds read
* clang
* clang
* Lock during callbacks
* Rename untracked to writeable
* Construct and mask in one step
* Sync on region mutex for thw whole protection
This is a temporary workarround until a fix is found for the page manager having issues when multiple threads update the same page at the same time.
* Bring back the gpu masking until properly handled
* Sync page manager protections
* clang-format
* Rename and fixups
* I fucked up clang-formatting one more time...
* kek
* added recentFiles save/load
* gui language
* fixups for language
* fixed language issue with savedata (it was saving based on gui language and not on console language)
* clang fix
* elf dirs added
* added theme
* Fix rename
We shouldn't be leaving a copy of the original filename laying around.
This fixes one of a few broken savedata checks in DRAGON BALL XENOVERSE (CUSA01341)
* sceKernelWrite hack
Seems like std::fwrite has some weird edge cases we aren't handling properly.
Until we get to the bottom of this issue, here's a hack that bypasses it.
This fixes saves in DRAGON BALL XENOVERSE (CUSA01341)
* hack fix
* Improved "hack"
* Fix rename for Windows users
Turns out, we're using copy instead of rename for a reason, and that same reason came up when adding the remove call.
Also adds a log for the sceKernelWrite issue, since that's definitely a hack that needs to be debugged.
* A real fix for the sceKernelWrite issue
Turns out, some data was just buffered.
Running Flush fixes that problem.
* Move fflush call to WriteRaw
To prevent future cases of this issue.
* If duplicate unique inputs found, show which buttons have duplicates
* remove duplicate button from list
* cleanup
* Set clang-format off for long translatable string
* initial save classes for gui save file
* fixup
* some more settings passed to the new saving file
* even more variables parsing
* more settings
* fixup
* more settings
* more settings
* clang fix
* fixed wrong setting
* more setting
* more setting
* added ca_ES
* rename to general_settings
* added set-addon-folder in main
* fixup
* fixup2
* added sr_CS
* Update CMakeLists.txt
---------
Co-authored-by: kalaposfos13 <153381648+kalaposfos13@users.noreply.github.com>
Neither sceVideodec2Decode or sceVideodec2Flush should be modifying the output's `thisSize`, doing so breaks older games now that we have the updated structs.
We should also only set frameFormat and framePitchInBytes if the game inputted the newer struct, since otherwise we're modifying memory the game never gave us.
These changes might fix the regression in Hatsune Miku Project Diva X, though it's hard to tell due to some weird caching issue with Windows, and the ancient regression this game had on Linux.
* Merge dmem areas
* Fix DirectMemoryArea::CanMergeWith
Don't merge dmem areas if the memory types are different.
* Reduce some warnings to info
Both functions should behave properly now, there's no reason to warn about their use.
* Clang
* Change default context and handle values
libSceNpToolkit internally uses context/handle values of zero to indicate NpTrophy calls failed.
This PR returns handle/context as index + 1 instead, avoiding this issue.
* Fix log message
* Update file_system.cpp
* libSceVideodec2 struct fixes
Our code was based on an old version of the libSceVideodec2 library. Based on what I've decompiled, these structs changed somewhere around firmware 6.50, and newer versions of the library have these flexible checks to accommodate both variants of the structs.
* Static assert for AvcPictureInfo struct
All the other Videodec2 structs have static asserts, might as well use one here too.
* Initialize new values
Set proper values for frameFormat and framePitchInBytes.
`frame->linesize[0]` appears to be in bytes already, I'm not sure if that means framePitch is being set wrong though.
* texture_cache: Avoid gpu tracking assert on sparse image
At the moment just take the easy way of creating the entire image normally and uploading unmapped subresources are zero
* tile_manager: Downgrade assert to error
* fix macos
* pixel_format: Remove unused tables, refactor
* host_compatibilty: Cleanup and support uncompressed views of compressed formats
* texture_cache: Handle compressed views of uncompressed images
* tile_manager: Bump max supported mips to 16
Fixes a crash during start
* oops
* texture_cache: Fix order of format compat check
* More descriptive sceKernelMapNamedFlexibleMemory logging
* Misc exports
These functions are used by Overwatch: Origins Edition
* Clang
* Function parameter cleanup
Changes the parameters on our sceKernelMapNamedFlexibleMemory and sceKernelMapFlexibleMemory functions to better align with our current standards.
* posix_pthread_attr_getschedparam
* Fixes for scePthreadGetAffinity and scePthreadSetAffinity
Looking at FreeBSD source, and our other pthread functions, we should be using our FindThread function to get the appropriate thread if thread != g_curthread.
* Swap do-while to while
If we use a do-while loop, we waste time if `aligned_size = 0`. This is also still accurate to FreeBSD behavior, where it returns success if `start == end` during mprotect.
This also effectively prevents the memory assert seen in updated versions of RESIDENT EVIL 2 (CUSA09193)
* Move prot validation outside loop
The prot variable shouldn't change during a mprotect call, so we can check the flags before protecting instead.
Also cleans up the code for prot validation.
This should improve performance, and is more accurate to FreeBSD code.
* Add logging for protect calls
This will help in debugging future problems
* Change touchpad handling and orientation calculation
* remove unnecessary includes in pad.cpp
* remove the cmake command arguments
* remove the weird file
* try to fix formatting
* limit new gyro and touchpad logic to controller 1
* remove cout
* fix formatting and add the handle check to scePadRead
* swap y and z back
* Only perform GPU memory mapping when GPU can access it
This better aligns with hardware observations, and should also speed up unmaps and decommits, since they don't need to be compared with the GPU max address anymore.
* Reserve fixes
ReserveVirtualRange seems to follow the 0x200000000 base address like MemoryPoolReserve does.
Both also need checks in their flags Fixed path to ensure we're mapping in-bounds. If we're not in mapping to our address space, we'll end up reserving and returning the wrong address, which could lead to weird memory issues in games.
I'll need to test on real hardware to verify if such changes are appropriate.
* Better sceKernelMmap
Handles errors where we would previously throw exceptions. Also moves the file logic to MapFile, since that way all the possible errors are in one place.
Also fixes some function parameters to align with our current standards.
* Major refactor
MapDirectMemory, MapFlexibleMemory, ReserveVirtualRange, and MemoryPoolReserve all internally use mmap to perform their mappings. Naturally, this means that all functions have similar behaviors, and a lot of duplicate code.
This add necessary conditional behavior to MapMemory so MemoryPoolReserve and ReserveVirtualRange can use it, without disrupting the behavior of MapDirectMemory or MapFlexibleMemory calls.
* Accurate phys_addr for non-direct mappings
* Properly handle GPU access rights
Since my first commit restricts GPU mappings to memory areas with GPU access permissions, we also need to be updating the GPU mappings appropriately during Protect calls too.
* Update memory.cpp
* Update memory.h
* Update memory.cpp
* Update memory.cpp
* Update memory.cpp
* Revert "Update memory.cpp"
This reverts commit 2c55d014c0.
* Coalesce dmem map
Aligns with hardware observations, hopefully shouldn't break anything since nothing should change hardware-wise when release dmem calls and unmap calls are performed?
Either that or Windows breaks because Windows, will need to test.
* Implement posix_mprotect
Unity calls this
Also fixes the names of sceKernelMprotect and sceKernelMtypeprotect, though that's more of a style change and can be reverted if requested.
* Fix sceKernelSetVirtualRangeName
Partially addresses a "regression" introduced when I fixed up some asserts.
As noted in the code, this implementation is still slightly inaccurate, as handling this properly could cause regressions on Windows.
* Unconditional assert in MapFile
* Remove protect warning
This is expected behavior, shouldn't need any logging.
* Respect alignment
Forgot to properly do this when updating ReserveVirtualRange and MemoryPoolReserve
* Fix Mprotect on free memory
On real hardware, this just does nothing. If something did get protected, there's no way to query that information.
Therefore, it seems pretty safe to just behave like munmap and return size here.
* Minor tidy-up
No functional difference, but looks better.
* Import memory
* 64K pages and fix memory mapping
* Queue coverage
* Buffer syncing, faulted readback adn BDA in Buffer
* Base DMA implementation
* Preparations for implementing SPV DMA access
* Base impl (pending 16K pages and getbuffersize)
* 16K pages and stack overflow fix
* clang-format
* clang-format but for real this time
* Try to fix macOS build
* Correct decltype
* Add testing log
* Fix stride and patch phi node blocks
* No need to check if it is a deleted buffer
* Clang format once more
* Offset in bytes
* Removed host buffers (may do it in another PR)
Also some random barrier fixes
* Add IR dumping from my read-const branch
* clang-format
* Correct size insteed of end
* Fix incorrect assert
* Possible fix for NieR deadlock
* Copy to avoid deadlock
* Use 2 mutexes insteed of copy
* Attempt to range sync error
* Revert "Attempt to range sync error"
This reverts commit dd287b48682b50f215680bb0956e39c2809bf3fe.
* Fix size truncated when syncing range
And memory barrier
* Some fixes (and async testing (doesn't work))
* Use compute to parse fault buffer
* Process faults on submit
* Only sync in the first time we see a readconst
Thsi is partialy wrong. We need to save the state into the submission context itself, not the rasterizer since we can yield and process another sumission (if im not understanding wrong).
* Use spec const and 32 bit atomic
* 32 bit counter
* Fix store_index
* Better sync (WIP, breaks PR now)
* Fixes for better sync
* Better sync
* Remove memory coveragte logic
* Point sirit to upstream
* Less waiting and barriers
* Correctly checkout moltenvk
* Bring back applying pending operations in wait
* Sync the whole buffer insteed of only the range
* Implement recursive shared/scoped locks
* Iterators
* Faster syncing with ranges
* Some alignment fixes
* fixed clang format
* Fix clang-format again
* Port page_manager from readbacks-poc
* clang-format
* Defer memory protect
* Remove RENDERER_TRACE
* Experiment: only sync on first readconst
* Added profiling (will be removed)
* Don't sync entire buffers
* Added logging for testing
* Updated temporary workaround to use 4k pages
* clang.-format
* Cleanup part 1
* Make ReadConst a SPIR-V function
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* Fix reading not existing savedata
* alloc save memory instead return error
* save memory saze to save slot instead of global
* remove unused enum
* remove unneeded memory clean
* Properly handle ENOMEM error return in MapMemory
Needed for Assassin's Creed Unity to behave properly.
* Change error message
If I left the message as-is, we'd probably see inexperienced people claiming the assert means your device needs more memory, which is completely false.
* Clang
You know you're doing something right when Clang complains.
* Attempt to handle MemoryMapFlags::NoOverwrite
Based on my interpretation of red_prig's descriptions. These changes are untested, as I'm not able to test right now.
* Fix flag description
* Move overwrite check to while condition
* Implement sceKernelMapDirectMemory2
Behaves similarly to sceKernelMapDirectMemory, but has a type parameter.
* Simplify
No need to copy all the MapDirectMemory code over, can just call the function, then do the SetDirectMemoryType call
* Clang