* More descriptive sceKernelMapNamedFlexibleMemory logging
* Misc exports
These functions are used by Overwatch: Origins Edition
* Clang
* Function parameter cleanup
Changes the parameters on our sceKernelMapNamedFlexibleMemory and sceKernelMapFlexibleMemory functions to better align with our current standards.
* posix_pthread_attr_getschedparam
* Fixes for scePthreadGetAffinity and scePthreadSetAffinity
Looking at FreeBSD source, and our other pthread functions, we should be using our FindThread function to get the appropriate thread if thread != g_curthread.
* Swap do-while to while
If we use a do-while loop, we waste time if `aligned_size = 0`. This is also still accurate to FreeBSD behavior, where it returns success if `start == end` during mprotect.
This also effectively prevents the memory assert seen in updated versions of RESIDENT EVIL 2 (CUSA09193)
* Move prot validation outside loop
The prot variable shouldn't change during a mprotect call, so we can check the flags before protecting instead.
Also cleans up the code for prot validation.
This should improve performance, and is more accurate to FreeBSD code.
* Add logging for protect calls
This will help in debugging future problems
* Change touchpad handling and orientation calculation
* remove unnecessary includes in pad.cpp
* remove the cmake command arguments
* remove the weird file
* try to fix formatting
* limit new gyro and touchpad logic to controller 1
* remove cout
* fix formatting and add the handle check to scePadRead
* swap y and z back
* Only perform GPU memory mapping when GPU can access it
This better aligns with hardware observations, and should also speed up unmaps and decommits, since they don't need to be compared with the GPU max address anymore.
* Reserve fixes
ReserveVirtualRange seems to follow the 0x200000000 base address like MemoryPoolReserve does.
Both also need checks in their flags Fixed path to ensure we're mapping in-bounds. If we're not in mapping to our address space, we'll end up reserving and returning the wrong address, which could lead to weird memory issues in games.
I'll need to test on real hardware to verify if such changes are appropriate.
* Better sceKernelMmap
Handles errors where we would previously throw exceptions. Also moves the file logic to MapFile, since that way all the possible errors are in one place.
Also fixes some function parameters to align with our current standards.
* Major refactor
MapDirectMemory, MapFlexibleMemory, ReserveVirtualRange, and MemoryPoolReserve all internally use mmap to perform their mappings. Naturally, this means that all functions have similar behaviors, and a lot of duplicate code.
This add necessary conditional behavior to MapMemory so MemoryPoolReserve and ReserveVirtualRange can use it, without disrupting the behavior of MapDirectMemory or MapFlexibleMemory calls.
* Accurate phys_addr for non-direct mappings
* Properly handle GPU access rights
Since my first commit restricts GPU mappings to memory areas with GPU access permissions, we also need to be updating the GPU mappings appropriately during Protect calls too.
* Update memory.cpp
* Update memory.h
* Update memory.cpp
* Update memory.cpp
* Update memory.cpp
* Revert "Update memory.cpp"
This reverts commit 2c55d014c0.
* Coalesce dmem map
Aligns with hardware observations, hopefully shouldn't break anything since nothing should change hardware-wise when release dmem calls and unmap calls are performed?
Either that or Windows breaks because Windows, will need to test.
* Implement posix_mprotect
Unity calls this
Also fixes the names of sceKernelMprotect and sceKernelMtypeprotect, though that's more of a style change and can be reverted if requested.
* Fix sceKernelSetVirtualRangeName
Partially addresses a "regression" introduced when I fixed up some asserts.
As noted in the code, this implementation is still slightly inaccurate, as handling this properly could cause regressions on Windows.
* Unconditional assert in MapFile
* Remove protect warning
This is expected behavior, shouldn't need any logging.
* Respect alignment
Forgot to properly do this when updating ReserveVirtualRange and MemoryPoolReserve
* Fix Mprotect on free memory
On real hardware, this just does nothing. If something did get protected, there's no way to query that information.
Therefore, it seems pretty safe to just behave like munmap and return size here.
* Minor tidy-up
No functional difference, but looks better.
* Import memory
* 64K pages and fix memory mapping
* Queue coverage
* Buffer syncing, faulted readback adn BDA in Buffer
* Base DMA implementation
* Preparations for implementing SPV DMA access
* Base impl (pending 16K pages and getbuffersize)
* 16K pages and stack overflow fix
* clang-format
* clang-format but for real this time
* Try to fix macOS build
* Correct decltype
* Add testing log
* Fix stride and patch phi node blocks
* No need to check if it is a deleted buffer
* Clang format once more
* Offset in bytes
* Removed host buffers (may do it in another PR)
Also some random barrier fixes
* Add IR dumping from my read-const branch
* clang-format
* Correct size insteed of end
* Fix incorrect assert
* Possible fix for NieR deadlock
* Copy to avoid deadlock
* Use 2 mutexes insteed of copy
* Attempt to range sync error
* Revert "Attempt to range sync error"
This reverts commit dd287b48682b50f215680bb0956e39c2809bf3fe.
* Fix size truncated when syncing range
And memory barrier
* Some fixes (and async testing (doesn't work))
* Use compute to parse fault buffer
* Process faults on submit
* Only sync in the first time we see a readconst
Thsi is partialy wrong. We need to save the state into the submission context itself, not the rasterizer since we can yield and process another sumission (if im not understanding wrong).
* Use spec const and 32 bit atomic
* 32 bit counter
* Fix store_index
* Better sync (WIP, breaks PR now)
* Fixes for better sync
* Better sync
* Remove memory coveragte logic
* Point sirit to upstream
* Less waiting and barriers
* Correctly checkout moltenvk
* Bring back applying pending operations in wait
* Sync the whole buffer insteed of only the range
* Implement recursive shared/scoped locks
* Iterators
* Faster syncing with ranges
* Some alignment fixes
* fixed clang format
* Fix clang-format again
* Port page_manager from readbacks-poc
* clang-format
* Defer memory protect
* Remove RENDERER_TRACE
* Experiment: only sync on first readconst
* Added profiling (will be removed)
* Don't sync entire buffers
* Added logging for testing
* Updated temporary workaround to use 4k pages
* clang.-format
* Cleanup part 1
* Make ReadConst a SPIR-V function
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* Fix reading not existing savedata
* alloc save memory instead return error
* save memory saze to save slot instead of global
* remove unused enum
* remove unneeded memory clean
* Properly handle ENOMEM error return in MapMemory
Needed for Assassin's Creed Unity to behave properly.
* Change error message
If I left the message as-is, we'd probably see inexperienced people claiming the assert means your device needs more memory, which is completely false.
* Clang
You know you're doing something right when Clang complains.
* Attempt to handle MemoryMapFlags::NoOverwrite
Based on my interpretation of red_prig's descriptions. These changes are untested, as I'm not able to test right now.
* Fix flag description
* Move overwrite check to while condition
* Implement sceKernelMapDirectMemory2
Behaves similarly to sceKernelMapDirectMemory, but has a type parameter.
* Simplify
No need to copy all the MapDirectMemory code over, can just call the function, then do the SetDirectMemoryType call
* Clang
* implementation
* add backend (WIP)
* now should be good
- fix implementation based on homebrew tests
- demote log to debug
- make squidbus happy (hopefully)
* fix moved m_name
* fix clang
* replace existing event when its same id and filter
* run timercallback after addEvent and remove useless code
* move KernelSignalRequest to the end
* clang (i hate you)
* dummy returns in p2p sockets
* added logging for sceNetSetsockopt
* possible fix for ORBIS_NET_SO_LINGER set
* logging for getsockoption as well
* disable kernel getsockname (seems to create issues with cyberpunk)
* some fixes with SetSocketOptions
* arggg
* posix_getsockname try
* mutex protection
* removed duplicated include (diegolix29)
* posix_getsockname appears to have issues in cyberpunk , comment it for now
* Implement sceKernelMemoryPoolBatch
I've tested Commit and Decommit on real hardware, haven't tested Protect or TypeProtect yet.
Implementation is primarily based on our sceKernelBatchMap implementation.
* Clang
* Update sceKernelMemoryPoolExpand
Hardware tests show that this function is basically the same as sceKernelAllocateDirectMemory, with some minor differences.
Update the memory searching code to match my updated AllocateDirectMemory code, with appropriate error conditions.
* Update MemoryPoolReserve
Only difference between real hw and our code is behavior with addr = 0.
* Don't coalesce PoolReserved areas.
Real hardware doesn't coalesce them.
* Update PoolCommit
Plenty of edge case behaviors to handle here.
Addresses are treated as fixed, EINVAL is returned for bad mappings, name should be preserved from PoolReserving, committed areas should coalesce, reserved areas get their phys_base updated
* Formatting
* Adjust fixed PoolReserve path
Hardware tests suggest this will overwrite all VMAs in the range. Run UnmapMemoryImpl on the full area, then reserve. Same logic applies to normal reservations too.
Also adjusts logic of the non-fixed path to more closely align with hardware observations.
* Remove phys_base modifications
This can be handled later. Doing the logic properly would likely take work in MergeAdjacent, and would probably need to be applied to normal dmem mappings too.
* Use VMAHandle.Contains()
Why do extra math when we have a function specifically for this?
* Update memory.cpp
* Remove unnecessary code
Since I've removed those two asserts, these two lines of code effectively do nothing.
* Clang
* Fix names
* Fix PoolDecommit
Should fix the address space regressions in UE titles on Windows.
* Fix error log
Should make the cause of this clearer?
* Clang
* Oops
* Remove coalesce on PoolCommit
Windows makes this more difficult.
* Track pool budgets
If you try to commit more pooled memory than is allocated, PoolCommit returns ENOMEM.
Also fixes error conditions for PoolDecommit, that should return EINVAL if given an address that isn't part of the pool.
Note: Seems like the pool budget can't hit zero? I used a <= comparison based on hardware tests, otherwise we're able to make more mappings than real hardware can.
* Reduce bitfield size
Linux compilers automatically convert this, Windows not so much.
* Static assert for VirtualQueryInfo struct size
Since compilers can be weird, having a static assert for this will be helpful.
Granted, this probably wont need changing after this PR.
* Fix VirtualQuery behavior on low addresses.
* Fix VirtualQuery struct
Somewhere in our BitField and array use, the size of our VirtualQuery struct became larger than the struct used on real hardware.
Fixing this fixes some data corruption visible in the name parameter during my tests.
* Default name to anon
On real hardware, nameless mappings are given the name "anon:address" where address appears to be the address that made the memory call.
For simplicity sake, I'll stick to the name "anon" for now.
* Place an upper bound on returns from SearchFree
Right now, this upper bound is set based on the limitations of our GPU buffer cache and page table.
Someone with more experience in that area of code should probably fix that at some point.
* More anons
* Clang
* Fix name in sceKernelMapNamedDirectMemory
* strncpy instead of strcpy
Hardcoded the constant size for now, I need to review how real hardware behaves here to determine if anything else is necessary for this to be accurate.
* Fix name behavior
All memory naming functions restrict the name size to a 31 character limit, and return `ORBIS_KERNEL_ERROR_ENAMETOOLONG` if that limit is exceeded.
Since this value is constant for all functions involving names, I've defined it as a constant in kernel's memory.h, and used that in place of any hardcoded 32 character limits.
* Error logging
Hopefully this helps in catching the UFC regression?
* Increase address space upper bound
Probably needs heavy testing, especially on Mac/Windows.
This increases the address space, as needed to accommodate strange memory behaviors seen in UFC.
* VirtualQuery fix
Due to limitations of certain platforms, we initialize our vma_map with 3 separate free mappings.
As such, we need to use a while loop here to accurately query mappings with high addresses
* Fix mappings to high addresses
The PS4's GPU can only handle 40bit addresses. Our texture cache and buffer cache were designed around these limits, and mapping to higher addresses would cause segmentation faults and access violations.
To fix these crashes, only map to the GPU if the mapping is fully contained within the address space the GPU should access.
I'm open to suggestions on how to make this cleaner
* Revert "Increase address space upper bound"
This reverts commit 3d50eeeebb.
* Revert VirtualQuery while loop
Windows wasn't happy with this, again.
Will try to debug and properly fix this when I have a good chance.
* Fix asserts
FindVMA, due to the way it's programmed, never actually returns vma_map.end(), the furthest it ever returns is the last valid memory area. All those asserts we involving vma_map.end() never actually trigger due to this.
This commit removes redundant asserts, adds messages to asserts that were lacking them, and fixes all asserts designed to detect out of bounds memory accesses so they actually trigger.
I've also fixed some potential memory safety issues.
* Proper error behavior in QueryProtection
Might as well handle this properly while I'm here.
* Clang
* More information about ReserveVirtualRange results
Should help debug issues like the one in The Order: 1886 (CUSA00076)
* Fix assert message
* Update assert message
Extra space
* Fix my bug
Oh hey, finally something that's my fault.
* Fix rasterizer unmaps
Should use adjusted_size here, otherwise we could unmap too much.
Thanks to diegolix29 for spotting this.
* Fix edge case in MapMemory
Code comments explain everything.
This should fix some memory asserts.
* Fix fix
Avoid running the code path if it's unnecessary, since there are many additional edge cases to handle when the VMA map is small.
* Fix fix fix
Should prevent infinite loops, haven't tested properly yet though.
* Split logging for inputs and out_addr in ReserveVirtualRange
Addresses review comments.
We do this in order to be able to actually fit in all possible values from AmdGpu::NumberConversion.
Fixes gcc compiler warnings:
warning: ‘Shader::PsColorBuffer::num_conversion’ is too small to hold all values of ‘enum class AmdGpu::NumberConversion’
* Fix module map addresses
Most modules are mapped starting at 0x800000000, with no gaps between mappings.
* Hardcode hardware accurate base address
Looking at our address space, all platforms will have this base address mapped, so there shouldn't be any problem in using it.
* Clang
* Swap module mapping to NoFlags, remove offset code
Since real hardware has no gap between module mappings, the Fixed flag is just an annoyance to work around, and has no impact on the actual mappings.
Swapping the module mappings to use flags NoFlags instead simplifies our code slightly.
* Fix module mapping names
On real hardware, the file extension is part of the mapping name. Easiest way to manage this is to swap the name to be `file.filename().string()` instead of `file.stem().string()`
* Fix patches
Completely missed this, whoops.
* Proper handling of whence 3 & 4
* Accurate directory handling in open
Directories can be opened, and can be created in open, these changes should handle that more accurately.
* Mount /app0 as read only
On real hardware, it's read only.
* Proper directory flag handling.
Even when directory is specified, it will still succeed to open non-directories.
* Check for read only directories
* Earlier ro check in posix_rmdir
Hardware tests suggest these checks are in a different order
* Clear temp folder on boot
My tests rely on this, and some games do too.
Two birds with one stone
* Clang
* Add missing DeleteHandle calls
Whoops
* Final flags adjustment in sceKernelOpen
All my current tests are now hardware accurate.
* Fix truncates
Host ftruncate consistently fails on EINVAL, I'll need to test if this issue affected Windows too.
* Windows hacks
Windows is more limiting about how folders are opened and things like that. For now, pretend these calls didn't error.
Also fixes compilation for Windows
* Final touch-ups
After expanding my test suite further, I found a couple more edge cases that needed addressing.
Bloodborne audio is still broken, I'll look into that soon.
* Remove hacky read-only behavior in posix_stat
Bloodborne apparently uses the mode parameter here when querying it's audio files, and the mode we returned led to it disabling audio entirely.
* Clang
* Cleaner code
* Combine fsync and sync flags
According to FreeBSD docs, the "sync" flag is synonymous with the fsync flag, and is only included to meet the POSIX spec.
* Log if any currently unhandled flags are encountered.
These are rare and probably not too important, but log a warning when they're seen.
* Update file_system.cpp
* Update file_system.cpp
* Clang
* Revert truncate fix
Using ftruncate works fine after moving the call to before the proper file opening code.
* Truncate before open
Open the file as read-write, then try truncating.
This fixes read | truncate flag behavior on Windows.
* Slightly adjust check for invalid flags
Any open call with invalid flags should return EINVAL, regardless of other errors parameters might cause.
* New translations en_us.ts (Vietnamese)
* New translations en_us.ts (Vietnamese)
* New translations en_us.ts (Vietnamese)
* New translations en_us.ts (Vietnamese)
* shader_recompiler: Add lowering pass for when 64-bit float is unsupported.
* shader_recompiler: Fix PackDouble2x32/UnpackDouble2x32 type.
* shader_recompiler: Remove extra bit cast implementations.
* Fix GetModule exception
Simple mistake
* Prevent OOB writes in add_segment
Due to mistakes in our linker logic, OpenOrbis' libSceFios2 causes OOB writes here.
While the ideal solution would be to fix the erroneous behavior, the best I'm capable of right now is just preventing the OOB writes.
* Implement sceKernelGetModuleInfo, sceKernelGetModuleInfoInternal, sceKernelGetModuleList
These are implemented based on hardware observations and a homebrew sample made by red_prig. I've yet to test what error cases can show up.
* Clang
* Accurate error returns
If there are more modules than provided space, then return kernel ENOMEM.
If either handles or out_count are null, return kernel EFAULT.
* Accurate error checks in ModuleInfo functions
* Clang
* Readable VideoOutEvent data packing
Inspired by the work of former shadPS4 devs and mostly based on red_prig's current code.
* Apply DceData struct to sceVideoOutGetEventCount
Makes the code easier to read
* Update equeue.h
* Update main.cpp
* Update equeue.h
* Proper struct names
* Fix hint mask
Thanks to red_prig for catching my mistake here.
* Clang
* Fix header discrepancy
* Some sysmodules inconsistencies fixed. Based on Visual studio flags if they are irrelevant lmk
* Suggestions - info passed to sceKernelGetModuleInfoForUnwind and if name field matches it gets zeroed
* Final suggestions
* reverting OrbisModuleInfoForUnwind and modifing header.
* Implement sceImeDialogGetPanelSize
* Fix header
* Clang
* Adjust values that are different from Ime
* Add original sizes as comments
* clang
* At this point half of the PR is from squidbus, and I'm just typing out what they say
---------
Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
* Update memory.cpp
* Clean logic
FindDmemArea guarantees that the first dmem area we check contains search_start. Any dmem areas beyond the first one will be entirely past search_start, so checking against it in the loop is unnecessary.