* Fix flag handling on Windows
Fixes a weird homebrew kalaposfos made
* Fix backing protects
Windows requires that protections on areas committed through MapViewOfFile functions are less than the original mapping.
The best way to make sure everything works is to VirtualProtect the code area with the requested protection instead of applying prot directly.
* Fix error code for sceKernelMapDirectMemory2
Real hardware returns EINVAL instead of EACCES here
* Fix prot setting in ProtectBytes
* Handle some extra protection-related edge cases.
Real hardware treats read and write as separate perms, but appends read if you call with write-only (this is visible in VirtualQuery calls)
Additionally, execute permissions are ignored when protecting dmem mappings.
* Properly handle exec permission behavior for memory pools
Calling sceKernelMemoryPoolCommit with executable permissions returns EINVAL, mprotect on pooled mappings ignores the exec protection.
* Clang
* Allow execution protection for direct memory
Further hardware tests show that the dmem area is actually executable, this permission is just hidden from the end user.
* Clang
* More descriptive assert message
* Align address and size in mmap
Like most POSIX functions, mmap aligns address down to the nearest page boundary, and aligns address up to the nearest page boundary.
Since mmap is the only memory mapping function that doesn't error early on misaligned length or size, handle the alignment in the libkernel code.
* Clang
* Fix valid flags
After changing the value, games that specify just CpuWrite would hit the error return.
* Fix prot conversion functions
The True(bool) function returns true whenever value is greater than 0. While this rarely manifested before because of our wrongly defined CpuReadWrite prot, it's now causing trouble with the corrected values.
Technically this could've also caused trouble with games mapping GpuRead permissions, but that seems to be a rare enough use case that I guess it never happened?
I've also added a warning for the case where `write & !read`, since we don't properly handle write-only permissions, and I'm not entirely sure what it would take to deal with that.
* Fix some lingering dmem issues
ReleaseDirectMemory was always unmapping with the size parameter, which could cause it to unmap too much. Since multiple mappings can reference the same dmem area, I've calculated how much of each VMA we're supposed to unmap.
Additionally, I've adjusted the logic for carving out the free dmem area to properly work if ReleaseDirectMemory is called over multiple dmem areas.
Finally, I've patched a bug with my code in UnmapMemory.
* Remove mapped dmem type
Since physical addresses can be mapped multiple times, tracking mapped pages is not necessary.
This also allows me to significantly simplify the MapMemory physical address validation logic.
* Proper implementation for sceKernelMtypeprotect
I've rewritten SetDirectMemoryType to use virtual addresses instead of physical addresses, allowing it to be used in sceKernelMtypeprotect.
To accommodate this change, I've also moved address and size alignment out of MemoryManager::Protect
* Apply memory type in sceKernelMemoryPoolCommit
* Organization
Some potentially important missing mutexes, removed some unnecessary mutexes, moved some mutexes after early error returns, and updated copyright dates
* Iterator logic cleanup
Missing end check in ClampRangeSize, and adjusted VirtualQuery and DirectMemoryQuery.
* Clang
* Adjustments
* Properly account for behavior differences in MapDirectMemory2
Undid the changes to direct memory areas, added more robust logic for changing dma types, and fixed DirectMemoryQuery to return hardware-accurate direct memory information in cases where dmas split here, but not on real hardware.
I've also changed MapMemory's is_exec flag to a validate_dmem flag, used to handle alternate behavior in MapDirectMemory2. is_exec is now determined by the use of MemoryProt::CpuExec instead.
* Clang
* Add execute permissions to physical backing
Needed for executable mappings to work properly on Windows, fixes regression in RE2 with prior commit.
* Minor variable cleanup
* Update memory.h
* Prohibit direct memory mappings with exec protections
Did a quick hardware test to confirm, only seems to be prohibited for dmem mappings though.
* Update memory.cpp
* Implement sceKernelMemoryPoolGetBlockStats
Not entirely sure on the logic behind the cached blocks work, but flushed blocks seems to just be based on committed direct memory.
* Fix comment
* Refactor direct memory areas
At this point, swapping the multiple booleans for an enum is cleaner, and makes it easier to track the state of a direct memory area.
I've also sped up the logic for mapping direct memory by checking for out-of-bounds physical addresses before looping, and made the logic more solid using my dma type logic.
* Fix PoolCommit assert
Windows devices will throw an access violation if we don't check for iterator reaching end.
* Fix isDevKit
Previously, isDevKit could increase the physical memory used above the length we reserve in the backing file.
* Physical backing for flexible allocations
I took the simple approach here, creating a separate map for flexible allocations and pretty much just copying over the logic used in the direct memory map.
* Various fixups
* Fix mistake #1
* Assert + clang
* Fix 2
* Clang
* Fix CanMergeWith
Validate physical base for flexible mappings
* Clang
* Physical backing for pooled memory
* Allow VMA splitting in NameVirtualRange
This should be safe, since with the changes in this PR, the only issues that come from discrepancies between address space and vma_map are issues related to vmas being larger than address space mappings. NameVirtualRange will only ever shrink VMAs by naming part of one.
* Fix
* Fix NameVirtualRange
* Revert NameVirtualRange changes
Seems like it doesn't play nice for Windows
* Clean up isDevKit logic
We already log both isNeo and isDevKit in Emulator::Run, so the additional logging in MemoryManager::SetupMemoryRegions isn't really necessary.
I've also added a separate constant for non-pro devkit memory, as suggested.
Finally I've changed a couple constants to use the ORBIS prefix we generally follow here, instead of the SCE prefix.
* Erase flexible memory contents from physical memory on unmap
Flexible memory should not be preserved on unmap, so erase flexible contents from the physical backing when unmapping.
* Expand flexible memory map
Some games will end up fragmenting the physical backing space used for flexible memory. To reduce the frequency of this happening under normal circumstances, allocate the entirety of the remaining physical backing to the flexible memory map.
This is effectively a workaround to the problem, but at the moment I think this should suffice.
* Clang
* Validate requested dmem range in MapMemory
Handles a rare edge case that only comes up when modding Driveclub
* Specify type
auto has failed us once again.
* Types cleanup
Just some basic tidying up.
* Clang
* Merge dmem areas
* Fix DirectMemoryArea::CanMergeWith
Don't merge dmem areas if the memory types are different.
* Reduce some warnings to info
Both functions should behave properly now, there's no reason to warn about their use.
* Clang
* texture_cache: Avoid gpu tracking assert on sparse image
At the moment just take the easy way of creating the entire image normally and uploading unmapped subresources are zero
* tile_manager: Downgrade assert to error
* fix macos
* Only perform GPU memory mapping when GPU can access it
This better aligns with hardware observations, and should also speed up unmaps and decommits, since they don't need to be compared with the GPU max address anymore.
* Reserve fixes
ReserveVirtualRange seems to follow the 0x200000000 base address like MemoryPoolReserve does.
Both also need checks in their flags Fixed path to ensure we're mapping in-bounds. If we're not in mapping to our address space, we'll end up reserving and returning the wrong address, which could lead to weird memory issues in games.
I'll need to test on real hardware to verify if such changes are appropriate.
* Better sceKernelMmap
Handles errors where we would previously throw exceptions. Also moves the file logic to MapFile, since that way all the possible errors are in one place.
Also fixes some function parameters to align with our current standards.
* Major refactor
MapDirectMemory, MapFlexibleMemory, ReserveVirtualRange, and MemoryPoolReserve all internally use mmap to perform their mappings. Naturally, this means that all functions have similar behaviors, and a lot of duplicate code.
This add necessary conditional behavior to MapMemory so MemoryPoolReserve and ReserveVirtualRange can use it, without disrupting the behavior of MapDirectMemory or MapFlexibleMemory calls.
* Accurate phys_addr for non-direct mappings
* Properly handle GPU access rights
Since my first commit restricts GPU mappings to memory areas with GPU access permissions, we also need to be updating the GPU mappings appropriately during Protect calls too.
* Update memory.cpp
* Update memory.h
* Update memory.cpp
* Update memory.cpp
* Update memory.cpp
* Revert "Update memory.cpp"
This reverts commit 2c55d014c0.
* Coalesce dmem map
Aligns with hardware observations, hopefully shouldn't break anything since nothing should change hardware-wise when release dmem calls and unmap calls are performed?
Either that or Windows breaks because Windows, will need to test.
* Implement posix_mprotect
Unity calls this
Also fixes the names of sceKernelMprotect and sceKernelMtypeprotect, though that's more of a style change and can be reverted if requested.
* Fix sceKernelSetVirtualRangeName
Partially addresses a "regression" introduced when I fixed up some asserts.
As noted in the code, this implementation is still slightly inaccurate, as handling this properly could cause regressions on Windows.
* Unconditional assert in MapFile
* Remove protect warning
This is expected behavior, shouldn't need any logging.
* Respect alignment
Forgot to properly do this when updating ReserveVirtualRange and MemoryPoolReserve
* Fix Mprotect on free memory
On real hardware, this just does nothing. If something did get protected, there's no way to query that information.
Therefore, it seems pretty safe to just behave like munmap and return size here.
* Minor tidy-up
No functional difference, but looks better.
* Implement sceKernelMapDirectMemory2
Behaves similarly to sceKernelMapDirectMemory, but has a type parameter.
* Simplify
No need to copy all the MapDirectMemory code over, can just call the function, then do the SetDirectMemoryType call
* Clang
* Update sceKernelMemoryPoolExpand
Hardware tests show that this function is basically the same as sceKernelAllocateDirectMemory, with some minor differences.
Update the memory searching code to match my updated AllocateDirectMemory code, with appropriate error conditions.
* Update MemoryPoolReserve
Only difference between real hw and our code is behavior with addr = 0.
* Don't coalesce PoolReserved areas.
Real hardware doesn't coalesce them.
* Update PoolCommit
Plenty of edge case behaviors to handle here.
Addresses are treated as fixed, EINVAL is returned for bad mappings, name should be preserved from PoolReserving, committed areas should coalesce, reserved areas get their phys_base updated
* Formatting
* Adjust fixed PoolReserve path
Hardware tests suggest this will overwrite all VMAs in the range. Run UnmapMemoryImpl on the full area, then reserve. Same logic applies to normal reservations too.
Also adjusts logic of the non-fixed path to more closely align with hardware observations.
* Remove phys_base modifications
This can be handled later. Doing the logic properly would likely take work in MergeAdjacent, and would probably need to be applied to normal dmem mappings too.
* Use VMAHandle.Contains()
Why do extra math when we have a function specifically for this?
* Update memory.cpp
* Remove unnecessary code
Since I've removed those two asserts, these two lines of code effectively do nothing.
* Clang
* Fix names
* Fix PoolDecommit
Should fix the address space regressions in UE titles on Windows.
* Fix error log
Should make the cause of this clearer?
* Clang
* Oops
* Remove coalesce on PoolCommit
Windows makes this more difficult.
* Track pool budgets
If you try to commit more pooled memory than is allocated, PoolCommit returns ENOMEM.
Also fixes error conditions for PoolDecommit, that should return EINVAL if given an address that isn't part of the pool.
Note: Seems like the pool budget can't hit zero? I used a <= comparison based on hardware tests, otherwise we're able to make more mappings than real hardware can.
* Fix VirtualQuery behavior on low addresses.
* Fix VirtualQuery struct
Somewhere in our BitField and array use, the size of our VirtualQuery struct became larger than the struct used on real hardware.
Fixing this fixes some data corruption visible in the name parameter during my tests.
* Default name to anon
On real hardware, nameless mappings are given the name "anon:address" where address appears to be the address that made the memory call.
For simplicity sake, I'll stick to the name "anon" for now.
* Place an upper bound on returns from SearchFree
Right now, this upper bound is set based on the limitations of our GPU buffer cache and page table.
Someone with more experience in that area of code should probably fix that at some point.
* More anons
* Clang
* Fix name in sceKernelMapNamedDirectMemory
* strncpy instead of strcpy
Hardcoded the constant size for now, I need to review how real hardware behaves here to determine if anything else is necessary for this to be accurate.
* Fix name behavior
All memory naming functions restrict the name size to a 31 character limit, and return `ORBIS_KERNEL_ERROR_ENAMETOOLONG` if that limit is exceeded.
Since this value is constant for all functions involving names, I've defined it as a constant in kernel's memory.h, and used that in place of any hardcoded 32 character limits.
* Error logging
Hopefully this helps in catching the UFC regression?
* Increase address space upper bound
Probably needs heavy testing, especially on Mac/Windows.
This increases the address space, as needed to accommodate strange memory behaviors seen in UFC.
* VirtualQuery fix
Due to limitations of certain platforms, we initialize our vma_map with 3 separate free mappings.
As such, we need to use a while loop here to accurately query mappings with high addresses
* Fix mappings to high addresses
The PS4's GPU can only handle 40bit addresses. Our texture cache and buffer cache were designed around these limits, and mapping to higher addresses would cause segmentation faults and access violations.
To fix these crashes, only map to the GPU if the mapping is fully contained within the address space the GPU should access.
I'm open to suggestions on how to make this cleaner
* Revert "Increase address space upper bound"
This reverts commit 3d50eeeebb.
* Revert VirtualQuery while loop
Windows wasn't happy with this, again.
Will try to debug and properly fix this when I have a good chance.
* Fix asserts
FindVMA, due to the way it's programmed, never actually returns vma_map.end(), the furthest it ever returns is the last valid memory area. All those asserts we involving vma_map.end() never actually trigger due to this.
This commit removes redundant asserts, adds messages to asserts that were lacking them, and fixes all asserts designed to detect out of bounds memory accesses so they actually trigger.
I've also fixed some potential memory safety issues.
* Proper error behavior in QueryProtection
Might as well handle this properly while I'm here.
* Clang
* More information about ReserveVirtualRange results
Should help debug issues like the one in The Order: 1886 (CUSA00076)
* Fix assert message
* Update assert message
Extra space
* Fix my bug
Oh hey, finally something that's my fault.
* Fix rasterizer unmaps
Should use adjusted_size here, otherwise we could unmap too much.
Thanks to diegolix29 for spotting this.
* Fix edge case in MapMemory
Code comments explain everything.
This should fix some memory asserts.
* Fix fix
Avoid running the code path if it's unnecessary, since there are many additional edge cases to handle when the VMA map is small.
* Fix fix fix
Should prevent infinite loops, haven't tested properly yet though.
* Split logging for inputs and out_addr in ReserveVirtualRange
Addresses review comments.
* Implement protecting multiple VMAs
A handful of games expect this to work, and updated versions of Grand Theft Auto V crash if it doesn't work.
* Clang
* memory: Consider flexible mappings as gpu accessible
Multiple guest apps do this with perfectly valid sharps in simple shaders. This needs some hw testing to see how it is handled but for now doesnt hurt to handle it
* memory: Clamp large buffers to mapped area
Sometimes huge buffers can be bound that start on some valid mapping but arent fully contained by it. It is not reasonable to expect the game needing all of the memory, so clamp the size to avoid the gpu tracking assert
* clang-format fix
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* Unmap memory in chunks if spanning over multiple VMAs
* clang
* Merge fixups
* Minor code style changes
* Update function declarations
---------
Co-authored-by: Marcin Mikołajczyk <marcinmikolajcz@gmail.com>
* libkernel: Cleanup some function places
* kernel: Refactor thread functions
* kernel: It builds
* kernel: Fix a bunch of bugs, kernel thread heap
* kernel: File cleanup pt1
* File cleanup pt2
* File cleanup pt3
* File cleanup pt4
* kernel: Add missing funcs
* kernel: Add basic exceptions for linux
* gnmdriver: Add workload functions
* kernel: Fix new pthreads code on macOS. (#1441)
* kernel: Downgrade edeadlk to log
* gnmdriver: Add sceGnmSubmitCommandBuffersForWorkload
* exception: Add context register population for macOS. (#1444)
* kernel: Pthread rewrite touchups for Windows
* kernel: Multiplatform thread implementation
* mutex: Remove spamming log
* pthread_spec: Make assert into a log
* pthread_spec: Zero initialize array
* Attempt to fix non-Windows builds
* hotfix: change incorrect NID for scePthreadAttrSetaffinity
* scePthreadAttrSetaffinity implementation
* Attempt to fix Linux
* windows: Address a bunch of address space problems
* address_space: Fix unmap of region surrounded by placeholders
* libs: Reduce logging
* pthread: Implement condvar with waitable atomics and sleepqueue
* sleepq: Separate and make faster
* time: Remove delay execution
* Causes high cpu usage in Tohou Luna Nights
* kernel: Cleanup files again
* pthread: Add missing include
* semaphore: Use binary_semaphore instead of condvar
* Seems more reliable
* libraries/sysmodule: log module on `sceSysmoduleIsLoaded`
* libraries/kernel: implement `scePthreadSetPrio`
---------
Co-authored-by: squidbus <175574877+squidbus@users.noreply.github.com>
Co-authored-by: Daniel R. <47796739+polybiusproxy@users.noreply.github.com>
* I hate programming and will furiously smash my monitor if I ever see another oversight of this caliber ever again in my goddamn life
* Merge both protect functions together
The headers for these functions were technically not the same as the actual function definition. This didn't cause any emulation issues, but caused some weird issues with my IDE.
* memory: Size direct memory based on requested flexible memory.
* memory: Guard against OrbisProcParam without an OrbisKernelMemParam.
* memory: Account for alignment in direct memory suitability checks and add more debugging.
* Fix in searchFree should fix#337
* clang format fix
* sceKernelSetVirtualRangeName implementation
* improved vaddr conversion
* updated VirtualQuery to include name too
* unmap also removed name thanks @red_prig
* fixed copy...
* Implement `sceKernelFtruncate` and `sceKernelUnlink`.
* Remove unused variable.
* Implement `sceKernelReserveVirtualRange`, misc fixes
* Fix `sceKernelReserveVirtualRange`.
* Add TODO on reserve
* Replace comment with assert.
* Add missing copyright header
* Add `UNREACHABLE` for `IOFile::Unlink`.
* Move NT API initialization out of the header
* Fix bug where files were always mapped as read only.
* `clang-format`
* video_core: Add a few missed things
* libkernel: More proper memory mapped files
* memory: Fix tessellation buffer mapping
* Cuphead work
* sceKernelPollSema fix
* clang format
* fixed ngs2 lle loading and rtc lib
* draft pthreads keys implementation
* fixed return codes
* return error code if sceKernelLoadStartModule module is invalid
* re-enabled system modules and disable debug in libs.h
* Improve linux support
* fix windows build
* kernel: Rework keys
---------
Co-authored-by: georgemoralis <giorgosmrls@gmail.com>
* core: Split module code from linker
* linker: Properly implement thread local storage
* kernel: Fix a few memory functions
* kernel: Implement module loading
* Now it's easy to do anyway with new module rework
* video_core: Remove hack in rasterizer
* The hack was to skip the first draw as the display buffer had not been created yet and the texture cache couldn't create one itself. With this patch it now can, using the color buffer parameters from registers
* shader_recompiler: Implement attribute loads/stores
* video_core: Add basic vertex, index buffer handling and pipeline caching
* externals: Make xxhash lowercase