Ryujinx/Ryujinx.HLE/HOS/Kernel/Process/KProcess.cs

1117 lines
34 KiB
C#
Raw Normal View History

Add a new JIT compiler for CPU code (#693) * Start of the ARMeilleure project * Refactoring around the old IRAdapter, now renamed to PreAllocator * Optimize the LowestBitSet method * Add CLZ support and fix CLS implementation * Add missing Equals and GetHashCode overrides on some structs, misc small tweaks * Implement the ByteSwap IR instruction, and some refactoring on the assembler * Implement the DivideUI IR instruction and fix 64-bits IDIV * Correct constant operand type on CSINC * Move division instructions implementation to InstEmitDiv * Fix destination type for the ConditionalSelect IR instruction * Implement UMULH and SMULH, with new IR instructions * Fix some issues with shift instructions * Fix constant types for BFM instructions * Fix up new tests using the new V128 struct * Update tests * Move DIV tests to a separate file * Add support for calls, and some instructions that depends on them * Start adding support for SIMD & FP types, along with some of the related ARM instructions * Fix some typos and the divide instruction with FP operands * Fix wrong method call on Clz_V * Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes * Implement SIMD logical instructions and more misc. fixes * Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations * Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes * Implement SIMD shift instruction and fix Dup_V * Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table * Fix check with tolerance on tester * Implement FP & SIMD comparison instructions, and some fixes * Update FCVT (Scalar) encoding on the table to support the Half-float variants * Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes * Use old memory access methods, made a start on SIMD memory insts support, some fixes * Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes * Fix arguments count with struct return values, other fixes * More instructions * Misc. fixes and integrate LDj3SNuD fixes * Update tests * Add a faster linear scan allocator, unwinding support on windows, and other changes * Update Ryujinx.HLE * Update Ryujinx.Graphics * Fix V128 return pointer passing, RCX is clobbered * Update Ryujinx.Tests * Update ITimeZoneService * Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks * Use generic GetFunctionPointerForDelegate method and other tweaks * Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics * Remove some unused code on the assembler * Fix REX.W prefix regression on float conversion instructions, add some sort of profiler * Add hardware capability detection * Fix regression on Sha1h and revert Fcm** changes * Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator * Fix silly mistake introduced on last commit on CpuId * Generate inline stack probes when the stack allocation is too large * Initial support for the System-V ABI * Support multiple destination operands * Fix SSE2 VectorInsert8 path, and other fixes * Change placement of XMM callee save and restore code to match other compilers * Rename Dest to Destination and Inst to Instruction * Fix a regression related to calls and the V128 type * Add an extra space on comments to match code style * Some refactoring * Fix vector insert FP32 SSE2 path * Port over the ARM32 instructions * Avoid memory protection races on JIT Cache * Another fix on VectorInsert FP32 (thanks to LDj3SNuD * Float operands don't need to use the same register when VEX is supported * Add a new register allocator, higher quality code for hot code (tier up), and other tweaks * Some nits, small improvements on the pre allocator * CpuThreadState is gone * Allow changing CPU emulators with a config entry * Add runtime identifiers on the ARMeilleure project * Allow switching between CPUs through a config entry (pt. 2) * Change win10-x64 to win-x64 on projects * Update the Ryujinx project to use ARMeilleure * Ensure that the selected register is valid on the hybrid allocator * Allow exiting on returns to 0 (should fix test regression) * Remove register assignments for most used variables on the hybrid allocator * Do not use fixed registers as spill temp * Add missing namespace and remove unneeded using * Address PR feedback * Fix types, etc * Enable AssumeStrictAbiCompliance by default * Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
using ARMeilleure.State;
using Ryujinx.Common;
using Ryujinx.Common.Logging;
using Ryujinx.Cpu;
using Ryujinx.HLE.Exceptions;
using Ryujinx.HLE.HOS.Kernel.Common;
using Ryujinx.HLE.HOS.Kernel.Memory;
using Ryujinx.HLE.HOS.Kernel.Threading;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
namespace Ryujinx.HLE.HOS.Kernel.Process
{
class KProcess : KSynchronizationObject
{
public const int KernelVersionMajor = 10;
public const int KernelVersionMinor = 4;
public const int KernelVersionRevision = 0;
public const int KernelVersionPacked =
(KernelVersionMajor << 19) |
(KernelVersionMinor << 15) |
(KernelVersionRevision << 0);
public KMemoryManager MemoryManager { get; private set; }
private SortedDictionary<ulong, KTlsPageInfo> _fullTlsPages;
private SortedDictionary<ulong, KTlsPageInfo> _freeTlsPages;
public int DefaultCpuCore { get; set; }
public bool Debug { get; private set; }
public KResourceLimit ResourceLimit { get; private set; }
public ulong PersonalMmHeapPagesCount { get; private set; }
public ProcessState State { get; private set; }
private object _processLock;
private object _threadingLock;
public KAddressArbiter AddressArbiter { get; private set; }
public long[] RandomEntropy { get; private set; }
private bool _signaled;
private bool _useSystemMemBlocks;
public string Name { get; private set; }
private int _threadCount;
public int MmuFlags { get; private set; }
private MemoryRegion _memRegion;
public KProcessCapabilities Capabilities { get; private set; }
Added GUI to Ryujinx (#695) * Added GUI to Ryujinx * Updated to use Glade Also added scrollbar and default dark theme * Added support for loading icon from .nro files and cleaned up the code a bit * Added General Settings Menu (read-only for now) and moved some functionality from MainMenu.cs to ApplicationLibrary.cs * Added custom GUI theme support and changed the defualt theme to one I just wrote * Added GTK to process path, fixed a bug and minor edits * some more edits and a bug fix * general settings menu is now fully functional. also fixed the bug where ryujinx crashes when it trys to load an invalid gamedir * big rewrite * aesthetic changes to General Settings menu * Added Control Settings one day done feature :P * minor changes * 1st wave of changes * 2nd wave of changes * 3rd wave of changes * Cleanup settings ui * minor edits * new about window added, still needs styling * added spin button for new option and tooltips to settings * Game icons and names are now shown in the games list * add nuget package which contains gtk dependencies * requested changes have been changed * put CreateGameWindow on a new thread and stopped destroying the main menu when a game loads * fixed bug that allowed a user to attempt to load multiple games at a time which causes a crash * Added LastPlayed and TimePlayed columns to the game list * Did some testing and fixed some bugs Im not happy with one of the fixes so i will do it properly an upcoming commit * did some more bug testing and fixed another 2 bugs * caught an exception when ryujinx tries to load non-homebrew as homebrew * Large changes Rewrote ApplicationLibrary.cs (added comments too) so any devs reading it wont get eye cancer, also its probably more efficient now. Added 2 new columns (Developer name and application version) to the game list and wrote the logic for it. Ryujinx now loads NRO's TitleName and TitleID from the NACP file instead of the default NPDM. I also killed a lot of bugs * Moved Files moved ApplicationLibrary.cs to Ryujinx.HLE as that is a better place for it. Moved contents of GUI folder to Ui folder and changed the namespaces of the gui files from Ryujinx to Ryujinx.Ui * Added 'Open Ryujinx Folder' button to the file menu and did some small fixes * New features * updated nuget package with missing dlls and changed emmauss' requested changes * fixed some minor issues * all requested changes marked as resolved have been changed * gdkchan's requested changes * fixed an issue with settings window getting chopped on small res * fixed 2 problems caused by rebase * changed the default theme * applied Thog's patch to fix issue on linux * fixed issue caused by rebase * added update check button that runs ryujinx-updater * reads version info from installer and displays it in about menu * changes completed * requested changes changed * fixed issue with default theme * fixed a bug and completed requested changes * added more tooltips and changed some text
2019-09-02 18:03:57 +02:00
public ulong TitleId { get; private set; }
public long Pid { get; private set; }
private long _creationTimestamp;
private ulong _entrypoint;
private ulong _imageSize;
private ulong _mainThreadStackSize;
private ulong _memoryUsageCapacity;
private int _version;
public KHandleTable HandleTable { get; private set; }
public ulong UserExceptionContextAddress { get; private set; }
private LinkedList<KThread> _threads;
public bool IsPaused { get; private set; }
public MemoryManager CpuMemory { get; private set; }
public CpuContext CpuContext { get; private set; }
public HleProcessDebugger Debugger { get; private set; }
public KProcess(KernelContext context) : base(context)
{
_processLock = new object();
_threadingLock = new object();
AddressArbiter = new KAddressArbiter(context);
_fullTlsPages = new SortedDictionary<ulong, KTlsPageInfo>();
_freeTlsPages = new SortedDictionary<ulong, KTlsPageInfo>();
Capabilities = new KProcessCapabilities();
RandomEntropy = new long[KScheduler.CpuCoresCount];
_threads = new LinkedList<KThread>();
Debugger = new HleProcessDebugger(this);
}
public KernelResult InitializeKip(
ProcessCreationInfo creationInfo,
int[] caps,
KPageList pageList,
KResourceLimit resourceLimit,
MemoryRegion memRegion)
{
ResourceLimit = resourceLimit;
_memRegion = memRegion;
AddressSpaceType addrSpaceType = (AddressSpaceType)((creationInfo.MmuFlags >> 1) & 7);
InitializeMemoryManager(addrSpaceType, memRegion);
bool aslrEnabled = ((creationInfo.MmuFlags >> 5) & 1) != 0;
ulong codeAddress = creationInfo.CodeAddress;
ulong codeSize = (ulong)creationInfo.CodePagesCount * KMemoryManager.PageSize;
KMemoryBlockAllocator memoryBlockAllocator = (MmuFlags & 0x40) != 0
? KernelContext.LargeMemoryBlockAllocator
: KernelContext.SmallMemoryBlockAllocator;
KernelResult result = MemoryManager.InitializeForProcess(
addrSpaceType,
aslrEnabled,
!aslrEnabled,
memRegion,
codeAddress,
codeSize,
memoryBlockAllocator);
if (result != KernelResult.Success)
{
return result;
}
if (!ValidateCodeAddressAndSize(codeAddress, codeSize))
{
return KernelResult.InvalidMemRange;
}
result = MemoryManager.MapPages(
codeAddress,
pageList,
MemoryState.CodeStatic,
MemoryPermission.None);
if (result != KernelResult.Success)
{
return result;
}
result = Capabilities.InitializeForKernel(caps, MemoryManager);
if (result != KernelResult.Success)
{
return result;
}
Pid = KernelContext.NewKipId();
if (Pid == 0 || (ulong)Pid >= KernelConstants.InitialProcessId)
{
throw new InvalidOperationException($"Invalid KIP Id {Pid}.");
}
result = ParseProcessInfo(creationInfo);
return result;
}
public KernelResult Initialize(
ProcessCreationInfo creationInfo,
int[] caps,
KResourceLimit resourceLimit,
MemoryRegion memRegion)
{
ResourceLimit = resourceLimit;
_memRegion = memRegion;
ulong personalMmHeapSize = GetPersonalMmHeapSize((ulong)creationInfo.PersonalMmHeapPagesCount, memRegion);
ulong codePagesCount = (ulong)creationInfo.CodePagesCount;
ulong neededSizeForProcess = personalMmHeapSize + codePagesCount * KMemoryManager.PageSize;
if (neededSizeForProcess != 0 && resourceLimit != null)
{
if (!resourceLimit.Reserve(LimitableResource.Memory, neededSizeForProcess))
{
return KernelResult.ResLimitExceeded;
}
}
void CleanUpForError()
{
if (neededSizeForProcess != 0 && resourceLimit != null)
{
resourceLimit.Release(LimitableResource.Memory, neededSizeForProcess);
}
}
PersonalMmHeapPagesCount = (ulong)creationInfo.PersonalMmHeapPagesCount;
KMemoryBlockAllocator memoryBlockAllocator;
if (PersonalMmHeapPagesCount != 0)
{
memoryBlockAllocator = new KMemoryBlockAllocator(PersonalMmHeapPagesCount * KMemoryManager.PageSize);
}
else
{
memoryBlockAllocator = (MmuFlags & 0x40) != 0
? KernelContext.LargeMemoryBlockAllocator
: KernelContext.SmallMemoryBlockAllocator;
}
AddressSpaceType addrSpaceType = (AddressSpaceType)((creationInfo.MmuFlags >> 1) & 7);
InitializeMemoryManager(addrSpaceType, memRegion);
bool aslrEnabled = ((creationInfo.MmuFlags >> 5) & 1) != 0;
ulong codeAddress = creationInfo.CodeAddress;
ulong codeSize = codePagesCount * KMemoryManager.PageSize;
KernelResult result = MemoryManager.InitializeForProcess(
addrSpaceType,
aslrEnabled,
!aslrEnabled,
memRegion,
codeAddress,
codeSize,
memoryBlockAllocator);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
if (!ValidateCodeAddressAndSize(codeAddress, codeSize))
{
CleanUpForError();
return KernelResult.InvalidMemRange;
}
result = MemoryManager.MapNewProcessCode(
codeAddress,
codePagesCount,
MemoryState.CodeStatic,
MemoryPermission.None);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
result = Capabilities.InitializeForUser(caps, MemoryManager);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
Pid = KernelContext.NewProcessId();
if (Pid == -1 || (ulong)Pid < KernelConstants.InitialProcessId)
{
throw new InvalidOperationException($"Invalid Process Id {Pid}.");
}
result = ParseProcessInfo(creationInfo);
if (result != KernelResult.Success)
{
CleanUpForError();
}
return result;
}
private bool ValidateCodeAddressAndSize(ulong address, ulong size)
{
ulong codeRegionStart;
ulong codeRegionSize;
switch (MemoryManager.AddrSpaceWidth)
{
case 32:
codeRegionStart = 0x200000;
codeRegionSize = 0x3fe00000;
break;
case 36:
codeRegionStart = 0x8000000;
codeRegionSize = 0x78000000;
break;
case 39:
codeRegionStart = 0x8000000;
codeRegionSize = 0x7ff8000000;
break;
default: throw new InvalidOperationException("Invalid address space width on memory manager.");
}
ulong endAddr = address + size;
ulong codeRegionEnd = codeRegionStart + codeRegionSize;
if (endAddr <= address ||
endAddr - 1 > codeRegionEnd - 1)
{
return false;
}
if (MemoryManager.InsideHeapRegion (address, size) ||
MemoryManager.InsideAliasRegion(address, size))
{
return false;
}
return true;
}
private KernelResult ParseProcessInfo(ProcessCreationInfo creationInfo)
{
// Ensure that the current kernel version is equal or above to the minimum required.
uint requiredKernelVersionMajor = (uint)Capabilities.KernelReleaseVersion >> 19;
uint requiredKernelVersionMinor = ((uint)Capabilities.KernelReleaseVersion >> 15) & 0xf;
if (KernelContext.EnableVersionChecks)
{
if (requiredKernelVersionMajor > KernelVersionMajor)
{
return KernelResult.InvalidCombination;
}
if (requiredKernelVersionMajor != KernelVersionMajor && requiredKernelVersionMajor < 3)
{
return KernelResult.InvalidCombination;
}
if (requiredKernelVersionMinor > KernelVersionMinor)
{
return KernelResult.InvalidCombination;
}
}
KernelResult result = AllocateThreadLocalStorage(out ulong userExceptionContextAddress);
if (result != KernelResult.Success)
{
return result;
}
UserExceptionContextAddress = userExceptionContextAddress;
MemoryHelper.FillWithZeros(CpuMemory, (long)userExceptionContextAddress, KTlsPageInfo.TlsEntrySize);
Name = creationInfo.Name;
State = ProcessState.Created;
_creationTimestamp = PerformanceCounter.ElapsedMilliseconds;
MmuFlags = creationInfo.MmuFlags;
_version = creationInfo.Version;
TitleId = creationInfo.TitleId;
_entrypoint = creationInfo.CodeAddress;
_imageSize = (ulong)creationInfo.CodePagesCount * KMemoryManager.PageSize;
_useSystemMemBlocks = ((MmuFlags >> 6) & 1) != 0;
switch ((AddressSpaceType)((MmuFlags >> 1) & 7))
{
case AddressSpaceType.Addr32Bits:
case AddressSpaceType.Addr36Bits:
case AddressSpaceType.Addr39Bits:
_memoryUsageCapacity = MemoryManager.HeapRegionEnd -
MemoryManager.HeapRegionStart;
break;
case AddressSpaceType.Addr32BitsNoMap:
_memoryUsageCapacity = MemoryManager.HeapRegionEnd -
MemoryManager.HeapRegionStart +
MemoryManager.AliasRegionEnd -
MemoryManager.AliasRegionStart;
break;
default: throw new InvalidOperationException($"Invalid MMU flags value 0x{MmuFlags:x2}.");
}
GenerateRandomEntropy();
return KernelResult.Success;
}
public KernelResult AllocateThreadLocalStorage(out ulong address)
{
KernelContext.CriticalSection.Enter();
KernelResult result;
if (_freeTlsPages.Count > 0)
{
// If we have free TLS pages available, just use the first one.
KTlsPageInfo pageInfo = _freeTlsPages.Values.First();
if (!pageInfo.TryGetFreePage(out address))
{
throw new InvalidOperationException("Unexpected failure getting free TLS page!");
}
if (pageInfo.IsFull())
{
_freeTlsPages.Remove(pageInfo.PageAddr);
_fullTlsPages.Add(pageInfo.PageAddr, pageInfo);
}
result = KernelResult.Success;
}
else
{
// Otherwise, we need to create a new one.
result = AllocateTlsPage(out KTlsPageInfo pageInfo);
if (result == KernelResult.Success)
{
if (!pageInfo.TryGetFreePage(out address))
{
throw new InvalidOperationException("Unexpected failure getting free TLS page!");
}
_freeTlsPages.Add(pageInfo.PageAddr, pageInfo);
}
else
{
address = 0;
}
}
KernelContext.CriticalSection.Leave();
return result;
}
private KernelResult AllocateTlsPage(out KTlsPageInfo pageInfo)
{
pageInfo = default;
if (!KernelContext.UserSlabHeapPages.TryGetItem(out ulong tlsPagePa))
{
return KernelResult.OutOfMemory;
}
ulong regionStart = MemoryManager.TlsIoRegionStart;
ulong regionSize = MemoryManager.TlsIoRegionEnd - regionStart;
ulong regionPagesCount = regionSize / KMemoryManager.PageSize;
KernelResult result = MemoryManager.AllocateOrMapPa(
1,
KMemoryManager.PageSize,
tlsPagePa,
true,
regionStart,
regionPagesCount,
MemoryState.ThreadLocal,
MemoryPermission.ReadAndWrite,
out ulong tlsPageVa);
if (result != KernelResult.Success)
{
KernelContext.UserSlabHeapPages.Free(tlsPagePa);
}
else
{
pageInfo = new KTlsPageInfo(tlsPageVa);
MemoryHelper.FillWithZeros(CpuMemory, (long)tlsPageVa, KMemoryManager.PageSize);
}
return result;
}
public KernelResult FreeThreadLocalStorage(ulong tlsSlotAddr)
{
ulong tlsPageAddr = BitUtils.AlignDown(tlsSlotAddr, KMemoryManager.PageSize);
KernelContext.CriticalSection.Enter();
KernelResult result = KernelResult.Success;
KTlsPageInfo pageInfo = null;
if (_fullTlsPages.TryGetValue(tlsPageAddr, out pageInfo))
{
// TLS page was full, free slot and move to free pages tree.
_fullTlsPages.Remove(tlsPageAddr);
_freeTlsPages.Add(tlsPageAddr, pageInfo);
}
else if (!_freeTlsPages.TryGetValue(tlsPageAddr, out pageInfo))
{
result = KernelResult.InvalidAddress;
}
if (pageInfo != null)
{
pageInfo.FreeTlsSlot(tlsSlotAddr);
if (pageInfo.IsEmpty())
{
// TLS page is now empty, we should ensure it is removed
// from all trees, and free the memory it was using.
_freeTlsPages.Remove(tlsPageAddr);
KernelContext.CriticalSection.Leave();
FreeTlsPage(pageInfo);
return KernelResult.Success;
}
}
KernelContext.CriticalSection.Leave();
return result;
}
private KernelResult FreeTlsPage(KTlsPageInfo pageInfo)
{
if (!MemoryManager.TryConvertVaToPa(pageInfo.PageAddr, out ulong tlsPagePa))
{
throw new InvalidOperationException("Unexpected failure translating virtual address to physical.");
}
KernelResult result = MemoryManager.UnmapForKernel(pageInfo.PageAddr, 1, MemoryState.ThreadLocal);
if (result == KernelResult.Success)
{
KernelContext.UserSlabHeapPages.Free(tlsPagePa);
}
return result;
}
private void GenerateRandomEntropy()
{
// TODO.
}
public KernelResult Start(int mainThreadPriority, ulong stackSize)
{
lock (_processLock)
{
if (State > ProcessState.CreatedAttached)
{
return KernelResult.InvalidState;
}
if (ResourceLimit != null && !ResourceLimit.Reserve(LimitableResource.Thread, 1))
{
return KernelResult.ResLimitExceeded;
}
KResourceLimit threadResourceLimit = ResourceLimit;
KResourceLimit memoryResourceLimit = null;
if (_mainThreadStackSize != 0)
{
throw new InvalidOperationException("Trying to start a process with a invalid state!");
}
ulong stackSizeRounded = BitUtils.AlignUp(stackSize, KMemoryManager.PageSize);
ulong neededSize = stackSizeRounded + _imageSize;
// Check if the needed size for the code and the stack will fit on the
// memory usage capacity of this Process. Also check for possible overflow
// on the above addition.
if (neededSize > _memoryUsageCapacity ||
neededSize < stackSizeRounded)
{
threadResourceLimit?.Release(LimitableResource.Thread, 1);
return KernelResult.OutOfMemory;
}
if (stackSizeRounded != 0 && ResourceLimit != null)
{
memoryResourceLimit = ResourceLimit;
if (!memoryResourceLimit.Reserve(LimitableResource.Memory, stackSizeRounded))
{
threadResourceLimit?.Release(LimitableResource.Thread, 1);
return KernelResult.ResLimitExceeded;
}
}
KernelResult result;
KThread mainThread = null;
ulong stackTop = 0;
void CleanUpForError()
{
HandleTable.Destroy();
mainThread?.DecrementReferenceCount();
if (_mainThreadStackSize != 0)
{
ulong stackBottom = stackTop - _mainThreadStackSize;
ulong stackPagesCount = _mainThreadStackSize / KMemoryManager.PageSize;
MemoryManager.UnmapForKernel(stackBottom, stackPagesCount, MemoryState.Stack);
_mainThreadStackSize = 0;
}
memoryResourceLimit?.Release(LimitableResource.Memory, stackSizeRounded);
threadResourceLimit?.Release(LimitableResource.Thread, 1);
}
if (stackSizeRounded != 0)
{
ulong stackPagesCount = stackSizeRounded / KMemoryManager.PageSize;
ulong regionStart = MemoryManager.StackRegionStart;
ulong regionSize = MemoryManager.StackRegionEnd - regionStart;
ulong regionPagesCount = regionSize / KMemoryManager.PageSize;
result = MemoryManager.AllocateOrMapPa(
stackPagesCount,
KMemoryManager.PageSize,
0,
false,
regionStart,
regionPagesCount,
MemoryState.Stack,
MemoryPermission.ReadAndWrite,
out ulong stackBottom);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
_mainThreadStackSize += stackSizeRounded;
stackTop = stackBottom + stackSizeRounded;
}
ulong heapCapacity = _memoryUsageCapacity - _mainThreadStackSize - _imageSize;
result = MemoryManager.SetHeapCapacity(heapCapacity);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
HandleTable = new KHandleTable(KernelContext);
result = HandleTable.Initialize(Capabilities.HandleTableSize);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
mainThread = new KThread(KernelContext);
result = mainThread.Initialize(
_entrypoint,
0,
stackTop,
mainThreadPriority,
DefaultCpuCore,
this);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
result = HandleTable.GenerateHandle(mainThread, out int mainThreadHandle);
if (result != KernelResult.Success)
{
CleanUpForError();
return result;
}
mainThread.SetEntryArguments(0, mainThreadHandle);
ProcessState oldState = State;
ProcessState newState = State != ProcessState.Created
? ProcessState.Attached
: ProcessState.Started;
SetState(newState);
// TODO: We can't call KThread.Start from a non-guest thread.
// We will need to make some changes to allow the creation of
// dummy threads that will be used to initialize the current
// thread on KCoreContext so that GetCurrentThread doesn't fail.
/* Result = MainThread.Start();
if (Result != KernelResult.Success)
{
SetState(OldState);
CleanUpForError();
} */
mainThread.Reschedule(ThreadSchedState.Running);
if (result == KernelResult.Success)
{
mainThread.IncrementReferenceCount();
}
mainThread.DecrementReferenceCount();
return result;
}
}
private void SetState(ProcessState newState)
{
if (State != newState)
{
State = newState;
_signaled = true;
Signal();
}
}
public KernelResult InitializeThread(
KThread thread,
ulong entrypoint,
ulong argsPtr,
ulong stackTop,
int priority,
int cpuCore)
{
lock (_processLock)
{
return thread.Initialize(entrypoint, argsPtr, stackTop, priority, cpuCore, this);
}
}
public void SubscribeThreadEventHandlers(ARMeilleure.State.ExecutionContext context)
{
Add a new JIT compiler for CPU code (#693) * Start of the ARMeilleure project * Refactoring around the old IRAdapter, now renamed to PreAllocator * Optimize the LowestBitSet method * Add CLZ support and fix CLS implementation * Add missing Equals and GetHashCode overrides on some structs, misc small tweaks * Implement the ByteSwap IR instruction, and some refactoring on the assembler * Implement the DivideUI IR instruction and fix 64-bits IDIV * Correct constant operand type on CSINC * Move division instructions implementation to InstEmitDiv * Fix destination type for the ConditionalSelect IR instruction * Implement UMULH and SMULH, with new IR instructions * Fix some issues with shift instructions * Fix constant types for BFM instructions * Fix up new tests using the new V128 struct * Update tests * Move DIV tests to a separate file * Add support for calls, and some instructions that depends on them * Start adding support for SIMD & FP types, along with some of the related ARM instructions * Fix some typos and the divide instruction with FP operands * Fix wrong method call on Clz_V * Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes * Implement SIMD logical instructions and more misc. fixes * Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations * Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes * Implement SIMD shift instruction and fix Dup_V * Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table * Fix check with tolerance on tester * Implement FP & SIMD comparison instructions, and some fixes * Update FCVT (Scalar) encoding on the table to support the Half-float variants * Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes * Use old memory access methods, made a start on SIMD memory insts support, some fixes * Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes * Fix arguments count with struct return values, other fixes * More instructions * Misc. fixes and integrate LDj3SNuD fixes * Update tests * Add a faster linear scan allocator, unwinding support on windows, and other changes * Update Ryujinx.HLE * Update Ryujinx.Graphics * Fix V128 return pointer passing, RCX is clobbered * Update Ryujinx.Tests * Update ITimeZoneService * Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks * Use generic GetFunctionPointerForDelegate method and other tweaks * Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics * Remove some unused code on the assembler * Fix REX.W prefix regression on float conversion instructions, add some sort of profiler * Add hardware capability detection * Fix regression on Sha1h and revert Fcm** changes * Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator * Fix silly mistake introduced on last commit on CpuId * Generate inline stack probes when the stack allocation is too large * Initial support for the System-V ABI * Support multiple destination operands * Fix SSE2 VectorInsert8 path, and other fixes * Change placement of XMM callee save and restore code to match other compilers * Rename Dest to Destination and Inst to Instruction * Fix a regression related to calls and the V128 type * Add an extra space on comments to match code style * Some refactoring * Fix vector insert FP32 SSE2 path * Port over the ARM32 instructions * Avoid memory protection races on JIT Cache * Another fix on VectorInsert FP32 (thanks to LDj3SNuD * Float operands don't need to use the same register when VEX is supported * Add a new register allocator, higher quality code for hot code (tier up), and other tweaks * Some nits, small improvements on the pre allocator * CpuThreadState is gone * Allow changing CPU emulators with a config entry * Add runtime identifiers on the ARMeilleure project * Allow switching between CPUs through a config entry (pt. 2) * Change win10-x64 to win-x64 on projects * Update the Ryujinx project to use ARMeilleure * Ensure that the selected register is valid on the hybrid allocator * Allow exiting on returns to 0 (should fix test regression) * Remove register assignments for most used variables on the hybrid allocator * Do not use fixed registers as spill temp * Add missing namespace and remove unneeded using * Address PR feedback * Fix types, etc * Enable AssumeStrictAbiCompliance by default * Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
context.Interrupt += InterruptHandler;
context.SupervisorCall += KernelContext.SyscallHandler.SvcCall;
Add a new JIT compiler for CPU code (#693) * Start of the ARMeilleure project * Refactoring around the old IRAdapter, now renamed to PreAllocator * Optimize the LowestBitSet method * Add CLZ support and fix CLS implementation * Add missing Equals and GetHashCode overrides on some structs, misc small tweaks * Implement the ByteSwap IR instruction, and some refactoring on the assembler * Implement the DivideUI IR instruction and fix 64-bits IDIV * Correct constant operand type on CSINC * Move division instructions implementation to InstEmitDiv * Fix destination type for the ConditionalSelect IR instruction * Implement UMULH and SMULH, with new IR instructions * Fix some issues with shift instructions * Fix constant types for BFM instructions * Fix up new tests using the new V128 struct * Update tests * Move DIV tests to a separate file * Add support for calls, and some instructions that depends on them * Start adding support for SIMD & FP types, along with some of the related ARM instructions * Fix some typos and the divide instruction with FP operands * Fix wrong method call on Clz_V * Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes * Implement SIMD logical instructions and more misc. fixes * Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations * Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes * Implement SIMD shift instruction and fix Dup_V * Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table * Fix check with tolerance on tester * Implement FP & SIMD comparison instructions, and some fixes * Update FCVT (Scalar) encoding on the table to support the Half-float variants * Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes * Use old memory access methods, made a start on SIMD memory insts support, some fixes * Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes * Fix arguments count with struct return values, other fixes * More instructions * Misc. fixes and integrate LDj3SNuD fixes * Update tests * Add a faster linear scan allocator, unwinding support on windows, and other changes * Update Ryujinx.HLE * Update Ryujinx.Graphics * Fix V128 return pointer passing, RCX is clobbered * Update Ryujinx.Tests * Update ITimeZoneService * Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks * Use generic GetFunctionPointerForDelegate method and other tweaks * Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics * Remove some unused code on the assembler * Fix REX.W prefix regression on float conversion instructions, add some sort of profiler * Add hardware capability detection * Fix regression on Sha1h and revert Fcm** changes * Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator * Fix silly mistake introduced on last commit on CpuId * Generate inline stack probes when the stack allocation is too large * Initial support for the System-V ABI * Support multiple destination operands * Fix SSE2 VectorInsert8 path, and other fixes * Change placement of XMM callee save and restore code to match other compilers * Rename Dest to Destination and Inst to Instruction * Fix a regression related to calls and the V128 type * Add an extra space on comments to match code style * Some refactoring * Fix vector insert FP32 SSE2 path * Port over the ARM32 instructions * Avoid memory protection races on JIT Cache * Another fix on VectorInsert FP32 (thanks to LDj3SNuD * Float operands don't need to use the same register when VEX is supported * Add a new register allocator, higher quality code for hot code (tier up), and other tweaks * Some nits, small improvements on the pre allocator * CpuThreadState is gone * Allow changing CPU emulators with a config entry * Add runtime identifiers on the ARMeilleure project * Allow switching between CPUs through a config entry (pt. 2) * Change win10-x64 to win-x64 on projects * Update the Ryujinx project to use ARMeilleure * Ensure that the selected register is valid on the hybrid allocator * Allow exiting on returns to 0 (should fix test regression) * Remove register assignments for most used variables on the hybrid allocator * Do not use fixed registers as spill temp * Add missing namespace and remove unneeded using * Address PR feedback * Fix types, etc * Enable AssumeStrictAbiCompliance by default * Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
context.Undefined += UndefinedInstructionHandler;
}
private void InterruptHandler(object sender, EventArgs e)
{
KernelContext.Scheduler.ContextSwitch();
}
public void IncrementThreadCount()
{
Interlocked.Increment(ref _threadCount);
KernelContext.ThreadCounter.AddCount();
}
public void DecrementThreadCountAndTerminateIfZero()
{
KernelContext.ThreadCounter.Signal();
if (Interlocked.Decrement(ref _threadCount) == 0)
{
Terminate();
}
}
public void DecrementToZeroWhileTerminatingCurrent()
{
KernelContext.ThreadCounter.Signal();
while (Interlocked.Decrement(ref _threadCount) != 0)
{
Destroy();
TerminateCurrentProcess();
}
// Nintendo panic here because if it reaches this point, the current thread should be already dead.
// As we handle the death of the thread in the post SVC handler and inside the CPU emulator, we don't panic here.
}
public ulong GetMemoryCapacity()
{
ulong totalCapacity = (ulong)ResourceLimit.GetRemainingValue(LimitableResource.Memory);
totalCapacity += MemoryManager.GetTotalHeapSize();
totalCapacity += GetPersonalMmHeapSize();
totalCapacity += _imageSize + _mainThreadStackSize;
if (totalCapacity <= _memoryUsageCapacity)
{
return totalCapacity;
}
return _memoryUsageCapacity;
}
public ulong GetMemoryUsage()
{
return _imageSize + _mainThreadStackSize + MemoryManager.GetTotalHeapSize() + GetPersonalMmHeapSize();
}
public ulong GetMemoryCapacityWithoutPersonalMmHeap()
{
return GetMemoryCapacity() - GetPersonalMmHeapSize();
}
public ulong GetMemoryUsageWithoutPersonalMmHeap()
{
return GetMemoryUsage() - GetPersonalMmHeapSize();
}
private ulong GetPersonalMmHeapSize()
{
return GetPersonalMmHeapSize(PersonalMmHeapPagesCount, _memRegion);
}
private static ulong GetPersonalMmHeapSize(ulong personalMmHeapPagesCount, MemoryRegion memRegion)
{
if (memRegion == MemoryRegion.Applet)
{
return 0;
}
return personalMmHeapPagesCount * KMemoryManager.PageSize;
}
public void AddThread(KThread thread)
{
lock (_threadingLock)
{
thread.ProcessListNode = _threads.AddLast(thread);
}
}
public void RemoveThread(KThread thread)
{
lock (_threadingLock)
{
_threads.Remove(thread.ProcessListNode);
}
}
public bool IsCpuCoreAllowed(int core)
{
return (Capabilities.AllowedCpuCoresMask & (1L << core)) != 0;
}
public bool IsPriorityAllowed(int priority)
{
return (Capabilities.AllowedThreadPriosMask & (1L << priority)) != 0;
}
public override bool IsSignaled()
{
return _signaled;
}
public KernelResult Terminate()
{
KernelResult result;
bool shallTerminate = false;
KernelContext.CriticalSection.Enter();
lock (_processLock)
{
if (State >= ProcessState.Started)
{
if (State == ProcessState.Started ||
State == ProcessState.Crashed ||
State == ProcessState.Attached ||
State == ProcessState.DebugSuspended)
{
SetState(ProcessState.Exiting);
shallTerminate = true;
}
result = KernelResult.Success;
}
else
{
result = KernelResult.InvalidState;
}
}
KernelContext.CriticalSection.Leave();
if (shallTerminate)
{
UnpauseAndTerminateAllThreadsExcept(KernelContext.Scheduler.GetCurrentThread());
HandleTable.Destroy();
SignalExitToDebugTerminated();
SignalExit();
}
return result;
}
public void TerminateCurrentProcess()
{
bool shallTerminate = false;
KernelContext.CriticalSection.Enter();
lock (_processLock)
{
if (State >= ProcessState.Started)
{
if (State == ProcessState.Started ||
State == ProcessState.Attached ||
State == ProcessState.DebugSuspended)
{
SetState(ProcessState.Exiting);
shallTerminate = true;
}
}
}
KernelContext.CriticalSection.Leave();
if (shallTerminate)
{
UnpauseAndTerminateAllThreadsExcept(KernelContext.Scheduler.GetCurrentThread());
HandleTable.Destroy();
// NOTE: this is supposed to be called in receiving of the mailbox.
SignalExitToDebugExited();
SignalExit();
}
}
private void UnpauseAndTerminateAllThreadsExcept(KThread currentThread)
{
lock (_threadingLock)
{
KernelContext.CriticalSection.Enter();
foreach (KThread thread in _threads)
{
if ((thread.SchedFlags & ThreadSchedState.LowMask) != ThreadSchedState.TerminationPending)
{
thread.PrepareForTermination();
}
}
KernelContext.CriticalSection.Leave();
}
KThread blockedThread = null;
lock (_threadingLock)
{
foreach (KThread thread in _threads)
{
if (thread != currentThread && (thread.SchedFlags & ThreadSchedState.LowMask) != ThreadSchedState.TerminationPending)
{
thread.IncrementReferenceCount();
blockedThread = thread;
break;
}
}
}
if (blockedThread != null)
{
blockedThread.Terminate();
blockedThread.DecrementReferenceCount();
}
}
private void SignalExitToDebugTerminated()
{
// TODO: Debug events.
}
private void SignalExitToDebugExited()
{
// TODO: Debug events.
}
private void SignalExit()
{
if (ResourceLimit != null)
{
ResourceLimit.Release(LimitableResource.Memory, GetMemoryUsage());
}
KernelContext.CriticalSection.Enter();
SetState(ProcessState.Exited);
KernelContext.CriticalSection.Leave();
}
public KernelResult ClearIfNotExited()
{
KernelResult result;
KernelContext.CriticalSection.Enter();
lock (_processLock)
{
if (State != ProcessState.Exited && _signaled)
{
_signaled = false;
result = KernelResult.Success;
}
else
{
result = KernelResult.InvalidState;
}
}
KernelContext.CriticalSection.Leave();
return result;
}
private void InitializeMemoryManager(AddressSpaceType addrSpaceType, MemoryRegion memRegion)
{
int addrSpaceBits = addrSpaceType switch
{
AddressSpaceType.Addr32Bits => 32,
AddressSpaceType.Addr36Bits => 36,
AddressSpaceType.Addr32BitsNoMap => 32,
AddressSpaceType.Addr39Bits => 39,
_ => throw new ArgumentException(nameof(addrSpaceType))
};
CpuMemory = new MemoryManager(KernelContext.Memory, 1UL << addrSpaceBits, InvalidAccessHandler);
CpuContext = new CpuContext(CpuMemory);
// TODO: This should eventually be removed.
// The GPU shouldn't depend on the CPU memory manager at all.
KernelContext.Device.Gpu.SetVmm(CpuMemory);
MemoryManager = new KMemoryManager(KernelContext, CpuMemory);
}
private bool InvalidAccessHandler(ulong va)
{
KernelContext.Scheduler.GetCurrentThreadOrNull()?.PrintGuestStackTrace();
Logger.Error?.Print(LogClass.Cpu, $"Invalid memory access at virtual address 0x{va:X16}.");
return false;
}
private void UndefinedInstructionHandler(object sender, InstUndefinedEventArgs e)
{
KernelContext.Scheduler.GetCurrentThreadOrNull()?.PrintGuestStackTrace();
Add a new JIT compiler for CPU code (#693) * Start of the ARMeilleure project * Refactoring around the old IRAdapter, now renamed to PreAllocator * Optimize the LowestBitSet method * Add CLZ support and fix CLS implementation * Add missing Equals and GetHashCode overrides on some structs, misc small tweaks * Implement the ByteSwap IR instruction, and some refactoring on the assembler * Implement the DivideUI IR instruction and fix 64-bits IDIV * Correct constant operand type on CSINC * Move division instructions implementation to InstEmitDiv * Fix destination type for the ConditionalSelect IR instruction * Implement UMULH and SMULH, with new IR instructions * Fix some issues with shift instructions * Fix constant types for BFM instructions * Fix up new tests using the new V128 struct * Update tests * Move DIV tests to a separate file * Add support for calls, and some instructions that depends on them * Start adding support for SIMD & FP types, along with some of the related ARM instructions * Fix some typos and the divide instruction with FP operands * Fix wrong method call on Clz_V * Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes * Implement SIMD logical instructions and more misc. fixes * Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations * Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes * Implement SIMD shift instruction and fix Dup_V * Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table * Fix check with tolerance on tester * Implement FP & SIMD comparison instructions, and some fixes * Update FCVT (Scalar) encoding on the table to support the Half-float variants * Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes * Use old memory access methods, made a start on SIMD memory insts support, some fixes * Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes * Fix arguments count with struct return values, other fixes * More instructions * Misc. fixes and integrate LDj3SNuD fixes * Update tests * Add a faster linear scan allocator, unwinding support on windows, and other changes * Update Ryujinx.HLE * Update Ryujinx.Graphics * Fix V128 return pointer passing, RCX is clobbered * Update Ryujinx.Tests * Update ITimeZoneService * Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks * Use generic GetFunctionPointerForDelegate method and other tweaks * Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics * Remove some unused code on the assembler * Fix REX.W prefix regression on float conversion instructions, add some sort of profiler * Add hardware capability detection * Fix regression on Sha1h and revert Fcm** changes * Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator * Fix silly mistake introduced on last commit on CpuId * Generate inline stack probes when the stack allocation is too large * Initial support for the System-V ABI * Support multiple destination operands * Fix SSE2 VectorInsert8 path, and other fixes * Change placement of XMM callee save and restore code to match other compilers * Rename Dest to Destination and Inst to Instruction * Fix a regression related to calls and the V128 type * Add an extra space on comments to match code style * Some refactoring * Fix vector insert FP32 SSE2 path * Port over the ARM32 instructions * Avoid memory protection races on JIT Cache * Another fix on VectorInsert FP32 (thanks to LDj3SNuD * Float operands don't need to use the same register when VEX is supported * Add a new register allocator, higher quality code for hot code (tier up), and other tweaks * Some nits, small improvements on the pre allocator * CpuThreadState is gone * Allow changing CPU emulators with a config entry * Add runtime identifiers on the ARMeilleure project * Allow switching between CPUs through a config entry (pt. 2) * Change win10-x64 to win-x64 on projects * Update the Ryujinx project to use ARMeilleure * Ensure that the selected register is valid on the hybrid allocator * Allow exiting on returns to 0 (should fix test regression) * Remove register assignments for most used variables on the hybrid allocator * Do not use fixed registers as spill temp * Add missing namespace and remove unneeded using * Address PR feedback * Fix types, etc * Enable AssumeStrictAbiCompliance by default * Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
throw new UndefinedInstructionException(e.Address, e.OpCode);
}
protected override void Destroy()
{
CpuMemory.Dispose();
}
}
}