More efficient code can be generated if the shift amount is known at
compile time. Similar optimizations were present in JitArm64 already,
but were missing in Jit64.
- By using an 8-bit immediate we can eliminate the need for ECX as a
scratch register, thereby reducing register pressure and occasionally
eliminating a spill.
Before:
B9 18 00 00 00 mov ecx,18h
41 8B F7 mov esi,r15d
48 D3 E6 shl rsi,cl
8B F6 mov esi,esi
After:
41 8B CF mov ecx,r15d
C1 E1 18 shl ecx,18h
- PowerPC has strange shift amount masking behavior which is emulated
using 64-bit shifts, even though we only care about a 32-bit result.
If the shift amount is known, we can handle this special case
separately, and use 32-bit shift instructions otherwise. We also no
longer need to clear the upper 32 bits of the register.
Before:
BE F8 FF FF FF mov esi,0FFFFFFF8h
8B CE mov ecx,esi
41 8B F4 mov esi,r12d
48 D3 E6 shl rsi,cl
8B F6 mov esi,esi
After:
Nothing, register is set to constant zero.
- A shift by zero becomes a simple MOV.
Before:
BE 00 00 00 00 mov esi,0
8B CE mov ecx,esi
41 8B F3 mov esi,r11d
48 D3 E6 shl rsi,cl
8B F6 mov esi,esi
After:
41 8B FB mov edi,r11d
More efficient code can be generated if the shift amount is known at
compile time. Similar optimizations were present in JitArm64 already,
but were missing in Jit64.
- By using an 8-bit immediate we can eliminate the need for ECX as a
scratch register, thereby reducing register pressure and occasionally
eliminating a spill.
Before:
B9 18 00 00 00 mov ecx,18h
45 8B C1 mov r8d,r9d
49 D3 E8 shr r8,cl
After:
45 8B C1 mov r8d,r9d
41 C1 E8 18 shr r8d,18h
- PowerPC has strange shift amount masking behavior which is emulated
using 64-bit shifts, even though we only care about a 32-bit result.
If the shift amount is known, we can handle this special case
separately, and use 32-bit shift instructions otherwise.
Before:
B9 F8 FF FF FF mov ecx,0FFFFFFF8h
45 8B C1 mov r8d,r9d
49 D3 E8 shr r8,cl
After:
Nothing, register is set to constant zero.
- A shift by zero becomes a simple MOV.
Before:
B9 00 00 00 00 mov ecx,0
45 8B C1 mov r8d,r9d
49 D3 E8 shr r8,cl
After:
45 8B C1 mov r8d,r9d
The enumerated LOG_TYPE "OSREPORT" is currently used in both EXI_DeviceIPL.cpp and HLE_OS.cpp. In many games, the multitude of game functions detected by HLE_OS.cpp for OSREPORT logging results in poor log readability. This Pull Request remedies that by adding a new enumerated LOG_TYPE "OSREPORT_HLE" for log usage in HLE_OS.cpp.
In the future, further changing how logging in HLE_OS.cpp works may be desirable. As it is, game functions are detected that send a single character to the log. This is a major source of poor readability.
Introduces the system class that will eventually contain all relevant
system state, as opposed to everything being distributed all over the
place as global variables.
Throughout the codebase we have code that from its interface-view, does
not actually require its dependencies to be described in the interface,
and we routinely run into issues with initialization where we sometimes
make use of a facility before it's been initialized, which leads to
annoying to debug cases, because the reader needs to run through the
codebase and see what order things get initialized in, and how they're
being used. This is particularly a frequent issue in the video code.
Further, we also have a lot of code that makes use of file-scope
variables (many of which are non-trivial), which must all be default
initialized before the application can actually enter main(). While this
may not be a huge issue in itself, some of these are allocating, which
means that the application may need to use memory that it otherwise
wouldn't need to (e.g. when a game isn't running, this excess memory is
being used).
Being able to wrap all these subsystems into objects would be nicer,
since they can be constructed when they're actually needed. Them being
objects also means we can better express dependencies on subsystems as
types directly in the interface, making them explicit to the reader
instead of a change randomly blowing up, said reader inspecting it, and finding
out that something needed to be initialized beforehand. With the global
turned into a function parameter, the dependency is explicit and they
know just by reading it, that the given subsystem needs to be in a valid
state before calling the function.
For a prior example of an emulator that has moved to this model, see
yuzu, which has been migrated off of global variables all over the place
and replaced with a system instance (which has now reached the stage,
where the singleton can be removed).
We want to clear/memset the padding bytes, not just each member,
so using assignment or {} initialization is not an option.
To silence the warnings, cast the object pointer to u8* (which is not
undefined behavior) to make it explicit to the compiler that we want
to fill the object representation.