When performing a default compilation with recent GCC & glibc,
the use of -Werror=nonnull causes a build error.
The error is given as IOFile::ClearError() can call std::clearerr()
with a null file, which can trigger a null-pointer dereference in libc.
Change the std::clearerr() call to be conditional on a file being open.
Unlike ADD (immediate), MOV (register) treats SP as ZR. Therefore the
ADDI2R optimization that was added in 67791d227c can't optimize ADD to
MOV when exactly one of the registers is SP.
There currently isn't any code in Dolphin that calls ADDI2R with
parameters that would trigger this case.
Use 1 of the same type as the stored value when shifting left. This
prevents undefined behavior caused by shifting an int more than 31 bits.
Previously iterator incrementation could either hang or prematurely
report it had reached the end of the bitset.
So somehow I forgot that AArch64 uses three-operand encoding...
Fixes a regression from 6303416201 which manifested in various ways,
such as incorrect rendering of the Wind Waker title screen.
The codegen for the functions themselves, not for the emitted code.
This seems to save 32 bytes per function. We also get rid of the oddity
we had before where ANDI2R would do masking for 32-bit operations but
the other functions wouldn't.
The alias for __m128i is typically something like:
typedef long long __m128i __attribute__((__vector_size__(16), __may_alias__));
and the part that ends up not getting preserved is the __may_alias__
attribute specifier.
So, in order to preserve that, we can just use a wrapper struct, so the
data type itself isn't being passed through the template.
We don't need to enforce the use of std::string instances with
AddSetting(). We can accept views and only construct one string,
rather than three temporaries.
Internal details: The large region is split into individual same-sized blocks of memory. On creation, we allocate a single block of memory that will always remain zero, and map that into the entire memory region. Then, the first time any of these blocks is written to, we swap the mapped zero block out with a newly allocated block of memory. On clear, we swap back to the zero block and deallocate the data blocks. That way we only actually allocate one zero block as well as a handful of real data blocks where the JitCache actually writes to.
This is a little trick I came up with that lets us restructure our float
classification code so we can exit earlier when the float is normal,
which is the case more often than not.
First we shift left by 1 to get rid of the sign bit, and then we count
the number of leading sign bits. If the result is less than 10 (for
doubles) or 7 (for floats), the float is normal. This is because, if the
float isn't normal, the exponent is either all zeroes or all ones.
With this, situations where multiple arguments need to be moved
from multiple registers become easy to handle, and we also get
compile-time checking that the number of arguments is correct.
This is needed so that the checks added in the previous commit will be
reevaluated if the value of m_enable_dcache changes.
JitArm64 was already recompiling its asm routines on cache clear by
necessity. It doesn't have the same setup as Jit64 where the asm
routines are in a separate region, so clearing the JitArm64 cache
results in the asm routines being cleared too.
This value is used in a multiplication. The result of this
multiplication is then subtracted from m_base. By negating m_dec, we are
free to use an addition instead.
On x64, this saves an instruction.