Lime3DS/externals/CMakeLists.txt
Wunk e13735b624
video_core: Implement an arm64 shader-jit backend (#7002)
* externals: Add oaksim submodule

Used for emitting ARM64 assembly

* common: Implement aarch64 ABI

Utilize oaknut to implement a stack frame.

* tests: Allow shader-jit tests for x64 and a64

Run the shader-jit tests for both x86_64 and arm64 targets

* video_core: Initialize arm64 shader-jit backend

Passes all current unit tests!

* shader_jit_a64: protect/unprotect memory when jit-ing

Required on MacOS. Memory needs to be fully unprotected and then
re-protected when writing or there will be memory access errors on
MacOS.

* shader_jit_a64: Fix ARM64-Imm overflow

These conditionals were throwing exceptions since the immediate values
were overflowing the available space in the `EOR` instructions. Instead
they are generated from `MOV` and then `EOR`-ed after.

* shader_jit_a64: Fix Geometry shader conditional

* shader_jit_a64: Replace `ADRL` with `MOVP2R`

Fixes some immediate-generation exceptions.

* common/aarch64: Fix CallFarFunction

* shader_jit_a64: Optimize `SantitizedMul`

Co-authored-by: merryhime <merryhime@users.noreply.github.com>

* shader_jit_a64: Fix address register offset behavior

Based on https://github.com/citra-emu/citra/pull/6942
Passes unit tests.

* shader_jit_a64: Fix `RET` address offset

A64 stack is 16-byte aligned rather than 8. So a direct port of the x64
code won't work. Fixes weird branches into invalid memory for any
shaders with subroutines.

* shader_jit_a64: Increase max program size

Tuned for A64 program size.

* shader_jit_a64: Use `UBFX` for extracting loop-state

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `SUB+CMP` to `SUBS`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `CMP+B` to `CBNZ`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Use `FMOV` for `ONE` vector

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove x86-specific documentation

* shader_jit_a64: Use `UBFX` to extract exponent

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove redundant MIN/MAX `SRC2`-NaN check

Special handling only needs to check SRC1 for NaN, not SRC2.
It would work as follows in the four possible cases:

No NaN: No special handling needed.
Only SRC1 is NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.
Only SRC2 is NaN: FMAX automatically picks SRC2 because it always picks the NaN if there is one.
Both SRC1 and SRC2 are NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests:: Add catch-stringifier for vec2f/vec3f

* shader_jit/tests: Add Dest Mask unit test

* shader_jit_a64: Fix Dest-Mask `BSL` operand order

Passes the dest-mask unit tests now.

* shader_jit_a64: Use `MOVI` for DestEnable mask

Accelerate certain cases of masking with MOVI as well

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests: Add source-swizzle unit test

This is not expansive. Generating all `4^4` cases seems to make Catch2
crash. So I've added some component-masking(non-reordering) tests based
on the Dest-Mask unit-test and some additional ones to test
broadcasts/splats and component re-ordering.

* shader_jit_a64: Fix swizzle index generation

This was still generating `SHUFPS` indices and not the ones that we wanted for the `TBL` instruction. Passes all unit tests now.

* shader_jit/tests: Add `ShaderSetup` constructor to `ShaderTest`

Rather than using the direct output of `CompileShaderSetup` allow a
`ShaderSetup` object to be passed in directly.  This enabled the ability
emit assembly that is not directly supported by nihstro.

* shader_jit/tests: Add `CALL` unit-test

Tests nested `CALL` instructions to eventually reach an `EX2`
instruction.

EX2 is picked in particular since it is implemented as an even deeper
dispatch and ensures subroutines are properly implemented between `CALL`
instructions and implementation-calls.

* shader_jit_a64: Fix nested `BL` subroutines

`lr` was getting writen over by nested calls to `BL`, causing undefined
behavior with mixtures of `CALL`, `EX2`, and `LG2` instructions.

Each usage of `BL` is now protected with a stach push/pop to preserve
and restore teh `lr` register to allow nested subroutines to work
properly.

* shader_jit/tests: Allocate generated tests on heap

Each of these generated shader-test objects were causing the stack to
overflow.  Allocate each of the generated tests on the heap and use
unique_ptr so they only exist within the life-time of the `REQUIRE`
statement.

* shader_jit_a64: Preserve `lr` register from external function calls

`EMIT` makes an external function call, and should be preserving `lr`

* shader_jit/tests: Add `MAD` unit-test

The Inline Asm version requires an upstream fix:
https://github.com/neobrain/nihstro/issues/68

Instead, the program code is manually configured and added.

* shader_jit/tests: Fix uninitialized instructions

These `union`-type instruction-types were uninitialized, causing tests
to indeterminantly fail at times.

* shader_jit_a64: Remove unneeded `MOV`

Residue from the direct-port of x64 code.

* shader_jit_a64: Use `std::array` for `instr_table`

Add some type-safety and const-correctness around this type as well.

* shader_jit_a64: Avoid c-style offset casting

Add some more const-correctness to this function as well.

* video_core: Add arch preprocessor comments

* common/aarch64: Use X16 as the veneer register

https://developer.arm.com/documentation/102374/0101/Procedure-Call-Standard

* shader_jit/tests: Add uniform reading unit-test

Particularly to ensure that addresses are being properly truncated

* common/aarch64: Use `X0` as `ABI_RETURN`

`X8` is used as the indirect return result value in the case that the
result is bigger than 128-bits. Principally `X0` is the general-case
return register though.

* common/aarch64: Add veneer register note

`LR` is generally overwritten by `BLR` anyways, and would also be a safe
veneer to utilize for far-calls.

* shader_jit_a64: Remove unneeded scratch register from `SanitizedMul`

* shader_jit_a64: Fix CALLU condition

Should be `EQ` not `NE`. Fixes the regression on Kid Icarus.
No known regressions anymore!

---------

Co-authored-by: merryhime <merryhime@users.noreply.github.com>
Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>
2023-11-05 21:40:31 +01:00

367 lines
11 KiB
CMake
Vendored

# Definitions for all external bundled libraries
# Suppress warnings from external libraries
if (CMAKE_CXX_COMPILER_ID MATCHES "MSVC")
add_compile_options(/W0)
else()
add_compile_options(-w)
endif()
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${PROJECT_SOURCE_DIR}/CMakeModules)
include(DownloadExternals)
include(ExternalProject)
# Boost
if (NOT USE_SYSTEM_BOOST)
message(STATUS "Including vendored Boost library")
set(BOOST_ROOT "${CMAKE_SOURCE_DIR}/externals/boost" CACHE STRING "")
set(Boost_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/externals/boost" CACHE STRING "")
set(Boost_NO_SYSTEM_PATHS ON CACHE BOOL "")
add_library(boost INTERFACE)
target_include_directories(boost SYSTEM INTERFACE ${Boost_INCLUDE_DIR})
# Boost::serialization
file(GLOB boost_serialization_SRC "${CMAKE_SOURCE_DIR}/externals/boost/libs/serialization/src/*.cpp")
add_library(boost_serialization STATIC ${boost_serialization_SRC})
target_link_libraries(boost_serialization PUBLIC boost)
# Boost::iostreams
add_library(
boost_iostreams
STATIC
${CMAKE_SOURCE_DIR}/externals/boost/libs/iostreams/src/file_descriptor.cpp
${CMAKE_SOURCE_DIR}/externals/boost/libs/iostreams/src/mapped_file.cpp
)
target_link_libraries(boost_iostreams PUBLIC boost)
# Add additional boost libs here; remember to ALIAS them in the root CMakeLists!
else()
unset(BOOST_ROOT CACHE)
unset(Boost_INCLUDE_DIR CACHE)
set(Boost_NO_SYSTEM_PATHS OFF CACHE BOOL "" FORCE)
endif()
# Catch2
set(CATCH_INSTALL_DOCS OFF CACHE BOOL "")
set(CATCH_INSTALL_EXTRAS OFF CACHE BOOL "")
add_subdirectory(catch2)
# Crypto++
if(USE_SYSTEM_CRYPTOPP)
find_package(cryptopp REQUIRED)
add_library(cryptopp INTERFACE)
target_link_libraries(cryptopp INTERFACE cryptopp::cryptopp)
else()
set(CRYPTOPP_BUILD_DOCUMENTATION OFF CACHE BOOL "")
set(CRYPTOPP_BUILD_TESTING OFF CACHE BOOL "")
set(CRYPTOPP_INSTALL OFF CACHE BOOL "")
set(CRYPTOPP_SOURCES "${CMAKE_SOURCE_DIR}/externals/cryptopp" CACHE STRING "")
add_subdirectory(cryptopp-cmake)
endif()
# dds-ktx
add_library(dds-ktx INTERFACE)
target_include_directories(dds-ktx INTERFACE ./dds-ktx)
# fmt and Xbyak need to be added before dynarmic
# libfmt
if(USE_SYSTEM_FMT)
add_library(fmt INTERFACE)
find_package(fmt REQUIRED)
target_link_libraries(fmt INTERFACE fmt::fmt)
else()
option(FMT_INSTALL "" ON)
add_subdirectory(fmt EXCLUDE_FROM_ALL)
endif()
# Xbyak
if ("x86_64" IN_LIST ARCHITECTURE)
if(USE_SYSTEM_XBYAK)
find_package(xbyak REQUIRED)
add_library(xbyak INTERFACE)
target_link_libraries(xbyak INTERFACE xbyak::xbyak)
else()
add_subdirectory(xbyak EXCLUDE_FROM_ALL)
endif()
endif()
# Oaknut
if ("arm64" IN_LIST ARCHITECTURE)
add_subdirectory(oaknut EXCLUDE_FROM_ALL)
endif()
# Dynarmic
if ("x86_64" IN_LIST ARCHITECTURE OR "arm64" IN_LIST ARCHITECTURE)
if(USE_SYSTEM_DYNARMIC)
find_package(dynarmic REQUIRED)
add_library(dynarmic INTERFACE)
target_link_libraries(dynarmic INTERFACE dynarmic::dynarmic)
# The dynarmic package's cmake files are helpfully completely silent
# so we have to inform the user of its status ourselves
if(TARGET dynarmic::dynarmic)
message(STATUS "Found dynarmic")
endif()
else()
set(DYNARMIC_TESTS OFF CACHE BOOL "")
set(DYNARMIC_FRONTENDS "A32" CACHE STRING "")
set(DYNARMIC_USE_PRECOMPILED_HEADERS ${CITRA_USE_PRECOMPILED_HEADERS} CACHE BOOL "")
add_subdirectory(dynarmic EXCLUDE_FROM_ALL)
endif()
endif()
# getopt
if (MSVC)
add_subdirectory(getopt)
endif()
# Glad
add_subdirectory(glad)
# glslang
if(USE_SYSTEM_GLSLANG)
find_package(glslang REQUIRED)
add_library(glslang INTERFACE)
add_library(SPIRV INTERFACE)
target_link_libraries(glslang INTERFACE glslang::glslang)
target_link_libraries(SPIRV INTERFACE glslang::SPIRV)
# System include path is different from submodule include path
get_target_property(GLSLANG_PREFIX glslang::SPIRV INTERFACE_INCLUDE_DIRECTORIES)
target_include_directories(SPIRV SYSTEM INTERFACE "${GLSLANG_PREFIX}/glslang")
else()
set(SKIP_GLSLANG_INSTALL ON CACHE BOOL "")
set(ENABLE_GLSLANG_BINARIES OFF CACHE BOOL "")
set(ENABLE_SPVREMAPPER OFF CACHE BOOL "")
set(ENABLE_CTEST OFF CACHE BOOL "")
set(ENABLE_HLSL OFF CACHE BOOL "")
set(BUILD_EXTERNAL OFF CACHE BOOL "")
add_subdirectory(glslang)
endif()
# inih
if(USE_SYSTEM_INIH)
find_package(inih REQUIRED COMPONENTS inih inir)
add_library(inih INTERFACE)
target_link_libraries(inih INTERFACE inih::inih inih::inir)
else()
add_subdirectory(inih)
endif()
# MicroProfile
add_library(microprofile INTERFACE)
target_include_directories(microprofile SYSTEM INTERFACE ./microprofile)
# Nihstro
add_library(nihstro-headers INTERFACE)
target_include_directories(nihstro-headers SYSTEM INTERFACE ./nihstro/include)
if (MSVC)
# TODO: For some reason MSVC still applies this warning even with /W0 for externals.
target_compile_options(nihstro-headers INTERFACE /wd4715)
endif()
# Open Source Archives
add_subdirectory(open_source_archives)
# faad2
add_subdirectory(faad2 EXCLUDE_FROM_ALL)
# Dynamic library headers
add_library(library-headers INTERFACE)
if (USE_SYSTEM_FFMPEG_HEADERS)
find_path(SYSTEM_FFMPEG_INCLUDES NAMES libavutil/avutil.h)
if (SYSTEM_FFMPEG_INCLUDES STREQUAL "SYSTEM_FFMPEG_INCLUDES-NOTFOUND")
message(WARNING "System FFmpeg headers not found. Falling back on bundled headers.")
else()
message(STATUS "Using system FFmpeg headers.")
target_include_directories(library-headers SYSTEM INTERFACE ${SYSTEM_FFMPEG_INCLUDES})
set(FOUND_FFMPEG_HEADERS ON)
endif()
endif()
if (NOT FOUND_FFMPEG_HEADERS)
message(STATUS "Using bundled ffmpeg headers.")
target_include_directories(library-headers SYSTEM INTERFACE ./library-headers/ffmpeg/include)
endif()
# SoundTouch
if(NOT USE_SYSTEM_SOUNDTOUCH)
set(INTEGER_SAMPLES ON CACHE BOOL "")
set(SOUNDSTRETCH OFF CACHE BOOL "")
set(SOUNDTOUCH_DLL OFF CACHE BOOL "")
add_subdirectory(soundtouch EXCLUDE_FROM_ALL)
target_compile_definitions(SoundTouch PUBLIC SOUNDTOUCH_INTEGER_SAMPLES)
endif()
# sirit
add_subdirectory(sirit EXCLUDE_FROM_ALL)
# Teakra
add_subdirectory(teakra EXCLUDE_FROM_ALL)
# SDL2
if (ENABLE_SDL2 AND NOT USE_SYSTEM_SDL2)
add_subdirectory(sdl2)
endif()
# libusb
if (ENABLE_LIBUSB AND NOT USE_SYSTEM_LIBUSB)
add_subdirectory(libusb)
set(LIBUSB_INCLUDE_DIR "" PARENT_SCOPE)
set(LIBUSB_LIBRARIES usb PARENT_SCOPE)
endif()
# Zstandard
if(USE_SYSTEM_ZSTD)
find_package(zstd REQUIRED)
add_library(zstd INTERFACE)
if(TARGET zstd::libzstd_shared)
message(STATUS "Found system Zstandard")
endif()
target_link_libraries(zstd INTERFACE zstd::libzstd_shared)
else()
set(ZSTD_LEGACY_SUPPORT OFF)
set(ZSTD_BUILD_PROGRAMS OFF)
set(ZSTD_BUILD_SHARED OFF)
add_subdirectory(zstd/build/cmake EXCLUDE_FROM_ALL)
target_include_directories(libzstd_static INTERFACE $<BUILD_INTERFACE:${CMAKE_SOURCE_DIR}/externals/zstd/lib>)
add_library(zstd ALIAS libzstd_static)
endif()
# ENet
if(USE_SYSTEM_ENET)
find_package(libenet REQUIRED)
add_library(enet INTERFACE)
target_link_libraries(enet INTERFACE libenet::libenet)
else()
add_subdirectory(enet)
target_include_directories(enet INTERFACE ./enet/include)
endif()
# Cubeb
if (ENABLE_CUBEB)
if(USE_SYSTEM_CUBEB)
find_package(cubeb REQUIRED)
add_library(cubeb INTERFACE)
target_link_libraries(cubeb INTERFACE cubeb::cubeb)
if(TARGET cubeb::cubeb)
message(STATUS "Found system cubeb")
endif()
else()
set(BUILD_TESTS OFF CACHE BOOL "")
set(BUILD_TOOLS OFF CACHE BOOL "")
set(BUNDLE_SPEEX ON CACHE BOOL "")
add_subdirectory(cubeb EXCLUDE_FROM_ALL)
endif()
endif()
# DiscordRPC
if (USE_DISCORD_PRESENCE)
add_subdirectory(discord-rpc EXCLUDE_FROM_ALL)
target_include_directories(discord-rpc INTERFACE ./discord-rpc/include)
endif()
# JSON
add_library(json-headers INTERFACE)
if (USE_SYSTEM_JSON)
find_package(nlohmann_json REQUIRED)
target_link_libraries(json-headers INTERFACE nlohmann_json::nlohmann_json)
get_target_property(NLOHMANN_PREFIX nlohmann_json::nlohmann_json INTERFACE_INCLUDE_DIRECTORIES)
# The nlohmann-json3 package expects "#include <nlohmann/json.hpp>"
# Citra uses "#include <json.hpp>" so we have to add this manually
target_include_directories(json-headers SYSTEM INTERFACE "${NLOHMANN_PREFIX}/nlohmann")
else()
target_include_directories(json-headers SYSTEM INTERFACE ./json)
endif()
# OpenSSL
if (USE_SYSTEM_OPENSSL)
find_package(OpenSSL 1.1)
if (OPENSSL_FOUND)
set(OPENSSL_LIBRARIES OpenSSL::SSL OpenSSL::Crypto)
endif()
endif()
if (NOT OPENSSL_FOUND)
# LibreSSL
set(LIBRESSL_SKIP_INSTALL ON CACHE BOOL "")
set(OPENSSLDIR "/etc/ssl/")
add_subdirectory(libressl EXCLUDE_FROM_ALL)
target_include_directories(ssl SYSTEM INTERFACE ./libressl/include)
target_compile_definitions(ssl PRIVATE -DHAVE_INET_NTOP)
get_directory_property(OPENSSL_LIBRARIES
DIRECTORY libressl
DEFINITION OPENSSL_LIBS)
endif()
# httplib
add_library(httplib INTERFACE)
if(USE_SYSTEM_CPP_HTTPLIB)
find_package(CppHttp 0.14.1)
if(CppHttp_FOUND)
target_link_libraries(httplib INTERFACE httplib::httplib)
else()
message(STATUS "Cpp-httplib not found or not suitable version! Falling back to bundled...")
target_include_directories(httplib SYSTEM INTERFACE ./httplib)
endif()
else()
target_include_directories(httplib SYSTEM INTERFACE ./httplib)
endif()
target_compile_options(httplib INTERFACE -DCPPHTTPLIB_OPENSSL_SUPPORT)
target_link_libraries(httplib INTERFACE ${OPENSSL_LIBRARIES})
if(ANDROID)
add_subdirectory(android-ifaddrs)
target_link_libraries(httplib INTERFACE ifaddrs)
endif()
# cpp-jwt
if (ENABLE_WEB_SERVICE)
if (USE_SYSTEM_CPP_JWT)
find_package(cpp-jwt REQUIRED)
add_library(cpp-jwt INTERFACE)
target_link_libraries(cpp-jwt INTERFACE cpp-jwt::cpp-jwt)
else()
add_library(cpp-jwt INTERFACE)
target_include_directories(cpp-jwt SYSTEM INTERFACE ./cpp-jwt/include)
target_compile_definitions(cpp-jwt INTERFACE CPP_JWT_USE_VENDORED_NLOHMANN_JSON)
endif()
endif()
# lodepng
add_subdirectory(lodepng)
# (xperia64): Only use libyuv on Android b/c of build issues on Windows and mandatory JPEG
if(ANDROID)
# libyuv
add_subdirectory(libyuv)
target_include_directories(yuv INTERFACE ./libyuv/include)
endif()
# OpenAL Soft
if (ENABLE_OPENAL)
set(ALSOFT_EMBED_HRTF_DATA OFF CACHE BOOL "")
set(ALSOFT_EXAMPLES OFF CACHE BOOL "")
set(ALSOFT_INSTALL OFF CACHE BOOL "")
set(ALSOFT_INSTALL_CONFIG OFF CACHE BOOL "")
set(ALSOFT_INSTALL_HRTF_DATA OFF CACHE BOOL "")
set(ALSOFT_INSTALL_AMBDEC_PRESETS OFF CACHE BOOL "")
set(ALSOFT_UTILS OFF CACHE BOOL "")
set(LIBTYPE "STATIC" CACHE STRING "")
add_subdirectory(openal-soft EXCLUDE_FROM_ALL)
endif()
# VMA
add_library(vma INTERFACE)
target_include_directories(vma SYSTEM INTERFACE ./vma/include)
# vulkan-headers
add_library(vulkan-headers INTERFACE)
target_include_directories(vulkan-headers SYSTEM INTERFACE ./vulkan-headers/include)
if (APPLE)
target_include_directories(vulkan-headers SYSTEM INTERFACE ${CMAKE_CURRENT_SOURCE_DIR}/MoltenVK)
endif()
# adrenotools
if (ANDROID AND "arm64" IN_LIST ARCHITECTURE)
add_subdirectory(libadrenotools)
endif()