summaryrefslogtreecommitdiff
path: root/src/video_core/shader/decode (follow)
Commit message (Collapse)AuthorAgeFilesLines
* shader: Remove old shader managementGravatar ReinUsesLisp2021-07-2228-4919/+0
|
* Review 1Gravatar Kelebek12021-02-151-1/+1
|
* Implement texture offset support for TexelFetch and TextureGather and add ↵Gravatar Kelebek12021-02-152-2/+10
| | | | | | offsets for Tlds Formatting
* video_core: Reimplement the buffer cacheGravatar ReinUsesLisp2021-02-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reimplement the buffer cache using cached bindings and page level granularity for modification tracking. This also drops the usage of shared pointers and virtual functions from the cache. - Bindings are cached, allowing to skip work when the game changes few bits between draws. - OpenGL Assembly shaders no longer copy when a region has been modified from the GPU to emulate constant buffers, instead GL_EXT_memory_object is used to alias sub-buffers within the same allocation. - OpenGL Assembly shaders stream constant buffer data using glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In theory this should save one hash table resolve inside the driver compared to glBufferSubData. - A new OpenGL stream buffer is implemented based on fences for drivers that are not Nvidia's proprietary, due to their low performance on partial glBufferSubData calls synchronized with 3D rendering (that some games use a lot). - Most optimizations are shared between APIs now, allowing Vulkan to cache more bindings than before, skipping unnecesarry work. This commit adds the necessary infrastructure to use Vulkan object from OpenGL. Overall, it improves performance and fixes some bugs present on the old cache. There are still some edge cases hit by some games that harm performance on some vendors, this are planned to be fixed in later commits.
* half_set: Resolve -Wmaybe-uninitialized warningsGravatar Lioncash2020-12-301-7/+7
|
* video_core: Rewrite the texture cacheGravatar ReinUsesLisp2020-12-302-32/+35
| | | | | | | | | | | | | | The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage.The current texture cache has several points that hurt maintainability and performance. It's easy to break unrelated parts of the cache when doing minor changes. The cache can easily forget valuable information about the cached textures by CPU writes or simply by its normal usage. This commit aims to address those issues.
* video_core: Remove unnecessary enum class casting in logging messagesGravatar Lioncash2020-12-079-48/+38
| | | | | | | fmt now automatically prints the numeric value of an enum class member by default, so we don't need to use casts any more. Reduces the line noise a bit.
* video_core: Resolve more variable shadowing scenarios pt.3Gravatar Lioncash2020-12-051-3/+4
| | | | | Cleans out the rest of the occurrences of variable shadowing and makes any further occurrences of shadowing compiler errors.
* video_core: Resolve more variable shadowing scenarios pt.2Gravatar Lioncash2020-12-052-10/+10
| | | | | | | Migrates the video core code closer to enabling variable shadowing warnings as errors. This primarily sorts out shadowing occurrences within the Vulkan code.
* Merge pull request #3681 from lioncash/componentGravatar Rodrigo Locatti2020-11-241-2/+2
|\ | | | | decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize()
| * decode/image: Fix typo in assert in GetComponentSize()Gravatar Lioncash2020-04-151-3/+3
| |
| * decoder/image: Fix incorrect G24R8 component sizes in GetComponentSize()Gravatar Lioncash2020-04-151-2/+2
| | | | | | | | The components' sizes were mismatched. This corrects that.
* | Merge pull request #4854 from ReinUsesLisp/cube-array-shadowGravatar bunnei2020-11-051-1/+0
|\ \ | | | | | | shader: Partially implement texture cube array shadow
| * | shader: Partially implement texture cube array shadowGravatar ReinUsesLisp2020-10-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | This implements texture cube arrays with shadow comparisons but doesn't fix the asserts related to it. Fixes out of bounds reads on swizzle constructors and makes them use bounds checked ::at instead of the unsafe operator[].
* | | shader/arithmetic: Implement FCMP immediate + register variantGravatar ReinUsesLisp2020-10-281-1/+2
|/ / | | | | | | Trivially add the encoding for this.
* | shader/texture: Implement CUBE texture type for TMML and fix arraysGravatar ReinUsesLisp2020-10-071-19/+22
| | | | | | | | | | | | | | | | TMML takes an array argument that has no known meaning, this one appears as the first component in gpr8 followed by s, t and r. Skip this component when arrays are being used. Also implement CUBE texture types. - Used by Pikmin 3: Deluxe Demo.
* | arithmetic_integer_immediate: Make use of std::move where applicableGravatar Lioncash2020-09-241-16/+19
| | | | | | | | | | Same behavior, minus any redundant atomic reference count increments and decrements.
* | Merge pull request #4672 from lioncash/narrowingGravatar Rodrigo Locatti2020-09-171-1/+1
|\ \ | | | | | | decoder/texture: Eliminate narrowing conversion in GetTldCode()
| * | decoder/texture: Eliminate narrowing conversion in GetTldCode()Gravatar Lioncash2020-09-171-1/+1
| | | | | | | | | | | | The assignment was previously truncating a u64 value to a bool.
* | | decode/image: Eliminate switch fallthrough in DecodeImage()Gravatar Lioncash2020-09-171-0/+1
|/ / | | | | | | | | Fortunately this didn't result in any issues, given the block that code was falling through to would immediately break.
* | video_core: Enforce -Werror=switchGravatar ReinUsesLisp2020-09-162-4/+13
| | | | | | | | This forces us to fix all -Wswitch warnings in video_core.
* | shader/memory: Amend UNIMPLEMENTED_IF_MSG without a messageGravatar Lioncash2020-08-141-1/+2
| | | | | | | | | | We need to provide a message for this variant of the macro, so we can simply log out the type being used.
* | General: Tidy up clang-format warnings part 2Gravatar Lioncash2020-08-131-3/+3
| |
* | Merge pull request #4391 from lioncash/nrvoGravatar bunnei2020-07-243-20/+20
|\ \ | | | | | | video_core: Allow copy elision to take place where applicable
| * | video_core: Allow copy elision to take place where applicableGravatar Lioncash2020-07-213-20/+20
| | | | | | | | | | | | | | | Removes const from some variables that are returned from functions, as this allows the move assignment/constructors to execute for them.
* | | Merge pull request #4361 from ReinUsesLisp/lane-idGravatar Rodrigo Locatti2020-07-211-2/+1
|\ \ \ | | | | | | | | decode/other: Implement S2R.LaneId
| * | | decode/other: Implement S2R.LaneIdGravatar ReinUsesLisp2020-07-161-2/+1
| |/ / | | | | | | | | | | | | | | | This maps to host's thread id. - Fixes graphical issues on Paper Mario.
* / / video_core: Rearrange pixel format namesGravatar ReinUsesLisp2020-07-131-27/+27
|/ / | | | | | | | | | | Normalizes pixel format names to match Vulkan names. Previous to this commit pixel formats had no convention, leading to confusion and potential bugs.
* | Merge pull request #4147 from ReinUsesLisp/hset2-immGravatar bunnei2020-06-261-21/+67
|\ \ | | | | | | shader/half_set: Implement HSET2_IMM
| * | shader/half_set: Implement HSET2_IMMGravatar ReinUsesLisp2020-06-221-21/+67
| | | | | | | | | | | | | | | | | | Add HSET2_IMM. Due to the complexity of the encoding avoid using BitField unions and read the relevant bits from the code itself. This is less error prone.
* | | decode/image: Implement B10G11R11FGravatar Morph2020-06-201-9/+17
|/ / | | | | | | - Used by Kirby Star Allies
* | shader/texture: Join separate image and sampler pairs offlineGravatar ReinUsesLisp2020-06-051-18/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Games using D3D idioms can join images and samplers when a shader executes, instead of baking them into a combined sampler image. This is also possible on Vulkan. One approach to this solution would be to use separate samplers on Vulkan and leave this unimplemented on OpenGL, but we can't do this because there's no consistent way of determining which constant buffer holds a sampler and which one an image. We could in theory find the first bit and if it's in the TIC area, it's an image; but this falls apart when an image or sampler handle use an index of zero. The used approach is to track for a LOP.OR operation (this is done at an IR level, not at an ISA level), track again the constant buffers used as source and store this pair. Then, outside of shader execution, join the sample and image pair with a bitwise or operation. This approach won't work on games that truly use separate samplers in a meaningful way. For example, pooling textures in a 2D array and determining at runtime what sampler to use. This invalidates OpenGL's disk shader cache :) - Used mostly by D3D ports to Switch
* | Merge pull request #4016 from ReinUsesLisp/invocation-infoGravatar LC2020-06-021-1/+1
|\ \ | | | | | | shader/other: Fix hardcoded value in S2R INVOCATION_INFO
| * | shader/other: Fix hardcoded value in S2R INVOCATION_INFOGravatar ReinUsesLisp2020-05-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Geometry shaders built from Nvidia's compiler check for bits[16:23] to be less than or equal to 0 with VSETP to default to a "safe" value of 0x8000'0000 (safe from hardware's perspective). To avoid hitting this path in the shader, return 0x00ff'0000 from S2R INVOCATION_INFO. This seems to be the maximum number of vertices a geometry shader can emit in a primitive.
* | | shader/other: Implement MEMBAR.CTSGravatar ReinUsesLisp2020-05-271-2/+12
|/ / | | | | | | | | This silences an assertion we were hitting and uses workgroup memory barriers when the game requests it.
* | Merge pull request #3981 from ReinUsesLisp/barGravatar bunnei2020-05-261-0/+5
|\ \ | | | | | | shader/other: Implement BAR.SYNC 0x0
| * | shader/other: Implement BAR.SYNC 0x0Gravatar ReinUsesLisp2020-05-211-0/+5
| | | | | | | | | | | | | | | Trivially implement this particular case of BAR. Unless games use OpenCL or CUDA barriers, we shouldn't hit any other case here.
* | | Merge pull request #3980 from ReinUsesLisp/red-opGravatar bunnei2020-05-261-2/+1
|\ \ \ | | | | | | | | shader/memory: Implement non-addition operations in RED
| * | | shader/memory: Implement non-addition operations in REDGravatar ReinUsesLisp2020-05-211-2/+1
| |/ / | | | | | | | | | Trivially implement these instructions. They are used in Astral Chain.
* / / shader/other: Implement thread comparisons (NV_shader_thread_group)Gravatar ReinUsesLisp2020-05-211-0/+21
|/ / | | | | | | | | | | | | | | | | | | | | Hardware S2R special registers match gl_Thread*MaskNV. We can trivially implement these using Nvidia's extension on OpenGL or naively stubbing them with the ARB instructions to match. This might cause issues if the host device warp size doesn't match Nvidia's. That said, this is unlikely on proper shaders. Refer to the attached url for more documentation about these flags. https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
* | shader_ir: Separate float-point comparisons in ordered and unorderedGravatar ReinUsesLisp2020-05-091-6/+6
| | | | | | | | | | This allows us to use native SPIR-V instructions without having to manually check for NAN.
* | Merge pull request #3693 from ReinUsesLisp/clean-samplersGravatar bunnei2020-05-022-94/+116
|\ \ | | | | | | shader/texture: Support multiple unknown sampler properties
| * | shader/texture: Support multiple unknown sampler propertiesGravatar ReinUsesLisp2020-04-231-51/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | This allows deducing some properties from the texture instruction before asking the runtime. By doing this we can handle type mismatches in some instructions from the renderer instead of the shader decoder. Fixes texelFetch issues with games using 2D texture instructions on a 1D sampler.
| * | shader_ir: Turn classes into data structuresGravatar ReinUsesLisp2020-04-232-59/+58
| | |
* | | shader/arithmetic_integer: Fix tracking issue in temporaryGravatar ReinUsesLisp2020-04-281-4/+0
| | | | | | | | | | | | | | | This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented. It caused issues when tracking global buffers.
* | | shader/arithmetic_integer: Fix edge case and mark IADD.X Rd.CC as unimplementedGravatar ReinUsesLisp2020-04-251-1/+6
| | | | | | | | | | | | | | | IADD.X Rd.CC requires some extra logic that is not currently implemented. Abort when this is hit.
* | | shader/arithmetic_integer: Change IAdd to UAdd to avoid signed overflowGravatar ReinUsesLisp2020-04-251-2/+2
| | | | | | | | | | | | | | | | | | Signed integer addition overflow might be undefined behavior. It's free to change operations to UAdd and use unsigned integers to avoid potential bugs.
* | | shader/arithmetic_integer: Implement IADD.XGravatar ReinUsesLisp2020-04-251-0/+6
| | | | | | | | | | | | | | | IADD.X takes the carry flag and adds it to the result. This is generally used to emulate 64-bit operations with 32-bit registers.
* | | shader/arithmetic_integer: Implement CC for IADDGravatar ReinUsesLisp2020-04-251-3/+19
| | |
* | | decode/register_set_predicate: Implement CCGravatar ReinUsesLisp2020-04-251-9/+14
| | | | | | | | | | | | | | | | | | P2R CC takes the state of condition codes and puts them into a register. We already have this implemented for PR (predicates). This commit implements CC over that.