- Emil Persson @Humus
- Matt Pettineo @mynameismjp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| struct vec3f {float x, y, z;}; | |
| struct vec4f {float x, y, z, w;}; | |
| struct mat44f {vec4f x, y, z, w;}; | |
| //============================================================================ | |
| // sphere_screen_extents | |
| //============================================================================ | |
| // Calculates the exact screen extents xyzw=[left, bottom, right, top] in | |
| // normalized screen coordinates [-1, 1] for a sphere in view space. For | |
| // performance, the projection matrix (v2p) is assumed to be setup so that |
- 2011 - A trip through the Graphics Pipeline 2011
- 2013 - Performance Optimization Guidelines and the GPU Architecture behind them
- 2015 - Life of a triangle - NVIDIA's logical pipeline
- 2015 - Render Hell 2.0
- 2016 - How bad are small triangles on GPU and why?
- 2017 - GPU Performance for Game Artists
- 2019 - Understanding the anatomy of GPUs using Pokémon
(This is a translation of the original article in Japanese by moratorium08.)
(UPDATE (22/3/2019): Added some corrections provided by the original author.)
Writing your own OS to run on a handmade CPU is a pretty ambitious project, but I've managed to get it working pretty well so I'm going to write some notes about how I did it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| In shader programming, you often run into a problem where you want to iterate an array in memory over all pixels in a compute shader | |
| group (tile). Tiled deferred lighting is the most common case. 8x8 tile loops over a light list culled for that tile. | |
| Simplified HLSL code looks like this: | |
| Buffer<float4> lightDatas; | |
| Texture2D<uint2> lightStartCounts; | |
| RWTexture2D<float4> output; | |
| [numthreads(8, 8, 1)] |
- 🌏 The official ISO C++ Get Started! page
- 🎥 Herb Sutter: (Not Your Father’s) C++
- 🎥 Beginning with C++ by Jens Weller
This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).
Matrix multiplication is a mathematical operation that defines the product of
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| // Named tuple for C++ | |
| // Example code from http://vitiy.info/ | |
| // Written by Victor Laskin (victor.laskin@gmail.com) | |
| // Parts of code were taken from: https://gist.github.com/Manu343726/081512c43814d098fe4b | |
| namespace foonathan { | |
| namespace string_id { | |
| namespace detail | |
| { |
NewerOlder