This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#import <Cocoa/Cocoa.h> | |
#import <QuartzCore/CAMetalLayer.h> | |
#import <Metal/Metal.h> | |
#include <cstdlib> | |
#include <iostream> | |
#include <webgpu/webgpu.h> | |
// Custom delegate class to handle window close events | |
@interface WindowDelegate : NSObject <NSWindowDelegate> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
i10 jab / punish: | |
1,2,2: mid NC, -10 block, +8 CFT hit | |
1,2,4: low, -13 block, +3 hit. CH: ff3 followup = 34 dmg | |
+8 CFT mixup: | |
1: mid i20, -9 block, hit KD, CH ff1+2 followup = 59 dmg | |
4: low i19, -26 block, ff3 followup = 26 dmg | |
ff2 i14 long range mid: | |
block: -2 pushback -> backdash |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Top moves: | |
2: i11 high, +1 block, +9 hit, 10 dmg, followup (NC): (1) mid, -2 block, +3 hit, NC = 22 dmg | |
f2: i10 high, -12 block, +5 hit, 17 dmg, CH followups: ff1 = 40 dmg, ff3 = 48 dmg, f3+4 = 42 dmg | |
f1: i14 mid, -6 block, +5 hit, 15 dmg, followup (NC): (1) high, -7 block, NC = 40 dmg | |
df1: i14 mid, -4 block, +3 hit, 12 dmg, followups (CH NC): (2,1) delayable high,high, launch, (1) mid, -12 block. CH NC = 55 dmg | |
db1: i12 low, -12 block, +2 hit, 13 dmg | |
df2: i15 mid, -14 block (safe tip range), launch | |
f1+2: i15 mid, -19 block (pushback), wall bounce | |
db,d,df1: i24 low, high crush, -37 block, 30 dmg | |
FC db1: i12 low, high crush, -8 block, +6 hit, 15 dmg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Top moves (close): | |
1,2,4/3: 10 high, -2 block, followups: mid(-8 push),mid(-12 push), low(-11 / 0 hit) | |
1+2: i16 mid, -9 block, launch | |
df1,2/4: i13 mid,high/mid, -3 block, followups: high(-1), mid(-12) | |
df2: i15 mid, -6 block, launch (no crouch) | |
d1+2: i20 low, -18 block, high crush, 36 damage minicombo (d2, f2) | |
d3+4: i14 low,high, -6 block (push), low crush, CH launch | |
db1,2: i14 mid,high, -9 block, high crush, followup CH launch | |
db2: i20 mid, -11 block, high cruch, launch | |
b2,1,4/4/1+2,4: i15 mid, -4 block, followups: mid,low/high (-7,-6 push), low(-11), high,mid (-9,-13) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Source: https://www.anandtech.com/show/16214/amd-zen-3-ryzen-deep-dive-review-5950x-5900x-5800x-and-5700x-tested | |
Format: | |
TestName (lower = better): 3700X -> 5600X (performance difference) | |
Less than 1% difference = tie | |
Office and Science | |
Agisoft Photoscan (lower = better): 2377 -> 2133 (+11.4%) | |
GIMP (lower = better): 20.72 -> 17.15 (+20.8%) | |
3D particle movement non-AVX: 2768->2452 (-11.4%) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Parry: | |
b2 mid+high: | |
3 frame startup and can interrupt many (non-NC) strings. | |
1 or 2 followup = 30 damage | |
Against slow recovery moves can launch with b3 or uf4. | |
3+4 (Hermit stance) low: | |
3 frame startup and can interrupt many (non-NC) strings. Hermit string transitions parry dick jab even at -9. | |
4,1+2 followup = 56 damage |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
All current buffer types in shading languages are slightly different ways to present homogeneous arrays (single struct or type repeating N times in memory). | |
DirectX has raw buffers (RWByteAddressBuffer) but that is limited to 32 bit integer types and the implementation doesn't require natural alignment for wide loads resulting in suboptimal codegen on Nvidia GPUs. | |
Complex use cases, such as tree traversal in spatial data structures (physics, ray-tracing, etc) require data structure that is non-homogeneous. You want different node payloads and tight memory layout. | |
Ability to mix 8/16/32 bit data types and 1d/2d/4d vectors to faciliate GPU wide loads (max bandwidth) in same data structure is crucial for complex use cases like this. | |
On the other hand we want better more readable/maintainable code syntax than DirectX raw buffers without manual bit packing/extracting and reinterpret casting. Goal should be to allow modern GPUs to use sub-register addressing (SDWA on AMD hardware). Saving both ALU and register |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#version 450 | |
#extension GL_ARB_separate_shader_objects : enable | |
#extension GL_KHR_shader_subgroup_basic : enable | |
#extension GL_KHR_shader_subgroup_ballot : enable | |
#extension GL_KHR_shader_subgroup_vote : enable | |
#extension GL_KHR_shader_subgroup_arithmetic : enable | |
layout(location = 0) out vec4 outColor; | |
//#define VISUALIZE_WAVES |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In shader programming, you often run into a problem where you want to iterate an array in memory over all pixels in a compute shader | |
group (tile). Tiled deferred lighting is the most common case. 8x8 tile loops over a light list culled for that tile. | |
Simplified HLSL code looks like this: | |
Buffer<float4> lightDatas; | |
Texture2D<uint2> lightStartCounts; | |
RWTexture2D<float4> output; | |
[numthreads(8, 8, 1)] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PerfTest | |
To select adapter, use: PerfTest.exe [ADAPTER_INDEX] | |
Adapters found: | |
0: Radeon (TM) RX 480 Graphics | |
1: Intel(R) HD Graphics 530 | |
2: Microsoft Basic Render Driver | |
Using adapter 1 | |
Running 5 warm-up frames and 5 benchmark frames: |
NewerOlder