Skip to content

Instantly share code, notes, and snippets.

View EgorBo's full-sized avatar
🛠️
Working from home

Egor Bogatov EgorBo

🛠️
Working from home
View GitHub Profile
@EgorBo
EgorBo / Dynamic PGO in .NET 6.0.md
Last active January 25, 2024 15:15
Dynamic PGO in .NET 6.0.md

Dynamic PGO in .NET 6.0

Dynamic PGO (Profile-guided optimization) is a JIT-compiler optimization technique that allows JIT to collect additional information about surroundings (aka profile) in tier0 codegen in order to rely on it later during promotion from tier0 to tier1 for hot methods to make them even more efficient.

What exactly PGO can optimize for us?

  1. Profile-driving inlining - inliner relies on PGO data and can be very aggressive for hot paths and care less about cold ones, see dotnet/runtime#52708 and dotnet/runtime#55478. A good example where it has visible effects is this StringBuilder benchmark:

  2. Guarded devirtualization - most monomorphic virtual/interface calls can be devirtualized using PGO data, e.g.:

void DisposeMe(IDisposable d)
@EgorBo
EgorBo / blog-parser.cs
Created November 13, 2023 17:35
blog-parser.cs
using System.Text.RegularExpressions;
string file = await new HttpClient().GetStringAsync(
"https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-8/");
MatchCollection pullRequestUrls =
Regex.Matches(file, @"https:\/\/github.com\/[a-zA-Z-]+\/[a-zA-Z-]+\/pull\/[0-9]+");
int total = pullRequestUrls.Count;
int i = 1;
@EgorBo
EgorBo / jit-diffs for changes in C# code.md
Last active January 16, 2023 18:29
jit-diffs for changes in C# code.md

How to run jit-diffs for changes in the managed code

All commands are in powershell, should be pretty much the same for bash and non-windows platforms

  1. build everything we're going to need:
.\build.cmd Clr+Libs -c Release ;; .\build.cmd Clr -c Checked ;; cd .\src\tests\ ;; .\build.cmd Release generatelayoutonly ;; cd ..\..
  1. Make a copy of the test core_root (will be a baseline):
@EgorBo
EgorBo / blog-parser.cs
Last active November 6, 2022 20:03
blog-parser.cs
using System.Text.RegularExpressions;
/*
Top25 Authors of all 512 PRs in https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/
(by count):
stephentoub -- 148
EgorBo -- 45
tannergooding -- 26
using System;
using System.Text.Json;
using System.Text.Json.Serialization;
User? user = JsonSerializer.Deserialize<User>(
await new HttpClient().GetStringAsync("https://jsonplaceholder.typicode.com/todos/1"), MyJsonContext.Default.User);
Console.WriteLine($"id={user?.id}, title={user?.title}");
public record User(int userId, int id, string title, bool completed);

Legend

  • Statistical Test threshold: 10%, the noise filter: 2 ns
  • Result is conclusion: Slower|Faster|Same|Noise|Unknown. Noise means that the difference was larger than 10% but not 2 ns.
  • Ratio = Base/Diff (the higher the better).
  • Alloc Delta = Allocated bytes diff - Allocated bytes base (the lower the better)

Statistics

Total: 52498

@EgorBo
EgorBo / RC1_Report.md
Created September 13, 2022 14:10
RC1_Report.md

Legend

  • Statistical Test threshold: 10%, the noise filter: 2 ns
  • Result is conclusion: Slower|Faster|Same|Noise|Unknown. Noise means that the difference was larger than 10% but not 2 ns.
  • Ratio = Base/Diff (the higher the better).
  • Alloc Delta = Allocated bytes diff - Allocated bytes base (the lower the better)

Statistics

Total: 52498

@EgorBo
EgorBo / AddrSan.cs
Last active September 9, 2022 10:25
AddrSan.cs
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics;
unsafe class Program
{
[DllImport("kernel32")]
static extern byte* VirtualAlloc(IntPtr addr, nuint size, uint typ, uint prot);
@EgorBo
EgorBo / GDV_for_delegates.cs
Last active June 17, 2022 08:02
GDV_for_delegates.cs
using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
//
@EgorBo
EgorBo / TE_Inline_Cont_Sockets.md
Last active January 11, 2022 02:59
TE_Inline_Cont_Sockets.md

DOTNET_SYSTEM_NET_SOCKETS_INLINE_COMPLETIONS=1 noticeably improves simple TE benchmarks such as the following ones on all UNIX archs. From my understanding, it avoids dispatching from the event-thread to threadpool and does the work in the same thread it got request from.

TE Benchmark Baseline, RPS MyTest, RPS diff, %
ARM64 Platform-JSON PGO 661,663 778,925 +17.72%
ARM64 Platform-Caching PGO 186,188 218,004 +17.09%
ARM64 Platform-Plaintext PGO 6,933,964 7,563,428 +9.08%
x64 Platform-JSON PGO 1,299,388 1,432,200 +10.22%
x64 Platform-Caching PGO 413,123 445,144 +7.75%