Skip to content

Instantly share code, notes, and snippets.

@chadbrewbaker
Last active October 13, 2022 03:48
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chadbrewbaker/7613be72b98417797f076052efe6972c to your computer and use it in GitHub Desktop.
Save chadbrewbaker/7613be72b98417797f076052efe6972c to your computer and use it in GitHub Desktop.
EasyPerf 1 August 2021 Twitter Space

Guest Thomas Dullien

https://twitter.com/halvarflake

https://optimyze.cloud/

  • Ofensive security is understanding large scale legacy systems.
  • Whole stack analysis required.
  • Tool built for Microsoft patch analysis (bindiff) is useful for seeing compiler changes
  • Weird machines, exploitability, and provable unexploitability
  • Two ARM cores in a modern USB cable.
  • Security is human conflict - performance is environmental impact.
  • Moore's law has many versions, cost per transitor is not falling.
  • What is a software vendor? Snowflake, Spotify? Vendors now have hardware and performance costs.
  • Hyperscalers in 2009 started building performance monitoring - led to TPU and Youtube transcoding hardware. FFMPEG was a huge bill for Google.
  • Torch/Tensorflow are Python interfaces to GPUs. Going to see more specialized accelerator libraries.
  • All cloud providers including Azure are adopting ARM.
  • Telemetry can build better chips. Apple and Amazon in house processors. Intel at a disadvantage - has to buy the data.
  • There will be a C level excecutive in charge of digital operating expense.
  • Utility computing has low margins. Cloud providers make profit from value added services.
  • Compiler design integration will improve.
  • There is no typical workload, but there are classes: Java, database, Python calling into c++. Linux is ubiqutios. AWS Graviton is possible because of open source. One recompile between CPU architectures.
  • Most money spent on 3rd party open source packages.
  • Bottlenecks are mostly the database. Amazon Linux has a default that spends 15% on getting clocks. Kubernetes does a disk quota that eats CPU.
  • There is way more Java in the routine enterprise. Allocation overhead is huge - 15%.
  • Public traded SAAS companies. Gross margin is growing linearly with market cap.
  • (Thomas' mobile battery died, short ironic intermission) - Denis discussion of power use.
  • Open source alternative to Thomas' Prodfiler - Pyroscope Secret sauce for logger is bccutils https://github.com/iovisor/bcc/blob/master/tools/profile.py
  • Netflix invested in Brendan Gregg's system tracing.
  • Car manufactuers know cost down to a screw. We need to measure at that granularity to optimize cost.
  • Google and Facebook built fleet wide profilers.
  • std::map data visible etc. https://abseil.io/
  • FB and Google compile with frame pointers to help profiling.
  • Optimyze.cloud profiler works on the operating system inrerupt to log all stack traces. File ID and offset pairs. Index of major package debug symbols - can upload your own symbols.
  • Extra info for JVM, Ruby, Python virtual machine traces.
  • Last branch record in hardare could help (Denis).
  • ELF format tutorial
  • 5 minute DWARF 5 overview
  • ld and gold linkers
  • lld linker
  • Regex JIT was turned off. Lots of time spent on UTF-8 when it was ASCII. Low hanging fruit on many web back ends.
  • AB testing of compiler flags - especialy with LTO/PGO. Hard problem to isolate root cause.
  • LTO can increase code size with inlining - create cache issues - because of bad profiling data.
  • REST API in the Fall for PGO with profiling data.
  • Too much time in serialization/deserialization. Python <-> Java in Spark workloads eating 30%.
  • Microservices need both profiling and distributed tracing. Akita OpenTelemetry
  • hash callstack, keep a counter for a histogram
  • reduce sampling rate on larger fleets
  • have some workers increase sample rate for more fine grain - target less than 1% of CPU
  • fork() heavy code has a lot of stack profiling overhead. Like full Linux kernel compile.
  • Want IO profiling.
  • Want better memory profiling.
  • (My mobile battery died. Lol) My phone while charging also shutdown when opening the Twitter app after a short recharge. That application when running Twitter spaces is an extreme power hog.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment