Skip to content

Instantly share code, notes, and snippets.

@nickdesaulniers
Created Jun 22, 2022
Embed
What would you like to do?
perf report of clang with no input
perf record -e cycles:pp --call-graph lbr -- clang -Wfoo -c -x c /dev/null
perf report --no-children --sort=dso,symbol --total-cycles
+ 20.18% [unknown] [k] 0xffffffffaba01087 ▒
+ 9.04% ld-2.33.so [.] do_lookup_x ▒
- 8.47% libc-2.33.so [.] __strlen_evex ▒
- std::basic_ios<char, std::char_traits<char> >::init@plt ◆
- 5.48% clang::Builtin::Context::initializeBuiltins ▒
clang::FrontendAction::BeginSourceFile ▒
clang::CompilerInstance::ExecuteAction ▒
clang::ExecuteCompilerInvocation ▒
cc1_main ▒
ExecuteCC1Tool ▒
std::vector<char const*, std::allocator<char const*> >::_M_assign_aux<char const* const*> ▒
llvm::CrashRecoveryContext::RunSafely ▒
clang::driver::CC1Command::Execute ▒
clang::driver::Compilation::ExecuteCommand ▒
clang::driver::Compilation::ExecuteJobs ▒
clang::driver::Driver::ExecuteCompilation ▒
clang_main ▒
__libc_start_main ▒
_start ▒
+ 1.09% llvm::cl::opt<bool, false, llvm::cl::parser<bool> >::opt<char [20], llvm::cl::OptionHidden, llvm::cl::desc, llvm::cl::initializer<bool> > ▒
+ 1.01% llvm::cl::opt<unsigned long, false, llvm::cl::parser<unsigned long> >::opt<char [30], llvm::cl::desc, llvm::cl::initializer<int>, llvm::cl::OptionHidden> ▒
+ 0.89% llvm::cl::opt<bool, false, llvm::cl::parser<bool> >::opt<char [33], llvm::cl::desc, llvm::cl::OptionHidden, llvm::cl::initializer<bool> > ▒
+ 6.05% ld-2.33.so [.] _dl_lookup_symbol_x ▒
+ 4.68% ld-2.33.so [.] _dl_relocate_object ▒
- 4.64% libc-2.33.so [.] __memcmp_evex_movbe ▒
- 0x6ad8300 ▒
+ 2.40% llvm::ARM::parseArch ▒
+ 2.24% matchOption ▒
+ 2.91% clang-15 [.] clang::DiagnosticIDs::getNearestOption ▒
+ 2.72% clang-15 [.] llvm::opt::OptTable::OptTable ▒
+ 2.70% clang-15 [.] clang::Lexer::LexTokenInternal ▒
+ 2.68% clang-15 [.] llvm::opt::StrCmpOptionName ▒
+ 2.64% libc-2.33.so [.] realloc ▒
+ 2.62% clang-15 [.] clang::EmitBackendOutput
  1. Didn't disable kptr restrict. Probably kernel calls backing malloc.
  2. Lot's of time spent in dynamic loader. Was clang linked with -z now? Can we create a cmake config var to NOT DO THAT?
  3. Why waste time initializing builtins?
  4. llvm::ARM::parseArch does a lot of work, even when not targeting arm!
  5. Why do we start the Lexer for no input? Or EmitBackendOutput? Maybe for default preprocessor defines? Maybe the new -fdriver-only flag could speed things up? https://reviews.llvm.org/rGc12577c61dbf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment