Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
jit笔记
链接
https://github.com/dotnet/coreclr/blob/master/Documentation/botr/ryujit-tutorial.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/ryujit-overview.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/porting-ryujit.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/building/viewing-jit-dumps.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/project-docs/clr-configuration-knobs.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/building/debugging-instructions.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/clr-abi.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/design-docs/finally-optimizations.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/design-docs/jit-call-morphing.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/type-system.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/type-loader.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/method-descriptor.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/virtual-stub-dispatch.md
https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/design-docs/jit-call-morphing.md
https://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes(v=vs.110).aspx
https://www.microsoft.com/en-us/research/wp-content/uploads/2001/01/designandimplementationofgenerics.pdf
https://www.cs.rice.edu/~keith/EMBED/dom.pdf
https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf
http://aakinshin.net/ru/blog/dotnet/typehandle/
https://en.wikipedia.org/wiki/List_of_CIL_instructions
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.arn0008c/index.html
http://wiki.osdev.org/X86-64_Instruction_Encoding
https://github.com/dotnet/coreclr/issues/12383
https://github.com/dotnet/coreclr/issues/14414
http://ref.x86asm.net/
https://www.onlinedisassembler.com/odaweb/
JIT入口点
Compiler::compCompile
如何输出IR
env COMPlus_JitDump=Main ./coreapp_jit
第一个函数是如何被JIT编译的
core-setup corehost\cli\coreclr.cpp
coreclr::initialize
return coreclr_initialize
dlls\mscoree\unixinterface.cpp
CorHost2::CreateAppDomainWithManager
hr = host->CreateAppDomainWithManager
vm\corhost.cpp
CorHost2::CreateAppDomain
_gc.setupInfo=prepareDataForSetup.Call_RetOBJECTREF(args);
vm\callhelpers.h
MDCALLDEF_REFTYPE( Call, FALSE, _RetOBJECTREF, Object*, OBJECTREF)
vm\callhelpers.cpp
MethodDescCallSite::CallTargetWorker
CallDescrWorkerWithHandler(&callDescrData);
vm\callhelpers.cpp
CallDescrWorkerWithHandler
CallDescrWorker(pCallDescrData);
vm\amd64\calldescrworkeramd64.S
CallDescrWorkerInternal
未解析出来
未解析出来
vm\amd64\theprestubamd64.S
ThePreStub
vm\prestub.cpp
PreStubWorker
pbRetVal = pMD->DoPrestub(pDispatchingMT);
vm\prestub.cpp
MethodDesc::DoPrestub
pStub = MakeUnboxingStubWorker(this);
vm\prestub.cpp
MakeUnboxingStubWorker
pstub = CreateUnboxingILStubForSharedGenericValueTypeMethods(pUnboxedMD);
vm\prestub.cpp
CreateUnboxingILStubForSharedGenericValueTypeMethods
RETURN Stub::NewStub(JitILStub(pStubMD));
vm\dllimport.cpp
JitILStub
pCode = pStubMD->MakeJitWorker(NULL, dwFlags, 0);
vm\prestub.cpp
线程安全的通过JIT编译函数,如果多个线程同时编译会返回同一份编译后的代码
MethodDesc::MakeJitWorker
pCode = UnsafeJitFunction(this, ILHeader, flags, flags2, &sizeOfCode);
vm\jitinterface.cpp
非线程安全的通过JIT编译函数
UnsafeJitFunction
res = CallCompileMethodWithSEHWrapper(jitMgr
vm\jitinterface.cpp
CallCompileMethodWithSEHWrapper
pParam->res = invokeCompileMethod
vm\jitinterface.cpp
invokeCompileMethod
CorJitResult ret = invokeCompileMethodHelper
vm\jitinterface.cpp
invokeCompileMethodHelper
ret = jitMgr->m_jit->compileMethod(comp, ...)
jit\ee_il_dll.cpp
CILJit::compileMethod
result = jitNativeCode(methodHandle
jit\compiler.cpp
jitNativeCode
pParam->pComp->compCompile(pParam->methodHnd, pParam->classPtr, pParam->compHnd, pParam->methodInfo,
jit\compiler.cpp
Compiler::compCompile
缩写
https://github.com/dotnet/coreclr/blob/master/Documentation/project-docs/glossary.md
HFA: homogeneous floating-point aggregate
HVA: homogeneous short vector aggregate
LSRA: Linear scan register alloc
GT_ASG: Assign (gtlist.h)
GT_CHS: flipsign
IND: load indirection (*ptr)
STOREIND: store indirection (*ptr = value)
CSE: Common subexpression elimination (https://en.wikipedia.org/wiki/Common_subexpression_elimination)
GC Cookie: https://msdn.microsoft.com/en-us/library/8dbf701c.aspx
SSA: Static Single Assignment
RMW: Read Modify Write
PSPSym: Previous Stack Pointer Symbol
LCG: Lightweight Code Generation. An early name for [dynamic methods]
问题
BasicBlock的结束 (block.h)
BBjumpKinds
Each basic block ends with a jump which is described as a value of the following enumeration
BBJ_EHFINALLYRET, // block ends with 'endfinally' (for finally or fault)
BBJ_EHFILTERRET, // block ends with 'endfilter'
BBJ_EHCATCHRET, // block ends with a leave out of a catch (only #if FEATURE_EH_FUNCLETS)
BBJ_THROW, // block ends with 'throw'
BBJ_RETURN, // block ends with 'ret'
BBJ_NONE, // block flows into the next one (no jump)
BBJ_ALWAYS, // block always jumps to the target
BBJ_LEAVE, // block always jumps to the target, maybe out of guarded
// region. Used temporarily until importing
BBJ_CALLFINALLY, // block always calls the target finally
BBJ_COND, // block conditionally jumps to the target
BBJ_SWITCH, // block ends with a switch statement
BBJ_COUNT
BasicBlock的标志 (block.h)
#define BBF_VISITED 0x00000001 // BB visited during optimizations
#define BBF_MARKED 0x00000002 // BB marked during optimizations
#define BBF_CHANGED 0x00000004 // input/output of this block has changed
#define BBF_REMOVED 0x00000008 // BB has been removed from bb-list
#define BBF_DONT_REMOVE 0x00000010 // BB should not be removed during flow graph optimizations
#define BBF_IMPORTED 0x00000020 // BB byte-code has been imported
#define BBF_INTERNAL 0x00000040 // BB has been added by the compiler
#define BBF_FAILED_VERIFICATION 0x00000080 // BB has verification exception
#define BBF_TRY_BEG 0x00000100 // BB starts a 'try' block
#define BBF_FUNCLET_BEG 0x00000200 // BB is the beginning of a funclet
#define BBF_HAS_NULLCHECK 0x00000400 // BB contains a null check
#define BBF_NEEDS_GCPOLL 0x00000800 // This BB is the source of a back edge and needs a GC Poll
#define BBF_RUN_RARELY 0x00001000 // BB is rarely run (catch clauses, blocks with throws etc)
#define BBF_LOOP_HEAD 0x00002000 // BB is the head of a loop
#define BBF_LOOP_CALL0 0x00004000 // BB starts a loop that sometimes won't call
#define BBF_LOOP_CALL1 0x00008000 // BB starts a loop that will always call
#define BBF_HAS_LABEL 0x00010000 // BB needs a label
#define BBF_JMP_TARGET 0x00020000 // BB is a target of an implicit/explicit jump
#define BBF_HAS_JMP 0x00040000 // BB executes a JMP instruction (instead of return)
#define BBF_GC_SAFE_POINT 0x00080000 // BB has a GC safe point (a call). More abstractly, BB does not
// require a (further) poll -- this may be because this BB has a
// call, or, in some cases, because the BB occurs in a loop, and
// we've determined that all paths in the loop body leading to BB
// include a call.
#define BBF_HAS_VTABREF 0x00100000 // BB contains reference of vtable
#define BBF_HAS_IDX_LEN 0x00200000 // BB contains simple index or length expressions on an array local var.
#define BBF_HAS_NEWARRAY 0x00400000 // BB contains 'new' of an array
#define BBF_HAS_NEWOBJ 0x00800000 // BB contains 'new' of an object type.
#if FEATURE_EH_FUNCLETS && defined(_TARGET_ARM_)
#define BBF_FINALLY_TARGET 0x01000000 // BB is the target of a finally return: where a finally will return during
// non-exceptional flow. Because the ARM calling sequence for calling a
// finally explicitly sets the return address to the finally target and jumps
// to the finally, instead of using a call instruction, ARM needs this to
// generate correct code at the finally target, to allow for proper stack
// unwind from within a non-exceptional call to a finally.
#endif // FEATURE_EH_FUNCLETS && defined(_TARGET_ARM_)
#define BBF_BACKWARD_JUMP 0x02000000 // BB is surrounded by a backward jump/switch arc
#define BBF_RETLESS_CALL 0x04000000 // BBJ_CALLFINALLY that will never return (and therefore, won't need a paired
// BBJ_ALWAYS); see isBBCallAlwaysPair().
#define BBF_LOOP_PREHEADER 0x08000000 // BB is a loop preheader block
#define BBF_COLD 0x10000000 // BB is cold
#define BBF_PROF_WEIGHT 0x20000000 // BB weight is computed from profile data
#ifdef LEGACY_BACKEND
#define BBF_FORWARD_SWITCH 0x40000000 // Aux flag used in FP codegen to know if a jmptable entry has been forwarded
#else // !LEGACY_BACKEND
#define BBF_IS_LIR 0x40000000 // Set if the basic block contains LIR (as opposed to HIR)
#endif // LEGACY_BACKEND
#define BBF_KEEP_BBJ_ALWAYS 0x80000000 // A special BBJ_ALWAYS block, used by EH code generation. Keep the jump kind
// as BBJ_ALWAYS. Used for the paired BBJ_ALWAYS block following the
// BBJ_CALLFINALLY block, as well as, on x86, the final step block out of a
// finally.
GenTree的标志 (gentree.h)
#define GTF_ASG 0x00000001 // sub-expression contains an assignment
#define GTF_CALL 0x00000002 // sub-expression contains a func. call
#define GTF_EXCEPT 0x00000004 // sub-expression might throw an exception
#define GTF_GLOB_REF 0x00000008 // sub-expression uses global variable(s)
#define GTF_ORDER_SIDEEFF 0x00000010 // sub-expression has a re-ordering side effect
// If you set these flags, make sure that code:gtExtractSideEffList knows how to find the tree,
// otherwise the C# (run csc /o-)
// var v = side_eff_operation
// with no use of v will drop your tree on the floor.
#define GTF_PERSISTENT_SIDE_EFFECTS (GTF_ASG | GTF_CALL)
#define GTF_SIDE_EFFECT (GTF_PERSISTENT_SIDE_EFFECTS | GTF_EXCEPT)
#define GTF_GLOB_EFFECT (GTF_SIDE_EFFECT | GTF_GLOB_REF)
#define GTF_ALL_EFFECT (GTF_GLOB_EFFECT | GTF_ORDER_SIDEEFF)
// The extra flag GTF_IS_IN_CSE is used to tell the consumer of these flags
// that we are calling in the context of performing a CSE, thus we
// should allow the run-once side effects of running a class constructor.
//
// The only requirement of this flag is that it not overlap any of the
// side-effect flags. The actual bit used is otherwise arbitrary.
#define GTF_IS_IN_CSE GTF_BOOLEAN
#define GTF_PERSISTENT_SIDE_EFFECTS_IN_CSE (GTF_ASG | GTF_CALL | GTF_IS_IN_CSE)
// Can any side-effects be observed externally, say by a caller method?
// For assignments, only assignments to global memory can be observed
// externally, whereas simple assignments to local variables can not.
//
// Be careful when using this inside a "try" protected region as the
// order of assignments to local variables would need to be preserved
// wrt side effects if the variables are alive on entry to the
// "catch/finally" region. In such cases, even assignments to locals
// will have to be restricted.
#define GTF_GLOBALLY_VISIBLE_SIDE_EFFECTS(flags) \
(((flags) & (GTF_CALL | GTF_EXCEPT)) || (((flags) & (GTF_ASG | GTF_GLOB_REF)) == (GTF_ASG | GTF_GLOB_REF)))
#define GTF_REVERSE_OPS \
0x00000020 // operand op2 should be evaluated before op1 (normally, op1 is evaluated first and op2 is evaluated
// second)
#define GTF_REG_VAL \
0x00000040 // operand is sitting in a register (or part of a TYP_LONG operand is sitting in a register)
#define GTF_SPILLED 0x00000080 // the value has been spilled
#ifdef LEGACY_BACKEND
#define GTF_SPILLED_OPER 0x00000100 // op1 has been spilled
#define GTF_SPILLED_OP2 0x00000200 // op2 has been spilled
#else
#define GTF_NOREG_AT_USE 0x00000100 // tree node is in memory at the point of use
#endif // LEGACY_BACKEND
#define GTF_ZSF_SET 0x00000400 // the zero(ZF) and sign(SF) flags set to the operand
#if FEATURE_SET_FLAGS
#define GTF_SET_FLAGS 0x00000800 // Requires that codegen for this node set the flags
// Use gtSetFlags() to check this flags
#endif
#define GTF_IND_NONFAULTING 0x00000800 // An indir that cannot fault. GTF_SET_FLAGS is not used on indirs
#define GTF_MAKE_CSE 0x00002000 // Hoisted Expression: try hard to make this into CSE (see optPerformHoistExpr)
#define GTF_DONT_CSE 0x00004000 // don't bother CSE'ing this expr
#define GTF_COLON_COND 0x00008000 // this node is conditionally executed (part of ? :)
#define GTF_NODE_MASK (GTF_COLON_COND)
#define GTF_BOOLEAN 0x00040000 // value is known to be 0/1
#define GTF_SMALL_OK 0x00080000 // actual small int sufficient
#define GTF_UNSIGNED 0x00100000 // with GT_CAST: the source operand is an unsigned type
// with operators: the specified node is an unsigned operator
#define GTF_LATE_ARG \
0x00200000 // the specified node is evaluated to a temp in the arg list, and this temp is added to gtCallLateArgs.
#define GTF_SPILL 0x00400000 // needs to be spilled here
#define GTF_SPILL_HIGH 0x00040000 // shared with GTF_BOOLEAN
#define GTF_COMMON_MASK 0x007FFFFF // mask of all the flags above
#define GTF_REUSE_REG_VAL 0x00800000 // This is set by the register allocator on nodes whose value already exists in the
// register assigned to this node, so the code generator does not have to generate
// code to produce the value.
// It is currently used only on constant nodes.
// It CANNOT be set on var (GT_LCL*) nodes, or on indir (GT_IND or GT_STOREIND) nodes, since
// it is not needed for lclVars and is highly unlikely to be useful for indir nodes
//---------------------------------------------------------------------
// The following flags can be used only with a small set of nodes, and
// thus their values need not be distinct (other than within the set
// that goes with a particular node/nodes, of course). That is, one can
// only test for one of these flags if the 'gtOper' value is tested as
// well to make sure it's the right operator for the particular flag.
//---------------------------------------------------------------------
// NB: GTF_VAR_* and GTF_REG_* share the same namespace of flags, because
// GT_LCL_VAR nodes may be changed to GT_REG_VAR nodes without resetting
// the flags. These are also used by GT_LCL_FLD.
#define GTF_VAR_DEF 0x80000000 // GT_LCL_VAR -- this is a definition
#define GTF_VAR_USEASG 0x40000000 // GT_LCL_VAR -- this is a use/def for a x<op>=y
#define GTF_VAR_USEDEF 0x20000000 // GT_LCL_VAR -- this is a use/def as in x=x+y (only the lhs x is tagged)
#define GTF_VAR_CAST 0x10000000 // GT_LCL_VAR -- has been explictly cast (variable node may not be type of local)
#define GTF_VAR_ITERATOR 0x08000000 // GT_LCL_VAR -- this is a iterator reference in the loop condition
#define GTF_VAR_CLONED 0x01000000 // GT_LCL_VAR -- this node has been cloned or is a clone
// Relevant for inlining optimizations (see fgInlinePrependStatements)
// Cleanup: Currently, GTF_REG_BIRTH is used only by stackfp
// We should consider using it more generally for VAR_BIRTH, instead of
// GTF_VAR_DEF && !GTF_VAR_USEASG
#define GTF_REG_BIRTH 0x04000000 // GT_REG_VAR -- enregistered variable born here
#define GTF_VAR_DEATH 0x02000000 // GT_LCL_VAR, GT_REG_VAR -- variable dies here (last use)
#define GTF_VAR_ARR_INDEX 0x00000020 // The variable is part of (the index portion of) an array index expression.
// Shares a value with GTF_REVERSE_OPS, which is meaningless for local var.
#define GTF_LIVENESS_MASK (GTF_VAR_DEF | GTF_VAR_USEASG | GTF_VAR_USEDEF | GTF_REG_BIRTH | GTF_VAR_DEATH)
#define GTF_CALL_UNMANAGED 0x80000000 // GT_CALL -- direct call to unmanaged code
#define GTF_CALL_INLINE_CANDIDATE 0x40000000 // GT_CALL -- this call has been marked as an inline candidate
#define GTF_CALL_VIRT_KIND_MASK 0x30000000
#define GTF_CALL_NONVIRT 0x00000000 // GT_CALL -- a non virtual call
#define GTF_CALL_VIRT_STUB 0x10000000 // GT_CALL -- a stub-dispatch virtual call
#define GTF_CALL_VIRT_VTABLE 0x20000000 // GT_CALL -- a vtable-based virtual call
#define GTF_CALL_NULLCHECK 0x08000000 // GT_CALL -- must check instance pointer for null
#define GTF_CALL_POP_ARGS 0x04000000 // GT_CALL -- caller pop arguments?
#define GTF_CALL_HOISTABLE 0x02000000 // GT_CALL -- call is hoistable
#define GTF_CALL_REG_SAVE 0x01000000 // GT_CALL -- This call preserves all integer regs
// For additional flags for GT_CALL node see GTF_CALL_M_
#define GTF_NOP_DEATH 0x40000000 // GT_NOP -- operand dies here
#define GTF_FLD_NULLCHECK 0x80000000 // GT_FIELD -- need to nullcheck the "this" pointer
#define GTF_FLD_VOLATILE 0x40000000 // GT_FIELD/GT_CLS_VAR -- same as GTF_IND_VOLATILE
#define GTF_INX_RNGCHK 0x80000000 // GT_INDEX -- the array reference should be range-checked.
#define GTF_INX_REFARR_LAYOUT 0x20000000 // GT_INDEX -- same as GTF_IND_REFARR_LAYOUT
#define GTF_INX_STRING_LAYOUT 0x40000000 // GT_INDEX -- this uses the special string array layout
#define GTF_IND_VOLATILE 0x40000000 // GT_IND -- the load or store must use volatile sematics (this is a nop
// on X86)
#define GTF_IND_REFARR_LAYOUT 0x20000000 // GT_IND -- the array holds object refs (only effects layout of Arrays)
#define GTF_IND_TGTANYWHERE 0x10000000 // GT_IND -- the target could be anywhere
#define GTF_IND_TLS_REF 0x08000000 // GT_IND -- the target is accessed via TLS
#define GTF_IND_ASG_LHS 0x04000000 // GT_IND -- this GT_IND node is (the effective val) of the LHS of an
// assignment; don't evaluate it independently.
#define GTF_IND_UNALIGNED 0x02000000 // GT_IND -- the load or store is unaligned (we assume worst case
// alignment of 1 byte)
#define GTF_IND_INVARIANT 0x01000000 // GT_IND -- the target is invariant (a prejit indirection)
#define GTF_IND_ARR_LEN 0x80000000 // GT_IND -- the indirection represents an array length (of the REF
// contribution to its argument).
#define GTF_IND_ARR_INDEX 0x00800000 // GT_IND -- the indirection represents an (SZ) array index
#define GTF_IND_FLAGS \
(GTF_IND_VOLATILE | GTF_IND_REFARR_LAYOUT | GTF_IND_TGTANYWHERE | GTF_IND_NONFAULTING | GTF_IND_TLS_REF | \
GTF_IND_UNALIGNED | GTF_IND_INVARIANT | GTF_IND_ARR_INDEX)
#define GTF_CLS_VAR_ASG_LHS 0x04000000 // GT_CLS_VAR -- this GT_CLS_VAR node is (the effective val) of the LHS
// of an assignment; don't evaluate it independently.
#define GTF_ADDR_ONSTACK 0x80000000 // GT_ADDR -- this expression is guaranteed to be on the stack
#define GTF_ADDRMODE_NO_CSE 0x80000000 // GT_ADD/GT_MUL/GT_LSH -- Do not CSE this node only, forms complex
// addressing mode
#define GTF_MUL_64RSLT 0x40000000 // GT_MUL -- produce 64-bit result
#define GTF_MOD_INT_RESULT 0x80000000 // GT_MOD, -- the real tree represented by this
// GT_UMOD node evaluates to an int even though
// its type is long. The result is
// placed in the low member of the
// reg pair
#define GTF_RELOP_NAN_UN 0x80000000 // GT_<relop> -- Is branch taken if ops are NaN?
#define GTF_RELOP_JMP_USED 0x40000000 // GT_<relop> -- result of compare used for jump or ?:
#define GTF_RELOP_QMARK 0x20000000 // GT_<relop> -- the node is the condition for ?:
#define GTF_RELOP_SMALL 0x10000000 // GT_<relop> -- We should use a byte or short sized compare (op1->gtType
// is the small type)
#define GTF_RELOP_ZTT 0x08000000 // GT_<relop> -- Loop test cloned for converting while-loops into do-while
// with explicit "loop test" in the header block.
#define GTF_QMARK_CAST_INSTOF 0x80000000 // GT_QMARK -- Is this a top (not nested) level qmark created for
// castclass or instanceof?
#define GTF_BOX_VALUE 0x80000000 // GT_BOX -- "box" is on a value type
#define GTF_ICON_HDL_MASK 0xF0000000 // Bits used by handle types below
#define GTF_ICON_SCOPE_HDL 0x10000000 // GT_CNS_INT -- constant is a scope handle
#define GTF_ICON_CLASS_HDL 0x20000000 // GT_CNS_INT -- constant is a class handle
#define GTF_ICON_METHOD_HDL 0x30000000 // GT_CNS_INT -- constant is a method handle
#define GTF_ICON_FIELD_HDL 0x40000000 // GT_CNS_INT -- constant is a field handle
#define GTF_ICON_STATIC_HDL 0x50000000 // GT_CNS_INT -- constant is a handle to static data
#define GTF_ICON_STR_HDL 0x60000000 // GT_CNS_INT -- constant is a string handle
#define GTF_ICON_PSTR_HDL 0x70000000 // GT_CNS_INT -- constant is a ptr to a string handle
#define GTF_ICON_PTR_HDL 0x80000000 // GT_CNS_INT -- constant is a ldptr handle
#define GTF_ICON_VARG_HDL 0x90000000 // GT_CNS_INT -- constant is a var arg cookie handle
#define GTF_ICON_PINVKI_HDL 0xA0000000 // GT_CNS_INT -- constant is a pinvoke calli handle
#define GTF_ICON_TOKEN_HDL 0xB0000000 // GT_CNS_INT -- constant is a token handle
#define GTF_ICON_TLS_HDL 0xC0000000 // GT_CNS_INT -- constant is a TLS ref with offset
#define GTF_ICON_FTN_ADDR 0xD0000000 // GT_CNS_INT -- constant is a function address
#define GTF_ICON_CIDMID_HDL 0xE0000000 // GT_CNS_INT -- constant is a class or module ID handle
#define GTF_ICON_BBC_PTR 0xF0000000 // GT_CNS_INT -- constant is a basic block count pointer
#define GTF_ICON_FIELD_OFF 0x08000000 // GT_CNS_INT -- constant is a field offset
#define GTF_BLK_VOLATILE 0x40000000 // GT_ASG, GT_STORE_BLK, GT_STORE_OBJ, GT_STORE_DYNBLK
// -- is a volatile block operation
#define GTF_BLK_UNALIGNED 0x02000000 // GT_ASG, GT_STORE_BLK, GT_STORE_OBJ, GT_STORE_DYNBLK
// -- is an unaligned block operation
#define GTF_BLK_INIT 0x01000000 // GT_ASG, GT_STORE_BLK, GT_STORE_OBJ, GT_STORE_DYNBLK -- is an init block operation
#define GTF_OVERFLOW 0x10000000 // GT_ADD, GT_SUB, GT_MUL, - Need overflow check
// GT_ASG_ADD, GT_ASG_SUB,
// GT_CAST
// Use gtOverflow(Ex)() to check this flag
#define GTF_NO_OP_NO 0x80000000 // GT_NO_OP --Have the codegenerator generate a special nop
#define GTF_ARR_BOUND_INBND 0x80000000 // GT_ARR_BOUNDS_CHECK -- have proved this check is always in-bounds
#define GTF_ARRLEN_ARR_IDX 0x80000000 // GT_ARR_LENGTH -- Length which feeds into an array index expression
#define GTF_LIST_AGGREGATE 0x80000000 // GT_LIST -- Indicates that this list should be treated as an
// anonymous aggregate value (e.g. a multi-value argument).
//----------------------------------------------------------------
#define GTF_STMT_CMPADD 0x80000000 // GT_STMT -- added by compiler
#define GTF_STMT_HAS_CSE 0x40000000 // GT_STMT -- CSE def or use was subsituted
//----------------------------------------------------------------
#if defined(DEBUG)
#define GTF_DEBUG_NONE 0x00000000 // No debug flags.
#define GTF_DEBUG_NODE_MORPHED 0x00000001 // the node has been morphed (in the global morphing phase)
#define GTF_DEBUG_NODE_SMALL 0x00000002
#define GTF_DEBUG_NODE_LARGE 0x00000004
#define GTF_DEBUG_NODE_MASK 0x00000007 // These flags are all node (rather than operation) properties.
#define GTF_DEBUG_VAR_CSE_REF 0x00800000 // GT_LCL_VAR -- This is a CSE LCL_VAR node
#endif // defined(DEBUG)
bbCatchTyp 的类型
// Some non-zero value that will not collide with real tokens for bbCatchTyp
#define BBCT_NONE 0x00000000
#define BBCT_FAULT 0xFFFFFFFC
#define BBCT_FINALLY 0xFFFFFFFD
#define BBCT_FILTER 0xFFFFFFFE
#define BBCT_FILTER_HANDLER 0xFFFFFFFF
JIT中的数据结构和关系
MethodTable (vm\methodtable.h)
表示各个普通的独立类型, Object指向的类型信息
泛型类型实例化一个就会生成一个新的MethodTable
TypeDesc (vm\typedesc.h)
表示特殊类型, 包括
TypeVarTypeDesc: 泛型类型, 例如List<T>中的T, 不共享
FnPtrTypeDesc: 函数指针, C#不支持, Managed c++用
ParamTypeDesc: byref或者指针类型, byref是使用ref或者out传递的类型, 指针类型Unsafe c#或者Managed c++用
ArrayTypeDesc: 数组类型
TypeHandle (vm\typehandle.h)
保存指向MethodTable或者TypeDesc的指针
保存TypeDesc时指针值会|=2, 用于判断指针的类型
CorElementType (inc\corhdr.h)
元素类型的枚举,有ELEMENT_TYPE_BOOLEAN ELEMENT_TYPE_CHAR等
EEClass (vm\class.h)
EE使用的类型信息, 包含是否抽象或者是否接口等Cold Data(运行时不需要, 只在加载类型和反射和JIT时需要的数据)
一般和MethodTable是一对一的关系,除非MethodTable是泛型的实例化
多个泛型的实例化的MethodTable会共享一个EEClass, 而EEClass.GetMethodTable会返回Canonical MT
MethodDesc (vm\method.hpp)
函数信息, 由EEClass引用, 保存在MDChunks中
FieldDesc (vm\field.hpp)
字段信息, 由EEClass引用
CORINFO_METHOD_INFO
从MethodDesc获取的公开函数信息, 包括ILCode和ILCodeSize
JIT里面会使用getMethodInfoHelper
CORINFO_EH_CLAUSE
例外处理器的信息
成员
Flags
CORINFO_EH_CLAUSE_FLAGS
CORINFO_EH_CLAUSE_NONE = 0,
CORINFO_EH_CLAUSE_FILTER = 0x0001, // If this bit is on, then this EH entry is for a filter
CORINFO_EH_CLAUSE_FINALLY = 0x0002, // This clause is a finally clause
CORINFO_EH_CLAUSE_FAULT
TryOffset try块的开始偏移值
TryLength try块的长度
HandlerOffset catch或finally块的开始偏移值
HandlerLength catch或finally块的长度
union { ClassToken, FilterOffset }
EHblkDsc
包含了例外处理器的块信息
成员
ebdTryBeg try开始的BasicBlock
ebdTryLast try结束的BasicBlock
ebdHndBeg handler开始的BasicBlock
ebdHndLast handler结束的BasicBlock
union {
ebdFilter 如果是filter, 这里保存filter开始的BasicBlock
ebdTyp 如果是catch, 这里保存捕捉的class token
}
ebdHandlerType handler类型, 有 catch filter fault finally
ebdEnclosingTryIndex 如果try有嵌套, 这里保存外层try的信息的索引值
ebdEnclosingHndIndex 如果handler有嵌套, 这里保存外层handler的信息的索引值
ebdFuncIndex eh funclet的索引值
ebdTryBegOffset try开始的il偏移值
ebdTryEndOffset try结束的il偏移值
ebdFilterBegOffset filter开始的il偏移值
ebdHndBegOffset handler开始的il偏移值
ebdHndEndOffset handler结束的il偏移值
LIR::Range
包含了一条或者多条IL指令
GenTree
语法节点, 根据IL指令构建
成员
gtOper 运算符, 有 GT_NOP GT_ADDR 等
gtType 评价后的类型, 有 TYP_VOID TYP_INT 等
gtOperSave 销毁gentree时保存gtOper的成员, 仅用于debug
gtCSEnum 执行CSE优化时, 如果找到则设置optCSEhash中的索引值
gtLIRFlags LIR中使用的标记
gtAssertionNum 给tree分配的optAssertionTabPrivate中的索引值, 可断言两个tree相等等
gtCostsInitialized gtCost是否已初始化
_gtCostEx 执行成本
_gtCostSz 代码体积成本
gtRegTag gtRegNum和gtRegPair是否已分配, 仅用于debug
union {
_gtRegNum 对应的寄存器
_gtRegPair 对应的两个寄存器, 仅在使用两个寄存器时使用
}
gtFlags 标记, 见 GTF_ 开头的值
gtDebugFlags 除错用的标记, 见 GTF_DEBUG_ 开头的值
gtVNPair 对应的Value Number, 可以用于识别两个tree的值是否一定会一样, 可用于CSE优化
gtRsvdRegs 执行后会销毁的寄存器集合
gtLsraInfo 使用LSRA分配寄存器时使用的信息
gtNext LIR中下一个tree
gtPrev LIR中上一个tree
gtTreeID tree的id, 在函数中唯一, 仅用于debug
gtSeqNum LIR中的tree的序列顺序, 仅用于debug
GenTreeStmt
一个完整的表达式, BasicBlock由一个或者多个GenTreeStmt组成
BasicBlock
包含一批IL指令,最后一条指令可能是跳转或者返指令,其他指令都不应该是跳转或返回指令
成员
LIR::Range : LIR::ReadOnlyRange {
m_firstNode, LIR中的第一个tree
m_lastNode LIR中的最后一个tree
}
bbNext 后一个BasicBlock
bbPrev 前一个BasicBlock
bbNum 序号,按原始指令顺序排列
bbPostOrderNum 使用postorder枚举block时的序列顺序
bbRefs 引用数量,等于0表示死代码
bbFlags 标志,看上面的BasicBlock的标志
bbWeight block的重量, 值越大表示block越热(容易被执行)
bbJumpKind 跳转到其他BasicBlock的类型,看上面的BBjumpKinds
union {
bbJumpOffs, 跳转到的IL偏移值,会在后面替换成bbJumpDest
bbJumpDest, 跳转到的目标BasicBlock值
bbJumpSwt, 跳转使用的Switch信息
}
bbEntryState stack信息, 包含了this是否已初始化和StackEntry的数组
bbStkTempsIn 来源溢出的临时变量的开始序号
bbStkTempsOut 自身溢出的临时变量的开始序号
bbTryIndex 如果代码在try中,对应的EHTable的序号
bbHndIndex 如果代码在catch中,对应的EHTable的序号
bbCatchTyp catch中的第一个BasicBlock会设置这个类型
union { bbStkDepth, bbFPinVars }
union { bbCheapPreds, bbPreds }
bbReach 可以到达此block的block集合, 递归并包含block自身
bbIDom block的dominator, 参考下面的"Reachable和Dominates的区别"
bbDfsNum 使用DFS reverse post order探索block是的序列顺序
bbDoms block的所有dominator的集合, 仅用于assertion prop(断言传播)
bbCodeOffs 块中IL指令的开始地址
bbCodeOffsEnd 块中IL指令的结束地址,不包含此地址上的指令
bbVarUse 使用过的本地变量集合
bbVarDef 修改过的本地变量集合
bbVarTmp 临时变量
bbLiveIn 进入block时存活的变量集合
bbLiveOut 离开block后存活的变量集合
bbHeapUse 是否使用过全局heap
bbHeapDef 是否修改过全局heap
bbHeapLiveIn 进入blob时全局heap是否存活
bbHeapLiveOut 离开blob后全局heap是否存活
bbHeapHavoc 是否会让全局heap进入未知的状态
bbHeapSsaPhiFunc EmptyHeapPhiDefn或者HeapPhiArg的链表
bbHeapSsaNumIn 进入block时全局heap的ssa序号
bbHeapSsaNumOut 离开block时全局heap的ssa序号
bbScope 哪些变量在block所在的scope, 用于debug支持(它们不一定会在bbVarUse和bbVarDef中)
union { bbCseGen, bbAssertionGen } 从block生成的cse或者assertion的索引值的bit集合
union { bbAssertionKill } 该block结束的assertion的索引值的bit集合
union { bbCseIn, bbAssertionIn } 进入block时有效的cse或者assertion的索引值的bit集合
union { bbCseOut, bbAssertionOut } 离开block时有效的cse或者assertion的索引值的bit集合
bbEmitCookie block对应的insGroup*(汇编指令的分组的指针)
bbLoopNum
bbNatLoopNum
bbTgtStkDepth
bbStmtNum
bbTraversalStamp
函数
BasicBlock::NumSucc(Compiler* comp)
获取下一个BasicBlock的数量
BBJ_THROW 和 BBJ_RETURN 返回 0
BBJ_COND 返回 bbJumpDest == bbNext ? 1 : 2
BBJ_SWITCH 返回 bbsCount 等
BasicBlock::GetSucc(unsigned i, Compiler* comp)
获取指定位置的下一个Block
Compiler
编译单个函数使用的类
会根据不同函数单独创建
成员
hbvGlobalData
verbose
dumpIR
dumpIRNodes
dumpIRTypes
dumpIRKinds
dumpIRLocals
dumpIRRegs
dumpIRSsa
dumpIRValnums
dumpIRCosts
dumpIRNoLists
dumpIRNoStmts
dumpIRTrees
dumpIRLinear
dumpIRDataflow
dumpIRBlockHeaders
dumpIRExit
dumpIRPhase
dumpIRFormat
verboseTrees
asciiTrees
verboseSsa
treesBeforeAfterMorph
morphNum
expensiveDebugCheckLevel
ehnTree
ehnNext
m_blockToEHPreds
fgNeedToSortEHTable
fgSafeBasicBlockCreation
lvaRefCountingStarted
lvaLocalVarRefCounted
lvaSortAgain
lvaTrackedFixed
lvaCount
lvaRefCount
lvaTable
lvaTableCnt
lvaRefSorted
lvaTrackedCount
lvaTrackedCountInSizeTUnits
lvaFirstStackIncomingArgNum
lvaTrackedVars
lvaFloatVars
lvaCurEpoch
lvaTrackedToVarNum
lvaVarPref
lvaVarargsHandleArg
lvaInlinedPInvokeFrameVar
lvaReversePInvokeFrameVar
lvaPInvokeFrameRegSaveVar
lvaMonAcquired
lvaArg0Var
lvaInlineeReturnSpillTemp
lvaOutgoingArgSpaceVar
lvaOutgoingArgSpaceSize
lvaReturnEspCheck
lvaGenericsContextUsed
lvaCachedGenericContextArgOffs
lvaLocAllocSPvar
lvaNewObjArrayArgs
lvaGSSecurityCookie
lvaSecurityObject
lvaStubArgumentVar
lvaPSPSym
impInlineInfo
m_inlineStrategy
fgNoStructPromotion
fgNoStructParamPromotion
lvaMarkRefsCurBlock
lvaMarkRefsCurStmt
lvaMarkRefsWeight
lvaHeapPerSsaData
lvaHeapNumSsaNames
impStkSize
impSmallStack
impTreeList
impTreeLast
impTokenLookupContextHandle
impCurOpcOffs
impCurOpcName
impNestedStackSpill
impLastILOffsStmt
impCurStmtOffs
impPendingList
impPendingFree
impPendingBlockMembers
impCanReimport
impSpillCliquePredMembers
impSpillCliqueSuccMembers
impBlockListNodeFreeList
seenConditionalJump
fgFirstBB
fgLastBB
fgFirstColdBlock
fgFirstFuncletBB
fgFirstBBScratch
fgReturnBlocks
fgEdgeCount
fgBBcount
fgBBcountAtCodeGen
fgBBNumMax
fgDomBBcount
fgBBInvPostOrder
fgDomTreePreOrder
fgDomTreePostOrder
fgBBVarSetsInited
fgCurBBEpoch
fgCurBBEpochSize
fgBBSetCountInSizeTUnits
fgMultipleNots
fgModified
fgComputePredsDone
fgCheapPredsValid
fgDomsComputed
fgHasSwitch
fgHasPostfix
fgIncrCount
fgEnterBlks
fgReachabilitySetsValid
fgEnterBlksSetValid
fgRemoveRestOfBlock
fgStmtRemoved
fgOrder
ftStmtRemoved
fgOrder
fgStmtListThreaded
fgCanRelocateEHRegions
fgEdgeWeightsComputed
fgHaveValidEdgeWeights
fgSlopUsedInEdgeWeights
fgRangeUsedInEdgeWeights
fgNeedsUpdateFlowGraph
fgCalledWeight
fgFuncletsCreated
fgGlobalMorph
fgExpandInline
impBoxTempInUse
impBoxTempInUsejitFallbackCompile
impInlinedCodeSize
fgReturnCount
fgMarkIntfUnionVS
m_opAsgnVarDefSsaNums
m_indirAssignMap
fgSsaPassesCompleted
vnStore
fgVNPassesCompleted
fgCurHeapVN
fgGCPollsCreated
m_switchDescMap
fgLoopCallMarked
fgBBs
fgProfileData_ILSizeMismatch
fgProfileBuffer
fgProfileBufferCount
fgNumProfileRuns
fgTreeSeqNum
fgTreeSeqBeg
fgPtrArgCntCur
fgPtrArgCntMax
fgOutgoingArgTemps
fgCurrentlyInUseArgTemps
fgPreviousCandidateSIMDFieldAsgStmt
fgMorphStmt
fgCurUseSet
fgCurDefSet
fgCurHeapUse
fgCurHeapDef
fgCurHeapHavoc
fgAddCodeList
fgAddCodeModf
fgRngChkThrowAdded
fgExcptnTargetCache
fgBigOffsetMorphingTemps
fgPrintInlinedMethods
fgHasLoops
optLoopTable
optLoopCount
optCallCount
optIndirectCallCount
optNativeCallCount
optLoopsCloned
optCSEhash
optCSEtab
optDoCSE
optValnumCSE_phase
optCSECandidateTotal
optCSECandidateCount
optCSEstart
optCSEcount
optCSEweight
optCopyPropKillSet
optMethodFlags
apTraits
apFull
apEmpty
optAddCopyLclNum
optAddCopyAsgnNode
optLocalAssertingProp
optAssertionPropagated
optAssertionPropagatedCurrentStmt
optAssertionPropCurrentTree
optComplementaryAssertionMap
optAssertionDep
optAssertionTabPrivate
optAssertionCount
optMaxAssertionCount
bbJtreeAssertionOut
optValueNumToAsserts
optRngChkCount
optLoopsMarked
raRegVarsMask
rpFrameType
rpMustCreateEBPCalled
m_pLinerScan
eeInfo
eeInfoInitialized
eeBoundariesCount
eeBoundaries
eeVarsCount
eeVars
tmpCount
tmpSize
tmpGetCount
tmpFree
tmpUsed
codeGen
genIPmappingList
genIPmappingLast
genCallSite2ILOffsetMap
genReturnLocal
genReturnBB
compFuncInfos
compCurrFuncIdx
compFuncInfoCount
compCurLife
compCurLifeTree
m_promotedStructDethVars
featureSIMD
lvaSIMDInitTempVarNum
SIMDFloatHandle
SIMDDoublHandle
SIMDIntHandle
SIMDUShortHandle
SIMDUByteHandle
SIMDLongHandle
SIMDUIntHandle
SIMDULongHandle
SIMDVector2Handle
SIMDVector3Handle
SIMDVector4Handle
SIMDVectorHandle
SIMDVectorFloat_set_Item
SIMDVectorFloat_get_Length
SIMDVectorFloat_op_Addition
InlineeCompiler
compInlineResult
compDoAggressiveInlining
compJmpOpUsed
compLongUsed
compFloatingPointUsed
compTailCallUsed
compLocallocUsed
compQmarkUsed
compQmarkRationalized
compUnsafeCastUsed
compQMarks
compBlkOpUsed
bRangeAllowStress
compCodeGenDone
compNumStatementLinksTraversed
fgNormalizeEHDone
compSizeEstimate
compCycleEstimate
fgLocalVarLivenessDone
fgLocalVarLivenessChanged
compStackProbePrologDone
compLSRADone
compRationalIRForm
compUsesThrowHelper
compGeneratingProlog
compGeneratingEpilog
compNeedsGSSecurityCookie
compGSReorderStackLayout
lvaDoneFrameLayout
inlRNG
s_compMethodsCount
compGenTreeID
compCurBB
compCurStmt
compCurStmtNum
compInfoBlkSize
compInfoBlkAddr
compHndBBtab
compHndBBtabCount
compHndBBtabAllocCount
syncStartEmitCookie
syncEndEmitCookie
previousCompletedPhase
compLclFrameSize
compCalleeRegsPushed
compCalleeFPRegsSavedMask
compVSQuirkStackPaddingNeeded
compQuirkForPPPflag
compArgSize
genMemStats
m_loopsConsidered
m_curLoopHasHoistedExpression
m_loopsWithHoistedExpressions
m_totalHoistedExpressions
compVarScopeMap
compAsIAllocator
compAsIAllocatorBitset
compAsIAllocatorGC
compAsIAllocatorLoopHoist
compAsIAllocatorDebugOnly
tiVerificationNeeded
tiIsVerifiableCode
tiRuntimeCalloutNeeded
tiSecurityCalloutNeeded
verCurrentState
verTrackObjCtorInitState
compMayHaveTransitionBlocks
raMaskDontEnregFloat
raLclRegIntfFloat
raCntStkStackFP
raCntWtdStkDblStackFP
raCntStkParamDblStackFP
raPayloadStackFP
raHeightsStackFP
raHeightsNonWeightedStackFP
compDebugBreak
gsGlobalSecurityCookieAddr
gsGlobalSecurityCookieVal
gsShadowVarInfo
gsMarkPtrsAndAssignGroups
gsReplaceShadowParams
pCompJitTimer
s_compJitTimerSummary
m_compCyclesAtEndOfInlining
m_compCycles
m_compTickCountAtEndOfInlining
compJitFuncInfoFilename
compJitFuncInfoFile
prevCompiler
m_nodeTestData
m_loopHoistCSEClass
m_fieldSeqStore
m_zeroOffsetFieldMap
m_arrayInfoMap
m_heapSsaMap
m_refAnyClass
typeInfo
类型信息, 包含类型标记(TI_REF, TI_I_IMPL等)和 class handle(或method table如果是TI_METHOD)
BasicBlockList
BasicBlock的链表, 有block和next成员
flowList
BasicBlock的链表, 包含了block, edge weight和dup count
block是edge的来源
edge weight参考下面的说明
dup count是如果block有多个edge目标, 则记录目标次数(仅发生在switch block)
StackEntry
包含了gentree和typeInfo, 以数组形式保存在EntryState中
EHSuccessorIter
用于枚举一个block的eh successor
例如 block 1 在try block里面, 则在catch(或finally)中的第一个block是block 1的eh successor
AllSuccessorIter
用于枚举一个block的普通successor和eh successor
FieldSeqNode
gentree使用的field信息链表(CORINFO_FIELD_HANDLE)
在GT_FIELD转换(lowered)为GT_IND等以后还会保留
FieldSeqStore
用于针对同一个CORINFO_FIELD_HANDLE返回同一个FieldSeqNode(单例)
GenTreeUseEdgeIterator
使用(use)tree的范围的枚举器
gentree有 IteratorPair<GenTreeUseEdgeIterator> UseEdges() 函数可以获取使用范围
枚举器返回的类型是GenTree*
GenTreeOperandIterator
tree的参数的枚举器
例如unary有一个参数, binary有两个参数, 部分tree有更多的参数
枚举器返回的类型是GenTree*
GenTreeUnOp
unary operator的tree, 带一个参数, GT_ARR_LENGTH GT_BOX 等
GenTreeVal
包含了一个size_t参数的tree, 是一个通用类型(GT_JMP GT_END_LFIN等)
GenTreePhysReg
包含了一个寄存器参数的tree, GT_PHYSREG (出现时表示使用该寄存器中的值)
GenTreeIntCon
constant int的tree, 包含int常量, GT_CNS_INT
GenTreeLngCon
constant long的tree, 包含long常量, GT_CNS_LNG
ICodeManager, EECodeManager (inc\eetwain.h, vm\eetwain.cpp)
保存了JIT编译后的函数的帧和GC信息
负责处理例外和回滚帧和枚举GC根对象等
RangeSection (codeman.h)
包含了JJT编译后的函数PC范围和对应的 PTR_IJitManager
用于根据PC定位属于的函数
IJitManager
管理JIT编译后的代码, 包含了 ICodeManager
EEJitManager : IJitManager
包含了 ICorJitCompiler, 从 jit\ee_il_dll.cpp 的 getJit() 函数生成
还有 NativeImageJitManager 和 ReadyToRunJitManager 等实现, 但这里不分析
ExecutionManager (codeman.h)
包含了 RangeSection 的链表 (m_CodeRangeList)
包含了全局使用的 EEJitManager (m_pEEJitManager)
包含了全局使用的 NativeImageJitManager (m_pNativeImageJitManager)
包含了全局使用的 ReadyToRunJitManager (m_pReadyToRunJitManager)
包含了全局使用的 EECodeManager (m_pDefaultCodeMan)
MorphAddrContext
用于储存GenTree的地址相关的上下文信息
例如byref节点, 是会被立刻ind还是会使用它的值例如传给其他参数, 可以影响到null检查的方式
一共有三种类型 MACK_Ind, MACK_Addr, MACK_CopyBlock
CodeGen
负责JIT后端(代码生成)的类
emitter
负责写入汇编代码的类
emitLocation
用于记录汇编指令的位置(所在ig和ig中的偏移值), 参考CaptureLocation
insGroup
汇编指令的分组, 作用类似于BasicBlock, 缩写是ig
跳转指令只会出现在ig的最后, 跳转目标只能是ig的开头
除了跳转以外, 还会按大小限制和是否可中断切割ig, 这点和BasicBlock不同
instrDesc
单个汇编指令的信息, 有很多子类型, 例如 instrDescJmp instrDescCns
GCInfo
保存了当前函数使用的gc信息, 会使用GcInfoEncoder写入到函数头中
GcInfoEncoder
用于写入gc信息到函数头的类
先写入 m_Info1 和 m_Info2, 再合并复制到 pRealCodeHeader->phdrJitGCInfo
varPtrDsc
用于记录在栈上的gc变量的生命周期, 生成gcinfo时使用
regPtrDsc
用于记录在寄存器上的gc变量的生命周期, 生成gcinfo时使用
lclVar和lclFld的区别
lclVar是读取本地变量,例如 var a = 0; 读取a
lclFld是读取字段,例如 var b = new MyStruct(); 读取b.x
什么是STUB
用于在方法第一次调用的时候JIT编译它, 后面再调用就调用JIT编译后的代码
例如JIT前是 call PrecodeFixupThunk; pop esi; dword pMethodDesc;
JIT后会变为 jmp target; pop edi; dword pMethodDesc
什么是Funclet
给finally, catch等区域生成的小函数
拥有独立的prolog和epilog, 调用时会通过call
在x86下不会生成
Funclet会接受上一个函数的栈指针用于访问本地函数, 原因是Funclet有可能由EH处理库调用,这时就需要显式传递栈地址
Funclet如果是catch或者filter会返回继续执行的地址
Funclet的格式和内容
下面说明的环境是x64, 不适用于其他平台
来源是 codegencommon.cpp:9875
funclet的传入参数
catch/filter-handler: rcx = InitialSP, rdx = 捕捉到的例外对象(GT_CATCH_ARG)
filter: rcx = InitialSP, rdx = 捕捉到的例外对象(GT_CATCH_ARG)
finally/fault: rcx = InitialSP
funclet的返回参数
catch/filter-handler: rax = 恢复执行的地址(BBJ_EHCATCHRET)
filter: rax = 不等于0则表示handler应该处理此例外, 等于0则表示不处理此例外
finally/fault: 无返回值
funclet frame的结构
incoming arguments
===================== (Caller's SP)
return address
saved ebp
callee saved registers
possible 8 byte pad for alignment
PSP slot (本地的PSPSym, [rsp+0x20])
Outgoing arg space (如果funclet会调用其他函数, 大小是0x20)
当前的rsp
funclet的例子
代码
int x = GetX();
try {
Console.WriteLine(x);
throw new Exception("abc");
} catch (Exception ex) {
Console.WriteLine(ex);
Console.WriteLine(x);
}
生成的主函数
00007FFF0FEC0480 55 push rbp // 备份原rbp
00007FFF0FEC0481 56 push rsi // 备份原rsi
00007FFF0FEC0482 48 83 EC 38 sub rsp,38h // 预留本地变量空间, 大小0x38
00007FFF0FEC0486 48 8D 6C 24 40 lea rbp,[rsp+40h] // rbp等于push rbp之前rsp的地址(0x38+0x8)
00007FFF0FEC048B 48 89 65 E0 mov qword ptr [rbp-20h],rsp // 保存预留本地变量后的rsp, 到本地变量[rbp-0x20], 也就是PSPSym
00007FFF0FEC048F E8 24 FC FF FF call 00007FFF0FEC00B8 // 调用GetX()
00007FFF0FEC0494 89 45 F4 mov dword ptr [rbp-0Ch],eax // 返回结果存本地变量[rbp-0x0c], 也就是x
185: try {
186: Console.WriteLine(x);
00007FFF0FEC0497 8B 4D F4 mov ecx,dword ptr [rbp-0Ch] // x => 第一个参数
00007FFF0FEC049A E8 B9 FE FF FF call 00007FFF0FEC0358 // 调用Console.WriteLine
187: throw new Exception("abc");
00007FFF0FEC049F 48 B9 B8 58 6C 6E FF 7F 00 00 mov rcx,7FFF6E6C58B8h // Exception的MethodTable => 第一个参数
00007FFF0FEC04A9 E8 A2 35 B1 5F call 00007FFF6F9D3A50 // 调用CORINFO_HELP_NEWFAST(JIT_New, 或汇编版本)
00007FFF0FEC04AE 48 8B F0 mov rsi,rax // 例外对象存rsi
00007FFF0FEC04B1 B9 12 02 00 00 mov ecx,212h // rid => 第一个参数
00007FFF0FEC04B6 48 BA 78 4D D6 0F FF 7F 00 00 mov rdx,7FFF0FD64D78h // module handle => 第二个参数
00007FFF0FEC04C0 E8 6B 20 AF 5F call 00007FFF6F9B2530 // 调用CORINFO_HELP_STRCNS(JIT_StrCns), 用于lazy load字符串常量对象
00007FFF0FEC04C5 48 8B D0 mov rdx,rax // 常量字符串对象 => 第二个参数
00007FFF0FEC04C8 48 8B CE mov rcx,rsi // 例外对象 => 第一个参数
00007FFF0FEC04CB E8 20 07 43 5E call 00007FFF6E2F0BF0 // 调用System.Exception:.ctor
00007FFF0FEC04D0 48 8B CE mov rcx,rsi // 例外对象 => 第一个参数
00007FFF0FEC04D3 E8 48 FC A0 5F call 00007FFF6F8D0120 // 调用CORINFO_HELP_THROW(IL_Throw)
00007FFF0FEC04D8 CC int 3 // unreachable
00007FFF0FEC04D9 48 8D 65 F8 lea rsp,[rbp-8] // 恢复到备份rbp和rsi后的地址
00007FFF0FEC04DD 5E pop rsi // 恢复rsi
00007FFF0FEC04DE 5D pop rbp // 恢复rbp
00007FFF0FEC04DF C3 ret
生成的funclet
00007FFF0FEC04E0 55 push rbp // 备份rbp
00007FFF0FEC04E1 56 push rsi // 备份rsi
00007FFF0FEC04E2 48 83 EC 28 sub rsp,28h // 本地的rsp预留0x28(PSP slot 0x8 + Outgoing arg space 0x20(如果funclet会调用其他函数))
00007FFF0FEC04E6 48 8B 69 20 mov rbp,qword ptr [rcx+20h] // rcx是InitialSP(预留本地变量后的rsp)
// 原函数的rbp跟rsp差40, 所以[InitialSP+20h]等于[rbp-20h], 也就是PSPSym
// 这个例子中因为只有一层, PSPSym里面保存的值跟传入的rcx一样(InitialSP)
00007FFF0FEC04EA 48 89 6C 24 20 mov qword ptr [rsp+20h],rbp // 复制PSPSym到funclet自己的frame
00007FFF0FEC04EF 48 8D 6D 40 lea rbp,[rbp+40h] // 原函数的rbp跟rsp差40, 计算得出原函数的rbp
188: } catch (Exception ex) {
189: Console.WriteLine(ex);
00007FFF0FEC04F3 48 8B CA mov rcx,rdx // rdx例外对象, 移动到第一个参数
00007FFF0FEC04F6 E8 7D FE FF FF call 00007FFF0FEC0378 // 调用Console.WriteLine
190: Console.WriteLine(x);
00007FFF0FEC04FB 8B 4D F4 mov ecx,dword ptr [rbp-0Ch] // [rbp-0xc]就是变量x, 移动到第一个参数
00007FFF0FEC04FE E8 55 FE FF FF call 00007FFF0FEC0358 // 调用Console.WriteLine
00007FFF0FEC0503 48 8D 05 CF FF FF FF lea rax,[7FFF0FEC04D9h] // 恢复执行的地址
00007FFF0FEC050A 48 83 C4 28 add rsp,28h // 释放本地的rsp预留的空间
00007FFF0FEC050E 5E pop rsi // 恢复rsi
00007FFF0FEC050F 5D pop rbp // 恢复rbp
00007FFF0FEC0510 C3 ret
什么是Spill Temps
BasicBlock 完成后,残留在 ExecutionStack 中的值需要先保存到本地变量
这些临时变量称为 Spill Temps
什么是Spill Cliques
Spill Temps的群体称为Spill Cliques
临时变量的开始地址保存在 bbStkTempsIn 和 bbStkTempsOut 中
什么时候内部(Internal) BasicBlock 会被插入
内部block会被标记为 BBF_INTERNAL
第一个block (fgFirstBB)
fgEnsureFirstBBisScratch 中插入
多个return归并到一个return的block
fgAddInternal 中插入
QMARK的THEN和ELSE
QMARK COLON中op1是else, op2是then, 详细见ThenNode和ElseNode函数
什么是Back Edges
return 或者是 跳转到前面的 BasicBlock 的边缘
如果一个 BasicBlock 在这个边缘, 则标记它为 BBF_NEEDS_GCPOLL
什么是GC Poll
GC在部分情况下需要停止所有Managed线程, 所以代码里面需要插入针对GC的检查
如果 block 被标记为 BBF_NEEDS_GCPOLL, 则在它尾部插入调用 CORINFO_HELP_POLL_GC(JIT_PollGC) 的代码
例如循环的跳转block(Back Edge)就需要插入, 否则死循环时会一直让GC等待
JIT_PollGC 会调用 PulseGCMode
PulseGCMode 会调用 EnablePreemptiveGC, 然后调用 DisablePreemptiveGC
什么是Value Numbering
参考: https://en.wikipedia.org/wiki/Global_value_numbering
VN是tree值的唯一标记, 如果两个tree的VN相同可以确定两个tree生成的值是一样的, 这个时候就可以执行CSE优化
计算VN需要先计算SSA
什么是Cold Block
不常运行的代码(basic block)都会归为cold block
实际生成汇编代码时会分别存到两处地方
分hot code和cold code的原因是为了增加cpu cache的命中率, 改善代码执行的性能
什么是Block Weight
BasicBlock 中的 bbWeight 代表 block 的重量, 值越大表示block越热(容易被执行)
什么是Edge Weight
flowList* bbPreds 中的 flEdgeWeightMin 和 flEdgeWeightMax
可以见 block.h 中的注释
In compiler terminology the control flow between two BasicBlocks
is typically referred to as an "edge". Most well known are the
backward branches for loops, which are often called "back-edges".
两个block之间的跳转可以称作edge
edge weight可以代表该跳转发生是否频繁, 值越大越频繁
flEdgeWeightMin和flEdgeWeightMax表示了值的范围, 它们通常是一样的
edge weight的计算在fgComputeEdgeWeights函数中, 生成的信息可以用于fgReorderBlocks优化
Reachable和Dominates的区别
两个block可能互相为reachable但不能互相为dominator
如果两个block互相为reachable, 则在dom树中较高的节点是较低的节点的dominator
dom树按DFS的after order的相反顺序构建
如果出现 A -> B, A -> C, B -> D, C -> D, 则D的dominator不是B或C而是A, 表示执行D必须经过A
参考: https://en.wikipedia.org/wiki/Dominator_(graph_theory)
参考: https://www.cs.rice.edu/~keith/EMBED/dom.pdf
什么是Dominance Frontier
如果出现 A -> B, A -> C, B -> D, C -> D, 则B和C的Dominance Frontier是D
因为D是不同的branch的join结果, 对join途中的节点来说D就是Dominance Frontier
计算的算法参考: https://en.wikipedia.org/wiki/Static_single_assignment_form
什么是Varargs里面的Cookie
如果目标函数有__arglist则需要传cookie,值是指向VASigCookie的指针
VASigCookie包括
sizeOfArgs: 在堆栈中的参数的大小 (在寄存器中的不会计算, 不属于__arglist的普通参数也会计算)
目前CoreCLR不支持
Phi Node什么时候消失
不会消失, 会一直用到CodeGen
ArgPlace是什么
如果call的参数不需要使用临时变量保存, 且参数使用寄存器传递
则把该参数替换为argplace, 然后把原来的式放到 gtCallLateArgs
此外标记了 needPlace 的参数(例如nested call)也会替换为 argPlace 节点
最终 gtCallLateArgs 会包含使用寄存器传递和标记为 needPlace 的参数, gtCallArgs 会包含通过 outgoing arg area 或者 栈传递的参数
参考: https://github.com/dotnet/coreclr/blob/fe98261c0fef181197380d0b42c482eb6eddcf94/Documentation/design-docs/jit-call-morphing.md
RBM和REG的区别是什么
RBM 是 register bit mask
例如 REG_EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI 分别是 0 1 2 3 4 5 6 7
但是 RBM_EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI 分别是 0x01 0x02 0x04 0x08 0x10 0x20 0x40 0x80
此外RAX是EAX的alias, 其他也一样
具体可以看 register.h 和 target.h
什么是GTF_CONTAINED
This node is contained (executed as part of its parent)
什么是Instruction Group
可以看作是保存汇编的BasicBlock
genProduceReg 和 genConsumeReg 的关系
genConsumeReg
在需要使用寄存器的值时调用
确保tree需要的寄存器有需要的内容
如果tree标记了GTF_SPILLED则需要从堆栈上reload
这个函数还会更新
codeGen.gcInfo.gcRegByrefSetCur // 当前包含byref的寄存器集合
codeGen.gcInfo.gcRegGCrefSetCur // 当前包含gcref的寄存器集合
codeGen.gcInfo.gcVarPtrSetCur // 当前包含byref或者gcref的栈变量集合
codeGen.regSet // 当前存活的寄存器集合
compiler.compCurLife // 当前存活的本地变量
genProduceReg
在寄存器产生了新的值时调用
表示tree产生了寄存器
如果标记了GTF_SPILL则需要store值到堆栈
这个函数还会更新
codeGen.gcInfo.gcRegByrefSetCur // 当前包含byref的寄存器集合
codeGen.gcInfo.gcRegGCrefSetCur // 当前包含gcref的寄存器集合
codeGen.gcInfo.gcVarPtrSetCur // 当前包含byref或者gcref的栈变量集合
codeGen.regSet // 当前存活的寄存器集合
compiler.compCurLife // 当前存活的本地变量
insGroup和instrDesc的结构
构建当前ig的时候会使用
emitCurIGfreeBase 当前ig对应的instrDesc数组的起始地址, 可以使用((instrDesc*)(emitCurIGfreeBase))[0]访问
emitCurIGfreeNext 添加下一个instrDesc的地址, 添加一次增加sizeof(instrDesc)
emitCurIGfreeEndp instrDesc数组的结尾地址, 如果超过则会创建下一个ig
如果空间不够(emitCurIGfreeBase已用完)
会调用 emitNxtIG, 会先调用 emitSavIG 然后调用 emitNewIG
构建ig完毕后
会调用 emitSavIG
复制 emitCurIGfreeBase ~ emitCurIGfreeNext 到 ig->igData
复制instr的数量到 ig->igInsCnt
重置 emitCurIGfreeNext = emitCurIGfreeBase
创建ig时
调用 emitNewIG => emitGenIG(emitAllocAndLinkIG())
保存ig的链表
emitIGlist ig的链表的第一个元素
emitIGlast ig的链表的最后一个元素
emitCurIG 当前处理的ig
访问链表可以使用 emitIGlist->igNext
访问ig中的instrDesc可以使用 emitIGlist->igData
什么是ReJIT
参考: https://github.com/dotnet/coreclr/blob/master/Documentation/botr/clr-abi.md
为了支持profiler attach
JIT会让所有函数的前5 bytes都不可中断并且不是跳转目标
如果启用了ReJIT并且prolog的大小小于5, 则会补充nop
后面需要热替换的时候JIT可以停止所有线程然后把在5个byte中写入跳转指令
什么是PSPSym
参考: https://github.com/dotnet/coreclr/blob/master/Documentation/botr/clr-abi.md
如果支持eh funclet,
需要在调用eh funclet的时候恢复rsp到main function的rsp值, funclet就可以访问到原来的本地变量
PSPSym的全称是Previous Stack Pointer Symbol, 是一个指针大小的值, 保存上一个函数的堆栈地址
在x64上, 它的值是 InitialSP, 也就是fixed size portion(本地变量, 临时变量)已经分配后的大小, 不包括alloca分配的范围
在其他平台上, 它的值是 CallerSP, 也就是调用函数之前的堆栈的值, 包括了前面用alloca分配的范围
在调用 funclet 的时候, 调用者会传递一个 Establisher Frame Pointer (例如在x64上通过rcx)
然后 funclet 会通过 Establisher Frame Pointer 找到 PSPSym 的地址, 然后根据 PSPSym 找到之前堆栈的值
最后就可以把 funclet 的 rsp 设为之前函数里面使用的 rsp 值
如何从PC定位到函数的信息
首先根据PC在 Nibble Map 里面找到对应的 pCode
pCode 前面是 CodeHeader
CodeHeader 里面包含了指向 _hpRealCodeHdr 的指针 pRealCodeHeader
pRealCodeHeader 里面包含了
phdrDebugInfo 包含了PC到IL offset的索引
结构: 见下面"Debug Info的生成和结构"
phdrJitEHInfo 包含了EH Clause的数组
结构: 见下面"EHInfo的结构"
phdrJitGCInfo 包含了GC扫描栈和寄存器使用的信息
结构: 见下面"GCInfo的结构"
phdrMDesc 函数的MethodDesc
nUnwindInfos unwindInfos的数量
unindInfos unwind信息(栈回滚信息)
结构: 见下面"Unwind Info的结构"
emitter里面的u1和u2是什么
用于记录push到堆栈的ref变量的状态
u1是启用了 emitSimpleStkUsed 时使用的, 会使用bitmask保存
u2是不启用 emitSimpleStkUsed 时使用的, 会使用一个数组保存
例如
emitArgTrackTab: [ TYPE_GC_NONE, TYPE_GC_NONE, TYPE_GC_REF, TYPE_GC_BYREF, TYPE_GC_NONE, ... ]
emitArgTrackTop: emitArgTrackTab + 5 // 下一次push会设在第六个元素
emitGcArgTrackCnt: 2
IF后面的格式
来源: emitfmtsxarch.h
IF_XYY
X = // first operand
R - register
M - memory
S - stack
A - address mode
T - x87 st(x)
YY = // second operand
RD - read
WR - write
RW - read write
IF_CNS constant
IF_SHF - shift constant
JIT后的代码保存在什么地方
jit的代码保存在loader heap中
流程是 allocCode => allocCodeRaw => AllocMemForCode_NoThrow => UnlockedAllocMemForCode_NoThrow
分配时会分配 [ CodeHeader, code ], 函数头部信息总是在代码(汇编代码)前面
实际函数头部信息中只有一个指针值, 指向真正的头部信息(_hpRealCodeHdr)
真正的头部信息如果是动态函数则放在代码后面, 否则放在GetLowFrequencyHeap后面
JIT在什么线程中编译
正常情况下会在调用(call)函数的线程中编译
除非使用MulticoreJitProfilePlayer手动编译函数
如何保证同一个函数只JIT一次
会使用线程锁保证
ListLock appdomain中保存entry的链表
ListLockEntry 一个entry保存一个正在jit的methoddesc和线程锁
jit时首先对ListLock上锁(全局锁), 然后获取或者创建ListLockEntry, 然后释放ListLock上的锁
所有对同一个methoddesc编译的函数都会获取到同一个ListLockEntry
但是实际上锁可能会失败, 最终不能够保证"同一个函数只JIT一次"
如果上锁失败会有多个线程同时执行jit编译, 但是写到stub的只有一个, 另外一个的编译结果会浪费(内存空间)
正常情况下, 一个线程正在编译时其他线程会等待锁, 等编译完成后所有等待锁的线程都会得到同一个pCode
什么情况会触发JIT编译(懒编译)
需要jit懒编译的函数都会有 fixup precode (stub)
jit编译前precode中会是调用jit编译的call
jit编译后precode中会是跳转到jit编译结果的jmp
触发jit编译会在*第一次调用(call)*该函数时
具体流程看下面"JIT Stub的调用和替换流程"
JIT Stub的调用和替换流程
jit前 call => Fixup Precode => Fixup Precode Chunk => The PreStub => PreStub Worker => ...
jit后 call => Fixup Precode => Compile Result
参考: https://github.com/dotnet/coreclr/blob/release/1.1.0/Documentation/botr/method-descriptor.md
lldb分析的流程
(lldb) b CallDescrWorkerInternal
(lldb) process handle -s false SIGUSR1 SIGUSR2
(lldb) r
(lldb) p *((CallDescrData*)$rdi)
(CallDescrData) $0 = {
pSrc = 0x00007fffffffb7e8
numStackSlots = 0
pArgumentRegisters = 0x00007fffffffb780
pFloatArgumentRegisters = 0x0000000000000000
fpReturnSize = 0
pTarget = 140735275988392
returnValue = ([0] = 140737312810959, [1] = 140737488337632)
}
calldescrworkeramd64.S
-> 0x7ffff5bd95ac <CallDescrWorkerInternal+121>: ff 53 28 callq *0x28(%rbx)
(lldb) p *(intptr_t*)($rbx+0x28)
(intptr_t) $10 = 140735275787688
Fixup Precode
(lldb) di --frame --bytes
-> 0x7fff7c21f5a8: e8 2b 6c fe ff callq 0x7fff7c2061d8
0x7fff7c21f5ad: 5e popq %rsi
0x7fff7c21f5ae: 19 05 e8 23 6c fe sbbl %eax, -0x193dc18(%rip)
0x7fff7c21f5b4: ff 5e a8 lcalll *-0x58(%rsi)
0x7fff7c21f5b7: 04 e8 addb $-0x18, %al
0x7fff7c21f5b9: 1b 6c fe ff sbbl -0x1(%rsi,%rdi,8), %ebp
0x7fff7c21f5bd: 5e popq %rsi
0x7fff7c21f5be: 00 03 addb %al, (%rbx)
0x7fff7c21f5c0: e8 13 6c fe ff callq 0x7fff7c2061d8
0x7fff7c21f5c5: 5e popq %rsi
0x7fff7c21f5c6: b0 02 movb $0x2, %al
(lldb) di --frame --bytes
-> 0x7fff7c2061d8: e9 13 3f 9d 79 jmp 0x7ffff5bda0f0 ; PrecodeFixupThunk
0x7fff7c2061dd: cc int3
0x7fff7c2061de: cc int3
0x7fff7c2061df: cc int3
0x7fff7c2061e0: 49 ba 00 da d0 7b ff 7f 00 00 movabsq $0x7fff7bd0da00, %r10
0x7fff7c2061ea: 40 e9 e0 ff ff ff jmp 0x7fff7c2061d0
Fixup Precode Chunk
lldb) di --frame --bytes
-> 0x7ffff5bda0f0 <PrecodeFixupThunk>: 58 popq %rax ; rax = 0x7fff7c21f5ad
0x7ffff5bda0f1 <PrecodeFixupThunk+1>: 4c 0f b6 50 02 movzbq 0x2(%rax), %r10 ; r10 = 0x05 (precode chunk index)
0x7ffff5bda0f6 <PrecodeFixupThunk+6>: 4c 0f b6 58 01 movzbq 0x1(%rax), %r11 ; r11 = 0x19 (methoddesc chunk index)
0x7ffff5bda0fb <PrecodeFixupThunk+11>: 4a 8b 44 d0 03 movq 0x3(%rax,%r10,8), %rax ; rax = 0x7fff7bdd5040 (methoddesc chunk)
0x7ffff5bda100 <PrecodeFixupThunk+16>: 4e 8d 14 d8 leaq (%rax,%r11,8), %r10 ; r10 = 0x7fff7bdd5108 (methoddesc)
0x7ffff5bda104 <PrecodeFixupThunk+20>: e9 37 ff ff ff jmp 0x7ffff5bda040 ; ThePreStub
(lldb) me re -s1 -fx -c 51 0x7fff7c21f5ad
0x7fff7c21f5ad: 0x5e 0x19 0x05 0xe8 0x23 0x6c 0xfe 0xff
0x7fff7c21f5b5: 0x5e 0xa8 0x04 0xe8 0x1b 0x6c 0xfe 0xff
0x7fff7c21f5bd: 0x5e 0x00 0x03 0xe8 0x13 0x6c 0xfe 0xff
0x7fff7c21f5c5: 0x5e 0xb0 0x02 0xe8 0x0b 0x6c 0xfe 0xff
0x7fff7c21f5cd: 0x5e 0x3f 0x01 0xe8 0x03 0x6c 0xfe 0xff
0x7fff7c21f5d5: 0x5e 0xb8 0x00 [0x40 0x50 0xdd 0x7b 0xff
0x7fff7c21f5dd: 0x7f 0x00 0x00]
(lldb) dumpmd 0x7fff7bdd5108
Method Name: System.AppDomain.SetupDomain(Boolean, System.String, System.String, System.String[], System.String[])
Class: 00007fff7bce1af0
MethodTable: 00007fff7cc39918
mdToken: 0000000006002DE2
Module: 00007fff7bc2a000
IsJitted: yes
CodeAddr: 00007fff7c5c7d50
Transparency: Critical
The PreStub (theprestubamd64.S)
(lldb) di --frame --bytes
-> 0x7ffff5bda040 <ThePreStub>: 55 pushq %rbp
0x7ffff5bda041 <ThePreStub+1>: 48 89 e5 movq %rsp, %rbp
0x7ffff5bda044 <ThePreStub+4>: 53 pushq %rbx
0x7ffff5bda045 <ThePreStub+5>: 41 57 pushq %r15
0x7ffff5bda047 <ThePreStub+7>: 41 56 pushq %r14
0x7ffff5bda049 <ThePreStub+9>: 41 55 pushq %r13
0x7ffff5bda04b <ThePreStub+11>: 41 54 pushq %r12
0x7ffff5bda04d <ThePreStub+13>: 41 51 pushq %r9
0x7ffff5bda04f <ThePreStub+15>: 41 50 pushq %r8
0x7ffff5bda051 <ThePreStub+17>: 51 pushq %rcx
0x7ffff5bda052 <ThePreStub+18>: 52 pushq %rdx
0x7ffff5bda053 <ThePreStub+19>: 56 pushq %rsi
0x7ffff5bda054 <ThePreStub+20>: 57 pushq %rdi
0x7ffff5bda055 <ThePreStub+21>: 48 8d a4 24 78 ff ff ff leaq -0x88(%rsp), %rsp ; allocate transition block
0x7ffff5bda05d <ThePreStub+29>: 66 0f 7f 04 24 movdqa %xmm0, (%rsp) ; fill transition block
0x7ffff5bda062 <ThePreStub+34>: 66 0f 7f 4c 24 10 movdqa %xmm1, 0x10(%rsp) ; fill transition block
0x7ffff5bda068 <ThePreStub+40>: 66 0f 7f 54 24 20 movdqa %xmm2, 0x20(%rsp) ; fill transition block
0x7ffff5bda06e <ThePreStub+46>: 66 0f 7f 5c 24 30 movdqa %xmm3, 0x30(%rsp) ; fill transition block
0x7ffff5bda074 <ThePreStub+52>: 66 0f 7f 64 24 40 movdqa %xmm4, 0x40(%rsp) ; fill transition block
0x7ffff5bda07a <ThePreStub+58>: 66 0f 7f 6c 24 50 movdqa %xmm5, 0x50(%rsp) ; fill transition block
0x7ffff5bda080 <ThePreStub+64>: 66 0f 7f 74 24 60 movdqa %xmm6, 0x60(%rsp) ; fill transition block
0x7ffff5bda086 <ThePreStub+70>: 66 0f 7f 7c 24 70 movdqa %xmm7, 0x70(%rsp) ; fill transition block
0x7ffff5bda08c <ThePreStub+76>: 48 8d bc 24 88 00 00 00 leaq 0x88(%rsp), %rdi ; arg 1 = transition block*
0x7ffff5bda094 <ThePreStub+84>: 4c 89 d6 movq %r10, %rsi ; arg 2 = methoddesc
0x7ffff5bda097 <ThePreStub+87>: e8 44 7e 11 00 callq 0x7ffff5cf1ee0 ; PreStubWorker at prestub.cpp:958
0x7ffff5bda09c <ThePreStub+92>: 66 0f 6f 04 24 movdqa (%rsp), %xmm0
0x7ffff5bda0a1 <ThePreStub+97>: 66 0f 6f 4c 24 10 movdqa 0x10(%rsp), %xmm1
0x7ffff5bda0a7 <ThePreStub+103>: 66 0f 6f 54 24 20 movdqa 0x20(%rsp), %xmm2
0x7ffff5bda0ad <ThePreStub+109>: 66 0f 6f 5c 24 30 movdqa 0x30(%rsp), %xmm3
0x7ffff5bda0b3 <ThePreStub+115>: 66 0f 6f 64 24 40 movdqa 0x40(%rsp), %xmm4
0x7ffff5bda0b9 <ThePreStub+121>: 66 0f 6f 6c 24 50 movdqa 0x50(%rsp), %xmm5
0x7ffff5bda0bf <ThePreStub+127>: 66 0f 6f 74 24 60 movdqa 0x60(%rsp), %xmm6
0x7ffff5bda0c5 <ThePreStub+133>: 66 0f 6f 7c 24 70 movdqa 0x70(%rsp), %xmm7
0x7ffff5bda0cb <ThePreStub+139>: 48 8d a4 24 88 00 00 00 leaq 0x88(%rsp), %rsp
0x7ffff5bda0d3 <ThePreStub+147>: 5f popq %rdi
0x7ffff5bda0d4 <ThePreStub+148>: 5e popq %rsi
0x7ffff5bda0d5 <ThePreStub+149>: 5a popq %rdx
0x7ffff5bda0d6 <ThePreStub+150>: 59 popq %rcx
0x7ffff5bda0d7 <ThePreStub+151>: 41 58 popq %r8
0x7ffff5bda0d9 <ThePreStub+153>: 41 59 popq %r9
0x7ffff5bda0db <ThePreStub+155>: 41 5c popq %r12
0x7ffff5bda0dd <ThePreStub+157>: 41 5d popq %r13
0x7ffff5bda0df <ThePreStub+159>: 41 5e popq %r14
0x7ffff5bda0e1 <ThePreStub+161>: 41 5f popq %r15
0x7ffff5bda0e3 <ThePreStub+163>: 5b popq %rbx
0x7ffff5bda0e4 <ThePreStub+164>: 5d popq %rbp
0x7ffff5bda0e5 <ThePreStub+165>: 48 ff e0 jmpq *%rax
%rax should be patched fixup precode = 0x7fff7c21f5a8
(%rsp) should be the return address before calling "Fixup Precode"
PreStub Worker (prestub.cpp)
957 extern "C" PCODE STDCALL PreStubWorker(TransitionBlock * pTransitionBlock, MethodDesc * pMD)
958 {
-> 959 PCODE pbRetVal = NULL;
SetupDomain has prejit code, calling flow would be,
PreStubWorker => DoPreStub => GetPreImplementedCode
frame #0: 0x00007ffff5cf3772 libcoreclr.so`MethodDesc::DoPrestub(this=0x00007fff7bdd5108, pDispatchingMT=0x0000000000000000) + 3970 at prestub.cpp:1585
1582
1583 if (pCode != NULL)
1584 {
-> 1585 if (HasPrecode())
1586 GetPrecode()->SetTargetInterlocked(pCode);
1587 else
1588 if (!HasStableEntryPoint())
frame #0: 0x00007ffff5874224 libcoreclr.so`MethodDesc::GetPrecode(this=0x00007fff7bdd5108) + 68 at method.hpp:293
290
291 PRECONDITION(HasPrecode());
292 Precode* pPrecode = Precode::GetPrecodeFromEntryPoint(GetStableEntryPoint());
-> 293 PREFIX_ASSUME(pPrecode != NULL);
294 return pPrecode;
295 }
296
(lldb) p GetSlot()
(WORD) $79 = 69
(lldb) p pPrecode
(Precode *) $76 = 0x00007fff7c21f5a8
precode type is 5f (PRECODE_FIXUP = FixupPrecode::Type)
FixupPrecode::SetTargetInterlocked will alter the assembly code here
(lldb) di --bytes -s 0x7fff7c21f5a8
0x7fff7c21f5a8: e9 a3 87 3a 00 jmp 0x7fff7c5c7d50
0x7fff7c21f5ad: 5f popq %rdi
0x7fff7c21f5ae: 19 05 e8 23 6c fe sbbl %eax, -0x193dc18(%rip)
0x7fff7c21f5b4: ff 5e a8 lcalll *-0x58(%rsi)
0x7fff7c21f5b7: 04 e8 addb $-0x18, %al
0x7fff7c21f5b9: 1b 6c fe ff sbbl -0x1(%rsi,%rdi,8), %ebp
0x7fff7c21f5bd: 5e popq %rsi
0x7fff7c21f5be: 00 03 addb %al, (%rbx)
0x7fff7c21f5c0: e8 13 6c fe ff callq 0x7fff7c2061d8
0x7fff7c21f5c5: 5e popq %rsi
0x7fff7c21f5c6: b0 02 movb $0x2, %al
什么是TransitionBlock
调用jit函数前, ThePreStub 会在栈上分配一个结构体 TransitionBlock 用于保存寄存器状态
在x64上会用于保存 xmm0 ~ xmm7, 调用jit函数完毕后会恢复回原来的寄存器
DebugInfo的生成和结构
DebugInfo 在 invokeCompileMethodHelper => CompressDebugInfo => CompressBoundariesAndVars 中生成
来源于以下的变量
CEEJitInfo::m_pOffsetMapping // 类型是 ICorDebugInfo::OffsetMapping, 包含 nativeOffset 和 ilOffset
CEEJitInfo::m_iOffsetMapping // offset mapping 数组的长度
CEEJitInfo::m_pNativeVarInfo // 类型是 ICorDebugInfo::NativeVarInfo, 包含内部变量所在的scope的信息(native offset range)
CEEJitInfo::m_iNativeVarInfo // native var 数组的长度
格式
保存到 phdrDebugInfo 的格式是 nibble stream, 以4bit为一个单位保存数字
例如 0xa9 0xa0 0x03 代表 80, 19 两个数字
0xa9 = 0b1010'1001
0xa0 = 0b1010'0000
0x03 = 0b0000'0011
001 010 000 => 80
010 011 => 19
格式是
header, 包含两个数字, 第一个是 offset mapping 编码后的长度(bytes), 第二个是 native vars 编码后的长度(bytes)
offset mapping
offset mapping 的数量
native offset, 写入与前一条记录的偏移值
il offset
source 标记(flags), 有 SOURCE_TYPE_INVALID, SEQUENCE_POINT, STACK_EMPTY 等
native vars
native vars 的数量
startOffset scope的开始偏移值
endOffset scope的结束偏移值, 写入距离start的delta
var number 变量的序号
var type (reg还是stack)
后面的信息根据var type而定, 具体参考 DoNativeVarInfo
EHInfo的结构
EHInfo保存在 pRealCodeHeader->phdrJitEHInfo 中, 格式如下
phdrJitEHInfo - sizeof(size_t) = EH Clause 的数量
phdrJitEHInfo = CorILMethod_Sect_FatFormat 结构体的内容, 包含类型和长度
phdrJitEHInfo + sizeof(CorILMethod_Sect_FatFormat) = EE_ILEXCEPTION_CLAUSE的数组
具体的生成可以参考下面 genReportEH 函数的解释, 这里给出实际解析GCInfo的例子
源代码
var x = GetString();
try {
Console.WriteLine(x);
throw new Exception("abc");
} catch (Exception ex) {
Console.WriteLine(ex);
Console.WriteLine(x);
}
汇编代码
IN0016: 000000 push rbp
IN0017: 000001 push rbx
IN0018: 000002 sub rsp, 24
IN0019: 000006 lea rbp, [rsp+20H]
IN001a: 00000B mov qword ptr [V06 rbp-20H], rsp
G_M21556_IG02: ; offs=00000FH, size=0009H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
IN0001: 00000F call ConsoleApplication.Program:GetString():ref
IN0002: 000014 mov gword ptr [V01 rbp-10H], rax
G_M21556_IG03: ; offs=000018H, size=0043H, gcVars=0000000000000001 {V01}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
IN0003: 000018 mov rdi, gword ptr [V01 rbp-10H]
IN0004: 00001C call System.Console:WriteLine(ref)
IN0005: 000021 mov rdi, 0x7F78892D3CE8
IN0006: 00002B call CORINFO_HELP_NEWSFAST
IN0007: 000030 mov rbx, rax
IN0008: 000033 mov edi, 1
IN0009: 000038 mov rsi, 0x7F78881BCE70
IN000a: 000042 call CORINFO_HELP_STRCNS
IN000b: 000047 mov rsi, rax
IN000c: 00004A mov rdi, rbx
IN000d: 00004D call System.Exception:.ctor(ref):this
IN000e: 000052 mov rdi, rbx
IN000f: 000055 call CORINFO_HELP_THROW
IN0010: 00005A int3
G_M21556_IG04: ; offs=00005BH, size=0007H, gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref, epilog, nogc
IN001b: 00005B lea rsp, [rbp-08H]
IN001c: 00005F pop rbx
IN001d: 000060 pop rbp
IN001e: 000061 ret
G_M21556_IG05: ; func=01, offs=000062H, size=000EH, gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, byref, funclet prolog, nogc
IN001f: 000062 push rbp
IN0020: 000063 push rbx
IN0021: 000064 push rax
IN0022: 000065 mov rbp, qword ptr [rdi]
IN0023: 000068 mov qword ptr [rsp], rbp
IN0024: 00006C lea rbp, [rbp+20H]
G_M21556_IG06: ; offs=000070H, size=0018H, gcVars=0000000000000001 {V01}, gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, gcvars, byref, isz
IN0011: 000070 mov rdi, rsi
IN0012: 000073 call System.Console:WriteLine(ref)
IN0013: 000078 mov rdi, gword ptr [V01 rbp-10H]
IN0014: 00007C call System.Console:WriteLine(ref)
IN0015: 000081 lea rax, G_M21556_IG04
G_M21556_IG07: ; offs=000088H, size=0007H, funclet epilog, nogc, emitadd
IN0025: 000088 add rsp, 8
IN0026: 00008C pop rbx
IN0027: 00008D pop rbp
IN0028: 00008E ret
LLDB命令
(lldb) p *codePtr
(void *) $1 = 0x00007fff7ceef920
(lldb) p *(CodeHeader*)(0x00007fff7ceef920-8)
(CodeHeader) $2 = {
pRealCodeHeader = 0x00007fff7cf35c78
}
(lldb) p *(_hpRealCodeHdr*)(0x00007fff7cf35c78)
(_hpRealCodeHdr) $3 = {
phdrDebugInfo = 0x0000000000000000
phdrJitEHInfo = 0x00007fff7cf35ce0
phdrJitGCInfo = 0x0000000000000000
phdrMDesc = 0x00007fff7baf9200
nUnwindInfos = 2
unwindInfos = {}
}
(lldb) me re -s8 -c20 -fx 0x00007fff7cf35ce0-8
0x7fff7cf35cd8: 0x0000000000000001 0x0000000000002040
0x7fff7cf35ce8: 0x0000001800000000 0x000000620000005b
0x7fff7cf35cf8: 0x000000000000008f 0x000000000100000e
0x7fff7cf35d08: 0x0000000000000030 0x0000000000000001
0x7fff7cf35d18: 0x00007ffff628f550 0x0000000000000b4a
0x7fff7cf35d28: 0x0000000000000000 0x0000000000000000
0x7fff7cf35d38: 0x0000000000000000 0x0000000000000000
0x7fff7cf35d48: 0x0000000000000000 0x0000000000000000
0x7fff7cf35d58: 0x0000000000000000 0x0000000000000000
0x7fff7cf35d68: 0x0000000000000000 0x0000000000000000
内容解析
0x0000000000000001:
phdrJitEHInfo - sizeof(size_t) is num clauses, here is 1
0x0000000000002040:
memeber from base class IMAGE_COR_ILMETHOD_SECT_FAT
Kind = 0x40 = CorILMethod_Sect_FatFormat
DataSize = 0x20 = 32 = 1 * sizeof(EE_ILEXCEPTION_CLAUSE)
(lldb) p ((EE_ILEXCEPTION_CLAUSE*)(0x00007fff7cf35ce0+8))[0]
(EE_ILEXCEPTION_CLAUSE) $29 = {
Flags = COR_ILEXCEPTION_CLAUSE_NONE
TryStartPC = 24
TryEndPC = 91
HandlerStartPC = 98
HandlerEndPC = 143
= (TypeHandle = 0x000000000100000e, ClassToken = 16777230, FilterOffset = 16777230)
}
(lldb) sos Token2EE * 0x000000000100000e
Module: 00007fff7bc04000
Assembly: System.Private.CoreLib.ni.dll
<invalid module token>
--------------------------------------
Module: 00007fff7baf6e70
Assembly: coreapp_jit.dll
Token: 000000000100000E
MethodTable: 00007fff7cc0dce8
EEClass: 00007fff7bcb9400
Name: mdToken: 0100000e (/home/ubuntu/git/coreapp_jitnew/bin/Release/netcoreapp1.1/ubuntu.16.04-x64/publish/coreapp_jit.dll)
(lldb) dumpmt 00007fff7cc0dce8
EEClass: 00007FFF7BCB9400
Module: 00007FFF7BC04000
Name: System.Exception
mdToken: 0000000002000249
File: /home/ubuntu/git/coreapp_jitnew/bin/Release/netcoreapp1.1/ubuntu.16.04-x64/publish/System.Private.CoreLib.ni.dll
BaseSize: 0x98
ComponentSize: 0x0
Slots in VTable: 51
Number of IFaces in IFaceMap: 2
GCInfo的结构
GCInfo保存在 pRealCodeHeader->phdrJitGCInfo 中, 是一个bit数组
具体的生成可以参考下面 genCreateAndStoreGCInfo 函数的解释, 这里给出实际解析GCInfo的例子
源代码
var x = GetString();
try {
Console.WriteLine(x);
throw new Exception("abc");
} catch (Exception ex) {
Console.WriteLine(ex);
Console.WriteLine(x);
}
汇编代码
IN0016: 000000 push rbp
IN0017: 000001 push rbx
IN0018: 000002 sub rsp, 24
IN0019: 000006 lea rbp, [rsp+20H]
IN001a: 00000B mov qword ptr [V06 rbp-20H], rsp
G_M21556_IG02: ; offs=00000FH, size=0009H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
IN0001: 00000F call ConsoleApplication.Program:GetString():ref
IN0002: 000014 mov gword ptr [V01 rbp-10H], rax
G_M21556_IG03: ; offs=000018H, size=0043H, gcVars=0000000000000001 {V01}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
IN0003: 000018 mov rdi, gword ptr [V01 rbp-10H]
IN0004: 00001C call System.Console:WriteLine(ref)
IN0005: 000021 mov rdi, 0x7F78892D3CE8
IN0006: 00002B call CORINFO_HELP_NEWSFAST
IN0007: 000030 mov rbx, rax
IN0008: 000033 mov edi, 1
IN0009: 000038 mov rsi, 0x7F78881BCE70
IN000a: 000042 call CORINFO_HELP_STRCNS
IN000b: 000047 mov rsi, rax
IN000c: 00004A mov rdi, rbx
IN000d: 00004D call System.Exception:.ctor(ref):this
IN000e: 000052 mov rdi, rbx
IN000f: 000055 call CORINFO_HELP_THROW
IN0010: 00005A int3
G_M21556_IG04: ; offs=00005BH, size=0007H, gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref, epilog, nogc
IN001b: 00005B lea rsp, [rbp-08H]
IN001c: 00005F pop rbx
IN001d: 000060 pop rbp
IN001e: 000061 ret
G_M21556_IG05: ; func=01, offs=000062H, size=000EH, gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, byref, funclet prolog, nogc
IN001f: 000062 push rbp
IN0020: 000063 push rbx
IN0021: 000064 push rax
IN0022: 000065 mov rbp, qword ptr [rdi]
IN0023: 000068 mov qword ptr [rsp], rbp
IN0024: 00006C lea rbp, [rbp+20H]
G_M21556_IG06: ; offs=000070H, size=0018H, gcVars=0000000000000001 {V01}, gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, gcvars, byref, isz
IN0011: 000070 mov rdi, rsi
IN0012: 000073 call System.Console:WriteLine(ref)
IN0013: 000078 mov rdi, gword ptr [V01 rbp-10H]
IN0014: 00007C call System.Console:WriteLine(ref)
IN0015: 000081 lea rax, G_M21556_IG04
G_M21556_IG07: ; offs=000088H, size=0007H, funclet epilog, nogc, emitadd
IN0025: 000088 add rsp, 8
IN0026: 00008C pop rbx
IN0027: 00008D pop rbp
IN0028: 00008E ret
LLDB命令
(lldb) p *codePtr
(void *) $1 = 0x00007fff7cee3920
(lldb) p *(CodeHeader*)(0x00007fff7cee3920-8)
(CodeHeader) $2 = {
pRealCodeHeader = 0x00007fff7cf29c78
}
(lldb) p *(_hpRealCodeHdr*)(0x00007fff7cf29c78)
(_hpRealCodeHdr) $3 = {
phdrDebugInfo = 0x0000000000000000
phdrJitEHInfo = 0x00007fff7cf29ce0
phdrJitGCInfo = 0x00007fff7cf29d28 "\x91\x81G"
phdrMDesc = 0x00007fff7baed200
nUnwindInfos = 2
unwindInfos = {}
}
(lldb) me re -s8 -c20 -fx 0x00007fff7cf29d28
0x7fff7cf29d28: 0x1963d80000478191 0x171f412003325ca8
0x7fff7cf29d38: 0xee92864c5ffe0280 0x1c5c1c1f09bea536
0x7fff7cf29d48: 0xed8a93e5c6872932 0x00000000000000c4
0x7fff7cf29d58: 0x000000000000002a 0x0000000000000001
0x7fff7cf29d68: 0x00007ffff628f550 0x0000000000000b2e
0x7fff7cf29d78: 0x0000000000000000 0x0000000000000000
0x7fff7cf29d88: 0x0000000000000000 0x0000000000000000
0x7fff7cf29d98: 0x0000000000000000 0x0000000000000000
0x7fff7cf29da8: 0x0000000000000000 0x0000000000000000
0x7fff7cf29db8: 0x0000000000000000 0x0000000000000000
bit数组包含的内容
10001001
1: use fat encoding
0: no var arg
0: no security object
0: no gc cookie
1: have pspsym stack slot
0 0: no generic context parameter
1: have stack base register
1000000
1: wants report only leaf
0: no edit and continue preserved area
0: no reverse pinvoke frame
0 0 0 0: return kind is RT_Scalar
1'11100010
0 10001111: code length is 143
0000000
0 000000: pspsym stack slot is 0
0'0000000
0 000: stack base register is rbp (rbp is 5, normalize function will ^5 so it's 0)
0 000: size of stack outgoing and scratch area is 0
0'000110
0 00: 0 call sites
1 0 0 1: 2 interruptible ranges
11'11000
0 001111: interruptible range 1 begins from 15
110'10011000'000
1 001011 0 000001: interruptible range 1 finished at 91 (15 + 75 + 1)
10101'00
0 010101: interruptible range 2 begins from 112 (91 + 21)
111010'01001100
0 010111: interruptible range 2 finished at 136 (112 + 23 + 1)
1: have register slots
1 00 0 01: 4 register slots
110000
1: have stack slots
0 01: 1 tracked stack slots
0 0: 0 untracked stack slots
00'0000010
0 000: register slot 1 is rax(0)
00: register slot 1 flag is GC_SLOT_IS_REGISTER(8 & 0b11 = 0)
0 10: register slot 2 is rbx(3) (0 + 2 + 1)
0'10000
0 10: register slot 3 is rsi(6) (3 + 2 + 1)
0 00: register slot 4 is rdi(7) (6 + 0 + 1)
010'11111000
01: stack slot 1 base on GC_FRAMEREG_REL(2)
0 111110: stack slot 1 offset is -16 (-16 / 8 = -2)
00: stack slot 1 flag is GC_SLOT_BASE(0)
111 01000
111: num bits per pointer is 7
00000001
0 0000001: chunk 0's bit offset is 0 (1-1)
01000000: chunk 1's bit offset is 63 (64-1)
011111
011111: chunk 0 could be live slot list, simple format, all could live
11'111
11111: chunk 0 final state, all slot lives
1 1010'00
1 000101: transition of register slot 1(rax) at 0x14 (20 = 15 + 5), becomes live
110010'01100001
1 001001: transition of register slot 1(rax) at 0x18 (24 = 15 + 9), becomes dead
1 100001: transition of register slot 1(rax) at 0x30 (48 = 15 + 33), becomes live
01001001
0: terminator, no more transition of register slot 1(rax) in this chunk
1 100100: transition of register slot 2(rbx) at 0x33 (51 = 15 + 36), becomes live
01110111
0: terminator, no more transition of register slot 2(rbx) in this chunk
1 111110: transition of register slot 3(rsi) at 0x4d (77 = 15 + 62), becomes live
01101100
0: terminator, no more transition of register slot 3(rsi) in this chunk
1 001101: transition of register slot 4(rdi) at 0x1c (28 = 15 + 13), becomes live
1010010
1 010010: transition of register slot 4(rdi) at 0x21 (33 = 15 + 18), becomes dead
1'0111110
1 111110: transition of register slot 4(rdi) at 0x4d (77 = 15 + 62), becomes live
0: terminator, no more transition of register slot 4(rdi) in this chunk
1'1001000
1 001001: transition of stack slot 1(rbp-16) at 0x18 (24 = 15 + 9), becomes live
0: terminator, no more transition of stack slot 1(rbp-16) in this chunk
0'11111
0 11111: chunk 1 could be live slot list, simple format, all could live
000'00
00000: chunk 1 final state, all slot dead
111000'00
1 000011: transition of register slot 1(rax) at 0x52 (15 + 64 + 3), becomes dead
0: terminator, no more transition of register slot 1(rax) in this chunk
111010'00
1: 001011: transition of register slot 2(rbx) at 0x5a (15 + 64 + 11), becomes dead
0: terminator, no more transition of register slot 2(rbx) in this chunk
111000'01001100
1 000011: transition of register slot 3(rsi) at 0x52 (15 + 64 + 3), becomes dead
1 001100: transition of register slot 3(rsi) at 0x70 (0x70 + (64+12 - (0x5b-0xf))), becomes live
10010100
1 010100: transition of register slot 3(rsi) at 0x78 (0x70 + (64+20 - (0x5b-0xf))), becomes dead
0: terminator, no more transition of register slot 3(rsi) in this chunk
1110000
1: 000011: transition of register slot 4(rdi) at 0x52 (15 + 64 + 3), becomes dead
1'011000
1 000110: transition of register slot 4(rdi) at 0x55 (15 + 64 + 6), becomes live
11'10100
1 001011: transition of register slot 4(rdi) at 0x5a (15 + 64 + 11), becomes dead
111'1100
1: 001111: transition of register slot 4(rdi) at 0x73 (0x70 + (64+15 - (0x5b-0xf))), becomes live
1001'010
1 010100: transition of register slot 4(rdi) at 0x78 (0x70 + (64+20 - (0x5b-0xf))), becomes dead
10001'10
1 011000: transition of register slot 4(rdi) at 0x7c (0x70 + (64+24 - (0x5b-0xf))), becomes live
110111'00
1 011101: transition of register slot 4(rdi) at 0x81 (0x70 + (64+29 - (0x5b-0xf))), becomes dead
0: terminator, no more transition of register slot 4(rdi) in this chunk
100011'00
1 011000: transition of stack slot 1(rbp-16) at 0x7c (0x70 + (64+24 - (0x5b-0xf))), becomes dead
0: terminator, no more transition of stack slot 1(rbp-16) in this chunk
Unwind Info的结构
Unwind Info保存在 pRealCodeHeader->nUnwindInfos 和 pRealCodeHeader->unwindInfos 中
pRealCodeHeader->unwindInfos 是一个长度为 pRealCodeHeader->nUnwindInfos 的数组, 类型是 RUNTIME_FUNCTION
数量等于主函数 + funclet的数量
RUNTIME_FUNCTION中又保存了UNWIND_INFO的数组, UNWIND_INFO保存了函数对栈指针的操作
以下是实际的实例分析
源代码
var x = GetString();
try {
Console.WriteLine(x);
throw new Exception("abc");
} catch (Exception ex) {
Console.WriteLine(ex);
Console.WriteLine(x);
} finally {
Console.WriteLine("finally");
}
汇编代码
G_M21556_IG01: ; func=00, offs=000000H, size=000FH, gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref, nogc <-- Prolog IG
IN001e: 000000 push rbp
IN001f: 000001 push rbx
IN0020: 000002 sub rsp, 24
IN0021: 000006 lea rbp, [rsp+20H]
IN0022: 00000B mov qword ptr [V06 rbp-20H], rsp
G_M21556_IG02: ; offs=00000FH, size=0009H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
IN0001: 00000F call ConsoleApplication.Program:GetString():ref
IN0002: 000014 mov gword ptr [V01 rbp-10H], rax
G_M21556_IG03: ; offs=000018H, size=0043H, gcVars=0000000000000001 {V01}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
IN0003: 000018 mov rdi, gword ptr [V01 rbp-10H]
IN0004: 00001C call System.Console:WriteLine(ref)
IN0005: 000021 mov rdi, 0x7F94DDF9CCE8
IN0006: 00002B call CORINFO_HELP_NEWSFAST
IN0007: 000030 mov rbx, rax
IN0008: 000033 mov edi, 1
IN0009: 000038 mov rsi, 0x7F94DCE85E70
IN000a: 000042 call CORINFO_HELP_STRCNS
IN000b: 000047 mov rsi, rax
IN000c: 00004A mov rdi, rbx
IN000d: 00004D call System.Exception:.ctor(ref):this
IN000e: 000052 mov rdi, rbx
IN000f: 000055 call CORINFO_HELP_THROW
IN0010: 00005A int3
G_M21556_IG04: ; offs=00005BH, size=0001H, gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
IN0011: 00005B nop
G_M21556_IG05: ; offs=00005CH, size=0008H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
IN0012: 00005C mov rdi, rsp
IN0013: 00005F call G_M21556_IG11
G_M21556_IG06: ; offs=000064H, size=0001H, nogc, emitadd
IN0014: 000064 nop
G_M21556_IG07: ; offs=000065H, size=0007H, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, epilog, nogc
IN0023: 000065 lea rsp, [rbp-08H]
IN0024: 000069 pop rbx
IN0025: 00006A pop rbp
IN0026: 00006B ret
G_M21556_IG08: ; func=01, offs=00006CH, size=000EH, gcVars=0000000000000001 {V01}, gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, gcvars, byref, funclet prolog, nogc
IN0027: 00006C push rbp
IN0028: 00006D push rbx
IN0029: 00006E push rax
IN002a: 00006F mov rbp, qword ptr [rdi]
IN002b: 000072 mov qword ptr [rsp], rbp
IN002c: 000076 lea rbp, [rbp+20H]
G_M21556_IG09: ; offs=00007AH, size=0018H, gcVars=0000000000000001 {V01}, gcrefRegs=00000040 {rsi}, byrefRegs=00000000 {}, gcvars, byref, isz
IN0015: 00007A mov rdi, rsi
IN0016: 00007D call System.Console:WriteLine(ref)
IN0017: 000082 mov rdi, gword ptr [V01 rbp-10H]
IN0018: 000086 call System.Console:WriteLine(ref)
IN0019: 00008B lea rax, G_M21556_IG04
G_M21556_IG10: ; offs=000092H, size=0007H, funclet epilog, nogc, emitadd
IN002d: 000092 add rsp, 8
IN002e: 000096 pop rbx
IN002f: 000097 pop rbp
IN0030: 000098 ret
G_M21556_IG11: ; func=02, offs=000099H, size=000EH, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, funclet prolog, nogc
IN0031: 000099 push rbp
IN0032: 00009A push rbx
IN0033: 00009B push rax
IN0034: 00009C mov rbp, qword ptr [rdi]
IN0035: 00009F mov qword ptr [rsp], rbp
IN0036: 0000A3 lea rbp, [rbp+20H]
G_M21556_IG12: ; offs=0000A7H, size=0013H, gcVars=0000000000000000 {}, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, gcvars, byref
IN001a: 0000A7 mov rdi, 0x7F94C8001068
IN001b: 0000B1 mov rdi, gword ptr [rdi]
IN001c: 0000B4 call System.Console:WriteLine(ref)
IN001d: 0000B9 nop
G_M21556_IG13: ; offs=0000BAH, size=0007H, funclet epilog, nogc, emitadd
IN0037: 0000BA add rsp, 8
IN0038: 0000BE pop rbx
IN0039: 0000BF pop rbp
IN003a: 0000C0 ret
LLDB命令
(lldb) p *codePtr
(void *) $0 = 0x00007fff7ceee920
(lldb) p *(CodeHeader*)(0x00007fff7ceee920-8)
(CodeHeader) $1 = {
pRealCodeHeader = 0x00007fff7cf34c78
}
(lldb) p *(_hpRealCodeHdr*)(0x00007fff7cf34c78)
(_hpRealCodeHdr) $2 = {
phdrDebugInfo = 0x0000000000000000
phdrJitEHInfo = 0x0000000000000000
phdrJitGCInfo = 0x0000000000000000
phdrMDesc = 0x00007fff7baf8200
nUnwindInfos = 3
unwindInfos = {}
}
(lldb) p ((_hpRealCodeHdr*)(0x00007fff7cf34c78))->unwindInfos[0]
(RUNTIME_FUNCTION) $3 = (BeginAddress = 2304, EndAddress = 2412, UnwindData = 2500)
(lldb) p ((_hpRealCodeHdr*)(0x00007fff7cf34c78))->unwindInfos[1]
(RUNTIME_FUNCTION) $4 = (BeginAddress = 2412, EndAddress = 2457, UnwindData = 2516)
(lldb) p ((_hpRealCodeHdr*)(0x00007fff7cf34c78))->unwindInfos[2]
(RUNTIME_FUNCTION) $5 = (BeginAddress = 2457, EndAddress = 2497, UnwindData = 2532)
first unwind info:
(lldb) p (void*)(((CEEJitInfo*)compiler->info.compCompHnd)->m_moduleBase + 2304)
(void *) $13 = 0x00007fff7ceee920
(lldb) p (void*)(((CEEJitInfo*)compiler->info.compCompHnd)->m_moduleBase + 2412)
(void *) $14 = 0x00007fff7ceee98c
# range is [0, 0x6c)
(lldb) p *(UNWIND_INFO*)(((CEEJitInfo*)compiler->info.compCompHnd)->m_moduleBase + 2500)
(UNWIND_INFO) $16 = {
Version = '\x01'
Flags = '\x03'
SizeOfProlog = '\x06'
CountOfUnwindCodes = '\x03'
FrameRegister = '\0'
FrameOffset = '\0'
UnwindCode = {
[0] = {
= (CodeOffset = '\x06', UnwindOp = '\x02', OpInfo = '\x02')
EpilogueCode = (OffsetLow = '\x06', UnwindOp = '\x02', OffsetHigh = '\x02')
FrameOffset = 8710
}
}
}
(lldb) p ((UNWIND_INFO*)(((CEEJitInfo*)compiler->info.compCompHnd)->m_moduleBase + 2500))->UnwindCode[0]
(UNWIND_CODE) $17 = {
= (CodeOffset = '\x06', UnwindOp = '\x02', OpInfo = '\x02')
EpilogueCode = (OffsetLow = '\x06', UnwindOp = '\x02', OffsetHigh = '\x02')
FrameOffset = 8710
}
(lldb) p ((UNWIND_INFO*)(((CEEJitInfo*)compiler->info.compCompHnd)->m_moduleBase + 2500))->UnwindCode[1]
(UNWIND_CODE) $18 = {
= (CodeOffset = '\x02', UnwindOp = '\0', OpInfo = '\x03')
EpilogueCode = (OffsetLow = '\x02', UnwindOp = '\0', OffsetHigh = '\x03')
FrameOffset = 12290
}
(lldb) p ((UNWIND_INFO*)(((CEEJitInfo*)compiler->info.compCompHnd)->m_moduleBase + 2500))->UnwindCode[2]
(UNWIND_CODE) $19 = {
= (CodeOffset = '\x01', UnwindOp = '\0', OpInfo = '\x05')
EpilogueCode = (OffsetLow = '\x01', UnwindOp = '\0', OffsetHigh = '\x05')
FrameOffset = 20481
}
使用COMPlus_JitDump生成的除错信息
Unwind Info:
>> Start offset : 0x000000 (not in unwind data)
>> End offset : 0x00006c (not in unwind data)
Version : 1
Flags : 0x00
SizeOfProlog : 0x06
CountOfUnwindCodes: 3
FrameRegister : none (0)
FrameOffset : N/A (no FrameRegister) (Value=0)
UnwindCodes :
CodeOffset: 0x06 UnwindOp: UWOP_ALLOC_SMALL (2) OpInfo: 2 * 8 + 8 = 24 = 0x18
CodeOffset: 0x02 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbx (3)
CodeOffset: 0x01 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbp (5)
allocUnwindInfo(pHotCode=0x00007F94DE27E920, pColdCode=0x0000000000000000, startOffset=0x0, endOffset=0x6c, unwindSize=0xa, pUnwindBlock=0x0000000002029516, funKind=0 (main function))
Unwind Info:
>> Start offset : 0x00006c (not in unwind data)
>> End offset : 0x000099 (not in unwind data)
Version : 1
Flags : 0x00
SizeOfProlog : 0x03
CountOfUnwindCodes: 3
FrameRegister : none (0)
FrameOffset : N/A (no FrameRegister) (Value=0)
UnwindCodes :
CodeOffset: 0x03 UnwindOp: UWOP_ALLOC_SMALL (2) OpInfo: 0 * 8 + 8 = 8 = 0x08
CodeOffset: 0x02 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbx (3)
CodeOffset: 0x01 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbp (5)
allocUnwindInfo(pHotCode=0x00007F94DE27E920, pColdCode=0x0000000000000000, startOffset=0x6c, endOffset=0x99, unwindSize=0xa, pUnwindBlock=0x0000000002029756, funKind=1 (handler))
Unwind Info:
>> Start offset : 0x000099 (not in unwind data)
>> End offset : 0x0000c1 (not in unwind data)
Version : 1
Flags : 0x00
SizeOfProlog : 0x03
CountOfUnwindCodes: 3
FrameRegister : none (0)
FrameOffset : N/A (no FrameRegister) (Value=0)
UnwindCodes :
CodeOffset: 0x03 UnwindOp: UWOP_ALLOC_SMALL (2) OpInfo: 0 * 8 + 8 = 8 = 0x08
CodeOffset: 0x02 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbx (3)
CodeOffset: 0x01 UnwindOp: UWOP_PUSH_NONVOL (0) OpInfo: rbp (5)
以第一个RUNTIME_FUNCTION(主函数)为例
它包含了3个UnwindCode, 分别记录了
push rbp
push rbx
sub rsp, 24
实际运行时根据当前pc获取当前frame的顶部 => 获取return address => 根据return address获取上一个frame的顶部 => 循环
这样即可获取调用链和各个调用源的frame的顶部, 这个流程也叫stack walking (或 stack crawling)
GCInfo的生成
GCInfo在 genCreateAndStoreGCInfo 中生成, 生成后保存到 pRealCodeHeader->phdrJitGCInfo
生成的过程中会使用 GcInfoEncoder 这个类, 代码在gcinfo文件夹下
EHInfo的生成
EHInfo在 genReportEH 中生成, 生成后保存到 pRealCodeHeader->phdrJitEHInfo
Unwind Info的生成
Unwind Info在 unwindEmit 中生成, 生成后保存到 pRealCodeHeader->nUnwindInfos 和 pRealCodeHeader->unindInfos[]
PersonalityRoutine的处理
PersonalityRoutine是保存在heapList(代码块)中用于处理例外的代码
默认是jmp到ProcessCLRException的代码
lldb解析
(lldb) b GetCLRPersonalityRoutineValue
(lldb) finish
(lldb) p *pPersonalityRoutine
(ULONG) $3 = 62
(lldb) p (_HeapList*)baseAddress
(_HeapList *) $5 = 0x00007fff7ce93020
(lldb) p *(_HeapList*)baseAddress
(_HeapList) $6 = {
hpNext = 0x0000000000000000
pHeap = 0x00000000006e1710
startAddress = 140735289045104
endAddress = 140735289046236
mapBase = 140735289044992
pHdrMap = 0x00007fff7bae7090
maxCodeHeapSize = 262032
cBlocks = 2
bFull = false
bFullForJumpStubs = false
CLRPersonalityRoutine = ([0] = 'H', [1] = '\xb8', [2] = '\x10', [3] = 'r', [4] = '\xbb', [5] = '\xf5', [6] = '\xff', [7] = '\x7f', [8] = '\0', [9] = '\0', [10] = '\xff', [11] = '\xe0')
}
(lldb) di -s (char*)((_HeapList*)baseAddress)->CLRPersonalityRoutine
0x7fff7ce9305e: movabsq $0x7ffff5bb7210, %rax
0x7fff7ce93068: jmpq *%rax
0x7fff7ce9306a: addb %al, (%rax)
0x7fff7ce9306c: addb %al, (%rax)
0x7fff7ce9306e: addb %al, (%rax)
0x7fff7ce93070: addb (%rax), %al
0x7fff7ce93072: addb %al, (%rax)
0x7fff7ce93074: addb %al, (%rax)
0x7fff7ce93076: addb %al, (%rax)
0x7fff7ce93078: callq 0x7ffff5bda0f0 ; PrecodeFixupThunk
0x7fff7ce9307d: popq %rsi
(lldb) di -s 0x7ffff5bb7210
0x7ffff5bb7210 <ProcessCLRException>: pushq %rbp
0x7ffff5bb7211 <ProcessCLRException+1>: movq %rsp, %rbp
0x7ffff5bb7214 <ProcessCLRException+4>: subq $0x340, %rsp
0x7ffff5bb721b <ProcessCLRException+11>: movq %fs:0x28, %rax
0x7ffff5bb7224 <ProcessCLRException+20>: movq %rax, -0x8(%rbp)
0x7ffff5bb7228 <ProcessCLRException+24>: movq %rdi, -0x1c8(%rbp)
IL是如何获取的
普通函数的IL可以通过MethodDesc->GetILHeader获取
GetILHeader会使用pModule->GetIL(GetRVA())获取
第一个可以调用GetILHeader获取的函数是Main
IL怎么转换成BasicBlock
IL的所在位置
info.compMethodInfo->ILCode
info.compMethodInfo->ILCodeSize
compCompileHelper复制到
info.compCode
info.compILCodeSize
流程
compCompileHelper
compInitDebuggingInfo
fgEnsureFirstBBisScratch
在最开始插入一个用于支持 Debug 的 BasicBlock
bbFlags |= BBF_INTERNAL | BBF_IMPORTED
fgInsertStmtAtEnd(fgFirstBB, gtNewNothingNode())
修改插入的 BasicBlock, 设置一个只有nop的GenTree
block->bbTreeList = stmt; // block->setBBTreeList(stmt)
fgFindBasicBlocks
fgFindJumpTargets
解析逐条指令,分析指令中的跳转,指令大小在 opcodeSizes 中
对跳转目标调用 fgMarkJumpTarget
jumpTarget 保存跳转目标的数组,由 fgFindBasicBlocks 生成
offs 是跳转目标的地址离开始地址的偏移值
jumpTarget[offs] |= (jumpTarget[offs] & JT_JUMP) << 1 | JT_JUMP
第一次标记是JT_JUMP,第二次以后标记是JT_JUMP | JT_MULTI
对 CEE_LDARG 调用 pushedStack.PushArgument
pushedStack的类型是FgStack, 是一个深度最大为2的execution stack, 专门用于记录inlinee中的指令
如果满足一定条件(例如传入参数是常量且使用了该参数)则可以在后面增加inline的成功率(multiplier)
参考fgObserveInlineConstants函数
对 CEE_LDLEN 调用 pushedStack.PushArrayLen, 同上
如果当前编译的函数是内联函数
如果当前的函数需要根据利益判断(CALLEE_IS_DISCRETIONARY_INLINE)
调用compInlineResult->DetermineProfitability判断, 判断不应该内联则中断JIT
m_CalleeNativeSizeEstimate = DetermineNativeSizeEstimate() // 使用statemachine估算的机器代码大小
m_CallsiteNativeSizeEstimate = DetermineCallsiteNativeSizeEstimate(methodInfo) // 估算调用此函数的指令花费的机器代码大小
m_Multiplier = DetermineMultiplier() // 系数, 值越大越容易inline, 详见DetermineMultiplier
const int threshold = (int)(m_CallsiteNativeSizeEstimate * m_Multiplier) // 阈值
if (m_CalleeNativeSizeEstimate > threshold)
设置不内联
根据例外处理器设置 jumpTarget
compXcptnsCount(methodInfo->EHcount) > 0 时
枚举例外处理器
获取信息
CORINFO_EH_CLAUSE clause;
info.compCompHnd->getEHinfo(info.compMethodHnd, XTnum, &clause);
try之前分割
jumpTarget[clause.TryOffset] = JT_ADDR;
try之后分割
tmpOffset = clause.TryOffset + clause.TryLength;
jumpTarget[tmpOffset] = JT_ADDR;
处理代码之前分割
jumpTarget[clause.HandlerOffset] = JT_ADDR;
处理代码之后分割
tmpOffset = clause.HandlerOffset + clause.HandlerLength;
jumpTarget[tmpOffset] = JT_ADDR;
如果使用了过滤器则过滤器之前分割
jumpTarget[clause.FilterOffset] = JT_ADDR;
fgMakeBasicBlocks
枚举指令
下一条指令的地址会在 nxtBBoffs 中
如果下一条指令是某个跳转指令的目标,则需要分块
if (jmpKind == BBJ_NONE)
bool makeBlock = (jumpTarget[nxtBBoffs] != JT_NONE);
如果当前指令是跳转,则需要分块
jmpKind的种类看上面的BBjumpKinds
分块
curBBdesc = fgNewBasicBlock(jmpKind);
curBBdesc->bbFlags |= bbFlags;
curBBdesc->bbRefs = 0;
curBBdesc->bbCodeOffs = curBBoffs;
curBBdesc->bbCodeOffsEnd = nxtBBoffs;
额外信息
如果jmpKind是BBJ_SWITCH
curBBdesc->bbJumpSwt = swtDsc
如果jmpKind是BBJ_COND, BBJ_ALWAYS, BBJ_LEAVE
curBBdesc->bbJumpOffs = jmpAddr;
jmpAddr是跳转目标的地址离开始地址的偏移值
保存分块
fgFirstBB 指向第一个BasicBlock
fgLastBB 指向最后一个BasicBlock
调用 fgLinkBasicBlocks
调用 fgInitBBLookup
设置Compiler::fgBBs (BasicBlock**),把链表中的各个BasicBlock*保存到数组中
如果jmpKind是BBJ_COND, BBJ_ALWAYS, BBJ_LEAVE
转换bbJumpOffs到bbJumpDest
增加目标BasicBlock的bbRefs
如果jmpKind是BBJ_NONE
增加下一个BasicBlock的bbRefs
如果jmpKind是BBJ_SWITCH
增加所有目标BasicBlock的bbRefs
如果目标BasicBlock的序号比当前BasicBlock的序号小
调用 fgMarkBackwardJump
从目标到当前的所有BasicBlock的 bbFlags |= BBF_BACKWARD_JUMP
检查是否可以inline
if (compIsForInlining())
以下处理仅在例外处理器存在时继续
if (info.compXcptnsCount == 0)
return
例外处理器超过65534个时报错
if (info.compXcptnsCount > MAX_XCPTN_INDEX)
IMPL_LIMITATION("too many exception clauses");
fgAllocEHTable
分配例外处理器的块信息数组,块信息包含了try开始和结束的BasicBlock,处理器开始和结束的BasicBlock等
compHndBBtab = new (this, CMK_BasicBlock) EHblkDsc[compHndBBtabAllocCount];
compHndBBtabCount = info.compXcptnsCount;
verInitEHTree
初始化EH树所用的节点
ehnNext = new (this, CMK_BasicBlock) EHNodeDsc[numEHClauses * 3];
ehnTree = nullptr;
填充例外处理器的块信息数组 compHndBBtab 和 EH树 ehnTree
for (XTnum = 0, HBtab = compHndBBtab; XTnum < compHndBBtabCount; XTnum++, HBtab++)
填充 compHndBBtab
构建 ehnTree
verInsertEhNode(&clause, HBtab)
节点有 ehnNext, ehnChild, ehnTryNode, ehnHandlerNode 等属性
try { } catch (ex_a) { } catch (ex_b) { } finally { } 会生成以下的树
try (=>next) finally
(=>child) try (=>next) catch (=>next) catch
fgSortEHTable
对例外处理器的块信息数组 compHndBBtab 进行排序
嵌套在里层的try catch会排在外层的前面
让 try或catch或finally中的 BasicBlock 指向排序后的 compHndBBtab 的序号
调用 setHndIndex 修改 bbHndIndex
调用 setTryIndex 修改 bbTryIndex
修改 ebdEnclosingTryIndex
修改 ebdEnclosingHndIndex
fgNormalizeEH
对嵌套的 try catch 插入空 BasicBlock
看 jiteh.cpp 中 fgNormalizeEH 的注释会比较清楚
BasicBlock怎么转换成GenTree
流程
compCompile (3参数, compiler.cpp:4078)
以下流程参考 compphases.h
PHASE_PRE_IMPORT
hashBv::Init(this)
清空compiler->hbvGlobalData
用于分配bitmap对象(hashBv*), 相当于一个allocator
后面的fgOutgoingArgTemps和fgCurrentlyInUseArgTemps会从这里分配
VarSetOps::AssignAllowUninitRhs(this, compCurLife, VarSetOps::UninitVal())
设置当前存活的变量的集合为未初始化, 后面会重新设为空集合
PHASE_IMPORTATION
fgImport
impImport(fgFirstBB)
初始化运行堆栈
verCurrentState.esStack = impSmallStack (maxstack小于16时使用SmallStack, 否则new)
verInitCurrentState()
初始化用于查找 Spill Cliques 的成员
inlineRoot->impPendingBlockMembers.Reset(fgBBNumMax * 2)
inlineRoot->impSpillCliquePredMembers.Reset(fgBBNumMax * 2)
inlineRoot->impSpillCliqueSuccMembers.Reset(fgBBNumMax * 2)
处理标记 bbFlags 带 BBF_INTERNAL 的 BasicBlock
跳过这些 BasicBlock
for (; method->bbFlags & BBF_INTERNAL; method = method->bbNext)
设置 BBF_IMPORTED, bbFlags |= BBF_IMPORTED
impImportBlockPending
添加 BasicBlock 到 impPendingList 中
这里会导入第一个非 BBF_INTERNAL 的 BasicBlock
后面这个函数还会用来导入跳转目标的 BasicBlock
处理 impPendingList 直到链表为空
调用 impImportBlock(dsc->pdBB)
跳过 BBF_INTERNAL 的节点并且把它的所有 Successor 加到 impPendingList
调用 impVerifyEHBlock(pParam->block, true)
本地变量 HBtab = block(try block)所属的例外信息
如果block是try start, 运行堆栈必须为空
备份当前运行堆栈的状态
while HBtab != nulltr
如果block是try start
this必须已初始化, 除非有fault block
处理 HBtab->ebdHndBeg (递归处理)
如果handler block接收例外对象
调用 hndBegBB = impPushCatchArgOnStack(hndBegBB, clsHnd)
如果有其他block jump到hndBegBB, 需要插入一个新的hndBegBB并且spill clsHnd到temp
调用 impImportBlockPending(hndBegBB) // 把handler加入处理队列
如果有filter
调用 filterBB = impPushCatchArgOnStack(filterBB, impGetObjectClass()), 同上
调用 impImportBlockPending(filterBB), 同上
如果有外层的try(ebdEnclosingHndIndex), 则HBtab=外层try的HBtab
恢复当前运行堆栈的状态
调用 impImportBlockCode(pParam->block)
这个函数就是把 BasicBlock 转换为 GenTree 的主要函数, 有5000多行
调用 impBeginTreeList
设置初始的 impTreeList = impTreeLast = new (this, GT_BEG_STMTS) GenTree(GT_BEG_STMTS, TYP_VOID)
处理 BasicBlock 中的各个指令
switch (opcode)
CEE_NOP
op1 = new (this, GT_NO_OP) GenTree(GT_NO_OP, TYP_VOID)
impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs)
添加到 impTreeLast 后面并更新 impTreeLast
CEE_NEWOBJ
调用 impSpillSpecialSideEff
如果当前BasicBlock是catch或finally中的第一个BasicBlock(funclet的开始)则
枚举 ExecutionStack
如果节点是CatchArg(异常对象)并且, 并且节点有 GTF_ORDER_SIDEEFF 标志时
impSpillStackEntry(level, BAD_VAR_NUM)
生成设置节点到临时变量的表达式并加到 impTreeList
把该节点替换到获取临时变量的节点
设置 resolvedToken (CORINFO_RESOLVED_TOKEN)
_impResolveToken(CORINFO_TOKENKIND_NewObj)
#define _impResolveToken(kind) impResolveToken(codeAddr, &resolvedToken, kind)
获取指令后的token
pResolvedToken->token = getU4LittleEndian(addr)
获取token对应的 TypeHandle 和 MethodDesc
info.compCompHnd->resolveToken(pResolvedToken) (CEEINFO::resolveToken)
获取后可以查看到获取到的 TypeHandle 和 MethodDesc
dumpmt resolvedToken->hClass
dumpmd resolvedToken->hMethod
调用 eeGetCallInfo
info.compCompHnd->getCallInfo(pResolvedToken, pConstrainedToken, info.compMethodHnd, flags, pResult) (CEEINFO::resolveToken)
设置callInfo,并且检查函数是否可以调用
调用 impHandleAccessAllowedInternal
判断是否可以调用函数
CORINFO_ACCESS_ALLOWED 时跳过
CORINFO_ACCESS_ILLEGAL 时抛出例外
CORINFO_ACCESS_RUNTIME_CHECK 时插入运行时检查的代码 (callInfo.callsiteCalloutHelper)
对 callInfo.classFlags 进行判断
There are three different cases for new
Object size is variable (depends on arguments)
1) Object is an array (arrays treated specially by the EE)
2) Object is some other variable sized object (e.g. String)
3) Class Size can be determined beforehand (normal case)
In the first case, we need to call a NEWOBJ helper (multinewarray)
in the second case we call the constructor with a '0' this pointer
In the third case we alloc the memory, then call the constuctor
第三种情况时
获取一个临时变量保存分配内存后的结果
lclNum = lvaGrabTemp(true DEBUGARG("NewObj constructor temp"))
增加 lvaCount 并返回增加前的值
生成 MethodTable 的参数节点 (Icon)
op1 = impParentClassTokenToHandle(&resolvedToken, nullptr, TRUE)
impTokenToHandle(pResolvedToken, pRuntimeLookup, mustRestoreHandle, TRUE)
impLookupToTree
gtNewIconEmbHndNode
这里的Icon是Int Const的意思
生成 JIT_New 的参数节点 (AllocObj)
op1 = gtNewAllocObjNode(
info.compCompHnd->getNewHelper(&resolvedToken, info.compMethodHnd),
resolvedToken.hClass, TYP_REF, op1)
new (this, GT_ALLOCOBJ) GenTreeAllocObj(type, helper, clsHnd, op1)
生成设置分配内存的结果到临时变量的节点,然后添加到 impTreeList
impAssignTempGen(lclNum, op1, (unsigned)CHECK_SPILL_NONE);
GenTreePtr asg = gtNewTempAssign(tmp, val);
impAppendTree(asg, curLevel, impCurStmtOffs)
生成获取临时变量的节点
newObjThisPtr = gtNewLclvNode(lclNum, TYP_REF)
跳转到生成函数调用节点的处理
goto CALL
判断是否要优化尾递归
bool isRecursive = (callInfo.hMethod == info.compMethodHnd)
添加调用构造函数的节点到 impTreeList
impImportCall
创建一个调用节点
call = gtNewCallNode(CT_USER_FUNC, callInfo->hMethod, callRetTyp, nullptr, ilOffset)
设置参数节点
args = call->gtCall.gtCallArgs = impPopList(sig->numArgs, &argFlags, sig, extraArg)
设置this
if (opcode == CEE_NEWOBJ) { obj = newobjThis; }
call->gtFlags |= obj->gtFlags & GTF_GLOB_EFFECT;
call->gtCall.gtCallObjp = obj;
添加到 impTreeList
impAppendTree(call, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs)
添加 this 到 ExecutionStack
impPushOnStack(gtNewLclvNode(newobjThis->gtLclVarCommon.gtLclNum, TYP_REF), typeInfo(TI_REF, clsHnd))
verCurrentState.esStack[verCurrentState.esStackDepth].seTypeInfo = ti;
verCurrentState.esStack[verCurrentState.esStackDepth++].val = tree;
调用 impMarkInlineCandidate(call, exactContextHnd, callInfo)
判断函数是否可以inline
未开启优化时不内联
函数是尾调用则不内联
函数的gtFlags & GTF_CALL_VIRT_KIND_MASK不等于GTF_CALL_NONVIRT时不内联
函数是helper call时不内联
函数是indirect call时不内联
环境设置了COMPlus_AggressiveInlining时, 设置 CORINFO_FLG_FORCEINLINE
未设置CORINFO_FLG_FORCEINLINE且函数在catch或者filter中时不内联
之前尝试内联失败, 标记了CORINFO_FLG_DONT_INLINE时不内联
同步函数(CORINFO_FLG_SYNCH)不内联
函数需要安全检查(CORINFO_FLG_SECURITYCHECK)则不内联
调用 impCheckCanInline 判断函数是否可以inline
调用 impCanInlineIL 判断函数是否可以inline
如果函数有例外处理器则不内联
函数无内容(大小=0)则不内联
函数参数是vararg时不内联
methodInfo中的本地变量数量大于MAX_INL_LCLS(32)时不内联
methodInfo中的参数数量大于MAX_INL_LCLS时不内联
调用inlineResult->NoteInt通知CALLEE_NUMBER_OF_LOCALS
inline policy中不处理
调用inlineResult->NoteInt通知CALLEE_NUMBER_OF_ARGUMENTS
inline policy中不处理
调用inlineResult->NoteBool通知CALLEE_IS_FORCE_INLINE
设置 m_IsForceInline = value
调用inlineResult->NoteInt通知CALLEE_IL_CODE_SIZE
如果codesize <= CALLEE_IL_CODE_SIZE(16)则标记CALLEE_BELOW_ALWAYS_INLINE_SIZE
如果force inline则标记CALLEE_IS_FORCE_INLINE
如果codesize <= m_RootCompiler->m_inlineStrategy->GetMaxInlineILSize()
(本机是100, 也就是DEFAULT_MAX_INLINE_SIZE)
则标记CALLEE_IS_DISCRETIONARY_INLINE, 后面根据利益判断
否则设置不内联(CALLEE_TOO_MUCH_IL)
调用inlineResult->NoteInt通知CALLEE_MAXSTACK
如果未要求强制内联且maxstack大小大于SMALL_STACK_SIZE(16)则不内联
调用 CEEInfo::initClass 初始化函数所在的 class
如果class未初始化
如果函数属于generic definition, 则不能内联
如果类型需要在访问任何字段前初始化(IsBeforeFieldInit), 则不能内联
如果未满足其他early out条件, 尝试了初始化class, 且失败了则不能内联
调用 CEEInfo::canInline 判断函数是否可以inline
Boundary method
- 会创建StackCrawlMark查找它的caller的函数
- 调用满足以上条件的函数的函数 (标记为IsMdRequireSecObject)
- 调用虚方法的函数 (虚方法可能满足以上的条件)
调用Boundary method的函数不内联
如果caller和callee的grant set或refuse set不一致则不内联
调用 canReplaceMethodOnStack
同一程序集的则判断可内联
不同程序集时, 要求以下任意一项成立
caller是full trust, refused set为空
appdomain的IsHomogenous成立, 且caller和callee的refused set都为空
IsHomogenous: https://msdn.microsoft.com/en-us/library/system.appdomain.ishomogenous(v=vs.110).aspx
如果callee和caller所在的module不一样, 且callee的string pool基于module
则标记dwRestrictions |= INLINE_NO_CALLEE_LDSTR (callee中不能有ldstr)
如果之前的判断全部通过则
call->gtFlags |= GTF_CALL_INLINE_CANDIDATE
CEE_DUP
弹出 ExecutionStack 顶的值
op1 = impPopStack(tiRetVal);
复制表达式
op1 = impCloneExpr(op1, &op2, tiRetVal.GetClassHandle(), (unsigned)CHECK_SPILL_ALL,
nullptr DEBUGARG("DUP instruction"));
压入 ExecutionStack
impPushOnStack(op1, tiRetVal)
压入 ExecutionStack
impPushOnStack(op2, tiRetVal)
CEE_LDC_I4_S
获取常量
cval.intVal = getI1LittleEndian(codeAddr);
跳到 PUSH_I4CON
goto PUSH_I4CON
压入 ExecutionStack
impPushOnStack(gtNewIconNode(cval.intVal), typeInfo(TI_INT))
CEE_CALL
获取 callInfo
_impResolveToken(CORINFO_TOKENKIND_Method);
eeGetCallInfo
运行到 CALL
接下来和上面说的一样
另外impImportCall时, 如果返回值不是void
并且如果该函数是inline候选
使用gtNewInlineCandidateReturnExpr构建一个 retExpr 并推入 ExecutionStack
否则
把函数的调用结果推入 ExecutionStack (中途有可能再隔一个cast)
CEE_STLOC_0
获取设置到的本地变量
lclNum = (opcode - CEE_STLOC_0)
lclNum += numArgs (跳过参数的本地变量)
从 ExecutionStack 顶中弹出
StackEntry se = impPopStack(clsHnd);
op1 = se.val;
tiRetVal = se.seTypeInfo;
生成设置到的本地变量的节点
op2 = gtNewLclvNode(lclNum, lclTyp, opcodeOffs + sz + 1);
生成赋值的节点
op1 = gtNewAssignNode(op2, op1);
添加到 impTreeList
impAppendTree(op1, (unsigned)CHECK_SPILL_ALL, impCurStmtOffs)
CEE_RET
impReturnInstruction
获取返回值的节点
StackEntry se = impPopStack(retClsHnd)
op2 = se.val
生成返回节点
op1 = gtNewOperNode(GT_RETURN, genActualType(info.compRetType), op2)
添加到 impTreeList
impAppendTree(op1, (unsigned)CHECK_SPILL_NONE, impCurStmtOffs)
调用 impVerifyEHBlock(pParam->block, /* isTryStart */ false), 同上
如果 ExecutionStack 中有残留的值则处理
if (verCurrentState.esStackDepth != 0)
判断下一个 BasicBlock 是否有多个来源
unsigned multRef = impCanReimport ? unsigned(~0) : 0
设置临时变量的开始序号为目标的 bbStkTempsIn
baseTmp = block->bbNext->bbStkTempsIn
部分情况需要把最后一个表达式从GenTree中弹出来,下面插入的时候需要插入到它前面
addStmt = impTreeLast;
impTreeLast = impTreeLast->gtPrev;
如果临时变量未分配,则按残留的值的数量分配,并设置 bbStkTempsIn 和 bbStkTempsOut
baseTmp = impGetSpillTmpBase(block)
lvaGrabTemps
分配后 lvaCount 会增加 verCurrentState.esStackDepth, baseTmp 会等于分配前的 lvaCount
impWalkSpillCliqueFromPred
简单的计算来源 BasicBlock
fgComputeCheapPreds
枚举所有目标 BasicBlock
如果 BasicBlock 未设置过 bbStkTempsIn
impSpillCliqueSuccMembers.Get(blk->bbInd()) == 0
调用 SetSpillTempsBase::Visit 设置 bbStkTempsIn
添加到 succCliqueToDo 链表中
枚举所有来源 BasicBlock
如果 BasicBlock 未设置过 bbStkTempsOut
impSpillCliquePredMembers.Get(blk->bbInd()) == 0
调用 SetSpillTempsBase::Visit 设置 bbStkTempsOut
添加到 predCliqueToDo 链表中
枚举 ExecutionStack 中残留的值
调用 impSpillStackEntry(level, tempNum)
生成设置节点到临时变量的表达式并加到 impTreeList
把该节点替换到获取临时变量的节点
把 addStmt 加回 impTreeList
保存生成的 GenTree (impTreeList) 到 BasicBlock
impEndTreeList(block)
impEndTreeList(block, firstTree, impTreeLast)
firstStmt->gtPrev = lastStmt
block->bbTreeList = firstStmt
block->bbFlags |= BBF_IMPORTED
#ifdef DEBUG
impTreeList = impTreeLast = nullptr
如果 reimportSpillClique 则调用 impReimportSpillClique
如果溢出的临时变量有类型转换 (int => native int, float => double)
重新导入所有目标 BasicBlock
把所有 Successor 加到 impPendingList
for (unsigned i = 0; i < block->NumSucc(); i++)
impImportBlockPending(block->GetSucc(i))
fgRemovePreds
删除之前在 fgComputeCheapPreds 生成的 bbPreds, 防止inline出现问题
fgRemoveEmptyBlocks
删除无法到达的 BasicBlock, 只在inline时处理, 如果是inline执行完这一步就会返回
PHASE_POST_IMPORT
fgRemoveEH
如果当前环境不支持处理例外, 删除所有catch关联的 BasicBlock, 但保留try关联的 BasicBlock
fgInstrumentMethod
如果当前正在测试性能, 在第一个 BasicBlock 中插入调用 JIT_LogMethodEnter 的代码
PHASE_MORPH
NewBasicBlockEpoch
更新当前的 BasicBlock 世代信息
fgCurBBEpoch++
fgCurBBEpochSize = fgBBNumMax + 1 // 当前世代的 BasicBlock 数量, fgBBcount增加后仍会维持原值
fgBBSetCountInSizeTUnits = 使用bitset来保存 BasicBlock 时的大小, 单位是size_t
fgMorph
如果类型需要动态初始化
例如类型是泛型并且有静态构造函数
在第一个 BasicBlock 插入调用 JIT_ClassInitDynamicClass 的代码
如果当前是除错模式
如果设置了 opts.compGcChecks
在第一个 BasicBlock 插入调用 JIT_CheckObj 的代码, 检查所有引用类型的参数的指针是否合法
如果设置了 opts.compStackCheckOnRet
添加一个临时变量 lvaReturnEspCheck (TYP_INT)
如果设置了 opts.compStackCheckOnCall
添加一个临时变量 lvaCallEspCheck (TYP_INT)
删除无法到达的 BasicBlock
删除未标记 BBF_IMPORTED 的 BasicBlock
如果try对应的 BasicBlock 已被删除, 同时删除EH Table中的元素
调用 fgRenumberBlocks 重新编排 BasicBlock 的序号
fgAddInternal
添加内部代码到第一个 BasicBlock (要求是BBF_INTERNAL)
如果第0个参数不是this参数 (lvaArg0Var != info.compThisArg)
设置第0个参数为this参数
如果设置了 opts.compNeedSecurityCheck
添加一个临时变量 lvaSecurityObject (TYP_REF)
检测是否要只生成一个ret, 保存在 oneReturn
如果当前平台不是x86(32位), 则为同步方法生成代码
fgAddSyncMethodEnterExit
unsigned byte acquired = 0;
try {
JIT_MonEnterWorker(<lock object>, &acquired);
*** all the preexisting user code goes here ***
JIT_MonExitWorker(<lock object>, &acquired);
} fault {
JIT_MonExitWorker(<lock object>, &acquired);
}
如果 oneReturn, 则生成一个合并用的 BasicBlock
genReturnBB = fgNewBBinRegion(BBJ_RETURN)
如果 oneReturn 并且有返回值
添加一个临时变量 genReturnLocal
如果函数有调用非托管函数
添加本地变量 lvaInlinedPInvokeFrameVar (TYP_BLK)
设置该变量大小 eeGetEEInfo()->inlinedCallFrameInfo.size
如果启用了JustMyCode
添加代码 if (*pFlag != 0) call JIT_DbgIsJustMyCode 到第一个 BasicBlock
注意这个代码包含了分支,会被标记为 GTF_RELOP_QMARK (?:)
QMARK节点会在后面继续分割到多个 BasicBlock
如果 tiSecurityCalloutNeeded 则
添加代码 call JIT_Security_Prolog(MethodHnd, &SecurityObject) 到第一个 BasicBlock
如果当前平台是x86(32位), 则为同步方法生成代码
插入 JIT_MonEnterWorker(<lock object>) 到第一个 BasicBlock
插入 JIT_MonExitWorker(<lock object>) 到返回的 BasicBlock, 并确保 oneReturn 成立
x86下函数遇到例外时vm会自动释放锁
如果 tiRuntimeCalloutNeeded 则
添加代码 call verificationRuntimeCheck(MethodHnd) 到第一个 BasicBlock
如果 opts.IsReversePInvoke 则 (c调用的c#函数)
插入调用 CORINFO_HELP_JIT_REVERSE_PINVOKE_ENTER 的代码到第一个 BasicBlock
插入调用 CORINFO_HELP_JIT_REVERSE_PINVOKE_EXIT 的代码到返回的 BasicBlock
这两个函数目前都是未定义
如果 oneReturn
生成 GT_RETURN 类型的节点并插入到返回的 BasicBlock (之前新创建的genReturnBB)
如果有返回值则使用之前创建的 genReturnLocal 变量
到这里原有的用于返回的 BasicBlock 仍然不会指向新创建的 genReturnBB, 到后面的 fgMorphBlocks 才会修改
fgInline
获取一个 InlineContext rootContext
设置所有 BasicBlock 中的所有 GenTreeStmt 的 gtInlineContext 到 rootContext
枚举所有 BasicBlock 中的所有 GenTreeStmt
如果包含的stmt中的expr类型是 GT_CALL 并且是inline候选 (GTF_CALL_INLINE_CANDIDATE)
调用 fgMorphCallInline
调用 fgMorphCallInlineHelper
如果本地变量有512个以上, 则标记inline失败
如果调用是virtual, 则标记inline失败
如果函数需要安全检查(compNeedSecurityCheck), 则标记inline失败
调用 fgInvokeInlineeCompiler
调用 fgCheckInlineDepthAndRecursion
如果出现循环inline, 例如A inline B, B inline A则设置不内联
如果层数大于InlineStrategy::IMPLEMENTATION_MAX_INLINE_DEPTH(1000)则设置不内联
调用inlineResult->NoteInt(InlineObservation::CALLSITE_DEPTH, depth)
如果inline层数超过 m_RootCompiler->m_inlineStrategy->GetMaxInlineDepth()
(本机是20, 也就是DEFAULT_MAX_INLINE_DEPTH)
则设置不内联(CALLSITE_IS_TOO_DEEP)
返回inline层数
调用 impInlineInitVars
初始化 pInlineInfo, 下面会传给 jitNativeCode
记录this arg的信息到 pInlineInfo->inlArgInfo[argNum], pInlineInfo->lclVarInfo[argNum]
记录传入参数的信息到 pInlineInfo->inlArgInfo[argNum], pInlineInfo->lclVarInfo[argNum]
记录函数本身的本地变量到 pInlineInfo->lclVarInfo[argNum]
调用 jitNativeCode
针对inline函数生成 BasicBlock 和 GenTree, 保存到 InlineeCompiler 中
针对inline函数的利益分析将会在这里进行, 如果判断不值得内联则会返回失败
流程是: jitNativeCode => compCompile => compCompileHelper => fgFindBasicBlocks =>
fgFindJumpTargets => InlineResult::DetermineProfitability =>
LegacyPolicy::DetermineProfitability
如果函数有返回类型但无返回表达式, 则标记inline失败
例如中途throw了导致return的 BasicBlock 未导入
如果允许立刻调用initClass但初始化失败, 则标记inline失败
* 从这里开始不能再标记inline失败
调用 fgInsertInlineeBlocks
如果 InlineeCompiler 中只有一个 BasicBlock
把该 BasicBlock 中的所有stmt插入到原stmt后面
标记原来的stmt为空
如果 InlineeCompiler 中有多个 BasicBlock
按原stmt的位置分割所在的 BasicBlock 到 topBlock 和 bottomBlock
插入inline生成的 BasicBlock 到 topBlock 和 bottomBlock 之间
标记原stmt为空, 原stmt还在 topBlock 中
原stmt下的call会被替换为inline后的返回表达式
iciCall->CopyFrom(pInlineInfo->retExpr, this)
其他的retExpr的gtInlineCandidate会指向这个call节点, 指向不变但内容会改变
可以参考 System.IO.ConsoleStream.Flush 这个函数的inline过程
标记inline成功
如果inline失败
清理新创建的本地变量, 恢复原有的本地变量数量(lvaCount)
如果inline失败
如果调用结果不是void
把stmt中的expr设为空
原来的stmt仍会被retExpr引用, 后面会替换回来
取消原expr(call)的inline候选 (GTF_CALL_INLINE_CANDIDATE)
如果原stmt被设为空则
删除原stmt
替换inline placeholder(retExpr)到inline后的结果
fgWalkTreePre(&stmt->gtStmtExpr, fgUpdateInlineReturnExpressionPlaceHolder);
如果stmt是GT_RET_EXPR
获取 stmt->gtRetExpr.gtInlineCandidate
替换表达式到gtInlineCandidate, 循环替换直到无GT_RET_EXPR
gtInlineCandidate有可能是call, 也有可能是lclVar或者lclFld
替换表达式到lclFld的例子可以看Sys.GetLastErrorInfo的inline处理
RecordStateAtEndOfInlining
除错模式时记录inline完成后的时间
m_compTickCountAtEndOfInlining = GetTickCount()
fgMarkImplicitByRefArgs
遍历本地变量
如果本地变量是TYP_STRUCT, 并且大小不普通(x86下3, 5, 6, 7, >8, arm64下>16)则把类型修改为TYP_BYREF
fgPromoteStructs
用于提升本地的struct变量(把各个字段提取出来作为单独的变量)
遍历本地变量
判断是否提升
如果本地变量有512个以上则不提升
如果变量不是struct则不提升
调用 lvaCanPromoteStructVar 判断, 返回是否可以提升和字段列表
如果变量在SIMD指令中使用则不提升
如果变量是HFA(homogeneous floating-point aggregate)类型则不提升
调用 lvaCanPromoteStructType
如果struct大小比sizeof(double) * 4更大则不提升
如果struct有4个以上的字段则不提升
如果struct有字段地址是重叠的(例如union)则不提升
如果struct有自定义layout并且是HFA类型则不提升
如果struct包含非primitive类型的字段则不提升
如果struct包含有特殊对齐的字段(fldOffset % fldSize) != 0)则不提升
标记可以提升, 并按偏移值排序StructPromotionInfo中的字段
如果判断提升, 调用 lvaPromoteStructVar(lclNum, &structPromotionInfo)
检查字段是否包含float
如果包含则标记到Compiler::compFloatingPointUsed
后面LSRA(Linear scan register alloc)会跟踪float寄存器的使用
添加字段作为一个单独的本地变量
原struct字段仍会保留
fgMarkAddressExposedLocals
标记所有地址被导出(传给了其他函数, 或者设到了全局变量)的本地变量, 这些本地变量将不能优化到寄存器中
例如 addr byref
\--* lclVar int (AX) V01 loc1
同时修改提升的struct的字段,把field改成lclVar
遍历所有 BasicBlock 中的所有 stmt
调用 fgMarkAddrTakenLocalsPreCB
如果tree类型是GT_FIELD, 判断是否可以替换为lclVar
调用 fgMorphStructField
如果field所属的struct已被提升,改成lclVar
如果变量的地址被导出
调用 lvaSetVarAddrExposed
调用 fgMarkAddrTakenLocalsPostCB
如果当前是除错模式
lvaStressLclFld
把部分本地变量转换为TYP_BLK, 添加padding
然后把本地变量对应的GT_LCL_VAR节点转为GT_LCL_FLD
fgStress64RsltMul
把 intOp1*intOp2 转换为 int(long(nop(intOp1))*long(intOp2))
仅会转换不带checked的int*int
fgMorphBlocks
枚举 BasicBlock
设置 fgGlobalMorph = true
调用 fgMorphStmts(block, &mult, &lnot, &loadw)
枚举 BasicBlock 中的 GenTreeStmt
有 fgRemoveRestOfBlock 时删除该Block中剩余的表达式
调用 fgMorphCombineSIMDFieldAssignments 整合SIMD赋值的操作
var a = new Vector3(1, 2, 3);
var b = new Vector3();
b.X = a.X; b.Y = a.Y; b.Z = a.Z;
三个赋值表达式会整合到一个 simd12 (copy) 表达式
调用 fgMorphTree(tree)
除错模式且compStressCompile时复制树, 以发现漏更新的引用数量
调用 optAssertionProp (if optLocalAssertionProp)
根据断言属性 (AssertionProp) 修改节点
断言属性在哪里生成?
断言属性在 optAssertionPropMain 生成, 保存在 optAssertionTabPrivate 中
获取使用 optGetAssertion, 创建使用 optCreateAssertion
标记节点拥有的断言属性用的是一个bitmap
调用 optAssertionProp_LclVar
如果确定本地变量等于常量,修改为该常量
如果确定本地变量等于另一本地变量,修改为另一本地变量 (例如alloc obj用的临时变量)
调用 optAssertionProp_Ind
如果indir左边的节点是lclVar, 并且该节点确定不为null则
tree->gtFlags &= ~GTF_EXCEPT // 该表达式不会抛出异常
tree->gtFlags |= GTF_ORDER_SIDEEFF // 防止reorder
调用 optAssertionProp_BndsChk
如果数组的位置是常量并且确定不会溢出, 则标记不需要检查边界
arrBndsChk->gtFlags |= GTF_ARR_BOUND_INBND
调用 optAssertionProp_Comma
如果前面标记了不需要检查边界, 则删除边界检查
(comma bound_check, expr) => (expr)
调用 optAssertionProp_Cast
如果是小范围类型转换为大范围类型, tree->gtFlags &= ~GTF_OVERFLOW
如果是大范围类型转换为小范围类型, 且确定不会溢出则去除cast
调用 optAssertionProp_Call
如果this确定不为null
tree->gtFlags &= ~GTF_CALL_NULLCHECK
tree->gtFlags &= ~GTF_EXCEPT
如果在调用jit helper来转换类型, 并确定转换一定成功
把call替换为第一个参数, 使用comma组合副作用列表
调用 optAssertionProp_RelOp
如果设置了 optLocalAssertionProp (优化选项)
如果x值确定, 把 x == const 替换成 true 或 false
否则
如果x值确定, 把 x == const 替换成 const == const 或 !(const == const)
如果节点 kind & GTK_CONST 调用 fgMorphConst
清除标记 tree->gtFlags &= ~(GTF_ALL_EFFECT | GTF_REVERSE_OPS)
这些标记可能是其他节点转为const时残留的
如果节点是const string
如果所在block的跳转类型是BBJ_THROW表示该block不经常运行
把节点替换为调用 getLazyStringLiteralHelper 帮助函数的节点, 以延迟构建字符串对象
否则
获取字符串对象然后构建 indir \--* const long 对象指针
如果节点 kind & GTK_LEAF 调用 fgMorphLeaf
如果 tree->gtOper == GT_LCL_VAR 调用 fgMorphLocalVar
如果 lclVar 类型是 TYP_BOOL 或者 TYP_SHORT, 且满足 lvNormalizeOnLoad 则先cast到int
如果 tree->gtOper == GT_LCL_FLD 且 _TARGET_X86_ 则调用 fgMorphStackArgForVarArgs
对于使用栈传递的参数, 且函数参数不定长, 则修改使用 varargs cookie 获取参数
非x86平台不需要
如果 tree->gtOper == GT_FTN_ADDR
把 ftnAddr 节点转为 ind \--* const long 函数地址 或者 nop \--* const long 函数地址
如果节点 kind & GTK_SMPOP 调用 fgMorphSmpOp (简单的unary或者binary操作)
调用 fgMorphForRegisterFP
如果操作是简单的加减乘除, 且结果类型是浮点数, cast两边的类型到结果类型
如果操作是比较,且两边类型不一致, 则cast float的一边到double
如果是赋值 GT_ASG
如果需要cast则cast rhs
如果需要转换为SIMD复制则生成SIMD语句
如果是算术赋值 GT_ASG_ADD, GT_ASG_SUB 等
如果lhs是本地变量或者不是TYP_STRUCT, 禁止CSE优化 (lvalue)
如果是取地址 GT_ADDR
禁止CSE优化 (lvalue)
如果是 GT_QMARK, 标记cond ? x : y的cond为GTF_RELOP_JMP_USED | GTF_DONT_CSE
如果是 GT_INDEX, 调用 fgMorphArrayIndex
修改访问数组元素的GT_INDEX节点到COMMA(GT_ARR_BOUND_CHK, GT_IND)节点
如果是 GT_CAST, 调用 fgMorphCast
修改GT_CAST节点, 确认不会溢出时移除这个节点
如果有需要使用helper(例如long => double)则转换为helper call
如果是 GT_MUL
如果在32位上做乘法, 结果是long且有可能溢出则需要使用helper call
如果是 GT_DIV
如果在32位上做除法, 结果是long则需要使用helper call
如果是 GT_UDIV
如果在32位上做除法, 结果是long则需要使用helper call
如果是 GT_MOD
如果结果类型是float则cast节点到double并使用helper call
否则也使用helper call, 因为signed mod需要更多处理
如果是 GT_UMOD
如果a%b的结果是long, b是int常数且在2~0x3fffffff之间则不使用helper call
否则使用helper call
arm上因为无mod对应的指令, 需要转换a % b = a - (a / b) * b
如果是 GT_RETURN
如果返回数值器且比int小则先cast到int
如果是 GT_EQ 或 GT_NE
优化 typeof(...) == obj.GetType() 或 typeof(...) == typeof(...)
优化 Nullable<T> == null, 直接访问hasValue字段 (例如 IsNull<T>(T arg) { arg == null })
如果是 GT_INTRINSIC, 并且在 arm 平台下
如果 gtIntrinsicId == CORINFO_INTRINSIC_Round 则转换为helper call
如果 cpu 不支持浮点数运算, 调用 fgMorphToEmulatedFP
转换节点到helper call
如果表达式有可能抛出例外
tree->gtFlags |= GTF_EXCEPT
处理 op1
如果tree是QMARK COLON, 则op1是then part
复制optAssertionTabPrivate到origAssertionTab
决定 fgMorphTree op1使用的 MorphAddrContext
如果当前是 GT_ADDR 且原来无上下文, 则op1用一个新的 MACK_Addr 上下文
如果当前是 GT_COMMA, 则op1使用一个空的上下文
如果当前是 GT_ASG 并且是块赋值, 则op1用一个新的 MACK_Ind 上下文
如果当前是 GT_OBJ, GT_BLK, GT_DYN_BLK, GT_IND, 则op1用一个新的 MACK_Ind 上下文
如果当前是 GT_ADD 并且 mac 不为null
则mac应该是IND或者ADDR
如果op2是常量且不会溢出则加到m_totalOffset, 否则设置m_allConstantOffsets = false
调用 fgMorphTree(op1, subMac1)
如果tree是QMARK COLON, 则op1是then part
复制optAssertionTabPrivate到thenAssertionTab
修改tree->gtFlags
如果tree不是GT_INTRINSIC, 或者tree是GT_INTRINSIC但不需要helper call则 gtFlags &= ~GTF_CALL
如果tree不会抛出例外则gtFlags &= ~GTF_EXCEPT
复制op1的副作用标志 tree->gtFlags |= (op1->gtFlags & GTF_ALL_EFFECT)
如果tree是GT_ADDR并且op1是GT_LCL_VAR或者GT_CLS_VAR则 gtFlags &= ~GTF_GLOB_REF (不用全局变量)
处理 op2
如果tree是QMARK COLON, 则op2是else part
复制origAssertionTab到optAssertionTabPrivate
决定 fgMorphTree op2使用的 MorphAddrContext
如果当前是 GT_ADD 并且 mac 是 MACK_Ind
检查 op1 是否常量, 常量时添加到m_totalOffset, 否则设置m_allConstantOffsets = false
如果当前是 GT_ASG 并且是块赋值, 则op2用一个新的 MACK_Ind 上下文
调用 fgMorphTree(op2, mac)
修改tree->gtFlags
复制op2的副作用标志 tree->gtFlags |= (op1->gtFlags & GTF_ALL_EFFECT)
如果tree是QMARK COLON, 则op2是else part
合并then part和else part的AssertionDsc
如果同时存在则保留, 否则删除
非64位上(long)(x shift non_const)会转换为heler call, 标记gtFlags |= GTF_CALL
如果tree类型是GC类型(ref byref array), 但op1和op2均为非GC类型
如果tree是GT_COMMA则修改tree类型为op2类型, 否则为op1类型
调用 gtFoldExpr(tree)
简化tree
如果tree类型是unary op, 且op1是常量, 返回gtFoldExprConst(tree)
如果tree类型是binary op并且是比较并且当前无debug
如果op1和op2都是常量且tree不是atomic op则返回gtFoldExprConst(tree)
如果op1和op2其中一个是常量, 例如op1+0或者op1*1则返回op1, op1==null则返回!op1
如果op1等于op2, 例如a==a或者a+b==b+a则返回true或false
如果QMARK COLON的两边都一样则转换为COMMA
如果返回值是op1, op2, qmarkOp1, qmarkOp2中任意一个可以直接返回tree
如果返回值是throw, 则需要fgMorphTree(tree->gtOp.gtOp1)
如果返回值不等于原tree, 或者原tree是变量则可以直接返回tree
如果tree是比较且op2是0
调用 op1->gtRequestSetFlags() 设置 gtFlags |= GTF_SET_FLAGS
根据oper做出postorder morphing
GT_ASG
如果op1是const则转换为(ind const), 0x123 = 1 => *0x123 = 1
小类型(bool~ushort)复制时可以省略cast
如果op2是比较且op1是byte则不需要额外的padding (op2->gtType = TYP_BYTE)
如果CSE优化把op1变成了lclVar则把op2的类型从TYP_BYTE改为op1的类型
给tree的op1对应的本地变量设置gtFlags |= GTF_VAR_DEF
GT_ASG_ADD, GT_ASG_SUB, ..., GT_ASG_RSZ
赋值的左边不启用CSE优化 op1->gtFlags |= GTF_DONT_CSE
GT_EQ, GT_NE
转换(expr +/- icon1) ==/!= (non-zero-icon2)
例如 "x+icon1==icon2" 到 "x==icon2-icon1"
转换 "(== (comma x (op 1 2)) 0)" 到 "((rev op) (comma x 1) 2)"
转换 "(== (comma (= tmp (op 1 2)) tmp) 0)" 到 "(== (op 1 2) 0)"
转换 "(== (op 1 2) 0)" 到 "((rev op) 1 2)"
GT_LT, GT_LE, GT_GE, GT_GT
如果x是int类型
转换 "x >= 1" 到 "x > 0"
转换 "x < 1" 到 "x <= 0"
转换 "x <= -1" 到 "x < 0"
转换 "x > -1" 到 "x >= 0"
GT_QMARK
转换共通的赋值项, 例如转换 (cond?(x=a):(x=b)) 到 (x=(cond?a:b))
如果then和else都是nop则返回cond
如果else是nop则转换 (cond then else) 到 ((rev cond) else then)
转换 (cond)?0:1 到 cond
https://github.com/dotnet/coreclr/issues/12383
GT_MUL
如果操作会检查是否溢出 (checked) 则调用
fgAddCodeRef(compCurBB, bbThrowIndex(compCurBB),
SCK_OVERFLOW, fgPtrArgCntCur)
判断是否已经为当前 BasicBlock 创建过这个类型的 throw basic block
如果未创建则创建
剩余操作同GT_OR, GT_XOR, GT_AND
修改 "(x * 0)" 到 "0", 如果有副作用则用COMMA
判断op2是否power of two
负数时修改op1为(neg op1), 并TODO
如果op2是常量op1也是常量, 复制op2的gtFieldSeq到op1
乘以1时返回op1
把op2改为log(op2), 并且设置changeToShift (后面改oper到GT_LSH)
例如转换 7 * 8 到 7 << 3
判断lowestbit是否1248并且op2>>log(lowestbit)是否359
shift = genLog2(lowestBit)
factor = abs_mult >> shift
修改op1为op1 * factor, 并且设置changeToShift (后面改oper到GT_LSH)
例如转换 7 * 72 到 7 * 9 << 3
GT_SUB
修改 "op1 - cns2" 到 "op1 + (-cns2)"
修改 "cns1 - op2" 到 "(cns1 + (-op2))"
同GT_MUL, 检查是否溢出并添加抛出溢出用的 BasicBlock
剩余操作同GT_OR, GT_XOR, GT_AND
GT_DIV
仅ARM, 如果不是float则添加抛出溢出或零除的 BasicBlock
GT_UDIV
仅ARM, 添加抛出零除的 BasicBlock
GT_ADD
修改 "((x+icon1)+(y+icon2)) 到 ((x+y)+(icon1+icon2))"
修改 "((x+icon1)+icon2)" 到 "(x+(icon1+icon2))"
修改 "(x + 0)" 到 "x"
同GT_MUL, 检查是否溢出并添加抛出溢出用的 BasicBlock
剩余操作同GT_OR, GT_XOR, GT_AND
GT_OR, GT_XOR, GT_AND
如果op1是常量且不是ref, 交换op1和op2 (op2放常量)
如果oper是GT_OR或GT_XOR, 调用 fgRecognizeAndMorphBitwiseRotation
转换部分特殊的模式到Circular shift
GT_CHS, GT_NOT, GT_NEG
如果启用了优化并且不在optValnumCSE_phase则断言op1不是常数(已优化)
GT_CKFINITE
为当前 BasicBlock 创建类型为 SCK_ARITH_EXCPN 的 throw basic block
GT_OBJ
如果 GT_OBJ(GT_ADDR(X)) 的 X 有 GTF_GLOB_REF
设置当前节点的 gtFlags |= GTF_GLOB_REF
GT_IND
修改 "*(&X)" 到 "X"
修改 "*(&lcl + cns)" 到 "lcl[cns]" (GT_LCL_FLD)
修改 "IND(COMMA(x, ..., z))" 到 "COMMA(x, ..., IND(z))"
GT_ADDR
修改 "ADDR(IND(...))" 到 "(...)"
修改 "ADDR(OBJ(...))" 到 "(...)"
修改 "ADDR(COMMA(x, ..., z))" 到 "COMMA(x, ..., ADDR(z))"
GT_COLON
调用 fgWalkTreePre(&tree, gtMarkColonCond)
标记QMARK COLON下的节点gtFlags |= GTF_COLON_COND
GT_COMMA
如果op2不会产生值则修改typ = tree->gtType = TYP_VOID
提取op1中的副作用列表(gtExtractSideEffList)
如果有, 则替换op1为该副作用列表
如果无, 返回op2
如果op2是void nop且op1是void, 则返回op1
GT_JTRUE
如果fgRemoveRestOfBlock则转换为COMMA(op1, nop)
如果当前不是 optValnumCSE_phase, 并且 oper 不是 GT_ASG, GT_COLON, GT_LIST(arglist)
如果op1是COMMA且op1的op1是throw
标记 fgRemoveRestOfBlock
如果tree是COMMA则 op1等于throw 返回tree
如果tree类型等于op1类型, 返回op1
如果tree类型等于void, 返回throw
否则修改 op1->gtType = commaOp2->gtType = tree类型, 返回op1
如果op2是COMMA且op2的op1是throw
标记 fgRemoveRestOfBlock
如果 op1 无副作用
如果tree是赋值, 返回op2的op1(throw)
如果tree是GT_ARR_BOUNDS_CHECK, 返回op2的op1(throw)
如果tree是COMMA, 返回op2的op1(throw)
必要时修改op2的类型, 返回op2
如果当前启用了 CLFLG_TREETRANS (优化选项), 则调用 fgMorphSmpOpOptional
判断 oper 是否 OperIsCommutative (满足交换律)
如果 tree->gtFlags & GTF_REVERSE_OPS 则交换 op1和op2
修改 "(a op (b op c))" 到 "((a op b) op c)"
修改 "((x+icon)+y)" 到 "((x+y)+icon)"
转换 "a = a <op> x" 到 "a <op>= x"
转换 "a = x <op> a" 到 "a <op>= x" 如果满足交换律
转换 "(val + icon) * icon" 到 "(val * icon) + (icon * icon)"
转换 "val / 1" 到 "val"
转换 "(val + icon) << icon" 到 "(icon << icon + icon << icon)"
转换 "x ^ -1" 到 "~x"
如果节点 tree->OperGet() == GT_FIELD 调用 fgMorphField
转换 field 节点为 ind 节点
(field (lclVar V00) member) =>
(comma (nullcheck (lclVar V00)) (indir (+ (lclVar V00) (const long 8))))
如果 mac.m_totalOffset + fldOffset <= MAX_UNCHECKED_OFFSET_FOR_NULL_OBJECT, nullcheck可省略
如果节点 tree->OperGet() == GT_CALL 调用 fgMorphCall
修改调用的各个参数, 如果参数不是单纯的表达式需要使用临时变量保存
如果可以尾调用优化则去除call后面的return, 但仍需要其他修改
如果 canFastTailCall
compCurBB->bbFlags |= BBF_HAS_JMP
否则
compCurBB->bbJumpKind = BBJ_THROW
如果call结果不为null, 返回一个空节点(place holder), 上层的GT_RETURN节点会使用这个空节点
如果节点 tree->OperGet() == GT_ARR_BOUNDS_CHECK 或 GT_SIMD_CHK 调用 fgSetRngChkTarget
如果delay或inline则延迟处理, 否则
创建调用 CORINFO_HELP_RNGCHKFAIL 的 BasicBlock (fgRngChkTarget)
设置tree的gtIndRngFailBB 等于 gtNewCodeRef(rngErrBlk)
如果节点 tree->OperGet() == GT_ARR_ELEM
tree->gtArrElem.gtArrObj = fgMorphTree(tree->gtArrElem.gtArrObj)
调用 fgSetRngChkTarget(tree, false), 同上
如果节点 tree->OperGet() == GT_ARR_OFFSET
tree->gtArrOffs.gtOffset = fgMorphTree(tree->gtArrOffs.gtOffset)
tree->gtArrOffs.gtIndex = fgMorphTree(tree->gtArrOffs.gtIndex)
tree->gtArrOffs.gtArrObj = fgMorphTree(tree->gtArrOffs.gtArrObj)
调用 fgSetRngChkTarget(tree, false), 同上
如果节点 tree->OperGet() == GT_CMPXCHG
tree->gtCmpXchg.gtOpLocation = fgMorphTree(tree->gtCmpXchg.gtOpLocation)
tree->gtCmpXchg.gtOpValue = fgMorphTree(tree->gtCmpXchg.gtOpValue)
tree->gtCmpXchg.gtOpComparand = fgMorphTree(tree->gtCmpXchg.gtOpComparand)
如果节点 tree->OperGet() == GT_STORE_DYN_BLK
tree->gtDynBlk.Data() = fgMorphTree(tree->gtDynBlk.Data())
如果节点 tree->OperGet() == GT_DYN_BLK
tree->gtDynBlk.Addr() = fgMorphTree(tree->gtDynBlk.Addr())
tree->gtDynBlk.gtDynamicSize = fgMorphTree(tree->gtDynBlk.gtDynamicSize)
调用 fgMorphTreeDone(tree, oldTree DEBUGARG(thisMorphNum))
如果tree有optAssertionCount并且是针对本地变量的赋值, 则调用fgKillDependentAssertions
删除本地变量和本地变量promoted出来的本地变量对应的assertion
调用 optAssertionGen
根据tree创建新的assertion
如果是GT_ASG则 OAK_EQUAL(op1, op2)
如果是GT_NULLCHECK或者GT_ARR_LENGTH则 OAK_NOT_EQUAL(op1, nullptr)
如果是GT_ARR_BOUNDS_CHECK则 OAK_NOT_EQUAL(tree, nullptr)
如果是GT_ARR_ELEM则 OAK_NOT_EQUAL(tree->gtArrElem.gtArrObj, nullptr)
如果是GT_CALL且gtFlags & GTF_CALL_NULLCHECK则 OAK_NOT_EQUAL(thisArg, nullptr)
如果是GT_CAST则 OAK_SUBRANGE(op1, tree)
如果是GT_JTRUE则调用optAssertionGenJtrue
assertionKind = GT_EQ ? OAK_EQUAL, GT_NE ? OAK_NOT_EQUAL
调用optCreateJtrueAssertions(op1, op2, assertionKind);
如果call节点被修改成return, 表示启用了尾调用优化
这里检查原来的call是否尾调用
如果compCurBB被修改了, 表示启用了尾调用优化
这里检查原来的call是否尾调用
除错模式且compStressCompile时复制树, 以发现漏更新的引用数量
如果 morph 是 COMMA 且 op1 是 throw 则 morph = op1 并且 fgRemoveRestOfBlock = true
如果 fgRemoveRestOfBlock, 跳过后面的处理并返回
调用 fgCheckRemoveStmt, 如果 stmt 无副作用则删除该 stmt, 跳过后面的处理并返回
调用 fgFoldConditional, TODO
调用 ehBlockHasExnFlowDsc TODO
检测是否有连续的 += 或者 -=, 有则设置 *mult = true
检测 "x = a[i] & icon; x |= a[i] << 8", 有则设置 *loadw = true (检测似乎未完成)
如果 fgRemoveRestOfBlock 并且 block->bbJumpKind 等于 BBJ_COND 或 BBJ_SWITCH
如果 bbJumpKind == BBJ_COND 且 lastStmt->gtOper == GT_JTRUE
或者 bbJumpKind == BBJ_SWITCH 且 lastStmt->gtOper == GT_SWITCH
修改 last->gtStmt.gtStmtExpr = fgMorphTree(op1) (去掉判断只剩条件)
调用 fgConvertBBToThrowBB
设置 block->bbJumpKind = BBJ_THROW
如果 endsWithTailCallConvertibleToLoop (尾调用是否可以转换为循环)
调用 fgMorphRecursiveFastTailCallIntoLoop
把调用中的各个参数提取出来, 例如
call(arg0 - 1, arg1, tmp0 = concat(arg1, arg2))
转换到
tmp0 = concat(arg1, arg2)
tmp1 = arg0 - 1
arg2 = tmp0
arg0 = tmp1
删掉call, 把block的jumpKind改为BBJ_ALWAYS, 跳转目标是第一个非scratch的block
设置 fgRemoveRestOfBlock = false
整合连续的+=和-=
#if OPT_MULT_ADDSUB 到 #endif
如果之前设置了oneReturn则把所有return block整合到sgenReturnBB
如果有返回值返回值需要设置到genReturnLocal
设置 fgGlobalMorph = false
fgSetOptions
设置codeGen的选项, 包括
genInterruptible: 是否生成完全可中断的代码, 用于debugger