Unity mono代码结构分析及阅读（六）——IL字节码解析与翻译

内托体头 · 发表于 2020-12-15 09:35

这是基于unity mono代码阅读的第六篇。
上文已经大致分析了mono runtime的框架，本文用两个很具有代表性的opcode Add和Call来深入分析一下CIL指令在mono CLR内的解码和翻译到对应平台机器码的过程。
好了，让我们开始吧。
我们先从简单的add开始
ldloc num1
ldloc num2
//求和
add推荐阅读
我们用CEE_ADD搜索一下，果然一下子就让我们找到了。
case CEE_ADD:
case CEE_SUB:
case CEE_DIV:
case CEE_DIV_UN:
case CEE_REM:
case CEE_REM_UN:
case CEE_AND:
case CEE_OR:
case CEE_XOR:
case CEE_SHL:
case CEE_SHR:
case CEE_SHR_UN:
CHECK_STACK (2);

MONO_INST_NEW (cfg, ins, (*ip));
sp -= 2;
ins->sreg1 = sp [0]->dreg;
ins->sreg2 = sp [1]->dreg;
type_from_op (ins, sp [0], sp [1]);
CHECK_TYPE (ins);
ADD_WIDEN_OP (ins, sp [0], sp [1]);
ins->dreg = alloc_dreg ((cfg), (ins)->type);
/* FIXME: Pass opcode to is_inst_imm */
/* Use the immediate opcodes if possible */
if (((sp [1]->opcode == OP_ICONST) || (sp [1]->opcode == OP_I8CONST)) && mono_arch_is_inst_imm (sp [1]->opcode == OP_ICONST ? sp [1]->inst_c0 : sp [1]->inst_l)) {
int imm_opcode;

imm_opcode = mono_op_to_op_imm_noemul (ins->opcode);
if (imm_opcode != -1) {
                           //...
}
}
MONO_ADD_INS ((cfg)->cbb, (ins));

*sp++ = mono_decompose_opcode (cfg, ins);
ip++;
break;
这串代码比较简单，并且把很多基础操作全部包含进来了，那么具体是什么操作呢？其实就是把当前的OPCode通过一个叫Mono_Inst的结构存储起来，这个结构里面有上下的参数和类型信息。并且对于Add和And这些普通数值操作。mono还尝试用immediate Opcodes进行优化。
里面有个函数需要关注一下mono_decompose_opcode
/*
* mono_decompose_opcode:
*
* Decompose complex opcodes into ones closer to opcodes supported by
* the given architecture.
* Returns a MonoInst which represents the result of the decomposition, and can
* be pushed on the IL stack. This is needed because the original instruction is
* nullified.
* Sets the cfg exception if an opcode is not supported.
*/
MonoInst*
mono_decompose_opcode (MonoCompile *cfg, MonoInst *ins)
{
MonoInst *repl = NULL;
int type = ins->type;
int dreg = ins->dreg;
      ...
}

这个大意是把将传入的opcode分解贴合当前架构（比如X86），其实主要的想法还是把通用的代码在运行时特化一下进行优化。
好了，现在我们大概了解了普通opcode会在mono_method_to_ir函数内被解码成一个一个的Mono_Inst。这里一个一个MonoInst应该是MonoInstructions的缩写也就是一个一个mono指令。
好吧，到这里为止，一切跟我们猜想的都差不多，mono把opcode翻译为一个一个的MonoInstructions，然后在mono_codegen里面会通过函数mono_arch_output_basic_block生成当前架构的机器码。
当然如果是ADD这样的简单操作，mono还会考虑要不要把原来CLR虚拟机的栈式操作变成寄存器操作，当然这些寄存器操作也是平台相关的。
/*
* mono_peephole_pass_1:
*
* Perform peephole opts which should/can be performed before local regalloc
*/
void
mono_arch_peephole_pass_1 (MonoCompile *cfg, MonoBasicBlock *bb)
{
MonoInst *ins, *n;

MONO_BB_FOR_EACH_INS_SAFE (bb, n, ins) {
MonoInst *last_ins = ins->prev;

switch (ins->opcode) {
case OP_IADD_IMM:
case OP_ADD_IMM:
if ((ins->sreg1 < MONO_MAX_IREGS) && (ins->dreg >= MONO_MAX_IREGS)) {
/*
   * X86_LEA is like ADD, but doesn&#39;t have the
   * sreg1==dreg restriction.
   */
ins->opcode = OP_X86_LEA_MEMBASE;
ins->inst_basereg = ins->sreg1;
} else if ((ins->inst_imm == 1) && (ins->dreg == ins->sreg1))
ins->opcode = OP_X86_INC_REG;
break;
                     ...
            }
}

void
mono_arch_output_basic_block (MonoCompile *cf, MonoBasicBlock *bb)
{
.....
case OP_X86_INC_REG:
x86_inc_reg (code, ins->dreg);
break;
....
}
比如我们刚刚看的OP_ADD_IMM操作就会在x86平台下优化成寄存器操作，然后在mono_arch_output_basic_block被x86_inc_reg替换来提升效率。
好了，看完了ADD操作，我们对mono的解码有了一个大致的认识，mono解码出来后的一般操作会存储在MonoInst中，在MonoInst结构中的指令，最后会被替换成一套平台相关的机器操作码。
如果我们要解码的CIL只包含一些简单的逻辑运算，或者只是简单的操作数据，那么本文的分析已经可以结束了。但是CIL不仅仅只包含这些逻辑运算，还包含函数的调用和跳转。
比如说，上文中，我们用来测试的TestCPlus.cs最后会被编译为如下的CIL
.method private hidebysig static void  Main() cil managed
{
  // Code size    11 (0xb)
  .maxstack  8
  IL_0000:  ldstr    &#34;Hello World.&#34;
  IL_0005:  call    void [mscorlib]System.Console::WriteLine(string)
  IL_000a:  ret
} // end of method Program::Main

里面在ldstr将Hello World 压栈后，会调用Call这个指令调用[mscorlib]System.Console::WriteLine
我们可以在Opcode.def这个文件中查询到相关的解码逻辑CIL的call指令会被翻译成mono Runtime里面的CEE_CALL指令。在mono_method_to_ir简单搜搜看。
还真有，CEE_CALL和CEE_CALLI和CEE_CALLVIRT三个在同一个swich case里面处理。
如果你有兴趣翻一翻这块的代码，你会发现，太恐怖了，
从当前的case到下一个return case有足足686行。
如果还好我们有上面ADD指令的一些基础。我们耐心点，一步一步看下去，不过CALL的确很复杂，可以先从jump，条件跳转，还有ret和throw，这些指令入手？
读者可以先尝试阅读阅读哦。
好了，我们正式开始分析Call这类跳转。
mono使用了一套mono_basic_block来描述这一系列控制跳转。并且我们上面的接触到的mono_arch_output_basicblock，也是在codegen里面随着一串一串的basic blocks控制块被循环调用的
void
mono_codegen (MonoCompile *cfg)
{
   ....
      /* emit code all basic blocks */
for (bb = cfg->bb_entry; bb; bb = bb->next_bb) {
bb->native_offset = cfg->code_len;
//if ((bb == cfg->bb_entry) || !(bb->region == -1 && !bb->dfn))
mono_arch_output_basic_block (cfg, bb);

if (bb == cfg->bb_exit) {
cfg->epilog_begin = cfg->code_len;

if (cfg->prof_options & MONO_PROFILE_ENTER_LEAVE) {
code = cfg->native_code + cfg->code_len;
code = mono_arch_instrument_epilog (cfg, mono_profiler_method_leave, code, FALSE);
cfg->code_len = code - cfg->native_code;
g_assert (cfg->code_len < cfg->code_size);
}

mono_arch_emit_epilog (cfg);
}
}
      ....
}
那么什么是basic blocks？
/*
* The IR-level extended basic block.
*
* A basic block can have multiple exits just fine, as long as the point of
* &#39;departure&#39; is the last instruction in the basic block. Extended basic
* blocks, on the other hand, may have instructions that leave the block
* midstream. The important thing is that they cannot be _entered_
* midstream, ie, execution of a basic block (or extened bb) always start
* at the beginning of the block, never in the middle.
*/
struct MonoBasicBlock {
MonoInst *last_ins;

/* the next basic block in the order it appears in IL */
MonoBasicBlock *next_bb;

/*
   * Before instruction selection it is the first tree in the
   * forest and the first item in the list of trees. After
   * instruction selection it is the first instruction and the
   * first item in the list of instructions.
   */
MonoInst *code;
      ....
}
basic blocks简单翻译就是 IR级别扩展基本块。然后这个基本快可以有多个出口，代码可以从代码控制快中间break，但是不能从代码控制块中间运行。说实话，从这个描述，我就想起了一个十分标准的图。。
IDA反编译的结果，也是有一个一个的控制块组成，这里mono也是类似，把一个一个操作变成一个一个basic blocks。通常正常的OP_Code是用不到basic blocks的，因为他们并没有分支和跳转。而需要使用basic blocks应该只有那么几个。我们从basic blocks入手，查看一下什么情况下才会产生basic blocks。
。}/* *
* link_bblock: Links two basic blocks
*
* links two basic blocks in the control flow graph, the &#39;from&#39;
* argument is the starting block and the &#39;to&#39; argument is the block
* the control flow ends to after &#39;from&#39;.
*/
static void
link_bblock (MonoCompile *cfg, MonoBasicBlock *from, MonoBasicBlock* to)
{
MonoBasicBlock **newa;
int i, found;
}
这里有个关键函数，由于控制块是一个链式结构，所以mono写了一个统一的函数link_bblock，我们查看一下link_bblock的索引很快就发现。link_bblock跟我们猜想的一样。只在条件跳转，无条件跳转Jump，各种Call，还有各种异常处理中出现。
这，很合理嘛。也只有这些操作会出现控制流转变，从而需要一个新的basic blocks来描述。
mono_arch_output_basicblock这个函数是个很重要的函数，在这个函数内，所有的op_code被翻译成平台相关的代码。
这里就回到我们上面所提出的一个问题来了。如果遇到地址跳转或者Call这样的指令，这个函数是怎么处理的？
case OP_BR:
if (ins->inst_target_bb->native_offset) {
x86_jump_code (code, cfg->native_code + ins->inst_target_bb->native_offset);
} else {
mono_add_patch_info (cfg, offset, MONO_PATCH_INFO_BB, ins->inst_target_bb);
if ((cfg->opt & MONO_OPT_BRANCH) &&
      x86_is_imm8 (ins->inst_target_bb->max_offset - cpos))
x86_jump8 (code, 0);
else
x86_jump32 (code, 0);
}
break;
case OP_BR_REG:
x86_jump_reg (code, ins->sreg1);
break;
我们发现，在有条件跳转里面，如果对方native_offset也就是非托管地址，那么是直接调用对应平台的jump进行跳转。如果是托管的地址，比如说CLR中的一个函数，那么这个时候翻译器会用mono_add_patch_info把当前要跳转的信息全部记录到一个patch_info里面。那么这个patch_info最终会在哪里被处理呢？
我们通过调试可以知道
最终在mono_codegen里面会调用mono_arch_patch_code然后在mono_resolve_patch_target里面，最终被替换成一个个Trampoline。并用Trampoline来替换当前的这些跳转操作。
Trampoline是什么呢？Trampoline是mono runtime中为了处理call和jump，虚函数这些确定或者不确定的地址跳转引入的一个概念。
推荐大家继续阅读之前前提前阅读一下笔者翻译的mono 官方对Trampolines的介绍。
在jump和call这类的跳转被Specific_Trampoline替换后，在CLR运行的时候，如果这些Specific_Trampoline被执行到了，会进入到一个mono_magic_trampoline函数
就像上图所展示的那样，0x31e0066这样的堆栈是刚刚mono_codegen动态生成的地址，mono_codegen生成的代码中运行时候如果触发到了这些Specific_Trampoline就会进入mono_magic_trampoline。
mono_magic_trampoline其实做的事情也很简单。把对应要跳转的地址检查一下。如果没有编译就触发编译，编译完了之后就跳转到对应的跳转地址运行。并且编译完了后还会将编译后的方法地址替换当前这个魔法的Trampolines。
至于为啥mono要这么做，我觉得，由于CIL语言的特性有很多东西其实我们是并不知道到底会不会被调用到的。比如通过一个if分支调用不同的函数的操作。其实很多情况下，我们或许只需要编译一个分支就好了。另外一个分支99%的情况都不会走到。
mono通过Trampolines这样的操作，就像埋了一个桩，做到了懒编译的过程，节省了大量的编译时间。
另外还有一个就是，由于CIL支持多态，也就是CIL里面的ICALL这样的操作，在没有收到当前的对象的时候，最终可以调用的方法实际上也是不确定的。有了Trampolines这个转接，就可以尽可能的达到用时再编译的效果。
好啦，时间也不早啦，今天就到这里吧。大家晚安。

		自动登录	找回密码
密码			立即注册

[笔记] Unity mono代码结构分析及阅读（六）——IL字节码解析与翻译

本帖子中包含更多资源

浏览过的版块