【翻译】CoreRT - A .NET Runtime for AOT

2024-01-12 00:10
文章标签 翻译 net runtime aot corert

本文主要是介绍【翻译】CoreRT - A .NET Runtime for AOT,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

原文:http://mattwarren.org/2018/06/07/CoreRT-.NET-Runtime-for-AOT/
版权归原作者所有

Firstly, what exactly is CoreRT? From its GitHub repo:

.. a .NET Core runtime optimized for AOT (ahead of time compilation) scenarios, with the accompanying .NET native compiler toolchain

The rest of this post will look at what that actually means.

首先,CoreRT到底是什么?来自官方的介绍:

… 一种使用AOT队.NET Core的运行优化技术,与.NET natvie一起的编译工具链

本文将从实际角度上来看待此技术!

Contents

  1. 已有的AOT实现(Existing .NET ‘AOT’ Implementations)
  2. 从上层来看(High-Level Overview)
  3. 编译器(The Compiler)
  4. The Runtime
  5. 示例(‘Hello World’ Program)
  6. Limitations
  7. Further Reading

已有的AOT实现(Existing .NET ‘AOT’ Implementations)

However, before we look at what CoreRT is, it’s worth pointing out there are existing .NET ‘Ahead-of-Time’ (AOT) implementations that have been around for a while:

然而,在看CoreRT之前,我们先了解下已有有效的 .NET AOT实现:

Mono方向

  • Ahead of Time Compilation in Mono (August 2006)
  • Mono Docs - AOT (also see this link)
  • How Xamarin.Android AOT Works
  • Xamarin.iOS - Architecture - AOT

.NET Native (Windows 10/UWP apps only, a.k.a ‘Project N’)方向

  • Announcing .NET Native Preview (April 2014)
  • The .NET Native Tool-Chain
  • Archive of ‘.NET Native’ Blogs Posts
  • Compiling Apps with .NET Native (docs)
  • .NET Native – What it means for Universal Windows Platform (UWP) developers
  • Introduction to .NET Native

So if there were existing implementations, why was CoreRT created? The official announcement gives us some idea:

If we want to shortcut this two-step compilation process and deliver a 100% native application on Windows, Mac, and Linux, we need an alternative to the CLR. The project that is aiming to deliver that solution with an ahead-of-time compilation process is called CoreRT.

The main difference is that CoreRT is designed to support .NET Core scenarios, i.e. .NET Standard, cross-platform, etc.

Also worth pointing out is that whilst .NET Native is a separate product, they are related and in fact “.NET Native shares many CoreRT parts”.

从上面来看,我们已经有实现实现方案,为啥要创建CoreRT项目呢?官方给了些说法:

如果我们想快速的完成2步编译(源码->CIL->native),生成一个能用于Windows、Mac和Linux的100%原生应用,我们需要一个CLR的替代方案。当前这个使用AOT实现项目叫CoreRT。

主要的不同是CoreRT支持 .NET Core技术,如:.NET Standard, 跨平台等。

需要指出的是.NET Native虽然是一个独立项目,但是他与 CoreRT有一定相关性,他们共享了很多组件。

从上层来看(High-Level Overview)

Because all the code is open source, we can very easily identify the main components and understand where the complexity is. Firstly lets look at where the most ‘lines of code’ are:

因为项目开源,所以很容易就能从中区分出主要组件和复杂度。首先从代码行数来看:

Source Code - LOC in Main Components

We clearly see that the majority of the code is written in C#, with only the Native component written in C++. The largest single component is System.Private.CoreLib which is all C# code, although there are other sub-components that contribute to it (‘System.Private.XXX’), such as System.Private.Interop (36,547 LOC), System.Private.TypeLoader (30,777) and System.Private.Reflection.Core (24,964). Other significant components are the ‘Intermediate Language (IL) Compiler’ and the Common code that is used re-used by everything else.

我们非常清晰的看到主要代码都是用C#写的,只有小部分Native组件是用C++写的。最大的单个组件是System.Private.CoreLib全是用C#代码,虽然还有很多他子组件(‘System.Private.XXX’),如: System.Private.Interop (36,547 行), System.Private.TypeLoader (30,777行) 和 System.Private.Reflection.Core (24,964行)。其他重要组件是「IL编译器」和「通用代码」是重用。

All these components are discussed in more detail below.

接下来详细的讨论下这些组件。

编译器(The Compiler)

So whilst CoreRT is a run-time, it also needs a compiler to put everything together, from Intro to .NET Native and CoreRT:

CoreRT只是一个运行时,所以还需要一个编译器,见「从Intro到.NET Native」:

.NET Native is a native toolchain that compiles CIL byte code to machine code (e.g. X64 instructions). By default, .NET Native (for .NET Core, as opposed to UWP) uses RyuJIT as an ahead-of-time (AOT) compiler, the same one that CoreCLR uses as a just-in-time (JIT) compiler. It can also be used with other compilers, such as LLILC, UTC for UWP apps and IL to CPP (an IL to textual C++ compiler we have built as a reference prototype).

.NET Native 是一个把CIL字节码编译成机器码(X64架构)的Native工具链。默认情况下.NET Native(只是.NET Core,不包含UWP)使用RyuJIT作为AOT编译器,与CoreCLR作为JIT编译相同。还可以与其他编译器一起使用,如:LLILC,

But what does this actually look like in practice, as they say ‘a picture paints a thousand words’:

但是到底是怎么回事呢?来看看下面这个图:

这里写图片描述

点击放大(Click for larger version)

To give more detail, the main compilation phases (started from \ILCompiler\src\Program.cs) are the following:

为了得到更多详情,主要编译过程(从 \ILCompiler\src\Program.cs开始)如下:

  1. Calculate the reachable modules/types/classes, i.e. the ‘compilation roots’ using the ILScanner.cs

    计算reachable模块/类型/类,

  2. Allow for reflection, via an optional rd.xml file and generate the necessary metadata using ILCompiler.MetadataWriter

    允许通过可选的rd.xml文件进行反射,并使用ILCompiler.MetadataWriter生成必要的元数据

  3. Compile the IL using the specific back-end (generic/shared code is in Compilation.cs)

    使用特定的后端编译IL

    • RyuJIT RyuJitCompilation.cs
    • Web Assembly (WASM) WebAssemblyCodegenCompilation.cs
    • C++ Code CppCodegenCompilation.cs
  4. Finally, write out the compiled methods using ObjectWriter which in turn uses LLVM under-the-hood

    最后,用ObjectWriter写出已编译的方法,而ObjectWriter最终使用了LLVM。

But it’s not just your code that ends up in the final .exe, along the way the CoreRT compiler also generates several ‘helper methods’ to cover the following scenarios:

  • IL Code (via the ‘EmitIL()’ method)
    • Delegates
    • P/Invoke Delegates
    • Inlined Array methods
    • Boxing
    • Dynamically Invoked methods
    • Enum GetHashCode()
    • Assembly GetExecutingAssembly()
  • Assembly Code (via the ‘EmitCode()’ method) (different implementaions for each CPU architecure)
    • Unboxing (x64)
    • Jump Stubs (ARM64)
    • ‘Ready to Run’ Generic helper (x86)

Fortunately the compiler doesn’t blindly include all the code it finds, it is intelligent enough to only include code that’s actually used:

We don’t use ILLinker, but everything gets naturally treeshaken by the compiler itself (we start with compiling Main/NativeCallable exports and continue compiling other methods and generating necessary data structures as we go). If there’s a type or method that is not used, the compiler doesn’t even look at it.

The Runtime

All the user/helper code then sits on-top of the CoreRT runtime, from Intro to .NET Native and CoreRT:

CoreRT is the .NET Core runtime that is optimized for AOT scenarios, which .NET Native targets. This is a refactored and layered runtime. The base is a small native execution engine that provides services such as garbage collection(GC). This is the same GC used in CoreCLR. Many other parts of the traditional .NET runtime, such as the type system, are implemented in C#. We’ve always wanted to implement runtime functionality in C#. We now have the infrastructure to do that. In addition, library implementations that were built deep into CoreCLR, have also been cleanly refactored and implemented as C# libraries.

This last point is interesting, why is it advantageous to implement ‘runtime functionality in C#’? Well it turns out that it’s hard to do in an un-managed language because there’s some very subtle and hard-to-track-down ways that you can get it wrong:

这里写图片描述
详情:https://twitter.com/JanKotas7/status/988622367973720064?ref_src=twsrc%5Etfw%7Ctwcamp%5Etweetembed%7Ctwterm%5E988622367973720064&ref_url=http%3A%2F%2Fwww.mattwarren.org%2F2018%2F06%2F07%2FCoreRT-.NET-Runtime-for-AOT%2F

These are known as ‘GC Holes’ and the BOTR provides more detail on them. The author of that tweet is significant, Jan Kotas has worked on the .NET runtime for a long time, if he thinks something is hard, it really is!!

Runtime Components

As previously mentioned it’s a layered runtime, i.e made up of several, distinct components, as explained in this comment:

At the core of CoreRT, there’s a runtime that provides basic services for the code to run (think: garbage collection, exception handling, stack walking). This runtime is pretty small and mostly depends on C/C++ runtime (even the C++ runtime dependency is not a hard requirement as Jan pointed out - #3564). This code mostly lives in src/Native/Runtime, src/Native/gc, and src/Runtime.Base. It’s structured so that the places that do require interacting with the underlying platform (allocating native memory, threading, etc.) go through a platform abstraction layer (PAL). We have a PAL for Windows, Linux, and macOS, but others can be added.

And you can see the PAL Components in the following locations:

  • Windows
  • Unix
  • MacOS ‘Apple’ and ‘OSX’

C# Code shared with CoreCLR

One interesting aspect of the CoreRT runtime is that wherever possible it shares code with the CoreCLR runtime, this is part of a larger effort to ensure that wherever possible code is shared across multiple repositories:

This directory contains the shared sources for System.Private.CoreLib. These are shared between dotnet/corert, dotnet/coreclr and dotnet/corefx. The sources are synchronized with a mirroring tool that watches for new commits on either side and creates new pull requests (as @dotnet-bot) in the other repository.

Recently there has been a significant amount of work done to moved more and more code over into the ‘shared partition’ to ensure work isn’t duplicated and any fixes are shared across both locations. You can see how this works by looking at the links below:

  • CoreRT
    • ‘shared partition’ commits
    • Normal System.Private.Corelib
    • Shared System.Private.Corelib
  • CoreCLR
    • ‘shared partition’ commits
    • Normal mscorlib
    • Shared mscorlib

What this means is that about 2/3 of the C# code in System.Private.CoreLib is shared with CoreCLR and only 1/3 is unique to CoreRT:

GroupC# LOC (Files)
shared170,106 (759)
src96,733 (351)
Total266,839 (1,110)

Native Code

Finally, whilst it is advantageous to write as much code as possible in C#, there are certain components that have to be written in C++, these include the GC (the majority of which is one file, gc.cpp which is almost 37,000 LOC!!), the JIT Interface, ObjWriter (based on LLVM) and most significantly the Core Runtime that contains code for activities like:

  • Threading
  • Stack Frame handling
  • Debugging/Profiling
  • Interfacing to the OS
  • CPU specific helpers for:
    • Exception handling
    • GC Write Barriers
    • Stubs/Thunks
    • Optimised object allocation

‘Hello World’ Program

One of the first things people asked about CoreRT is “what is the size of a ‘Hello World’ app” and the answer is ~3.93 MB (if you compile in Release mode), but there is work being done to reduce this. At a ‘high-level’, the .exe that is produced looks like this:

Exe Components

Note the different colours correspond to the original format of a component, obviously the output is a single, native, executable file.

This file comes with a full .NET specific ‘base runtime’ or ‘class libraries’ (‘System.Private.XXX’) so you get a lot of functionality, it is not the absolute bare-minimum app. Fortunately there is a way to see what a ‘bare-minimum’ runtime would look like by compiling against the Test.CoreLib project included in the CoreRT source. By using this you end up with an .exe that looks like this:

Exe Components - Reduced CoreLib

But it’s so minimal that OOTB you can’t even write ‘Hello World’ to the console as there is no System.Console type! After a bit of hacking I was able to build a version that did have a working Console output (if you’re interested, this diff is available here). To make it work I had to include the following components:

  • System.Console
  • System.Text.UnicodeEncoding
  • String handling
  • P/Invoke and Marshalling support (to call an OS function)

So Test.CoreLib really is a minimal runtime!! But the difference in size is dramatic, it shrinks down to 0.49 MB compared to 3.93 MB for the fully-featured runtime!

TypeStandard (bytes)Test.CoreLib (bytes)Difference
.data163,84036,864-126,976
.managed1,540,09665,536-1,474,560
.pdata147,45620,480-126,976
.rdata1,712,12881,920-1,630,208
.reloc98,3044,096-94,208
.text360,448299,008-61,440
rdata98,3044,096-94,208
Total (bytes)4,120,576512,000-3,608,576
Total (MB)3.930.49-3.44

These data sizes were obtained by using the Microsoft DUMPBIN tool and the /DISASM cmd line switch (zip file of the full ouput), which produces the following summary (note: size values are in HEX):

  Summary28000 .data178000 .managed24000 .pdata1A2000 .rdata18000 .reloc58000 .text18000 rdata

Also contained in the output is the assembly code for a simple Hello World method:

HelloWorld_HelloWorld_Program__Main:0000000140004C50: 48 8D 0D 19 94 37  lea         rcx,[__Str_Hello_World__E63BA1FD6D43904697343A373ECFB93457121E4B2C51AF97278C431E8EC85545]000000000140004C57: 48 8D 05 DA C5 00  lea         rax,[System_Console_System_Console__WriteLine_12]000000000140004C5E: 48 FF E0           jmp         rax0000000140004C61: 90                 nop0000000140004C62: 90                 nop0000000140004C63: 90                 nop

and if we dig further we can see the code for System.Console.WriteLine(..):

System_Console_System_Console__WriteLine_12:0000000140011238: 56                 push        rsi0000000140011239: 48 83 EC 20        sub         rsp,20h000000014001123D: 48 8B F1           mov         rsi,rcx0000000140011240: E8 33 AD FF FF     call        System_Console_System_Console__get_Out0000000140011245: 48 8B C8           mov         rcx,rax0000000140011248: 48 8B D6           mov         rdx,rsi000000014001124B: 48 8B 00           mov         rax,qword ptr [rax]000000014001124E: 48 8B 40 68        mov         rax,qword ptr [rax+68h]0000000140011252: 48 83 C4 20        add         rsp,20h0000000140011256: 5E                 pop         rsi0000000140011257: 48 FF E0           jmp         rax000000014001125A: 90                 nop000000014001125B: 90                 nop

Limitations

Missing Functionality

There have been some people who’ve successfully run complex apps using CoreRT, but, as it stands CoreRT is still an alpha product. At least according to the NuGet package ‘1.0.0-alpha-26529-02’ that the official samples instruct you to use and I’ve not seen any information about when a full 1.0 Release will be available.

So there is some functionality that is not yet implemented, e.g. F# Support, GC.GetMemoryInfoor canGetCookieForPInvokeCalliSig (a calli to a p/invoke). For more information on this I recommend this entertaining presentation on Building Native Executables from .NET with CoreRT by Mark Rendle. In the 2nd half he chronicles all the issues that he ran into when he was trying to run an ASP.NET app under CoreRT (some of which may well be fixed now).

Reflection

But more fundamentally, because of the nature of AOT compilation, there are 2 main stumbling blocks that you may also run into Reflection and Runtime Code-Generation.

Firstly, if you want to use reflection in your code you need to tell the CoreRT compiler about the types you expect to reflect over, because by-default it only includes the types it knows about. You can do with by using a file called rd.xml as shown here. Unfortunately this will always require manual intervention for the reasons explained in this issue. More information is available in this comment ‘…some details about CoreRT’s restriction on MakeGenericType and MakeGenericMethod’.

To make reflection work the compiler adds the required metadata to the final .exe using this process:

This would reuse the same scheme we already have for the RyuJIT codegen path:

  • The compiler generates a blob of bytes that describes the metadata (namespaces, types, their members, their custom attributes, method parameters, etc.). The data is generated as a byte array in the ComputeMetadata method.
  • The metadata gets embedded as a data blob into the executable image. This is achieved by adding the blob to a “ready to run header”. Ready to run header is a well known data structure that can be located by the code in the framework at runtime.
  • The ready to run header along with the blobs it refers to is emitted into the final executable.
  • At runtime, pointer to the byte array is located using the RhFindBlob API, and a parser is constructed over the array, to be used by the reflection stack.

Runtime Code-Generation

In .NET you often use reflection once (because it can be slow) followed by ‘dynamic’ or ‘runtime’ code-generation with Reflection.Emit(..). This technique is widely using in .NET libraries for Serialisation/Deserialisation, Dependency Injection, Object Mapping and ORM.

The issue is that ‘runtime’ code generation is problematic in an ‘AOT’ scenario:

ASP.NET dependency injection introduced dependency on Reflection.Emit in aspnet/DependencyInjection#630 unfortunately. It makes it incompatible with CoreRT.

We can make it functional in CoreRT AOT environment by introducing IL interpretter (#5011), but it would still perform poorly. The dependency injection framework is using Reflection.Emit on performance critical paths.

It would be really up to ASP.NET to provide AOT-friendly flavor that generates all code at build time instead of runtime to make this work well. It would likely help the startup without CoreRT as well.

I’m sure this will be solved one way or the other (see #5011), but at the moment it’s still ‘work-in-progress’.


Discuss this post on HackerNews and /r/dotnet

Further Reading

If you’ve got this far, here’s some other links that you might be interested in:

  • What’s the difference between .NET CoreCLR, CoreRT, Roslyn and LLILC
  • What I’ve learned about .NET Native
  • Channel 9 - CoreRT & .NET Native
  • Channel 9 - Going Deep - Inside .NET Native
  • Building ILCompiler in Visual Studio 2017
  • Type System Overview (botr)
  • Interfaces API surface on Type System
  • How Xamarin.Android AOT Works
  • An introduction to IL2CPP internals
  • .NET Native Performance and Internals
  • Dynamic Tracing of .NET Core Methods
  • Generic sharing for valuetypes (Mono)
  • .NET Internals and Native Compiling

这篇关于【翻译】CoreRT - A .NET Runtime for AOT的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/596169

相关文章

基于.NET编写工具类解决JSON乱码问题

《基于.NET编写工具类解决JSON乱码问题》在开发过程中,我们经常会遇到JSON数据处理的问题,尤其是在数据传输和解析过程中,很容易出现编码错误导致的乱码问题,下面我们就来编写一个.NET工具类来解... 目录问题背景核心原理工具类实现使用示例总结在开发过程中,我们经常会遇到jsON数据处理的问题,尤其是

Node.js net模块的使用示例

《Node.jsnet模块的使用示例》本文主要介绍了Node.jsnet模块的使用示例,net模块支持TCP通信,处理TCP连接和数据传输,具有一定的参考价值,感兴趣的可以了解一下... 目录简介引入 net 模块核心概念TCP (传输控制协议)Socket服务器TCP 服务器创建基本服务器服务器配置选项服

.NET利用C#字节流动态操作Excel文件

《.NET利用C#字节流动态操作Excel文件》在.NET开发中,通过字节流动态操作Excel文件提供了一种高效且灵活的方式处理数据,本文将演示如何在.NET平台使用C#通过字节流创建,读取,编辑及保... 目录用C#创建并保存Excel工作簿为字节流用C#通过字节流直接读取Excel文件数据用C#通过字节

poj 1258 Agri-Net(最小生成树模板代码)

感觉用这题来当模板更适合。 题意就是给你邻接矩阵求最小生成树啦。~ prim代码:效率很高。172k...0ms。 #include<stdio.h>#include<algorithm>using namespace std;const int MaxN = 101;const int INF = 0x3f3f3f3f;int g[MaxN][MaxN];int n

如何在Visual Studio中调试.NET源码

今天偶然在看别人代码时,发现在他的代码里使用了Any判断List<T>是否为空。 我一般的做法是先判断是否为null,再判断Count。 看了一下Count的源码如下: 1 [__DynamicallyInvokable]2 public int Count3 {4 [__DynamicallyInvokable]5 get

2、PF-Net点云补全

2、PF-Net 点云补全 PF-Net论文链接:PF-Net PF-Net (Point Fractal Network for 3D Point Cloud Completion)是一种专门为三维点云补全设计的深度学习模型。点云补全实际上和图片补全是一个逻辑,都是采用GAN模型的思想来进行补全,在图片补全中,将部分像素点删除并且标记,然后卷积特征提取预测、判别器判别,来训练模型,生成的像

论文翻译:arxiv-2024 Benchmark Data Contamination of Large Language Models: A Survey

Benchmark Data Contamination of Large Language Models: A Survey https://arxiv.org/abs/2406.04244 大规模语言模型的基准数据污染:一项综述 文章目录 大规模语言模型的基准数据污染:一项综述摘要1 引言 摘要 大规模语言模型(LLMs),如GPT-4、Claude-3和Gemini的快

论文翻译:ICLR-2024 PROVING TEST SET CONTAMINATION IN BLACK BOX LANGUAGE MODELS

PROVING TEST SET CONTAMINATION IN BLACK BOX LANGUAGE MODELS https://openreview.net/forum?id=KS8mIvetg2 验证测试集污染在黑盒语言模型中 文章目录 验证测试集污染在黑盒语言模型中摘要1 引言 摘要 大型语言模型是在大量互联网数据上训练的,这引发了人们的担忧和猜测,即它们可能已

Golang进程权限调度包runtime

关于 runtime 包几个方法: Gosched:让当前线程让出 cpu 以让其它线程运行,它不会挂起当前线程,因此当前线程未来会继续执行GOMAXPROCS:设置最大的可同时使用的 CPU 核数Goexit:退出当前 goroutine(但是defer语句会照常执行)NumGoroutine:返回正在执行和排队的任务总数GOOS:目标操作系统NumCPU:返回当前系统的 CPU 核数量 p

excel翻译软件有哪些?如何高效提翻译?

你是否曾在面对满屏的英文Excel表格时感到头疼?项目报告、数据分析、财务报表... 当这些重要的信息被语言壁垒阻挡时,效率和理解度都会大打折扣。别担心,只需3分钟,我将带你轻松解锁excel翻译成中文的秘籍。 无论是职场新人还是老手,这一技巧都将是你的得力助手,让你在信息的海洋中畅游无阻。 方法一:使用同声传译王软件 同声传译王是一款专业的翻译软件,它支持多种语言翻译,可以excel