-
-
Save lucasmeijer/9de952d508cd9b737518 to your computer and use it in GitHub Desktop.
I've been trying to understand how all moving parts of CoreFX, CoreCLR & ReferenceSource, especially related to | |
mscorlib. These are my notes / conclusions. If you have more information, or see something that is wrong, please | |
let me know! | |
the CoreCLR repo, has an embedded mscorlib inside of it. When diffing this against the referencesource mscorlib, | |
it looks like it forked at some point, and has received some minor cleanups. most occuring changes: | |
- change license header | |
- remove [ResourceExposure] and [ResourceConsumption] attributes. | |
- some modest improvements. (files like Task.cs, Thread.cs, AppDomain.cs, are files with relatively high amount of changes) | |
- cleanup. if referencesource code had #if DOTNETCORE, that define has been removed in the coreclr one, as it is now always true. | |
- all filenames have been camelcased. | |
but by and large, these corlibs are pretty much the same. | |
What it looks like is that the intention is to move many of the types that are in corlib today, into seperate assemblies. | |
You can see this intention in the CoreFX framework, where it actually has a System.Reflection.dll assembly, but today | |
that only has typeforwarders to mscorlib. It is my interpetation that this is to futureproof a future where these types | |
can move to System.Reflection.dll, without breaking user code. (I cannot actually find these typeforwarders in the corefx | |
repo, but if you inspect a build of corefx, and use ILSpy to look in the asssembly, the typeforwarders are there. maybe | |
some other tool makes them, not sure). | |
There are also some parts of BCL that CoreFX seems to implement itself, without typeforwarding to the "legacy" mscorlib. | |
Example of this is System.IO.File here: https://github.com/dotnet/corefx/blob/master/src/System.IO.FileSystem/src/System/IO/File.cs | |
This is a bit akward, because System.IO.File is also in the legacy mscorlib, but it looks like that code is read, and could | |
probably be removed, or maybe it needs to be kept around to make syncing with some internal microsoft full .net sourcetree | |
easy or something. Apparently there is a tool called BclRewriter.exe (not (yet) opensource), which the CoreCLR build | |
procress downloads over nuget. On my build it doesn't run, but on kangaroo's build, it apparently makes internal all the | |
types inside mscorlib that you should no longer be using directly. (like the System.Reflection ones, that you should be | |
using through the System.Reflection.dll port forwarders) | |
An important realization is that unlike the Mono BCL, the CoreFX BCL is not platform agnostic. CoreFX wants to stop using | |
icalls for everything, and start using pinvoke. This leads to the implementation of File.Delete being done with a pinvoke | |
to a windows native library: | |
https://github.com/dotnet/corefx/blob/master/src/Common/src/Interop/Windows/mincore/Interop.DeleteFile.cs (pinvokes into: api-ms-win-core-console-l1-1-0.dll. | |
note to self: we should check what windows-backward-compatibility story is on these native libraries. do they work on win7? winxp?) | |
So System.IO.dll compiled for windows, always pinvokes into a windows lib. If you want to run it on mac, you need a | |
different System.IO.dll compiled for mac, that pinvokes into an osx/posix lib. For System.IO.File, an osx/unix | |
implementation already lives in the corefx repo, but we need to realize that we need to ship seperate BCLs (or at leats | |
a subset) per platform. Also note that the work of the "cross os porting" of these libraries is shared between all | |
Runtimes that decide to support CoreFX. As long as the runtime supports PInvoke, which Mono, IL2CPP and CoreCLR do, | |
System.IO.dll only needs to be ported to platformX once. | |
Does CoreFX BCL strip/treeshake better than Mono or referencesource BCL? I have been hoping that it would. | |
(int.ToString() pulling in ThaiBuddistCalendar is not something I hear game devs request a lot :) ). I have not tried | |
yet, but since the CoreFX corlib at least today seems almost identical to referencesource, I see no reason to believe that | |
a HelloWorld app, gets smaller after stripping/treeshaking than it did on a referencesource BCL. | |
Tagging a few more folks that might want to correct me or add to what I said :-)
One minor note. You don't really need to ship multi-BCLs if you embrace nuget. Basically you'd only ship multiple mscorlib's, and then nuget all the packages that the build requires.
That said, you'll probably need to host private builds of the nuget pkgs which extend and add support for platforms that CoreFx does not support.
For the libraries, we'd rather leverage NuGet to select the appropriate
implementation. I think the implementation of System.IO would use source
sharing, #if, and partial classes, so that we can isolate the OS specific
pieces into a small set. We blieve this makes us more agile. So yes, we
believe that P/Invokes and source sharing is the way to go there.
Does that mean have platform specific (or OS API specific, i.e. POSIX/Win32) versions of each assembly needing native resources?
You don't really need to ship multi-BCLs if you embrace nuget. Basically you'd only ship multiple mscorlib's, and then nuget all the packages that the build requires.
Correct. You do, however, need the runtime library (e.g. mscorlib) to support a certain set of contracts which is done via type forwarding, i.e. something needs to type forward System.Runtime!System.String
to mscorlib!System.String
. Those type forwarders could be shipped by the runtime or in the derived packages (I'm not up to speed where we currently ship those with NuGet v3).
Does that mean have platform specific (or OS API specific, i.e. POSIX/Win32) versions of each assembly needing native resources?
Pretty much. In the past, we only had to support multiple architectures (x86, x64, ARM). Now our native resources would be multiplied by platforms. A binary-based ecosystem using native code is much harder than MSIL which is why we'll probably rethink some of our native dependencies. For example, System.IO.Compression
uses a native implementation of deflate called zlib
. We didn't do it for speed but mostly because zlib
provides a superior compression quality and porting it to managed code would have required more work than simply P/Invoking.
Great article. Here are my two cents. Let me know if you have any questions.
Relationship between .NET Core and .NET Framework
The
mscorlib
that is part of CoreCLR is a fork of themscorlib
that is part of the .NET Framework. You can think of it as Silverlight's copy.In general, we don't have (and don't want to) have automatic code flow between .NET Core and the .NET Framework. The reason being significant implementation differences and compatibility requirements with the 1.8 billion installs of the .NET Framework.
BCL rewriter
The BCL rewriter was created in the Silverlight days to make it easier for us to share the same code base for .NET Framework and Silverlight and yet get the footprint down to something that works for Silverlight.
We currently still uses the rewriter because it's already there and thus was easier for us to use. Long term, since we don't share the implementations, we can physically refactor CoreCLRs
mscorlib
.As far as implementation dependencies on higher level components such as globalization goes: that's a good point which we're (painfully) aware of.
I think there are two answers for this:
Code relationship between CoreCLR's mscorlib and CoreFX
The version of
mscorlib
that is part of CoreCLR is bigger than it needs to. In a perfect world, it would only contain the code that is runtime specific and have no overlap with CoreFX. For example,String
should live here butConsole
shouldn't.The are two reasons for this duplication:
mscorlib
is simply dead.mscorlib
might need to have an implementation for, say,IList<T>
. However, there is no reason why the implementation has to beList<T>
. It's better if we can version the widely used typeList<T>
independently of the runtime itself. One way to do this is by having a simplified copy ofList<T>
that is private tomscorlib
.P/Invokes vs. runtime calls
Originally, the idea was that the managed pieces of the CLI are operating system agnostic and that the runtime provides the OS specific implementations.
However, we believe that this creates a factoring nightmare. First of all, runtime calls (QCalls, FCalls, etc) are fairly complicated. Secondly, it would force all OS specific implementations into a single spot which forces the runtime to version at the same rate as the fasted component that needs OS specific logic. In other words, it doesn't scale.
We believe that the runtime should only provide, well, the runtime specific pieces, such as the GC, the JIT etc. We're even thinking about breaking the runtime into multiple pieces so that, for example, we could update the JIT independently of the GC.
For the libraries, we'd rather leverage NuGet to select the appropriate implementation. I think the implementation of
System.IO
would use source sharing,#if
, and partial classes, so that we can isolate the OS specific pieces into a small set. We blieve this makes us more agile. So yes, we believe that P/Invokes and source sharing is the way to go there.Please note that because of NuGet consumers don't have to know that. For them, it doesn't matter whether
System.IO
is a single DLL that runs on all operating systems or whether there are multiple implementations. In the end, you reference the same package and rely on the build to select the right implementation for deployment.api-ms-*
Windows has done engineering work to improve the dependencies in Win32. The result is called API sets which have funky names like
api-ms-win-core-console-l1-1-0.dll
.My understanding is that there differences between operating systems. AFAIK for CoreCLR we support Win7 and higher.