Skip to content

Instantly share code, notes, and snippets.

@Sunbreak
Forked from mraleph/ffi.md
Last active March 20, 2022 15:14
Show Gist options
  • Save Sunbreak/a8e400750eb34e43059c4d80a9afbce7 to your computer and use it in GitHub Desktop.
Save Sunbreak/a8e400750eb34e43059c4d80a9afbce7 to your computer and use it in GitHub Desktop.

Dart VM FFI愿景

Dart FFI项目(跟踪为问题#34452)的目的是提供一种低模板、低仪式和低开销的方式,与本地C/C++代码进行互操作。

这个项目背后的动机是双重的。

  1. Flutter最常见的请求之一是要求提供一种与本地(C/C++)代码交互的低开销同步机制(见问题#7053)。
  2. 我们希望有一个替代Dart VM C的API,以反映Dart语言今天的样子和它的使用环境。

目前Flutter支持通过平台通道与用Java(Kotlin)和Objective-C(Swift)代码编写的平台特定代码进行交互。这种机制基于异步消息传递,需要人们同时用Dart和各自的平台语言编写胶水代码。这是一个高开销的解决方案,无论是在性能还是在程序员需要编写的模板代码方面。

Dart VM提供了一个定义在dart_api.h头中的C语言API,以及通过native extensions将Dart代码与本地C/C++代码绑定的机制。然而,这种机制并没有与Flutter集成,也不能开箱使用。

虽然有可能对Flutter引擎和工具进行必要的修改,使开发人员能够编写基于VM API的本地扩展(见Native Flutter Extensions Prototype doc),但我们认为使用Dart VM C API并不是未来的正确方式,原因如下。

  • C API是基于名称的,例如:Dart_Handle Dart_GetField(Dart_Handle container, Dart_Handle name);

    • 这使得它对AOT不友好。
    • 这使得它很慢--因为名称解析的结果没有被缓存。
  • 它是 reflective :本地函数的签名是void (Dart_NativeArguments args),这允许它们接受任何参数并返回任何结果,尽管Dart端的函数签名通常要严格得多,并为编译器提供足够的信息来自动执行必要的marshalling。解除参数和包装结果需要通过API边界进行多次往返--而且不能被Dart编译工具链所优化。FFI背后的一个核心思想是。

    如果一个具有静态已知签名的本地函数被绑定到一个具有已知签名的Dart函数上,那么基于静态已知类型的参数和结果的编排比基于Dart C API的反射性编排更有效率。

  • 它很冗长。

基于这些观察,我们还期望以更精简的方式与本地代码集成,这也应有利于Dart VM C API的现有用户--例如,我们期望将Flutter引擎从C API转移到FFI,应大大减少与跨越Dart和本地代码之间的界限有关的开销。

一般来说,我们尽量将FFI的设计融入到现有的Dart类型系统中,以便像代码完成和静态错误这样的事情能够如期进行。

然而,从下面的章节可以看出,这并不总是可能的,通常是由于缺乏类型系统的功能,使我们无法将必要的信息编码到静态类型中并执行额外的类型规则。

这意味着FFI的实现有可能需要对Dart类型系统进行自己的扩展,并在CFE层面和分析器层面以额外的Kernel变换的方式执行规则。不幸的是,Dart前端的不完全统一意味着这项工作将不得不重复进行--就像其他语言功能的重复一样。

FFI的第一个支柱是一种从Dart访问本地内存的方法。设计如何在Dart代码中表达这一点是受Dart语义限制的。

  • Dart类型是引用类型
  • 本地类型和Dart内置类型之间的映射通常是多对一。例如,本地的int8_tint32_t在Dart那边都对应于int类型。
library dart.ffi;

/// Classes representing native width integers from the native side.
/// They are not constructible in the Dart code and serve purely as
/// markers in type signatures.
class _NativeType { }
class _NativeInteger extends _NativeType { }
class _NativeDouble extends _NativeType { }
class Int8 extends _NativeInteger { }
class Int16 extends _NativeInteger { }
class Int32 extends _NativeInteger { }
class Int64 extends _NativeInteger { }
class Uint8 extends _NativeInteger { }
class Uint16 extends _NativeInteger { }
class Uint32 extends _NativeInteger { }
class Uint64 extends _NativeInteger { }
class IntPtr extends _NativeInteger { }
class Float extends _NativeDouble { }
class Double extends _NativeDouble { }
class Void extends _NativeType {}

// Note: do we need to have Char type?
// Note: do we need to have ConstPointer type that only supports loads?

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> extends _NativeType {
  /// Cast Pointer<T> to Pointer<U>.
  Pointer<U> cast<U extends _NativeType>();

  /// Pointer arithmetic (takes element size into account).
  Pointer<T> elementAt(int index);

  /// Pointer arithmetic (byte offset).
  Pointer<T> offsetBy(int offsetInBytes);

  /// Store a value of Dart type R into this location.
  void store<R>(R value);
  
  /// Load a value of Dart type R from this location.
  R load<R>();

  /// Access to the raw pointer value and construction from raw value.
  int toInt();
  factory fromInt(int ptr);
}

注意storeload方法都有自己的类型参数R,表示存储/加载值的Dart表示。不幸的是,Dart类型系统不允许我们表达TR的相互约束(例如,如果T扩展了_NativeInteger,那么R应该是int)--这必须由 "FFI-typing pass "来报告。

ffi.Pointer<ffi.Int32> ptr;
final i = ptr.load<int>();  // valid
final s = ptr.load<String>();  // compile time error 

请注意,我们将依靠FFI类型传递来禁止使用Pointer<T>的方式,即T(或R)不是静态已知的。

// Compile time error: Pointer<T> has to be statically instantiated.
int load<T extends _NativeType>(Pointer<T> p) => p.load();

// Compile time error: R has to be statically instantiated.
R load(Pointer<Int32> p) => p.load();

这个限制的存在是为了确保后端可以为指针加载生成最简单的、单态的代码。

注意: load<R>store<R>可以成为Dart支持的扩展方法,那么你可以写出

on Pointer<T extends _NativeInteger> {
  int load();
  void store(int value);
}

这里的一个问题是可以用Pointer<T>进行什么样的操作。一个明显的想法是允许通过这个指针加载和存储T类型的值。

abstract class Pointer<T extends _NativeType> {
  void store(T value);
  T load();
}

但是这没有意义,因为这意味着Pointer<Int32>派生到Int32,而我们希望它派生到int--一个Dart程序员了解如何使用的类型。这类似于Int32List.[]如何返回int而不是Int32

不幸的是,Dart的类型系统不允许我们写这样的东西。

abstract class Pointer<T extends _NativeType> {
  void store(Representation(T) value);
  Representation(T) load();
}

其中Representation(T)是。

  • T扩展到_NativeInteger时为int';
  • T扩展到_NativeDouble'时为double';
  • T扩展了指针'时为T'。

一个可能的方法是为这些不同的情况引入Pointer子类。

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> {
  /// Cast Pointer<T> to any other pointer type.
  P cast<P extends Pointer>();
}

abstract class IntPointer<T extends _NativeInteger> extends Pointer<T> {
  void store(int value);
  int load(int value);
}

abstract class DoublePointer<T extends _NativeDouble> extends Pointer<T> {
  void store(double value);
  double load(); 
}

abstract class PointerPointer<T extends Pointer> extends Pointer<T> {
  void store(T value);
  T load();
}

另一种可能的方式是认为任何Pointer<T>都可以被转换为类型化的数组。

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> {
  U asList<U extends TypedData>();
}

然后你可以写这样的代码。

import 'dart:ffi' as ffi;

ffi.Pointer<ffi.Void> ptr;
// Equivalent of ptr.cast<IntPointer<Int32>>.store(10);
ptr.asArray<Int32List>()[0] = 10; 

注意: Dart类型系统不允许我们对U表达约束,以确保UTypedData的具体子类,而不是比如TypedData本身。

这种方法导致了一个非常冗长的PointerPointer<IntPointer<Int32>>类型。

library dart.ffi;

/// Allocate [count] elements of type [T] and return a pointer
/// to the newly allocated memory.
Pointer<T> allocate<T extends _NativeType>({int count: 1});

/// Free memory pointed to by [p].
void free<P extends Pointer>(P p);

/// Return a pointer object that has a finalizer attached to it. When this
/// pointer object is collected by GC the given finalizer is invoked.
///
/// Note: the pointer object passed to the finalizer is not the same as 
/// the pointer object that is returned from [finalizable] - it points
/// to the same memory region but has different identity. 
Pointer<T> finalizable<T>(Pointer<T> p, void finalizer(Pointer<T> ptr))

一般来说,只是指针本身就足以处理结构化数据。

import 'dart:ffi' as ffi;

/// Same as
///
///     struct Point { 
///       double x;
///       double y; 
///       Point* next;
///    };
///
class Point {
  final _ptr = ;

  Point.fromPtr(Pointer<ffi.Void> ptr) : _ptr = ptr.cast<ffi.Uint8>();
  
  Point(double x, double y, Point next) : 
    _ptr = ffi.allocate<ffi.Uint8>(
       count: ffi.sizeOf<ffi.Double>() * 2 + 
              ffi.sizeOf<ffi.Pointer<Void>>()) {
    this.x = x;
    this.y = y;
    this.next = next;
  }

  ffi.Pointer<ffi.Double> get _xPtr => 
    _ptr.offsetBy(0).cast<ffi.Double>();
  set x (double v) { _xPtr.store(v); }
  double get x => _xPtr.load();

  ffi.Pointer<ffi.Double> get _yPtr => 
    _ptr.offsetBy(ffi.sizeOf<ffi.Double>() * 1).cast<ffi.Double>();
  set y (double v) { _yPtr.store(v); }
  double get y => _yPtr.load();

  ffi.Pointer<ffi.Pointer<ffi.Void>> get _nextPtr =>
    _ptr.offsetBy(ffi.sizeOf<ffi.Double>() * 2).cast<ffi.Double>();
  set next (Point v) { _nextPtr.store(v._ptr); }
  Point get next => Point.fromPtr(_nextPtr.load()); 
}

然而这种代码是非常冗长的,所以我们想把它隐藏在一层语法糖之下。核心思想是,我们使用正常的字段声明来描述布局,每个字段有两种类型与之相关。

  • 字段的正常Dart类型指定类型如何暴露给Dart代码。
  • 一个注释指定了相应字段的本地存储格式。

例如,像这样的声明。

import 'dart:ffi' as ffi;

@ffi.struct  // Specifies layout (either ffi.struct or ffi.union)
class Point extends ffi.Pointer<Point> {
  @ffi.Double()  // () are confusing :-(
  double x;
  
  @ffi.Double()
  double y;

  @ffi.Pointer()  // To distinguish from the case when one struct embeds
  Point next;     // another by value.
}

可以通过前端的方式进行转换,与上面的更详细的声明相匹配。

注意: 这里有几个问题需要回答。

  • 如何方便地将Pointer<Point>投到Point
  • `Point'应该有什么样的构造函数?
  • ...

结构布局在不同的平台之间本质上是不可移植的。例如,POSIX文件状态API使用的struct stat在Mac OS X和Linux上有不同的布局。

Dart没有相当于预处理器的东西,所以指定平台的布局需要一些其他的机制。

一个潜在的方法是这样的。

@ffi.struct({
  'x64 && linux': { // Layout on 64-bit Linux
    'x': ffi.Field(ffi.Double, 0),
    'y': ffi.Field(ffi.Double, 8),
    'next': ffi.Field(ffi.Double, 16)
  },
  'arm && ios': {  // Layout on 32-bit iOS
    'x': ffi.Field(ffi.Float, 4),
    'y': ffi.Field(ffi.Float, 8),
    'next': ffi.Field(ffi.Pointer, 0)
  },
})
class Point extends ffi.Pointer<Point> {
  double x;
  double y;
  Point next;
}

在我们深入研究如何在Dart中表示函数指针的细节之前,让我们概述一下从Dart调用本地函数和反之亦然的情况。

要从Dart调用本地函数,我们需要: 1:

  1. 将传出的参数从其Dart表示法转换成本地表示法。 注意: 这里的一个重要决定是决定我们要允许多少自动参数集结,例如,"String "是自动转换为 "uint8_t*"还是程序员在调用函数时必须明确进行这种转换?

  2. 如果callee可以重新进入Dart("非leaf") :记录退出框架信息,以便Dart GC能够找到它。

    注意: 声明函数是一个叶子(=不会进入Dart代码)是一种优化,因为它简化了参数的编排,从Dart过渡到本地代码,也允许在这种函数调用中进行优化--因为这种函数不能影响纯Dart对象。如果一个函数是一个叶子,那么将 "String "转换为 "const uint8_t*"参数可能就像传递一个指针到 "String "的主体一样简单(如果字符串是一个字节的字符串)。

    注意: 一般来说,这也意味着FFI不能(轻易)与本地非本地控制流(longjmp或异常)互通,当控制从一个本地框架转移到另一个本地框架时,会绕过夹在中间的Dart框架。(有一些方法可以实现与异常的互操作--但这些方法并不复杂,所以暂时不在讨论范围之内)。

  3. 根据被调用者的_调用惯例,在堆栈和寄存器中安排传出的参数。

  4. 调用被调用者。

  5. 当callee返回时,我们需要将结果转换为Dart表示,并拆掉退出框架。 注意: 这里的一个重要问题是如何表示由值返回的结构?[最接近的想法是在本地堆上分配它们,并返回一个带有finalizer的指针,而不是一个值]。

从Native调用Dart Function与上面描述的过程没有什么不同--步骤只是有些颠倒了。只有几个问题需要回答。

  • 我们是否允许调用 closuresclass方法 或者我们限制自己使用 static 函数?
  • 如果允许,这些在本地代码中如何表示,接收者在本地代码中如何表示。(注意:之前我们只谈到了来回传递本地数据。将Dart对象传入本地代码需要一个句柄系统,这样GC就会知道)。
  • 我们是否希望调用Dart函数的线程被连接到 isolate(例如通过Dart_EnterIsolate API调用)?我们是否要防止用户滥用FFI并试图在一个错误的线程上调用一个函数(例如回调)的可能性?FFI的结构是否应该突出这种错误的可能性,并允许报告它 - 或者我们应该直接崩溃?

想象一下,我们想把这段代码转换为Dart FFI。

typedef int32_t (*binary_t)(int32_t x, int32_t y); 
struct Ops {
  binary_t add;
  binary_t sub;
};

// Invoke by pointer
int32_t invoke(binary_t f, int32_t x, int32_t y) {
  return f(x, y);
}

我们可以遵循我们对字段的设计:使用两种类型的组合,一种是描述函数指针的本地性质,另一种是描述它在Dart中的使用方式。例如,我们可以扩展Pointer类,用一种方法将其强制为Dart函数,同时创建NativeFunction类,代表本地函数的类型。

library dart.ffi;

abstract class Pointer<T extends _NativeType> {
  // Should only be valid if T is a function type. Creates a function that
  // will marshall all incoming parameters, perform an invocation via
  // this pointer and then unmarshall the result. 
  U asFunction<U extends Function>();
} 

class NativeFunction<T extends Function> extends _NativeType {
}

可以像这样使用。

import 'dart:ffi' as ffi;

typedef ffi.Int32 NativeBinaryOp(ffi.Int32, ffi.Int32);
typedef int BinaryOp(int, int);

@ffi.struct 
class Ops extends ffi.Pointer<Ops> {
  // Front-end ensures that type of the annotation is 
  @ffi.NativeFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32)>()
  BinaryOp add;

  @ffi.NativeFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32)>()
  BinaryOp sub;
}

// Invoke by pointer. Note: have to write ffi.Pointer<NativeFunction<...>>
// because Pointer constraints T to be a subtype of _NativeType.
void invoke(ffi.Pointer<NativeFunction<NativeBinaryOp>> op, int x, int y) {
  op.asFunction<BinaryOp>()(x, y);
}

注意: 我们希望代码可以被AOT编译,我们将指定调用Pointer<F>.asFunction<G>()只依赖于FG的静态值,而不是接收器的重化类型--否则我们不能预编译所有必要的marshalling stubs。 (指定通用不变性/精确性的语言特性在这里会很有用)。

这看起来比较干净,但不幸的是,它没有捕捉到一些必要的信息。

  • 调用惯例。
  • 函数是否是一个叶子。

不幸的是,目前还不完全清楚什么是将这些信息编码到指针类型中的最佳方式。一个可能的方法是像这样做。

class _CallingConvention {}
class Cdecl extends _CallingConvention {}
class StdCall extends _CallingConvention {}

class _Leafness {}
class Leaf extends _Leafness {}
class NotLeaf extends _Leafness {}

class NativeFunction<T extends Function, 
                     CC extends _CallingConvention, 
                     L extends _Leafness> extends _NativeType {
}

但这可能太啰嗦了(尤其是Dart不支持类型参数值的默认值)。

TODO(vegorov)描述了可以用来转换指针和字符串、指针和数组等的助手;注意:可以使用外部类型的数据和字符串进行有效转换。

如果本地函数需要你传入一个回调,怎么办?

typedef intptr_t (*callback_t)(void* baton, void* something);
void with_something(callback_t cb, void* baton);

如果我们想从Dart中调用这个,我们如何传递一个函数呢?

为了简单起见,最初我们应该只允许传入 静态方法 --这一点实现起来非常简单,因为静态方法可以简单地有重定向的蹦蹦跳跳。

对于允许将 batons 与回调联系起来的API,用户可以使用 handmade persistent handles 来传递闭包,其思路是这样的。

typedef int Callback(ffi.Pointer<ffi.Void> something);

int _id = 0;
final _i2cb = <int, Callback>{};
final _cb2i = <Callback, int>{};

int _trampoline(ffi.Pointer<ffi.Void> baton, ffi.Pointer<ffi.Void> something) {
  _i2cb[baton.toInt()](something);
}

ffi.Pointer<ffi.Void> _toHandle(Callback cb) {
  return ffi.Pointer<ffi.Void>.fromInt(_cb2i.putIfAbsent(cb, () {
    _i2cb[_id] = cb;
    return _id++;
  }));
}

void withSomething(Callback cb) {
  with_something(_trampoline, _toHandle(cb));
}

请注意,这将会泄露内存--所以这真的只适用于单枪匹马或支持取消注册的API。

对于没有棍子的API来说,仍然有一种方法可以将闭包作为函数指针来传递--通过为每个不同的闭包设置一个闭包专用的蹦床,然而这只有在传递给对方的闭包数量较少的情况下才有效(因为AOT必须预先生成固定数量的蹦床),而且也只有在支持注册和注销的API上才真正有效。

上一节已经介绍了通过函数指针从Dart调用本地代码的可能性。因此,如果dart:fi库提供了dlopen/dlsym这样的基元,就已经足以跨越这个方向的界限。

library dart.ffi;

class DynamicLibrary {
  // Equivalent of dlopen
  factory DynamicLibrary.open(String name);

  // Equivalent of dlsym
  Pointer<SymbolType> lookup<SymbolType extends _NativeType>(String symbolName);

  // Helper that combines lookup and cast to a Dart function.  
  // Note: user code is would not be permitted to be generic like this.
  // However FFI own code can.
  // Note: ignoring leafness and calling convention for brevity.
  F lookupFunction<SymbolType extends Function, F extends Function>(String symbolName) {
    return lookup<SymbolType>(symbolName)?.asFunction<F>();
  }
}
import 'dart:ffi' as ffi;

// Invoke int32_t add(int32_t, int32_t) from library libfoo.so
final lib = DynamicLibrary.open('libfoo.so');
final add = lib.lookupFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32), int Function(int, int)>('add');
print(add(1, 2));

然而这种风格的代码是不必要的冗长,所以我们还应该提供一种声明性的方式,将Dart函数绑定到本地函数。比如说

library dart.ffi;

/// An annotation that can be used to make FE/VM generate binding code
/// between an extern static method declaration and native code.
class Import<NativeType> {
  /// Native library that contains the target native method.
  /// Can be null - then the symbol is resolved globally.
  final String library;

  /// Symbol to bind to.
  final String symbol;

  /// Specifies whether the target function is expected to call 
  /// the Dart code back.
  final bool isLeaf;

  final callingConvention;

  const Import({
    this.library,
    this.symbol,
    this.isLeaf: true,
    this.callingConvention: Cdecl  // Note: Cdecl is a Type literal.
  });
}
import 'dart:ffi' as ffi;

@ffi.Import<ffi.Int32 Function(ffi.Int32, ffi.Int32)>(
  library: 'foo',  // Q: should mangle library name in platform specific way?
  symbol: 'add',
)
extern int nativeAdd(int a, int b);

@ffi.Import<ffi.Int32>(symbol: 'g_counter')
extern int globalCounter;

TODO(vegorov)我们需要考虑在不同平台(如Linux、Android、MacOS、iOS、Windows等)之间可移植的绑定必须如何结构化。

这有可能需要使用条件导入。

TODO(vegorov)作为一个扩展目标,我们可以设想一个可以从C头文件生成Dart绑定的工具。

[注意:这是一个扩展目标,不会被包含在最初的原型中] 。

想象一下我们在上一节中描述的问题的反面。我们的Dart程序定义了一个静态方法int add(int x, int y),我们想从C++代码中调用它。

这里的核心思想是引入注解ffi.Export

library foo;

@ffi.Export<ffi.Int32 Function(ffi.Int32, ffi.Int32)>(symbol: 'add')
int add(int a, int b) => a + b;

这个注解将指示VM生成一个外部可调用的跳板,其对应的本地签名为int32_t (int32_t, int32_t)

然后,在本地代码中,开发人员可以做。

typedef int32_t (*add_t)(int32_t, int32_t);
add_t f = Dart_LookupFFIExport("foo", "add");
f(1, 2);

请注意,这还可以更进一步--我们可以有一个工具,从注解中生成绑定模块,其中包含以下代码。

#if defined(DART_AOT_USING_DLL)
// AOT compiler would generate a symbol that can be hooked up by 
// the normal dynamic linkage process.
extern "C" int32_t dart_foo_add(int32_t x, int32_t y);
#else
// In JIT or blob based AOT we have to lookup dynamically.
int32_t dart_foo_add(int32_t x, int32_t y) {
  static int32_t (*f) (int32_t, int32_t) = Dart_LookupFFIExport("foo", "add");
  return f(x, y);
}
#endif

www.deepl.com 翻译

Dart VM FFI Vision

Background

The aim of Dart FFI project (tracked as Issue #34452) is to provide a low boilerplate, low ceremony & low overhead way of interoperating with native C/C++ code.

The motivation behind this project is twofold:

  1. one of the most common requests for Flutter is a request for a low overhead synchronous mechanism for interacting with native (C/C++) code (see Issue #7053).
  2. We want a replacement for Dart VM C API that reflects how Dart language looks today and the contexts in which it is used.

Currently Flutter has support for interacting with platform specific code written in Java (Kotlin) and Objective-C (Swift) code via platform channels. This mechanism based on asynchronous message passing and requires people to write glue code in both Dart and a respective platform language. It's a high overhead solution, both in terms of performance and boilerplate code that programmer is required to write.

Dart VM provides a C API defined in the dart_api.h header and a mechanism of binding Dart code to native C/C++ code via native extensions. However this mechanism is not integrated with Flutter, and can't be used out of the box.

While it is possible to make necessary changes to Flutter engine and tools to enable developers write VM API based native extensions (see Native Flutter Extensions Prototype doc), we believe that using Dart VM C API is not the right way going forward for the following reasons:

  • C API is name based e.g. Dart_Handle Dart_GetField(Dart_Handle container, Dart_Handle name);

    • This makes it AOT unfriendly;
    • This makes it slow - because the results of the name resolution are not cached;
  • It is reflective: native functions have signature void (Dart_NativeArguments args) which allows them to accept any arguments and return any results, even though the signature of the function on the Dart side is usually much more strict and provides enough information to the compiler to perform necessary marshalling automatically. Unwrapping arguments and wrapping results requires multiple roundtrips through API boundary - and can't be optimized by the Dart compilation toolchain. One of the core ideas behind FFI is:

    If a native function with a statically known signature is bound to a Dart function with a known signature then marshalling of arguments and results based on statically known types is more efficient than reflective marshalling of arguments based on Dart C API.

  • It is verbose.

Based on these observations we also expect that a more lean way to integrate with native code should also benefit current users of Dart VM C API - for example, we expect that moving Flutter Engine from C API to FFI should significantly reduce overheads associated with crossing the boundary between Dart and native code.

Design Sketch

Note about type system

In general we try to fit FFI design into existing Dart type system as much as possible so that things like code completion and static errors work as expected.

However it would be evident from sections below that it is not always possible, usually due to the lack of type system features that would allow us to encode necessary information into static types and enforce additional typing rules.

This means that FFI implementation potentially would have to come with its own extensions to Dart type system, with rules enforced as an additional Kernel transformation at the CFE level and linter at the analyzer level. Incomplete unification of Dart front-ends unfortunately means that this work will have to be duplicated - just like it is duplicated for other language features.

Accessing Native Types from Dart

The first pillar of an FFI is a way to access native memory from Dart. Design of how this is expressed in Dart code is constrained by Dart semantics:

  • Dart types are reference types
  • The mapping between native types and Dart builtin types is usually many to one. For example native int8_t and int32_t both correspond to int type on the Dart side.

Pointers and Primitives

library dart.ffi;

/// Classes representing native width integers from the native side.
/// They are not constructible in the Dart code and serve purely as
/// markers in type signatures.
class _NativeType { }
class _NativeInteger extends _NativeType { }
class _NativeDouble extends _NativeType { }
class Int8 extends _NativeInteger { }
class Int16 extends _NativeInteger { }
class Int32 extends _NativeInteger { }
class Int64 extends _NativeInteger { }
class Uint8 extends _NativeInteger { }
class Uint16 extends _NativeInteger { }
class Uint32 extends _NativeInteger { }
class Uint64 extends _NativeInteger { }
class IntPtr extends _NativeInteger { }
class Float extends _NativeDouble { }
class Double extends _NativeDouble { }
class Void extends _NativeType {}

// Note: do we need to have Char type?
// Note: do we need to have ConstPointer type that only supports loads?

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> extends _NativeType {
  /// Cast Pointer<T> to Pointer<U>.
  Pointer<U> cast<U extends _NativeType>();

  /// Pointer arithmetic (takes element size into account).
  Pointer<T> elementAt(int index);

  /// Pointer arithmetic (byte offset).
  Pointer<T> offsetBy(int offsetInBytes);

  /// Store a value of Dart type R into this location.
  void store<R>(R value);
  
  /// Load a value of Dart type R from this location.
  R load<R>();

  /// Access to the raw pointer value and construction from raw value.
  int toInt();
  factory fromInt(int ptr);
}

Note how store and load methods have their own type parameter R which denotes Dart representation of the stored/loaded value. Unfortunately Dart type system does not allow us to express mutual constraints for T and R (e.g. if T extends _NativeInteger then R should be int) - this would have to be reported by "FFI-typing pass".

ffi.Pointer<ffi.Int32> ptr;
final i = ptr.load<int>();  // valid
final s = ptr.load<String>();  // compile time error 

Note that we would rely on the FFI typing pass to outlaw usage of Pointer<T> in such a way that T (or R) are not statically known.

// Compile time error: Pointer<T> has to be statically instantiated.
int load<T extends _NativeType>(Pointer<T> p) => p.load();

// Compile time error: R has to be statically instantiated.
R load(Pointer<Int32> p) => p.load();

This restriction exists to ensure that backend can generate simplest, monomorphic code for pointer loads.

Note: load<R> and store<R> could become extension methods of Dart supported them, then you could write

on Pointer<T extends _NativeInteger> {
  int load();
  void store(int value);
}
Alternatives considered to load<R>/store<R>

A question here is what kind of operations can be performed with Pointer<T>. The obvious idea is to allow loading and storing values of type T via this pointer:

abstract class Pointer<T extends _NativeType> {
  void store(T value);
  T load();
}

But that does not make sense, because that would mean that Pointer<Int32> dereferences to Int32 where we would like it to dereference to int - a type that Dart programmers understand how to use. This is similar to how Int32List.[] returns int and not Int32.

Unfortunately the Dart type system does not allow us to write something like this:

abstract class Pointer<T extends _NativeType> {
  void store(Representation(T) value);
  Representation(T) load();
}

Where Representation(T) is:

  • int when T extends _NativeInteger;
  • double when T extends _NativeDouble;
  • T when T extends Pointer.

A possible approach is to introduce Pointer subclasses for those different cases:

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> {
  /// Cast Pointer<T> to any other pointer type.
  P cast<P extends Pointer>();
}

abstract class IntPointer<T extends _NativeInteger> extends Pointer<T> {
  void store(int value);
  int load(int value);
}

abstract class DoublePointer<T extends _NativeDouble> extends Pointer<T> {
  void store(double value);
  double load(); 
}

abstract class PointerPointer<T extends Pointer> extends Pointer<T> {
  void store(T value);
  T load();
}

Another possible way to look at this is to think that any Pointer<T> can be converted to typed array:

/// An class representing a pointer into the native heap.
abstract class Pointer<T extends _NativeType> {
  U asList<U extends TypedData>();
}

Then you could write code like this:

import 'dart:ffi' as ffi;

ffi.Pointer<ffi.Void> ptr;
// Equivalent of ptr.cast<IntPointer<Int32>>.store(10);
ptr.asArray<Int32List>()[0] = 10; 

Note: Dart type system does not allow us to express constraint on U that would ensure that U is a concrete subclass of TypedData and not for example TypedData itself.

This approach leads to a very verbose PointerPointer<IntPointer<Int32>> types.

Allocating and Freeing Memory

library dart.ffi;

/// Allocate [count] elements of type [T] and return a pointer
/// to the newly allocated memory.
Pointer<T> allocate<T extends _NativeType>({int count: 1});

/// Free memory pointed to by [p].
void free<P extends Pointer>(P p);

/// Return a pointer object that has a finalizer attached to it. When this
/// pointer object is collected by GC the given finalizer is invoked.
///
/// Note: the pointer object passed to the finalizer is not the same as 
/// the pointer object that is returned from [finalizable] - it points
/// to the same memory region but has different identity. 
Pointer<T> finalizable<T>(Pointer<T> p, void finalizer(Pointer<T> ptr))

Structures/Unions

In general just pointers themselves are enough to work with structured data:

import 'dart:ffi' as ffi;

/// Same as
///
///     struct Point { 
///       double x;
///       double y; 
///       Point* next;
///    };
///
class Point {
  final _ptr = ;

  Point.fromPtr(Pointer<ffi.Void> ptr) : _ptr = ptr.cast<ffi.Uint8>();
  
  Point(double x, double y, Point next) : 
    _ptr = ffi.allocate<ffi.Uint8>(
       count: ffi.sizeOf<ffi.Double>() * 2 + 
              ffi.sizeOf<ffi.Pointer<Void>>()) {
    this.x = x;
    this.y = y;
    this.next = next;
  }

  ffi.Pointer<ffi.Double> get _xPtr => 
    _ptr.offsetBy(0).cast<ffi.Double>();
  set x (double v) { _xPtr.store(v); }
  double get x => _xPtr.load();

  ffi.Pointer<ffi.Double> get _yPtr => 
    _ptr.offsetBy(ffi.sizeOf<ffi.Double>() * 1).cast<ffi.Double>();
  set y (double v) { _yPtr.store(v); }
  double get y => _yPtr.load();

  ffi.Pointer<ffi.Pointer<ffi.Void>> get _nextPtr =>
    _ptr.offsetBy(ffi.sizeOf<ffi.Double>() * 2).cast<ffi.Double>();
  set next (Point v) { _nextPtr.store(v._ptr); }
  Point get next => Point.fromPtr(_nextPtr.load()); 
}

However this sort of code is very verbose, so we want to hide it under a layer of syntactic sugar. The core idea is that we use normal field declarations to describe the layout and each field has two types associated with it:

  • normal Dart type of the field specifies how type is exposed to Dart code;
  • an annotation specifies the native storage format for the corresponding field.

For example the declaration like this:

import 'dart:ffi' as ffi;

@ffi.struct  // Specifies layout (either ffi.struct or ffi.union)
class Point extends ffi.Pointer<Point> {
  @ffi.Double()  // () are confusing :-(
  double x;
  
  @ffi.Double()
  double y;

  @ffi.Pointer()  // To distinguish from the case when one struct embeds
  Point next;     // another by value.
}

Can be transformed by front-end in a way that matches a more verbose declaration from above.

Note: Few questions to answer here:

  • How to conveniently cast Pointer<Point> to Point?
  • What kind of constructors should Point have?
  • ...
Structure Layouts and Portability

Structure layouts are inherently non-portable between platforms. For example struct stat used by POSIX file status APIs has different layout on Mac OS X and Linux.

Dart does not have an equivalent of the preprocessor so specifying platform specific layouts require some other mechanism.

A potential way to do could be something like this:

@ffi.struct({
  'x64 && linux': { // Layout on 64-bit Linux
    'x': ffi.Field(ffi.Double, 0),
    'y': ffi.Field(ffi.Double, 8),
    'next': ffi.Field(ffi.Double, 16)
  },
  'arm && ios': {  // Layout on 32-bit iOS
    'x': ffi.Field(ffi.Float, 4),
    'y': ffi.Field(ffi.Float, 8),
    'next': ffi.Field(ffi.Pointer, 0)
  },
})
class Point extends ffi.Pointer<Point> {
  double x;
  double y;
  Point next;
}

Function types

Before we dive into the details of how function pointers can be represented in Dart, let us outline what has to happen to invoke a native function from Dart and vice versa.

Invoking Native Function from Dart

To invoke a native function from Dart we need:

  1. Convert outgoing arguments from their Dart representation into native representation;
    Note: an important decision here is to decide how much of an automatic argument marshalling we want to allow, e.g. does String get automatically converted to uint8_t* or programmer must do this conversion explicitly when invoking function?
  2. If callee can re-enter Dart ("non-leaf"): record the exit frame information for Dart GC to be able to find it;
    Note: declaring that function is a leaf (= will not enter Dart code) is an optimization because it simplifies marshalling of arguments, transition from Dart into native code and also allows optimizations across such function call - because such function can't affect pure Dart objects. If a function is a leaf then converting String into const uint8_t* parameter might be as simple as passing a pointer into String-s body (if string is a one-byte string);
    Note: in general this also means that FFI can't (easily) interoperate with native non-local control-flow (longjmp or exceptions) when control is transferred from one native frame to another native frame bypassing Dart frames sandwiched in between. (there are ways to interoperate with exceptions - but they are non-trivial and are left outside of the scope for now).
  3. Arrange outgoing arguments on the stack and registers according to the calling convention of the callee;
  4. Invoke callee;
  5. When callee returns we need to convert result into Dart representation and tear down the exit frame.
    Note: an important question here is how to represent structs returned by value? [the closest idea is to allocate them on the native heap and return a pointer with a finalizer instead of a value].
Invoking Dart Function from Native

Invoking Dart Function from Native is not that different from the process described above - steps are just somewhat inverted. There are just a few questions to answer:

  • Do we allow invocation of closures and _class methods _or we limit ourselves to static functions?
  • If yes, how are these represented in native code and how receivers are represented in native code. (Note: previously we talked only about passing native data back and forth. Passing Dart objects into native code requires a handle system, so that GC would know).
  • Do we expect the thread invoking a Dart function to be attached to the _isolate _(e.g. via Dart_EnterIsolate API call)? Do we guard against possibilities that user might misuse the FFI and try to invoke a function (e.g. callback) on a wrong thread? Should the FFI be structured in a way that highlights the possibility of such error, and allows to report it - or should we just crash?
Representing Function Pointers

Imagine we want to convert this code to Dart FFI:

typedef int32_t (*binary_t)(int32_t x, int32_t y); 
struct Ops {
  binary_t add;
  binary_t sub;
};

// Invoke by pointer
int32_t invoke(binary_t f, int32_t x, int32_t y) {
  return f(x, y);
}

We can follow the same design we had for fields: use a combination of two types, one that describes native nature of a function pointer and one that describes how it will be used from Dart. For example we could extend Pointer class with a way to coerce it to a Dart function and also create the NativeFunction class which would represent the type of native functions:

library dart.ffi;

abstract class Pointer<T extends _NativeType> {
  // Should only be valid if T is a function type. Creates a function that
  // will marshall all incoming parameters, perform an invocation via
  // this pointer and then unmarshall the result. 
  U asFunction<U extends Function>();
} 

class NativeFunction<T extends Function> extends _NativeType {
} 

Which can be used like this:

import 'dart:ffi' as ffi;

typedef ffi.Int32 NativeBinaryOp(ffi.Int32, ffi.Int32);
typedef int BinaryOp(int, int);

@ffi.struct 
class Ops extends ffi.Pointer<Ops> {
  // Front-end ensures that type of the annotation is 
  @ffi.NativeFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32)>()
  BinaryOp add;

  @ffi.NativeFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32)>()
  BinaryOp sub;
}

// Invoke by pointer. Note: have to write ffi.Pointer<NativeFunction<...>>
// because Pointer constraints T to be a subtype of _NativeType.
void invoke(ffi.Pointer<NativeFunction<NativeBinaryOp>> op, int x, int y) {
  op.asFunction<BinaryOp>()(x, y);
}

Note: we want the code to be AOT compilable to we will specify that invocation Pointer<F>.asFunction<G>() only depends on static values of F and G and not on reified type of the receiver - otherwise we can not precompile all necessary marshalling stubs. (a language feature to specify generic invariance / exactness would be beneficial here).

This looks relatively clean, but unfortunately it does not capture some of the information required:

  • Calling convention;
  • Whether function is a leaf or not.

Unfortunately it is not entirely clear what is the best way to encode this information into the type of the pointer. One possible way is to do something like this:

class _CallingConvention {}
class Cdecl extends _CallingConvention {}
class StdCall extends _CallingConvention {}

class _Leafness {}
class Leaf extends _Leafness {}
class NotLeaf extends _Leafness {}

class NativeFunction<T extends Function, 
                     CC extends _CallingConvention, 
                     L extends _Leafness> extends _NativeType {
}

But this might be too verbose (especially because Dart does not support default values for type parameter values).

Conversion between builtin types and native types

TODO(vegorov) describe helpers that can be used to convert for example between pointer and a string, pointer and array; note: can use external typed data and strings for efficient conversions.

Converting Dart Functions to Function Pointers

What if native function requires you to pass a callback in?

typedef intptr_t (*callback_t)(void* baton, void* something);
void with_something(callback_t cb, void* baton);

If we want to invoke this from Dart, how do we pass a function in?

For simplicity, initially we should only allow to pass down _static methods _- this is very simple to implement because static methods could simply have redirecting trampolines.

For APIs that allow to associate _batons _with callbacks users can use handmade persistent handles to pass closures along these lines:

typedef int Callback(ffi.Pointer<ffi.Void> something);

int _id = 0;
final _i2cb = <int, Callback>{};
final _cb2i = <Callback, int>{};

int _trampoline(ffi.Pointer<ffi.Void> baton, ffi.Pointer<ffi.Void> something) {
  _i2cb[baton.toInt()](something);
}

ffi.Pointer<ffi.Void> _toHandle(Callback cb) {
  return ffi.Pointer<ffi.Void>.fromInt(_cb2i.putIfAbsent(cb, () {
    _i2cb[_id] = cb;
    return _id++;
  }));
}

void withSomething(Callback cb) {
  with_something(_trampoline, _toHandle(cb));
}

Note that this will be leaking memory - so this really only works for APIs that are oneshot or support deregistration.

For APIs that don't have batons there is still a way to pass closures as a function pointer - by having a closure specific trampoline for each different closure, however this works only if the number of closured passed to the other side is small (because AOT has to pregenerate fixed number of trampolines) and again only really works for APIs which support both registration and deregistration.

Binding Native Code to Dart Methods

A previous section already covers a possibility of invoking native code from Dart via function pointers. So if dart:ffi library provides dlopen / dlsym like primitives that alone would already be enough to cross the boundary in that direction.

library dart.ffi;

class DynamicLibrary {
  // Equivalent of dlopen
  factory DynamicLibrary.open(String name);

  // Equivalent of dlsym
  Pointer<SymbolType> lookup<SymbolType extends _NativeType>(String symbolName);

  // Helper that combines lookup and cast to a Dart function.  
  // Note: user code is would not be permitted to be generic like this.
  // However FFI own code can.
  // Note: ignoring leafness and calling convention for brevity.
  F lookupFunction<SymbolType extends Function, F extends Function>(String symbolName) {
    return lookup<SymbolType>(symbolName)?.asFunction<F>();
  }
}
import 'dart:ffi' as ffi;

// Invoke int32_t add(int32_t, int32_t) from library libfoo.so
final lib = DynamicLibrary.open('libfoo.so');
final add = lib.lookupFunction<ffi.Int32 Function(ffi.Int32, ffi.Int32), int Function(int, int)>('add');
print(add(1, 2));

However the code in this style is unnecessary verbose, so we should also provide a declarative way of binding Dart functions to native functions. For example:

library dart.ffi;

/// An annotation that can be used to make FE/VM generate binding code
/// between an extern static method declaration and native code.
class Import<NativeType> {
  /// Native library that contains the target native method.
  /// Can be null - then the symbol is resolved globally.
  final String library;

  /// Symbol to bind to.
  final String symbol;

  /// Specifies whether the target function is expected to call 
  /// the Dart code back.
  final bool isLeaf;

  final callingConvention;

  const Import({
    this.library,
    this.symbol,
    this.isLeaf: true,
    this.callingConvention: Cdecl  // Note: Cdecl is a Type literal.
  });
}
import 'dart:ffi' as ffi;

@ffi.Import<ffi.Int32 Function(ffi.Int32, ffi.Int32)>(
  library: 'foo',  // Q: should mangle library name in platform specific way?
  symbol: 'add',
)
extern int nativeAdd(int a, int b);

@ffi.Import<ffi.Int32>(symbol: 'g_counter')
extern int globalCounter;

About platform specific bindings

TODO(vegorov) we need to consider how bindings that are portable between different platforms (e.g. Linux, Android, MacOS, iOS, Windows, etc) would have to be structured.

This potentially requires using conditional imports.

Generating Dart bindings from C headers

TODO(vegorov) As a stretch goal we could imagine a tool that could generate Dart bindings from C Headers.

Binding Native Functions to Dart Code

[Note: this is a stretch goal and is not going to be included into the initial prototype]

Imagine an inverse of a problem that we described in the previous section. We have Dart program defining a static method int add(int x, int y) and we would like to invoke it from the C++ code.

The core of the idea here is to introduce annotation ffi.Export:

library foo;

@ffi.Export<ffi.Int32 Function(ffi.Int32, ffi.Int32)>(symbol: 'add')
int add(int a, int b) => a + b;

this annotation would instruct VM to generate an externally callable trampoline with a corresponding native signature int32_t (int32_t, int32_t).

From native code developer can then do:

typedef int32_t (*add_t)(int32_t, int32_t);
add_t f = Dart_LookupFFIExport("foo", "add");
f(1, 2);

Note that this can be taken further - we can have a tool that would generate binding modules from annotations, that contain the following code:

#if defined(DART_AOT_USING_DLL)
// AOT compiler would generate a symbol that can be hooked up by 
// the normal dynamic linkage process.
extern "C" int32_t dart_foo_add(int32_t x, int32_t y);
#else
// In JIT or blob based AOT we have to lookup dynamically.
int32_t dart_foo_add(int32_t x, int32_t y) {
  static int32_t (*f) (int32_t, int32_t) = Dart_LookupFFIExport("foo", "add");
  return f(x, y);
}
#endif
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment