Skip to content

Instantly share code, notes, and snippets.

@mklbtz
Created March 24, 2018 19:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mklbtz/cc8f257e110104adcee0c411728f0e7a to your computer and use it in GitHub Desktop.
Save mklbtz/cc8f257e110104adcee0c411728f0e7a to your computer and use it in GitHub Desktop.
SE Proposal: New literal for Data from contents of file

New literal for Data from contents of file

  • Proposal: SE-NNNN
  • Authors: Michael Bates
  • Review Manager: TBD
  • Status: Awaiting implementation

Introduction

In Swift today, literals give us the ability to embed files, images, and colors directly into our programs. This proposes a new literal that will embed the contents of a file as Data. The proposed syntax is…

let contents: Data = #dataLiteral(contentsOf: "path/file.ext")

Swift-evolution thread: New literal for string from contents of file

More info on how Xcode’s support for literals: the Apple Developer blog

Motivation

The best way to motivate this feature is with example use cases.

For starters, consider a web server with dozens of SQL queries all written as multi-line string literals. This has numerous drawbacks.

  • The Swift code quickly becomes a cluttered mess as the number and size of these queries grows.
  • We lose out on IDE features when we write code that’s embedded in a string.
  • It is far more difficult to grok any code when it’s all one color.

A natural remedy is to move each query into its own .sql file, but then you would have to read those files at runtime using Foundation APIs. This has a performance cost and creates a dependency on the file system that isn’t desirable.

Also, bear in mind that there is no way to embed resource types with a binary format (images, PDFs, etc.) without a lot of manual effort. You have to depend on the file system.

How can we get the best of both worlds? We want the efficiency and reliability of embedded data with the convenience and flexibility of keeping these resources in separate source files. This is the purpose of #dataLiteral.

Proposed solution

Introduce this "macro" syntax to the compiler: #dataLiteral(contentsOf: "…"). The contentsOf parameter must be a relative or absolute path.

The compiler will handle this by looking for a file at that path. The file contents will be included in the generated SIL code as though you had manually typed out the content bytes in the source:

Data(bytes: [0xA, 0xB, 0xC, ])

It is considered an buildtime error to provide an empty path or a path to a file that does not exist or cannot be read. Paths must be written using Unix conventions. When a relative path is written, the compiler will use project directory as the current working directory. Here, “project directory” refers the directory containing the .xcodeproj or Package.swift.

Detailed design

I used this example above: Data(bytes: [0xA, …]). I used an Array literal here for syntactic convenience, but the Data initializer will accept any Sequence of bytes. We should choose a sequence type (TBD) that has acceptable performance characteristics for this feature, which may or may not be Array. Indeed, a different Data initializer may be preferred to this one.

Source compatibility

N/A — this syntax is purely additive.

Effect on ABI stability

N/A — we should build this feature so that it is purely a compiler convenience. As far as the binary is concerned, the layout of the embedded data should be no different from writing out the binary data manually: Data(bytes: [0xA, 0xB, ...]).

Effect on API resilience

This feature does add public API in a sense, but it is only available via the compiler. There is no binary interface to this feature. As such, it could be changed or removed at any point with no affect on ABI.

Alternatives considered

The core of this feature is embedding file contents. The main debate was around which type to use. There were three dominant ideas with varying levels of convenience.

  1. Embed as a buffer of bytes e.g. Array<UInt8>, UnsafeBufferPointer, or equivalent.
  2. This proposal — embed as Data and require Foundation. Most similar to #fileLiteral and #imageLiteral.
  3. Embed as any type that conforms to a protocol that provides "init from buffer of bytes". Build in conformance for Data, String, and others. Foundation not required. Most similar to #colorLiteral

My personal preference was for option three, but the general feeling was that option two provided the best balance between convenience and complexity. Furthermore, if it becomes popular or if a convincing proposal is made, option three could always be built atop option two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment