Skip to content

Instantly share code, notes, and snippets.

@Rajveer100
Last active March 30, 2024 13:24
Show Gist options
  • Save Rajveer100/63f177c6ac6756b8b97d79ca232ce34b to your computer and use it in GitHub Desktop.
Save Rajveer100/63f177c6ac6756b8b97d79ca232ce34b to your computer and use it in GitHub Desktop.
Potential Proposal - Lexical Scopes for swift-syntax

Lexical Scopes - A Scope Lookup Library for swift-syntax

Overview

Swift source code is organized into a set of scopes, each of which can introduce names that can be found in that scope and other scopes nested within it. This project involves implementing a notion of lexical scopes to answer questions by introducing a new library for the swift-syntax package. These APIs can form the basis of IDE features like “Edit All in Scope” and are an important step toward replacing the C++ implementation of scopes within the Swift compiler.

Current State of ASTScope and its Dependencies

At the current state, the Swift compiler has a class ASTScopeImpl which essentially represents a lexical scope:

namespace ast_scope {
class ASTScopeImpl;

// ...
} // namespace ast_scope
class ASTScopeImpl : public ASTAllocated<ASTScopeImpl> {
  // Friend classes...
  
  // ...

protected:
  using Children = SmallVector<ASTScopeImpl *, 4>;
  /// Whether the given parent is the accessor node for an abstract
  /// storage declaration or is directly descended from it.

private:
  /// The kind of scope node.
  ScopeKind kind;

This serves as the base class with essential methods for retrieving various properties and information like children, source ranges, queries, debugging, tree creation and most importantly the lookup with customised constraints.

Once we define the requirements in the base class, we can then inherit these properties for each of the cases which include the following:

/// The root of the scope tree.
class ASTSourceFileScope final : public ASTScopeImpl { ... }

/// The scope introduced by a conditional clause initializer in an
/// if/while/guard statement.
class ConditionalClauseInitializerScope final : public ASTScopeImpl

/// If, while, & guard statements all start with a conditional clause, then some
/// later part of the statement, (then, body, or after the guard) circumvents
/// the normal lookup rule to pass the lookup scope into the deepest conditional
/// clause.
class ConditionalClausePatternUseScope final : public ASTScopeImpl

/// For a closure with named parameters, this scope does the local bindings.
class ClosureParametersScope final : public ASTScopeImpl

Indeed, there can be quite many possibilities like enumerations, macros and also each of these child scopes can have their sub-classes with different conditions like consider the following:

func sum(x: Int, y: Int?) -> Int {
  guard let y = y else {
    return 0
  }

  return x + y
}

This is being handled by GuardStmtBodyScope in the current file, so we should be able to cover all such corner cases (or patterns) and appropriately handle their existence in different blocks.

Approach

Now we can consider the approach from the Swift perspective to implement and potentially move towards the replacement of the existing files with swift packages. We can utilises protocols to define the generic abstract for our lexical scope.

Defining and Representing Scopes

We can have a protocol which will have the basic declarations for handling the different kinds of statements and expressions. This includes the children and the optional parent scope to retrieve, for example, during a lookup and also various kinds of traversals like (pre, post, in)-order.

protocol ASTScope {
  typealias Children = [ASTScope]
  
  // Scope kind.
  var kind: ScopeKind { get set }
  
  // Children for the current scope.
  var children: Children { get set }
  
  // Parent scope that can be non-existent
  var parent: ASTScope? { get set }
  
  // Get the scope kind.
  func getKind() -> ScopeKind
  
  // Get the parent scope
  func getParent() -> ASTScope?
}

Each type of scope can be broken down with the help of enumerations, say a ScopeKind:

enum ScopeKind {
  case AbstractFunctionDecl
  case AbstractStmtDecl
  case If
  case For
  case While
  case Switch
  case Try
  // ...
  // Closures, Guard Statements, If Let, etc...
}

The above kinds of scope will have their own individual characteristics and special cases, which can be handled, but in general at the top level, we can classify them into functions, bodys and statements.

Name Lookup API

Depending the user the lookup can be further extended on the basic idea and additional features can be added depending on the requirements. But we can consider a few general cases below:

  • An unqualified lookup with no constraints specified.
  • A constrained lookup (like by starting point).
  • Scope-specific lookup.
  • Get outermost or innermost scope from a certain scope.
  • Captured environment/binding lookup.

At the very basic top level, consider the non-constrained lookup:

static func unqualifiedLookup(file: SourceFile, loc: SourceLoc, decl: DeclConsumer) { }
// ...
// Statements, Fall-throughs, Macros, etc.

Similarly, we can have such lookup methods for statements, functions, bodies, captured bindings and other relevant cases, our top-down approach will basically look at each level in the AST tree, which would look something like this:

|-------------------------------|
        |---------------|
            |-------|
                 |--|
                  .

We can also have a buffer/cache in the form of a dictionary to store key-value pairs to store recently used searches by the user, improving efficiency where traversals are little complex.

The swift-syntax library offers useful functions to retrieve useful information about the visibility of scopes.

Few methods are listed below:

/// Returns the first token node that is part of this syntax node.
func firstToken(viewMode: SyntaxTreeViewMode) -> TokenSyntax? { }
  
/// Recursively walks through the tree to find the next token semantically
/// after this node.
func nextToken(viewMode: SyntaxTreeViewMode) -> TokenSyntax? { }

/// Recursively walks through the tree to find the token semantically before
/// this node.
func previousToken(viewMode: SyntaxTreeViewMode) -> TokenSyntax?

As described with the top-down closing-interval recursive DFS, the following if quite useful:

/// Find the syntax token at the given absolute position within this
/// syntax node or any of its children.
func token(at position: AbsolutePosition) -> TokenSyntax? {
  // If the position isn't within this node at all, return early.
  guard position >= self.position && position < self.endPosition else {
    return nil
  }

  // If we are a token syntax, that's it!
  if let token = Syntax(self).as(TokenSyntax.self) {
    return token
  }

  // Otherwise, it must be one of our children.
  for child in children(viewMode: .sourceAccurate) {
    if let token = child.token(at: position) {
      return token
    }
  }

  fatalError("Children of syntax node do not cover all positions in it")
}

Integration with SourceKit-LSP

Once the APIs are all set we can consider integrating our library with SourceKit-LSP to enhance IDE support. This could involve building features like code completion, navigation, and refactoring based on lexical scope information. My personal favourite is the jump-to-definition (and callers) which really helps developers to navigate with ease.

Testing

The testing part will require a lot of work to handle comprehensive unit tests and validate the correctness of the scope tracking and name lookup implementation. Various Swift constructs can be used to cover a wide range of scenarios.

Potential Challenges

  • In my understanding the lookup part is the one that needs to be optimised the most since it would be called quite often, we should have logarithmic expected time complexity on average.

  • Tests may not be able to cover all scenarios initially and hence will need frequent updates in the beginning with swift-syntax, a lot of emphasis has been put on making it easy to write good and comprehensive unit tests.

  • Integration with SourceKitLSP and Swift Syntax could get complex since the logic and implementation needs to conform and adapt with the libraries to have consistency in the codebase while still having good performance.

  • Currently the C++ implementation needs to be replaced which would require some work, as it is an important step in replacing scope-related logic within the codebase.

Timeline

Here's an approximate timeline, considering the community bonding period from May 1 - May 27:

May 27 - July 8 (Mid-Term Evaluation) => We should have the initial codebase ready at this point with support for all/most scopes and coverage for the basic name lookups with efficient test benchmarks.

July 12 - August 19 (Final Evaluation) => A lot of tests must be added for the corner cases and be handled for appropriate functionality and optimal performance. As discussed earlier, IDE Support can be added for various scope-build queries as in SourceKit-LSP.

September 3 - November 4 => As a potential extension, we could improve further which depends on our benchmarks and overall state of tests for different methods in our package. Since the goal is to steadily replace existing implementations, we can potentially integrate Swift-specific terminologies to the best possible extent, although this would be a significant stretch for the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment