Skip to content

Instantly share code, notes, and snippets.

@jkup
Last active October 12, 2023 14:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jkup/6c868f1868d3857effb1d60c9f87a369 to your computer and use it in GitHub Desktop.
Save jkup/6c868f1868d3857effb1d60c9f87a369 to your computer and use it in GitHub Desktop.
Hackathon Notes

Sep 22, 2023 | Source map hackathon

Attendees: artem.kobzar@jetbrains.com Benedikt Meurer Daniel Ehrenberg denise.l.mccoy@gmail.com hmanilla@mozilla.com holger@replay.io Jaroslav Sevcik jon.kuperman@gmail.com Jutta Simon Eric Leese luca.forstner@sentry.io Meysam Sarabadani Mathias Bynens Ryan Day tobias.koppers@googlemail.com

Notes

  • Going through issues first, starting with the correctness stream.

  • List source map implementations looks very complete. And not correctness issue

  • Define which points in js should have Sourcemap mapping

  • Historical problem, using mozilla library to merge/combine source maps, needs specific points, otherwise loses fidelity. Currently worked around by just generating too many mappings (on every token boundary). Ideally only the minimum amount of points should be generated.

  • Source maps also don’t have a way to communicate the “offsetting” of a range of source from original to generated code. This happens a lot with intermediate source maps (not necessarily final source maps).

  • Proposal:

    1. a recommendation for which points in the JS syntax should generate mappings in a high-resolution sourcemap.
    2. an extension to the sourcemap file format to express that two mapping describe an offset range
    3. a standard algorithmic definition of the applySourceMap combining function
  • Webpack infers the bit by looking at both sources and comparing them.

  • Would be nice if source maps stand on their own.

  • Something like applySourceMap will also need to scale with new additions to the specification.

  • Document the meaning of the names array

  • Seems like Chrome DevTools is the only tool that seriously uses it.

  • Scopes would make the names obsolete, but not all use cases would be covered by scopes (necessarily).

  • Proposal:

    1. DevTools to document what debugger does today
    2. Start by defining that unreferenced names (and sources) can be discarded safely.
  • Discussion about how source maps are linked from . wasmfiles. Specification should also describe the exact format of the custom section in the WebAssembly.

  • Support for //@unclear? Chrome DevTools should have UMA to understand how much this is still being used.

  • Also understand where this is used, i.e. inside eval or in classic scripts or whatever

  • x_google_ignoreList discussion: Need a safe upgrade path. Guidance what to do when both x_google_ignoreList and ignoreList is present.

  • Given that there’s anyways some discussion with others upfront, there’s not really a point in the x_ prefix maybe?

  • Demo for the Debug ID by Luca.

    • Associate debug_id with the source file by injecting a tiny snippet into each generated file that throws an error and stashes the mapping into some global table
    • Content hashes in filenames would eliminate the need? Sentry cannot rely on the filename being globally unique.
    • debug_id can separate the shipping of production bundles from debug information, ideally very similar to Microsofts debug server solution.
    • Seems like everyone gave up on sanitizing / extending Error.stack.
    • The sourcemap part is uncontroversial and good to go. The prototype from sentry is working proof. Next step is to propose some stage 0 idea to TC39.
    • Having the sourcemap logic in V8 would make it easier to get source map information out (also in case of Node.js).
  • Demo of Babel test file format for source map tests.

  • Jaro demos the scopes prototype.

    • Ideally the outcome should allow for generating the scope information later (using language specific machinery).
    • Discussion about scope in terms of source vs. generated code.
    • Concerns about the size and what kind of scope information can be omitted.
  • Discussion about names and scopes

    • Jaro’s presents his thinking
  • stay similar to https://github.com/bloomberg/pasta-sourcemaps

  • Scopes (source only information)

  • Scopes have bindings, something like “sourceVarRef” pointing to generated expressions. Are these ranges?

  • For inlined functions:

    1. callsite with source coordinates
    2. Generated range (start to end)
    3. Inlined children (nested structure)
    • Tobias presented a different (DWARF-like) proposal:
    • scopes
name?: str
callsite?: org s-e
bindings: [ ]
range: gen s-e
definition: org s-e
flags: inline | function | inherit |...

Full scopes proposal ended up here: https://gist.github.com/sokra/97a53a869b9a421accadbc9681cb26f3

  • Encoded efficiently via VLQ+Base64 just like the mappings field
  • What about shadowing? We need an optimized_out sentinel value.
    • Should the specification recommend to avoid shadowing? In the native world, developers are used to optimized vs. debug builds. That same statement does not apply in the JS world.
  • Differences between the proposals
    • Tobias’ approach might lead to bigger source maps, since scope information has to be repeated for every inlined function (JS tools try to inline only once however, but for Kotlin this might be a problem).
    • Referring to things from outer scopes (not a problem for C++ in DWARF).
  • What about expressions? In the context of Wasm? This should definitely be captured as part of the spec.
    • Split up the workstream into “scopes” and “expressions”.
  • Action items:
    • Could we do a concrete write up for the two different proposals and their implementation details?
    • Artem and Holger are working on this
    • We’ll need significant follow up conversations around WASM expressions and how those will work
    • We still need to talk about CSS
    • We still need to decide on encoding
    • We also need to talk about degradation, what information do stack traces need vs debuggers?
    • Stages will be 1. Structure (Holger and Artem’s document) 2. Encoding (Features calls) 3. Expressions (Feature calls later)
    • Jaro adjusting his Chrome DevTools and Terser prototypes, Artem to provide Kotlin, Holger to do some more examples.
    • What about hiding functions? For the ZoneJS use case.
  • Testing (different class of tests)
    • Different kinds of consumers (DevTools vs. sentry) need different kinds of tests.
    • Tests for tools that compose source maps (different levels of code)
    • This group should not worry about test implementation, but rather focus on a test harness, which can be integrated with concrete implementations.
    • Ideally we’d work on four sets of tests that can be hooked into
  • Consumer tests:
    1. A consumer like Chrome DevTools could take the source code, generated code and put in breakpoints and compare
    2. A consumer like Sentry could take source code, generated code, a source map and an error stack and do a similar suite of tests
    3. Tests for libraries that compose source maps. Some of these tools may run different sets of optimizations so this might be a little trickier. Maybe we could have the input, the output and a series of predicates that run against the output.
  1. There is some more follow up discussion to be had around the code point vs. code unit discussion earlier. It looks like from this issue, no two browsers agree 100% on the results.
  • Generator tests:
  1. JS to JS or even Chrome Devtools would be to run a source file through terser with source maps enabled and then execute both the source and generated file and stop at a break point and validate they both have the same scope information.
  2. In some cases, like the Kotlin one, the best we can do is just validation. No duplicate mappings, offsets are within range, is every line mapped, check statements.
    • For example, for testing code units vs. code points. You could have JS executable file that throws an error with an emoji in it. Then you could parse that error and validate that the column numbers are correct.
  • Debug IDs
    • Useful for not just sentry, but also DevTools.
    • How to convince JavaScript engines?
    • How big does it need to be? UUID v3 or v5?
    • The lack of error.stack proposal progress in TC39 shouldn’t block us.
  • DevTools formats
  • Ignore List
    • Stepping in Wasm doesn’t respect the ignore list.
    • Discussion about Stopping Behavior.
    • What’s preventing us from adding ignoreList now?
  • We might want to use scopes instead.
  • ZoneJS works with scopes?
  • No documentation on how to do RFCs for (other) people. Process documentation is missing.
  • Stages
  • All correctness bugs.

Action items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment