nrc/tools.md

## tools.md

      
    Raw
  

              tools.md
            
          
    Rust developer tools - status and strategy

Availability and quality of developer tools are an important factor in the success of a programming language. C/C++ has remained dominant in the systems space in part because of the huge number of tools tailored to these lanaguages. Succesful modern languages have had excellent tool support (Java in particular, Scala, Javascript, etc.). Finally, LLVM has been successful in part because it is much easier to extend than GCC. So far, Rust has done pretty well with developer tools, we have a compiler which produces good quality code in reasonable time, good support for debug symbols which lets us leverage C++/lanaguge agnostic tools such as debuggers, profilers, etc., there are also syntax highlighting, cross-reference, code completion, and documentation tools.
In this document I want to layout what Rust tools exist and where to find them, highlight opportunities for tool developement in the short and long term, and start a discussion about where to focus our time and energy to have maximum impact. Note that much of this document concerns long term developemnt and most goals will be low priority until post-1.0. Some of the issues here are not pure tooling issues (the compiler (!), syntax extensions, ...) and will hopefully discussion willoutgrow this document pretty quickly.
Existing tools

(Please help expand this list!)


The compiler - rustc - (see section, below)


Cargo package manager


Debugging support - debug symbols via rustc

GDB, LLDB

WIP rust-lldb wrapper to provide Rust pretty printing (in tree)


enables a whole bunch of tools which are based on debug symbols (Valgrind, rr, dtrace, various other profiling tools, ...)
lots of work to do, work is ongoing (Windows support, executing code, bugs, ... issues)


Profiling

Servo has a profiler, not sure how generally applicable it is (see https://github.com/servo/servo/blob/master/src/components/util/time.rs, https://github.com/servo/servo/blob/master/src/components/util/memory.rs)
perf - apparently this works, we have some kind of make file perf.mk in the repo, no idea what it does
gprof - needs some more compiler support
instruments (OS X) - works
callgrind/kcachegrind - works
more?


rustfmt WIP


Compiler built-in tools

unit testing
bench marking
logging
pretty printer - can also output explicit types, etc.
lints (built in, also lint plugins)


Editor support (syntax highlighting, etc.)

Vim, Emacs, Kate, and gedit (in tree)
sublime text

linter


Geany works out of the box - syntax highlighting and basic build support
Atom
ctags (make TAGS)


IDE support

Visual Studio

Would be nice to have debugging and profiling support (cmr knows something about this, maybe)
Debugging "kinda works" via cv2pdb. Possible way forward to improve Windows/VS debugging would be to improve Rust support in cv2pdb


IntelliJ IDEA
Eclipse WIP
XCode?


Testing

Hamcrest-Rust
Stainless
Quickcheck and many more
Rust Enforce - fluid assertions (TODO is this the right place? more details)


code search/cross referencing

rustfind (https://github.com/dobkeratops/rustfind)
dxr (https://github.com/nick29581/dxr/tree/rust2)
It would be nice to make DXR more easily usable for users rather than be a server-hosted thing (e.g., without requiring a web server and having long indexing time).
mxr (http://mxr.mozilla.org/rust/; http://mxr.mozilla.org/servo/)
I would love to have more in this space, like sophisticated, type-based queries over a code base.


code completion

Racer

Racer support for Sublime Text
Racer support for Atom


syntax highlighting

Pygments
gist/GitHub


experimentation tools

Play-pen


RustDoc (in tree)


RustCI


BindGen


spellck - spell checking Rust programs, compiler plugin


C-Reduce - test case reduction, designed for C/C++ but mostly works with Rust


Tools it would be nice to have


Refactoring tools Basic things such as rename method/field/variable and more complex things like inline/outline function (note that this is slightly more complicated in Rust than other lanaguages because Rust is not compositional with respect to outlining).


RustFix - a tool for translating code from an old version of Rust to a new version of Rust. C.f., GoFix. I think this is just a refactoring tool which can generate invalid code (i.e., code that is valid in a later version of Rust). The refactorings it can do are more language-level though.


A REPL - Read, eval, print loop. Useful for introducing new users to Rust, tutorials, experimentation with small programs.

related - an embeddable JIT


Deadlock detection.


Instrumentation of tasks to give a picture of how concurrent tasks are communicating.


Style checking (as opposed to style fixing (RustFmt)). Lints and make tidy do some of this. See for example, FXCop


Code coverage

gcov - rust-lang/rust#690
kcov - apparently works (modulo macros, see hackndev/zinc#168)


XCode fix-its - I couldn't find much information about how these work or even if it is possible to make them work for another lanaguage. Could be trivial or impossible or anywhere in bewtween.


Implementation specific debugging aids - e.g., what is the vtable layout of this object? What is the layout of this struct?


ccache
This is part of a crazy plan to simplify rustc's build system - if no-op compiles can be extrememly quick, effectively zero-cost, then we might never need make files. We would not care about dependencies and could just have a script which compiles everything. We would require accurate dependency information and stable/deterministic builds. It might need to be built in to the compiler. I expect if we have proper incremental compilation this could use the same infrastructure but be at the scale of crates rather than parts of a crate.


distcc


And some more research-ey ideas:


Lifetime visualisation/explanation


Rust-specific memory profiling (this might be a silly idea, but I wonder if we can use the static lifetime info from Rust with dynamic memory profiling to give useful information to the programmer)


Static/dynamic analysis for unsafe blocks


Some general issues

Its worth thinking about how Rust can get an awesome tools eco-system. I think tooling is a great way for people to get involved with Rust - projects tend to be small-ish and immediately useful. Since there are lots of UI issues around tooling, it is often useful to have several different versions of a tool, rather than a canonical implementation.
I believe the best way for the core Rust community to foster tool development (as well as working with/helping people who want to work on tools) is to provide useful and comprehensible APIs to the compiler (that is, APIs which can be used without understanding the details of the compiler) and to provide re-usable abstractions which can be leveraged by many tools (debuginfo is a great example of this, although it is kind of unique due to the level of support from existing tools).
As for why tooling requires special consideration rather than just plain reuse of compiler (or extension/plugin) APIs: compiling is basically a one-way street from source code to machine code. The only time we really need to go backwards is when generating error messages. Tooling often wants to do a round trip, e.g., start at source code, compile to annotated machine code, and then get back from the machine code annotations to the source code (debugging), or go from source code to type checking and then back to source code (cross-referencing). These backwards steps mean that tools often need different API, e.g., debuginfo, or much more spans than required for errors. It also means we need more flexibility - different tools need radically different information.
Other high level questions:

where to focus to have maximum impact?
what is most important?
where is the momentum?

Open questions

What exactly do tools need from the compiler (information and interfaces)?
Brain dump:

macro support (we've got a good start with span stacks, I hope we can do better in order to make tools which work well with macros)
spans
stringified idents/names, etc.
type info (and other info, e.g., cfg, metadata, dependencies)

APIs for this (both methods provided by the compiler, and dumps of this info as JSON or other data)
how to best organise and present this information? Currently tools have to do a lot of data juggling to make this information useful, can we move some of that to the compiler or to some low level tool?


identifiers, hashes for various items in the language
deterministic/stable builds

mostly this is about symbol names. We use hashes in those which (I think) are non-deterministic in some ways. That means we don't get stable builds.
furthermore, I would like to able to change the internals of one function without changing the symbol names for other items (does this have a name?). Currently, those hashes depend indirectly on node ids, which change very easily. It would be nice to not use node ids, etc. in these hashes so we have more stability in our builds.
there are other issues here which I have forgotten (FIXME)


printing, pretty printing (this is provided in several different ways by the compiler, we should think about unifying some of these methods and perhaps moving full pretty printing (of programs, rather than snippets) into a separate tool and instead provide more fundamental information for printing. The trouble is that the compiler often uses some of this functionality for messages, etc. so it is not as simple as just ripping out a bunch of code)
identification/search
flexibility - e.g., incremental/partial compilation

especially for tools like IDEs
useful for: realtime error reporting, code completion, navigation/cross-reference, etc.
need fine grained incrementality - be able to re-compile a single item without parsing, resolving, etc. the rest of the file.
error tolerance - be able to keep compiling despite errors; especially in the parser
codemap in libsyntax is not a suitable abstraction for incremental/long running compilation
be able to output messages (errors, warnings, etc) in different ways (e.g., for an IDE or warnings for DXR)


if the compiler is going to be long running (as part of an IDE), then we must be more careful about memory management, making sure everything is freed properly, etc.

How can the compiler be reused/extended? And how do we make this straightforward? By which I mean, how do we make the compiler most usable as a library and easiest to customise using plugins?
We should make any exposed APIs as separate from the compiler internals as possible. That gives us the most flexibility in refactoring the compiler later. Any exposed APIs should be guarded with the most flexible stability attributes.
The compiler

I also want to think about what we want to do with the compiler. This could be a separate topic, but seeing as the compiler is the most important developer tool, I thought I'd chuck my thoughts in here.
What are our goals for the compiler? Here are my ideas, in very rough order of priority:

complete and correct i.e., a faithful implementation of the Rust language
emit high quality code
flexible both in how it operates and in the different flavours of code it can produce:

different levels of optimisation
various debugging outputs
incremental compilation
oneliner execution - i.e., compile and execute code snippets in any context, for example when debugging and paused at a break point. Even, one day, edit and continue (be able to patch code into executable whilst debugging)


fast the faster we compile code, the better - waiting for the compiler sucks
we need automation here to ensure improvement and prevent regressions
possible idea - have a dedicated timing server: pull rust, build with time_passes, 3x, take average, make a graph of the results, if there is a regression, email anyone who merged a patch since the previous pull, repeat. Shouldn't matter about llvm because we only look at the time_passes results.
extensible

compiler plugins/syntax extensions (see http://doc.rust-lang.org/guide-plugin.html)
compiler API, ala Clang, Roselyn
plugins to the compiler middle/back end, as well as the front end (see, for example, https://www.haskell.org/ghc/docs/7.2.1/html/users_guide/compiler-plugins.html)


engineering quality
easy to improve/extend
less susceptible to bugs
well documented
modularity (could be part of extensibility and engineering quality) librustc is currently monolithic
useful for tooling
memory efficient the less memory we use during compilation, the better
backend agnostic it would be nice to be able to swap out the LLVM backend for something else
be an exemplar of well-written Rust (it is pretty much the opposite right now)

Once we get to 1.0, or shortly before, we should think about how well we meet these goals and how we can improve things. Only the first has really been high priority up till now.
Expanding on a couple of those goals:
Engineering quality

By improving the quality of the software, we make the compiler easier to improve and extend and less susceptible to bugs. This should save developer time and attract more contributors.
I have some ideas for large scale change below. On a smaller scale, the compiler could benefit from auditing of older code for refactoring opportunities, adhering to modern style conventions (this will be much easier if we have refactoring tools), using more idiomatic Rust patterns and modern language features, better documentation, removing obsolete or under-utilised code, and use of clearer abstractions.
Useful for tooling

Tools will use the compiler in two ways - as a library and as a framework (current examples: Rustdoc uses the compiler as a library, DXR uses it as a framework). In both cases, the compiler is more useful if it has a high quality and stable API.
For better use as a library, we should aim to stabilise some parts of the compiler as an API, generally the highest level functionality. In order to preserve flexibility in our implementation, we should probably add an extra API layer, rather than exposing the internals of the compiler. However, to some extent, we will have to commit to exposing and stabilising some data structures. For use as a framework, we need to identify parts of the compiler which can be used as hooks, both on a small and large scale (e.g., a callback when visiting a node of the AST during an existing pass, vs adding a pass).
I would like this high level API to include as much information as we can from the compiler, such as debuginfo, intermediate information from type checking, borrow checking, etc., metadata (even where the compiler would not normally generate it), and so forth.
For syntax extensions, I would like to separate the AST generated and used by libsyntax and the AST used by rustc. The former would be very close to the source code and exposed as part of the API to syntax extensions. The latter would not be exposed and would be a more transformed version of the AST. We would commit to only changing the libsyntax AST according to the semver rules, the rustc AST could change however we like. The first stage of rustc would convert the libsyntax AST to its internal AST.
I would also like to make available more high level information about a program being compiled which is currently implicit in the compiler. For example, finding all implementations of a trait or uses of a variable. This could be computed by an external tool, but this kind of information is likely to be widely used and we would be helping tool authors by making it available. I expect a blessed compiler plugin which could be used by other plugins is the best solution.
Again, good documentation is really important here - the compiler is only a useful component for tool authors if the exposed APIs are well documented and there are good guides to using the APIs.
If we do this right, I hope to see the compiler used as a library in sophisticated tools such as IDEs - incremental compilation, type information, warnings (including lints), macro expansion, code search, etc. all available as APIs.
Plans for the compiler

General things

If we're going to get serious about compiler speed (and I think we should, as well as getting proper incremental compilation, etc.), we need better infrastructure to prevent regressions there. Currently, we really only have isrustfastyet. A possible idea is to have a dedicated timing server (not a VM) which will pull rust, build three times with time_passes, find the average for each pass, make a graph of the results, if there is a regression, email anyone who merged a patch since the previous pull, repeat.