nikomatsakis/status-of-borrowck.md Secret

## status-of-borrowck.md

      
    Raw
  

              status-of-borrowck.md
            
          
    I've been getting a lot of questions about the status of "Non-lexical
lifetimes" (NLL) -- or, as I prefer to call it these days, the
MIR-based borrow checker -- so I wanted to post a status
update.
The single most important fact is that the MIR-based borrow check is
feature complete and available on nightly. What this means is that
the behavior of #![feature(nll)] is roughly what we intend to ship
for "version 1", except that (a) the performance needs work and (b) we
are still improving the diagnostics. (More on those points later.)
The MIR-based borrow check as currently implemented represents a huge
step forward from the existing borrow checker, for two reasons.
First, it eliminates a ton of borrow check errors, resulting in a much
smoother compilation experience. Second, it has a lot less bugs. More
on this point later too.
You may be wondering how this all relates to the "alias-based borrow
check" that I outlined in [my previous post][pp], which we have since
dubbed Polonius. We have implemented that analysis and solved the
performance hurdles that it used to have, but it will still take some
effort to get it fully ready to ship. The plan is to defer that work
and ultimately ship Polonius as a second step: it will basically be a
"MIR-based borrow check 2.0", offering even fewer errors.
[pp]: {{ site.baseurl }}/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/
Would you like to help?

If you'd like to be involved, we'd love to have you! The NLL working
group hangs out on the #wg-nll stream in Zulip. We have
weekly meetings on Tuesdays (3:30pm Eastern time) where we discuss the
priorities for the week and try to dole out tasks. If that time
doesn't work for you, you can of course pop in any time and
communicate asynchronously. You can also always go look for work to do
amongst [the list of GitHub issues].
Transition period

As I mentioned earlier, the MIR-based borrow checker fixes a lot of
bugs -- this is largely a side effect of making the check operate
over the MIR. This is great, but it does mean that we can't just
"flip the switch" and enable the MIR-based borrow checker by default,
since that would break existing crates (I don't really know how many
yet). The plan therefore is to have a transition period.
During the transition period, we will issue warnings if your program
used to compile with the old borrow checker but doesn't with the new
checker (because we fixed a bug in the borrow check). The way we do
this is to run both the old and the new borrow checker. If the new
checker would report an error, we first check if the old check would
also report an error. If so, we can issue the error as normal. If
not, we issue only a warning, since that represents a case that used
to compile but no longer does.
The good news is that while the MIR-based checker fixes a lot of bugs,
it also accepts a lot more code. This lessens the overall impact. That
is, there is a lot of code which ought to have gotten errors from the
old borrow check (but never did), but most of that code won't get any
errors at all under the new check. No harm, no foul. =)
Performance

One of the main things we are working on is the performance of the
MIR-based checker, since enabling the MIR-based borrow checker
currently implies significant overhead during compilation. Take a look
at this chart, which plots rustc build times for the clap
crate:

The black line ("clean") represents the "from scratch" build time with
rustc today. The orange line ("nll") represents "from scratch" build
times when NLL is enabled. (The other lines represent incremental
build times in various combinations.) You can see we've come a long
way, but there is still plenty of work to do.
The biggest problem at this point is that we effectively have to
"re-run" the type check a second time on the MIR, in order to compute
all the lifetimes. This means we are doing two type-checks, and that
is expensive.  However, this second type check can be significantly
simpler than the original: most of the "heavy lifting" has been
done. Moreover, there are lots of opportunities to cache work between
them so that it only has to be done once. So I'm confident we'll make
big strides here. (For example, I've got a PR up right now
that adds some simple memoization for a 20% win, and I'm
working on follow-ups that add much more aggressive memoization.)
(There is an interesting corolllary to this: after the transition
period, the first type check will have no need to consider lifetimes
at all, which I think means we should be able to make it run quite a
bit faster as well, which should mean a shorter "time till first
error" and also help things like computing autocompletion information
for the RLS.)
Diagnostics

It's not enough to point out problems in the code, we also have to
explain the error in an understandable way. We've put a lot of effort
into our existing borrow checker's error message. In some cases, the
MIR-based borrow checker actually does better here.  It has access to
more information, which means it can be more specific than the older
checker. As an example¹, consider this error that the old borrow
checker gives:
error[E0597]: `json` does not live long enough
  --> src\main.rs:38:17
   |
38 |         let v = json["data"]["search"]["edges"].as_array();
   |                 ^^^^ borrowed value does not live long enough
...
52 |     }
   |     - `json` dropped here while still borrowed
...
90 | }
   | - borrowed value needs to live until here

The error isn't bad, but you'll note that while it says "borrowed
value needs to live until here" it doesn't tell you why the borrowed
value needs to live that long -- only that it does. Compare that to the
new error you get from the same code:
error[E0597]: `json` does not live long enough
  --> src\main.rs:39:17
   |
39 |         let v = json["data"]["search"]["edges"].as_array();
   |                 ^^^^ borrowed value does not live long enough
...
53 |     }
   |     - borrowed value only lives until here
...
70 |             ", last_cursor))
   |                ----------- borrow later used here

The new error doesn't tell you "how long" the borrow must last, it
points to a concrete use. That's great.
Other times, though, the errors from the new checker are not as good.
This is particularly true when it comes to suggestions and tips for
how to fix things. We've gone through all of our internal diagnostic
tests and drawn up a list of about 37
issues,
documenting each point where the checker's message is not as good as
the old one, and we're working now on drilling through this list.
Polonius

In my [previous blog post][pp], I described a new version of the
borrow check, which we have since dubbed Polonius. That analysis
further improves on the MIR-based borrow check that is in Nightly
now. The most significant improvement that Polonius brings has to do
with "conditional returns".  Consider this example:
fn foo<T>(vec: &mut Vec<T>) -> &T {
  let r = &vec[0];
  if some_condition(r) {
    return r;
  }
  
  // Question: can we mutate `vec` here? On Nightly,
  // you get an error, because a reference that is returned (like `r`)
  // is considered to be in scope until the end of the function,
  // even if that return only happens conditionally. Polonius can
  // accept this code.
  vec.push(...);
}
In this example, vec is borrowed to produce r, and r is then
returned -- but only sometimes. In the MIR borrowck on nightly, this
will give an error -- when r is returned, the borrow is forced to
last until the end of foo, no matter what path we take. The Polonius
analysis is more precise, and understands that, outside of the if,
vec is no longer referenced by any live references.
We originally intended for NLL to accept examples like this: in the
RFC, this was called Problem Case #3. However, we had to
remove that support because it was simply killing compilation times,
and there were also cases where it wasn't as precise as we wanted.  Of
course, some of you may recall that in my [previous post about
Polonius][pp] I wrote:

...the performance has a long way to go ([Polonius] is currently
slower than existing analysis).

I'm happy to report that this problem is basically solved. Despite the
increased precision, the Polonius analysis is now easily as fast as
the existing Nightly analysis, thanks some smarter encoding of the
rules as well as the move to use
datafrog.
We've not done detailed comparisons, but I consider this problem
essentially solved.
If you'd like, you can try Polonius today using the -Zpolonius
switch to Nightly. However, keep in mind that this would be a
'pre-alpha' state: there are still some known bugs that we have not
prioritized fixing and so forth.
Conclusion

The key take-aways here:

NLL is in a "feature complete" state on Nightly.
We are doing a focused push on diagnostics and performance, primarily.
Even once it ships, we can expect further improvements in the
future, as we bring in the Polonius analysis.

Footnotes


Hat tip to steveklabnik for providing this example! ↩


## status-post-2.md

      
    Raw
  

              status-post-2.md
            
          
    Now that the final Rust 2018 Release Candidate has
shipped, I thought it would be a good idea to do another
update on the state of the MIR-based borrow check (aka NLL). The [last
update][] was in June, when we were still hard at work on getting
things to work.
[last update]: {{ site.baseurl }}/blog/2018/06/15/mir-based-borrow-check-nll-status-update/
Rust 2018 will use NLL now, Rust 2015 support coming soon

Let's get the highlights out of the way. Most importantly, Rust 2018
crates will use NLL by default. Once the Rust 2018 release candidate
becomes stable, I hope to switch Rust 2015 crates to use NLL as well,
but we're holding off until we have some more experience with people
using it in the wild.
NLL is awesome

I've been using NLL in practice for quite some time now, and I can't
imagine going back. Recently I've been working in my spare time on
the salsa crate¹, which uses Rust 2018, and I've really
noticed how NLL makes a lot of "complex" borrowing interactions work
out quite smoothly. These are all instances of the [problem cases #1
and #2] I highlighted way back when², but they interact in
interesting ways I did not fully anticipate.
Let me give you a hypothetical example. Imagine I am writing some bit
of code that routes messages, which look like this:
enum Message {
    Letter { recipient: String, data: String },
    // ... other cases here ...
}
When I receive a letter, I want to inspect its recipient. If that matches my name,
I will process the data using process:
fn process(data: &str) { .. }
but otherwise I'll forward it along to the next person in the
chain. Using NLL, I can write this code like so (playground):
fn router(me: &str, rx: Receiver<Message>, tx: Sender<Message>) {
  for message in rx {
    match &message {
      Message::Letter { recipient, data } => {
        if recipient != me {
          tx.send(message).unwrap();
        } else {
          process(data);
        }
      }
    }
  }
}
What's interesting about this code is how uninteresting it is -- it
basically just does what you expect, and didn't require any special
action to please the borrow checker³. But before NLL, this would have
required some significant contortions to achieve (try it yourself if
you
like
-- that's a link to the same code, but with Rust 2015 edition set).
Diagnostics, migration, and performance

We've also put a lot of effort into NLL diagnostics and I think that
by and large they are even better than the old borrow checker (which
were already quite good). This is particularly true for the 'lifetime
error messages'.  Unfortunately, you won't see all of those
improvements yet on Rust 2018 -- the reason has to do with
migration.
What is this migration you ask? Well, it's our way of dealing with the
fact that the new MIR-based borrow checker has fixed a ton of
soundness bugs from the old checker. Unfortuantely, in practice, that
means that some existing code will not compile anymore (because it
never should have compiled in the first place!). To give people time
to make that transition, we are running the NLL code in "migration
mode", which means that if you have code that used to compile, but no
longer does, we issue warnings instead of errors. This
migration mode will eventually change to issue hard errors instead
(probably in a few releases, but that depends a bit on what we find in
the wild).
One downside of migration mode is that it requires keeping around the
older code. In some cases, this older code can produce errors that
wind up masking the newer, nicer errors that are produced by the
MIR-based checker. The good news is that once we finish the migration,
this means that errors will just get better.

Finally, those of you who read the previous posts may remember that
the performance of the NLL checker was a big stumbling block. I'm
happy to report that the performance issues were largely addressed:
there remains some slight overhead to using NLL, but it is largely not
noticeable in practice, and I expect we'll continue to improve it over
time.
What next?

So, now that NLL is shipping, what is next for ownership and borrowing
in Rust? That's a big question, and it has a few different answers,
depending on the "scale" of time we are looking at. The immediate
answer is that we've still got some bugs to nail down (small ones)
and of course we expect that once more people start banging on the new
code, they'll encounter new problems that have to be fixed. In
addition, we've got to put some energy into writing up documentation
for how the new checker works and similar things (we wound up
deviating from the RFC analysis in various ways, and it'd be nice to
document those).
In the medium term, the plan is to push more on the Polonius
formulation of NLL that [I described here][alias-based]. In addition
to offering a crisp formalization of our analysis, Polonius promises
to fix the [Problem Case #3][pc3-link] that I identified in the
original NLL introduction, along with some other cases where the
current analysis falls short.
In the longer term, well, that's an open question, and one where I
would like to here from you, dear reader. Over the next week or so, I
am planning to write up a series of blog posts. Each will describe
what I consider to be a common "tricky scenario" where people hit
problems with the borrow checker, and none of which are solved by NLL.
I'll also describe the current fixes required. Then I hope to do a
survey, trying to get a picture of which of these challenges cause the
most problems for folks, so that we can try to decide how to
prioritize future improvements to Rust.
Footnotes

[alias-based]: {{ site.baseurl }}/blog/2018/04/27/an-alias-based-formulation-of-the-borrow-checker/
[pc3-link]: {{ sit.baseurl }}/blog/2016/04/27/non-lexical-lifetimes-introduction/#problem-case-3-conditional-control-flow-across-functions
Footnotes


Did you see how smoothly I worked in that plug for salsa? I'll write a post about it soon, I promise. ↩


Note that the current NLL implementation does not solve Problem Case #3. See [the "What Next?" section][wn] for more.
[wn]: #what-next ↩


Interestingly, I remember an example almost exactly like this being shown to me by a Servo intern -- I forget which one -- many years ago. At the time, it didn't seem like a big deal to do the workarounds, but I realize now I was wrong about that. Ah well. ↩


## what-is-next-post-1.md

      
    Raw
  

              what-is-next-post-1.md
            
          
    In my previous post on the status of NLL, I promised to talk about
"What is next?" for ownership and borrowing in Rust. I want to lay out
the various limitations of Rust's ownership and borrowing system that
I see, as well as -- where applicable -- current workarounds. I'm
curious to get feedback on which problems affect folks the most.
The first limitation I wanted to focus on is interprocedural
conflicts.  In fact, I've covered a special case of this before --
where a closure conflicts with its creator function -- in my post on
[Precise Closure Capture Clauses][cc]. But the problem is more
general.
[cc]: {{ site.baseurl }}/blog/2018/04/24/rust-pattern-precise-closure-capture-clauses/
The problem

Oftentimes, it happens that we have a big struct that contains a
number of fields, not all of which are used by all the
methods. Consider a struct like this:
use std::sync::mpsc::Sender;

struct MyStruct {
  widgets: Vec<MyWidget>,
  counter: usize,
  listener: Sender<()>,
}

struct MyWidget { .. }
Perhaps we have a method increment which increments the counter each
time some sort of event occurs. It also fires off a message to some
listener to let them know.
impl MyStruct {
  fn signal_event(&mut self) {
    self.counter += 1;
    self.listener.send(()).unwrap();
  }
}
The problem arises when we try to invoke this method while we are
simultaneously using some of the other fields of MyStruct.  Suppose
we are "checking" our widgets, and this process might generate the
events we are counting; that might look like so:
impl MyStruct {
  fn check_widgets(&mut self) {
    for widget in &self.widgets {
      if widget.check() {
        self.signal_event();
      }
    }
  }
}  
Unfortunately, this code is going to yield a compilation error.
The error I get presently is:
error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable
  --> src/main.rs:26:17
     |
  24 |         for widget in &self.widgets {
     |                       -------------
     |                       |
     |                       immutable borrow occurs here
     |                       immutable borrow used here, in later iteration of loop
  25 |             if widget.check() {
  26 |                 self.signal_event();
     |                 ^^^^^^^^^^^^^^^^^^^ mutable borrow occurs here

What this message is trying to tell you¹ is that:

During the loop, you are holding a borrow of self.widgets.
You are then giving away access to self in order to call signal_event.

The danger here is that signal_event may mutate self.widgets,
which you are currently iterating over.


Now, you and I know that signal_event is not going to touch the
self.widgets field, so there should be no problem here. But the
compiler doesn't know that, because it only examines one function at a
time.
Inlining as a possible fix

The simplest way to fix this problem is to modify check_widgets
to inline the body of signal_event:
impl MyStruct {
  fn check_widgets(&mut self) {
    for widget in &self.widgets {
      if widget.check() {
        // Inline `self.signal_event()`:
        self.counter += 1;
        self.listener.send(()).unwrap(); 
      }
    }
  }
}  
Now the compiler can clearly see that distinct fields of self are
being used, so everything is hunky dory. Of course, now we've created
a "DRY"-failure -- we have two bits of code that know how to signal an
event, and they could easily fall out of sync.
Factoring as a possible fix

One way to address the DRY failure is to factor our types better.
For example, perhaps we can extract a EventSignal type and
move the signal_event method there:
struct EventSignal {
  counter: usize,
  listener: Sender<()>,
}

impl EventSignal {
  fn signal_event(&mut self) {
    self.counter += 1;
    self.listener.send(()).unwrap();
  }
}
Now we can modify the MyStruct type to embed an EventSignal:
struct MyStruct {
  widgets: Vec<MyWidget>,
  signal: EventSignal,
}
Finally, instead of writing self.signal_event(), we will write self.signal.signal_event():
impl MyStruct {
  fn check_widgets(&mut self) {
    for widget in &mut self.widgets {
      if widget.update() {
        self.signal.signal_event(); // <-- Changed
      }
    }
  }
}  
This code compiles fine, since the compiler now sees access to
two distinct fields: widgets and signal. Moreover, we can invoke
self.signal.signal_event() from as many places as we want without
duplication.
Truth be told, factoring sometimes makes for cleaner code: e.g., in
this case, there was a kind of "mini type" hiding within MyStruct,
and it's nice that we can extract it. But definitely not always. It
can be more verbose, and I sometimes find that it makes things more
opaque, simply because there are now just more structs running around
that I have to look at.  Some things are so simple that the complexity
of having a struct outweights the win of isolating a distinct bit of
functionality.
The other problem with factoring is that it doesn't always work:
sometimes we have methods that each use a specific set of fields, but
those fields don't factor nicely. For example, if we return to our
original MyStruct (where everything was inlined), perhaps we might
have a method that used both self.counter and self.widgets but not
self.listener -- the factoring we did can't help us identify a
function that uses counter but not listener.
Free variables as a general, but extreme solution

One very general way to sidestep our problem is to move things out of
method form and into a "free function". The idea is that instead of
&mut self, you will take a separate &mut parameter for each field
that you use. So signal_event might look like:
fn signal_event(counter: &mut usize, listener: &Sender<()>) {
  *counter += 1;
  listener.send(()).unwrap();
}
Then we would replace self.signal_event() with:
signal_event(&mut self.counter, &self.listener)
Obviously, this is a significant ergonomic regression. However, it is
very effective at exposing the set of fields that will be accessed to
our caller.
Moving to a free function also gives us some extra flexibility. You
may have noted, for example, that the signal_event function takes a
&Sender<()> and not a &mut Sender<()>. This is because the send
method on Sender only requires &self, so a shared borrow is
all we need. This means that we could invoke signal_event in some
location where we needed another shared borrow of self.listener
(perhaps another method or function).
View structs as a general, but extreme solution

I find moving to a free function to be ok in a pinch, but it's pretty
annoying if you have a lot of fields, or if the method you are
converting calls other methods (in which case you need to identify the
transitive set of fields). There is another technique I have used from
time to time, though it's fairly heavy weight. The idea is to define a
"view struct" which has all the same fields as the orignal, but it
uses references to identify if those fields are used in a "shared"
(immutable) or "mutable" way.
For example, we might define CheckWidgetsView
struct CheckWidgetsView<'me> {
  widgets: &'me Vec<MyWidget>,
  counter: &'me mut usize,
  listener: &'me mut Sender<()>,
}
Now we can define methods on the view without a problem:
impl<'me> CheckWidgetsView<'me>  {
  fn signal_event(&mut self) {
    *self.counter += 1;
    self.listener.send(()).unwrap();
  }

  fn check_widgets(&mut self) {
    for widget in &self.widgets {
      if widget.check() {
        self.signal_event();
      }
    }
  }
}  
You might wonder why this solved the problem. After all, the check_widgets
method here basically looks the same -- the compiler still sees two overlapping
borrows:

a shared borrow of self.widgets, in the for loop
a mutable borrow of self, when invoking signal_event

The difference here lies in the type of self.widgets: because it is
a &Vec<MyWidget>, we already know that the vector we are iterating
over cannot change -- that is, we are not giving away mutable access
to the iterator itself, just to a reference to the iterator. So
there is nothing that signal_event could do to mess up our
iteration.
(Note that if we needed to mutate the widgets as we iterated, this
"view struct" trick would not work here, and we'd be back where we
started -- or rather, we'd need a new view struct just for
signal_event.)
One nice thing about view structs is that we can have more than one,
and we can change the set of fields that each part refers to. So, for
example, one sometimes has "double buffering"-like algorithms that use
one field for input and one field for output, but which field is used
alternates depending on the phase (and which perhaps use other fields
in a shared capacity). Using view struct(s) can handle this quite
elegantly.
Relation to closures

As I mentioned, one common place where this problem arises is actually
with closures. This occurs because closures always capture entire
local variables; so if a closure only uses some particular field of a
local, it can create an unnecessary conflict. For example:
fn check_widgets(&mut self) {
  // Make a closure that uses `self.counter`
  // and `self.listener`; but it will actually
  // capture all of `self`.
  let signal_event = || {
    self.counter += 1;
    self.listener.send(()).unwrap(); 
  };
  
  for widget in &self.widgets {
    if widget.check() {
      signal_event();
    }
  }
}
Even though it's an instance of the same general problem, it's worth
calling out specially, because it can be solved in different ways. In
fact, we've accepted RFC #2229, which proposes to change the closure
desugaring. In this case, the closure would only capture
self.counter and self.listener, avoiding the problem.
Extending the language to solve this problem

There has been discussion on and off about how to solve this problem.
Clearly, there is a need to permit methods to expose information about
which fields they access and how they access those fields, but it's not
clear what's the best way to do this. There are a number of tradeoffs at play:

Adding more concepts to the surface language.
Core complexity; this probably involves extending the base borrow checker rules.
Annotation burden.
Semver considerations (see below).

There is some discussion of the view idea in this internals
thread;
I've also tinkered with the idea of merging views and traits, as
described in this internals
post. I've
also toyed with the idea of trying to infer some of this information
for private functions (or perhaps even crate-private functions), but I
think it'd be best to start with some form of explicit syntax.
Semver considerations. One of the things you'll notice about all
of the solutions to the problem is that they are all ways of exposing
information to the compiler about which fields will be used in
signal_event or (in the case of view structs) how they will be
used. This has semver implications: imagine you have a public function
fn get(&self) -> &Foo that returns a reference to something in
self. If we now permit your clients to invoke other methods while
that borrow is live (because we know somehow that they won't
interfere), that is a semver commitment. The current version, where
your struct is considered an atomic unit, gives you maximal freedom to
change your implementation in the future, because it is maximally
conservative with respect to what your clients can do.
Conclusion

The general problem here I think is being able to identify which
fields are used by a method (or set of methods) and how. I've shown a
number of workarounds you can use today. I'm interested to hear,
however, how often this problem affects you, and which (if any) of the
workarounds might have helped you. (As noted, I would break out
closures into their own subcategory of this problem.)
Footnotes


This message is actually a bit confusing. ↩