Skip to content

Instantly share code, notes, and snippets.

@jonhoo
Last active April 20, 2024 07:44
Show Gist options
  • Star 22 You must be signed in to star a gist
  • Fork 8 You must be signed in to fork a gist
  • Save jonhoo/7cfdfe581e5108b79c2a4e9fbde38de8 to your computer and use it in GitHub Desktop.
Save jonhoo/7cfdfe581e5108b79c2a4e9fbde38de8 to your computer and use it in GitHub Desktop.
cell-refcell-rc
// these aren't _quite_ functional tests,
// and should all be compile_fail,
// but may be illustrative
#[test]
fn concurrent_set() {
use std::sync::Arc;
let x = Arc::new(Cell::new(42));
let x1 = Arc::clone(&x);
std::thread::spawn(move || {
x1.set(43);
});
let x2 = Arc::clone(&x);
std::thread::spawn(move || {
x2.set(44);
});
}
#[test]
fn set_during_get() {
let x = Cell::new(String::from("hello"));
let first = x.get();
x.set(String::new());
x.set(String::from("world"));
eprintln!("{}", first);
}
#[test]
fn concurrent_set_take2() {
use std::sync::Arc;
let x = Arc::new(Cell::new([0; 40240]));
let x1 = Arc::clone(&x);
let jh1 = std::thread::spawn(move || {
x1.set([1; 40240]);
});
let x2 = Arc::clone(&x);
let jh2 = std::thread::spawn(move || {
x2.set([2; 40240]);
});
jh1.join().unwrap();
jh2.join().unwrap();
let xs = x.get();
for &i in xs.iter() {
eprintln!("{}", i);
}
}
#[test]
fn concurrent_get_set() {
use std::sync::Arc;
let x = Arc::new(Cell::new(0));
let x1 = Arc::clone(&x);
let jh1 = std::thread::spawn(move || {
for _ in 0..1000000 {
let x = x1.get();
x1.set(x + 1);
}
});
let x2 = Arc::clone(&x);
let jh2 = std::thread::spawn(move || {
for _ in 0..1000000 {
let x = x2.get();
x2.set(x + 1);
}
});
jh1.join().unwrap();
jh2.join().unwrap();
assert_eq!(x.get(), 2000000);
}
use std::cell::UnsafeCell;
pub struct Cell<T> {
value: UnsafeCell<T>,
}
// implied by UnsafeCell
// impl<T> !Sync for Cell<T> {}
impl<T> Cell<T> {
pub fn new(value: T) -> Self {
Cell {
value: UnsafeCell::new(value),
}
}
pub fn set(&self, value: T) {
// SAFETY: we know no-one else is concurrently mutating self.value (because !Sync)
// SAFETY: we know we're not invalidating any references, because we never give any out
unsafe { *self.value.get() = value };
}
pub fn get(&self) -> T
where
T: Copy,
{
// SAFETY: we know no-one else is modifying this value, since only this thread can mutate
// (because !Sync), and it is executing this function instead.
unsafe { *self.value.get() }
}
}
use crate::cell::Cell;
use std::marker::PhantomData;
use std::ptr::NonNull;
struct RcInner<T> {
value: T,
refcount: Cell<usize>,
}
pub struct Rc<T> {
inner: NonNull<RcInner<T>>,
_marker: PhantomData<RcInner<T>>,
}
impl<T> Rc<T> {
pub fn new(v: T) -> Self {
let inner = Box::new(RcInner {
value: v,
refcount: Cell::new(1),
});
Rc {
// SAFETY: Box does not give us a null pointer.
inner: unsafe { NonNull::new_unchecked(Box::into_raw(inner)) },
_marker: PhantomData,
}
}
}
impl<T> std::ops::Deref for Rc<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
// SAFETY: self.inner is a Box that is only deallocated when the last Rc goes away.
// we have an Rc, therefore the Box has not been deallocated, so deref is fine.
&unsafe { self.inner.as_ref() }.value
}
}
impl<T> Clone for Rc<T> {
fn clone(&self) -> Self {
let inner = unsafe { self.inner.as_ref() };
let c = inner.refcount.get();
inner.refcount.set(c + 1);
Rc {
inner: self.inner,
_marker: PhantomData,
}
}
}
// TODO: #[may_dangle]
impl<T> Drop for Rc<T> {
fn drop(&mut self) {
let inner = unsafe { self.inner.as_ref() };
let c = inner.refcount.get();
if c == 1 {
drop(inner);
// SAFETY: we are the _only_ Rc left, and we are being dropped.
// therefore, after us, there will be no Rc's, and no references to T.
let _ = unsafe { Box::from_raw(self.inner.as_ptr()) };
} else {
// there are other Rcs, so don't drop the Box!
inner.refcount.set(c - 1);
}
}
}
use crate::cell::Cell;
use std::cell::UnsafeCell;
#[derive(Copy, Clone)]
enum RefState {
Unshared,
Shared(usize),
Exclusive,
}
pub struct RefCell<T> {
value: UnsafeCell<T>,
state: Cell<RefState>,
}
// implied by UnsafeCell
// impl<T> !Sync for RefCell<T> {}
impl<T> RefCell<T> {
pub fn new(value: T) -> Self {
Self {
value: UnsafeCell::new(value),
state: Cell::new(RefState::Unshared),
}
}
pub fn borrow(&self) -> Option<Ref<'_, T>> {
match self.state.get() {
RefState::Unshared => {
self.state.set(RefState::Shared(1));
Some(Ref { refcell: self })
}
RefState::Shared(n) => {
self.state.set(RefState::Shared(n + 1));
Some(Ref { refcell: self })
}
RefState::Exclusive => None,
}
}
pub fn borrow_mut(&self) -> Option<RefMut<'_, T>> {
if let RefState::Unshared = self.state.get() {
self.state.set(RefState::Exclusive);
// SAFETY: no other references have been given out since state would be
// Shared or Exclusive.
Some(RefMut { refcell: self })
} else {
None
}
}
}
pub struct Ref<'refcell, T> {
refcell: &'refcell RefCell<T>,
}
impl<T> std::ops::Deref for Ref<'_, T> {
type Target = T;
fn deref(&self) -> &Self::Target {
// SAFETY
// a Ref is only created if no exclusive references have been given out.
// once it is given out, state is set to Shared, so no exclusive references are given out.
// so dereferencing into a shared reference is fine.
unsafe { &*self.refcell.value.get() }
}
}
impl<T> Drop for Ref<'_, T> {
fn drop(&mut self) {
match self.refcell.state.get() {
RefState::Exclusive | RefState::Unshared => unreachable!(),
RefState::Shared(1) => {
self.refcell.state.set(RefState::Unshared);
}
RefState::Shared(n) => {
self.refcell.state.set(RefState::Shared(n - 1));
}
}
}
}
pub struct RefMut<'refcell, T> {
refcell: &'refcell RefCell<T>,
}
impl<T> std::ops::Deref for RefMut<'_, T> {
type Target = T;
fn deref(&self) -> &Self::Target {
// SAFETY
// see safety for DerefMut
unsafe { &*self.refcell.value.get() }
}
}
impl<T> std::ops::DerefMut for RefMut<'_, T> {
fn deref_mut(&mut self) -> &mut Self::Target {
// SAFETY
// a RefMut is only created if no other references have been given out.
// once it is given out, state is set to Exclusive, so no future references are given out.
// so we have an exclusive lease on the inner value, so mutably dereferencing is fine.
unsafe { &mut *self.refcell.value.get() }
}
}
impl<T> Drop for RefMut<'_, T> {
fn drop(&mut self) {
match self.refcell.state.get() {
RefState::Shared(_) | RefState::Unshared => unreachable!(),
RefState::Exclusive => {
self.refcell.state.set(RefState::Unshared);
}
}
}
}
@poppindouble
Copy link

Hi, Jon, Great great tutorial! love it.

One question:

// TODO: #[may_dangle]
impl<T> Drop for Rc<T> {
    fn drop(&mut self) {
        let inner = unsafe { self.inner.as_ref() };
        let c = inner.refcount.get();
        if c == 1 {
            drop(inner);
            // SAFETY: we are the _only_ Rc left, and we are being dropped.
            // therefore, after us, there will be no Rc's, and no references to T.
            let _ = unsafe { Box::from_raw(self.inner.as_ptr()) };
        } else {
            // there are other Rcs, so don't drop the Box!
            inner.refcount.set(c - 1);
        }
    }
}

why this one have a may_dangle issue?

@jonhoo
Copy link
Author

jonhoo commented Oct 5, 2020

Hi! It doesn't have a may_dangle issue, it's more that we could add #[may_dangle] to it to make the type nicer to work with. :)

@poppindouble
Copy link

Cool, Cool! Thanks

@tony612
Copy link

tony612 commented Jul 4, 2021

@jonhoo Thanks for your wonderful videos. I have a question about rc: I found I can still use inner and can even set and get inner.value in Rc's Drop, why? I added the example code in // Why? section:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=e11759fb8fd3d8c21d605e0523dda419

@jonhoo
Copy link
Author

jonhoo commented Jul 8, 2021

Ah, yes, that's because I was being silly — drop(inner) doesn't actually do anything when inner is a reference. It doesn't remove it from scope, even though it kind of reads that way. The better way to safe-guard this code would be to intentionally shadow inner with:

let inner = ();

@tony612
Copy link

tony612 commented Jul 8, 2021

@jonhoo 👍 But what I still don't understand is why I can still set and get refcount after Box::from_raw. I thought the heap memory has been freed, so we can't access it. But I can still use it without panic.

@jonhoo
Copy link
Author

jonhoo commented Jul 9, 2021

Ah, that's because freeing memory just makes it available for future allocations, it doesn't actually make the memory inaccessible. The memory is still there, it's just undefined behavior to access it.

@tony612
Copy link

tony612 commented Jul 10, 2021

Got it. Thanks!

@f3lixding
Copy link

How come for RefCell and Cell Drop is not implemented but it is for Rc?

@jonhoo
Copy link
Author

jonhoo commented Sep 3, 2022

Because neither RefCell nor Cell need to do anything special on Drop — they just have to drop their inner type, which happens automatically when dropping their inner UnsafeCell.

@f3lixding
Copy link

Ahh I see. Thanks!

@1717chen
Copy link

Hi, thanks for the amazing video.

at line https://gist.github.com/jonhoo/7cfdfe581e5108b79c2a4e9fbde38de8#file-rc-rs-L57

let inner = unsafe { self.inner.as_ref() };
    let c = inner.refcount.get();
    if c == 1 {
        drop(inner);
        // SAFETY: we are the _only_ Rc left, and we are being dropped.
        // therefore, after us, there will be no Rc's, and no references to T.
        let _ = unsafe { Box::from_raw(self.inner.as_ptr()) };

for drop(inner);

I am noticing we are passing inner, a reference to RcInner into drop which is the following function:
pub fn drop<T>(_x: T) {}

since drop would take ownership, my question is would rust compiler convert &Rcinnert into RcInner type when we passing inner to drop()

@1717chen
Copy link

another question would be

let inner = Box::new(RcInner {

what if we don't use Box for inner. I understand that it would mean that the pointer would be stored on stack instead of heap. But would like to know more how does it affect our program behaviour.
For example, we should worry if the pointer got clean up after a particular stack exit?

thanks

@jonhoo
Copy link
Author

jonhoo commented Apr 20, 2024

@1717chen No to the first question — that line actually doesn't do anything except document to the reader that we are not allowed to use the reference any more.

To the second question, yes, the problem is exactly making sure the inner value, and any references to it, are cleaned up when the hosting stack frame exits. That is doable with lifetimes, but would require quite a bit of re-structuring I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment