Skip to content

Instantly share code, notes, and snippets.

@dubrowgn
Last active September 14, 2023 13:54
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save dubrowgn/c61372647160ae4a52f65ef91da99ddc to your computer and use it in GitHub Desktop.
Save dubrowgn/c61372647160ae4a52f65ef91da99ddc to your computer and use it in GitHub Desktop.
Move, Copy, Borrow

Move/Copy/Borrow Semantics in Programming

Or, ramblings and complaints about the general state of programming and other possibly related grievances.

There are 3 primary ways to pass data into functions: move, copy, or borrow (aka a reference). Since mutability is inherently intertwined with data passing (this function can borrow my data, but only if they promise not to mess with it), we end up with 6 distinct combinations.

Move, Copy, Borrow, Mutable, Immutable

Every language has its own level of support and take on these semantics:

Language Move Copy Borrow Default (Primitives) Default (Complex Types) Immutable Parameters
C - Y -* Copy - Y (opt-in via 'const')
C# - Y Y Copy Mutable Borrow -
Java - Y Y Copy Mutable Borrow Y (opt-in via 'final')
Rust Y Y Y Copy Move/Copy** Y (opt-out via 'mut')

* Technically, C supports borrowing via pointers. However, the actual data (ie the address stored in the pointer) is always copied. So, you could argue that C supports indirect borrowing, as oposed to direct borrowing like Rust for example.
** In Rust move is the default, but types that implement the Copy trait are copied by default

Who knew things were this complicated? In C, everything is a primitive type. When you want to pass a "reference," you pass the memory address via a pointer. This pointer is really just an integer value, which is copied just like everything else. In most garbage collected languages, like C# or Java, complex types are passed by mutable reference. Rust mixes things up a bit by supporting all 6 semantics, and by changing the default for complex types to move instead of borrow.

Inconsistency

Interestingly, out of all of these languages, only C is consistent here; All values are always copied when you call a function; Easy. In all the other languages, taking a working function and swaping one parameter type for another is not guaranteed to work as expected. The following are some examples.

C

int example1(int value, char *message) {
	// value is copied
	// message is copied, memory pointed to by message is not
}
char* msg = malloc(100);
example1(1, msg);
// example1 is implicitly obligated to not give msg to anyone else (ie borrow)
free(msg);
int example2(int value, char *message) {
	// value is copied
	// message is copied, memory pointed to by message is not
	
	free(msg);
}
char* msg = malloc(100);
example2(1, msg);
// example2 is implicitly obligated to free msg (ie move)
int example3(int value, const char *message) {
	// value is copied
	// message is copied, memory pointed to by message is not
	
	// message is immutable, however we can remove immutability via cast...
	char *mut_message = (char *)message;
}
char* msg = malloc(100);
example3(1, msg);
free(msg);

Upsides:

  1. Perfect consistency, since everything is always copied.
  2. Mutability is explicit in the function signature via 'const'

Downsides:

  1. No move semantics, so the compiler cannot check for trivial memory leaks or double frees
  2. Functions given a const value can cast it back into a non-const value

Observations:

  1. C only support copy semantics, but optimizing compilers will convert some of them into moves if possible. In these cases the optimized code is generally equivalent to passing a pointer instead.

C#

int example1(int value, string message) {
	// value is copied
	// message is mutable borrowed (passed by reference)
}
example1(1, "hello world");
int example2(ref int value, string message) {
	// value is mutable borrowed (passed by reference)
	// message is mutable borrowed (passed by reference)
}
example2(ref int_var, "hello world");

Upsides:

  1. passing primitives by ref is explicit in the function signature

Downsides:

  1. Passing primitives by ref is explicit for the caller as well. Changing the function signature means changing all the callsites.
  2. Complex types can only be passed by reference (they are even called "reference types" in C#)
  3. No immutable borrow sematics
  4. No move semantics

Observations:

  1. Being garbage collected, languages like C# or Java probably dont have much use for move semantics since the garbage collector is always ultimately responsible for all memory allocated on the heap.

Rust

fn example1(value: i32, message: String) -> i32 {
	// value is copied
	// message is moved
}
example1(1, "hello world".to_string());
fn example2(value: i32, message: SomeTypeThatImplementsCopy) -> i32 {
	// value is copied
	// message is copied
}
example2(1, some_copy_var);
fn example3(value: &i32, message: &mut String) -> i32 {
	// value is immutable borrowed
	// message is mutable borrowed
}
let val = 1;
let mut msg = "hello world".to_string();
example3(&val, &mut msg);

Upsides:

  1. Support for the main move/copy/borrow semantic cases
  2. Borrowing is explicit in the function signature

Downsides:

  1. Move-by-default or copy-by-default is decided for each type, meaning a function signature cannot tell you which will be used for a given parameter.
  2. If the default move/copy behavior is undesirable in a specific situation, there are two ways to opt-out depending on which direction you want to go (move->copy vs copy->move).
  3. Primitive types cannot be moved. This is probably not a big deal, but it is inconsistent.

Observations:

  1. Rust improves on C by supporting move semantics, but the implicit move vs copy makes for a mess. From the function implementation side, it does not matter which is used because the data is owned by the function either way. But, from the caller's perspective, you always have to keep in mind which types implement the Copy trait. It can also make things fail in non-obvious ways:
fn foo<T>(value: T) {
	// ...
}

// this works
let c: char = 'c';
foo(c); // c was copied here
foo(c); // c was copied here

// this does not
let s: String = "c".to_string();
foo(s); // s was moved here
foo(s); // s is no longer in this scope; compilation failure

As someone who gets annoyed by this kind of inconsistency in languages, I have wondered if there is a better way.

Copy is Move in Disguise

What is the difference between copy and move? Once you think about it, you will realize that copy is really just a data copy followed by a move. The only difference, from the caller's perspective, is whether you move the original data or a copy of it. From the function's perspective, there is no difference; the data is owned either way. Which begs the question, why is copy tied up in function signatures in the first place? Wouldn't it be better to have some explicit, language-level support for copy that has nothing to do with functions or their signatures? Rust, for exmple, goes through a lot of trouble to keep data allocations explict, but at the end of the day, the main indicator that something will do a copy or a move comes down to how it is named. Why not something simple and explicit, like so?

// sends a complex type to another thread, or inserts into work queue, etc.
fn send(item: move ComplexType) {
	...
}

let original: ComplextType = ComplexType::new();

// create an explicit copy of the item
let clone = copy original;

// send a copy of original
send (clone);
// - or -
send(copy original);

// send the original item
send(original);

This keeps copies explicit, and prevents them from being conflated with function calling.

What I Actually Want Out of a Language

My big complaints about languages so far are:

  1. Lack of consistency.
  2. Copy is coupled to calling functions.

The obvious solution here seems to be explicit move/borrow semantics with a consistent default, and an explicit copy operator.

  1. Semantically, default is always immutable borrow, regardless of type
  2. Explicitly annotate function parameters for all other cases. Immutable and borrow annotations can be optional.
  3. Explicit annotations not required for callsites
  4. Mutable is implicitly convertable to immutable. Immutable is never convertable to mutable.
  5. Copy is always explicit
Language Move Copy Borrow Default (Primitives) Default (Complex Types) Immutable Parameters
Hypothetical Y Y* Y Immutable Borrow Immutable Borrow Y (opt-out via 'mut')

* Copy doesn't really belong in this table for our hypothetical language, because it is now a separate feature all together.

Borrow

fn example1(value: i32, message: String) {
	// same as fn example1(value: ref i32, message: ref String) {
	// value and message are immutable borrowed
}
let mut val = 1; // val is mutable in this scope
let msg = "hello world"; // msg is immutable in this scope
example1(val, msg); // both can be immutable borrowed
example1(val, msg); // multiple times
fn example2(value: mut i32, message: mut String) {
	// same as fn example2(value: mut ref i32, message: mut ref String) {
	// value and message are mutable borrowed
	value += 2;
}
let val = 1;
let msg = "hello world";
example2(val, msg); // compilation failure, both val and msg are immutable

let mut val = 1;
let mut msg = "hello world";
example2(val, msg);
example2(val, msg);
// value is 5

Move

fn example3(value: move i32, message: move String) {
	// value and message are immutable moved
}
let mut val = 1;
let msg = "hello world";
example3(val, msg); // val becomes immutable upon move
example3(val, msg); // compilation failure, val and msg have moved into example3
fn example4(value: mut move i32, message: mut move String) {
	// value and message are mutable moved
}
let mut val = 1;
let msg = "hello world";
example4(val, msg); // compilation failure, msg is immutable
example4(val, msg); // compilation failure, val and msg have moved into example4

Copy

fn example5(value: move i32, message: move String) {
	// value and message are immutable moved
}
let val = 1;
let msg = "hello world";
example5(copy val, copy msg);
example5(copy val, copy msg);
fn example6(value: mut move i32, message: mut move String) {
	// value and message are mutable copied
	value += 2;
}
let mut val = 1;
let msg = "hello world";

// mutability is determined by the alias (message in this case)
// so, this is legal, even though msg is immutable
example6(copy val, copy msg);
example6(copy val, copy msg);
// val is still 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment