trungnt13/Rust.md

## Rust.md

      
    Raw
  

              Rust.md
            
          
    Rust


Rust

VSCode and rust-analyzer

rust-analyzer failed to discover workspace
rust-analyzer block build
rust-analyzer CARGO_HOME


Control Flow

If-then-else
Match
Loop


Data Types

Scalar Types
Compound Types
References and Ownership Types

Ownership Types


Arrays, Tuple and Vector
Struct, Enum and Traits
Other Types
Mutable Reference
Automatic Dereferencing


Ownership and Borrowing by Examples

Move Out of an Array
Move and Scope
When to use String vs &str
Lifetime
MARSAW - Multiple Active Readers or Single Active Writer


Operators and Symbols

Pointers operators
Arithmetic operators
Range and Indexing
Other operators


Generic
Closures

Async Closures
Closures with Moves


Error Handling

Result, Option and unwrap

Optional
Result


Concurrent

Tokio Async
Async and Thread


Project, Lib and Module
Cargo Project Layout
Pyo3, Maturin and Extension-Module

Pyo3 Project Structure
Pyo3 Python Type Hint & Stub
Pyo3 Cargo.toml

Pyo3 Root's Cargo.toml
Pyo3 Sub-package Cargo.toml


Pyo3 pyproject.toml
Pyo3 and Rust Project Configuration
Rust Format
Pyo3 serde_json::Value to PyObject


SIMD and Auto Vectorization

Use "reslicing" to make it as obvious as possible to LLVM that your array indexes are in-bounds
For tiny loop bodies, always use internal iteration instead of external iteration
Floating-point is non-associative
make it really obvious what order you want it to do things


VSCode and rust-analyzer

rust-analyzer failed to discover workspace

Too many Cargo.toml files in the workspace, set this:
"rust-analyzer.linkedProjects": [
    "/home/tngo/codes/melia/tools/compiler/ml3x_compiler_accel/Cargo.toml"
]
rust-analyzer block build

"rust-analyzer.check.extraArgs": [
    "--target-dir=target/rust-analyzer"
]
rust-analyzer CARGO_HOME

"rust-analyzer.server.extraEnv": {
    "CARGO_HOME": "/home/tngo/codes/ml3x/insim/build/.cargo"
},

Control Flow

If-then-else

if x > 0 {
    println!("condition was true");
} else if x < 0 {
    println!("condition was false");
} else {
    println!("condition was false");
}
Match

match x {
    1 => println!("one"),
    2 => println!("two"),
    3 => println!("three"),
    _ => println!("anything"),
}
Loop

// Forever loop
loop {
    println!("again!");
}
// labeled loop
'outer: loop {
    'inner: loop {
        break 'outer;
    }
}
// while loop
while number != 0 {
    println!("{}!", number);
    number -= 1;
}
// for loop
for i in 1..4 {
    println!("{}", i);
}

Data Types

Scalar Types


Signed: i8, i16, i32, i64, i128 (100_000 is valid syntax for 100000 in Rust)
Unsigned: u8, u16, u32, u64, u128
Floating: f32, f64
Boolean: bool
Character: char

Compound Types


Arrays (fixed-size, homogeneous data structures): let a: [i32; 3] = [1, 2, 3];
Tuples (fixed-size, heterogeneous data structures): let tup: (i32, f64, char) = (500, 6.4, 'a');

References and Ownership Types


Immutable &T and mutable &mut T references
str must be a reference: let s: &str = "Hello, world!"; because it is a string slice mapped to a fixed memmory address.
to_string return a String type (in heap), while as_str return a &str type (reference stored in stack, but data point to the heap).

Ownership Types


Stored in heap: let owned_string: String = String::from("Hello, Rust!");
Vectors: let v: Vec<i32> = vec![1, 2, 3]; or let mut v: Vec<i32> = Vec::new();
Hash maps: let mut scores: HashMap<String, i32> = HashMap::new();
Hash sets: let mut set: HashSet<i32> = HashSet::new();
Structs: struct User { username: String, email: String }
Enums: enum IpAddr { V4(u8, u8, u8, u8), V6(String) }
Traits: trait Summary { fn summarize(&self) -> String; }
Closures: let expensive_closure = |num| { println!("calculating slowly..."); thread::sleep(Duration::from_secs(2)); num };

Arrays, Tuple and Vector


Tuple (a, b, c): A tuple is a fixed-size, stack-allocated collection of potentially different types. Once a tuple is created, you cannot add or remove elements from it.
Array [a; n]: An array is a fixed-size, stack-allocated collection of elements of the same type. Once an array is created, its size cannot be changed. All elements in an array must be of the same type.
Vector Vec<T>: A vector is a growable, heap-allocated collection of elements of the same type. You can add or remove elements from a vector dynamically. All elements in a vector must be of the same type.

let my_array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// Create a slice with a step of 2
let sliced_array: Vec<_> = my_array.iter().skip(1).step_by(2).take(3).collect();
println!("{:?}", sliced_array); // Output: [2, 4, 6]
Struct, Enum and Traits

Self is a keyword used to represent the type of the implementing struct or enum within a trait definition or implementation
trait Example {
    fn generic_method(value: Self) -> Self;
}

enum Color {
    Red,
    Green,
    Blue,
}
// classic struct
struct MyStruct {
    value: i32,
}
// tuple struct
struct RGB(Color, Color, Color);
// unit struct (fieldless)
struct UnitStruct;

impl Example for MyStruct {
    fn generic_method(value: MyStruct) -> MyStruct {
        // Some generic logic
        value
    }
}
Other Types


Option (optional values, Some(T) or None): let x: Option<i32> = Some(42); or let x: Option<i32> = None;
Result (error handling, Ok(T) or Err(E)): let x: Result<i32, String> = Ok(-3); or let x: Result<i32, String> = Err("Some error message".to_string());

Mutable Reference

&mut self syntax is used when defining methods that modify the contents of the instance they are called on.
Passing by reference mean no copy. Passing by value mean copy.
Automatic Dereferencing

In Rust, there's no difference in syntax when calling a method on an instance of a type or a reference to an instance. This is because Rust has a feature called automatic referencing and dereferencing.
When you call a method with the . operator, Rust automatically adds in any necessary &, &mut, or * so you can call methods on the value no matter how it's referenced.
However, the method itself may require a certain type of receiver (the type of self). If a method takes self, &self, or &mut self, it's called on a value, a reference, or a mutable reference, respectively.

If the method requires a value and you have a reference, you'll need to dereference the reference with * to call the method. If the method requires a reference and you have a value, Rust will automatically reference the value.


"It [the deref algorithm] will deref as many times as possible (&&String -> &String -> String -> str) and then reference at max once (str -> &str)".


Ownership and Borrowing by Examples

Compiler ensures:

No dangling references (using lifetime)
No double free

Move Out of an Array

In Rust, variables are moved by default when you assign them to another variable or pass them to a function. This means that the original variable can no longer be used after the move.
When you assign x[0] to y, you are moving the Option<String> out of the array. This is not allowed because the size of the array is fixed, and moving an element out would leave a hole.
fn pr(x: [Option<String>; 4]) {
    let y = x[0]; // ERROR
    let y = &x[0]; // OK
    let z = y.unwrap(); // ERROR: cannot move out of y because it's borrowed reference
    let z = (*y).unwrap(); // ERROR: same as above, automatic dereference here
    let z0: String = x[0].clone().unwrap(); // OK
    let z1: &String = x[0].as_ref().unwrap(); // OK
}
Move and Scope

let x = String::from("Hello, world!"); // x is the owner of the string
let y = x; // x is moved to y, x is no longer valid
println!("{}", x); // This will panic! Because x is no longer valid

{
    let z = y; // y is moved to z
} // z is no longer valid, y is no longer valid
println!("{}", y); // This will panic! Because y is no longer valid

/*==== Move is hard to detect ====*/
let v = vec![1, 2, 3];

fn sum_take_ownership(v: Vec<i32>) -> i32 {
    let mut sum = 0;
    for i in v {
        sum += i;
    }
    sum
    // v is moved to the function, v is no longer valid
}
let s = sum_take_ownership(v);
dbg!(s);
dbg!(v); // This will panic! Because v is no longer valid

let s = sum(v.clone());
dbg!(s);
dbg!(v); // this work because v is cloned/copied

/*==== ownership and Option ====*/
let food = Option::Some(Food::Apple);
let chop = Chopped(food.unwrap());
dbg!(food); // this will panic! because food is moved to chop
dbg!(chop);

/*==== function call with borrow to avoid move ====*/
fn sum_ref(v: &Vec<i32>) -> i32 {
    let mut sum = 0;
    for i in v {
        sum += i;
    }
    sum
}
let s = sum_ref(&v); // sum_ref borrow v
dbg!(v);
When to use String vs &str

#[derive(Debug)]
pub struct ApiKeyNotFound {
    target_key: String,
    known_keys: String,
}
ApiKeyNotFound("api_key".to_string(), "keys".to_string()); // String ownership is moved to the struct

/// This indicates that the `ApiKeyNotFound` struct has a lifetime `'a`, and the `target_key` and `known_keys` references must live at least as long as `'a`.
#[derive(Debug)]
pub struct ApiKeyNotFound<'a> {
    target_key: &'a str,
    known_keys: &'a str,
}

impl<'a> ApiKeyNotFound<'a> {
    pub fn new(target_key: &'a str, known_keys: &'a str) -> Self {
        ApiKeyNotFound {
            target_key,
            known_keys,
        }
    }
}

/*==== In this case, you cannot refer the str but must own the String ====*/
fn test(){
    let target = String::from("Shit");
    Err(ApiKeyNotFound::new("damn", target.as_str())) // Err: cannot return value refer to a local variable
}

String: This is an owned string type. If you want the struct to own the data (i.e., you want to store the string data directly in the struct), you should use String. This means that the struct will be responsible for deallocating the string data when it is no longer needed. This is generally easier to work with because you don't have to worry about lifetimes, but it can cause more memory allocation/deallocation.
&str: This is a borrowed string type. If you want the struct to borrow the data (i.e., you want to store a reference to string data that is owned by something else), you should use &str. This means that the struct will not be responsible for deallocating the string data. This can be more efficient because it avoids unnecessary memory allocation/deallocation, but it can be harder to work with because you have to ensure that the string data outlives the struct (i.e., the string data is not deallocated while the struct still exists).

In your case, if the target_key and known_keys are expected to be relatively short and not changed frequently, using String would be more convenient and the performance impact would be negligible.
If they are expected to be very large or changed frequently, you might want to consider using &str to avoid frequent memory allocation/deallocation. However, you would need to add a lifetime parameter to the struct to use &str, like so:
Lifetime

The scope within which a borrowed reference is valid. The aim of lifetimes is to prevent dangling references, which cause a program to reference data that has been deallocated.
let alice = "Alice";
let bob = "Bob";
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
    if x.len() > y.len() {
        x
    } else {
        y
    }
}
let result = longest(alice, bob);
println!("The longest string is {}", result);

// this also work
fn longest<'a>(x: &'a str, _y: &str) -> &'a str {
    x
}

// Struct with reference need lifetime annotation
struct Person<'a> {
    name: &'a str
}
In this case the function return a borrowed value, but the compiler don't know the lifetime of the return value should be x or y
MARSAW - Multiple Active Readers or Single Active Writer

Mutability constraints your ability to borrow references, only one of these two kinds of borrows can be active at a time:

one or more references (&T) to a resource,
exactly one mutable reference (&mut T).

let mut writer = vec![1, 2, 3];
let reader = &writer;
writer.push(4);

/*==== but this does not work ====*/
let mut writer = vec![1, 2, 3];
let reader = &writer;
writer.push(4);
println!("{:?}", reader); // this will panic! because reader is still active

/*==== implicit borrow as immutable (active reader )====*/
let mut writer = vec![1, 2, 3];
for i in writer.iter() { // borrow as immutable
    writer.push(i * 2); // error: borrow as mutable
}
Solutions for MARSAW:
/*==== Reorganize code ====*/
let mut writer = vec![1, 2, 3];
{
    let item = writer.last();
}
writer.push(4); // this work because the borrow is no longer valid
/*==== clone ====*/
item = writer.last().clone();
writer.push(4); // this work because the borrow is no longer valid
/*==== interior mutability ====*/
use std::cell::RefCell;
let data = RefCell::new(vec![1, 2, 3]);
{
    let im_ref = data.borrow(); // borrow immutably
}
let mut mut_ref = data.borrow_mut(); // borrow mutably
for i in mut_ref.iter_mut() {
    *i += 1;
}

Operators and Symbols

Pointers operators


Address-of/Reference operator: &
Dereference operator: *

Arithmetic operators


Logical: && - and, || - or, ! - not, (a || b) && !(a && b) - xor
Bitwise: & - and, | - or, ^ - xor, ! - not, << - left shift, >> - right shift
Field access: . - access field of a struct, -> - access field of a struct pointer

Range and Indexing


Range: .. - exclusive, ..= - inclusive (i.e. 1..=5 is equivalent to 1, 2, 3, 4, 5)

The real type is: std::ops::Range and std::ops::RangeInclusive


Array indexing: x[0] - access element of an array

Slicing: &x[1..3] - access a slice of an array. Reference is needed because slicing does not copy the data, it creates a view or reference to the original array.


Iterator:

The iterator returned by into_iter may yield any of T, &T or &mut T, depending on the context.
The iterator returned by iter will yield &T, by convention.
The iterator returned by iter_mut will yield &mut T, by convention.

String slicing:
let s = "Hello, world!";
let slice = &s[0..5]; // Slice from byte index 0 to 5
println!("Slice: {}", slice); // Prints: Slice: Hello

let s = "Здравствуйте"; // "Hello" in Russian
let slice = &s[0..1]; // This will panic! Because the first byte of the string is not a valid char boundary (each char is 2 bytes in UTF-8)
Other operators


Lambda: |x| x + 1 or |x, y| { x + y }
Error propagation: ? - propagate error to caller (down the stack)
Ignored values: _
Macro expansion: ! - println! is a macro, not a function
Liftime: 'a - lifetime of a variable

let r: &'a i32 = &x; - r has the same lifetime as a


Generic

In Rust, the correct syntax for specifying type parameters is Type::<TypeParameter>::function().
fn identity<T>(value: T) -> T {
    value
}
struct Wrapper<T> {
    value: T,
}
let result: i32 = identity(42);
let wrapper: Wrapper<i32> = Wrapper { value: 42 };
let vector = Vec::<i32>::new();

Closures

<..> is needed implementing the function because closure is a generic type, any trait that has the same signature can be passed to the function.
fn apply_operation<F>(x: i32, y: i32, operation: F) -> i32
where
    F: Fn(i32, i32) -> i32,
{
    operation(x, y)
}

let add_closure = |a, b| a + b;
let result = apply_operation(3, 5, add_closure);
Three types of closures:

FnOnce(self): consumes (copy) the variables it captures from its enclosing scope, known as the closure’s environment.
FnMut(&mut self): can change the environment because it mutably borrows values.
Fn(&self): borrows values from the environment immutably.

The move keyword signals that the closure should take ownership (i.e. copy) of the variables it captures, even if they are normally borrowed.
let x = 42;
// Closure without `move` (borrows x)
let borrow_closure = || {
};
// Closure with `move` (takes ownership of x)
let move_closure = move || {
};
Async Closures

a is borrow and it may outlive the context, thus, it must be move to be captured by the closure.
    let closure = |a: u64| async move {
        for i in 1..10 {
            println!("Hi number {} from the closure!", a);
            thread::sleep(Duration::from_millis(1));
        }
    };
Closures with Moves

In Rust, when you create a closure that captures variables from its enclosing scope, by default, it borrows them immutably. This means that the closure cannot modify the captured variables unless they are declared as mutable.
// Example of Fn
let mut x = 7;
let add_two = |y| x += y;
add_two(5);
println!("{}", x); // prints 12
add_two(3);
println!("{}", x); // prints 15

// Example of FnOnce
let mut x = 7;
let add_two = move |y| x += y;
add_two(5);
println!("{}", x); // prints 12
// add_two(3); // This would cause a compile error because add_two takes ownership of x and x is not available after the first call

Error Handling


panic! macro: terminate the program immediately unrecoverable.
Result type: Ok(T) or Err(E), recoverable.

Result, Option and unwrap

Optional

let x: Option<i32> = Some(5);
let y: Option<i32> = None;
let sum = x.unwrap() + y.unwrap(); // This will panic! Because y is None
let sum = x.unwrap() + y.unwrap_or(1); // this return 6
let sum = x.unwrap() + y.unwrap_or_else(|| 2); // this return 7
let sum = x.unwrap() + y.unwrap_or_default(); // this return 5
let sum = x.unwrap() + y.expect("y is None"); // this will panic! with message "y is None"
// using map
let sum = x.map(|v| v + 1).unwrap() ; // this will panic! because y is None
// using match
let z = match x {
    Some(v) => v, // `v` is the value inside `Some`, this automatically unwrap the value
    None => 0,
};
// using and_then
let sum = x.and_then(|v: i32| Some(v + 1)); // return Some(6)
let sum1 = x.map(|v: i32| v + 1); // return Some(6)
let sum2 = x.map(|v: i32| Some(v + 1)).flatten(); // return Some(6), without flatten it return Some(Some(6))
// using map results in nested Option, using and_then results in flattened Option
#[derive(Debug)] enum Food {Apple, Carrot, Potato}
#[derive(Debug)] struct Chopped(Food);
#[derive(Debug)] struct Cooked(Food);
let food = Option::Some(Food::Apple);
let cook = food.map(|f| Chopped(f)).map(|Chopped(f)| Cooked(f));
Result

let x: Result<i32, &str> = Ok(5);
let y: Result<i32, &str> = Err("error");
dbg!(x.unwrap()); // this return 5
dbg!(y.unwrap()); // this will panic! with message "error"

fn multiply(x: &str, y: &str) -> Result<i32, std::num::ParseIntError> {
    let x: i32 = x.parse()?;
    let y: i32 = y.parse()?;
    Ok(x * y)
}

fn multiply(x: &str, y: &str) -> Result<i32, std::num::ParseIntError> {
    x.parse::<i32>().and_then(|x| y.parse::<i32>().map(|y| x * y))
}

type IoResult<T> = std::result::Result<T, std::io::Error>;
Multiple Error Type:
// Pulling results out of Options, map_or means (if-None, if-Some)
let x: Option<Result<String, &str>> = Some(Ok("hello".to_string()));
let y = x.map_or(Ok(None), |r| r.map(Some)); // NOTE: x is moved to y

let x: Result<String, &str> = Ok("hello".to_string());
x.ok_or_else(|| DoubleError.into()); // ok_or_else for Result

let x: Option<Result<String, &str>> = Some(Ok("hello".to_string()));
x.map_err(|e| DoubleError.into()); // map_err for Option

Define a custom Error type
Boxing Error: Box<dyn Error>, use alias type Result<T> = std::result::Result<T, Box<dyn Error>>

use DoubleError.into() to convert DoubleError to Box<dyn Error>


Concurrent

Tokio Async

Must be called in an async context (i.e. async fn main).
let handle = tokio::task::spawn_blocking(|| {
    // some potentially blocking operation, offload onto a separate thread pool
});

let handle = tokio::spawn(async {
    // some async code, schedule on an async executor
});
Can be called in a sync context.
let rt = tokio::runtime::Runtime::new().unwrap();
rt.block_on(async {
    // some async code, schedule on an async executor
});
Async and Thread

Using Async and Thread side-by-side in a sync function.
fn main_sync() {
    println!("Hello, world!");

    let rt = tokio::runtime::Runtime::new().unwrap();
    rt.block_on(async {
        println!("block_on");

        join!(test0(), test1());
    });

    println!("Main thread {:?}", thread::current().id());
    let data = Arc::new(Mutex::new(0));
    let mut handles = vec![];
    for _ in 0..10 {
        let data = Arc::clone(&data);
        let handle = thread::spawn(move || {
            println!("thread {:?}", thread::current().id());
            let mut data = data.lock().unwrap();
            *data += 1;
        });
        handles.push(handle);
    }

    for handle in handles {
        handle.join().unwrap();
    }
    println!("Result: {}", *data.lock().unwrap());
}
Using Async and Thread side-by-side in an async function.
async fn main_async() {
    // Start an async task using tokio::spawn
    let closure = |a: u64| async move {
        for i in 1..a {
            println!("Hi number {} from the closure!", i);
            thread::sleep(Duration::from_millis(1));
        }
    };
    let async_handle = task::spawn(closure(5));

    // Start a new thread using std::thread::spawn
    let thread_handle = thread::spawn(|| {
        for i in 1..5 {
            println!("Hi number {} from the new thread!", i);
            thread::sleep(Duration::from_millis(1));
        }
    });

    // Wait for both tasks to complete
    async_handle.await.unwrap();
    thread_handle.join().unwrap();
}

Project, Lib and Module

Project template:
my_project
├── Cargo.toml
├── src
│   ├── main.rs
│   ├── lib.rs
│   ├── mod0.rs
│   └── mod1.rs
└── tests
    └── lib.rs

Define Modules: in lib.rs
mod mod0;
mod mod1;
Modules Codes: in mod0.rs and mod1.rs
// mod0.rs
pub fn mod0_func() {
    println!("mod0_func");
}
// mod1.rs
pub fn mod1_func() {
    println!("mod1_func");
}
Cargo Project Layout

https://doc.rust-lang.org/cargo/guide/project-layout.html
├── Cargo.lock
├── Cargo.toml
├── src/
│   ├── lib.rs
│   ├── main.rs
│   └── bin/
│       ├── named-executable.rs
│       ├── another-executable.rs
│       └── multi-file-executable/
│           ├── main.rs
│           └── some_module.rs
├── benches/
│   ├── large-input.rs
│   └── multi-file-bench/
│       ├── main.rs
│       └── bench_module.rs
├── examples/
│   ├── simple.rs
│   └── multi-file-example/
│       ├── main.rs
│       └── ex_module.rs
└── tests/
    ├── some-integration-tests.rs
    └── multi-file-test/
        ├── main.rs
        └── test_module.rs


Pyo3, Maturin and Extension-Module


Maturin create new project: maturin new [name] --name=[name] --bindings=pyo3 --mixed
build-essential and gcc needed for cargo test and build rlib

Pyo3 Project Structure

root
├── Cargo.toml
├── pyrust-lib
│   ├── Cargo.toml
│   ├── src
│   │   └── lib.rs
│   ├── bin
│   │   └── main.rs
│   ├── tests
│   │   └── test.rs
│   ├── examples
│   │   └── example.rs
├── pyrust-rs
│   ├── Cargo.toml
│   ├── src
│   │   └── main.rs
└── pyrust-py
    ├── Cargo.toml
    ├── pyproject.toml
    ├── .rustfmt.toml
    ├── src/lib.rs
    ├── python/pyrust_py/__init__.py /py.typed /pyrust_py.pyi
    └── tests/test_integration.rs

Pyo3 Python Type Hint & Stub


Create py.typed file to define python type
Create pyrust_py.pyi file to define python interface

For Rust only project:
pyrust
    ├── Cargo.toml
    ├── pyproject.toml
    ├── .rustfmt.toml
    ├── src/lib.rs
    ├── pyrust.pyi

For mixed Python and Rust project:
pyrust
    ├── Cargo.toml
    ├── pyproject.toml
    ├── src/lib.rs
    └── python/pyrust_py/__init__.py /py.typed /pyrust.pyi

Example of .pyi file:
from typing import final
__all__ = ["sum_as_string"]
@final
def sum_as_string(a: int, b: int) -> str:
    """Sum two integers (add something) and return the result as a string."""
    ...
Pyo3 Cargo.toml

Pyo3 Root's Cargo.toml

[workspace]
members = [ "pyrust-lib", "pyrust-rs", "pyrust-py" ]
resolver = "2"
Pyo3 Sub-package Cargo.toml

[package]
name = "pyrust-py"
version = "0.1.0"
edition = "2021"
include = ["src", "python/pyrust_py", "pyproject.toml", "README.md", "!*.so"]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[lib]
name = "pyrust_py"
# "cdylib" is necessary to produce a shared library for Python to import from.
# Downstream Rust code (including code in `bin/`, `examples/`, and `tests/`) will not be able
# to `use string_sum;` unless the "rlib" or "lib" crate type is also included, e.g.:
crate-type = ["cdylib", "rlib"]

[dependencies]
pyo3 = { version = "0.20.2" }
pyrust-lib = { path = "../pyrust-lib" }

[dev-dependencies]
pyo3 = { version = "0.20.2", features = ["auto-initialize"] }

[build-dependencies]
version_check = "0.9.4"

[features]
extension-module = ["pyo3/extension-module"]
# default = ["extension-module"]
Pyo3 pyproject.toml

[build-system]
requires = ["maturin>=1.4,<2.0"]
build-backend = "maturin"

[project]
name = "pyrust-py"
requires-python = ">=3.11"
classifiers = [
    "Programming Language :: Rust",
    "Programming Language :: Python :: Implementation :: CPython",
    "Programming Language :: Python :: Implementation :: PyPy",
]
dynamic = ["version"] # __version__ must be defined in __init__.py

# pip install -e .[tests]
[project.optional-dependencies]
tests = ["pytest"]

[tool.maturin]
python-source = "python"
# module-name = "pyrust_py._pyrust_py"
bindings = "pyo3"
features = ["pyo3/extension-module"]
NOTE: module-name must contain both python package name and rust lib name, separated by a dot (e.g. pyrust_py._pyrust_py indicates import pyrust_py for python package and import ._pyrust_py for rust lib).
If python module name same as rust lib name, then module-name can be omitted. Otherwise, it must be defined.
Pyo3 and Rust Project Configuration

Hierarchical structure:

/projects/foo/bar/.cargo/config.toml
/projects/foo/.cargo/config.toml
/projects/.cargo/config.toml
$CARGO_HOME/.config.toml

paths = ["/projects/foo/target", "/projects/target"] # dependency overrides

[alias] # command aliases
t = "test -- --nocapture"
rr = "run --release"

[build]
jobs = 1
rustflags = ["-C", "target-cpu=native"] #compiler flags

[profile.dev] # cargo build

[profile.test] # cargo test

[profile.bench] # cargo bench

[profile.doc] # cargo doc

[profile.release] # cargo build --release
opt-level = 3
lto = "fat" # link-time optimization: "off", "thin" - part, "fat" - whole
codegen-units = 1 # When you set `codegen-units = 1`, it means that the compiler will generate the machine code for your program in a single unit. This can potentially lead to more optimized code because the compiler has full visibility into all parts of your program when performing optimizations.

# Profile overrides
[profile.dev.package."*"]
opt-level = 1
incremental = true # Enable incremental compilation
codegen-units = 4 # Set the number of codegen units for parallel compilation

[doc]
browser = "chromium"

[env]
RUST_BACKTRACE = "full" # full, minimal, 1
RUST_TEST_THREADS = "1"
PYO3_PRINT_DEBUG = "1"
CARGO_LOG = "info" # trace, debug, info, warn, error
Rust Format

# .rustfmt.toml
max_width = 100
tab_spaces = 4
hard_tabs = false
newline_style = "auto" # auto, native, unix, windows
use_small_heuristics = "Default" # Max,Off, Default, a positive integer
edition = "2021"
Pyo3 serde_json::Value to PyObject

use std::collections::HashMap;

use pyo3::prelude::*;
use serde_json::Value;

fn value_to_object( val: &Value, py: Python<'_> ) -> PyObject {
    match val {
        Value::Null => py.None(),
        Value::Bool( x ) => x.to_object( py ),
        Value::Number( x ) => {
            let oi64 = x.as_i64().map( |i| i.to_object( py ) );
            let ou64 = x.as_u64().map( |i| i.to_object( py ) );
            let of64 = x.as_f64().map( |i| i.to_object( py ) );
            oi64.or( ou64 ).or( of64 ).expect( "number too large" )
        },
        Value::String( x ) => x.to_object( py ),
        Value::Array( x ) => {
            let inner: Vec<_> = x.iter().map(|x| value_to_object(x, py)).collect();
            inner.to_object( py )
        },
        Value::Object( x ) => {
            let inner: HashMap<_, _> =
                x.iter()
                    .map( |( k, v )| ( k, value_to_object( v, py ) ) ).collect();
            inner.to_object( py )
        },
    }
}

#[repr(transparent)]
#[derive( Clone, Debug )]
struct ParsedValue( Value );

impl ToPyObject for ParsedValue {
    fn to_object( &self, py: Python<'_> ) -> PyObject {
        value_to_object( &self.0, py )
    }
}

#[pyfunction]
pub fn parse() -> PyResult<PyObject> {
    let mapping: HashMap<i64, HashMap<String, ParsedValue>> = HashMap::from( [
        ( 1, HashMap::from( [
            ( "test11".to_string(), ParsedValue( "Foo".into() ) ),
            ( "test12".to_string(), ParsedValue( 123.into() ) ),
        ] ) ),
        ( 2, HashMap::from( [
            ( "test21".to_string(), ParsedValue( "Bar".into() ) ),
            ( "test22".to_string(), ParsedValue( 123.45.into() ) ),
        ] ) ),
    ] );

    Ok( pyo3::Python::with_gil( |py| {
        mapping.to_object( py )
    } ) )
}

#[pymodule]
fn parser( _py: Python, m: &PyModule ) -> PyResult<()> {
    m.add_function( wrap_pyfunction!( parse, m )? )?;

    return Ok( () );
}

SIMD and Auto Vectorization

Rust Lulz: Godbolt assembly exploring without crate limitations, in Visual Studio Code
https://saveriomiroddi.github.io/Rust-lulz-godbolt-assembly-exploring-without-crate-limitations-in-visual-studio-code/

LLVM (and GCC) don't know how to auto-vectorize loops whose trip-count can't be calculated up front. This rules out search loops like this.
Probably your only hope would be to manually loop over 2, 4, or 8-element chunks of the arrays, branchlessly calculating your condition based on all those elements. If you're lucky, LLVM might turn that into operations on one SIMD vector. So using that inner loop inside a larger loop could result in getting the compiler to make vectorized asm, for example using AVX vptest (which sets CF according to bitwise a AND (not b) having any non-zero bits). i.e. manually express the "unrolling" of SIMD elements in your source, for a specific vector widt

Use "reslicing" to make it as obvious as possible to LLVM that your array indexes are in-bounds

pub fn demo_slow(x: &[i32], y: &[i32], z: &mut [i32]) {
    for i in 0..z.len() {
        z[i] = x[i] * y[i];
    }
}

pub fn demo_fast(x: &[i32], y: &[i32], z: &mut [i32]) {
    let n = z.len();
    let (x, y, z) = (&x[..n], &y[..n], &mut z[..n]); // NOTE: reslicing
    for i in 0..z.len() {
        z[i] = x[i] * y[i];
    }
}
for i in 0..n loops are generally the slowest in Rust. If you can't use iterators (which eliminate bounds checks), then the trick is to hint LLVM that the slices are large enough: let x = &x[0..n];, but that very much depends on whether LLVM will figure out that the range of the slice matches range used in the for loop.
rustflags = ["-C", "target-cpu=native", "-C", "llvm-args=-ffast-math", "-C", "opt-level=3", "-C", "llvm-args=-force-vector-width=16"]
For tiny loop bodies, always use internal iteration instead of external iteration

It uses the shortest length between x, y, and z
pub fn demo_iter(x: &[i32], y: &[i32], z: &mut [i32]) {
    let products = std::iter::zip(x, y).map(|(&x, &y)| x * y);
    std::iter::zip(z, products).for_each(|(z, p)| *z = p);
}
Floating-point is non-associative

// This vectorizes great
pub fn dot_int(x: &[i32], y: &[i32]) -> i32 {
    std::iter::zip(x, y).map(|(&x, &y)| x * y).sum()
}
// But this only unrolls, without vectorizing
pub fn dot_float(x: &[f32], y: &[f32]) -> f32 {
    std::iter::zip(x, y).map(|(&x, &y)| x * y).sum()
}
make it really obvious what order you want it to do things

pub fn dot<const N: usize>(x: &[f32; N], y: &[f32; N]) -> f32 {
    let (x, x_tail) = x.as_chunks::<4>();
    let (y, y_tail) = y.as_chunks::<4>();

    assert!(x_tail.is_empty() && y_tail.is_empty(), "N must be a multiple of 4");

    let mut sums = [0.0; 4];
    for (x, y) in std::iter::zip(x, y) {
        let [x0, x1, x2, x3] = *x;
        let [y0, y1, y2, y3] = *y;
        let [p0, p1, p2, p3] = [x0 * y0, x1 * y1, x2 * y2, x3 * y3];
        sums[0] += p0;
        sums[1] += p1;
        sums[2] += p2;
        sums[3] += p3;
    }
    
    (sums[0] + sums[1]) + (sums[2] + sums[3])
}