Arvid Gerstmann Leandros

## custom_game_engines_small_study.md

      
              1 file
            
          
              60 forks
            
          
              142 comments
            
          
              1309 stars
            
          
                raysan5
                / custom_game_engines_small_study.md
            
            
              Last active
              July 2, 2024 07:04
            
              
                A small state-of-the-art study on custom engines
              
          
    CUSTOM GAME ENGINES: A Small Study


A couple of weeks ago I played (and finished) A Plague Tale, a game by Asobo Studio. I was really captivated by the game, not only by the beautiful graphics but also by the story and the locations in the game. I decided to investigate a bit about the game tech and I was surprised to see it was developed with a custom engine by a relatively small studio. I know there are some companies using custom engines but it's very difficult to find a detailed market study with that kind of information curated and updated. So this article.
Nowadays lots of companies choose engines like Unreal or Unity for their games (or that's what lot of people think) because d

  
## cache-counters-rant.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              20 stars
            
          
                travisdowns
                / cache-counters-rant.md
            
            
              Created
              October 13, 2019 16:46
            
              
                Discussion of x86 L1D related cache counters
              
          
    The counters that are the easiest to understand and the best for making ratios that are internally consistent (i.e., always fall in the range 0.0 to 1.0) are the mem_load_retired events, e.g., mem_load_retired.l1_hit and mem_load_retired.l1_miss.
These count at the instruction level, i.e., the universe of retired instructions. For example, could make a reasonable hit ratio from mem_load_retired.l1_hit / mem_inst_retired.all_loads and it will be sane (never indicate a hit rate more than 100%, for example).
That one isn't perfect though, in that it may not reflect the true costs of cache misses and the behavior of the program for at least the following reasons:

It appplies only to loads and can't catch misses imposed by stores (AFAICT there is no event that counts store misses).
It only counts loads that retire - a lot of the load activity in your process may be due to loads on a speculative path that never retire. Loads on a speculative path may bring in data that is never used, causing misses and d


## borrow_inference.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              11 stars
            
          
                paniq
                / borrow_inference.md
            
            
              Last active
              January 22, 2024 06:59
            
              
                Borrow Inference
              
          
    Borrow Inference

by Leonard Ritter, Duangle GbR
This document has only historical significance and does not describe the borrow checker as it is now implemented. Please see this document for a more recent description.
This is a description of borrow inference, an alternative to borrow checking
that requires no declarative annotations to support proper management of
unique values and borrowed references at compile time.

  
## Matrix.md

      
              7 files
            
          
              73 forks
            
          
              17 comments
            
          
              867 stars
            
          
                nadavrot
                / Matrix.md
            
            
              Last active
              July 1, 2024 17:31
            
              
                Efficient matrix multiplication
              
          
    High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix
multiplication program on modern processors. In this tutorial I will use a
single core of the Skylake-client CPU with AVX2, but the principles in this post
also apply to other processors with different instruction sets (such as AVX512).
Intro

Matrix multiplication is a mathematical operation that defines the product of

  
## Example.cpp
#include "Reflection.h"
#include <string>
#include <vector>


namespace mks
{
	struct Rect
	{
		int top;

## amazon.md

      
              1 file
            
          
              37 forks
            
          
              21 comments
            
          
              270 stars
            
          
                terabyte
                / amazon.md
            
            
              Created
              December 6, 2017 02:27
            
              
                Amazon's Build System
              
          
    Prologue

I wrote this answer on stackexchange, here:
https://stackoverflow.com/posts/12597919/
It was wrongly deleted for containing "proprietary information" years later.  I think that's bullshit so I am posting it here.  Come at me.
The Question

Amazon is a SOA system with 100s of services (or so says Amazon Chief Technology Officer Werner Vogels). How do they handle build and release?

  
## bob
#!/bin/bash
#
# This is a sketch of an experimental design for a build system where project definitions are written in C.
# The build system is packaged as a single shell script which contains the C library code as a heredoc string.
# The shell script creates a concatenation from the library code and the file specified as the first command line argument.
# Then that is compiled and the result is run, which takes any required build-related actions; for this toy implementation,
# the only build action is running the C compiler directly, but it could also generate makefiles or VS solutions like premake.
#
# Note that using some tricks you can write a single file that is both a valid Unix shell script and Windows batch file,
# which combined with cross-platform code for the C library would allow a portable single-file build system solution.

## soa.cpp
#include <vector>
#include <cstdio>
#include <cstdint>
#include <cmath>
#include <xmmintrin.h>

namespace Soa
{
	template<typename TYPE, typename... ARGS>
	void Load(TYPE&, ARGS...);

## zero_copy_buffer_splice.c
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <windows.h>

typedef struct {
    char *start;
    char *current;
    char *end;

## rw_ring_sync.txt
// assume sequential consistency.

// this technique prevents frequent synchronization (cache line thrashing) of the read/write positions
// in the case where the ring buffer is running neither too close to full or too close to empty. it
// relies on the fact that an out of date notion of the read/write positions are conservative approximations.

// globals in shared memory. assume in different cache lines to prevent false sharing.
int read_pos, write_pos;

// reader
	#include "Reflection.h"
	#include <string>
	#include <vector>


	namespace mks
	{
	struct Rect
	{
	int top;
	#!/bin/bash
	#
	# This is a sketch of an experimental design for a build system where project definitions are written in C.
	# The build system is packaged as a single shell script which contains the C library code as a heredoc string.
	# The shell script creates a concatenation from the library code and the file specified as the first command line argument.
	# Then that is compiled and the result is run, which takes any required build-related actions; for this toy implementation,
	# the only build action is running the C compiler directly, but it could also generate makefiles or VS solutions like premake.
	#
	# Note that using some tricks you can write a single file that is both a valid Unix shell script and Windows batch file,
	# which combined with cross-platform code for the C library would allow a portable single-file build system solution.
	#include <vector>
	#include <cstdio>
	#include <cstdint>
	#include <cmath>
	#include <xmmintrin.h>

	namespace Soa
	{
	template<typename TYPE, typename... ARGS>
	void Load(TYPE&, ARGS...);
	#include <stdlib.h>
	#include <stdio.h>
	#include <assert.h>
	#include <string.h>
	#include <windows.h>

	typedef struct {
	char *start;
	char *current;
	char *end;
	// assume sequential consistency.

	// this technique prevents frequent synchronization (cache line thrashing) of the read/write positions
	// in the case where the ring buffer is running neither too close to full or too close to empty. it
	// relies on the fact that an out of date notion of the read/write positions are conservative approximations.

	// globals in shared memory. assume in different cache lines to prevent false sharing.
	int read_pos, write_pos;

	// reader