Kenta Sato bicycle1885

## shift_dfa.md

      
              1 file
            
          
              4 forks
            
          
              6 comments
            
          
              93 stars
            
          
                pervognsen
                / shift_dfa.md
            
            
              Last active
              January 27, 2024 19:54
            
              
                Shift-based DFAs
              
          
    A traditional table-based DFA implementation looks like this:
uint8_t table[NUM_STATES][256]

uint8_t run(const uint8_t *start, const uint8_t *end, uint8_t state) {
    for (const uint8_t *s = start; s != end; s++)
        state = table[state][*s];
    return state;
}


## JuliaAtomics.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              8 stars
            
          
                vtjnash
                / JuliaAtomics.md
            
            
              Last active
              June 17, 2024 11:33
            
          
    Introduction

This proposal aims to define the memory model of Julia and to provide certain guarantees in the presence of data races, both by default and through providing intrinsics to allow the user to specify the level of guarantees required. This should allow native implementation in Julia of simple system primitives (like mutexes), interoperate with native system code, and aim to give generally explainable behaviors without incurring significant performance cost. Additionally, it strives to be general-purpose and yet clear about the user's intent—particularly with respect to ensuring that an atomic-type field is accessed with proper care for synchronization.
The last two points deserve particular attention, as Julia has always provided strong reflection and generic programming capabilities that has not been seen—in this synergy combination—in any other language. Therefore, we want to be careful to observe a distinction between the asymmetries of reading vs. writing that we have felt is often not given

  
## Matrix.md

      
              7 files
            
          
              73 forks
            
          
              17 comments
            
          
              867 stars
            
          
                nadavrot
                / Matrix.md
            
            
              Last active
              July 1, 2024 17:31
            
              
                Efficient matrix multiplication
              
          
    High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix
multiplication program on modern processors. In this tutorial I will use a
single core of the Skylake-client CPU with AVX2, but the principles in this post
also apply to other processors with different instruction sets (such as AVX512).
Intro

Matrix multiplication is a mathematical operation that defines the product of

  
## WhatIsStrictAliasingAndWhyDoWeCare.md

      
              1 file
            
          
              50 forks
            
          
              35 comments
            
          
              489 stars
            
          
                shafik
                / WhatIsStrictAliasingAndWhyDoWeCare.md
            
            
              Last active
              July 2, 2024 08:29
            
              
                What is Strict Aliasing and Why do we Care?
              
          
    What is the Strict Aliasing Rule and Why do we care?

(OR Type Punning, Undefined Behavior and Alignment, Oh My!)

What is strict aliasing? First we will describe what is aliasing and then we can learn what being strict about it means.
In C and C++ aliasing has to do with what expression types we are allowed to access stored values through. In both C and C++ the standard specifies which expression types are allowed to alias which types. The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule. If we attempt to access a value using a type not allowed it is classified as undefined behavior(UB). Once we have undefined behavior all bets are off, the results of our program are no longer reliable.
Unfortunately with strict aliasing violations, we will often obtain the results we expect, leaving the possibility the a future version of a compiler with a new optimization will break code we th

  
## 01-mac-profiling.md

      
              3 files
            
          
              22 forks
            
          
              5 comments
            
          
              123 stars
            
          
                loderunner
                / 01-mac-profiling.md
            
            
              Last active
              June 8, 2024 09:44
            
              
                Profiling an application in Mac OS X
              
          
    Profiling an application in Mac OS X

Finding which process to profile

If your system is running slowly, perhaps a process is using too much CPU time and won't let other processes run smoothly. To find out which processes are taking up a lot of CPU time, you can use Apple's Activity Monitor.
The CPU pane shows how processes are affecting CPU (processor) activity:


## gist:0b7dab3e75bfbf96f895

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              59 stars
            
          
                shyouhei
                / gist:0b7dab3e75bfbf96f895
            
            
              Created
              March 31, 2015 15:26
            
              
                新社会人の人が留意すべき事項
              
          
    新社会人に必須である：

勤務先との書面による「労働契約」。業務委託契約等NG。
多寡を問わず毎月払われる給料。遅配等論外である。
健康保険。
労災保険。
雇用保険。
三六協定。
年次有休。
育児休業の制度があり取得者がいる会社に勤務する。


## latency.txt
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference                           0.5 ns
Branch mispredict                            5   ns
L2 cache reference                           7   ns                      14x L1 cache
Mutex lock/unlock                           25   ns
Main memory reference                      100   ns                      20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy             3,000   ns        3 us
Send 1K bytes over 1 Gbps network       10,000   ns       10 us
Read 4K randomly from SSD*             150,000   ns      150 us          ~1GB/sec SSD

## 256color.pl
#!/usr/bin/perl
# Author: Todd Larason <jtl@molehill.org>
# $XFree86: xc/programs/xterm/vttests/256colors2.pl,v 1.2 2002/03/26 01:46:43 dickey Exp $

# use the resources for colors 0-15 - usually more-or-less a
# reproduction of the standard ANSI colors, but possibly more
# pleasing shades

# colors 16-231 are a 6x6x6 color cube
for ($red = 0; $red < 6; $red++) {
	Latency Comparison Numbers (~2012)
	----------------------------------
	L1 cache reference 0.5 ns
	Branch mispredict 5 ns
	L2 cache reference 7 ns 14x L1 cache
	Mutex lock/unlock 25 ns
	Main memory reference 100 ns 20x L2 cache, 200x L1 cache
	Compress 1K bytes with Zippy 3,000 ns 3 us
	Send 1K bytes over 1 Gbps network 10,000 ns 10 us
	Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
	#!/usr/bin/perl
	# Author: Todd Larason <jtl@molehill.org>
	# $XFree86: xc/programs/xterm/vttests/256colors2.pl,v 1.2 2002/03/26 01:46:43 dickey Exp $

	# use the resources for colors 0-15 - usually more-or-less a
	# reproduction of the standard ANSI colors, but possibly more
	# pleasing shades

	# colors 16-231 are a 6x6x6 color cube
	for ($red = 0; $red < 6; $red++) {