Skip to content

Instantly share code, notes, and snippets.

@craftslab
Last active January 29, 2024 10:39
Show Gist options
  • Save craftslab/4f90f8f161ea470e52a97e90258d450d to your computer and use it in GitHub Desktop.
Save craftslab/4f90f8f161ea470e52a97e90258d450d to your computer and use it in GitHub Desktop.
go simd

Go SIMD

Instruction Support Detection: CPUID Instruction

package main

import (
    "fmt"

    "github.com/klauspost/cpuid"
)

func main() {
    if cpuid.CPU.AVX() {
        fmt.Println("AVX: supported")
    } else {
        fmt.Println("AVX: unsupported")
    }
}

cpu-feature-identification-for-go

Three Ways to Use SIMD Instructions

  • Compiler auto-vectorization

    Let the compiler figure out how to use SIMD

  • Intrinsic functions

    Special low-level functions that easily map to processor instructions

  • Implement in another language, link with Go

accelerating-data-processing-in-go-with-simd-instructions

Compiler auto-vectorization in Go

  • Google’s Go compiler does not support auto-vectorization

  • gccgo can do auto-vectorization

    Pass -ftree-vectorize (or -O3) option to gccgo

    Set -march option to enable additional SIMD extensions

    -march=native to use everything available on host

    Enable -ffast-math to unlock additional floating-point vectorization opportunities (use with care: it affects results)

    With go utility: go build

    -compiler=gccgo -gccgoflags=”-O3 -march=native -ffast-math”

accelerating-data-processing-in-go-with-simd-instructions

Linking with External Implementation

Go provides 4 mechanisms to call into code implemented in other languages:

  • Built-in assembler
  • syso
  • cgo
  • extern

accelerating-data-processing-in-go-with-simd-instructions

Using Go Assembler

Both Google’s and gccgo toolchains include assemblers, but the syntax is quite different:

gccgo toolchain uses GNU assembler, which accepts traditional AT&T and Intel syntax Google’s Go assembler (6a) follows Plan 9 syntax, which is different from everything!

accelerating-data-processing-in-go-with-simd-instructions

6a Assembler

6a assembler weirdness:

Unconventional names for instructions and registers Branch instructions use names from Motorola 68020 arch! Some widely used constructs are not supported E.g. alignment, jump tables 6a lacks SIMD instructions later than SSE3 (2004) Can be inserted directly as opcode bytes Function names must start with Unicode middle dot character!

multidimensional-arrays-for-the-go-language

accelerating-data-processing-in-go-with-simd-instructions

Generating Low-level Code with PeachPy

PeachPy is a Python framework for writing high-performance low-level codes.

For Go users, PeachPy can generate SysO, Plan 9 assembly, as well as object files for Windows, Linux, and OS X

peachpy

peachpy-examples

accelerating-data-processing-in-go-with-simd-instructions

Reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment