Skip to content

Instantly share code, notes, and snippets.

🎯
Googler

Edmond Cote edcote

View GitHub Profile
@edcote
edcote / ulimit.md
Last active Jul 10, 2018
ulimit gotcha
View ulimit.md
@edcote
edcote / cmake.md
Last active Jun 14, 2018
CMake Notes
View cmake.md

Why CMake? Why? It is only me or do I find CMake to be convoluted and non intuitive?

  • Debug vs. Release:

Technically, this is all that is needed:

cmake -DCMAKE_BUILD_TYPE=Release ..
cmake -DCMAKE_BUILD_TYPE=Debug ..
@edcote
edcote / scala.md
Last active Jun 12, 2018
Scala Notes
View scala.md

Scala Notes

Variables

Immutable and mutable

Scala has two kinds of variables, vals and vars:

  • val is immutable, cannot be reassigned
  • var is mutable, can be reassigned
View riscv_debug_spec.md

Chapter 2

Each hart in the platform is controlled by exactly one DM? But, usually all harts in a single core are controlled by the same DM.

Abstract commands provide access to GPRs. Addt. registers are accessible through abstract commands or by writing to the optional program buffer.

The program buffer allows the debugger to execute arbitrary instructions on a hart. A bus access block allows memory access without using a RISC-V hart to perform the access.

Chapter 3

@edcote
edcote / primer_consistency_coherence.md
Last active Jul 31, 2018
Primer on Memory Consistency and Cache Coherence
View primer_consistency_coherence.md

Chapter 1

Consistency models define correct shared memory behavior in terms of loads and stores without references to caches or coherence.

Chapter 2

  1. Single-Writer, Multiple-Read (Invariant): For any memory location A, at any given time, there exists only a single core that may write to A (and can also read it) ot some number of cores that may only read A.
  2. Data-Value Invariant: The value of the memory location at the start of an epoch is the same as the value of the memory location at the end of its last read-write epoch.

Chapter 3 - Memory Consistency Motivation and Sequential Consistency

@edcote
edcote / shen_lipasti.md
Last active Jul 31, 2018
Modern Processor Design - Shen, Lipasti
View shen_lipasti.md

Chapter 1

  • "Iron law": 1/Perf = time/program = instructions/program (cycle count) * cycles/instruction (CPI) * time/cycle (cycle time)
  • "Amdahl's law" = speedup = 1 / time = 1 / ((1-f)+(f/N))
    • speedup is limited by sequential bottlenec

Chapter 2

  • Three possible data dependences between two instructions, true (RAW), anti (WAR), and output (WAW). Also applies to memory data dependencies (not applicable in simple five stage pipeline).
  • There is also control dependencies.
@edcote
edcote / tilelink.md
Created May 26, 2018
SiFive TileLink Specification
View tilelink.md
@edcote
edcote / ddr3_features.md
Created May 26, 2018
Features of DDR3 SDRAM
View ddr3_features.md

Source

  • DDR3 SDRAM has eight banks, which allows more efficient interleave

Output driver impedance (Ron), ODT and ZQ calibration

  • The output driver impedence (Ron) of DQ, DQS, /DQS, and DM is selectable. Ron may fluctuate with PVR. DDR3 uses ZQ calibration.
  • ODT (On Die Termination). A termination resistor is provided by the chip to suppress signal refection. ODT resistance Rtt can be adjusted by MR2.
  • ZQ calibrate long to be issued during initialization. ZQ calibrate short to issue periodically during operation.
@edcote
edcote / riscv-user.md
Created May 26, 2018
RISC-V User-Level ISA
View riscv-user.md

Base

  • ISA separated into small base ISA and support for extensions
  • JAL stores the address of the instruction following the jump (pc+4) into register rd. Calling convention is x1 as return address and x5 as alternate link register. Return address stack can be manipulated by JAL/JALR.
  • Aligned loads and stores are guaranteed to execute atomically, misaligned loads and stores are not
  • Each hart observes its own memory operations as if they are executed in sequential program order. RISC-V observes a relaxed memory model between harts. Explicit FENCE instructions are required to guarantee ordering between memory operations from different harts.
  • FENCE is used to order I/O and memory accesses as viewed by other RISC-V harts, external devices, and co-processors. No other hart or external device can observe any operation in the successor set following a FENCE operation before any operation in the predecessor set before the FENCE.

Atomic "A"

@edcote
edcote / boomv2.md
Last active Oct 3, 2018
BOOM v2: An Open-Source OoO RISC-V Core
View boomv2.md

Notes

Link to tech report

Alpha 21264 has 15 FO4 delays. (FO4 delay is the delay of inverter, driven by an inverter 4x smaller than itself, and driving an interter 4x bigger than itself). BOOMv2 is 35 FO4.

BOOMv1 follows the 6-stage pipeline structure of MIPS R10K - fetch, decode/rename, issue/register-read, excute, memory, and writeback.

Frontend fetches instructions for execution in the backend.

You can’t perform that action at this time.