Skip to content

Instantly share code, notes, and snippets.

@lffg
Last active June 14, 2023 23:40
Show Gist options
  • Save lffg/1880f95766fd6f0073fb872a481af640 to your computer and use it in GitHub Desktop.
Save lffg/1880f95766fd6f0073fb872a481af640 to your computer and use it in GitHub Desktop.

Basics of RISC-V

  • Notes.
    • See An Introduction to Assembly Programming with RISC-V.
    • ISA (Industry Standard Architecture) is the portion of the architecture which is visible to compiler writers. I.e., a contract.
    • There are 33 general-purpose unprivileged registers:
      • zero
      • t0 ~ t6 for temporary values, which are not to be persisted across calls.
      • s0 ~ s11 for saved values.
      • a0 ~ a7 for arguments and return values.
      • ra, the return address.
      • sp, the stack pointer.
      • gp, the global pointer.
      • tp, the thread pointer.
      • pc, the program counter.
    • Basic instruction formats are:

      • INS <rd>, <rs1>, <rs2>.
        • Example, ADD s0, s1, s2.
      • INS <rd>, <rs>, <imm>, with an immediate argument.
        • Example, ADD s0, s1, 9.
      • INS <rd>, <imm>.
        • Example, LUI s0, 9.
    • Basic instruction reference

      • Arithmetic instructions:

        • Add ADD <rd>, <rs1>, <rs2> (format R).
        • Add immediate ADDI <rd>, <rs>, <imm> (format I).
        • Sub SUB <rd>, <rs1>, <rs2> (format R).
        • Load Upper Immediate LUI <rd>, <imm> (format U).]
        • Mul MUL <rd>, <rs1>, <rs2> (format R).
        • Div DIV{U} <rd>, <rs1>, <rs2> (format R). U suffix may be used to indicate that the values in rs1 and rs2 are unsigned.
        • Rem REM{U} <rd>, <rs1>, <rs2> (format R).
        • When the M extension (which adds multiplication and division) is not enabled, one may use shifts and additions to perform multiplications and divisions.

      • Logic instructions:

        • XOR XOR <rd>, <rs1>, <rs2> (format R).
        • XOR immediate XORI <rd>, <rs>, <imm> (format I).
        • Same for OR, ORI, AND and ANDI.
      • Shift instructions:

        • Shift left SLL <rd>, <rs1>, <rs2> (format R).
        • Shift left immediate SLLI <rd>, <rs>, <imm> (format I).
        • Note that $X &lt;&lt; n$ is equivalent to $X \times 2^n$.

        • Shift right SRL <rd>, <rs1>, <rs2> (format R).
        • Shift right immediate SRLI <rd>, <rs>, <imm> (format I).
        • Note that $X &gt;&gt; n$ is equivalent to $X \times 2^{-n}$. Or... is it?

          When shifting to the right, similar to the logical left shift, the rightmost bits are discarded. The new leftmost bits are set with zero.

          However, though shift right is "equivalent" to division to positive (unsigned) numbers, it is not valid to negative numbers. This is due to the two's complement representation. When some of the leftmost bits are set to zero, one gets a positive number!

          Hence, there is an arithmetic shift right. Instead of setting the leftmost bits with zero, the leftmost bits are replicated with the current leftmost bit.

        • Shift right arithmetic SRA <rd>, <rs1>, <rs2> (format R).
        • Shift right arithmetic immediate SRAI <rd>, <rs>, <imm> (format I).
      • Comparison instructions

        • Set less than SLT <rd>, <rs1>, <rs2> sets rd with 1 if the signed value in rs1 is less than the signed value in rs2. Else, set rd with 0.
        • Set less than immediate SLTI <rd>, <rs>, <imm>.
        • Set less than unsigned SLTU <rd>, <rs1>, <rs2>.
        • Set less than unsigned immediate SLTUI <rd>, <imm>.
        • There is also SEQZ <rd>, <rs> and SNEZ <rd>, <rs>.
      • Data movement (memory) instructions

        • RISC-V is a load/store architecture. This means that values must be loaded (or stored) from (or to) the memory before being used for operations.
        • Load word LW <rd>, <rs>, <offset>. E.g. LW a5, 0(a0).
        • Store word SW <rs> <offset>(<rd>). E.g. SW a5, 0(a0).
        • Notice that store instructions are the only ones where the source register is the first operand.

        • There is also LB (byte), LBU (byte unsigned), LH (half-word), LHU (half-word unsigned), SB, SBU, SH and SHU.
      • Endianness

        • RISC-V follows the little-endian endianness. Hence, the byte loaded from the memory position associated with the smallest address (leftmost) is loaded into the register's least (rightmost) significant byte.
          • Analogously, when storing, the register's least (rightmost) significant byte is stored in the smallest address (leftmost).
        • For example, when the LBU (load byte unsigned) instruction is used, the byte at the designated position is loaded into the least significant byte of the register. The other three bytes are set to zero. E.g. result is \x00 \x00 \x00 \xAB.
          • When LB is used, if the byte is negative as per two's complement format, the three rightmost bytes are set to FF (i.e., all one). This preserves the negative signal and the negative signed value itself, which is the expected behavior.
      • Pseudo-instructions

        • Doesn't have a corresponding machine instruction in the architecture, but is automatically translated-from by the assembler. Some examples include...
        • nop as addi x0, x0, 0.
        • mv <rd> <rs> as add <rd>, <rs>, 0.
        • li <rd>, <imm>, which is "automatically converted by the assembler to the best sequence of machine instructions to compose the desired value". One should remember that, since all instructions in the RV32I set are 32-bits long, immediate values can occupy at most 12 bits (i.e., max 2047).
        • la <rd>, <symbol>, which loads the 32-bit address indicated by the label, symbol, into the destination register.
        • call, which uses jal or jalr.
      • Control-flow (branch) instructions

        • Conditional control-flow

          • The decision of changing or not the normal execution flow depends on whether a given condition is satisfied. I.e., jump [to dest label] if true.

          • BEQ <rs1>, <rs2>, <lab> if equal.
          • BNE <rs1>, <rs2>, <lab> if not equal.
          • BEQZ <rs>, <lab> if equal to zero.
          • BNEZ <rs>, <lab> if not equal to zero.
          • BLT <rs1>, <rs2>, <lab> if signed rs1 is smaller than signed rs2.
          • BLTU <rs1>, <rs2>, <lab> if unsigned rs1 is smaller than unsigned rs2.
          • BGE <rs1>, <rs2>, <lab> if signed rs1 is greater than or equal to signed rs2.
          • BGEU <rs1>, <rs2>, <lab> if unsigned rs1 is greater than or equal to unsigned rs2.
          • Notice that there is not BLE (branch less than or equal to) nor BGT (branch greater than).

        • Unconditional control-flow

          • J <lab> jumps to lab.
          • JR <rs> jumps to the address stored in the register rs ("indirect jump").
          • JAL <lab> stores the return address (pc + 4) on the return register (ra) and jumps to lab. "Jump and link".
          • JAL <rd>, <lab> is the same as above, but stores the return address in rd.
          • RET jumps to the address stored on the return register, ra.
    • Labels

      • Symbolic and numeric labels.
      • Numeric labels require the postfix b for before or f for after. E.g. j 1b jumps to the numeric label 1 before.
    • Directives

      • Directives are commands that control the assembler.
      • For example, the .section .data directive instructs the assembler to turn the .data section into the active session. The .word 10 directive tells the assembler to assemble a 32-bit (a word) value (10) and add it to the active session.
      • Examples are .byte (8-bit word), .half, .word, dword (64-bit word), .string / .asciiz and .ascii.
      • The .set directive may be used to define a custom symbol. E.g. .set max_temp, 100.
    • Routines

      • Allocating memory in the stack is usual when calling routines. The following grows the stack.
      • addi sp, sp, -4     # allocates 4 bytes (a word)
        sw   a0, 0(sp)      # stores `a0` into the stack
        
      • The stack grows towards lower addresses. Writes to it occurs from lower to higher addresses.
      • To shrink the stack, instead of subtract, one adds to sp. E.g.,
      • lw   a0, 0(sp)     # loads into `a0`
        addi sp, sp, 4     # deallocates 4 bytes (a word)
        
      • Who should save the register into the stack?
        • ra caller *
        • t0 ~ t6 caller
        • a0 ~ a7 caller *
        • sp callee
        • s0 ~ s11 callee
    • System calls

      • Instruction is ECALL.
      • In the emulator, the operation is set to the t0 register. The result is set in the a0 register. E.g.,
      • li t0, 4   # Selects `read integer` syscall
        ecall      # Now the integer is in `a0`
        
      • System calls are:
        • 1 to write integer from a0.
        • 2 to write character from a0.
        • 3 to write string from a0 of length a1.
        • 4 to read integer into a0.
        • 5 to read character into a0.
        • 6 to read string into a0 of length a1.
        • 7 SBRK, to allocate a0 bytes of memory and return the pointer in a0.
      • https://ascslab.org/research/briscv/simulator/simulator.html
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment