Skip to content

Instantly share code, notes, and snippets.

@MRobertEvers
Last active May 23, 2022 23:21
Show Gist options
  • Save MRobertEvers/db98e5f2ee23a73ef140a4e2cacfad19 to your computer and use it in GitHub Desktop.
Save MRobertEvers/db98e5f2ee23a73ef140a4e2cacfad19 to your computer and use it in GitHub Desktop.
Boot PocketBeagle into Bare Metal Code

Introduction

BeagleBoard came out with a new iteration called PocketBeagle in late september 2017. The new Beagle is minimalistic in nature, consisting primarily of an SD Card Reader, a Micro USB Interface, and the Octavo Systems OSD3358 1GHz ARM® Cortex-A8 System-in-Package (SIP).
The getting started page for the BeagleBoard truly gets you going quick with Linux. However, those of us looking to learn embedded systems, are left high and dry. The resources on uses of the PocketBeagle/BeagleBoard for "bare metal" programming are much fewer in number than those for getting started with Linux.
When I bought the PocketBeagle, it was still very new. There were few resources online for "bare metal" booting/programming. This gist will hopefully document some of the problems I encountered while learning how to boot the PocketBeagle into my own code.

The Manual

The manual for the AM335x is 5000+ pages. Be prepared to visit its table of contents often. You will venture there frequently. Despite its length, it does actually contain a lot of information - its not just fluff. It contains all the information you need to start processing - sometimes its connecting the ideas that's the hard part.

The Software

Before we can write code to boot, we must get an assembler and compiler for the ARM system. Fortunately, someone else has made a cross-compiler for us. See here for the GNU ARM compiler for any OS. Download the compiler for the right platform, and make sure it is accessible from the command line. TODO: See Windows.

Make is also needed to be successful in this venture. I tried TI's Code Composer Studio (CCS), but I couldn't get it to work easily and it obfuscated a lot of the details are good for learning. (i.e. automated toolchain configuration) I often find that when things like that don't work, its harder to figure out how to get them to work than just doing what you wanted to do in the first place. I suggest learning 'make.' It will help smooth this process a little bit. My goal is to eliminate as much of the 'magic' as possible. The GNU Make page can be found here.

Finally, a Hex Editor will come in handy. For this gist at least, we will need to explore some memory spaces with no file system. I suggest HxD.

Part 1: The Boot Process (RAW Mode)

The PocketBeagle is run by the AM3358 TI Sitara chip. The manual contains all the information we need to know to boot into our own code. Since the PocketBeagle's features are few, and one of them is an SD card reader, that is a good place to start. The most natural way is probably the boot method that allows us to use the FAT file system on the SD Card. But this method is still quite tricky, and is described in detail in manual section (26.1.8.5 MMC/SD Cards). I was unsuccessful in getting that method to work. I found that whenever I formatted the SD card on Windows, the SD card still contained a lot of garbage data.
The only other SD card boot option is 'raw' mode. This is the mode that we will explore for now. The starting goal will be to see some sign of life from the board. A convenient way to do that is to turn on the USR LEDs that are already on the board. From the System Reference Manual, the on-board LEDs are the USR0, USR1, USR2, and USR3 LEDs. These LEDs are connected to GPIO pins documented in this table from the reference manual.

LED	Signal Name
USR0	GPIO1_21
USR1	GPIO1_22
USR2	GPIO1_23
USR3	GPIO1_24

Manual section 26.1.8.5.5 MMC/SD Read Sector Procedure in Raw Mode contains the information to boot in raw mode. Here is what we need to do to act on that information.

  1. Write the TOC.
  2. Write the GP Header.
  3. Write YOUR code.
  4. Load it onto the SD card.

Write the Table of Contents (TOC)

The manual describes the TOC data block as a rather magical block. The TOC section (26.1.11 Table of Contents) in the manual contains all the information that this data block (in memory) should contain and what the block should look like. Below is an example TOC - it may be an examble, but this TOC will work with most raw (all?) boot configurations.

40 00 00 00 0C 00 00 00 00 00 00 00 00 00 00 00  @...............
00 00 00 00 43 48 53 45 54 54 49 4E 47 53 00 00  ....CHSETTINGS..
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF  ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
C1 C0 C0 C0 00 01 00 00 00 00 00 00 00 00 00 00  ÁÀÀÀ............
00 [0x50-0x1FF]...

Notice that bytes at addresses between 0x50 and 0x1FF are all 0. This is part of the TOC definition.

Now we want to boot into our own code. The location of our 'image' is also defined in the manual (26.1.8.5.5 MMC/SD Read Sector Procedure in Raw Mode). It states "In the case of a GP Device, a Configuration Header (CH) must be located in the first sector, followed by a GP header."

TOC In HxD

The GP Header

An SD card acts as a General Purpose (GP) device in this case. We have already seen out to write the TOC, now we just need a GP header. At this point, I was worried because section 26.1.8.5.5 didn't mention anything about an image. Fortunately, in teh GP Header definition section (26.1.10.2 Image Format for GP Device), it includes the location of our image.
The GP Header contains two words of information (8 bytes). The first byte is the size of the image - this determines how much of the memory on the SD Card is copied into RAM. The second byte is the location to put the image in RAM. Also, the second byte is the location that the ROM boot loader jumps to upon completion. So the first instruction there is the first instruction of our code that is run. This is all layed out in section 26.1.10.2. Our image starts at address 0x208 on the SD Card - just after the 8 byte GP Header.
We're almost ready to run our code on the PocketBeagle. For the sake of example, a snippet of code to turn on the USR built-in LEDs is provided below.

The Image - Our Code

The annotated assembly code below does 3 things. It enables the GPIO clock for GPIO1, sets the GPIO pins to output mode, then it puts output to those pins. There is some left-over code in here. Just ignore that.

.equ GPIO1, 0x4804C000
.equ GPIO_OE, 0x134
.equ CM_PER, 0x44E00000
.equ CM_PER_GPIO1_CLKCTRL, 0xAC
.equ GPIO_DATAOUT, 0x13C

.globl _start

_start:
    ldr r0, =CM_PER                @Clocks control register bus.
    ldr r1, =CM_PER_GPIO1_CLKCTRL  @Offset of the clock register for GPIO1
    add r0, r1
    mov r1, #2                     @Set the enable bit. Man 8.1.12.1.29
    str r1, [r0]
_led_enable:
    ldr r0, =GPIO1     @Register bank for GPIO1
    ldr r1, =GPIO_OE   @Register that controls output enable.
    add r0, r1
    mov r1, #0         @0 means output mode...
    str r1, [r0]
_main:
    ldr r0, =GPIO1     
    ldr r1, =GPIO_DATAOUT  @Register than controls the output of GPIO1
    add r0, r1
    ldr r1, =0xFFFFFFFF
    str r1, [r0]
loop:
    ldr r2, =0x00010000     @The start of a loop that may or may not work.
    sub r2, #1              @I tried to make the USR LEDs blink, but I posted
    cmp r2, #0              @this before I tested it.
    beq _main
    mov r1, #0
    str r1, [r0]
_hang:
    b _main @We don't want the processor to run into uncontrolled memory. So loop here.

TODO: Document manual pages.
Before compiling, note the three big points in this code.

  1. Between _start and _led_enable, the GPIO1 'interface' clock is being turned on. I direct you to section 8.1.6.1 to read about interface clocks. The interface clock is the clock that needs to be enabled prior to using a module - otherwise the registers cant be set. Knowing which clock needs to be enabled is a bit cryptic to me. I believe the name of the register that controls the module we want to turn on corrosponds to the name of the power domain that that module is on. The cryptic name, CM_PER_GPIO1_CLKCTRL, loosely translates to Clock Module_Peripheral_GPIO1_Clock Control. Like I said, I haven't been able to verify this 100%, but it looks like the Clock Module 'Domain' (i.e. whether to use CM_PER/CM_WKUP, etc.) is tied to the power domain that that module is on. I'm guessing this is the case since there isn't a table that lists which module is on which clock domain, but there is a table that ties each module to a power domain. It seems that the power domains share a naming convention with the clock domains. Thus, if the module is in the Peripheral (called PER when naming) power domain, then it is in the Peripheral Clock domain. The power domain table is in section 8.1.11. Actually, whenever initializing a module in general, it is necessary to enable the interface clock for it. The registers that control these clocks are described in section 8.1.12. Modules that are on the Peripheral power domain are controlled by the CM_PER registers and the modules on the Wakeup power domain are on the CM_WKUP registers (...or so it seems). Follow Up: It looks like the clock domain is stated in that modules manual section. Take a look a the UART0 connectivity attributes. However, it looks like those signal names (in the UART Section) are out-dated.
  2. This is more of a note. Often, modules will use pins that are used for multiplexing. When that is the case, the pin must be configured to the mode that we want. See here. Note that these registers correspond to the ball map pins from section 5.1 of the OSD335x Datasheet. See the System Reference manual, section 7.1.1, for Pin Muxing Mode Assignments. Generally speaking, the TRM refers to the Pin by its Mode 1 name. If we look at a Pin Usage section for any module, we can cross reference the names in the Module's Pinout table, to the names in the pinout table linked in the system reference manual. By default, each pin is configured to mode 0.
  3. The GPIO pins can be set to input AND output. Since we want to provide power to the onboard LEDs, the pins must be set to output. To do this, the GPIO_OE bits must be set to 0 for the pins that are output. 1 is input; 0 is output. GPIO_OE Register
  4. Finally, we can set the output and see a result...

This code can be compiled with the following make file. Note here that the above code is assumed to be in a boot.asm file.

ARMGNU = arm-none-eabi

boot.bin: boot.asm
    $(ARMGNU)-as boot.asm -o boot.o
    $(ARMGNU)-ld -T linker.ld boot.o -o boot.elf
    $(ARMGNU)-objdump -D boot.elf > boot.list
    $(ARMGNU)-objcopy boot.elf -O srec boot.srec
    $(ARMGNU)-objcopy boot.elf -O binary boot.bin

Make will call the arm-none-eabi GNU assembler and linker to assemble and link this code. See the software section for a link. The output we care about now is the boot.bin file. This contains the binary code of the assembly above. The produced code should look like this (in Hex):

50 00 9F E5 AC 10 A0 E3 01 00 80 E0 02 10 A0 E3 
00 10 80 E5 40 00 9F E5 4D 1F A0 E3 01 00 80 E0 
00 10 A0 E3 00 10 80 E5 2C 00 9F E5 4F 1F A0 E3 
01 00 80 E0 00 10 E0 E3 00 10 80 E5 01 28 A0 E3
01 20 42 E2 00 00 52 E3 F6 FF FF 0A 00 10 A0 E3
00 10 80 E5 F3 FF FF EA 00 00 E0 44 00 C0 04 48

The code produced is 0x50 bytes long, so in our GP Header, we need to state that the image is at least 0x50. I have stated that the image was 0xFF bytes long for convenience. Finally, our GP Header looks like this

FF 00 00 00 00 04 2F 40

The second word (bytes 4-7) contains the address in memory to which the image will be copied. This word is little endian; the value is 0x402F0400. This number is the beginning of "Public RAM" on the AM335x chip. Manual section 26.1.4.2 Public RAM Memory Map has information on that. The GP Header simply indicates that the image should be copied to the first public RAM address. Finally, putting the image and GP Header together:

FF 00 00 00 00 04 2F 40 50 00 9F E5 AC 10 A0 E3 
01 00 80 E0 02 10 A0 E3 00 10 80 E5 40 00 9F E5
4D 1F A0 E3 01 00 80 E0 00 10 A0 E3 00 10 80 E5 
2C 00 9F E5 4F 1F A0 E3 01 00 80 E0 00 10 E0 E3
00 10 80 E5 01 28 A0 E3 01 20 42 E2 00 00 52 E3 
F6 FF FF 0A 00 10 A0 E3 00 10 80 E5 F3 FF FF EA 
00 00 E0 44 00 C0 04 48 00 00 00 00 00 00 00 00

An SD Card with this data at address 0x200, and the 512 byte TOC at address 0x0, will light up the PocketBeagle's USR LEDs when booted. Below is an image of this code at address 0x200, written to the SD Card using HxD. GP Header and Image In HxD

Part 2: Moving Up - The C Toolchain

Generating binaries with the GNU C Toolchain comes with some "gotchas." (The C compiler comes with the GNU ARM Toolchain) For starters, it generates extra data/code that will not work on the PocketBeagle (Read: AM335x). It's tricky but not too difficult to configure the C toolchain to produce a simple binary file that will run on the PocketBeagle.

The Entry Point

Ideally, we want the binary's entry point to be the first instruction. This makes it a simple matter to load the binary onto the SD Card and boot it.

A first attempt to do this might look like this.

@ system.s
.global Register_Write

@ Register_Write
@ Arg 1 r0: Register Bank
@ Arg 2 r1: Register Offset
@ Arg 3 r2: Word Value to Write
Register_Write:
  add r0, r1
  mov r1, r2
  str r1, [r0]
  bx  lr


// main.c
#define CM_PER 0x44E00000
#define GPIO1 0x4804C000

#define CM_PER_GPIO1_CLKCTRL 0xAC
#define GPIO_OE 0x134
#define GPIO_DATAOUT 0x13C

void Register_Write(unsigned int, unsigned int, unsigned int);

int main(void)
{
  Register_Write(CM_PER, CM_PER_GPIO1_CLKCTRL, 2);
  Register_Write(GPIO1, GPIO_OE, 0);
  Register_Write(GPIO1, GPIO_DATAOUT, 0xFFFFFFFF);
  while(1);
}

with the following make file...

ARMGNU = arm-none-eabi

main.bin: system.o main.o 
  $(ARMGNU)-ld  system.o main.o -o main.bin

system.o: system.s
  $(ARMGNU)-as system.s -c -o system.o

main.o: main.c
  $(ARMGNU)-gcc main.c -c -o main.o

Unfortunately, this doesn't work for a few reasons. The PocketBeagle will look for the entry point of our code at the first address in memory. Additionally, the C toolchain adds extra binary that is normally used by operatings systems and for things like exception handling. We need to remove this code before we can load it onto the PocketBeagle. See below for an example of the extra code below.

The objdump of the output will reveal that our entry point is not the first instruction. This is dependent on how we write our makefile but it would be nice if it didn't.

00008000 <Register_Write>:
    8000:	e0800001 	add	r0, r0, r1
    8004:	e1a01002 	mov	r1, r2
    8008:	e5801000 	str	r1, [r0]
    800c:	e12fff1e 	bx	lr

00008010 <main>:
    8010:	e92d4800 	push	{fp, lr}
    8014:	e28db004 	add	fp, sp, #4
    8018:	e3a02002 	mov	r2, #2
    801c:	e3a010ac 	mov	r1, #172	; 0xac
    8020:	e59f0024 	ldr	r0, [pc, #36]	; 804c <main+0x3c>
    8024:	ebfffff5 	bl	8000 <Register_Write>
    8028:	e3a02000 	mov	r2, #0
    802c:	e3a01f4d 	mov	r1, #308	; 0x134
    8030:	e59f0018 	ldr	r0, [pc, #24]	; 8050 <main+0x40>
    8034:	ebfffff1 	bl	8000 <Register_Write>
    8038:	e3e02000 	mvn	r2, #0
    803c:	e3a01f4f 	mov	r1, #316	; 0x13c
    8040:	e59f0008 	ldr	r0, [pc, #8]	; 8050 <main+0x40>
    8044:	ebffffed 	bl	8000 <Register_Write>
    8048:	eafffffe 	b	8048 <main+0x38>
    804c:	44e00000 	strbtmi	r0, [r0], #0
    8050:	4804c000 	stmdami	r4, {lr, pc}

The GNU linker can take a linker script file that can order the binary in the right way. The code below is in a linker.ld file.

MEMORY
{
    ram : ORIGIN = 0x402F0400, LENGTH = 0x10000
}

ENTRY(main)
SECTIONS
{
    .text : 
  {
  *(.text.main);
  *(.text*)
  } > ram
}

What this does is state that the .text.main section should come before the .text section. Now we must ensure that the .text.main section contains our entry point. The GNU Compiler allows the use of attributes that put a function in a chosen section. This looks like below

int main(void) __attribute__ ((section (".text.main")));
int main(void)
{
  Register_Write(CM_PER, CM_PER_GPIO1_CLKCTRL, 2);
  Register_Write(GPIO1, GPIO_OE, 0);
  Register_Write(GPIO1, GPIO_DATAOUT, 0xFFFFFFFF);
  while(1);
}

When this is compiled with a makefile that includes the linker script, main will be put at the beginning. Note that I also indicated the memory start address in the linker script.

ARMGNU = arm-none-eabi

main.bin: system.o main.o 
  $(ARMGNU)-ld  -T linker.ld system.o main.o -o main.bin
  $(ARMGNU)-objdump -D main.bin > main.list

system.o: system.s
  $(ARMGNU)-as system.s -c -o system.o

main.o: main.c
  $(ARMGNU)-gcc main.c -c -o main.o

Here is what the object dump looks like now.

402f0400 <main>:
402f0400:	e92d4800 	push	{fp, lr}
402f0404:	e28db004 	add	fp, sp, #4
402f0408:	e3a02002 	mov	r2, #2
402f040c:	e3a010ac 	mov	r1, #172	; 0xac
402f0410:	e59f0024 	ldr	r0, [pc, #36]	; 402f043c <main+0x3c>
402f0414:	eb00000a 	bl	402f0444 <Register_Write>
402f0418:	e3a02000 	mov	r2, #0
402f041c:	e3a01f4d 	mov	r1, #308	; 0x134
402f0420:	e59f0018 	ldr	r0, [pc, #24]	; 402f0440 <main+0x40>
402f0424:	eb000006 	bl	402f0444 <Register_Write>
402f0428:	e3e02000 	mvn	r2, #0
402f042c:	e3a01f4f 	mov	r1, #316	; 0x13c
402f0430:	e59f0008 	ldr	r0, [pc, #8]	; 402f0440 <main+0x40>
402f0434:	eb000002 	bl	402f0444 <Register_Write>
402f0438:	eafffffe 	b	402f0438 <main+0x38>
402f043c:	44e00000 	strbtmi	r0, [r0], #0
402f0440:	4804c000 	stmdami	r4, {lr, pc}

402f0444 <Register_Write>:
402f0444:	e0800001 	add	r0, r0, r1
402f0448:	e1a01002 	mov	r1, r2
402f044c:	e5801000 	str	r1, [r0]
402f0450:	e12fff1e 	bx	lr

The Executable Code

The next thing we need to accomodate is the extra code that is added. The two images below show extra code at 0x0 and at about 0x8058.
ExtraELF ExtraOSCode

Isolating just the executable code is only a matter of use -objcopy. See the make file below with $(ARMGNU)-objcopy -O binary main.elf main.bin. This command extracts the binary code from the compiled file.

ARMGNU = arm-none-eabi

main.bin: system.o main.o 
  $(ARMGNU)-ld  -T linker.ld system.o main.o -o main.elf
  $(ARMGNU)-objcopy -O binary main.elf main.bin
  $(ARMGNU)-objdump -D main.bin > main.list

system.o: system.s
  $(ARMGNU)-as system.s -c -o system.o

main.o: main.c
  $(ARMGNU)-gcc main.c -c -o main.o

Now the binary produced can be placed at address 0x208 on the SD Card. With the appropriate modification of the size word of the GP Header, the code will execute and turn on the USR LEDs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment