Skip to content

Instantly share code, notes, and snippets.

@benjamindoron
Last active September 12, 2022 17:42
Show Gist options
  • Save benjamindoron/e8c5f4f866424c96e9f3a5a7269fc1b5 to your computer and use it in GitHub Desktop.
Save benjamindoron/e8c5f4f866424c96e9f3a5a7269fc1b5 to your computer and use it in GitHub Desktop.

GSoC 2022 projects

S3 resume implementation

Overview

I successfully implemented firmware support for resume from S3 sleep on MinPlatform, mentored by Nate DeSimone and Ankit Sinha. It suspends from and resumes to an operating system on my Acer Aspire VN7-572G (Skylake). While presently, the board-specific code is only implemented for KabylakeOpenBoardPkg, I've attempted to ensure that the implementation is as silicon-agnostic as possible. After one last straightforward bug is resolved (involving conditions for detecting the power state), it will be entirely ready for daily use. I performed my testing with Fedora 34, with no relevant modifications.

Although for other platforms, it's expected that more work is required than toggling the S3 feature PCD - and developers may need to prepare for debugging their port of the KabylakeOpenBoardPkg commit - this should be quite straightforward. Therefore, I maintain that the implementation is almost entirely generic across at least all Intel client MinPlatform boards presently in the tree.

I primarily use laptops, so power management features are fairly important to me. I actually attempted to enable working S3 resume while I was finalising my board port project for GSoC 2021. While the work performed then was insufficient, the issues uncovered then formed the beginning of my work here.

I hope that implementing this important power management feature, particularly for laptops, helps make MinPlatform a more competitive open-source firmware.

Work performed

The primary work-product for this project is the following patch-set (v3):

I also wrote some assorted improvements to other code. The below have been merged:

The below have not been merged:

Challenges

  • MemoryInit() and BaseMemoryTest() failures: After further development and enabling the HDMI debug port, the issue went away. It's unclear what corrected this, but the implementation was very WIP at this time.
    • Furthermore, the SPI log becomes quite corrupted. After further thought, I've realised that it's neither a true ringbuffer - doesn't seek over valid text from previous boots, nor does it always call Spi2Ppi->FlashErase(). Therefore, some bits cannot be flipped, per NOR flash rules.
  • Confirm BootMode using PMC WAK_STS: Potentially related to the above, the PMC's 'last wake' registers should be validated using WAK_STS: is this a wake-up? Otherwise, incorrect BootMode may be reported. Use a silicon library, such as GetSleepTypeAfterWakeup(), rather than merely picking just the wake type register.
  • Install valid memory on S3 flow: As of last year, I was aware that MinPlatform's FspWrapperHobProcessLib erroneously does not support S3 resume flow. Therefore, allocate memory on cold boots to consume on S3 resume.
  • Implement full SmmAccess and SmmControl support for SMM communication use: As of last year, I was aware that at least true SmmAccess support is required for LockBox and security lockdown. SMM communication will also require SmmControl, potentially optional but assumed necessary for dispatching callbacks.
  • All FSP HOBs are only installed after FSP-S: S3Pei now defers installing SmmAccess and SmmControl until notification of FspSiliconInitDonePpi.
  • Applying boot CPU structures (GDT, IDT, MTRRs): As of last year, I was aware that FSP can perform this task, but that the UPDs cannot be set as we do not write the GUID-ed structure in SMRAM. It turns out that this structure is produced by CpuInitDxe in closed-source code. Therefore, to stay as silicon-agnostic as possible, CpuS3DataDxe is included so that PiSmmCpuDxeSmm can perform this task.
  • On BootScriptExecutorDxe: This module is a special case: It's a DXE_DRIVER, but is by definition guaranteed to perform primary execution after end-of-BS. This apparent contradiction is enabled and complicated by copying the driver to the LockBox. The debug library stack is a particularly common target for modification by developers, so care must be taken to satisfy this module's special requirements. More details below.
    • Some background:
      • DebugLibReportStatusCode cannot be used for debugging after end-of-BS, as the serial port RSC handler is uninstalled. See here. So, DebugLibSerialPort is preferred here.
        • Aside, I've gained new understanding why DebugLibSerialPort often cannot be linked to a phase core: Since AutoGen, which orders all library constructors, will be ordering debug constructor before all others (such as phase-specific service table pointer libraries), just note that this library stack is incompatible with a SerialPortInitialize() using such services.
      • BootScriptExecutorDxe is copied into the LockBox at DxeSmmReadyToLock. While it seems as though a boolean to track end-of-BS in a SerialPortLib that uses gBS is responsible practice, the event that fires will not patch the correct data section. This was interesting to debug, because code can appear valid, but contain bugs due to not fully comprehending the implications of boot flow. I attempted to raise this on the mailing-list, but ultimately resolved it myself, here.
    • Ankit and I discussed this, and as I expected, there doesn't seem to be a way around it: this module is part of the root-of-trust and must be protected for the integrity of platform security. It must be impossible to patch this module in memory, which executes before platform lockdown. So, it must be copied into the LockBox before untrusted code is dispatched.

After all these challenges were addressed, the system suspended and resumed to an operating system successfully for the first time.

Replication

  • For a Kabylake board in the tree, enable gS3FeaturePkgTokenSpaceGuid.PcdS3FeatureEnable. Otherwise, first port the KabylakeOpenBoardPkg commit to your board.
  • Build, then flash an image.
  • Recovering logs on the Aspire VN7-572G:
    • USE_HDMI_DEBUG_PORT: Follow the instructions below to get set-up, then connect with a terminal. I use GNU screen, though PuTTY is another good option.
    • USE_PEI_SPI_LOGGING is corrupted, USE_MEMORY_LOGGING does not support S3 specifically.

Bugs/Work for the future

Debugging stacks

  • SPI flash log: Implement true ringbuffer and Spi2Ppi->FlashErase() as required.
  • In-memory debug log: Implement support for S3 resume and merge upstream.

Complementary debug capabilities

Overview

Initially, my complementary project was enabling a debug port over HDMI's DDC pins (I2C). I anticipated that quite a fair amount of work would be required in this area, but that was not entirely the case as Nate published some working code for this task. So, while I would spend the community bonding and preparation period planning my approach, reading specifications, code, etc. while I wrote some planned code for this task, this was ultimately unnecessary, though still educational.

After resolving HDMI debug-specific challenges, the remainder of my work in this area was performed concurrently, during and specifically for my time working on and debugging the S3 support.

So, right before midterm evaluations, I would finish all my planned work for this summer. Therefore, what else to do but take on another project? Early on, a HII form with some silicon config for all of MinPlatform was considered, but I had written one for my board already and wanted to do something new.

I would ultimately settle on implementing a new debug feature, similar to rescue capsules, that would allow reflashing firmware delivered by a new userspace tool over a serial port, as early as PEI APRIORI. Together with the HDMI debug port, this enables an improved debug flow, far superior to in-memory logging: flash broken image, retrieve logs, reboot and reflash a new image. With more work, this will be a competitive alternative to flashing externally, which will minimise tedium and increase the developer's productivity.

Work performed

Challenges

  • Set-up a Bus Pirate: The Bus Pirate can be bricked easily while attempting to upgrade the bootloader if power is removed before writing the flash pages. I recommend having a PIC programmer available in-case recovery is necessary: I have a PICkit2 clone and a PICkit4 for this task.
  • On HDMI's Hot-Plug Detect pin: In theory, only the SCL, SDA and GND pins are required for this task. In practice, I failed to detect a signal with a logic analyser despite a multimeter reporting good continuity. Some discussion with my mentors to debug this issue and recalling the Parade level-shifter interposed between the port and PCH on my board, revealed that the HPD pin can be crucially important. Then, I found that datasheets from Parade commonly indicate that HPD shall signal their firmware. Referring to the HDMI specification for this issue led to resolving the issue with a 1 kOhm resistor between HPD and 5V.
  • On serial FIFOs: It seems obvious now, but the interactive console cannot be opened while the userspace tool makes a connection, per the definition of FIFO.
  • Packetising: If the implementation layer cannot transfer 4K at once, then the userspace tool must not either. To keep each side synchronised, more wait_for_ack_on() calls are used. Presently, the tool uses arguments to determine which code flow modifications to apply.
  • On differing baudrates: While packets transfer at 115200 baud from the host, the board must not attempt to read the serial port. Therefore, delays are implemented both between packets in WriteBlock() and also before reading the command packet, in-case the block number is being flushed more slowly through the host.
  • XIP code:
    • Some background: To be capable of recovering from as many bugs as possible, this PEIM should execute as early as possible: in PEI APRIORI, using cache-as-RAM (note that SEC might be possible). This early in boot, execution is in-place, backed by SPI MMIO. Writing over our own blocks risks misbehaviour as code pointed at by EIP is inevitably overwritten, and I don't know if other CAR guarantees (implicit caching) mitigate this. Therefore, the PEIM presently copies itself into NEM for execution, but these assumptions should be reviewed too.
    • However, during no-evict mode, cache coherency is by definition not maintained. Consequently, it's quite plausible that L1i and L3 (CAR) become out of sync. This results in the CPU throwing #UD for valid code and instructions loaded into CAR. As has been explained to me, this is because uOps in L1i do not match instructions in CAR. Therefore, attempt to saturate the L1 caches so that prior contents are demoted to L2. Then any access to CAR will cache-miss L1. This issue is provably the case since there is no issue in DRAM.

After these challenges were addressed, the shell app could repeatedly and successfully flash the SPI chip. At the time of writing, the rescue PEIM can start, but I suspect it requires the disabling of DEBUG() to prevent interference on the serial transport that leads to lost commands.

Replication

Set-up a Bus Pirate and HDMI debug cable. The guide here can largely be followed, though mind the challenges section above. I:

  • Soldered wire spool to the exposed DDC pins (SCL, SDA and GND) on the HDMI breakout board and soldered each of these to jumper wires, for convenience.
  • Soldered a 1 kOhm resistor between HPD and 5V. Potentially optional, but required on my board. I recommend that others examine their board's schematics.
  • Applied heat shrink for convenience and durability.
  • Also consider using test leads. Soldering might not be required.

Either build FlashRescueBoardApp from the new repo, or build an image with FlashRescueBoardPei. When prompted, disconnect the terminal and execute the userspace tool.

Bugs/Work for the future

  • Port the Bus Pirate implementation: Native support for I2C targets offers major speed improvements over bit-banged emulation. An SBC such as the Pi, an ARM dev board (such as those by Adafruit, with Cortex M-based SoCs), or an FPGA are all potential candidates.
  • Consider implementing a "patch table": With the Bus Pirate as the underlying implementation layer, a sequence of 5 bytes corresponding to "F12" will disconnect the emulated terminal from the board. It's unlikely, but possible, for a BIOS image to contain this sequence. Scanning the binary for such sequences and implementing either a patch command, or entries in the structure, will make the protocol more reliable in this case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment