Skip to content

Instantly share code, notes, and snippets.

@smccarney
Last active December 4, 2023 23:02
Show Gist options
  • Save smccarney/fb90393a848fac18bc6fde7987efb1c9 to your computer and use it in GitHub Desktop.
Save smccarney/fb90393a848fac18bc6fde7987efb1c9 to your computer and use it in GitHub Desktop.
Design for improved power sequencer pgood failure isolation

High level design of pgood isolation enhancements

Contents

  • Obtain pgood status for rails connected to analog pins
  • Enhance JSON file format
  • Enhance pgood failure isolation algorithm for UCD 90XXX devices
  • Update existing JSON files to conform to the new syntax

Obtain pgood status for rails connected to analog pins

  • The UCD90160 and UCD90320 devices support connecting a voltage rail to two different types of GPIO pins.
    • Analog monitor pin. The UCD reads a voltage level and compares it to a pgood range. The pin is normally connected to the voltage output from a voltage regulator. These pins cannot be read directly from firmware.
    • Digital monitor pin. The UCD reads a simple 0/1 value from this pin to determine the pgood status. The pin is normally connected to a pgood signal (e.g. VR_RDY) from the voltage regulator. These pins can be read directly from firmware.
  • The power sequencer application currently does not obtain the pgood value for analog monitor pins. This causes pgood isolation to fail when a fault occurs during the power on. The UCD 90XXX device does not report the pgood failure using the expected PMBus status registers.
  • Obtain the state of the analog pins indirectly by comparing the rail output voltage to the critical undervoltage and overvoltage thresholds.
    • The output voltage is available over PMBUS using the READ_VOUT command. The UCD device driver makes this value available in a sysfs file named in[0-*]_input.
    • The UCD defines critical undervoltage and overvoltage levels for each rail. Those levels are available in the PMBus commands VOUT_UV_FAULT_LIMIT and VOUT_OV_FAULT_LIMIT. The UCD device driver makes these values available in sysfs files named in[0-*]_lcrit and in[0-*]_crit.
      • Note: there are also in[0-*]_lcrit_alarm and in[0-*]_crit_alarm files produced by the driver. These contain a 0 or 1 that is supposed to indicate if the output voltage is out of bounds. However, these are not set by the driver during the boot because the correct bits also need to be set in STATUS_VOUT and the UCD does not set those bits.
    • The * portion of the preceding sysfs file names is not the PMBus page number. The corresponding in[0-*]_label file must be read. The UCD driver sets the file contents to a string like vout14. The number suffix is the PMBus page + 1.
    • If the output voltage is below the undervoltage limit or above the overvoltage limit, the rail will be considered having a pgood value of false.

Enhance JSON file format

  • Change file format to be generic rather than specific to UCD90XXX devices.
  • Rail objects/array
    • Add optional integer page property
      • Value is PMBus PAGE number of rail
    • Add optional boolean check_status_vout property
      • Default is false
      • If true, check the value of the STATUS_VOUT register for this rail. If one of the error bits is set, the rail pgood will be considered false.
    • Add optional boolean compare_voltage_to_limits property
      • Default is false
      • If true, compare the voltage output to the critical undervoltage and overvoltage limits. If the voltage is out of bounds, the rail pgood will be considered false.
    • Add optional object gpio property
      • Used to identify a digital GPIO that can be read directly.
      • Required integer line property
        • Value is the libgpiod line offset of the GPIO
      • Optional boolean active_low property
        • Default is false
        • If true, a 0 value indicates pgood is asserted.
  • Change the meaning of the rail order in the array
    • Previously the rail order implied the PMBus PAGE. The first rail was PAGE 0, the second PAGE 1, etc. This caused two problems:
      • Sometimes one or more UCD PAGEs were associated with a non-voltage rail, such as a telemetry data source. These PAGEs should not be checked for pgood errors. However, they could not be omitted from the array because that would disrupt the PAGE numbers.
      • The PAGE order is usually not the same as the power on sequence. The pgood failure isolation algorithm needs to go in power on sequence order rather than PAGE order.
    • Now the rail position in the array will imply the power on sequence order. The first rail in the array will be the first rail in the power on sequence, the second rail the next in the sequence, etc.
  • Pin objects/array
    • Deprecate this array. The distinction between pins and rails is somewhat specific to the UCD90XXX device family. Having both pin and rail objects results in multiple types of duplicate data across several JSON files.

Complete JSON file syntax with old and new properties

  • rails array (rail objects are in power on sequence order)
    • rail object
      • name property
        • String
        • Required
        • Must be unique within the file
      • presence property
        • String
        • Optional
        • Inventory path of a system component which must be present in order for the rail to be present.
      • page property
        • Integer
        • Optional
        • PMBus PAGE number of the rail
      • check_status_vout property
        • Boolean
        • Optional. Default value is false.
        • If true, check the value of the STATUS_VOUT register for this rail. If one of the error bits is set, the rail pgood will be considered false.
      • compare_voltage_to_limits property
        • Boolean
        • Optional. Default value is false.
        • If true, compare the voltage output to the critical undervoltage and overvoltage limits. If the voltage is out of bounds, the rail pgood will be considered false.
      • gpio property
        • Object
        • Optional
        • Used to identify a pgood signal that can be read directly from a digital GPIO.
          • line property
            • Integer
            • Required
            • The libgpiod line offset of the GPIO
          • active_low property
            • Boolean
            • Optional. Default is false
            • If true, a 0 value indicates pgood is asserted. If false, a 1 value indicates pgood is asserted.

Enhance pgood failure isolation algorithm for UCD 90XXX devices

  • Loop through all label files in sysfs to build mapping from label number to PMBus PAGE.
  • Loop through all the rails in power on sequence order
    • If the presence property is defined, verify inventory item is present
      • Cache presence within this isolation attempt
    • If page is defined
      • If check_status_vout is defined
        • If any error bits are set, identify this rail
      • If compare_voltage_to_limits is defined
        • Get output voltage, critical undervoltage, and critical overvoltage values from sysfs files.
        • If output voltage is outside bounds, identify this rail
    • If gpio is defined
      • Read GPIO line value
      • If active_low is true, negate the value
      • If value is false, identify this rail
  • If no rail is identified, log a generic pgood failure error without identifying a specific rail.

Update existing JSON files to conform to the new syntax

  • Sort existing rails in power on sequence order
  • Add new properties as needed
  • Move information from pins array into gpio properties of rail objects
  • Remove rail objects that are not true voltage rails (e.g. telemetry) or that do not contribute to the overall chassis pgood status (e.g. 1.1V on Rainier systems)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment