Skip to content

Instantly share code, notes, and snippets.

@rahujosh
rahujosh / failed_case_job81081_3_detailed.md
Created April 28, 2026 05:22
PR #495 LAVA job 81081 case 3 detailed analysis

PR 495 | Job 81081 | SoC sm8750-mtp | Case 3

Detailed analysis

Summary: The failure is a boot-stage software regression, not an infra/power-control issue. In this case, LAVA successfully resets the board and executes fastboot boot, but no Linux banner is seen, so auto-login-action times out after 578 seconds. During that wait window, logs show repeated SM8750 boot ROM/SBL/UEFI sequences instead of kernel output, indicating boot-loop or failed kernel handoff. The final job failure is a scheduler consequence of timeout budgeting after the first fastboot-boot attempt fails.

@rahujosh
rahujosh / failed_case_job81081_2_detailed.md
Created April 28, 2026 05:20
PR #495 LAVA job 81081 case 2 detailed analysis

PR 495 | Job 81081 | SoC sm8750-mtp | Case 2

Detailed analysis

Summary: fastboot-boot itself transfers and boots boot.img successfully, but the board never reaches Linux prompt detection. The serial log stays in Qualcomm bootloader stages for sm8750-mtp and shows AVB verification error on dtbo_a. LAVA then times out in auto-login-action and consumes the full block timeout, so retries are skipped. This points to deploy/artifact consistency on DUT storage, not worker power-control instability.

@rahujosh
rahujosh / failed_case_job81081_1_detailed.md
Created April 28, 2026 05:19
PR #495 LAVA job 81081 case 1 detailed analysis

PR 495 | Job 81081 | SoC sm8750-mtp | Case 1

Detailed analysis

Summary: The failing case is an infra boot-flow failure before Linux starts, not a kernel runtime crash. LAVA waits for Linux version [0-9], but the console remains in Qualcomm bootloader stages and never prints a kernel banner. On sm8750-mtp, boot chain validation flags dtbo_a integrity mismatch, which blocks normal kernel handoff. TAC debugboard power control succeeded, so worker/PDU path is healthy.

@rahujosh
rahujosh / failed_case_job81078_2_detailed.md
Created April 28, 2026 05:17
PR #495 LAVA job 81078 case 2 detailed analysis

PR 495 | Job 81078 | SoC qcs9100-ride | Case 2

Detailed analysis

Summary: The failed case is not caused by worker power/control infra; execution reaches late suites and only shmbridge reports FAIL. The failure is tied to qcs9100 secure monitor interaction (qcom_scm_* video protection call returning -5) during boot-log checks. After this subtest fail, testcases continue, but the run is still marked failed because the run was not explicitly closed with ENDRUN. So this is a SoC-specific runtime/firmware-path issue plus a test-runner signaling gap.

@rahujosh
rahujosh / failed_case_job81078_1_detailed.md
Created April 28, 2026 05:15
PR #495 LAVA job 81078 case 1 detailed analysis

PR 495 | Job 81078 | SoC qcs9100-ride | Case 1

Detailed analysis

Summary: shmbridge itself does not crash; it fails by log-policy. The testcase confirms CONFIG_QCOM_SCM and qcom_scm sysfs presence, then scans current boot dmesg and finds repeated Iris SCM errors, so it marks FAIL. The errors are from the qcs9100-ride video path (qcom-iris aa00000.video-codec) and occur well before the testcase starts. This indicates a kernel/software integration regression in secure video memory protection flow, not LAVA worker infrastructure.

@rahujosh
rahujosh / failed_case_job81080_2_detailed.md
Created April 28, 2026 05:13
PR #495 LAVA job 81080 case 2 detailed analysis

PR 495 | Job 81080 | SoC qcs8300-ride | Case 2

Detailed analysis

Summary: This failure is a LAVA run-control/infrastructure issue, not a board power or debugboard failure. The test run starts with a malformed STARTRUN line and later exits without a matching ENDRUN, which triggers LAVA’s “unfinished test run” failure. qcs8300-ride uses serial console-driven control flow, and printk interleaving on that console is the proximate breakage. Individual testcases continue to report correctly, confirming partial signal integrity.

@rahujosh
rahujosh / failed_case_job81080_1_detailed.md
Created April 28, 2026 05:12
PR #495 LAVA job 81080 case 1 detailed analysis

PR 495 | Job 81080 | SoC qcs8300-ride | Case 1

Detailed analysis

Summary: Probe_Failure_Check failed because it found exactly one probe-related kernel error and no deferred-probe backlog. The matched line is an Aquantia AQR115C probe failure on stmmac-0:08 with -22, which is consistent with the earlier boot log line failed to read firmware-name: -22. On qcs8300-ride this is the external Ethernet PHY path, and Ethernet itself is not a blocking functional signal in this run (Ethernet: skip), so this is a probe-gating issue rather than a system crash/regression symptom in core SoC bring-up. The failure is deterministic and tied to Ethernet PHY probe semantics on this platform.

@rahujosh
rahujosh / failed_case_job81086_2_detailed.md
Created April 28, 2026 05:09
PR #495 LAVA job 81086 case 2 detailed analysis

PR 495 | Job 81086 | SoC qcs6490 | Case 2

Detailed analysis

Summary: This is primarily an infra/lab dependency failure in the qcs6490 board flow, not a board power-control or dispatcher crash. The suite executes normally, but USBHost reports no enumerated USB devices and records a fail. After testcase collection, LAVA reports Marking unfinished test run as failed because only STARTRUN is seen and no matching ENDRUN is emitted. Power-off/finalize via debugboard.py --board alpaca succeeds, which further points away from worker connectivity/PDU issues.

@rahujosh
rahujosh / failed_case_job81086_1_detailed.md
Created April 28, 2026 05:07
PR #495 LAVA job 81086 case 1 detailed analysis

PR 495 | Job 81086 | SoC qcs6490 | Case 1

Detailed analysis

Summary: USBHost fails immediately after testcase start with “No USB devices found,” with no preceding kernel crash tied to this case. The same run shows PCIe passing just before and other suites continuing, which indicates the target stayed healthy and the failure is localized to USBHost setup/enumeration. Kernel boot logs include USB core/controller bring-up on RB3gen2, so this is not a full USB stack bring-up failure. For qcs6490, this pattern is most consistent with missing lab-side attach/mux state for a physical host-test USB device.

@rahujosh
rahujosh / failed_case_job81084_4_detailed.md
Created April 28, 2026 05:06
PR #495 LAVA job 81084 case 4 detailed analysis

PR 495 | Job 81084 | SoC qcs615-ride | Case 4

Detailed analysis

Summary: The failure is infra-side in the LAVA boot/login path, not a clear kernel crash regression. The board does start Linux (Linux version ..., correct QCS615 Ride model, console=ttyMSM0) but later the serial stream shows malformed fastboot command text and a watchdog reset in firmware. After reset, LAVA stays in auto-login-action retry loops and times out. The job then fails at fastboot-boot due exhausted block time, which propagates to case job.