Engineering Council Test Reliability Report

Executive Snapshot

7

Daily Runs

3/7

Daily Green

19m 28s

Avg Daily Runtime

13

Smoke Attempts

11/13

Smoke Green

3m 14s

Avg Smoke Runtime

2m 56s

Median Smoke Time

0

Current Green Streak

Executive Analysis

Bottom line: the weakest link is smoke reliability, not test speed. The suite can still provide signal, but deploy confidence is being taxed by failed or noisy smoke attempts.

What Matters

Daily regression passed 3 of 7 runs (42.9%), with a current green streak of 0 and a best streak of 2 in this window. The latest daily run (150328) failed, so the system is ending the week under tension rather than in a clean state.
Smoke passed 11 of 13 attempts (84.6%) across 7 production pipelines. 2 pipeline(s) recovered on rerun, which is useful for continuity but also a sign that first-pass deploy signal is noisier than it should be.
Failure concentration is not random: Library has the highest strict failure ratio at 1.00%, while Library has the broadest non-pass footprint at 1.00%.
Frontend is the weakest smoke surface in this window at 5/7 green (71.4%).
Daily-suite runtime averaged 19m 28s.

Engineering Analysis

A release gate should fail loudly for product regressions and quietly for infrastructure noise. Rerun recoveries and incomplete smoke attempts suggest those two failure modes are still partially mixed together.
The failure profile is concentrated enough to act on. Library and Library are carrying the strongest signal, which means reliability work should be assigned by category ownership instead of treating the suite as one undifferentiated problem.
The broader daily suite is carrying more instability than smoke, which usually means product regressions are escaping into wider coverage areas even when the narrow deploy gate looks acceptable.

Recommended Actions

Assign one owner to Library for the next cycle and expect a short written burn-down: top failing tests, suspected root causes, flake versus regression breakdown, and what gets fixed or quarantined first.
Treat the daily regression suite like an operations queue until it is calm again: triage failures after each red run, close known-noise items fast, and avoid letting multiple unrelated red signals pile up between runs.
Put Frontend smoke under closer guardrails for the next release cycle. It is the best place to improve first-pass deploy confidence quickly.

Improvement Ideas

Introduce a small reliability budget for tests: every flaky or quarantined case needs an owner and an expiry, and the team should review that budget weekly the same way it reviews bugs or incidents.
Track first-fail to root-cause time as a core metric. Fast diagnosis is as important as raw pass rate because the practical value of a test gate depends on how quickly it helps the team recover.
Define a runtime budget per suite and require justification when test count or duration grows. Reliable feedback systems stay trusted when they remain both stable and proportionate.

Category Execution Ratios

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Strict Failure Ratio

Share of category executions that ended in failed across all daily runs in this window.

Billing0.00%

Web0.00%

Frontend0.23%

Library1.00%

Non-pass Ratio

Share of category executions that ended in failed, pending, or skipped across all daily runs in this window.

Billing0.00%

Web0.00%

Frontend0.23%

Library1.00%

Category Aggregate Table

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

How computed

Category total executions means the sum of that category's observed test executions across every daily-suite run in the selected window.

Strict Failure Ratio = failed executions for that category divided by total executions for that category across the window.

Non-pass Ratio = (failed + pending + skipped) executions for that category divided by total executions for that category across the window.

Example: if Billing executed 800 times across the week and 2 of those executions failed, Billing strict failure ratio is 0.25%. That does not mean 0.25% of pipelines failed; it means 0.25% of observed Billing executions ended in failed.

Category	Total	Failed	Failure Ratio	Non-pass Ratio	Runs With Failures
Billing	756	0	0.00%	0.00%	0
Web	5201	0	0.00%	0.00%	0
Frontend	1764	4	0.23%	0.23%	3
Library	602	6	1.00%	1.00%	3

Billing

Pend 0Skip 0Runs 0

0

0.00%

756

Web

Pend 0Skip 0Runs 0

0

0.00%

5201

Frontend

Pend 0Skip 0Runs 3

4

0.23%

1764

Library

Pend 0Skip 0Runs 3

6

1.00%

602

Recent Runs

Recent Daily Suite Runs

Date	Pipeline	Suites	Status	Summary
2026-03-21 18:23	149394	BillingWebFrontendLibrary	PASSED	Total 1189 \| Passed 1189 \| Failed 0
2026-03-22 18:23	149456	BillingWebFrontendLibrary	PASSED	Total 1189 \| Passed 1189 \| Failed 0
2026-03-23 18:23	149694	BillingWebFrontendLibrary	FAILED	Total 1189 \| Passed 1185 \| Failed 4
2026-03-24 18:22	149866	BillingWebFrontendLibrary	FAILED	Total 1189 \| Passed 1186 \| Failed 3
2026-03-25 18:22	150059	BillingWebFrontendLibrary	FAILED	Total 1189 \| Passed 1187 \| Failed 2
2026-03-26 18:22	150180	BillingWebFrontendLibrary	PASSED	Total 1189 \| Passed 1189 \| Failed 0
2026-03-27 18:22	150328	BillingWebFrontendLibrary	FAILED	Total 1189 \| Passed 1188 \| Failed 1

2026-03-21 18:23Pipeline 149394BillingWebFrontendLibrary

PASSED

T 1189 | P 1189 | F 0 | Pend 0

2026-03-22 18:23Pipeline 149456BillingWebFrontendLibrary

PASSED

T 1189 | P 1189 | F 0 | Pend 0

2026-03-23 18:23Pipeline 149694BillingWebFrontendLibrary

FAILED

T 1189 | P 1185 | F 4 | Pend 0

2026-03-24 18:22Pipeline 149866BillingWebFrontendLibrary

FAILED

T 1189 | P 1186 | F 3 | Pend 0

2026-03-25 18:22Pipeline 150059BillingWebFrontendLibrary

FAILED

T 1189 | P 1187 | F 2 | Pend 0

2026-03-26 18:22Pipeline 150180BillingWebFrontendLibrary

PASSED

T 1189 | P 1189 | F 0 | Pend 0

2026-03-27 18:22Pipeline 150328BillingWebFrontendLibrary

FAILED

T 1189 | P 1188 | F 1 | Pend 0

Recent Smoke Attempts

Date	Suite	Pipeline	Job	Status	Passed	Failed	Duration
2026-03-23 12:20	Frontend	149520	Frontend smoke	PASSED	110	0	3m 08s
2026-03-23 15:20	University	149634	University smoke	PASSED	10	0	2m 09s
2026-03-23 15:25	Frontend	149634	Frontend smoke	FAILED	103	7	5m 41s
2026-03-23 16:25	University	149671	University smoke	PASSED	10	0	2m 17s
2026-03-23 16:31	Frontend	149671	Frontend smoke	FAILED	103	7	5m 57s
2026-03-23 17:03	University	149684	University smoke	PASSED	10	0	2m 33s
2026-03-23 17:05	Frontend	149684	Frontend smoke	PASSED	110	0	3m 03s
2026-03-24 20:52	University	149788	University smoke	PASSED	10	0	2m 56s
2026-03-24 20:53	Frontend	149788	Frontend smoke	PASSED	110	0	3m 10s
2026-03-25 12:14	University	149902	University smoke	PASSED	10	0	2m 23s
2026-03-25 12:16	Frontend	149902	Frontend smoke	PASSED	110	0	3m 05s
2026-03-27 14:33	University	150306	University smoke	PASSED	10	0	2m 23s
2026-03-27 14:35	Frontend	150306	Frontend smoke	PASSED	110	0	3m 15s

2026-03-23 12:20FrontendPipeline 149520Job Frontend smoke

PASSED

P 110 | F 0 | 3m 08s

2026-03-23 15:20UniversityPipeline 149634Job University smoke

PASSED

P 10 | F 0 | 2m 09s

2026-03-23 15:25FrontendPipeline 149634Job Frontend smoke

FAILED

P 103 | F 7 | 5m 41s

2026-03-23 16:25UniversityPipeline 149671Job University smoke

PASSED

P 10 | F 0 | 2m 17s

2026-03-23 16:31FrontendPipeline 149671Job Frontend smoke

FAILED

P 103 | F 7 | 5m 57s

2026-03-23 17:03UniversityPipeline 149684Job University smoke

PASSED

P 10 | F 0 | 2m 33s

2026-03-23 17:05FrontendPipeline 149684Job Frontend smoke

PASSED

P 110 | F 0 | 3m 03s

2026-03-24 20:52UniversityPipeline 149788Job University smoke

PASSED

P 10 | F 0 | 2m 56s

2026-03-24 20:53FrontendPipeline 149788Job Frontend smoke

PASSED

P 110 | F 0 | 3m 10s

2026-03-25 12:14UniversityPipeline 149902Job University smoke

PASSED

P 10 | F 0 | 2m 23s

2026-03-25 12:16FrontendPipeline 149902Job Frontend smoke

PASSED

P 110 | F 0 | 3m 05s

2026-03-27 14:33UniversityPipeline 150306Job University smoke

PASSED

P 10 | F 0 | 2m 23s

2026-03-27 14:35FrontendPipeline 150306Job Frontend smoke

PASSED

P 110 | F 0 | 3m 15s

Smoke Suite Breakdown

Frontend

7 attempts across 7 pipelines

71% green

Passed5

Failed2

Incomplete0

Avg runtime3m 54s

Median passing runtime3m 08s

Pipelines7

University

6 attempts across 6 pipelines

100% green

Passed6

Failed0

Incomplete0

Avg runtime2m 27s

Median passing runtime2m 23s

Pipelines6