case-studyembeddedCI/CD

Verifying Real-Time Constraints: Case Study of Integrating RocqStat into a CI Pipeline

UUnknown

2026-02-02

9 min read

How a Tier‑1 team integrated RocqStat with VectorCAST in CI to catch WCET regressions early—32 regressions blocked, 75% fewer field incidents.

Hook: Real-time failures cost projects time and trust — here's how we stopped them in CI

Teams building safety-critical embedded software face a recurrent, painful problem: functional tests pass, CI is green, and yet the system misses deadlines in integration tests or—worse—in the field. The root cause is often timing: undetected increases in worst-case execution time (WCET) that slip through traditional unit and integration testing. In 2026, with Vector's acquisition of RocqStat and tighter industry focus on timing safety, integrating timing analysis into CI is no longer optional — it's essential.

Executive summary (most important first)

What we did: An anonymized Tier-1 automotive component team integrated RocqStat's WCET analysis into their VectorCAST-enabled CI pipeline to enforce timing budgets automatically on every merge request.

Results in 6 months:

32 timing regressions caught before mainline merge
75% reduction in field timing incidents vs the prior year
40% faster root-cause resolution for timing failures
MTTD (mean time to detect) for timing regressions improved from days to minutes

Why it matters: Automated WCET checks reduce late-stage surprises, enforce reproducible timing budgets, and shift timing verification left — aligning with 2026 trends around unified verification toolchains after Vector's RocqStat acquisition.

Context: Why timing belongs in CI in 2026

Late 2025 and early 2026 have seen two clear industry signals: software-defined systems proliferate across automotive, avionics, and industrial controls; and toolchains are consolidating to deliver unified verification workflows. Vector's acquisition of RocqStat (announced January 2026) accelerated availability of integrated timing analysis inside VectorCAST-style toolchains, making automation easier and more trustworthy.

Safety standards (ISO 26262, DO-178C derivatives) and performance budgets require reproducible WCET estimates and traceability. Teams that do not bake WCET into CI risk shipping regressions that are expensive to diagnose and costly to recall.

Case study overview: team, constraints, and objectives

Team profile

Size: 35 engineers (firmware, platform, and test)
Domain: Automotive body control unit (BCU) with real-time tasks
Toolchain: VectorCAST for unit and integration testing; GitLab CI; in-house build and test automation; running on ARM Cortex-M and AUTOSAR runtime

Primary objectives

Automate WCET estimation and regression detection in CI.
Fail merges that introduce timing budget violations.
Increase developer ownership of timing through local tooling and quick feedback.
Produce audit trails and reproducible artifacts for certification evidence.

Solution design: integrating RocqStat + VectorCAST into CI

The team implemented a three-tier verification model inside the CI pipeline:

Local pre-commit: lightweight timing checks via a pared-down RocqStat runner and static heuristics.
Merge-request gate: full VectorCAST + RocqStat WCET analysis on a dedicated analysis runner.
Nightly baseline: cross-configuration WCET sweep including HIL traces and hardware profiling.

Architecture

Developer laptop  ->  Pre-commit RocqStat (fast)  ->  GitLab MR  ->  CI runner (VectorCAST + RocqStat)  ->  Results & thresholds
                                  |                                                   |
                                  +-- local artifacts & cache for reproducibility  +-- artifact storage, signed results

Key integration points

VectorCAST adapter in CI to run unit/integration tests and provide instrumentation output.
RocqStat engine invoked after instrumentation to produce WCET estimates from control-flow and measurement inputs.
JSON export and a small parser that compares WCET estimates against per-task budgets and fails the pipeline when exceeded.

Implementation: concrete commands and pipeline snippets

Below are simplified examples used by the team. Adapt paths and options to your environment.

1) Local pre-commit script (pre-commit.sh)

#!/usr/bin/env bash
# quick rocqstat smoke test
BUILD_DIR=build/quick
mkdir -p "$BUILD_DIR"
make -C src all -j4 O="$BUILD_DIR"
vectorcast_cli --project myproj --build "$BUILD_DIR" --run-tests --export-instrumentation inst.xml
rocqstat_cli --input inst.xml --mode quick --output wcet_quick.json
python tools/check_wcet.py wcet_quick.json --budget-file budgets.json || exit 1

2) GitLab CI job (excerpt)

wcet_analysis:
  stage: test
  tags: [highmem]
  script:
    - mkdir -p analysis
    - vectorcast_cli --project myproj --build build/ci --run-all --export-instrumentation inst_full.xml
    - rocqstat_cli --input inst_full.xml --cfg configs/rocq.yml --output wcet_full.json
    - python tools/compare_wcet_and_budget.py wcet_full.json budgets/merged_budgets.json
  artifacts:
    paths:
      - wcet_full.json
    when: always
  rules:
    - if: '$CI_MERGE_REQUEST_IID'

3) Sample compare script (simplified)

#!/usr/bin/env python3
import json, sys
wcet=json.load(open(sys.argv[1]))
budgets=json.load(open(sys.argv[2]))
violations=[]
for task, data in wcet['tasks'].items():
  if data['wcet'] > budgets.get(task, 1e9):
    violations.append((task, data['wcet'], budgets.get(task)))
if violations:
  for t,w,b in violations:
    print(f"VIOLATION: {t} wcet={w} budget={b}\n")
  sys.exit(2)
print("OK: all WCETs within budget")

Thresholds, margins and stability rules

Outright failure on any small increase is noisy. The team adopted a practical rule-set to balance strictness and signal-to-noise:

Hard fail: WCET > budget (absolute deadline miss).
Soft alert: WCET increase > 5% relative to baseline — creates a warning and requires developer sign-off for merge.
Regression tracking: If a task exceeds 2% in three consecutive MRs, enforce a hard block — this catches slow drift.
Baseline selection: Baselines are taken from the last approved mainline build and are reproducibly stored alongside the signed artifact.

What failures did this catch?

The integration uncovered three common failure modes that previously escaped detection:

Compiler optimization regressions: A change to compiler flags to address code size unexpectedly increased WCET for an ISR by 18%. The CI gate flagged it and the flag was reverted within hours.
Library update side-effects: An updated math library function introduced a rare conditional that caused worst-case pathing to change; WCET grew by 9% and was caught pre-merge.
Unbounded loops in new feature: A newly introduced debug loop (guarded only by an assert) increased worst-case time under certain configurations. The MR was blocked and the assert was changed into a safe timeout.

Collectively these incidents would previously have required multiple integration lab cycles. With the pipeline gates, they were resolved by the authors in hours instead of weeks.

Metrics we tracked (and the tooling to track them)

To quantify impact we instrumented the process and tracked:

Number of timing regressions detected per month
Mean time to detect (developer push to CI indicator)
Mean time to remediate (CI fail to MR resolution)
Field incidents attributable to timing (per quarter)
False positive rate (soft alerts that proved harmless)

Tools used: Prometheus + Grafana for CI metrics, VectorCAST reports for coverage and instrumentation, RocqStat JSON outputs, and an internal dashboard to correlate MR IDs with WCET deltas.

Results: quantitative outcomes

After six months of running the integrated pipeline we observed:

32 timing regressions caught in MR gates (all fixed before merge).
MTTD decreased from a median of 3 days (manual detection during integration tests) to under 12 minutes (CI job runtime + reporting).
MTTR decreased by 40% because the MR author had the precise failing context.
Field timing incidents dropped by 75% year-over-year.
False positive rate on soft alerts was ~9%; the team accepted and tuned the thresholds accordingly.

Cultural changes and process shifts

Integrating WCET checks into CI impacted team behavior beyond technical automation.

Developer ownership

Developers took responsibility for running pre-commit quick checks locally. That reduced the friction of CI failures landing on reviewers and encouraged quicker fixes.

Shift-left timing thinking

Timing became part of the definition of done. Feature PR templates now include a checkbox for 'WCET impact considered' and require attaching the RocqStat summary for changed modules.

Cross-functional collaboration

Firmware, platform, and testing teams instituted weekly 'timing triage' sessions to review persistent soft alerts and to decide on budget adjustments where necessary. SREs helped automate baselines and storage of signed WCET artifacts for audit trails.

Training and enablement

Engineers received targeted workshops on interpreting RocqStat outputs, controlling worst-case paths via coding patterns, and creating reproducible builds for timing analysis.

Advanced strategies and 2026 trends to adopt

Building on this baseline, teams should look to these 2026-forward strategies:

ML-assisted regression prioritization: Use anomaly detection on historical WCET trends to reduce false positives dynamically.
Hardware-in-the-loop (HIL) CI stages: Incorporate run-to-completion hardware traces nightly to improve precision of WCET bounds for hardware-specific code paths.
Provenance and signed artifacts: Store RocqStat and VectorCAST artifacts with digital signatures to support certification evidence — a growing requirement post-2025 toolchain standardization. Consider governance and trust models from community cloud co‑op playbooks.
Integrated dashboards: Centralize test coverage, WCET deltas, and SBOM/third-party component timing implications for holistic risk assessment.
Automated mitigation suggestions: Link common anti-pattern detectors (e.g., unbounded loops, deep recursion) to MR comments automatically with suggested fixes.

Practical pitfalls and how to avoid them

Pitfall: Failing MR on tiny WCET noise.
Fix: Soft alerts with escalation rules and time-based regression tracking.
Pitfall: Non-reproducible baselines due to environment drift.
Fix: Containerize analysis runners and store exact toolchain hashes and config files with signed artifacts.
Pitfall: Overloading developers with raw RocqStat output.
Fix: Provide summarized MR comments that surface only the affected tasks and suggested causes.
Pitfall: Ignoring cross-configuration variability (e.g., different optimization levels).
Fix: Nightly sweeps across key configs and hardware targets to detect worst-case divergence early.

Evidence & traceability: what to keep for audits

For certification and audits, the team stored:

Signed WCET JSON reports from RocqStat per CI run.
VectorCAST coverage and instrumentation artifacts (linked to the source commit hash).
Toolchain version metadata (compiler, flags, RocqStat, VectorCAST) as immutable build artifacts.
MR IDs and comments showing review and remediation steps.

Why Vector + RocqStat in 2026 changes the calculus

Vector's acquisition of RocqStat formalizes the integration path between unit/integration test frameworks and advanced timing analysis. For teams, that means lower integration friction, vendor-supported workflows, and a clearer upgrade path for toolchain updates — critical when you need reproducible, auditable WCET evidence for certification authorities.

"Timing safety is becoming a critical requirement for software-defined systems." — Vector announcement, Jan 2026

Actionable checklist to get started this week

Install a RocqStat evaluation on a representative module and run a quick local WCET estimation.
Create a lightweight pre-commit wrapper that runs a fast RocqStat smoke test.
Add a dedicated CI job to run full VectorCAST + RocqStat analysis on merge requests and export JSON artifacts to artifact storage.
Define and document timing budgets per task and add them to the MR template.
Instrument dashboards for WCET deltas and set soft/hard gating rules.

Final lessons learned

Automating WCET in CI shifts discovery earlier, reduces surprise regressions, and shortens remediation time.
Tune thresholds to avoid noise — use a mix of soft warnings and hard failures to balance agility and safety.
Store reproducible artifacts and provenance for certification and incident forensics.
Invest in developer enablement so timing verification becomes part of daily development, not an afterthought.

Call to action

If your team struggles with late timing regressions, start a pilot: run RocqStat on a critical module, wire a merge-request CI job to block on timing budget breaches, and measure MTTD/MTTR after 90 days. Need a jumpstart? Contact us to see a sample VectorCAST+RocqStat CI template, dashboard examples, and a migration checklist tailored to automotive and other real-time domains.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.