Back to Docs

Understanding KPIs

Comprehensive guide to all KPIs and metrics in the Reports dashboard, including calculation methods and interpretation.

Every metric in the Reports dashboard is calculated from real data. This guide explains how each KPI is computed and what it means for your testing strategy.

Time Window Filtering

All metrics support 7, 14, or 30-day time windows for historical analysis. The time window affects data retrieval but maintains the same calculation methods described below.

Team Overview KPIs

Team Health Score

  • Definition: A composite metric combining multiple signals to give a holistic view of team health.
  • Calculation: Weighted average of three components:
    • 60% Pass Rate: (passed tests / total tests) × 100 across all team projects
    • 25% Automation Rate: (enabled tests / total tests) × 100 across all team projects
    • 15% Maintenance Health: 100% - (tests needing maintenance / total tests) × 100
  • Formula: Health Score = (Pass Rate × 0.6) + (Automation Rate × 0.25) + (Maintenance Health × 0.15)
  • Color Coding: Green (80%+), Amber (60-79%), Red (<60%)
  • Interpretation: A high health score indicates good pass rates, strong automation coverage, and low maintenance burden across the team.

Cross-Project Status

  • Pass Rate: (passed tests / total tests) * 100
  • Fail Rate: (failed tests / total tests) * 100
  • Skip Rate: (pending tests / total tests) * 100
  • Data Source: Aggregated test execution results
  • Interpretation: Shows current execution state across all team projects

Automation Coverage

  • Scope: Team-wide (aggregated across all projects in the selected team)
  • Calculation: (enabled tests / total tests) * 100 aggregated across all team projects
  • Growth Tracking: Shows tests within the selected time window (7/14/30 days)
  • Interpretation: Higher percentages indicate better automation adoption and reduced manual testing overhead

Quality Metrics

  • Success Rate: (passed tests / total tests) * 100
  • Tests Needing Maintenance: Count of disabled tests or tests with execution time > 1 minute
  • Interpretation: Tracks test reliability and identifies maintenance bottlenecks

Project Overview KPIs

Scope note

All project-level KPIs below are scoped to the selected project, and when you apply tags in the Reports header they are further scoped to tests that have ANY of the selected tags. If no tags are selected, these metrics fall back to true project-wide values.

Health Score

  • Definition: A composite metric combining multiple signals to give a holistic view of project health.
  • Calculation: Weighted average of three components:
    • 60% Pass Rate: (passed tests / total tests) × 100
    • 25% Automation Rate: (enabled tests / total tests) × 100
    • 15% Data Quality: (tests with duration / total tests) × 100
  • Formula: Health Score = (Pass Rate × 0.6) + (Automation Rate × 0.25) + (Data Quality × 0.15)
  • Color Coding: Green (80%+), Amber (60-79%), Red (<60%)
  • Interpretation: A high health score indicates good pass rates, strong automation coverage, and complete test data. Low scores suggest issues in one or more of these areas.

Test Count

  • Data Source: Total count of tests in the project
  • Scope: Respects the selected tags (only tagged tests are counted). With no tags selected, this is the true project-wide count.
  • Growth Tracking: Real count of tests created in the selected time window
  • Interpretation: Shows project (or tagged segment) scale and recent activity

Automation Rate

  • Calculation: (enabled tests / total tests) * 100 within the current scope
  • Scope: Respects selected tags. With tags applied, both the numerator and denominator are computed only over tagged tests. With no tags, this is the project-wide automation rate.
  • Interpretation: Measures what percentage of tests in the current slice (project or tagged subset) are enabled for execution and automation adoption

Execution Status

  • Components: Passed, Failed, Pending, Running, Regenerating
  • Calculation: Real-time counts from project status with percentage breakdowns
  • Interpretation: Shows current test execution state and identifies bottlenecks

Performance Metrics

  • Total Duration: Sum of all test execution times
  • Average Duration: Total duration / number of tests with recorded times
  • Fastest Duration: Minimum execution time among tests with recorded times
  • Slowest Duration: Maximum execution time among tests with recorded times
  • Data Source: Tests with recorded execution duration
  • Interpretation: Identifies performance bottlenecks and optimization opportunities. Use "Fastest" to benchmark best-case latency and "Slowest" to highlight worst-case outliers.

Tests Trend

  • Definition: Daily count of updated or newly created tests over the selected time window
  • Scope: Project-scoped, respects the selected time window (7/14/30 days)
  • Learn more: See Tests Trend for details and interpretation

Quality Insights

  • Enabled Tests: Count of tests with enabled=true
  • Disabled Tests: total tests - enabled tests
  • Maintenance Needed: Tests disabled OR execution time > 60 seconds
  • Interpretation: Helps prioritize test maintenance and optimization efforts

Latest Run Insights

  • Scope: Project-level and tag-aware. Uses the latest run per test within the selected project and tag filters.
  • Data Source: Latest test run per test
  • Median Duration (p50): 50th percentile of execution durations for tests with recorded times
  • P95 Duration: 95th percentile of execution durations (high-tail latency)
  • Median/Mean %: Median duration divided by average duration, as a percentage
  • Hanging Runs: Count of tests whose latest status is running or regenerating and whose startTime is older than 10 minutes
  • Interpretation: Highlights typical vs tail performance and detects potentially stuck executions

Freshness & Quality

  • Scope: Project-level and tag-aware. Uses the latest run per tagged test (or all tests when no tags are selected).
  • Data Source: Latest test run per test
  • Stale Tests %: Percentage of tests with no endTime or whose endTime is older than 14 days
  • Latest Error Rate: Percentage of tests whose latest run has an error or a non-zero totalFailCount
  • Avg Failed Steps: Average stepFailCount per test (shown alongside average totalFailCount)
  • Median Recency: Median time since the latest run completion (or last update if missing), shown as a duration
  • Interpretation: Measures data freshness and failure signals to guide quality triage

Top Lists

Top 5 Slowest Tests

  • Definition: Tests with the longest execution durations over the selected time window
  • Data Source: Project tests with recorded duration, sorted by execution time
  • Time Window: Respects 7/14/30-day selection
  • Usage: Identify candidates for optimization or parallelization

Top 5 Flakiest Tests

  • Definition: Tests with the highest flakiness rate among those that have both passed and failed in the selected time window.
  • Calculation:
    • For each test in the selected project (and matching any selected tags):
      • Let totalRuns be the number of runs in the time window.
      • Let failedRuns be the number of runs with status = 'failed'.
      • Let passedRuns be the number of runs with status = 'passed'.
      • Mark the test as flaky if failedRuns > 0 and passedRuns > 0.
      • Define Flakiness(test) = failedRuns / totalRuns.
    • Sort flaky tests by Flakiness(test) in descending order, with tiebreakers on failedRuns and regression failure steps. Take the top 5.
  • Time Window: Respects 7/14/30-day selection.
  • Scope: Respects selected tags. With tags applied, only tagged tests and their runs are considered.
  • Usage: Highlights tests that are truly unstable (sometimes pass, sometimes fail) rather than those that simply fail often.

Top 5 Most Failing Tests

  • Definition: Tests with the highest number of failed runs over the selected time window.
  • Calculation:
    • For each test in the selected project (and matching any selected tags):
      • Consider only runs where status = 'failed' within the time window.
      • Define Failures(test) = count_failed_runs_in_window(test).
    • Sort tests by Failures(test) in descending order. Take the top 5.
  • Time Window: Respects 7/14/30-day selection.
  • Scope: Respects selected tags. With tags applied, only tagged tests and their runs are counted.
  • Usage: Focus fixes on tests that fail most often in practice, not just those with a single bad run.

Reliability Metrics

Stability & Maintenance

  • Definition: Stability indicators based on test failures and disabled tests.
  • Metrics:
    • Tests with Failed Steps: Count and percentage of tests whose latest run has stepFailCount > 0 or totalFailCount > 0.
    • Avg Failed Steps: Average number of failed steps per test (among tests with failures).
    • High Failure Tests: Tests with totalFailCount > 3 in their latest run.
    • Disabled Tests: Count and percentage of tests with enabled = false.
    • Active Tests: Count of enabled tests.
    • Stability Score: 100% - (tests with failed steps %). Higher is better.
  • Interpretation: High failure rates indicate flaky tests or unstable environments. High disabled counts suggest maintenance debt.

Data Quality

  • Definition: Measures completeness and reliability of recorded test data.
  • Metrics:
    • Missing Duration: Count of tests without recorded execution duration.
    • Missing Timestamps: Count of tests without createdAt or updatedAt.
    • Quality Score: Percentage of tests with complete data (no missing duration or timestamps). Calculated as (tests without issues / total tests) × 100.
  • Note: Tests missing both duration and timestamps are counted once (not double-counted).
  • Interpretation: Low quality scores can skew other KPIs. Improve data collection and reporting to get accurate metrics.

Environment Breakdown

  • Definition: Breakdown of test execution environments extracted from test logs.
  • Metrics:
    • Browser Distribution: Percentage of tests run on each browser type (Chrome, Firefox, Safari, etc.).
    • Runner Distribution: Percentage of tests run on each runner/executor type.
  • Interpretation: Helps identify environment-specific issues and optimize test distribution.

Test Insights KPIs

Scope note

In the Test Insights and project-level sections, KPIs now respect the selected tags. When tags are applied, all metrics are computed over the tagged subset of tests. When no tags are selected, they use project-wide data. Team Overview remains strictly team-wide and is never tag-scoped.

Automation Coverage

  • Enabled Tests: Count of enabled tests within the current scope (tagged subset or entire project).
  • Disabled Tests: Count of disabled tests within the same scope.
  • Coverage Rate: (enabled tests / total tests) * 100 within the scope.
  • Fail Rate: Failure percentage computed from tests in the current scope.
  • Interpretation: Automation status for the currently focused slice of tests. Use tags (e.g. @critical, @smoke) to understand coverage for key segments.

Test Health

  • Flakiness Rate: Percentage of tests in the current scope that have both passed and failed at least once in the selected time window (using the same definition as Top Flakiest Tests).
  • Quality Score: Excellent (80%+), Good (60%+), Poor (<60%) based on pass rate within the scope.
  • Active Tests: Count of currently running tests (or "Idle" if none) within the project.
  • Interpretation: Reliability and health metrics for the current slice of tests.

Test Execution

  • Stability: Excellent (95%+ pass rate), Good (85%+), Fair (70%+), Poor (<70%) based on pass rate within the current scope.
  • Running: Count of tests currently executing in the selected project.
  • Regenerating: Count of tests being regenerated in the selected project.
  • Interpretation: Real-time execution status for the focused set of tests.
Reports KPIs Dashboard

Data Freshness

All KPIs are calculated in real-time from your current test data. Metrics update automatically when you change team/project selections or tags.

  • Team Overview metrics are always team-wide and ignore tags for consistency.
  • Project-level metrics (Overview, Trends, Insights, Reliability, Top Lists) are tag-aware: when tags are selected, all calculations are performed over tagged tests only.

No cached or stale data is used in any calculations.

Data Limits & Performance

To ensure fast dashboard performance, the Reports page uses the following optimizations:

  • Server-side aggregations: Duration percentiles, averages, and other statistics are computed on the server to minimize data transfer.
  • Data limit: A maximum of 10,000 tests or test runs are processed per query. For projects exceeding this limit, a truncation indicator will be shown.
  • Parallel queries: Team-wide metrics fetch data from all projects in parallel for faster load times.

If your project has more than 10,000 tests, consider using tag filters to focus on specific subsets for accurate metrics.

Color Coding System

The Reports dashboard uses consistent color coding across all metrics:

  • Green: Excellent performance (80%+)
  • Amber: Good performance (60%+)
  • Red: Needs attention (<60%)
  • Blue: Neutral/informational metrics
  • Purple: Active/processing states

This visual system helps you quickly identify areas that need attention and celebrate successes.