← All modules

6.3 Analytics and Latency

Draft — not verified

Theoretical Background: Analytics and Latency Measurement

Module 6 Theory: Analytics and Latency Measurement

Learning Objectives

  • Understand the useAnalyticsEngine data model and statistical metrics
  • Derive the statistical quantities (mean, variance, p95) computed over control samples
  • Understand the latency_tracker.py stamped-command interception pattern
  • Interpret the HTTP API exposed by the latency service
  • Apply analytics data to evaluate and compare controller performance
  • Connect the analytics pipeline to the broader integrated system (6.1, 6.2, 6.4)

6.3.1 Analytics Engine Overview

The useAnalyticsEngine composable implements a singleton performance recorder: module-level reactive refs persist data across page navigations, allowing continuous collection during a multi-page session.

// useAnalyticsEngine.ts  — module-level (singleton pattern)
const _isCollecting   = ref(false)
const _dataPoints     = ref<DataPoint[]>([])
const _collectionRate = ref(10)        // Hz
const _maxDataPoints  = ref(1000)      // circular buffer size

DataPoint Interface:

interface DataPoint {
  timestamp:     number   // Date.now() milliseconds
  x:             number   // robot x position (m)
  y:             number   // robot y position (m)
  theta:         number   // robot heading (rad)
  linearVel:     number   // commanded linear velocity (m/s)
  angularVel:    number   // commanded angular velocity (rad/s)
  distanceError: number   // Euclidean distance to current goal (m)
  headingError:  number   // heading error e_theta (rad)
  algorithm:     string   // active control algorithm name
  latencyMs:     number   // round-trip latency at this sample (ms)
}

Data points are collected at collectionRate Hz by the control loop in useMobileRobotSimulation or the ROS 2 telemetry handler, depending on the active connection mode.

6.3.2 Statistical Metrics

6.3.2.1 AnalyticsStats Interface

interface AnalyticsStats {
  sampleCount:       number  // N
  avgLinearVel:      number  // mean |v| (m/s)
  avgAngularVel:     number  // mean |omega| (rad/s)
  maxLinearVel:      number  // peak |v| (m/s)
  avgDistanceError:  number  // mean e_d (m)
  avgHeadingError:   number  // mean |e_theta| (rad)
  stdDevError:       number  // standard deviation of e_d (m)
  p95Error:          number  // 95th percentile of e_d (m)
  totalPathLength:   number  // sum of inter-sample distances (m)
}

6.3.2.2 Mean and Standard Deviation

For samples of distance error :

// useAnalyticsEngine.ts
const avgDist  = pts.reduce((s, p) => s + p.distanceError, 0) / n
const variance = pts.reduce((s, p) =>
  s + (p.distanceError - avgDist) ** 2, 0) / n
const stdDev   = Math.sqrt(variance)

6.3.2.3 95th Percentile Error (p95)

The p95 error characterises worst-case controller performance more robustly than the maximum, which is sensitive to single outlier samples:

const sortedErrors = [...pts.map(p => p.distanceError)].sort((a, b) => a - b)
const p95 = sortedErrors[Math.floor(n * 0.95)] || 0

6.3.2.4 Total Path Length

Odometric path length is accumulated by summing Euclidean inter-sample displacements:

let totalPath = 0
for (let i = 1; i < pts.length; i++) {
  const dx = pts[i].x - pts[i-1].x
  const dy = pts[i].y - pts[i-1].y
  totalPath += Math.hypot(dx, dy)
}

6.3.3 Data Export

useAnalyticsEngine exports the collected data in two formats.

CSV Export:

const exportCSV = () => {
  const headers = ['timestamp','x','y','theta','linearVel','angularVel',
                   'distanceError','headingError','algorithm','latencyMs']
  const rows    = dataPoints.value.map(p =>
    headers.map(h => p[h as keyof DataPoint]).join(','))
  const blob    = new Blob(
    [headers.join(',') + '\n' + rows.join('\n')],
    { type: 'text/csv' }
  )
  saveAs(blob, `arbiter_analytics_${Date.now()}.csv`)
}

JSON Export:

The JSON export includes both the raw data points array and the computed AnalyticsStats summary, enabling offline post-processing with MATLAB, Python, or R.

6.3.4 Latency Measurement Architecture

ArbiterROS measures the end-to-end command round-trip latency between the moment a velocity command is generated in the browser and the moment it is processed by the backend robot node. This involves a stamped-command pattern implemented across the frontend, latency_tracker.py, and the HTTP API.

6.3.4.1 The Stamped Command Pattern

A standard geometry_msgs/Twist carries no timestamp. To measure latency, the frontend wraps the twist in a JSON envelope with a high-resolution timestamp before publishing to /stamped_cmd_vel:

// Frontend — useROSBridge.ts or useMobileRobotWithROS2.ts
const sendStampedCommand = (linear: number, angular: number) => {
  const stamped = {
    header: { stamp: Date.now() },   // milliseconds since epoch
    twist:  { linear: { x: linear }, angular: { z: angular } }
  }
  publishTopic('/stamped_cmd_vel', 'std_msgs/String',
               { data: JSON.stringify(stamped) })
}

6.3.4.2 latency_tracker.py

The backend interceptor subscribes to /stamped_cmd_vel, extracts the frontend timestamp, computes the one-way latency, and republishes the raw twist to /cmd_vel_measured for the robot:

# latency_tracker.py
def _on_stamped_cmd(self, msg: String):
    payload   = json.loads(msg.data)
    send_ns   = payload['header']['stamp'] * 1_000_000  # ms -> ns
    recv_ns   = self.get_clock().now().nanoseconds
    latency_ms = (recv_ns - send_ns) / 1_000_000.0


record = { 'latency_ms': round(latency_ms, 4), 'linear_x': payload['twist']['linear']'x', 'angular_z': payload['twist']['angular']'z', 'timestamp': time.time(), 'deploy_mode': self.deploy_mode, } self.records.append(record) self.report_pub.publish(String(data=json.dumps(record)))



    # Forward to robot
    twist = Twist()
    twist.linear.x  = payload['twist']['linear']['x']
    twist.angular.z = payload['twist']['angular']['z']
    self.forward_pub.publish(twist)

6.3.4.3 Latency Measurement Equation

where is Date.now() in the browser at the moment of publication, and is clock().now().nanoseconds in the Python node. Both are wall-clock times; the measurement assumes the browser and backend clocks are synchronised (NTP or same host).

6.3.5 HTTP Latency API

latency_tracker.py runs an embedded HTTP server on port 8085 alongside the ROS 2 node, serving two endpoints:

GET /api/latency

Returns current statistics and recent records:

{
  "stats": {
    "count":    142,
    "mean_ms":  0.34,
    "min_ms":   0.12,
    "max_ms":   1.87,
    "p95_ms":   0.71,
    "stddev_ms": 0.18
  },
  "records": [
    { "latency_ms": 0.31, "linear_x": 0.4, "angular_z": 0.0,
      "timestamp": 1712345678.9, "deploy_mode": "docker" },
    ...
  ],
  "deploy_mode": "docker"
}

GET /api/health

{ "status": "ok", "node": "latency_tracker",
  "uptime_s": 142.3, "records_count": 142 }

The /latency frontend page polls /api/latency at 2 Hz to populate the live dashboard. useLatencyDashboard stores a rolling window of recent samples for the sparkline chart.

6.3.6 Performance Benchmarks by Deployment Mode

ModeTypical mean (ms)p95 (ms)Bottleneck
Simulation (client-side)N/AN/ANo network path
ROS 2 Virtual (Docker)0.1–0.50.8Docker bridge + loopback
ROS 2 Virtual (native)0.1–0.30.5Loopback socket
Hardware Pi 4B (WiFi)0.2–0.81.5WiFi round-trip + GPIO
Cloud (DigitalOcean)5–3050Internet RTT

Clock synchronisation is critical for accurate measurement. On the Pi 4B, run chronyc tracking to verify NTP synchronisation is within 10 ms before interpreting latency data.

Integration: Theory to Practice

The analytics pipeline is connected to every other module in Chapter 6. The web dashboard (6.2) triggers startCollection() and stopCollection() on the analytics engine and reads AnalyticsStats to display live performance cards. The autonomous behaviour node (6.1) produces the /cmd_vel commands that are time-stamped and measured by the latency tracker. The Docker deployment (6.4) runs latency_tracker.py as a declared service in docker-compose.yml, and its :8085 port is exposed alongside the rosbridge :9090 port.

Theoretical Design Choices

Why p95 instead of maximum error? The maximum is dominated by transient events (initialisation, mode transitions) that do not characterise steady-state controller performance. The 95th percentile provides a statistically robust worst-case estimate that excludes the top 5% of outliers, consistent with standard embedded systems timing analysis practice.

Why a singleton analytics state? The /control and /analytics pages are separate routes. If analytics state were scoped inside the composable function, navigating away from /control would destroy the collected data before the user reaches /analytics. Module-level refs persist for the lifetime of the browser tab, enabling cross-page data continuity.

Why /stamped_cmd_vel instead of modifying /cmd_vel? The /cmd_vel topic uses the standard geometry_msgs/Twist type expected by Nav2, move_base, and all ROS 2 velocity consumers. Changing its type would break interoperability. Publishing a parallel stamped string topic adds measurement capability without disturbing the standard interface.

Why embed the HTTP server in the ROS 2 node? The alternative — a separate Flask service reading from a shared file or database — requires process synchronisation and adds deployment complexity. Running the HTTP server in a daemon thread inside the ROS 2 node keeps the latency data co-located with its source and avoids additional containers in the Compose file.