Skip to main content
🩺 DiagnosticsROS 2 · June 2026

ROS 2 Diagnostics Guide 2026

Monitor robot health with the ROS 2 diagnostics stack: diagnostic_updater for per-node status reporting, FrequencyStatus and TimeStampStatus for common checks, diagnostic_aggregator with GenericAnalyzer YAML config, and rqt_robot_monitor for real-time visualization.

Why Diagnostics Matter

Without diagnostics, a crashed sensor driver looks identical to a healthy robot from the outside — the robot just stops working. With diagnostics, you get a timestamped record of when each component went ERROR or STALE, which component caused the failure, and the exact error message. Essential for field robots where you can't attach a debugger.

1

ROS 2 Diagnostics Overview

The diagnostics system provides a standardized way for nodes to report their health status. Every component publishes to /diagnostics, which aggregators collect, group, and escalate.

text
ROS 2 Diagnostics Architecture
═══════════════════════════════════════════════════════
Node (sensor driver, motor controller, etc.)
  │
  │  publishes DiagnosticArray → /diagnostics (1 Hz)
  │
  ▼
diagnostic_aggregator
  │  subscribes to /diagnostics
  │  groups by Analyzer (namespace, prefix, name)
  │  publishes → /diagnostics_agg (1 Hz)
  │
  ▼
rqt_robot_monitor  /  robot_monitor  /  custom dashboard

DiagnosticStatus levels:
  OK    (0) — component healthy
  WARN  (1) — degraded, may need attention
  ERROR (2) — component failed
  STALE (3) — no updates received (timeout)

Key packages:
  diagnostic_updater     — C++ and Python helper classes for publishers
  diagnostic_aggregator  — subscriber that groups and analyzes statuses
  rqt_robot_monitor      — GUI tool to visualize aggregated diagnostics

Install:
  sudo apt install -y     ros-humble-diagnostic-updater     ros-humble-diagnostic-aggregator     ros-humble-rqt-robot-monitor

Always add diagnostics to production robot nodes. When a field robot fails, the /diagnostics_agg topic is often the first place to check — it shows which component raised an error and when it went STALE.

2

diagnostic_updater — Python Node

Use diagnostic_updater.Updater to publish health status from your node. Add DiagnosticTask callbacks that set the status level and message.

python
import rclpy
from rclpy.node import Node
import diagnostic_updater
import diagnostic_msgs.msg


class MotorControllerNode(Node):
    def __init__(self):
        super().__init__("motor_controller")

        # Create the updater (publishes to /diagnostics at 1 Hz)
        self.updater = diagnostic_updater.Updater(self)
        self.updater.setHardwareID("MotorController-v2.1")

        # Add named diagnostic tasks
        self.updater.add("Motor Temperature", self._check_temperature)
        self.updater.add("Motor Current",     self._check_current)
        self.updater.add("Encoder Status",    self._check_encoder)

        # Simulated sensor values
        self._temperature = 45.0   # °C
        self._current = 2.3        # Amps
        self._encoder_ok = True

        # Timer to update sensor readings
        self.create_timer(0.1, self._sensor_update)

        self.get_logger().info("Motor controller ready")

    def _sensor_update(self):
        """Update sensor readings (replace with real hardware reads)."""
        self._temperature += 0.01   # simulate heating
        # self.updater.update() is called automatically at 1 Hz by the updater

    def _check_temperature(self, stat: diagnostic_updater.DiagnosticStatusWrapper):
        stat.add("Temperature (°C)", f"{self._temperature:.1f}")
        stat.add("Threshold WARN (°C)", "70.0")
        stat.add("Threshold ERROR (°C)", "85.0")

        if self._temperature > 85.0:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.ERROR,
                         f"OVERHEATING: {self._temperature:.1f}°C")
        elif self._temperature > 70.0:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.WARN,
                         f"High temperature: {self._temperature:.1f}°C")
        else:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.OK,
                         f"Temperature normal: {self._temperature:.1f}°C")
        return stat

    def _check_current(self, stat: diagnostic_updater.DiagnosticStatusWrapper):
        stat.add("Current (A)", f"{self._current:.2f}")
        if self._current > 10.0:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.ERROR,
                         "Overcurrent detected")
        elif self._current > 7.0:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.WARN,
                         "High current draw")
        else:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.OK,
                         f"Current normal: {self._current:.2f}A")
        return stat

    def _check_encoder(self, stat: diagnostic_updater.DiagnosticStatusWrapper):
        stat.add("Encoder connected", str(self._encoder_ok))
        if not self._encoder_ok:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.ERROR,
                         "Encoder disconnected")
        else:
            stat.summary(diagnostic_msgs.msg.DiagnosticStatus.OK,
                         "Encoder OK")
        return stat


def main():
    rclpy.init()
    node = MotorControllerNode()
    rclpy.spin(node)
3

FrequencyStatus and TimeStampStatus

diagnostic_updater provides pre-built tasks for common checks: FrequencyStatus monitors topic publish rate, TimeStampStatus checks message timestamp freshness.

python
import rclpy
from rclpy.node import Node
from sensor_msgs.msg import LaserScan
import diagnostic_updater


class LidarNode(Node):
    def __init__(self):
        super().__init__("lidar_node")

        self.updater = diagnostic_updater.Updater(self)
        self.updater.setHardwareID("RPLIDAR-S3-SN12345")

        # ── FrequencyStatus ───────────────────────────────────
        # Monitors how often tick() is called — warns if rate drifts
        freq_params = diagnostic_updater.FrequencyStatusParam(
            {"min": 9.5, "max": 10.5},   # expected: 10 Hz ±5%
            tolerance=0.05,
            window_size=10
        )
        self.freq_status = diagnostic_updater.FrequencyStatus(freq_params)
        self.updater.add(self.freq_status)

        # ── TimeStampStatus ───────────────────────────────────
        # Warns if message.header.stamp is stale (older than max_acceptable)
        ts_params = diagnostic_updater.TimeStampStatusParam(
            min_acceptable=-1.0,   # allow 1s in the past (bag replay)
            max_acceptable=0.5,    # warn if older than 0.5s
        )
        self.ts_status = diagnostic_updater.TimeStampStatus(ts_params)
        self.updater.add(self.ts_status)

        # Subscriber
        self.sub = self.create_subscription(
            LaserScan, "/scan", self._scan_cb, 10
        )

    def _scan_cb(self, msg: LaserScan):
        # Tick frequency counter
        self.freq_status.tick()

        # Tick timestamp checker
        stamp_sec = msg.header.stamp.sec + msg.header.stamp.nanosec * 1e-9
        self.ts_status.tick(stamp_sec)

FrequencyStatus is the most useful built-in task — it catches sensor drivers that drop packets, get throttled by a busy CPU, or silently stop publishing. Set the window_size to cover ~10 seconds of data so brief glitches don't trigger WARN.

4

diagnostic_aggregator — Group and Namespace Diagnostics

diagnostic_aggregator subscribes to /diagnostics and organizes statuses by prefix/namespace. Configure it with a YAML analyzers file.

yaml
# analyzers.yaml — configuration for diagnostic_aggregator
# Launch: ros2 run diagnostic_aggregator aggregator_node
#         --ros-args -p analyzers:=<path>/analyzers.yaml

analyzers:
  ros__parameters:
    path: /Robot
    analyzers:

      # Group all motor-related diagnostics under /Robot/Motors
      motors:
        type: diagnostic_aggregator/GenericAnalyzer
        path: Motors
        prefix: Motor
        # Matches any DiagnosticStatus whose name starts with "Motor"
        # e.g., "Motor Temperature", "Motor Current", "Motor Controller"

      # Group all sensor diagnostics under /Robot/Sensors
      sensors:
        type: diagnostic_aggregator/GenericAnalyzer
        path: Sensors
        contains:
          - Lidar
          - Camera
          - IMU
          - Encoder
        # Matches DiagnosticStatus names CONTAINING any of these strings

      # Group by exact hardware ID
      power:
        type: diagnostic_aggregator/GenericAnalyzer
        path: Power
        prefix: Battery
        num_items: 3    # expect exactly 3 battery status messages
        timeout: 5.0    # go STALE if no update in 5 seconds

      # Catch-all for unmatched diagnostics
      other:
        type: diagnostic_aggregator/GenericAnalyzer
        path: Other
        regex: ".*"
        remove_prefix: ""
5

Launch Aggregator and Monitor

Launch diagnostic_aggregator alongside your robot nodes and view results with rqt_robot_monitor or by echoing /diagnostics_agg.

python
# launch/diagnostics.launch.py
from launch import LaunchDescription
from launch_ros.actions import Node
import os

def generate_launch_description():
    analyzers_config = os.path.join(
        get_package_share_directory("my_robot"),
        "config", "analyzers.yaml"
    )

    return LaunchDescription([
        # Your robot nodes
        Node(package="my_robot", executable="motor_controller"),
        Node(package="my_robot", executable="lidar_driver"),

        # Diagnostic aggregator
        Node(
            package="diagnostic_aggregator",
            executable="aggregator_node",
            name="diagnostic_aggregator",
            parameters=[analyzers_config],
            output="screen",
        ),

        # Optional: republish aggregated to /diagnostics for rqt_robot_monitor
        # rqt_robot_monitor reads /diagnostics_agg by default
    ])

# ── From terminal ────────────────────────────────────────────
# View raw diagnostics (per-node output)
ros2 topic echo /diagnostics

# View aggregated diagnostics (grouped output)
ros2 topic echo /diagnostics_agg

# Launch rqt_robot_monitor GUI
ros2 run rqt_robot_monitor rqt_robot_monitor

# Check for errors at any level (one-liner for CI)
ros2 topic echo /diagnostics_agg --once | grep -E "level: [12]"
# level 1 = WARN, level 2 = ERROR

In CI pipelines, pipe /diagnostics_agg to a script that exits with code 1 if any status is ERROR. This catches sensor failures in automated tests before the robot ships to the field.

6

STALE Detection and Timeout Configuration

A diagnostic status goes STALE when the aggregator stops receiving updates from a node. This catches nodes that silently crash or freeze.

python
# Configuring STALE timeout in diagnostic_aggregator
# analyzers.yaml:
# sensors:
#   type: diagnostic_aggregator/GenericAnalyzer
#   path: Sensors
#   prefix: Lidar
#   timeout: 5.0   # mark STALE after 5s with no update

# In your node: updater publishes at 1 Hz by default.
# Force a specific rate:
from diagnostic_updater import Updater

class MyNode(Node):
    def __init__(self):
        super().__init__("my_node")
        self.updater = Updater(self)
        self.updater.setHardwareID("sensor-001")

        # Change publish period (default = 1.0s)
        self.updater.period = 0.5   # 2 Hz

        # Add a "heartbeat" task — always publishes OK while node is running
        self.updater.add("Heartbeat", lambda stat: (
            stat.summary(0, "Node alive"),
            stat
        )[-1])

# Manually force an update (useful after a state change):
self.updater.force_update()

# Force an error (useful for testing):
self.updater.broadcast(2, "Emergency stop triggered")
# All status in the next publish will be ERROR with this message

Quick Reference

Concept / APIDetails
Installsudo apt install ros-humble-diagnostic-updater ros-humble-diagnostic-aggregator ros-humble-rqt-robot-monitor
Updater (Python)diagnostic_updater.Updater(self); updater.add('name', callback)
setHardwareIDself.updater.setHardwareID('MyDevice-SN001') — sets the hardware_id field
Status levels0=OK, 1=WARN, 2=ERROR, 3=STALE
FrequencyStatusMonitors publish rate; tick() on each message received
TimeStampStatusMonitors header.stamp freshness; tick(stamp_sec) per message
aggregator_noderos2 run diagnostic_aggregator aggregator_node --ros-args -p analyzers:=file.yaml
Analyzer prefixprefix: Motor matches any name starting with 'Motor'
Analyzer containscontains: [Lidar, IMU] matches names containing any of these strings
STALE timeouttimeout: 5.0 in analyzer config — mark STALE after 5s no update