SENSE RTMon
Dynamic Network Real-Time Monitoring for Multi-Domain Science Workflows
Overview
SENSE RTMon automates real-time monitoring and visualization of network services provisioned by the SENSE Orchestrator. It generates Grafana dashboards, integrates with Prometheus/SiteRM, and provides end-to-end visibility of cross-domain network paths.
Quick Start
Prerequisites
Cloud Stack Host:
- Docker and Docker Compose installed.
- HTTPS/TLS certificates (e.g., Let’s Encrypt).
- Access to SENSE Orchestrator API endpoints.
Site Stack Hosts:
- Linux hosts with Docker support for exporter containers.
- SNMP/SSH access to monitored network devices.
Cloud Stack Deployment
(Central monitoring services: Grafana, Prometheus, Pushgateway)
1. Configuration
Edit config_cloud/config.yml:
hostIP: "YOUR_HOST_IP"
grafana_host: "http://your-grafana-host:3000"
grafana_api_token: "YOUR_GRAFANA_API_KEY"
siterm_url_map:
"urn:ogf:network:example": "https://sense-fe.example.org/sitefe/json/frontend"
ssl_certificate: "/etc/letsencrypt/live/example.com/fullchain.pem"
ssl_certificate_key: "/etc/letsencrypt/live/example.com/privkey.pem"
2. Installation
# Install dependencies and deploy Docker containers
./install.sh # Installs Docker, Python modules
./start.sh # Launches Grafana, Prometheus, and RTMon workers
3. Dashboard Generation
# Sync dashboards with SENSE-O manifests
./update.sh
4. Access Dashboards
- Open Grafana at
http://your-grafana-host:3000. - Navigate to the
Autogolefolder for auto-generated dashboards.
Site Stack Deployment
(Lightweight exporters for switches/hosts)
1. Node Exporter (Host Metrics)
cd exporters/node_exporter
docker compose up -d # Deploy to a Rocky Linux host
2. SNMP Exporter (Switch Metrics)
- Edit
snmp_exporter/snmp.ymlto target your switches. - Deploy:
cd exporters/snmp_exporter
docker compose up -d
3. Verify Exporters
Check metrics endpoints:
- Node:
http://[HOST]:9100/metrics - SNMP:
http://[HOST]:9116/metrics
Key Features
- Automated Dashboards:
- Real-time topology maps (Mermaid diagrams).
- Layer 2/3 metrics (latency, packet loss, throughput).
- Network Diagnostics:
- Integrated ping/traceroute via SiteRM.
- Annotations for performance tests.
- Multi-Instance Support: Monitor multiple SENSE-O deployments.
Troubleshooting
| Issue | Solution |
|---|---|
| Dashboards not updating | Run ./update.sh and check SENSE-O logs. |
| Exporters offline | Verify Docker containers with docker ps. |
| SSL errors | Renew Let’s Encrypt certificates. |