Introduction
The Operations Central Monitoring setup collects monitoring information from across the instrument, and provides monitoring dashboards as well as an alarm-management system on top. It provides you with the following user services:
A Grafana monitoring & alerting system, exposed on http://localhost:3001 (credentials: admin/admin),
A Alerta alarm-management system implementing the ISA 18.2 alarm model, exposed on http://localhost:8081 (credentials: admin/alerta).
As well as the following backing services to support the setup:
A Prometheus database that collects monitoring information from across the instrument, exposed on http://localhost:9091,
A Node Exporter scraper that collects monitoring information of the host running this software stack, exposed on http://localhost:9100.
Hint
The URLs assume you’re running this software on localhost. Replace this with the hostname of the hosting system if you’re accessing this software on a server.
The services are connected as follows. The green components are part of this software package, the gray components are external:
![digraph monitoring_setup {
layout=dot;
nodesep=1.2;
fontname="Helvetica,Arial,sans-serif"
node [fontname="Helvetica,Arial,sans-serif" fontsize="20pt" style=filled fixedsize=true]
edge [fontname="Helvetica,Arial,sans-serif" fontsize="20pt"]
rankdir=TB;
node [shape=ellipse height=1 width=2 color=gray];
slack;
node [shape=rectangle width=1 color=gray];
user;
subgraph cluster_operational_central_management {
color=black;
label="Operational Central Management";
node [shape=ellipse height=1 width=2 color=aquamarine];
prometheus; grafana; alerta; node_exporter;
prometheus -> grafana [label="query results"];
grafana -> alerta [label="alerts"];
node_exporter -> prometheus [label="metrics"];
grafana -> prometheus [label="metrics"];
prometheus -> prometheus [label="metrics"];
}
subgraph cluster_station {
label="LOFAR2.0 Station";
node [shape=ellipse height=1 width=2 color=gray];
station_prometheus [label="prometheus"];
station_grafana [label="grafana"]
station_node_exporter [label="node_exporter"]
hardware
tango_devices;
exporter;
jupyter;
station_node_exporter -> station_prometheus [label="metrics"];
station_prometheus -> station_grafana [label="query results"];
station_grafana -> station_prometheus [label="metrics"];
hardware -> tango_devices [label="M&C"];
tango_devices -> exporter [label="metrics"];
exporter -> station_prometheus [label="metrics"];
station_prometheus -> jupyter [label="metrics"];
tango_devices -> jupyter [label="M&C"]
}
station_prometheus -> prometheus [label="metrics" minlen=1];
station_grafana -> user [label="dashboards"]
jupyter -> user [label="M&C"];
alerta -> slack [label="notifications"];
grafana -> user [label="dashboards"];
alerta -> user [label="notifications"];
slack -> user [label="notifications"];
}](_images/graphviz-fa337bd3143e8c185237854c6168a8550c274c7f.png)