Centralized Monitoring

Monitor clusters, created with Kommander, on any connected cluster

Kommander provides centralized monitoring, in a multi-cluster environment, using the monitoring stack running on any managed clusters. Centralized monitoring is provided by default in every Kommander cluster.

NOTE: Centralized monitoring is available and supported for Konvoy and non-Konvoy clusters.

Managed clusters are distinguished by a monitoring ID. The monitoring ID corresponds to the kube-system namespace UID of the cluster. To find a cluster’s monitoring ID, you can go to the Clusters tab on the Kommander UI (in the relevant workspace):

https://<CLUSTER_URL>/ops/portal/kommander/ui/#/clusters

Click on the View Details link on the managed cluster card, and the monitoring ID can be found under Monitoring ID (clusterId).

You may also search or filter by monitoring IDs on the Clusters page, linked above.

Alternatively, you can run this kubectl command, using the correct cluster’s context or kubeconfig, to look up the cluster’s kube-system namespace UID to determine which cluster the metrics and alerts correspond to:

$ kubectl get namespace kube-system -o jsonpath='{.metadata.uid}'

Centralized Metrics

The Kommander cluster collects and presents metrics from all managed clusters remotely using Thanos. You can visualize these metrics in Grafana using a set of provided dashboards.

The Thanos Query component is installed on the Kommander cluster. Thanos Query queries the Prometheus instances, running on the managed clusters, using a Thanos sidecar running alongside each Prometheus container. Grafana is configured with Thanos Query as its datasource, and comes with a pre-installed dashboard for a global view of all managed clusters, named Kubernetes / Compute Resources / Cluster [Global]. The Thanos Query dashboard is also installed, by default, to monitor the Thanos Query component.

NOTE: Metrics from clusters are read remotely from Kommander; they are not backed up. If a managed cluster goes down, Kommander no longer collects or presents its metrics, including past data.

You can access the centralized Grafana UI at:

https://<CLUSTER_URL>/ops/portal/kommander/monitoring/grafana

NOTE: This is a separate Grafana instance than the one installed on all Konvoy clusters. It is dedicated specifically to components related to centralized monitoring.

Optionally, if you want to access the Thanos Query UI (essentially the Prometheus UI), the UI is exposed at:

https://<CLUSTER_URL>/ops/portal/kommander/monitoring/query

You can also check that the managed cluster’s Thanos sidecars are successfully added to Thanos Query by going to:

https://<CLUSTER_URL>/ops/portal/kommander/monitoring/query/stores

The preferred method to view the metrics for a specific cluster is to go directly to that cluster’s Grafana UI.

Adding custom dashboards

You can also define custom dashboards for centralized monitoring on Kommander. There are a few methods to import dashboards to Grafana. For simplicity, assume the desired dashboard definition is in json format:

{
    "annotations":
    ...
    # Complete json file here
    ...
    "title": "Some Dashboard",
    "uid": "abcd1234",
    "version": 1
}

After creating your custom dashboard, configure Kommander to deploy it by modifying the cluster.yaml file as follows:

- name: kommander
  enabled: true
  values: |
    grafana:
      dashboards:
        default:
          some-dashboard:
            json: |
              {
                "annotations":
                ...
                # Complete json file here
                ...
                "title": "Some Dashboard",
                "uid": "abcd1234",
                "version": 1
              }

Centralized Alerts

A centralized view of alerts, from managed clusters, is provided using an alert dashboard called Karma. Karma aggregates all alerts from the Alertmanagers running in the managed clusters, allowing you to visualize these alerts on one page. Using the Karma dashboard, you can get an overview of each alert and filter by alert type, cluster, and more.

NOTE: Silencing alerts using the Karma UI is currently not supported.

You can access the Karma dashboard UI at:

https://<CLUSTER_URL>/ops/portal/kommander/monitoring/karma

NOTE: When there are no managed clusters, the Karma UI displays an error message `Get https://placeholder.invalid/api/v2/status: dial tcp: lookup placeholder.invalid on 10.0.0.10:53: no such host`. This is expected, and the error disappears when clusters are connected.

Federating Prometheus Alerting Rules

You can define additional Prometheus alerting rules on the Kommander cluster and federate them to all of the managed clusters by following these instructions:

  1. Enable the PrometheusRule type for federation.

    kubefedctl enable PrometheusRules --kubefed-namespace kommander
    
  2. Modify the existing alertmanager configuration.

    kubectl edit PrometheusRules/prometheus-kubeaddons-prom-alertmanager.rules -n kubeaddons
    
  3. Append a sample rule.

    - alert: MyFederatedAlert
      annotations:
        message: A custom alert that will always fire.
      expr: vector(1)
      labels:
        severity: warning
    
  4. Federate the rules you just modified.

    kubefedctl federate PrometheusRules prometheus-kubeaddons-prom-alertmanager.rules --kubefed-namespace kommander -n kubeaddons
    
  5. Ensure that the clusters selection (status.clusters) is appropriately set for your desired federation strategy and check the propagation status.

    kubectl get federatedprometheusrules prometheus-kubeaddons-prom-alertmanager.rules -n kubeaddons -oyaml