Monitoring
Why Prometheus
Prometheus has become the mainstream open source monitoring tool of choice in the container and microservice world.
It provides ready-to-use Exporters for all services that our system provides and implements a pull mechanism that is better suited for microservice architecture.
Furthermore, it supports additional features like the Alert Manager, a query language to retrieve the data and the visualization tool Grafana
Kubernetes Prometheus Stack
We are using the Kube-Prometheus Stack developed by the Prometheus community.
Installing the helm chart
Make sure your kubectx is set to the Airy Core instance and you have Helm installed.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack
Customizing Prometheus
In this section you can customize the Prometheus chart by changing the defaults
in infrastructure/tools/prometheus/values.yaml
to ones that suite your
requirements.
To access Prometheus, Grafana and Alertmanager from outside the cluster you have
to put your hostname in the respective hosts: []
variable.
In the case that you make Grafana publicly accessible you should also set the
adminPassword
to something secure.
You can apply those changes by running:
helm upgrade prometheus --values infrastructure/tools/prometheus/values.yaml
Grafana Dashboards
Grafana is a very powerful visualization tool which can be used for all sorts of dashboarding and monitoring tasks.
For Grafana there is one more step to do before you can access it.
k edit cm prometheus-grafana
[server]
domain = <your_hostname>
root_url = <your_hostname>/grafana
serve_from_sub_path = true
Access Grafana under /grafana
and login with the adminPassword
you set in
the values.yaml
.
Access Predefined Dashboards
If the defaultDashboardsEnabled
is set to true you can find the default
Kubernetes dashboards under /grafana/dashboards
The Grafana website provides a lot more dashboards that can be added to your instance by importing them.
Here is a list of dashboard we recommend to add to monitor your Airy Core instance
Receiving alerts
To get notifications for the default alerts all you have to do is set up a receiver like described here. That way you can get notified on Slack or PagerDuty on issues like crashing components and nodes running out of free storage space.