Upgrading to Prometheus 2.0: Configuration
on Oct 26, 2017
The GA release of Prometheus 2.0 is just around the corner and brings a host of improvements to the storage engine, removing bottlenecks and massively improving all round performance. This blog post covers some of the config changes you’ll need to make to upgrade to Prometheus 2.0.
First off, the command line flags have changed. Instead of a single dash they
all need a double dash. Common flags (--config.file
, --web.listen-address
and --web.external-url
) are still the same but beyond that, almost all the
storage-related flags have been removed.
Alertmanager service discovery
Alertmanager service discovery was introduced in Prometheus 1.4, allowing Prometheus to dynamically discover Alertmanager replicas using the same mechanism as monitoring targets. In Prometheus 2.0, the command line flags for static Alertmanager config have been removed, so the following command line flag:
./prometheus -alertmanager.url=http://alertmanager.default.svc.cluster.local/admin/alertmanager
Might looks like the following snippet in the prometheus.yml
config file:
alerting:
alertmanagers:
- path_prefix: /admin/alertmanager
kubernetes_sd_configs:
- role: pod
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_name]
regex: alertmanager
action: keep
- source_labels: [__meta_kubernetes_namespace]
regex: default
action: keep
- source_labels: [__meta_kubernetes_pod_container_port_number]
regex:
action: drop
In this snippet I’m instructing Prometheus to search for Kubernetes pods, in the
default
namespace, with the label name: alertmanager
and with a non-empty port.
Recording rules an alerts
Another thing that has changed is the format for alerting and recoding rules. The
promtool
has a command to automate the conversion, which Conor at Robust
Perception published a blog post
on this recently. An example of an recording rule in the old format:
namespace:container_cpu_usage_seconds_total:sum_rate =
sum(rate(container_cpu_usage_seconds_total{image!=""}[5m])) by (namespace)
Might looks like this in the new format:
groups:
- name: node.rules
rules:
- record: "namespace:container_cpu_usage_seconds_total:sum_rate",
expr: "sum(rate(container_cpu_usage_seconds_total{image!=\"\"}[5m])) by (namespace)",
Prometheus non-root user
Also worth noting that the Prometheus docker image is now built to run Prometheus as a non-root user. If you want the Prometheus UI/API to listen on a low port number (say, port 80), you’ll need to override it in your Kubernetes YAML:
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo-2
spec:
securityContext:
runAsUser: 0
...
See https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ for more details.
Prometheus Lifecycle
Finally, if you use the Prometheus /-/reload
HTTP endpoint to automatically reload your
Prometheus config when it changes,
you’ll find these endpoints are disabled by default in Prometheus 2.0. To enable
them, set the --web.enable-lifecycle
flag.
Next
In the next blog post we’ll look at the data format changes between Prometheus 1.8 and 2.0, and how to keep access to your historic monitoring data after migration.