Setting up Alertmanager and Rules
---------------------------------
Now that you have prometheus set up, you need to specify some instructions. The next step is to create a values.yaml file that specifies
1) what the alert rules are,
2) what the Prometheus targets are (i.e the definition of what to scrape and how) and any jobs for Prometheus, and
3) where alerts should be routed to (in this case, Slack).
Alert Rules
------------
vi prometheus.values
## Prometheus server ConfigMap entries
##
serverFiles:
## Alerts configuration
## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
alerts:
groups:
- name: Instances
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
summary: 'Instance {{ $labels.instance }} down'
Prometheus Targets
------------------
Next, you will set up the Prometheus targets and specify the jobs. Fortunately, Kube scraping is already set up out of the box, so for our purposes, no additional action is required for this step
Alert Routing
-------------
Here I am routing the alerts to slack api.
api_url / WebHOOK_URL : https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx
you can test if the webhook url is working or not by sending a message to slack using the below curl command.
For getting your webhook url take the below url after login to your slack:
https://curai.slack.com/apps/A0F7XDUAZ-incoming-webhooks?next_id=0
curl -X POST --data-urlencode "payload={\"channel\": \"#MYCHANNELNAME\", \"username\": \"sanitybot\", \"text\": \"Just a sanity check that slack webhook is working.\", \"icon_emoji\": \":ghost:\"}" MY_WEBHOOK_URL
curl -X POST --data-urlencode "payload={\"channel\": \"#devops\", \"jino\": \"webhookbot\", \"text\": \"This is posted to #devops and comes from a bot named webhookbot.\", \"icon_emoji\": \":ghost:\"}" https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx
vi prometheus.values
## alertmanager ConfigMap entries
##
alertmanagerFiles:
alertmanager.yml:
global: {}
# slack_api_url: ''
receivers:
- name: default-receiver
slack_configs:
- channel: "#devops"
send_resolved: true
api_url: 'https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx'
text: "description: {{ .CommonAnnotations.description }}\nsummary: {{ .CommonAnnotations.summary }}"
route:
group_by: [cluster]
receiver: default-receiver
routes:
- match:
severity: critical
receiver: default-receiver
repeat_interval: 1m
group_wait: 10s
group_interval: 5m
Once your values.yaml file is prepared, you’re ready to upgrade.
# helm upgrade -f prometheus.values prometheus stable/prometheus
Now check your prometheus alert url :
https://monitoring.abtest.tk/alerts
You can also confirm that the settings that you given are correct using the below command:
# kubectl describe configmap prometheus-alertmanager -n prometheus
Name: prometheus-alertmanager
Namespace: prometheus
Labels: app=prometheus
chart=prometheus-9.3.1
component=alertmanager
heritage=Tiller
release=prometheus
Annotations:
Data
====
alertmanager.yml:
----
global: {}
receivers:
- name: default-receiver
slack_configs:
- api_url: https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx
channel: '#devops'
send_resolved: true
text: |-
description: {{ .CommonAnnotations.description }}
summary: {{ .CommonAnnotations.summary }}
route:
group_by:
- cluster
group_interval: 5m
group_wait: 10s
receiver: default-receiver
repeat_interval: 3h
routes:
- group_interval: 5m
group_wait: 10s
match:
severity: critical
receiver: default-receiver
repeat_interval: 1m
Events:
That is all!!!! cheers :-)
---------------------------------
Now that you have prometheus set up, you need to specify some instructions. The next step is to create a values.yaml file that specifies
1) what the alert rules are,
2) what the Prometheus targets are (i.e the definition of what to scrape and how) and any jobs for Prometheus, and
3) where alerts should be routed to (in this case, Slack).
Alert Rules
------------
vi prometheus.values
## Prometheus server ConfigMap entries
##
serverFiles:
## Alerts configuration
## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
alerts:
groups:
- name: Instances
rules:
- alert: InstanceDown
expr: up == 0
for: 5m
labels:
severity: page
annotations:
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
summary: 'Instance {{ $labels.instance }} down'
Prometheus Targets
------------------
Next, you will set up the Prometheus targets and specify the jobs. Fortunately, Kube scraping is already set up out of the box, so for our purposes, no additional action is required for this step
Alert Routing
-------------
Here I am routing the alerts to slack api.
api_url / WebHOOK_URL : https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx
you can test if the webhook url is working or not by sending a message to slack using the below curl command.
For getting your webhook url take the below url after login to your slack:
https://curai.slack.com/apps/A0F7XDUAZ-incoming-webhooks?next_id=0
curl -X POST --data-urlencode "payload={\"channel\": \"#MYCHANNELNAME\", \"username\": \"sanitybot\", \"text\": \"Just a sanity check that slack webhook is working.\", \"icon_emoji\": \":ghost:\"}" MY_WEBHOOK_URL
curl -X POST --data-urlencode "payload={\"channel\": \"#devops\", \"jino\": \"webhookbot\", \"text\": \"This is posted to #devops and comes from a bot named webhookbot.\", \"icon_emoji\": \":ghost:\"}" https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx
vi prometheus.values
## alertmanager ConfigMap entries
##
alertmanagerFiles:
alertmanager.yml:
global: {}
# slack_api_url: ''
receivers:
- name: default-receiver
slack_configs:
- channel: "#devops"
send_resolved: true
api_url: 'https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx'
text: "description: {{ .CommonAnnotations.description }}\nsummary: {{ .CommonAnnotations.summary }}"
route:
group_by: [cluster]
receiver: default-receiver
routes:
- match:
severity: critical
receiver: default-receiver
repeat_interval: 1m
group_wait: 10s
group_interval: 5m
Once your values.yaml file is prepared, you’re ready to upgrade.
# helm upgrade -f prometheus.values prometheus stable/prometheus
Now check your prometheus alert url :
https://monitoring.abtest.tk/alerts
You can also confirm that the settings that you given are correct using the below command:
# kubectl describe configmap prometheus-alertmanager -n prometheus
Name: prometheus-alertmanager
Namespace: prometheus
Labels: app=prometheus
chart=prometheus-9.3.1
component=alertmanager
heritage=Tiller
release=prometheus
Annotations:
Data
====
alertmanager.yml:
----
global: {}
receivers:
- name: default-receiver
slack_configs:
- api_url: https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxx/xxxxxxxxxxxxx
channel: '#devops'
send_resolved: true
text: |-
description: {{ .CommonAnnotations.description }}
summary: {{ .CommonAnnotations.summary }}
route:
group_by:
- cluster
group_interval: 5m
group_wait: 10s
receiver: default-receiver
repeat_interval: 3h
routes:
- group_interval: 5m
group_wait: 10s
match:
severity: critical
receiver: default-receiver
repeat_interval: 1m
Events:
That is all!!!! cheers :-)
Comments