Prometheus Monitoring Setup in K8s cluster with Grafana for GUI

Prometheus setup
################

Prerequisites:
--------------

> PV & PVC setup with NFS. >> https://jinojoseph.blogspot.com/2019/11/persistent-volume-in-k8s-multinode.html
> helm setup with tiller. >> https://jinojoseph.blogspot.com/2019/10/setting-up-helm-chart-for-k8s-cluster.html

git clone https://github.com/jinojosep/k8s.git

cd k8s/prometheus

vi 1.6-deployments.yaml

Replace the values of NFS_SERVER & NFS_PATH

NFS_SERVER : 10.0.1.9
NFS_PATH : /srv/nfs/k8sdata

vi 1.6-class.yaml

Add the below line in the metadata:

annotations:
storageclass.kubernetes.io/is-default-class: "true"

# ubuntu@namenode:~/myk8syamls/nfs-provisioner$ kubectl create -f rbac.yaml -f 1.6-class.yaml -f 1.6-deployment.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
storageclass.storage.k8s.io/managed-nfs-storage created
deployment.apps/nfs-client-provisioner created

ubuntu@namenode:~$ helm version --client --short
Client: v2.15.1+gcf1de4f

# helm inspect values stable/prometheus > /home/ubuntu/myk8syamls/prometheus/prometheus.values

change the 857th line as below:

nodePort: 32322
type: NodePort

# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
NAME: prometheus
LAST DEPLOYED: Wed Nov 20 04:35:18 2019
NAMESPACE: prometheus
STATUS: DEPLOYED

# watch kubectl get all -n prometheus

NAME READY STATUS RESTARTS AGE
pod/prometheus-alertmanager-977545d7b-kctwz 2/2 Running 0 94s
pod/prometheus-kube-state-metrics-dd4fcf989-ldxlj 1/1 Running 0 94s
pod/prometheus-node-exporter-n4c7g 1/1 Running 0 94s
pod/prometheus-node-exporter-t6vhn 1/1 Running 0 94s
pod/prometheus-node-exporter-tvjh9 1/1 Running 0 94s
pod/prometheus-pushgateway-644868fb9c-zdjd7 1/1 Running 0 94s
pod/prometheus-server-d6c7dbd-vmpqh 2/2 Running 0 94s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus-alertmanager ClusterIP 10.109.24.42    80/TCP 94s
service/prometheus-kube-state-metrics ClusterIP None    80/TCP 94s
service/prometheus-node-exporter ClusterIP None    9100/TCP 94s
service/prometheus-pushgateway ClusterIP 10.106.50.223 9091/TCP 94s
service/prometheus-server NodePort 10.105.205.128    80:32322/TCP 94s

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-node-exporter 3 3 3 3 3    94s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-alertmanager 1/1 1 1 94s
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 94s
deployment.apps/prometheus-pushgateway 1/1 1 1 94s
deployment.apps/prometheus-server 1/1 1 1 94s

NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-alertmanager-977545d7b 1 1 1 94s
replicaset.apps/prometheus-kube-state-metrics-dd4fcf989 1 1 1 94s
replicaset.apps/prometheus-pushgateway-644868fb9c 1 1 1 94s
replicaset.apps/prometheus-server-d6c7dbd 1 1 1 94s

Now create and Ingress resource rule like below for taking the prometheus url from outside:
# kubectl get ing -n prometheus -oyaml

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.org/rewrites: serviceName=prometheus-server rewrite=/;
creationTimestamp: "2019-11-20T09:39:49Z"
generation: 1
name: ingress-resource-1
namespace: prometheus
resourceVersion: "1333268"
selfLink: /apis/extensions/v1beta1/namespaces/prometheus/ingresses/ingress-resource-1
uid: xxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx
spec:
rules:
- host: monitoring.yourdomain.com
http:
paths:
- backend:
serviceName: prometheus-server
servicePort: 80
path: /
status:
loadBalancer: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""

Now take the url http://monitoring.yourdomain.com , which will load the prometheus page.

Grafana setup
############

# helm inspect values stable/grafana > /home/ubuntu/myk8syamls/grafana/grafana.values

>> Change the type from ClusterIP to NodePort and nodePort to 32323

>> Also change the adminPassword to a strong password.

>> persistence: enabled: true

# helm install stable/grafana --name grafana --values /home/ubuntu/myk8syamls/grafana/grafana.values --namespace grafana

NOTES:
1. Get your 'admin' user password by running:

kubectl get secret --namespace grafana grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:

grafana.grafana.svc.cluster.local

Get the Grafana URL to visit by running these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace grafana -o jsonpath="{.spec.ports[0].nodePort}" services grafana)
export NODE_IP=$(kubectl get nodes --namespace grafana -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT

3. Login with the password from step 1 and the username: admin

# watch kubectl get all -n grafana

To delete prometheus:

# helm delete prometheus --purge

# helm delete grafana --purge

Errors & Fixes
------------------
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Error: validation failed: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "Deployment" in version "extensions/v1beta1"]

Fix:
-----

# helm reset

# helm init --service-account tiller --output yaml | sed 's@apiVersion: extensions/v1beta1@apiVersion: apps/v1@' | sed 's@ replicas: 1@ replicas: 1\n selector: {"matchLabels": {"app": "helm", "name": "tiller"}}@' | kubectl apply -f -

deployment.apps/tiller-deploy created
service/tiller-deploy created

ubuntu@namenode:~$ helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus

Errors & Fixes
-------------------
When enabling " persistence: enabled: true " in grafana values , the pods are getting errors:

No data sources of type Prometheus AlertManager found

ubuntu@namenode:~$ kubectl describe pod/grafana-c6566b4b7-q8srd -n grafana
pullable://busybox@sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
Port:
Host Port:
Command:
chown
-R
472:472
/var/lib/grafana
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1

Fix:
------

This might be the issue with the nfs mount in the nfs server, make sure you have added the no_root_squash , in the nfs server like below:

[ec2-user@nfs-server]$ sudo exportfs -v
/srv/nfs/k8sdata
(rw,sync,wdelay,hide,no_subtree_check,sec=sys,insecure,no_root_squash,no_all_squash)

Setting /etc/hosts entries during the initial deployment of an Application using k8s yaml file

Some times we have to enter specific hosts file entries to the container running inside the POD of a kubernetes deployment during the initial deployment stage itself. If these entries are not in place, the application env variables mentioned in the yaml file , as hostnames , will not resolve to the IP address and the application will not start properly. So to make sure the /etc/hosts file entries are already there after the spin up of the POD you can add the below entries in your yaml file. cat > api-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: spec: template: metadata: spec: volumes: containers: - image: registryserver.jinojoseph.com:5000/jinojosephimage:v1.13 lifecycle: postStart: exec: command:...

My Experiments for Truth.

Search This Blog

Prometheus Monitoring Setup in K8s cluster with Grafana for GUI

Comments

Popular posts from this blog

Password reset too simplistic/systematic issue

Running K8s cluster service kubelet with Swap Memory Enabled

Setting /etc/hosts entries during the initial deployment of an Application using k8s yaml file