Skip to main content

Prometheus Monitoring Setup in K8s cluster with Grafana for GUI

Prometheus setup
################

Prerequisites:
--------------

> PV & PVC setup with NFS. >> https://jinojoseph.blogspot.com/2019/11/persistent-volume-in-k8s-multinode.html
> helm setup with tiller. >> https://jinojoseph.blogspot.com/2019/10/setting-up-helm-chart-for-k8s-cluster.html

git clone https://github.com/jinojosep/k8s.git

cd k8s/prometheus

vi 1.6-deployments.yaml

Replace the values of NFS_SERVER & NFS_PATH

NFS_SERVER : 10.0.1.9
NFS_PATH : /srv/nfs/k8sdata



vi 1.6-class.yaml

Add the below line in the metadata:

  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
 

# ubuntu@namenode:~/myk8syamls/nfs-provisioner$ kubectl create -f rbac.yaml -f 1.6-class.yaml -f 1.6-deployment.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
storageclass.storage.k8s.io/managed-nfs-storage created
deployment.apps/nfs-client-provisioner created


ubuntu@namenode:~$ helm version --client --short
Client: v2.15.1+gcf1de4f

# helm inspect values stable/prometheus > /home/ubuntu/myk8syamls/prometheus/prometheus.values

change the 857th line as below:

nodePort: 32322
type: NodePort


# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
NAME:   prometheus
LAST DEPLOYED: Wed Nov 20 04:35:18 2019
NAMESPACE: prometheus
STATUS: DEPLOYED

# watch kubectl get all -n prometheus

NAME                                                READY   STATUS    RESTARTS   AGE
pod/prometheus-alertmanager-977545d7b-kctwz         2/2     Running   0          94s
pod/prometheus-kube-state-metrics-dd4fcf989-ldxlj   1/1     Running   0          94s
pod/prometheus-node-exporter-n4c7g                  1/1     Running   0          94s
pod/prometheus-node-exporter-t6vhn                  1/1     Running   0          94s
pod/prometheus-node-exporter-tvjh9                  1/1     Running   0          94s
pod/prometheus-pushgateway-644868fb9c-zdjd7         1/1     Running   0          94s
pod/prometheus-server-d6c7dbd-vmpqh                 2/2     Running   0          94s

NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/prometheus-alertmanager         ClusterIP   10.109.24.42             80/TCP         94s
service/prometheus-kube-state-metrics   ClusterIP   None                     80/TCP         94s
service/prometheus-node-exporter        ClusterIP   None                     9100/TCP       94s
service/prometheus-pushgateway          ClusterIP   10.106.50.223            9091/TCP       94s
service/prometheus-server               NodePort    10.105.205.128           80:32322/TCP   94s

NAME                                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-node-exporter   3         3         3       3            3                     94s

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-alertmanager         1/1     1            1           94s
deployment.apps/prometheus-kube-state-metrics   1/1     1            1           94s
deployment.apps/prometheus-pushgateway          1/1     1            1           94s
deployment.apps/prometheus-server               1/1     1            1           94s

NAME                                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/prometheus-alertmanager-977545d7b         1         1         1       94s
replicaset.apps/prometheus-kube-state-metrics-dd4fcf989   1         1         1       94s
replicaset.apps/prometheus-pushgateway-644868fb9c         1         1         1       94s
replicaset.apps/prometheus-server-d6c7dbd                 1         1         1       94s

Now create and Ingress resource rule like below for taking the prometheus url from outside:
# kubectl get ing -n prometheus -oyaml

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    annotations:
      nginx.org/rewrites: serviceName=prometheus-server rewrite=/;
    creationTimestamp: "2019-11-20T09:39:49Z"
    generation: 1
    name: ingress-resource-1
    namespace: prometheus
    resourceVersion: "1333268"
    selfLink: /apis/extensions/v1beta1/namespaces/prometheus/ingresses/ingress-resource-1
    uid: xxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx
  spec:
    rules:
    - host: monitoring.yourdomain.com
      http:
        paths:
        - backend:
            serviceName: prometheus-server
            servicePort: 80
          path: /
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Now take the url http://monitoring.yourdomain.com , which will load the prometheus page.

Grafana setup
############




# helm inspect values stable/grafana > /home/ubuntu/myk8syamls/grafana/grafana.values

>> Change the type from ClusterIP to NodePort and nodePort to 32323

>> Also change the adminPassword to a strong password.

>> persistence: enabled: true

# helm install stable/grafana --name grafana --values /home/ubuntu/myk8syamls/grafana/grafana.values --namespace grafana

NOTES:
1. Get your 'admin' user password by running:

   kubectl get secret --namespace grafana grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:

   grafana.grafana.svc.cluster.local

   Get the Grafana URL to visit by running these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace grafana -o jsonpath="{.spec.ports[0].nodePort}" services grafana)
     export NODE_IP=$(kubectl get nodes --namespace grafana -o jsonpath="{.items[0].status.addresses[0].address}")
     echo http://$NODE_IP:$NODE_PORT


3. Login with the password from step 1 and the username: admin



# watch kubectl get all -n grafana


To delete prometheus:

# helm delete prometheus --purge

# helm delete grafana --purge


Errors & Fixes
------------------
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Error: validation failed: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "Deployment" in version "extensions/v1beta1"]

Fix:
-----

# helm reset

# helm init --service-account tiller --output yaml | sed 's@apiVersion: extensions/v1beta1@apiVersion: apps/v1@' | sed 's@  replicas: 1@  replicas: 1\n  selector: {"matchLabels": {"app": "helm", "name": "tiller"}}@' | kubectl apply -f -

deployment.apps/tiller-deploy created
service/tiller-deploy created

ubuntu@namenode:~$ helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus

Errors & Fixes
-------------------
When enabling " persistence: enabled: true " in grafana values , the pods are getting errors:

No data sources of type Prometheus AlertManager found


ubuntu@namenode:~$ kubectl describe pod/grafana-c6566b4b7-q8srd -n grafana
pullable://busybox@sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
    Port:         
    Host Port:     
    Command:
      chown
      -R
      472:472
      /var/lib/grafana
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1

Fix:
------

This might be the issue with the nfs mount in the nfs server, make sure you have added the no_root_squash , in the nfs server like below:

[ec2-user@nfs-server]$ sudo exportfs -v
/srv/nfs/k8sdata
(rw,sync,wdelay,hide,no_subtree_check,sec=sys,insecure,no_root_squash,no_all_squash)

Comments