Skip to main content

Prometheus Monitoring Setup in K8s cluster with Grafana for GUI

Prometheus setup
################

Prerequisites:
--------------

> PV & PVC setup with NFS. >> https://jinojoseph.blogspot.com/2019/11/persistent-volume-in-k8s-multinode.html
> helm setup with tiller. >> https://jinojoseph.blogspot.com/2019/10/setting-up-helm-chart-for-k8s-cluster.html

git clone https://github.com/jinojosep/k8s.git

cd k8s/prometheus

vi 1.6-deployments.yaml

Replace the values of NFS_SERVER & NFS_PATH

NFS_SERVER : 10.0.1.9
NFS_PATH : /srv/nfs/k8sdata



vi 1.6-class.yaml

Add the below line in the metadata:

  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
 

# ubuntu@namenode:~/myk8syamls/nfs-provisioner$ kubectl create -f rbac.yaml -f 1.6-class.yaml -f 1.6-deployment.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
storageclass.storage.k8s.io/managed-nfs-storage created
deployment.apps/nfs-client-provisioner created


ubuntu@namenode:~$ helm version --client --short
Client: v2.15.1+gcf1de4f

# helm inspect values stable/prometheus > /home/ubuntu/myk8syamls/prometheus/prometheus.values

change the 857th line as below:

nodePort: 32322
type: NodePort


# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
NAME:   prometheus
LAST DEPLOYED: Wed Nov 20 04:35:18 2019
NAMESPACE: prometheus
STATUS: DEPLOYED

# watch kubectl get all -n prometheus

NAME                                                READY   STATUS    RESTARTS   AGE
pod/prometheus-alertmanager-977545d7b-kctwz         2/2     Running   0          94s
pod/prometheus-kube-state-metrics-dd4fcf989-ldxlj   1/1     Running   0          94s
pod/prometheus-node-exporter-n4c7g                  1/1     Running   0          94s
pod/prometheus-node-exporter-t6vhn                  1/1     Running   0          94s
pod/prometheus-node-exporter-tvjh9                  1/1     Running   0          94s
pod/prometheus-pushgateway-644868fb9c-zdjd7         1/1     Running   0          94s
pod/prometheus-server-d6c7dbd-vmpqh                 2/2     Running   0          94s

NAME                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
service/prometheus-alertmanager         ClusterIP   10.109.24.42             80/TCP         94s
service/prometheus-kube-state-metrics   ClusterIP   None                     80/TCP         94s
service/prometheus-node-exporter        ClusterIP   None                     9100/TCP       94s
service/prometheus-pushgateway          ClusterIP   10.106.50.223            9091/TCP       94s
service/prometheus-server               NodePort    10.105.205.128           80:32322/TCP   94s

NAME                                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-node-exporter   3         3         3       3            3                     94s

NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/prometheus-alertmanager         1/1     1            1           94s
deployment.apps/prometheus-kube-state-metrics   1/1     1            1           94s
deployment.apps/prometheus-pushgateway          1/1     1            1           94s
deployment.apps/prometheus-server               1/1     1            1           94s

NAME                                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/prometheus-alertmanager-977545d7b         1         1         1       94s
replicaset.apps/prometheus-kube-state-metrics-dd4fcf989   1         1         1       94s
replicaset.apps/prometheus-pushgateway-644868fb9c         1         1         1       94s
replicaset.apps/prometheus-server-d6c7dbd                 1         1         1       94s

Now create and Ingress resource rule like below for taking the prometheus url from outside:
# kubectl get ing -n prometheus -oyaml

apiVersion: v1
items:
- apiVersion: extensions/v1beta1
  kind: Ingress
  metadata:
    annotations:
      nginx.org/rewrites: serviceName=prometheus-server rewrite=/;
    creationTimestamp: "2019-11-20T09:39:49Z"
    generation: 1
    name: ingress-resource-1
    namespace: prometheus
    resourceVersion: "1333268"
    selfLink: /apis/extensions/v1beta1/namespaces/prometheus/ingresses/ingress-resource-1
    uid: xxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx
  spec:
    rules:
    - host: monitoring.yourdomain.com
      http:
        paths:
        - backend:
            serviceName: prometheus-server
            servicePort: 80
          path: /
  status:
    loadBalancer: {}
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Now take the url http://monitoring.yourdomain.com , which will load the prometheus page.

Grafana setup
############




# helm inspect values stable/grafana > /home/ubuntu/myk8syamls/grafana/grafana.values

>> Change the type from ClusterIP to NodePort and nodePort to 32323

>> Also change the adminPassword to a strong password.

>> persistence: enabled: true

# helm install stable/grafana --name grafana --values /home/ubuntu/myk8syamls/grafana/grafana.values --namespace grafana

NOTES:
1. Get your 'admin' user password by running:

   kubectl get secret --namespace grafana grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:

   grafana.grafana.svc.cluster.local

   Get the Grafana URL to visit by running these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace grafana -o jsonpath="{.spec.ports[0].nodePort}" services grafana)
     export NODE_IP=$(kubectl get nodes --namespace grafana -o jsonpath="{.items[0].status.addresses[0].address}")
     echo http://$NODE_IP:$NODE_PORT


3. Login with the password from step 1 and the username: admin



# watch kubectl get all -n grafana


To delete prometheus:

# helm delete prometheus --purge

# helm delete grafana --purge


Errors & Fixes
------------------
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Error: validation failed: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "Deployment" in version "extensions/v1beta1"]

Fix:
-----

# helm reset

# helm init --service-account tiller --output yaml | sed 's@apiVersion: extensions/v1beta1@apiVersion: apps/v1@' | sed 's@  replicas: 1@  replicas: 1\n  selector: {"matchLabels": {"app": "helm", "name": "tiller"}}@' | kubectl apply -f -

deployment.apps/tiller-deploy created
service/tiller-deploy created

ubuntu@namenode:~$ helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus

Errors & Fixes
-------------------
When enabling " persistence: enabled: true " in grafana values , the pods are getting errors:

No data sources of type Prometheus AlertManager found


ubuntu@namenode:~$ kubectl describe pod/grafana-c6566b4b7-q8srd -n grafana
pullable://busybox@sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
    Port:         
    Host Port:     
    Command:
      chown
      -R
      472:472
      /var/lib/grafana
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1

Fix:
------

This might be the issue with the nfs mount in the nfs server, make sure you have added the no_root_squash , in the nfs server like below:

[ec2-user@nfs-server]$ sudo exportfs -v
/srv/nfs/k8sdata
(rw,sync,wdelay,hide,no_subtree_check,sec=sys,insecure,no_root_squash,no_all_squash)

Comments

Popular posts from this blog

K8s External Secrets integration between AWS EKS and Secrets Manager(SM) using IAM Role.

What is K8s External Secrets and how it will make your life easier? Before saying about External Secrets we will say about k8s secrets and how it will work. In k8s secrets we will create key value pairs of the secrets and set this as either pod env variables or mount them as volumes to pods. For more details about k8s secrets you can check my blog http://jinojoseph.blogspot.com/2020/08/k8s-secrets-explained.html   So in this case if developers wants to change the ENV variables , then we have to edit the k8s manifest yaml file, then we have to apply the new files to the deployment. This is a tiresome process and also chances of applying to the wrong context is high if you have multiple k8s clusters for dev / stage and Prod deployments. So in-order to make this easy , we can add all the secrets that is needed in the deployment, in the AWS Secret Manager and with the help of External secrets we can fetch and create those secrets in the k8s cluster. So what is K8s external Secret? It i...

Password reset too simplistic/systematic issue

Some time when we try to reset the password of our user in linux it will show as simple and systematic as below: BAD PASSWORD: it is too simplistic/systematic no matter how hard password you give it will show the same. Solution: ######### Check if your password is Ok with the below command, jino@ndz~$ echo 'D7y8HK#56r89lj&8*&^%&^%#56rlKJ!789l' | cracklib-check D7y8HK#56r89lj&8*&^%&^%#56rlKJ!789l: it is too simplistic/systematic Now Create a password with the below command : jino@ndz~$ echo $(tr -dc '[:graph:]' 7\xi%!W[y*S}g-H7W~gbEB4cv,9:E:K; You can see that this password will be ok with the cracklib-check. jino@ndz~$ echo '7\xi%!W[y*S}g-H7W~gbEB4cv,9:E:K;' | cracklib-check                 7\xi%!W[y*S}g-H7W~gbEB4cv,9:E:K;: OK Thats all, Thanks.

Setting /etc/hosts entries during the initial deployment of an Application using k8s yaml file

Some times we have to enter specific hosts file entries to the container running inside the POD of a kubernetes deployment during the initial deployment stage itself. If these entries are not in place, the application env variables mentioned in the yaml file , as hostnames , will not resolve to the IP address and the application will not start properly. So to make sure the /etc/hosts file entries are already there after the spin up of the POD you can add the below entries in your yaml file. cat > api-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: spec:   template:     metadata:     spec:       volumes:       containers:       - image: registryserver.jinojoseph.com:5000/jinojosephimage:v1.13         lifecycle:           postStart:             exec:               command:...