Prometheus setup
################
Prerequisites:
--------------
> PV & PVC setup with NFS. >> https://jinojoseph.blogspot.com/2019/11/persistent-volume-in-k8s-multinode.html
> helm setup with tiller. >> https://jinojoseph.blogspot.com/2019/10/setting-up-helm-chart-for-k8s-cluster.html
git clone https://github.com/jinojosep/k8s.git
cd k8s/prometheus
vi 1.6-deployments.yaml
Replace the values of NFS_SERVER & NFS_PATH
NFS_SERVER : 10.0.1.9
NFS_PATH : /srv/nfs/k8sdata
vi 1.6-class.yaml
Add the below line in the metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
# ubuntu@namenode:~/myk8syamls/nfs-provisioner$ kubectl create -f rbac.yaml -f 1.6-class.yaml -f 1.6-deployment.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
storageclass.storage.k8s.io/managed-nfs-storage created
deployment.apps/nfs-client-provisioner created
ubuntu@namenode:~$ helm version --client --short
Client: v2.15.1+gcf1de4f
# helm inspect values stable/prometheus > /home/ubuntu/myk8syamls/prometheus/prometheus.values
change the 857th line as below:
nodePort: 32322
type: NodePort
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
NAME: prometheus
LAST DEPLOYED: Wed Nov 20 04:35:18 2019
NAMESPACE: prometheus
STATUS: DEPLOYED
# watch kubectl get all -n prometheus
NAME READY STATUS RESTARTS AGE
pod/prometheus-alertmanager-977545d7b-kctwz 2/2 Running 0 94s
pod/prometheus-kube-state-metrics-dd4fcf989-ldxlj 1/1 Running 0 94s
pod/prometheus-node-exporter-n4c7g 1/1 Running 0 94s
pod/prometheus-node-exporter-t6vhn 1/1 Running 0 94s
pod/prometheus-node-exporter-tvjh9 1/1 Running 0 94s
pod/prometheus-pushgateway-644868fb9c-zdjd7 1/1 Running 0 94s
pod/prometheus-server-d6c7dbd-vmpqh 2/2 Running 0 94s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus-alertmanager ClusterIP 10.109.24.42 80/TCP 94s
service/prometheus-kube-state-metrics ClusterIP None 80/TCP 94s
service/prometheus-node-exporter ClusterIP None 9100/TCP 94s
service/prometheus-pushgateway ClusterIP 10.106.50.223 9091/TCP 94s
service/prometheus-server NodePort 10.105.205.128 80:32322/TCP 94s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-node-exporter 3 3 3 3 3 94s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-alertmanager 1/1 1 1 94s
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 94s
deployment.apps/prometheus-pushgateway 1/1 1 1 94s
deployment.apps/prometheus-server 1/1 1 1 94s
NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-alertmanager-977545d7b 1 1 1 94s
replicaset.apps/prometheus-kube-state-metrics-dd4fcf989 1 1 1 94s
replicaset.apps/prometheus-pushgateway-644868fb9c 1 1 1 94s
replicaset.apps/prometheus-server-d6c7dbd 1 1 1 94s
Now create and Ingress resource rule like below for taking the prometheus url from outside:
# kubectl get ing -n prometheus -oyaml
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.org/rewrites: serviceName=prometheus-server rewrite=/;
creationTimestamp: "2019-11-20T09:39:49Z"
generation: 1
name: ingress-resource-1
namespace: prometheus
resourceVersion: "1333268"
selfLink: /apis/extensions/v1beta1/namespaces/prometheus/ingresses/ingress-resource-1
uid: xxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx
spec:
rules:
- host: monitoring.yourdomain.com
http:
paths:
- backend:
serviceName: prometheus-server
servicePort: 80
path: /
status:
loadBalancer: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Now take the url http://monitoring.yourdomain.com , which will load the prometheus page.
Grafana setup
############
# helm inspect values stable/grafana > /home/ubuntu/myk8syamls/grafana/grafana.values
>> Change the type from ClusterIP to NodePort and nodePort to 32323
>> Also change the adminPassword to a strong password.
>> persistence: enabled: true
# helm install stable/grafana --name grafana --values /home/ubuntu/myk8syamls/grafana/grafana.values --namespace grafana
NOTES:
1. Get your 'admin' user password by running:
kubectl get secret --namespace grafana grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:
grafana.grafana.svc.cluster.local
Get the Grafana URL to visit by running these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace grafana -o jsonpath="{.spec.ports[0].nodePort}" services grafana)
export NODE_IP=$(kubectl get nodes --namespace grafana -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
3. Login with the password from step 1 and the username: admin
# watch kubectl get all -n grafana
To delete prometheus:
# helm delete prometheus --purge
# helm delete grafana --purge
Errors & Fixes
------------------
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Error: validation failed: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "Deployment" in version "extensions/v1beta1"]
Fix:
-----
# helm reset
# helm init --service-account tiller --output yaml | sed 's@apiVersion: extensions/v1beta1@apiVersion: apps/v1@' | sed 's@ replicas: 1@ replicas: 1\n selector: {"matchLabels": {"app": "helm", "name": "tiller"}}@' | kubectl apply -f -
deployment.apps/tiller-deploy created
service/tiller-deploy created
ubuntu@namenode:~$ helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Errors & Fixes
-------------------
When enabling " persistence: enabled: true " in grafana values , the pods are getting errors:
No data sources of type Prometheus AlertManager found
ubuntu@namenode:~$ kubectl describe pod/grafana-c6566b4b7-q8srd -n grafana
pullable://busybox@sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
Port:
Host Port:
Command:
chown
-R
472:472
/var/lib/grafana
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Fix:
------
This might be the issue with the nfs mount in the nfs server, make sure you have added the no_root_squash , in the nfs server like below:
[ec2-user@nfs-server]$ sudo exportfs -v
/srv/nfs/k8sdata
(rw,sync,wdelay,hide,no_subtree_check,sec=sys,insecure,no_root_squash,no_all_squash)
################
Prerequisites:
--------------
> PV & PVC setup with NFS. >> https://jinojoseph.blogspot.com/2019/11/persistent-volume-in-k8s-multinode.html
> helm setup with tiller. >> https://jinojoseph.blogspot.com/2019/10/setting-up-helm-chart-for-k8s-cluster.html
git clone https://github.com/jinojosep/k8s.git
cd k8s/prometheus
vi 1.6-deployments.yaml
Replace the values of NFS_SERVER & NFS_PATH
NFS_SERVER : 10.0.1.9
NFS_PATH : /srv/nfs/k8sdata
vi 1.6-class.yaml
Add the below line in the metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
# ubuntu@namenode:~/myk8syamls/nfs-provisioner$ kubectl create -f rbac.yaml -f 1.6-class.yaml -f 1.6-deployment.yaml
serviceaccount/nfs-client-provisioner created
clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
storageclass.storage.k8s.io/managed-nfs-storage created
deployment.apps/nfs-client-provisioner created
ubuntu@namenode:~$ helm version --client --short
Client: v2.15.1+gcf1de4f
# helm inspect values stable/prometheus > /home/ubuntu/myk8syamls/prometheus/prometheus.values
change the 857th line as below:
nodePort: 32322
type: NodePort
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
NAME: prometheus
LAST DEPLOYED: Wed Nov 20 04:35:18 2019
NAMESPACE: prometheus
STATUS: DEPLOYED
# watch kubectl get all -n prometheus
NAME READY STATUS RESTARTS AGE
pod/prometheus-alertmanager-977545d7b-kctwz 2/2 Running 0 94s
pod/prometheus-kube-state-metrics-dd4fcf989-ldxlj 1/1 Running 0 94s
pod/prometheus-node-exporter-n4c7g 1/1 Running 0 94s
pod/prometheus-node-exporter-t6vhn 1/1 Running 0 94s
pod/prometheus-node-exporter-tvjh9 1/1 Running 0 94s
pod/prometheus-pushgateway-644868fb9c-zdjd7 1/1 Running 0 94s
pod/prometheus-server-d6c7dbd-vmpqh 2/2 Running 0 94s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/prometheus-alertmanager ClusterIP 10.109.24.42
service/prometheus-kube-state-metrics ClusterIP None
service/prometheus-node-exporter ClusterIP None
service/prometheus-pushgateway ClusterIP 10.106.50.223
service/prometheus-server NodePort 10.105.205.128
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/prometheus-node-exporter 3 3 3 3 3
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/prometheus-alertmanager 1/1 1 1 94s
deployment.apps/prometheus-kube-state-metrics 1/1 1 1 94s
deployment.apps/prometheus-pushgateway 1/1 1 1 94s
deployment.apps/prometheus-server 1/1 1 1 94s
NAME DESIRED CURRENT READY AGE
replicaset.apps/prometheus-alertmanager-977545d7b 1 1 1 94s
replicaset.apps/prometheus-kube-state-metrics-dd4fcf989 1 1 1 94s
replicaset.apps/prometheus-pushgateway-644868fb9c 1 1 1 94s
replicaset.apps/prometheus-server-d6c7dbd 1 1 1 94s
Now create and Ingress resource rule like below for taking the prometheus url from outside:
# kubectl get ing -n prometheus -oyaml
apiVersion: v1
items:
- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.org/rewrites: serviceName=prometheus-server rewrite=/;
creationTimestamp: "2019-11-20T09:39:49Z"
generation: 1
name: ingress-resource-1
namespace: prometheus
resourceVersion: "1333268"
selfLink: /apis/extensions/v1beta1/namespaces/prometheus/ingresses/ingress-resource-1
uid: xxxxxxx-xxxxx-xxxx-xxxx-xxxxxxxxxxxx
spec:
rules:
- host: monitoring.yourdomain.com
http:
paths:
- backend:
serviceName: prometheus-server
servicePort: 80
path: /
status:
loadBalancer: {}
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Now take the url http://monitoring.yourdomain.com , which will load the prometheus page.
Grafana setup
############
# helm inspect values stable/grafana > /home/ubuntu/myk8syamls/grafana/grafana.values
>> Change the type from ClusterIP to NodePort and nodePort to 32323
>> Also change the adminPassword to a strong password.
>> persistence: enabled: true
# helm install stable/grafana --name grafana --values /home/ubuntu/myk8syamls/grafana/grafana.values --namespace grafana
NOTES:
1. Get your 'admin' user password by running:
kubectl get secret --namespace grafana grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
2. The Grafana server can be accessed via port 80 on the following DNS name from within your cluster:
grafana.grafana.svc.cluster.local
Get the Grafana URL to visit by running these commands in the same shell:
export NODE_PORT=$(kubectl get --namespace grafana -o jsonpath="{.spec.ports[0].nodePort}" services grafana)
export NODE_IP=$(kubectl get nodes --namespace grafana -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
3. Login with the password from step 1 and the username: admin
# watch kubectl get all -n grafana
To delete prometheus:
# helm delete prometheus --purge
# helm delete grafana --purge
Errors & Fixes
------------------
# helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Error: validation failed: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "Deployment" in version "extensions/v1beta1"]
Fix:
-----
# helm reset
# helm init --service-account tiller --output yaml | sed 's@apiVersion: extensions/v1beta1@apiVersion: apps/v1@' | sed 's@ replicas: 1@ replicas: 1\n selector: {"matchLabels": {"app": "helm", "name": "tiller"}}@' | kubectl apply -f -
deployment.apps/tiller-deploy created
service/tiller-deploy created
ubuntu@namenode:~$ helm install stable/prometheus --name prometheus --values /home/ubuntu/myk8syamls/prometheus/prometheus.values --namespace prometheus
Errors & Fixes
-------------------
When enabling " persistence: enabled: true " in grafana values , the pods are getting errors:
No data sources of type Prometheus AlertManager found
ubuntu@namenode:~$ kubectl describe pod/grafana-c6566b4b7-q8srd -n grafana
pullable://busybox@sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
Port:
Host Port:
Command:
chown
-R
472:472
/var/lib/grafana
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
------
This might be the issue with the nfs mount in the nfs server, make sure you have added the no_root_squash , in the nfs server like below:
[ec2-user@nfs-server]$ sudo exportfs -v
/srv/nfs/k8sdata
Comments