k8s series 24 calico初级(监控)

「这是我参与11月更文挑战的第12天,活动详情查看:2021最后一次更文挑战」。

前言

calico官方提供了prometheus的监控,且还详细说明了相关指标

因为我们是默认安装的calico,并没有启用Typha组件,所以kubernetes集群中就只有Felix和kube-controlles两大组件在运行

Felix的详细指标: docs.projectcalico.org/reference/f…

kube-controlles的详细指标:docs.projectcalico.org/reference/k…

calico组件配置

虽然官方提供了相关prometheus的监控,但默认配置是禁用的,需要手动开启,且还需要提供端点供prometheus拉取监控数据

Felix配置

启用Felix的prometheus指标

1
js复制代码calicoctl patch felixConfiguration default  --patch '{"spec":{"prometheusMetricsEnabled": true}}'

创建Felix指标端点

1
2
3
4
5
6
7
8
9
10
11
12
13
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: felix-metrics-svc
namespace: kube-system
spec:
selector:
k8s-app: calico-node
ports:
- port: 9091
targetPort: 9091
EOF

kube-controlles配置

kube-controlles的prometheus指标默认是启用的,需要无须改动,如果想更改它的监控端口,可以使用如下命令,如果端口改为0,则为禁用

1
2
js复制代码#默认端口监控在9094
calicoctl patch kubecontrollersconfiguration default --patch '{"spec":{"prometheusMetricsPort": 9094}}'

创建kube-controlles指标端点

1
2
3
4
5
6
7
8
9
10
11
12
13
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: kube-controllers-metrics-svc
namespace: kube-system
spec:
selector:
k8s-app: calico-kube-controllers
ports:
- port: 9094
targetPort: 9094
EOF

两个组件的Service创建成功后,查看一下

image.png

prometheus安装配置

在安装prometheus之前需要提前 创建相关服务账号和权限

创建namespace

创建一个独立的命令空间,供监控使用

1
2
3
4
5
6
7
8
9
js复制代码kubectl apply -f -<<EOF
apiVersion: v1
kind: Namespace
metadata:
name: calico-monitoring
labels:
app: ns-calico-monitoring
role: monitoring
EOF

创建服务账号

创建一个具从calico采集数据的账号,然后授于相关权限

下面配置分为三部分,创建角色,创建账号,绑定角色账号

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
js复制代码kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: calico-prometheus-user
rules:
- apiGroups: [""]
resources:
- endpoints
- services
- pods
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-prometheus-user
namespace: calico-monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: calico-prometheus-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-prometheus-user
subjects:
- kind: ServiceAccount
name: calico-prometheus-user
namespace: calico-monitoring
EOF

prometheus配置文件

创建prometheus的配置文件,如果二制进安装过prometheus,应该发现下列的配置几乎是一样的,后期想修改相关配置,直接编辑既可

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: calico-monitoring
data:
prometheus.yml: |-
global:
scrape_interval: 15s
external_labels:
monitor: 'tutorial-monitor'
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'felix_metrics'
scrape_interval: 5s
scheme: http
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: felix-metrics-svc
replacement: $1
action: keep
- job_name: 'typha_metrics'
scrape_interval: 5s
scheme: http
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: typha-metrics-svc
replacement: $1
action: keep
- job_name: 'kube_controllers_metrics'
scrape_interval: 5s
scheme: http
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_name]
regex: kube-controllers-metrics-svc
replacement: $1
action: keep
EOF

安装prometheus

以上步骤成功后,执行下列安装步骤

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: prometheus-pod
namespace: calico-monitoring
labels:
app: prometheus-pod
role: monitoring
spec:
serviceAccountName: calico-prometheus-user
containers:
- name: prometheus-pod
image: prom/prometheus
resources:
limits:
memory: "128Mi"
cpu: "500m"
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus/prometheus.yml
subPath: prometheus.yml
ports:
- containerPort: 9090
volumes:
- name: config-volume
configMap:
name: prometheus-config
EOF

查看安装进度,如果返回的状态是Running说明安装完成

1
js复制代码kubectl get pods prometheus-pod -n calico-monitoring

访问prometheus

因为我们没有给promethesu创建Service,所以这里先使用端口转发,简单验证一下prometheus是否获取到了calico的数据

1
js复制代码kubectl port-forward --address 0.0.0.0 pod/prometheus-pod 9090:9090 -n calico-monitoring

访问 http://ip:9090 端口

Grafana安装配置

在配置Grafana之前,需要声明prometheus访问方式,便于访问数据显示图表

创建prometheus Service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
name: prometheus-dashboard-svc
namespace: calico-monitoring
spec:
selector:
app: prometheus-pod
role: monitoring
ports:
- port: 9090
targetPort: 9090
EOF

创建grafana配置

创建grafana连接数据库的类型,地址,端口,以及连接方式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-config
namespace: calico-monitoring
data:
prometheus.yaml: |-
{
"apiVersion": 1,
"datasources": [
{
"access":"proxy",
"editable": true,
"name": "calico-demo-prometheus",
"orgId": 1,
"type": "prometheus",
"url": "http://prometheus-dashboard-svc.calico-monitoring.svc:9090",
"version": 1
}
]
}
EOF

Felix的仪表盘配置

1
js复制代码kubectl apply -f https://docs.projectcalico.org/manifests/grafana-dashboards.yaml

安装Grafana

直接应用如下配置,会从grafana官方下载最新的镜像

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
js复制代码kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: grafana-pod
namespace: calico-monitoring
labels:
app: grafana-pod
role: monitoring
spec:
containers:
- name: grafana-pod
image: grafana/grafana:latest
resources:
limits:
memory: "128Mi"
cpu: "500m"
volumeMounts:
- name: grafana-config-volume
mountPath: /etc/grafana/provisioning/datasources
- name: grafana-dashboards-volume
mountPath: /etc/grafana/provisioning/dashboards
- name: grafana-storage-volume
mountPath: /var/lib/grafana
ports:
- containerPort: 3000
volumes:
- name: grafana-storage-volume
emptyDir: {}
- name: grafana-config-volume
configMap:
name: grafana-config
- name: grafana-dashboards-volume
configMap:
name: grafana-dashboards-config
EOF

访问grafana

因暂时没有写Service配置,先转发端口来访问,验证一下监控是否正常

1
js复制代码kubectl port-forward --address 0.0.0.0 pod/grafana-pod 3000:3000 -n calico-monitoring

访问http://IP:3000 访问Grafana的web-ui登陆页,默认账号密码都是: admin/admin

登陆成功后,会提示修改密码或跳过,后续在设置中修改

image.png

登陆好之看,是没有任何东西的,需要访问一下这个地址: http://ip:3000/d/calico-felix-dashboard/felix-dashboard-calico?orgId=1

会打开calico给我们提供的Dashborad,这里点一下加星,后面就可以在主页上找到该面版了

image.png

创建Service

直接使用expose命令创建一个NodePort类型的Service

1
2
3
4
js复制代码#创建Service
kubectl expose pod grafana-pod --port=3000 --target-port=3000 --type=NodePort -n calico-monitoring
#查看暴露的端口
kubectl get svc -n calico-monitoring

访问集群节点ip+30538端口 就可以打开grafana了
image.png

卸载

如果觉得该套监控比较占用集群资源,如果单纯的只是想看看效果,可执行下列命令来删除这套监控

1
2
3
4
5
6
7
8
js复制代码kubectl delete service felix-metrics-svc -n kube-system
kubectl delete service typha-metrics-svc -n kube-system
kubectl delete service kube-controllers-metrics-svc -n kube-system
kubectl delete namespace calico-monitoring
kubectl delete ClusterRole calico-prometheus-user
kubectl delete clusterrolebinding calico-prometheus-user

kubectl delete namespace calico-monitoring

本文转载自: 掘金

开发者博客 – 和开发相关的 这里全都有

0%