Kubernetes自定义指标HPA

背景

Kubernetes 默认提供 CPU 和内存作为 HPA 弹性伸缩的指标,如果有更复杂的场景需求,比如基于业务单副本 QPS 大小来进行自动扩缩容,可以考虑自行安装 prometheus-adapter 来实现基于自定义指标的 Pod 弹性伸缩。

Promethue-adapter的作用

prometheus采集到的metrics并不能直接给k8s用,因为两者数据格式不兼容,这时就需要另外一个组件(prometheus-adapter),将prometheus的metrics 数据格式转换成k8s API接口能识别的格式,因为prometheus-adapter是自定义API Service,所以还需要用Kubernetes aggregator在主API服务器中注册,以便直接通过/apis/来访问。

kubernetes apiserver 提供了三种 API 用于监控指标相关的操作:

  • resource metrics API:被设计用来给 k8s 核心组件提供监控指标,例如 kubectl top;
  • custom metrics API:被设计用来给 HPA 控制器提供指标。
  • external metrics API:被设计用来通过外部指标扩容(后面细讲)

prometheus-adapter支持以下三种API,kubectl top node/podresource metrics 指标。所以我们可以用prometheus-adapter替代metrics-server

  • resource metrics API
  • custom metrics API
  • external metrics API

Kubernetes API Aggregation

在 Kubernetes 1.7 版本引入了聚合层,允许第三方应用程序通过将自己注册到kube-apiserver上,仍然通过 API Server 的 HTTP URL 对新的 API 进行访问和操作。为了实现这个机制,Kubernetes 在 kube-apiserver 服务中引入了一个 API 聚合层(API Aggregation Layer),用于将扩展 API 的访问请求转发到用户服务的功能。

custom-metrics-hpa-1

当你访问 apis/metrics.k8s.io/v1beta1 的时候,实际上访问到的是一个叫做 kube-aggregator 的代理。而 kube-apiserver,正是这个代理的一个后端;而 Metrics Server,则是另一个后端 。通过这种方式,我们就可以很方便地扩展 Kubernetes 的 API 了。

如果你使用kubeadm部署的,默认已开启。如果你使用二进制方式部署的话,需要在kube-APIServer中添加启动参数,增加以下配置:

1
2
3
4
5
6
7
8
9
10
# vim /opt/kubernetes/cfg/kube-apiserver.conf
......
--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
......

资源指标与自定义指标工作流程

custom-metrics-hpa-2

(1) 资源指标工作流程: hpa -> apiserver -> kube aggregation -> metrics-server -> kubelet(cadvisor)
(2) 自定义资源指标工作流程: hpa -> apiserver -> kube aggregation -> prometheus-adapter -> prometheus -> pods

部署Prometheus-adapter

项目地址:https://github.com/qist/k8s/tree/main/k8s-yaml/monitoring

在此github项目上下载custom-metrics-apiprometheus-adapter并应用,custom-metrics-api是创建三个APIService,还有prometheus-adapter是部署 prometheus-adapter Pod,并且添加了v1beta1.metrics.k8s.io APIService用于替换metrics-server

使用以下命令可以看到创建了4个APIService,当他们访问前面的连接的时候会被反代到monitoring/prometheus-adapter svc上,由svc后面的Pod把数据返回会给请求者。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@k8s-master-01 prometheus-adapter]#            kubectl get apiservice
....
v1beta1.authentication.k8s.io Local True 621d
v1beta1.authorization.k8s.io Local True 621d
v1beta1.batch Local True 621d
v1beta1.certificates.k8s.io Local True 621d
v1beta1.coordination.k8s.io Local True 621d
v1beta1.custom.metrics.k8s.io monitoring/prometheus-adapter True 31h
v1beta1.events.k8s.io Local True 621d
v1beta1.extensions Local True 621d
v1beta1.external.metrics.k8s.io monitoring/prometheus-adapter True 31h
v1beta1.metrics.k8s.io monitoring/prometheus-adapter True 15s
v1beta1.networking.k8s.io Local True 621d
v1beta1.node.k8s.io Local True 621d
v1beta1.policy Local True 621d
v1beta1.rbac.authorization.k8s.io Local True 621d
v1beta1.scheduling.k8s.io Local True 621d
v1beta1.storage.k8s.io Local True 621d
v1beta2.custom.metrics.k8s.io monitoring/prometheus-adapter True 31h
v2beta1.autoscaling Local True 621d
v2beta2.autoscaling Local True 621d
....
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@k8s-master-01 prometheus-adapter]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
kube-state-metrics-69d87f556-dcccq 4/4 Running 0 43d
node-exporter-249cv 2/2 Running 0 41d
node-exporter-4xbm2 2/2 Running 0 6d6h
node-exporter-545k4 2/2 Running 0 41d
node-exporter-8hchf 2/2 Running 0 41d
node-exporter-kxgx7 2/2 Running 0 41d
node-exporter-mfwxl 2/2 Running 0 41d
node-exporter-qrpv5 2/2 Running 0 41d
node-exporter-xhr6b 2/2 Running 0 41d
prometheus-adapter-6bc9d8bd9c-86v5q 1/1 Running 0 30h
prometheus-adapter-6bc9d8bd9c-96t6q 1/1 Running 0 30h
prometheus-k8s-0 2/2 Running 0 7d8h

Prometheus-adapter配置写法

adapter 使用字段 rules 、resourceRules 和 externalRules 分别表示 custom metricsresource metricsexternal metrics,如本例中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
apiVersion: v1
kind: ConfigMap
metadata:
name: adapter-config
namespace: monitoring
data:
config.yaml: |
rules:
- seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}'
seriesFilters: []
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^container_(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}'
seriesFilters:
- isNot: ^container_.*_seconds_total$
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^container_(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container!="POD"}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{__name__=~"^container_.*",container!="POD",namespace!="",pod!=""}'
seriesFilters:
- isNot: ^container_.*_total$
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: ^container_(.*)$
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container!="POD"}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_total$
resources:
template: <<.Resource>>
name:
matches: ""
as: ""
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_seconds_total
resources:
template: <<.Resource>>
name:
matches: ^(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters: []
resources:
template: <<.Resource>>
name:
matches: ^(.*)_seconds_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
"resourceRules":
"cpu":
"containerLabel": "container"
"containerQuery": |
sum by (<<.GroupBy>>) (
irate (
container_cpu_usage_seconds_total{<<.LabelMatchers>>,pod!=""}[120s]
)
)
"nodeQuery": |
sum by (<<.GroupBy>>) (
1 - irate(
node_cpu_seconds_total{mode="idle"}[60s]
)
* on(namespace, pod) group_left(node) (
node_namespace_pod:kube_pod_info:{<<.LabelMatchers>>}
)
)
or sum by (<<.GroupBy>>) (
1 - irate(
windows_cpu_time_total{mode="idle", job="windows-exporter",<<.LabelMatchers>>}[4m]
)
)
"resources":
"overrides":
"namespace":
"resource": "namespace"
"node":
"resource": "node"
"pod":
"resource": "pod"
"memory":
"containerLabel": "container"
"containerQuery": |
sum by (<<.GroupBy>>) (
container_memory_working_set_bytes{<<.LabelMatchers>>,pod!=""}
)
"nodeQuery": |
sum by (<<.GroupBy>>) (
node_memory_MemTotal_bytes{job="node-exporter",<<.LabelMatchers>>}
-
node_memory_MemAvailable_bytes{job="node-exporter",<<.LabelMatchers>>}
)
or sum by (<<.GroupBy>>) (
windows_cs_physical_memory_bytes{job="windows-exporter",<<.LabelMatchers>>}
-
windows_memory_available_bytes{job="windows-exporter",<<.LabelMatchers>>}
)
"resources":
"overrides":
"node":
"resource": "node"
"namespace":
"resource": "namespace"
"pod":
"resource": "pod"
"window": "5m"
externalRules:
- seriesQuery: '{__name__=~"^.*_queue_(length|size)$",namespace!=""}'
resources:
overrides:
namespace:
resource: namespace
name:
matches: ^.*_queue_(length|size)$
as: "$0"
metricsQuery: max(<<.Series>>{<<.LabelMatchers>>})
- seriesQuery: '{__name__=~"^.*_queue$",namespace!=""}'
resources:
overrides:
namespace:
resource: namespace
name:
matches: ^.*_queue$
as: "$0"
metricsQuery: max(<<.Series>>{<<.LabelMatchers>>})

adapter 的配置主要分为4个:

Discovery:指定需要处理的 Prometheus 的 metrics。通过 seriesQuery 挑选需要处理的 metrics 集合,可以通过 seriesFilters 精确过滤 metrics。

seriesQuery 可以根据标签进行查找(如下),也可以直接指定 metric name 查找

1
2
3
seriesQuery: '{__name__=~"^container_.*_total",container_name!="POD",namespace!="",pod_name!=""}'
seriesFilters:
- isNot: "^container_.*_seconds_total"

seriesFilters:

1
2
is: <regex>, 匹配包含该正则表达式的metrics.
isNot: <regex>, 匹配不包含该正则表达式的metrics.

Association:设置 metric 与 kubernetes resources 的映射关系,kubernetes resorces 可以通过 kubectl api-resources 命令查看。overrides 会将 Prometheus metric label与一个 kubernetes resource (下例为 deployment )关联。需要注意的是该 label 必须是一个真实的 kubernetes resource,如 metric 的 pod_name 可以映射为 kubernetes 的pod resource,但不能将 container_image 映射为 kubernetes 的 pod resource,映射错误会导致无法通过 custom metrics API 获取正确的值。这也表示 metric 中必须存在一个真实的 resource 名称,将其映射为 kubernetes resource。

1
2
3
resources:
overrides:
microservice: {group: "apps", resource: "deployment"}

Naming:用于将 prometheus metrics 名称转化为 custom metrics API 所使用的metrics名称,但不会改变其本身的metric名称,即通过curl http://$(kubectl get service sample-app -o jsonpath=’{ .spec.clusterIP }’)/metrics 获得的仍然是老的 metric 名称。如果不需要可以不执行这一步。

1
2
3
4
5
# 匹配将任何名称 <name>_total 转换为 <name>_per_second 
# 例如 http_requests_total 变为 http_requests_per_second
name:
matches: "^(.*)_total$"
as: "${1}_per_second"

如本例中 HPA 后续可以通过 /apis/{APIService-name}/v1beta1/namespaces/{namespaces-name}/pods/*/http_requests_per_second 获取metrics

Querying:处理调用 custom metrics API 获取到的 metrics 的 value,该值最终提供给 HPA 进行扩缩容

1
2
# convert cumulative cAdvisor metrics into rates calculated over 2 minutes
metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[2m])) by (<<.GroupBy>>)"

metricsQuery 字段使用 Go template 将 URL 请求转变为 Prometheus 的请求,它会提取 custom metrics API 请求中的字段,并将其划分为 metric name,group-resource,以及 group-resource 中的一个或多个 objects,对应如下字段:

  • Series: metric名称
  • LabelMatchers: 以逗号分割的 objects,当前表示特定 group-resource 加上命名空间的 label(如果该 group-resource 是 namespaced 的)
  • GroupBy:以逗号分割的 label 的集合,当前表示 LabelMatchers 中的group-resource label

假设 metrics http_requests_per_second 如下

1
2
http_requests_per_second{pod="pod1",service="nginx1",namespace="somens"}
http_requests_per_second{pod="pod2",service="nginx2",namespace="somens"}

当调用 kubectl get --raw "/apis/{APIService-name}/v1beta1/namespaces/somens/pods/*/http_request_per_second"时,metricsQuery 字段的模板的实际内容如下:

  • Series: “http_requests_total”
  • LabelMatchers: “pod=~"pod1|pod2”,namespace=”somens”
  • GroupBy:pod

HPA 的配置

HPA 通常会根据 type 从 aggregated APIs (metrics.k8s.io, custom.metrics.k8s.io, external.metrics.k8s.io)的资源路径上拉取 metrics

HPA 支持的 metrics 类型有4种(下述为v2beta2的格式):

resource:目前仅支持 cpu 和 memory。target 可以指定数值(targetAverageValue) 和比例 (targetAverageUtilization) 进行扩缩容 HPA 从 metrics.k8s.io 获取 resource metrics

pods:custom metrics,这类 metrics 描述了 pod 类型,target 仅支持按指定数值(targetAverageValue)进行扩缩容。targetAverageValue 用于计算所有相关 pods 上的 metrics 的平均值

1
2
3
4
5
6
7
type: Pods
pods:
metric:
name: packets-per-second
target:
type: AverageValue
averageValue: 1k

HPA 从 custom.metrics.k8s.io 获取 custom metrics

object:custom metrics,这类 metrics 描述了相同命名空间下的(非 pod )类型。target 支持通过 value 和 AverageValue 进行扩缩容,前者直接将 metric 与 target 比较进行扩缩容,后者通过 metric/ 相关的 pod 数目与 target 比较进行扩缩容

1
2
3
4
5
6
7
8
9
10
11
type: Object
object:
metric:
name: requests-per-second
describedObject:
apiVersion: extensions/v1beta1
kind: Ingress
name: main-route
target:
type: Value
value: 2k

external:kubernetes 1.10+支持的新功能。通常 Prometheus 能够直接从 RabbitMQ 中抓取指标。不幸的是,RabbitMQ的指标端点并没有包含queue length指标。为了收集这些数据,我们使用了RabbitMQ Exporter. 一旦我们将它连接到 RabbitMQ,我们将拥有大量的 RabbitMQ 指标,我们可以将其用作扩展的基础,然后将它们存储在其时间序。Prometheus 能够抓取这些指标列数据库中。 prometheus 中以rabbitmq_queue开头的任何指标都可以通过这个新的 external.metrics.k8s.io API 以 1 分钟间隔的速率形式提供。(也就是说 任何pod都可以用这些值来实现pod扩容,哪怕是两个业务毫无关系。但前提是需要把他注册到api-resources)

  • custom.metrics.k8s.io 只支持pod 本身metrics指标来扩容
  • external.metrics.k8s.io 可以是其它业务pod 根据这个值来扩容(例如我可以用nginx的指标值来扩 mysql,也可以用mysql_exporter的值 来扩mysql。)

例如我们可以用mongo_exporter的连接数来扩容nginx,HPA 从 external.metrics.k8s.io 获取 external metrics

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
kind: Deployment
metadata:
name: deployment-firstspec:
replicas: 2
template:
metadata:
labels:
app: deployment-first
spec:
containers:
- name: deployment-first
image: nginx
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
resources:
requests:
cpu: "1m"
limits:
cpu: "100m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: app-server-mongo-conn-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: deployment-first
minReplicas: 2
maxReplicas: 5
metrics:
- type: External
external:
metric
name: mongodb_current_connection
selector:
matchLabels:
queue: "worker_tasks"
target:
type: AverageValue
averageValue: 30

custom-metrics-hpa-3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
externalRules:
- seriesQuery: '{__name__=~"^.*_queue_(length|size)$",namespace!=""}'
resources:
overrides:
namespace:
resource: namespace
name:
matches: ^.*_queue_(length|size)$
as: "$0"
metricsQuery: max(<<.Series>>{<<.LabelMatchers>>})
- seriesQuery: '{__name__=~"^.*_queue$",namespace!=""}'
resources:
overrides:
namespace:
resource: namespace
name:
matches: ^.*_queue$
as: "$0"
metricsQuery: max(<<.Series>>{<<.LabelMatchers>>})

被上面的规则匹配到的指标会注册为一个新的资源对象

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@k8s-master-01 ~]# kubectl api-resources
.....
ipamhandles crd.projectcalico.org false IPAMHandle
ippools crd.projectcalico.org false IPPool
networkpolicies crd.projectcalico.org true NetworkPolicy
networksets crd.projectcalico.org true NetworkSet
events ev events.k8s.io true Event
ingresses ing extensions true Ingress
node_cpu_core_throttles_total external.metrics.k8s.io true ExternalMetricValueList
node_network_transmit_queue_length external.metrics.k8s.io true ExternalMetricValueList
prometheus_notifications_queue_length external.metrics.k8s.io true ExternalMetricValueList
nodes metrics.k8s.io false NodeMetrics
pods metrics.k8s.io true PodMetrics
.....
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
[root@k8s-master-01 ~]# kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/monitoring/node_cpu_core_throttles_total" | jq .                                  
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"metricName": "node_cpu_core_throttles_total",
"metricLabels": {
"__name__": "node_cpu_core_throttles_total",
"app_kubernetes_io_name": "node-exporter",
"app_kubernetes_io_version": "v1.2.2",
"container": "node-exporter",
"core": "0",
"endpoint": "https",
"instance": "172.16.50.1:9100",
"job": "node-exporter",
"namespace": "monitoring",
"package": "0",
"pod": "node-exporter-8hchf",
"service": "node-exporter",
"service_name": "node-exporter-172.16.50.1"
},
"timestamp": "2021-12-23T08:21:49Z",
"value": "0"
},
{
"metricName": "node_cpu_core_throttles_total",
"metricLabels": {
"__name__": "node_cpu_core_throttles_total",
"app_kubernetes_io_name": "node-exporter",
"app_kubernetes_io_version": "v1.2.2",
"container": "node-exporter",
"core": "0",
"endpoint": "https",
"instance": "172.16.50.29:9100",
"job": "node-exporter",
"namespace": "monitoring",
"package": "0",
"pod": "node-exporter-mfwxl",
"service": "node-exporter",
"service_name": "node-exporter-172.16.50.29"
},
"timestamp": "2021-12-23T08:21:49Z",
"value": "0"
}
}

Kubernetes metrics 的获取

假设注册的 APIService为custom.metrics.k8s.io/v1beta1,在注册好APIService 后 HorizontalPodAutoscaler controller 会从以 /apis/custom.metrics.k8s.io/v1beta1 为根 API 的路径上抓取 metrics。metrics 的 API path 可以分为 namespaced 和 non-namespaced 类型的。通过如下方式校验 HPA 是否可以获取到 metrics:

namespaced

获取指定 namespace 下指定 object 类型和名称的 metrics

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/{namespace-name}/{object-type}/{object-name}/{metric-name...}" | jq .   

如获取 monitor 命名空间下名为 grafana 的 pod 的start_time_seconds metric

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitor/pods/grafana/start_time_seconds" | jq .   

获取指定 namespace 下所有特定 object 类型的 metrics

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/{namespace-name}/pods/*/{metric-name...}" | jq .   

如获取 monitor 命名空间下名为所有 pod 的 start_time_seconds metric

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/monitor/pods/*/start_time_seconds" | jq .   

使用 labelSelector 可以选择带有特定 label 的 object

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/{namespace-name}/{object-type}/{object-name}/{metric-name...}?labelSelector={label-name}" | jq .   
1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/{namespace-name}/pods/*/{metric-name...}?labelSelector={label-name}" | jq .   

non-namespaced

non-namespaced 和 namespaced 的类似,主要有 node,namespace,PersistentVolume 等。non-namespaced 访问有些与 custom metrics API 描述不一致。

访问 object 为 namespace 的方式如下如下

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/{namespace-name}/metrics/{metric-name...}" | jq .   
1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/*/metrics/{metric-name...}" | jq .   

访问 node 的方式如下

1
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/nodes/{node-name}/{metric-name...}" | jq .   

示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
apiVersion: v1
kind: Service
metadata:
name: sample-app
labels:
app: sample-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
spec:
ports:
- name: http
port: 8080
targetPort: 8080
selector:
app: sample-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sample-app
labels:
app: sample-app
spec:
replicas: 1
selector:
matchLabels:
app: sample-app
template:
metadata:
labels:
app: sample-app
istio: ingressgateway
version: v1
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- image: luxas/autoscale-demo:v0.1.2
name: metrics-provider
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"
ports:
- name: http
containerPort: 8080
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: sample-app
spec:
maxReplicas: 10
metrics:
- pods:
metric:
name: http_requests
target:
averageValue: 500m
type: AverageValue
type: Pods
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: sample-app

HPA在k8s中的结构

首先可以看一下HPA在k8s中的结构,这里找了一个k8s官方给出的HPA例子,我在关键字段上给出一些注释方便理解。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
namespace: default
spec:
# HPA的伸缩对象描述,HPA会动态修改该对象的pod数量
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
# HPA的最小pod数量和最大pod数量
minReplicas: 1
maxReplicas: 10
# 监控的指标数组,支持多种类型的指标共存
metrics:
# Object类型的指标
- type: Object
object:
metric:
# 指标名称
name: requests-per-second
# 监控指标的对象描述,指标数据来源于该对象
describedObject:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
name: main-route
# Value类型的目标值,Object类型的指标只支持Value和AverageValue类型的目标值
target:
type: Value
value: 10k
# Resource类型的指标
- type: Resource
resource:
name: cpu
# Utilization类型的目标值,Resource类型的指标只支持Utilization和AverageValue类型的目标值
target:
type: Utilization
averageUtilization: 50
# Pods类型的指标
- type: Pods
pods:
metric:
name: packets-per-second
# AverageValue类型的目标值,Pods指标类型下只支持AverageValue类型的目标值
target:
type: AverageValue
averageValue: 1k
# External类型的指标
- type: External
external:
metric:
name: queue_messages_ready
# 该字段与第三方的指标标签相关联,(此处官方文档有问题,正确的写法如下)
selector:
matchLabels:
env: "stage"
app: "myapp"
# External指标类型下只支持Value和AverageValue类型的目标值
target:
type: AverageValue
averageValue: 30

根据前面 adapter 中的配置文件去匹配,从上往下匹配,匹配到那条规则就用有那条规则做计算。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@k8s-master-01 prometheus-adapter]# kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pod/*/http_requests" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pod/%2A/http_requests"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "sample-app-b7b48448c-ljk2f",
"apiVersion": "/v1"
},
"metricName": "http_requests",
"timestamp": "2021-12-24T03:07:20Z",
"value": "100m",
"selector": null
}
]
}

被此规则匹配到

1
2
3
4
5
6
7
8
9
- seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
seriesFilters:
- isNot: .*_seconds_total
resources:
template: <<.Resource>>
name:
matches: ^(.*)_total$
as: ""
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
1
2
3
4
5
6
[root@k8s-master-01 prometheus-adapter]#  ab -c 100 -n 1000000 http://10.96.152.103:8080/metrics 
This is ApacheBench, Version 2.3 <$Revision: 1430300 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 10.96.152.103 (be patient)
1
2
3
4
[root@k8s-master-01 ~]#      kubectl get pod
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-7b889f4bc9-rb9xj 1/1 Running 2 222d
sample-app-b7b48448c-ljk2f 1/1 Running 0 54m

custom-metrics-hpa-4

成功扩容

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@k8s-master-01 ~]#      kubectl get pod -w
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-7b889f4bc9-rb9xj 1/1 Running 2 222d
sample-app-b7b48448c-2dlvm 1/1 Running 0 81s
sample-app-b7b48448c-9tm79 1/1 Running 0 66s
sample-app-b7b48448c-9xzmb 1/1 Running 0 82s
sample-app-b7b48448c-h6lj6 1/1 Running 0 82s
sample-app-b7b48448c-jvpgn 1/1 Running 0 98s
sample-app-b7b48448c-knzh9 1/1 Running 0 66s
sample-app-b7b48448c-ljk2f 1/1 Running 0 56m
sample-app-b7b48448c-qhtgp 1/1 Running 0 98s
sample-app-b7b48448c-tc89k 1/1 Running 0 82s
sample-app-b7b48448c-vgn8n 1/1 Running 0 98s

Kubernetes自定义指标HPA
https://system51.github.io/2021/12/22/custom-metrics-hpa/
作者
Mr.Ye
发布于
2021年12月22日
许可协议