4 minute read
QOS ensurance’s architecture is shown as below. It contains three modules.
The main process:
AvoidanceAction mainly defines the operations that need to be performed after interference is detected, including several operations such as Disable Scheduling, throttle, and eviction, and defines some related parameters.
NodeQOS mainly defines the metrics collection method and parameters, the related parameters of the watermark, and the associated avoidance operation when metrics are abnormal. At the same time, the above content is associated to the specified nodes through a series of selectors.
PodQOS defines the AvoidanceAction that a specified pod can be executed, and is usually paired with NodeQOS to limit the scope of execution actions from the dimensions of nodes and pods. The selector supported by PodQOS includes label selector, and also supports filtering of specific QOSClass (“BestEffort”, “Guaranteed”, etc.), specific Priority, and specific Namespace of pods, above selectors are associated with each other in the manner of “AND”.
The following AvoidanceAction and NodeQOS can be defined. As a result, when the node CPU usage triggers the threshold, disable schedule action for the node will be executed.
The sample YAML looks like below:
apiVersion: ensurance.crane.io/v1alpha1
kind: AvoidanceAction
metadata:
labels:
app: system
name: disablescheduling
spec:
description: disable schedule new pods to the node
coolDownSeconds: 300 # The minimum wait time of the node from scheduling disable status to normal status
apiVersion: ensurance.crane.io/v1alpha1
kind: NodeQOS
metadata:
name: "watermark1"
spec:
nodeQualityProbe:
timeoutSeconds: 10
nodeLocalGet:
localCacheTTLSeconds: 60
rules:
- name: "cpu-usage"
avoidanceThreshold: 2 #(1)
restoreThreshold: 2 #(2)
actionName: "disablescheduling" #(3)
strategy: "None" #(4)
metricRule:
name: "cpu_total_usage" #(5)
value: 4000 #(6)
apiVersion: ensurance.crane.io/v1alpha1
kind: PodQOS
metadata:
name: all-elastic-pods
spec:
allowedActions:
- disablescheduling
labelSelector:
matchLabels:
preemptible_job: "true"
Please check the video to learn more about the scheduling disable actions.
The following AvoidanceAction and NodeQOS can be defined. As a result, when the node CPU usage triggers the threshold, throttle action for the node will be executed.
The sample YAML looks like below:
apiVersion: ensurance.crane.io/v1alpha1
kind: AvoidanceAction
metadata:
name: throttle
labels:
app: system
spec:
coolDownSeconds: 300
throttle:
cpuThrottle:
minCPURatio: 10 #(1)
stepCPURatio: 10 #(2)
description: "throttle low priority pods"
apiVersion: ensurance.crane.io/v1alpha1
kind: NodeQOS
metadata:
name: "watermark2"
spec:
nodeQualityProbe:
timeoutSeconds: 10
nodeLocalGet:
localCacheTTLSeconds: 60
rules:
- name: "cpu-usage"
avoidanceThreshold: 2
restoredThreshold: 2
actionName: "throttle"
strategy: "None"
metricRule:
name: "cpu_total_usage"
value: 6000
apiVersion: ensurance.crane.io/v1alpha1
kind: PodQOS
metadata:
name: all-be-pods
spec:
allowedActions:
- throttle
scopeSelector:
matchExpressions:
- operator: In
scopeName: QOSClass
values:
- BestEffort
The following YAML is another case, low priority pods on the node will be evicted, when the node CPU usage trigger the threshold.
apiVersion: ensurance.crane.io/v1alpha1
kind: AvoidanceAction
metadata:
name: eviction
labels:
app: system
spec:
coolDownSeconds: 300
eviction:
terminationGracePeriodSeconds: 30 #(1)
description: "evict low priority pods"
apiVersion: ensurance.crane.io/v1alpha1
kind: NodeQOS
metadata:
name: "watermark3"
labels:
app: "system"
spec:
nodeQualityProbe:
timeoutSeconds: 10
nodeLocalGet:
localCacheTTLSeconds: 60
rules:
- name: "cpu-usage"
avoidanceThreshold: 2
restoreThreshold: 2
actionName: "eviction"
strategy: "Preview" #(1)
metricRule:
name: "cpu_total_usage"
value: 6000
apiVersion: ensurance.crane.io/v1alpha1
kind: PodQOS
metadata:
name: all-elastic-pods
spec:
allowedActions:
- eviction
labelSelector:
matchLabels:
preemptible_job: "true"
Name | Description |
---|---|
cpu_total_usage | node cpu usage |
cpu_total_utilization | node cpu utilization percent |
memory_total_usage | node mem usage |
memory_total_utilization | node mem utilization percent |
For details, please refer to the examples under examples/ensurance.
In order to avoid the impact of active avoidance operations on high-priority services, such as the wrongful eviction of important services, it is recommended to use PodQOS to associate workloads that use dynamic resources, so that only those workloads that use idle resources are affected when executing actions, ensuring that The stability of the core business on the node.
For the content of dymamic resources, see Dynamic resource oversold and limit.