Crane-scheduler

Crane-scheduler 介绍

Crane-scheduler 是一组基于scheduler framework的调度插件, 包含:

开始

安装 Prometheus

确保你的 Kubernetes 集群已安装 Prometheus。如果没有,请参考Install Prometheus.

配置 Prometheus 规则

配置 Prometheus 的规则以获取预期的聚合数据:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-record
spec:
  groups:
  - name: cpu_mem_usage_active
    interval: 30s
    rules:
    - record: cpu_usage_active
      expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[30s])) * 100)
    - record: mem_usage_active
      expr: 100*(1-node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes)
  - name: cpu-usage-5m
    interval: 5m
    rules:
    - record: cpu_usage_max_avg_1h
      expr: max_over_time(cpu_usage_avg_5m[1h])
    - record: cpu_usage_max_avg_1d
      expr: max_over_time(cpu_usage_avg_5m[1d])
  - name: cpu-usage-1m
    interval: 1m
    rules:
    - record: cpu_usage_avg_5m
      expr: avg_over_time(cpu_usage_active[5m])
  - name: mem-usage-5m
    interval: 5m
    rules:
    - record: mem_usage_max_avg_1h
      expr: max_over_time(mem_usage_avg_5m[1h])
    - record: mem_usage_max_avg_1d
      expr: max_over_time(mem_usage_avg_5m[1d])
  - name: mem-usage-1m
    interval: 1m
    rules:
    - record: mem_usage_avg_5m
      expr: avg_over_time(mem_usage_active[5m])

!!! warning “️Troubleshooting”

    Prometheus 的采样间隔必须小于30秒,不然可能会导致规则无法正常生效。如:`cpu_usage_active`。

安装 Crane-scheduler

有两种选择:

  • 安装 Crane-scheduler 作为第二个调度器
  • 用 Crane-scheduler 替换原生 Kube-scheduler

安装 Crane-scheduler 作为第二个调度器

=== “Main”

   ```bash
   helm repo add crane https://gocrane.github.io/helm-charts
   helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
   ```

=== “Mirror”

   ```bash
   helm repo add crane https://finops-helm.pkg.coding.net/gocrane/gocrane
   helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
   ```

用 Crane-scheduler 替换原生 Kube-scheduler

  1. 备份/etc/kubernetes/manifests/kube-scheduler.yaml
cp /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/
  1. 通过修改 kube-scheduler 的配置文件(scheduler-config.yaml ) 启用动态调度插件并配置插件参数:
apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
...
profiles:
- schedulerName: default-scheduler
 plugins:
   filter:
     enabled:
     - name: Dynamic
   score:
     enabled:
     - name: Dynamic
       weight: 3
 pluginConfig:
 - name: Dynamic
    args:
     policyConfigPath: /etc/kubernetes/policy.yaml
...
  1. 新建/etc/kubernetes/policy.yaml,用作动态插件的调度策略:
 apiVersion: scheduler.policy.crane.io/v1alpha1
 kind: DynamicSchedulerPolicy
 spec:
   syncPolicy:
     ##cpu usage
     - name: cpu_usage_avg_5m
       period: 3m
     - name: cpu_usage_max_avg_1h
       period: 15m
     - name: cpu_usage_max_avg_1d
       period: 3h
     ##memory usage
     - name: mem_usage_avg_5m
       period: 3m
     - name: mem_usage_max_avg_1h
       period: 15m
     - name: mem_usage_max_avg_1d
       period: 3h

   predicate:
     ##cpu usage
     - name: cpu_usage_avg_5m
       maxLimitPecent: 0.65
     - name: cpu_usage_max_avg_1h
       maxLimitPecent: 0.75
     ##memory usage
     - name: mem_usage_avg_5m
       maxLimitPecent: 0.65
     - name: mem_usage_max_avg_1h
       maxLimitPecent: 0.75

   priority:
     ##cpu usage
     - name: cpu_usage_avg_5m
       weight: 0.2
     - name: cpu_usage_max_avg_1h
       weight: 0.3
     - name: cpu_usage_max_avg_1d
       weight: 0.5
     ##memory usage
     - name: mem_usage_avg_5m
       weight: 0.2
     - name: mem_usage_max_avg_1h
       weight: 0.3
     - name: mem_usage_max_avg_1d
       weight: 0.5

   hotValue:
     - timeRange: 5m
       count: 5
     - timeRange: 1m
       count: 2
  1. 修改kube-scheduler.yaml并用 Crane-scheduler的镜像替换 kube-scheduler 镜像:
...
 image: docker.io/gocrane/crane-scheduler:0.0.23
...
  1. 安装crane-scheduler-controller: === “Main”

      kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/rbac.yaml
      kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/deployment.yaml
    

=== “Mirror”

  ```bash
  kubectl apply -f https://gitee.com/finops/crane-scheduler/raw/main/deploy/controller/rbac.yaml
  kubectl apply -f https://gitee.com/finops/crane-scheduler/raw/main/deploy/controller/deployment.yaml
  ```

使用 Crane-scheduler 调度 Pod

使用以下示例测试 Crane-scheduler :

apiVersion: apps/v1
kind: Deployment
metadata:
  name: cpu-stress
spec:
  selector:
    matchLabels:
      app: cpu-stress
  replicas: 1
  template:
    metadata:
      labels:
        app: cpu-stress
    spec:
      schedulerName: crane-scheduler
      hostNetwork: true
      tolerations:
      - key: node.kubernetes.io/network-unavailable
        operator: Exists
        effect: NoSchedule
      containers:
      - name: stress
        image: docker.io/gocrane/stress:latest
        command: ["stress", "-c", "1"]
        resources:
          requests:
            memory: "1Gi"
            cpu: "1"
          limits:
            memory: "1Gi"
            cpu: "1"

!!! Note

如果想将`crane-scheduler`用作默认调度器,请将`crane-scheduler`更改为`default-scheduler`。

如果测试 pod 调度成功,将会有以下事件:

Type    Reason     Age   From             Message
----    ------     ----  ----             -------
Normal  Scheduled  28s   crane-scheduler  Successfully assigned default/cpu-stress-7669499b57-zmrgb to vm-162-247-ubuntu