基于Kubernetes v1.32.3手动部署Elastic Stack 9.0.1

部署介绍 本博文将介绍基于Kubernetes v1.32.3手动部署Elastic Stack 9.0.1,而不使用官方的helm chart。 不使用官方的helm chart的原因是截至本文发布时,chart上只有8.5.1的版本。 本文部署的Elastic Stack以生产环境为标准,包括了

部署介绍

本博文将介绍基于Kubernetes v1.32.3手动部署Elastic Stack 9.0.1,而不使用官方的helm chart。

不使用官方的helm chart的原因是截至本文发布时,chart上只有8.5.1的版本。

本文部署的Elastic Stack以生产环境为标准,包括了以下特性:

  1. Elasticsearch使用3副本

  2. 开启xpack.security

  3. 自动生成Elasticsearch所需证书

  4. 密码保存在secret中

  5. 启用Data Stream

  6. 启用索引生命周期管理

前置知识

博主默认读者已掌握K8S的基本命令操作及资源yaml文件的编写,本文不再对这部分内容进行详细说明。

完整yaml

# Elasticsearch Service
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: elk
spec:
  selector:
    app: elasticsearch
  ports:
    - protocol: TCP
      port: 9200
      targetPort: 9200

---
# Elasticsearch Headless Service
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-discovery
  namespace: elk
spec:
  selector:
    app: elasticsearch
  ports:
    - protocol: TCP
      port: 9300
      targetPort: 9300
      name: transport
    - port: 9200
      name: http
      targetPort: 9200

---
# Elasticsearch ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-config
  namespace: elk
data:
  elasticsearch.yml: |-
    network.host: 0.0.0.0
    discovery.type: multi-node
    cluster.name: alfd
    xpack.security.enabled: true
    xpack.security.transport.ssl.enabled: true
    xpack.security.transport.ssl.verification_mode: certificate
    xpack.security.transport.ssl.keystore.path: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
    xpack.security.transport.ssl.truststore.path: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
    xpack.security.http.ssl.enabled: false
    discovery.initial_state_timeout: 60m
    discovery.seed_hosts: 
    - elasticsearch-0.elasticsearch-discovery.elk.svc.cluster.local:9300
    - elasticsearch-1.elasticsearch-discovery.elk.svc.cluster.local:9300
    - elasticsearch-2.elasticsearch-discovery.elk.svc.cluster.local:9300
    cluster.initial_master_nodes:
    - elasticsearch-0
    - elasticsearch-1
    - elasticsearch-2

---
# Password Secret
apiVersion: v1
kind: Secret
metadata:
  name: password-secret
  namespace: elk
type: Opaque
stringData:
  elasticsearch: "Changeme"
  logstash: "Password"
  kibana: "123456"

---
# Elasticsearch Certificates PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: elasticsearch-certs-pvc
  namespace: elk
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

---
# Elasticsearch StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: elk
spec:
  serviceName: "elasticsearch-discovery"
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      initContainers:
      - name: generate-certs
        image: docker.elastic.co/elasticsearch/elasticsearch:9.0.1
        command: ["/bin/sh", "-c"]
        args:
        - |
          if [ ! -f /certs/elastic-certificates.p12 ]; then \
            mkdir -p /certs && \
            /usr/share/elasticsearch/bin/elasticsearch-certutil ca --silent --pass "" --out /certs/elastic-ca.p12
            /usr/share/elasticsearch/bin/elasticsearch-certutil cert --silent --pass "" --ca /certs/elastic-ca.p12 --ca-pass "" --out /certs/elastic-certificates.p12
          else
            echo "Certificates already exist"
          fi
        volumeMounts:
        - name: certs-volume
          mountPath: /certs
          
      - name: chown
        image: busybox
        command: ["sh", "-c"]
        args:
          - "chown -R 1000:1000 /usr/share/elasticsearch/data"
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      containers:
      - name: elasticsearch
        env:
        - name: ELASTIC_PASSWORD
          valueFrom:
            secretKeyRef:
              name: password-secret
              key: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:9.0.1
        resources:
          limits:
            memory: 5Gi
            cpu: 4000m
          requests:
            memory: 4Gi
            cpu: 1000m
        ports:
        - containerPort: 9200
        - containerPort: 9300
        volumeMounts:
        - name: elasticsearch-config
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          subPath: elasticsearch.yml
        - name: data
          mountPath: /usr/share/elasticsearch/data
        - name: certs-volume
          mountPath: /usr/share/elasticsearch/config/certs
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: elasticsearch
              topologyKey: kubernetes.io/hostname
      volumes:
        - name: certs-volume
          persistentVolumeClaim:
            claimName: elasticsearch-certs-pvc
        - name: elasticsearch-config
          configMap:
            name: elasticsearch-config                  
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 50Gi

---
# Kibana Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: elk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      initContainers:
      - name: wait-for-elasticsearch
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z elasticsearch 9200; do echo waiting for elasticsearch...; sleep 2; done;
      containers:
      - name: kibana
        startupProbe:
          tcpSocket:
            port: 5601
          failureThreshold: 330
          periodSeconds: 10   
          initialDelaySeconds: 300
          timeoutSeconds: 5
        livenessProbe:
          tcpSocket:
            port: 5601
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 10
        readinessProbe:
          tcpSocket:
            port: 5601
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        image: docker.elastic.co/kibana/kibana:9.0.1
        resources:
          limits:
            memory: 4Gi
            cpu: 2000m
        ports:
        - containerPort: 5601
        env:
        - name: ELASTICSEARCH_HOSTS
          value: http://elasticsearch:9200
        - name: ELASTICSEARCH_USERNAME
          value: kibana_system
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              name: password-secret
              key: kibana
        - name: ELASTICSEARCH_SSL_VERIFICATIONMODE
          value: none
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/kibana/config/kibana.yml
          subPath: kibana.yml
      volumes:
      - name: config-volume
        configMap:
          name: kibana-config

---
# Kibana ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: kibana-config
  namespace: elk
data:
  kibana.yml: |-
    server.host: "0.0.0.0"
    elasticsearch.hosts: ["http://elasticsearch.elk.svc.cluster.local:9200"]
    elasticsearch.username: "${ELASTICSEARCH_USERNAME}"
    elasticsearch.password: "${ELASTICSEARCH_PASSWORD}"
    elasticsearch.ssl.verificationMode: none
    elasticsearch.requestTimeout: 60000

---
# Kibana Service
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: elk
spec:
  selector:
    app: kibana
  ports:
  - protocol: TCP
    port: 5601
    targetPort: 5601

---
# Logstash ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-config
  namespace: elk
data:
  logstash.conf: |
    input {
      beats {
        port => 5044
      }
    }

    filter {
      mutate {
        add_field => {
          "[data_stream][dataset]" => "k8s.%{[kubernetes][namespace]}" 
        }
      }
    }
    
    output {
      # stdout {
      #   codec => rubydebug
      # }

      elasticsearch {
        hosts => ["http://elasticsearch:9200"]
        # index => "k8s"
        user => "logstash_writer"
        password => "${LOGSTASH_WRITER_PASSWORD}"
        ssl_enabled => false
        # ilm_enabled          => true
        # ilm_policy          => "k8s-logs-policy"
        # manage_template     => true
        # template_name   => "k8s-template"
        data_stream => true
        # data_stream_dataset => "k8s.%{[kubernetes][namespace]}"
        data_stream_type => "logs"
      }
    }

---
# Logstash Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash
  namespace: elk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: logstash
  template:
    metadata:
      labels:
        app: logstash
    spec:
      initContainers:
      - name: wait-for-elasticsearch
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z elasticsearch 9200; do echo waiting for elasticsearch...; sleep 2; done;
      containers:
      - name: logstash
        startupProbe:
          tcpSocket:
            port: 5044
          failureThreshold: 600 
          periodSeconds: 10   
          initialDelaySeconds: 150
          timeoutSeconds: 5
        livenessProbe:
          tcpSocket:
            port: 5044
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          tcpSocket:
            port: 5044
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        image: docker.elastic.co/logstash/logstash:9.0.1
        resources:
          limits:
            memory: 4Gi
            cpu: 4000m
        ports:
        - containerPort: 5044
        env:
        - name: LOGSTASH_WRITER_PASSWORD
          valueFrom:
            secretKeyRef:
              name: password-secret
              key: logstash
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/logstash/pipeline/logstash.conf
          subPath: logstash.conf
        - name: shared-data
          mountPath: /usr/share/elasticsearch/config
      volumes:
      - name: config-volume
        configMap:
          name: logstash-config
      - name: shared-data
        emptyDir: {}

---
# Logstash Service
apiVersion: v1
kind: Service
metadata:
  name: logstash
  namespace: elk
spec:
  selector:
    app: logstash
  ports:
  - protocol: TCP
    port: 5044
    targetPort: 5044

---
# Filebeat ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
rules:
- apiGroups: [""]
  resources: ["nodes", "pods", "namespaces"]
  verbs: ["get", "list", "watch"]

---
# Filebeat ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: elk
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io

---
# Filebeat ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: elk

---
# Filebeat DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: elk
spec:
  selector:
    matchLabels:
      name: filebeat
  template:
    metadata:
      labels:
        name: filebeat
    spec:
      serviceAccountName: filebeat
      initContainers:
      - name: wait-for-logstash
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z logstash 5044; do echo waiting for logstash...; sleep 2; done;
      containers:
      - name: filebeat
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: POD_NAME     
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NAMESPACE     
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        image: docker.elastic.co/beats/filebeat:9.0.1
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
            cpu: 100m
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/filebeat/filebeat.yml
          subPath: filebeat.yml
        - name: containers
          mountPath: /var/log/pods/
          readOnly: true
      volumes:
      - name: config-volume
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: containers
        hostPath:
          path: /var/log/pods/

---
# Filebeat ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: elk
data:
  filebeat.yml: |
    filebeat.inputs:
    - type: filestream
      id: unique-input-id
      prospector.scanner.symlinks: true
      enabled: true
      paths:
        - /var/log/pods/*/*/*.log
      parsers:
      - container: 
          stream: stdout
          format: auto
    processors:
      - add_kubernetes_metadata:
          in_cluster: true 
          host: ${NODE_NAME}
          default_indexers.enabled: false
          default_matchers.enabled: false
          indexers: 
          - pod_uid:
          matchers:
          - logs_path:
              logs_path: "/var/log/pods/"
              resource_type: 'pod'
    output.logstash:
      hosts: ["logstash:5044"]

---
# Kibana WhiteList Middleware
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: whitelist
  namespace: elk
spec:
  ipWhiteList:
    sourceRange:
      - "172.16.0.0/21"

---
# Kibana IngressRoute
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: kibana
  namespace: elk
spec:
  entryPoints:
    - websecure
  routes:
    - kind: Rule
      match: Host(`kibana.example.com`) && PathPrefix(`/`)
      middlewares:
        - name: whitelist
      services:
        - name: kibana
          port: 5601
          scheme: http

---
# Init Job
apiVersion: batch/v1
kind: Job
metadata:
  name: init
  namespace: elk
spec:
  template:
    spec:
      initContainers:
      - name: wait-for-elasticsearch
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z elasticsearch 9200; do echo waiting for elasticsearch...; sleep 2; done;
      containers:
        - name: init
          image: appropriate/curl
          env:
            - name: ELASTIC_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: password-secret
                  key: elasticsearch
            - name: KIBANA_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: password-secret
                  key: kibana
            - name: LOGSTASH_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: password-secret
                  key: logstash
          command:
            - /bin/sh
            - -c
            - |
              curl -X POST "http://elasticsearch:9200/_security/user/kibana_system/_password" \
                  -u elastic:$ELASTIC_PASSWORD \
                  -H "Content-Type: application/json" \
                  -d '{"password": "'"$KIBANA_PASSWORD"'"}' && \
              echo "kibana密码更新完成!"
              curl -X PUT "http://elasticsearch:9200/_security/role/logstash_writer" \
                  -u elastic:$ELASTIC_PASSWORD \
                  -H "Content-Type: application/json" \
                  -d '{
                        "cluster": [
                          "manage_index_templates",
                          "monitor",
                          "manage_ilm",
                          "manage_data_streams"
                        ],
                        "indices": [
                          {
                            "names": ["*"],
                            "privileges": ["write", "create_index"]
                          }
                        ]
                      }' && \
              echo "logstash_writer角色创建完成!"
              curl -X PUT "http://elasticsearch:9200/_security/user/logstash_writer" \
              -u elastic:$ELASTIC_PASSWORD \
              -H "Content-Type: application/json" \
              -d '{
                "password" : "'"$LOGSTASH_PASSWORD"'",
                "roles" : [ "logstash_writer" ],
                "full_name" : "Logstash Writer User"
              }' && \
              echo "logstash_writer用户创建完成!"
              curl -X PUT "http://elasticsearch:9200/_ilm/policy/k8s-logs-policy?pretty" \
              -u elastic:$ELASTIC_PASSWORD \
              -H "Content-Type: application/json" \
              -d '{
                "policy": {
                  "phases": {
                    "hot": {
                      "actions": {
                        "rollover": {
                          "max_size": "50GB",
                          "max_age": "30d"
                        }
                      }
                    },
                    "delete": {
                      "min_age": "180d",
                      "actions": {
                        "delete": {}
                      }
                    }
                  }
                }
              }' && \
              echo "k8s-log-policy策略新建完成!"
      restartPolicy: Never

各资源yaml说明

Elasticsearch Service

# Elasticsearch Service
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
  namespace: elk
spec:
  selector:
    app: elasticsearch
  ports:
    - protocol: TCP
      port: 9200
      targetPort: 9200

Elasticsearch对外暴露的SVC,用于kibana和logstash的访问。

没有需要特别注意的事项。

Elasticsearch Headless Service

# Elasticsearch Headless Service
apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-discovery
  namespace: elk
spec:
  selector:
    app: elasticsearch
  ports:
    - protocol: TCP
      port: 9300
      targetPort: 9300
      name: transport
    - port: 9200
      name: http
      targetPort: 9200

Elasticsearch集群内部通信使用的Headless SVC。

Elasticsearch ConfigMap

# Elasticsearch ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-config
  namespace: elk
data:
  elasticsearch.yml: |-
    network.host: 0.0.0.0
    discovery.type: multi-node
    cluster.name: my-cluster
    xpack.security.enabled: true
    xpack.security.transport.ssl.enabled: true
    xpack.security.transport.ssl.verification_mode: certificate
    xpack.security.transport.ssl.keystore.path: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
    xpack.security.transport.ssl.truststore.path: /usr/share/elasticsearch/config/certs/elastic-certificates.p12
    xpack.security.http.ssl.enabled: false
    discovery.initial_state_timeout: 60m
    discovery.seed_hosts: 
    - elasticsearch-0.elasticsearch-discovery.elk.svc.cluster.local:9300
    - elasticsearch-1.elasticsearch-discovery.elk.svc.cluster.local:9300
    - elasticsearch-2.elasticsearch-discovery.elk.svc.cluster.local:9300
    cluster.initial_master_nodes:
    - elasticsearch-0
    - elasticsearch-1
    - elasticsearch-2

Elasticsearch配置文件使用的ConfigMap,后续将挂载到Elasticsearch的StatusfulSet上

需要注意的有如下几点

  1. 证书位置必须在/usr/share/elasticsearch/config/certs/ ,否则Elastcsearch会出现安全性报错,无法启动。

  2. discovery.seed_hostscluster.initial_master_nodes 必须按照如下配置,否则无法节点之间无法互相发现。

        discovery.seed_hosts: 
        - elasticsearch-0.elasticsearch-discovery.elk.svc.cluster.local:9300
        - elasticsearch-1.elasticsearch-discovery.elk.svc.cluster.local:9300
        - elasticsearch-2.elasticsearch-discovery.elk.svc.cluster.local:9300
        cluster.initial_master_nodes:
        - elasticsearch-0
        - elasticsearch-1
        - elasticsearch-2

Password Secret

# Password Secret
apiVersion: v1
kind: Secret
metadata:
  name: password-secret
  namespace: elk
type: Opaque
stringData:
  elasticsearch: "Changeme"
  logstash: "password"
  kibana: "123456"

定义密码的Secret,密码可以自行定义,后续需要引用这个Secret

Elasticsearch Certificates PersistentVolumeClaim

# Elasticsearch Certificates PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: elasticsearch-certs-pvc
  namespace: elk
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi

由于需要所有节点共享证书,所以需要使用一个PVC在多个Elasticsearch中共享同一套证书。

需要注意的是,博主这里已经提前部署了StorageClass,所以可以在声明PVC同时自动创建PV。若读者使用的StoargeClass不能自动创建PV,请手动创建PV,再声明PVC。切记PVC要声明为RWX的访问模式。

Elasticsearch StatefulSet

# Elasticsearch StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
  namespace: elk
spec:
  serviceName: "elasticsearch-discovery"
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      initContainers:
      - name: generate-certs
        image: docker.elastic.co/elasticsearch/elasticsearch:9.0.1
        command: ["/bin/sh", "-c"]
        args:
        - |
          if [ ! -f /certs/elastic-certificates.p12 ]; then \
            mkdir -p /certs && \
            /usr/share/elasticsearch/bin/elasticsearch-certutil ca --silent --pass "" --out /certs/elastic-ca.p12
            /usr/share/elasticsearch/bin/elasticsearch-certutil cert --silent --pass "" --ca /certs/elastic-ca.p12 --ca-pass "" --out /certs/elastic-certificates.p12
          else
            echo "Certificates already exist"
          fi
        volumeMounts:
        - name: certs-volume
          mountPath: /certs
          
      - name: chown
        image: busybox
        command: ["sh", "-c"]
        args:
          - "chown -R 1000:1000 /usr/share/elasticsearch/data"
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      containers:
      - name: elasticsearch
        env:
        - name: ELASTIC_PASSWORD
          valueFrom:
            secretKeyRef:
              name: password-secret
              key: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:9.0.1
        resources:
          limits:
            memory: 5Gi
            cpu: 4000m
          requests:
            memory: 4Gi
            cpu: 1000m
        ports:
        - containerPort: 9200
        - containerPort: 9300
        volumeMounts:
        - name: elasticsearch-config
          mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
          subPath: elasticsearch.yml
        - name: data
          mountPath: /usr/share/elasticsearch/data
        - name: certs-volume
          mountPath: /usr/share/elasticsearch/config/certs
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: elasticsearch
              topologyKey: kubernetes.io/hostname
      volumes:
        - name: certs-volume
          persistentVolumeClaim:
            claimName: elasticsearch-certs-pvc
        - name: elasticsearch-config
          configMap:
            name: elasticsearch-config                  
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 50Gi

Elasticsearch集群的Workload,由于是有状态服务,所以使用StatefulSet类型部署。

其中initContainers的generate-certs容器用于生成证书,chown容器用于修改PVC内目录的属主关系,这两个容器务必要添加。

需要注意的是,这个StatefulSet务必不能添加探针,因为这样会导致各个节点之间无法互相发现,导致集群无法就绪。此前博主就因添加探针导致无法启动第二个Pod,以至于集群一致unready。经查阅资料发现,官方论坛有开发者回复过此类问题,结论是不建议给Elasticsearch的容器添加探针,因为Elasticsearch这种有状态服务不适合通过探针重启,且Elasticsarch有自己的节点管理机制,因此博主最后移除了所有的探针。

Kibana Deployment

# Kibana Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: elk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      initContainers:
      - name: wait-for-elasticsearch
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z elasticsearch 9200; do echo waiting for elasticsearch...; sleep 2; done;
      containers:
      - name: kibana
        startupProbe:
          tcpSocket:
            port: 5601
          failureThreshold: 330
          periodSeconds: 10   
          initialDelaySeconds: 300
          timeoutSeconds: 5
        livenessProbe:
          tcpSocket:
            port: 5601
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 10
        readinessProbe:
          tcpSocket:
            port: 5601
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        image: docker.elastic.co/kibana/kibana:9.0.1
        resources:
          limits:
            memory: 4Gi
            cpu: 2000m
        ports:
        - containerPort: 5601
        env:
        - name: ELASTICSEARCH_HOSTS
          value: http://elasticsearch:9200
        - name: ELASTICSEARCH_USERNAME
          value: kibana_system
        - name: ELASTICSEARCH_PASSWORD
          valueFrom:
            secretKeyRef:
              name: password-secret
              key: kibana
        - name: ELASTICSEARCH_SSL_VERIFICATIONMODE
          value: none
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/kibana/config/kibana.yml
          subPath: kibana.yml
      volumes:
      - name: config-volume
        configMap:
          name: kibana-config

这是Kibana的Deployment,博主在这里添加了一个initContainer,用于探测Elasticsearch是否启动,避免因Elasticsearch没有启动导致无法连接,使得Kibana容器重启。

Kibana ConfigMap

# Kibana ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: kibana-config
  namespace: elk
data:
  kibana.yml: |-
    server.host: "0.0.0.0"
    elasticsearch.hosts: ["http://elasticsearch:9200"]
    elasticsearch.username: "${ELASTICSEARCH_USERNAME}"
    elasticsearch.password: "${ELASTICSEARCH_PASSWORD}"
    elasticsearch.ssl.verificationMode: none
    elasticsearch.requestTimeout: 60000

Kibana的ConfigMap,其中的elasticsearch.usernameelasticsearch.password 是从Kibana Deployment的环境变量读取的。

Kibana Service

# Kibana Service
apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: elk
spec:
  selector:
    app: kibana
  ports:
  - protocol: TCP
    port: 5601
    targetPort: 5601

Kibana的SVC,因博主部署了Ingress Controller,所以使用了ClusterIP模式部署SVC,通过Ingress暴露服务。若未使用Ingress,可以将Kibana的SVC改成NodePort模式部署。

Logstash ConfigMap

# Logstash ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-config
  namespace: elk
data:
  logstash.conf: |
    input {
      beats {
        port => 5044
      }
    }

    filter {
      mutate {
        add_field => {
          "[data_stream][dataset]" => "k8s.%{[kubernetes][namespace]}" 
        }
      }
    }
    
    output {
      elasticsearch {
        hosts => ["http://elasticsearch:9200"]
        user => "logstash_writer"
        password => "${LOGSTASH_WRITER_PASSWORD}"
        ssl_enabled => false
        data_stream => true
        data_stream_type => "logs"
      }
    }

Logstash的ConfigMap,通过5044接收filebeat的数据,并发送到Elasticsearch。其中使用了data_stream,需要注意一定要加filter动态处理dataset,不能在output动态处理。

Logstash Deployment

# Logstash Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: logstash
  namespace: elk
spec:
  replicas: 1
  selector:
    matchLabels:
      app: logstash
  template:
    metadata:
      labels:
        app: logstash
    spec:
      initContainers:
      - name: wait-for-elasticsearch
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z elasticsearch 9200; do echo waiting for elasticsearch...; sleep 2; done;
      containers:
      - name: logstash
        startupProbe:
          tcpSocket:
            port: 5044
          failureThreshold: 600 
          periodSeconds: 10   
          initialDelaySeconds: 150
          timeoutSeconds: 5
        livenessProbe:
          tcpSocket:
            port: 5044
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          tcpSocket:
            port: 5044
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        image: docker.elastic.co/logstash/logstash:9.0.1
        resources:
          limits:
            memory: 4Gi
            cpu: 4000m
        ports:
        - containerPort: 5044
        env:
        - name: LOGSTASH_WRITER_PASSWORD
          valueFrom:
            secretKeyRef:
              name: password-secret
              key: logstash
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/logstash/pipeline/logstash.conf
          subPath: logstash.conf
        - name: shared-data
          mountPath: /usr/share/elasticsearch/config
      volumes:
      - name: config-volume
        configMap:
          name: logstash-config
      - name: shared-data
        emptyDir: {}

Logstash的Deployment,同样添加了initContainer,用于探测Elasticsearch是否正常启动。同时通过探针探测5044端口,保证Logstash正常对外提供服务。

Logstash Service

# Logstash Service
apiVersion: v1
kind: Service
metadata:
  name: logstash
  namespace: elk
spec:
  selector:
    app: logstash
  ports:
  - protocol: TCP
    port: 5044
    targetPort: 5044

Logstash的SVC,主要对Filebeat提供服务。

Filebeat ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
rules:
- apiGroups: [""]
  resources: ["nodes", "pods", "namespaces"]
  verbs: ["get", "list", "watch"]

Filebeat的ClusterRole,通过配合ClusterRoleBinding和Filebeat配置文件中的processor,使写入Elasticsearch的索引携带有K8S的metadata,方便查询日志。

Filebeat ClusterRoleBinding

# Filebeat ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  namespace: elk
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io

Filebeat的ClusterRoleBinding,通过配合ClusterRole和Filebeat配置文件中的processor,使写入Elasticsearch的索引携带有K8S的metadata,方便查询日志。

Filebeat ServiceAccount

# Filebeat ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  namespace: elk

filebeat专用的SA,用于ClusterRoleBinding

Filebeat DaemonSet

# Filebeat DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: filebeat
  namespace: elk
spec:
  selector:
    matchLabels:
      name: filebeat
  template:
    metadata:
      labels:
        name: filebeat
    spec:
      serviceAccountName: filebeat
      initContainers:
      - name: wait-for-logstash
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z logstash 5044; do echo waiting for logstash...; sleep 2; done;
      containers:
      - name: filebeat
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: POD_NAME     
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: NAMESPACE     
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        image: docker.elastic.co/beats/filebeat:9.0.1
        securityContext:
          runAsUser: 0
        resources:
          limits:
            memory: 200Mi
            cpu: 100m
        volumeMounts:
        - name: config-volume
          mountPath: /usr/share/filebeat/filebeat.yml
          subPath: filebeat.yml
        - name: containers
          mountPath: /var/log/pods/
          readOnly: true
      volumes:
      - name: config-volume
        configMap:
          defaultMode: 0600
          name: filebeat-config
      - name: containers
        hostPath:
          path: /var/log/pods/

Filebeat的DaemonSet,使用了initContainer探测Logstash是否就绪,通过ReadOnly方式挂载日志到容器内部,方便读取日志。

Filebeat ConfigMap

# Filebeat ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: elk
data:
  filebeat.yml: |
    filebeat.inputs:
    - type: filestream
      id: unique-input-id
      prospector.scanner.symlinks: true
      enabled: true
      paths:
        - /var/log/pods/*/*/*.log
      parsers:
      - container: 
          stream: stdout
          format: auto
    processors:
      - add_kubernetes_metadata:
          in_cluster: true 
          host: ${NODE_NAME}
          default_indexers.enabled: false
          default_matchers.enabled: false
          indexers: 
          - pod_uid:
          matchers:
          - logs_path:
              logs_path: "/var/log/pods/"
              resource_type: 'pod'
    output.logstash:
      hosts: ["logstash:5044"]

Filebeat的ConfigMap,用于配置Filebeat。由于Filebeat 9以后移除了Container的input,所以只能使用filestream采集。由于需要添加K8S的metadata,务必配置processors和parsers,并且保证SA和ClusterRoleBinding和ClusterRole配置正确,否则会无法读取元数据。

Init Job

---
# Init Job
apiVersion: batch/v1
kind: Job
metadata:
  name: init
  namespace: elk
spec:
  template:
    spec:
      initContainers:
      - name: wait-for-elasticsearch
        image: busybox
        command: ["/bin/sh", "-c"]
        args:
        - >
          until nc -z elasticsearch 9200; do echo waiting for elasticsearch...; sleep 2; done;
      containers:
        - name: init
          image: appropriate/curl
          env:
            - name: ELASTIC_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: password-secret
                  key: elasticsearch
            - name: KIBANA_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: password-secret
                  key: kibana
            - name: LOGSTASH_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: password-secret
                  key: logstash
          command:
            - /bin/sh
            - -c
            - |
              curl -X POST "http://elasticsearch:9200/_security/user/kibana_system/_password" \
                  -u elastic:$ELASTIC_PASSWORD \
                  -H "Content-Type: application/json" \
                  -d '{"password": "'"$KIBANA_PASSWORD"'"}' && \
              echo "kibana密码更新完成!"
              curl -X PUT "http://elasticsearch:9200/_security/role/logstash_writer" \
                  -u elastic:$ELASTIC_PASSWORD \
                  -H "Content-Type: application/json" \
                  -d '{
                        "cluster": [
                          "manage_index_templates",
                          "monitor",
                          "manage_ilm",
                          "manage_data_streams"
                        ],
                        "indices": [
                          {
                            "names": ["*"],
                            "privileges": ["write", "create_index"]
                          }
                        ]
                      }' && \
              echo "logstash_writer角色创建完成!"
              curl -X PUT "http://elasticsearch:9200/_security/user/logstash_writer" \
              -u elastic:$ELASTIC_PASSWORD \
              -H "Content-Type: application/json" \
              -d '{
                "password" : "'"$LOGSTASH_PASSWORD"'",
                "roles" : [ "logstash_writer" ],
                "full_name" : "Logstash Writer User"
              }' && \
              echo "logstash_writer用户创建完成!"
              curl -X PUT "http://elasticsearch:9200/_ilm/policy/k8s-logs-policy?pretty" \
              -u elastic:$ELASTIC_PASSWORD \
              -H "Content-Type: application/json" \
              -d '{
                "policy": {
                  "phases": {
                    "hot": {
                      "actions": {
                        "rollover": {
                          "max_size": "50GB",
                          "max_age": "30d"
                        }
                      }
                    },
                    "delete": {
                      "min_age": "180d",
                      "actions": {
                        "delete": {}
                      }
                    }
                  }
                }
              }' && \
              echo "k8s-log-policy策略新建完成!"
      restartPolicy: Never

初始化Elasticsearch的Job。修改了Kibana账号的密码,并且新建了一个logstash_writer的用户和对应的角色。如果要使用ILM,请务必保留最后一个PUT请求,新建ILM策略。

Kibana WhiteList Middleware

# Kibana WhiteList Middleware
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: whitelist
  namespace: elk
spec:
  ipWhiteList:
    sourceRange:
      - "172.16.0.0/21"

由于博主使用的是traefik ingress,且需要进行访问控制,限制IP白名单单,所以博主新增了一个Middleware用于进行访问控制。

Kibana IngressRoute

---
# Kibana IngressRoute
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: kibana
  namespace: elk
spec:
  entryPoints:
    - websecure
  routes:
    - kind: Rule
      match: Host(`kibana.example.com`) && PathPrefix(`/`)
      middlewares:
        - name: whitelist
      services:
        - name: kibana
          port: 5601
          scheme: http

博主使用traefik ingress时没有使用K8S原生的Ingress,而是使用了traefik推荐的CRD,并且配置了HTTPS。

参考资料

  1. ECK - Kubernetes liveness and readiness probes - Elastic Orchestration / Elastic Cloud on Kubernetes (ECK) - Discuss the Elastic Stac

LICENSED UNDER CC BY-NC-SA 4.0
Comment