Prometheus-Operator结合blackbox-exporter针对k8s集群内外简单应用

前言

很久没搞Operator了,最近公司配了两台机器给个人用.

之前网站博客等等所有服务都在一台云服务商,难免有些吃不消. 有了机器之后我迅速构建一套Kubernetes集群.并把之前的监控也迁移了过来与rometheus Operator进行合并.你懂的

把一些常用的主机监控加入到Prometheus Operator,突然发现少了点什么~ 哦对 那就是黑盒监控,也就是传说中的blackbox. 这玩意很强大,可以对网络端口啥的 进行监控.

Prometheus Operator,顾名思义,是负责K8S中自动化管理Prometheus的Custom Controller。更多内容,参考coreos/prometheus-operator

那么怎样利用Prometheus Operator,在Kubernetes集群中安装部署Prometheus,并且添加Blackbox exporter组件?让我们开始把!

安装Prom Operator

参考Prometheus Operator 安装之有手即可coreos/kube-prometheus 官网,安装Prometheus Operator。

1、kubelet配置添加参数 vim /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 添加:

--authentication-token-webhook=true
--authorization-mode=Webhook

2、获取源码,并切换版本(与k8s版本的对应关系可以在github仓库找到)

git clone https://github.com/coreos/kube-prometheus.git
cd kube-prometheus
kubectl version
git branch -a
git checkout origin/release-0.4

3、安装Prom Operator

# Create the namespace and CRDs, and then wait for them to be availble before creating the remaining resources
kubectl create -f manifests/setup
until kubectl get servicemonitors --all-namespaces ; do date; sleep 1; echo ""; done
kubectl create -f manifests/

4、查看安装

kubectl get crd | grep coreos
kubectl get pod -n monitoring
kubectl get svc -n monitoring

以上,Prometheus Operator安装完成,Prometheus也安装完成。

PS:卸载Prom Operator

kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup

题外话:

如果你访问不了Github,首先你并不是一个出色的互联网IT工作者

其次,我在这里为你精心准备了Prometheus Operator离线包. 你下载直接apply即可.

组件版本:0.7.0

支持Kubernetes版本:1.19.* 1.20.*

kube-prometheus有手就行安装教程

安装Blackbox exporter

1、创建yaml文件 blackbox-exporter.yaml

apiVersion: v1
data:
config.yml: |
  modules:
    http_2xx:
      prober: http
      http:
        method: GET
        preferred_ip_protocol: "ip4"
    http_post_2xx:
      prober: http
      http:
        method: POST
        preferred_ip_protocol: "ip4"
    tcp:
      prober: tcp
    ping:
      prober: icmp
      timeout: 3s
      icmp:
        preferred_ip_protocol: "ip4"
    dns_k8s:
      prober: dns
      timeout: 5s
      dns:
        transport_protocol: "tcp"
        preferred_ip_protocol: "ip4"
        query_name: "kubernetes.default.svc.cluster.local"
        query_type: "A"
kind: ConfigMap
metadata:
name: blackbox-exporter
namespace: monitoring
---
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
  name: blackbox-exporter
  cluster: ali-huabei2-dev
name: blackbox-exporter
namespace: monitoring
spec:
replicas: 1
selector:
  matchLabels:
    name: blackbox-exporter
strategy: {}
template:
  metadata:
    creationTimestamp: null
    labels:
      name: blackbox-exporter
      cluster: ali-huabei2-dev
  spec:
    containers:
    - image: prom/blackbox-exporter:v0.16.0
      name: blackbox-exporter
      ports:
      - containerPort: 9115
      volumeMounts:
      - name: config
        mountPath: /etc/blackbox_exporter
      args:
      - --config.file=/etc/blackbox_exporter/config.yml
      - --log.level=info
    volumes:
    - name: config
      configMap:
        name: blackbox-exporter
---
apiVersion: v1
kind: Service
metadata:
#annotations:
# service.beta.kubernetes.io/alicloud-loadbalancer-address-type: intranet
labels:
  name: blackbox-exporter
  cluster: ali-huabei2-dev
name: blackbox-exporter
namespace: monitoring
spec:
#externalTrafficPolicy: Local
selector:
  name: blackbox-exporter
ports:
- name: http-metrics
  port: 9115
  targetPort: 9115
type: LoadBalancer

2、应用yaml文件

kubectl apply -f blackbox-exporter.yaml
kubectl get svc -n monitoring
kubectl get deploy -n monitoring

配置使用Blackbox exporter(错误方法)

在Prometheus中配置使用Blackbox exporter是很简单的,scrape_configs里配置相应字段即可。但是,k8s中的Prometheus配置,会有一些不同。

1、获取prometheus.yml配置

kubectl get secrets -n monitoring prometheus-k8s -oyaml | grep prometheus.yaml.gz | awk '{print $2}' | base64 --decode | gzip -d > prometheus.yml

2、查看prometheus.yml配置,下面截取一段:

global:
evaluation_interval: 30s
scrape_interval: 30s
external_labels:
  prometheus: monitoring/k8s
  prometheus_replica: $(POD_NAME)
rule_files:
- /etc/prometheus/rules/prometheus-k8s-rulefiles-0/*.yaml
scrape_configs:
- job_name: monitoring/node-exporter/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
  namespaces:
    names:
    - monitoring
scrape_interval: 15s
scheme: https
tls_config:
  insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: keep
  source_labels:
  - __meta_kubernetes_service_label_k8s_app
  regex: node-exporter
- action: keep
  source_labels:
  - __meta_kubernetes_endpoint_port_name
  regex: https
- source_labels:
  - __meta_kubernetes_endpoint_address_target_kind
  - __meta_kubernetes_endpoint_address_target_name
  separator: ;
  regex: Node;(.*)
  replacement: ${1}
  target_label: node
- source_labels:
  - __meta_kubernetes_endpoint_address_target_kind
  - __meta_kubernetes_endpoint_address_target_name
  separator: ;
  regex: Pod;(.*)
  replacement: ${1}
  target_label: pod
- source_labels:
  - __meta_kubernetes_namespace
  target_label: namespace
- source_labels:
  - __meta_kubernetes_service_name
  target_label: service
- source_labels:
  - __meta_kubernetes_pod_name
  target_label: pod
- source_labels:
  - __meta_kubernetes_service_name
  target_label: job
  replacement: ${1}
- source_labels:
  - __meta_kubernetes_service_label_k8s_app
  target_label: job
  regex: (.+)
  replacement: ${1}
- target_label: endpoint
  replacement: https
- source_labels:
  - __meta_kubernetes_pod_node_name
  target_label: instance
  regex: (.*)
  replacement: $1
  action: replace
- source_labels:
  - __meta_kubernetes_service_label_cluster
  target_label: cluster
  regex: (.*)
  replacement: $1
  action: replace

其中,job_name配置target名称,kubernetes_sd_configs配置k8s的服务发现,relabel_configs配置标签最终的显示。source_labels是样本的原标签,target_label是显示的标签;regex使用正则匹配value,replacement代表最终显示的value。$1代表regex正则匹配到的第一个字符串。

3、添加blackbox exporter的配置

- job_name: monitoring/blackbox-exporter/0
honor_labels: false
kubernetes_sd_configs:
- role: endpoints
  namespaces:
    names:
    - monitoring
scrape_interval: 15s
scheme: http
tls_config:
  insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: keep
  source_labels:
  - __meta_kubernetes_service_label_name
  regex: blackbox-exporter
- source_labels:
  - __meta_kubernetes_service_label_name
  target_label: job
  regex: (.+)
  replacement: ${1}
- source_labels:
  - __meta_kubernetes_service_label_cluster
  target_label: cluster
  regex: (.*)
  replacement: $1
  action: replace

4、应用新的配置

# 1. compress prometheus.yaml
cat prometheus.yaml | gzip -f | base64 | tr -d "\n"
# 2. copy string
# 3. edit secret
kubectl edit secrets -n monitoring prometheus-k8s
# 4. replace prometheus.yaml.gz
# 5. get the latest config
kubectl get secrets -n monitoring prometheus-k8s -oyaml | grep prometheus.yaml.gz | awk '{print $2}' | base64 --decode | gzip -d | grep blackbox

然而,配置中并没有blackbox,配置没有发生改变!证明了prometheus的配置是自动生成的,手动修改无效。如果你系统的学过Prometheus-opertoar,上面的操作,你根本做不来. 因为人家根本不是那样设计的.....让我们接下来去看正确的方法是怎么操作的吧~

配置使用Blackbox exporter(正确方法)

Prometheus Operator中配置Target,是利用ServiceMonitor进行动态发现的方式。

1、创建servicemonitor的yaml文件,blackbox-exporter-sm.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
  name: blackbox-exporter
  release: p
name: blackbox-exporter
namespace: monitoring
spec:
namespaceSelector:
  matchNames:
  - monitoring
selector:
  matchLabels:
    name: blackbox-exporter
endpoints:
- interval: 15s
  port: http-metrics
  path: /probe
  relabelings:
  - action: replace
    regex: (.*)
    replacement: $1
    sourceLabels:
    - __meta_kubernetes_service_label_cluster
    targetLabel: cluster
  - action: replace
    regex: (.*)
    replacement: $1
    sourceLabels:
    - __param_module
    targetLabel: module
  - action: replace
    regex: (.*)
    replacement: $1
    sourceLabels:
    - __param_target
    targetLabel: target
  params:
    module:
    - http_2xx
    target:
    - https://www.vlinux.cn
- interval: 15s
  port: http-metrics
  path: /probe
  relabelings:
  - action: replace
    regex: (.*)
    replacement: $1
    sourceLabels:
    - __meta_kubernetes_service_label_cluster
    targetLabel: cluster
  - action: replace
    regex: (.*)
    replacement: $1
    sourceLabels:
    - __param_module
    targetLabel: module
  - action: replace
    regex: (.*)
    replacement: $1
    sourceLabels:
    - __param_target
    targetLabel: target
  params:
    module:
    - dns_k8s
    target:
    - 172.31.16.10 # dns ip address

2、应用到k8s集群 kubectl apply -f blackbox-exporter-sm.yaml

3、等待一分钟后,进行验证 访问prometheus的graph页面,可以查看blackbox-exporter指标。

{job=~"blackbox-exporter",__name__!~"^go.*"}

查看结果表明,params的配置中,http_2xx 探测只有第一个target生效了,另外两个target根本没有探测记录。本实验证明了,target里只能填写一个域名,多了无效。 要想配置多个站点的探测,最简单的办法就是配置多个endpoint。至于N个站点配置M种探测方式,如果你知道怎么配置,欢迎留言告知,感谢~

配置告警

学过Prometheus基础的你知道配置告警需要在prometheus配置文件中指定alertmanager实例和报警的rules文件。 而通过operator部署的prometheus,怎样配置告警呢?这里需要定义PrometheusRule资源,并且具备标签 prometheus=k8s 和 role=alert-rules。 这里以配置dns服务告警为例,dns服务出问题,不能正常解析 kubernetes.default.svc.cluster.local 。

1、查看alertmanager配置

kubectl get secrets -n monitoring alertmanager-main -oyaml | grep "alertmanager.yaml" | awk '{print $2}' | base64 -d

2、创建prometheus-rule-dns.yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
  prometheus: k8s
  role: alert-rules
name: dns-alert-rules
namespace: monitoring
spec:
groups:
- name: DNS
  rules:
  - alert: DNSServerError
    annotations:
      summary: No summary
      description: No description
      webhookToken: xxxxxxxxx
    expr: |
      probe_success{module="dns_k8s"} == 0
    for: 1m
    labels:
      severity: critical
      alertTag: k8s

3、应用rule kubectl apply -f prometheus-rule-dns.yaml

喜欢()
评论 (0)
热门搜索
15 文章
19 评论
12 喜欢
Top