promethues-operator安装
minikube简介minikube在macOS、Linux和Windows上快速建立本地Kubernetes集群。方法开发人员在本地进行k8s学习和相关实验。安装前提要求2 CPUs or more2GB of free memory20GB of free disk spaceInternet connectionContainer or virtual machine manager, su
minikube
简介
minikube在macOS、Linux和Windows上快速建立本地Kubernetes集群。方法开发人员在本地进行k8s学习和相关实验。
安装
前提要求
2 CPUs or more
2GB of free memory
20GB of free disk space
Internet connection
Container or virtual machine manager, such as: Docker, Hyperkit, Hyper-V, KVM, Parallels, Podman, VirtualBox, or VMware Fusion/Workstation
mac安装
安装
brew install minikube
启动
minikube start --cpus=2 --memory=4000mb --image-mirror-country='cn' \
--registry-mirror=https://xxx.mirror.aliyuncs.com \
--kubernetes-version=v1.18.8
常用配置参数
- –driver=*** 从1.5.0版本开始,Minikube缺省使用系统优选的驱动来创建Kubernetes本地环境,比如您已经安装过Docker环境,minikube 将使用 docker 驱动。参照官网。
- –cpus=2: 为minikube虚拟机分配CPU核数
- –memory=4000mb: 为minikube虚拟机分配内存数
- –registry-mirror=*** 为了提升拉取Docker Hub镜像的稳定性,可以为 Docker daemon 配置镜像加速,参考阿里云镜像服务
- –kubernetes-version=***: minikube 虚拟机将使用的 kubernetes 版本
删除
minikube delete
helm
简介
Helm 是查找、共享和使用为 Kubernetes 构建的软件的最佳方式。
安装
brew install helm
设置图表库
helm repo add stable https://charts.helm.sh/stable
其他图表库:helm 源大集合
查找prometheus-operator
helm search repo prometheus
prometheus-community/prometheus-operator 9.3.2 0.38.1 DEPRECATED - This chart will be renamed. See ht…
说明
图表prometheus-operator
已标记为废弃,可以看官方说明Moved
,大体意思图表迁移到prometheus-community/kube-prometheus-stack,该图表维护在github https://github.com/prometheus-community/helm-charts上的独立图表库。 这里仍使用老版本图表。
prometheus-opterator安装
安装
创建namespace:monitor
kubectl create ns monitor
执行安装:
helm install prometheus-operator stable/prometheus-operator -n monitor
或
helm pull stable/prometheus-operator
helm install prometheus-operator ./prometheus-operator-9.3.2.tgz
验证
$ kubectl get pod -n monitor
NAME READY STATUS RESTARTS AGE
alertmanager-prometheus-operator-alertmanager-0 2/2 Running 0 14m
prometheus-operator-grafana-88648f94b-rgjlx 2/2 Running 0 15m
prometheus-operator-kube-state-metrics-69fcc8d48c-pt6mk 1/1 Running 0 15m
prometheus-operator-operator-6db7d95977-sdjjg 2/2 Running 0 15m
prometheus-operator-prometheus-node-exporter-bfbsp 1/1 Running 0 15m
prometheus-prometheus-operator-prometheus-0 3/3 Running 1 14m
外网访问
端口转发
kubectl port-forward --address [external_ip] -n monitor [prometheus pod name] 9090:9090
kubectl port-forward --address [external_ip] -n monitor [grafana pod name] 3000:3000
kubectl port-forward --address [external_ip] -n monitor [alertmanager pod name] 9093:9093
添加自定义告警规则
通过预定义资源发布
cat > prometheus-rule.yaml <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example
labels:
app: prometheus-operator
release: prometheus-operator
spec:
groups:
- name: example
rules:
- alert: ExampleAlert
expr: vector(1)
labels:
foo: bar
namespace: monitor
EOF
kubectl apply -n monitor -f prometheus-rule.yaml
添加AlertManager通知规则
查看默认配置
配置默认安装在secret
。
# 找到alertmanager的secret
$ kubectl get secret -n monitor|grep alertmanager
alertmanager-prometheus-operator-alertmanager Opaque 1 13d
prometheus-operator-alertmanager-token-z8jfs kubernetes.io/service-account-token 3 13d
# 在Type为Opaque的secret中,查看该secret
$ kubectl get secret -n monitor alertmanager-prometheus-operator-alertmanager -o go-template='{{index .data "alertmanager.yaml"}}'|base64 --decode
通过预定义资源发布(有问题)
cat > alertmanager-config.yaml <<EOF
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: dingtalk
spec:
route:
receiver: dingtalk
matchers:
- name: foo
value: bar
receivers:
- name: dingtalk
webhookConfigs:
- url: http://alertmanager-webhook-dingtalk/dingtalk/robot1/send
EOF
kubectl apply -n monitor -f alertmanager-config.yaml
报错:
error: unable to recognize "alertmanager-config.yaml": no matches for kind "AlertmanagerConfig" in version "monitoring.coreos.com/v1alpha1"
查看crd
(CustomResourceDefinition):
$ kubectl get crd -n monitor
NAME CREATED AT
alertmanagers.monitoring.coreos.com 2021-11-17T09:27:22Z
podmonitors.monitoring.coreos.com 2021-11-17T09:27:22Z
prometheuses.monitoring.coreos.com 2021-11-17T09:27:23Z
prometheusrules.monitoring.coreos.com 2021-11-17T09:27:23Z
servicemonitors.monitoring.coreos.com 2021-11-17T09:27:23Z
thanosrulers.monitoring.coreos.com 2021-11-17T09:27:23Z
没有alertmanagerconfigs.monitoring.coreos.com
??? 应该是没用较新版本,目前没解决该问题,使用helm的upgrade命令进行更新
通过helm upgrade更新
cat > alertmanager-config.yaml <<EOF
alertmanager:
config:
route:
repeat_interval: 12h
receiver: dingtalk
routes:
- match:
foo: bar
namespace: monitor
receiver: "null"
receivers:
- name: "null"
- name: dingtalk
webhook_configs:
- url: http://alertmanager-webhook-dingtalk/dingtalk/robot1/send
EOF
helm upgrade -f alertmanager-config.yaml -n monitor prometheus-operator stable/prometheus-operator
Service 类型设置为 NodePort
kubectl edit svc [prometheus svc name] -n monitor
kubectl edit svc [grafana svc name] -n monitor
kubectl edit svc [alertmanager svc name] -n monitor
kubectl get svc -n monitor
查看随机生成的nodePort端口(30000-32767)
问题
Error: INSTALLATION FAILED: failed to install CRD crds/crd-alertmanager.yaml: unable to recognize “”: no matches for kind “CustomResourceDefinition” in version “apiextensions.k8s.io/v1beta1”
$ kubectl version
Client Version: version.Info{…, GitVersion:“v1.18.8”,…}
Server Version: version.Info{…, GitVersion:“v1.22.2”,…}
发现Client和Server版本不一致,So,改成一样就可以了。
Normal Pulling 77s (x4 over 2m39s) kubelet Pulling image “ccr.ccs.tencentyun.com/alvin_me/film-be-film:latest”
Warning Failed 77s (x4 over 2m39s) kubelet Failed to pull image “ccr.ccs.tencentyun.com/alvin_me/film-be-film:latest”: rpc error: code = Unknown desc = Error response from daemon: Head “https://ccr.ccs.tencentyun.com/v2/alvin_me/film-be-film/manifests/latest”: unauthorized: authentication required
kubectl apply -f k8s_deploy.yml -n monitor
拉取私有镜像仓库时报错,镜像拉取失败。
方案参考:
Pull an Image from a Private Registry
命令
确定问题 Pod 所在节点
kubectl get pods [podname] -n [namespace] -o wide
确定 Pod 所使用的容器镜像
kubectl get pods [pod name] -n monitor -o yaml | grep image:
修改yaml配置文件
kubectl edit [component] [component name] -n [namespace]
发布资源
kubectl apply -n [namespace] -f [yaml file]
删除发布的资源
kubectl delete -n [namespace] -f [yaml file]
删除资源
# 根据资源文件删除
kubectl delete -f [filename] -n [namespace]
# 根据资源删除
# --grace-period=0 --force 强制删除
kubectl delete [component] [component name] [--grace-period=0 --force] -n [namespace]
查看pod信息
kubectl -n [namespace] describe pod [podname]
更多推荐
所有评论(0)