本文主要是介绍icp 中altermanager报警设置,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
alertmanager.yml(monitoring-prometheus-alertmanager)配置如下(用于配置报警):
global:smtp_smarthost: 'smtp.163.com:25'smtp_from: 's@163.com'smtp_auth_username: 's@163.com'smtp_auth_password: '密码'
receivers:- name: default-receiveremail_configs:- to: 'songjxin@*.com,b@c.a'
route:group_by: ['alertname']group_wait: 10sgroup_interval: 5mreceiver: default-receiverrepeat_interval: 1h
alter.rulers(configMap-monitoring-prometheus-alertrules)配置如下(用于配置报警规则) groups:
- name: test-rulerules:- alert: clientsexpr: sum(kube_node_info) < 9for: 1mlabels:severity: warning annotations:summary: \"{{$labels.instance}}: node is not ready\"description: \"{{$labels.instance}}: node number is less than 10 (current value is: {{ $value }}\"- alert: NodeMemoryUsageexpr: (((node_memory_MemTotal - node_memory_MemFree - node_memory_Cached) / (node_memory_MemTotal)* 100)) > 25for: 1mlabels:severity: pageannotations:DESCRIPTION: '{{$labels.instance}}: Memory usage is above 75% (current valueis: {{ $value }})'SUMMARY: '{{$labels.instance}}: High memory usage detected'- alert: HighCPUUsageexpr: ((sum(node_cpu{mode=~\"user|nice|system|irq|softirq|steal|idle|iowait\"})BY (instance, job)) - (sum(node_cpu{mode=~\"idle|iowait\"}) BY (instance, job)))/ (sum(node_cpu{mode=~\"user|nice|system|irq|softirq|steal|idle|iowait\"}) BY(instance, job)) * 100 > 30for: 1mlabels:service: backendannotations:description: This machine has really high CPU usage for over 10msummary: High CPU Usage
这篇关于icp 中altermanager报警设置的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!