- f5c1c3e Fix incorrect state in Loadbalancer monitoring by JustHumanz · 3 weeks ago
- 333a193 Enhance `MySQLDown` alert (#2186) by Aldin Setiawan · 9 weeks ago
- cfea8c4 [ATMOSPHERE-367] Add the NodeTimeSkewDetected alert (#2151) by Dong Ma · 4 months ago
- a90d889 [ATMOSPHERE-578] Update NovaServiceGroupDown rule and Added failing tests (#2100) by Mohammed Naser · 4 months ago
- fa5d244 [ATMOSPHERE-523] Improve NeutronNetworkOutOfIPs alarm (#2063) by Dong Ma · 5 months ago
- f521227 [ATMOSPHERE-453] Remove NodeNonLTSKernel alert (#2043) by Dong Ma · 5 months ago
- bfb2ae8 [ATMOSPHERE-503] fix: remove softnet squeeze rules in kube-prometheus-stack (#2019) by Oleksandr K. · 5 months ago
- bf5b320 [ATMOSPHERE-508] Disable CephPGImbalance (#2022) by Dong Ma · 5 months ago
- 78a774a Add optional kubeconfig path to roles (#1871) by Austin Talbot · 6 months ago
- be509ea [ATMOSPHERE-428] Fix goldpinger grafana dashboard threshold for nodes (#1840) by Yaguang Tang · 7 months ago
- 2a165d3 Add TLS to node exporter (#1775) by Mohammed Naser · 7 months ago
- 8ba9975 [ATMOSPHERE-397] Add CommonName for monitoring stack (#1760) by Mohammed Naser · 7 months ago
- e186273 [ATMOSPHERE-340] Support collect softirq for node-exporter (#1732) by Yaguang Tang · 8 months ago
- 19bcfbf [ATMOSPHERE-175] Add support of ceph dashboard in grafana (#1688) by Yaguang Tang · 8 months ago
- d49adf8 [ATMOSPHERE-302] fix: set variables for cluster issuer name for keycloak and kube-prom-stack (#1676) by Oleksandr K. · 8 months ago
- 0bdfe94 Change promethues to use pvc for data store (#1652) by Yaguang Tang · 8 months ago
- 4569e9b Add Goldpinger + node-exporter-full (#1640) by Mohammed Naser · 8 months ago
- d05ee41 fix: nova capacity alert (#1306) by Michiel Piscaer · 8 months ago
- 2ce13b7 Add support to collect keycloak application metrics to prometheus (#1556) by Yaguang Tang · 8 months ago
- 1ca829c Fix `libvirt_exporter` missing `namespaceSelector: (#1517) by Yaguang Tang · 8 months ago
- 27022ca grafana: Allow user lookups by email (#1491) by Giovanni Tirloni · 8 months ago
- a10d0e5 ceph: Add CephHealthDetail alerts (#1495) by Giovanni Tirloni · 8 months ago
- cb3d38c Fix JSONNET rendiner for alerts by Mohammed Naser · 9 months ago
- 08ff881 Add build request failure monitoring [ATMOSPHERE-249] (#1414) by Mohammed Naser · 9 months ago
- 7270870 Improve CI reliability (#1408) by Mohammed Naser · 9 months ago
- 3708ece fix: use openstack_helm_ingress_secret_name when set for monitoring (#1386) by Michiel Piscaer · 9 months ago
- f0836c2 fix: add CA mounts in the Prometheus oauth2 container (#1329) by Michiel Piscaer · 9 months ago
- 90128aa Switch docs to Sphinx (#1166) by Mohammed Naser · 11 months ago
- 44efb88 Remove FluxCD references (#1123) by Mohammed Naser · 11 months ago
- ed41210 Add monitoring for stuck VMs (#1129) by Mohammed Naser · 11 months ago
- d206f5d feat: Add openstack db exporter (#1039) by Rico Lin · 12 months ago
- 37ebfde fix: fix CI with aritubee not define issue (#989) by Rico Lin · 1 year, 1 month ago
- 91e2fa0 feat(monitoring): expose prom/am via sso (#987) by Mohammed Naser · 1 year, 1 month ago
- 0b59744 feat: increase EL compatibility (#963) by Tadas Sutkaitis · 1 year, 1 month ago
- 8dc7add fix(keycloak): add no_log and disable become by Mohammed Naser · 1 year, 2 months ago
- 2e937c9 fix: added monitoring for high 500s count by Mohammed Naser · 1 year, 3 months ago
- 93c165d chore: update doc for kube-prom-stack ingresses (#713) by Oleksandr Kozachenko · 1 year, 4 months ago
- 947a84a feat(libvirt): Enable exporter ootb (#573) by Oleksandr Kozachenko · 1 year, 5 months ago
- 2beb903 fix(monitoring): fire IpmiCollectorDown after 15m by Mohammed Naser · 1 year, 5 months ago
- 6589394 fix(monitoring): drop ethtool exporter (#572) by Mohammed Naser · 1 year, 6 months ago
- b009349 feat: Add keycloak (#510) by Oleksandr Kozachenko · 1 year, 6 months ago
- 5b49cbb feat(monitoring): refactor (#555) by Mohammed Naser · 1 year, 7 months ago
- 4a761bb fix: added NodeNetworkMulticast by Mohammed Naser · 1 year, 9 months ago
- 7ae2b65 fix: ignore vxlan- in node exporter by Mohammed Naser · 1 year, 9 months ago
- 610ff8c add alerts for node softnet by ricolin · 1 year, 9 months ago
- dce06d4 fix: ignore osa interfaces by Mohammed Naser · 1 year, 9 months ago
- 403a42a fix: Add NodeNonLTSKernel alert (#404) by Rico Lin · 1 year, 10 months ago
- 3e5885e Update main.yml by Mohammed Naser · 1 year, 11 months ago
- 55100d5 Add missing default for grafana host by ricolin · 1 year, 11 months ago
- d778add Correct grafana variable name by ricolin · 1 year, 11 months ago
- cc14968 feat: unify all monitoring via grafana by Mohammed Naser · 1 year, 11 months ago
- f0314a8 fix: implement isolated clusters by Mohammed Naser · 1 year, 11 months ago
- 574d650 fix: use updated vexxhost.k8s by Mohammed Naser · 2 years ago
- 6b7acca fix: tune net.core.netdev_budget by Mohammed Naser · 2 years ago
- 7538f02 chore: refactor to v.k8s.upload_helm_chart by Mohammed Naser · 2 years ago
- 31171f4 chore: refactor to vexxhost.k8s.docker_image by Mohammed Naser · 2 years ago
- 9118f67 feat(monitoring): add metrics for ingress-nginx by Mohammed Naser · 2 years ago
- 7500421 fix: misc monitoring updates by Mohammed Naser · 2 years ago
- 40eb429 doc: fix typo in grafana by Mohammed Naser · 2 years, 1 month ago
- 8a2c8fb feat: add logging via vector + loki by Mohammed Naser · 2 years, 1 month ago
- 36f1de2 docs: clean-up opsgenie integration by Mohammed Naser · 2 years, 1 month ago
- e119d8b docs(monitoring): fix opsgenie by Mohammed Naser · 2 years, 1 month ago
- 53c04a3 doc: update monitoring docs by Mohammed Naser · 2 years, 2 months ago
- 273d3ca chore: move monitoring to offline install by Mohammed Naser · 2 years, 2 months ago
- 8b5c306 fix: use atmosphere_images for an image manifest by Mohammed Naser · 2 years, 2 months ago
- 7d3c797 feat(monitoring): add to operator by Mohammed Naser · 2 years, 4 months ago
- 6ed255b build: fix galaxy publishing by Mohammed Naser · 2 years, 6 months ago
- b8d3432 fix: stop waiting for kube-prometheus-stack by Mohammed Naser · 2 years, 6 months ago
- 09b3b54 chore: refactor servicemonitors into kube-prometheus-stack by Mohammed Naser · 2 years, 6 months ago
- 08c6224 chore: refactor *monitors to kube-prometheus-stack by Mohammed Naser · 2 years, 6 months ago
- 64da5c6 feat: clean-up more code for helm repos by Mohammed Naser · 2 years, 6 months ago
- 2a8ce6a fix(metrics): don't wait for entire helmrelease, just deployment by Mohammed Naser · 2 years, 6 months ago
- 6bf6535 ci: move ansible-lint to pre-commit by Mohammed Naser · 2 years, 6 months ago
- c8e1a45 Add Flux CD for Helm deployment by Mohammed Naser · 2 years, 7 months ago
- ba40eb3 Add exception for gre_sys by ricolin · 2 years, 8 months ago
- bff9371 Add persistence to AlertManager by Mohammed Naser · 2 years, 8 months ago
- d92c5f7 Add exception for tbr instances by Mohammed Naser · 2 years, 8 months ago
- 0ae4144 Drop CephNodeDiskspaceWarning by Mohammed Naser · 2 years, 9 months ago
- 3a15345 monitoring: upgrade kube-prometheus-stack by Mohammed Naser · 2 years, 9 months ago
- 55cc241 monitoring: disable noisy alerts by Mohammed Naser · 2 years, 10 months ago
- 6cd7291 Fix webhook errors for monitoring by Mohammed Naser · 2 years, 10 months ago
- f3dffa8 Fix nodeSelector for services by Mohammed Naser · 2 years, 10 months ago
- 49e80bd Added ability to run overrides for monitoring by Mohammed Naser · 2 years, 11 months ago
- 511c3fa Add ansible-lint job by Mohammed Naser · 3 years ago
- b7b97d6 Added OpenStack services by Mohammed Naser · 3 years ago