Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This article provides serves as an introductory overview of the availability monitoring processes followed by EmpowerID follows in monitoring SaaS environments. As an introduction, While it does not include details about delve into all the various aspects of Site Reliability Engineering or Security Information and Event Management performed by EmpowerID. For instance, performance monitoring is not discussed in great detail here. Instead, this article focuses on availability monitoring and , the article provides a comprehensive understanding of the processes followed by the EmpowerID DevOps team to ensure a base level of service with minimal impact on end - users. The focus of the article is on availability monitoring, and the information provided here is intended aims to help SaaS customers understand what EmpowerID performs in this area and help the monitoring processes performed by EmpowerID and assist the in-house operations team of non-SaaS customers know what they need to monitor to achieve parity. Within the context of availability monitoring, as a solution, EmpowerID can be EmpowerID's solution for availability monitoring can be broadly divided into three broad areas:

...

areas, including front-end services,

...

back-end services,

...

and the underlying infrastructure monitored by EmpowerID DevOps.

Front-End Monitoring

To monitor site availability, EmpowerID DevOps primarily monitors site availability by ensuring focuses on ensuring that the main web applications load without any issues. For this purpose, Azure Monitor is used for this purposeutilized, and the following URLs are loaded:

...

three specific URLs are checked every two minutes per Azure region. These URLs include Core Login (https://<core-domain>/WebIdpForms/Login/Portal

...

), IAM

...

Shop (

...

https://<iamshop-domain>

...

), if applicable, and My Identity (

...

https://<myid-domain>

...

These URLs are checked every two minutes per Azure region, and two primary regions are usually configured for monitoring. The primary and child ), if applicable. All requests are checked to ensure all that they are successful. Three In case of three consecutive failures result in , a High-Priority alert being is raised, which would then be handled by the EmpowerID DevOps team. Aside from the In addition to active front-end monitoring described, passive error rate monitoring is optionally performed when there is a for large user basebases, and where the EmpowerID UI is frequently utilized frequently and regularly by end-users. . For this, the Azure Application Gateway provides a failed-requests metric. A High-Priority alert is raised , and if the error rate exceeds the 5% threshold and sustains for more than five minutes, a High-Priority alert is raised.

Easy html macro
theme{"label":"solarized_dark","value":"solarized_dark"}
contentByMode{"html":"<!doctype html>\r\n<link href=\"https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css\" rel=\"stylesheet\" integrity=\"sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC\" crossorigin=\"anonymous\">\r\n<link href=\"https://docs.empowerid.com/new_docs.css\" rel=\"stylesheet\">\r\n<script src=\"https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js\" integrity=\"sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM\" crossorigin=\"anonymous\"></script>\r\n <p class = \"bd-callout bd-callout-info\">In the past, EmpowerID DevOps relied on \r\n automated web tests, which performed a sequence of activities in an automated way, \r\n such as simulating a user login. However, this facet of front-end availability monitoring\r\n is being retired, as it did not report anything novel above and beyond what the \r\n abovementioned monitoring provided. It is mentioned here in the event a need requiring such \r\n UI-driven process monitoring presents itself.</p>","javascript":"","css":""}

Backend-Monitoring

Most clients utilize EmpowerID partly or solely for EmpowerID's identity lifecycle automation , sometimes more so than UI-based functionality. Even if the primary use cases involve UI-driven processes, EmpowerID’s backend processes are invaluable to overall system functionality. Therefore, much effort has been spent to monitor all of EmpowerID’s various backend processes.Because EmpowerID persists all vital information – including process state information – in one database, EmpowerID has implemented a simple, functionality is often the primary reason for clients to use the platform, and monitoring backend processes is critical for ensuring system functionality. EmpowerID stores all vital information, including process state information, in one database, enabling the use of a simple yet effective mechanism to report process health. A stored procedure named called Z_EmpowerID_Health checks process state information against predefined criteria and outputs a list of problematic conditions that require requiring attention. The configuration and details A complete listing of these health checks and a complete listing of the checks performed is detailed here their configurations is available at EmpowerID HealthCheck: SQL Procedure Z_EmpowerID_Health .

To monitor this process, EmpowerID DevOps deploys a particular monitoring container that invokes this the health-check procedure every five minutes and submits any reported problem conditions to Azure Monitor. If the a problem condition is reported consecutively in polling intervals, then a medium-priority alert is raised. Therefore, EmpowerID DevOps ensures that all of EmpowerID's various backend processes are continually monitored to maintain overall system health.

Infrastructure Monitoring

EmpowerID SaaS runs in is hosted on Azure, utilizing various several products , such as like Azure Kubernetes Services (AKS) and SQL Database (as-a-Service). To stay ahead of EmpowerID DevOps monitors specific metrics for each service to proactively detect issues before they affect front-end and back-end services, EmpowerID DevOps monitors specific metrics for each service. A . Depending on the metric and threshold, a medium or high-severity alert is generated depending on the metric and threshold. Some of the metrics monitored include:

SQL Database:

...

for SQL Database include the remaining free space, with less than 15% raising a medium-severity alert

...

. Deadlocks are also monitored, with over three deadlocks within

...

ten minutes

...

raising a high-severity alert

...

. In addition, an average CPU utilization of over 90% raises a medium-severity alert.

Alert Handling

EmpowerID utilizes Azure Monitor to aggregate metrics, evaluate rules, and raise alerts. Actions are configured in Azure Monitor to trigger alerts in Atlassian Ops Genie, which then pages EmpowerID DevOps personnel. Depending on the severity, EmpowerID manages these alerts in the following way:

  • HighFor high-severity alerts page the person/people , on-call no matter personnel are paged regardless of the time of day, and follow-up with escalations if escalations are followed up if the alert is not acknowledged.

  • MediumFor medium-severity alerts page people , personnel is paged during waking hours so they can be followed up accordingly, allowing for a timely follow-up.

Insert excerpt
IL:External Stylesheet
IL:External Stylesheet
nopaneltrue