Execute the following stored procedure in SQL to perform a health check of core EmpowerID processes: Z_EmpowerID_Health
.
This stored procedure has been pre-programmed to check the health of a common set of EmpowerID functionality and is meant to be consumed by an enterprise monitoring tool to help alert IT staff when attention is needed. The stored procedure can be customized to include additional checks. This procedure is intended to be run regularly and periodically; a 5-minute interval is recommended, and an interval smaller than 1 minute may put undue load on the system.
A few pre-programmed health checks in this stored procedure require 2 settings to be effective. The following can be set through the EID UI: https://<EID-URL>/ui/#custom/EmpowerIDConfigSettings
EmpowerID_Health_EmpowerIDServerRoleNames_ToMonitor
: This setting should be a comma-separated list of the EmpowerID Server Roles, which are utilized in the EID instance. Most commonly, this value should be set to "Application Server Full, Web Front-End" for the 2 roles in which servers are most commonly assigned to servers.EmpowerID_Health_EmpowerIDServiceFriendlyNames_ToMonitor
: This setting should be a comma-separated list of the EmpowerID Services which are utilized in the EID instance. Most commonly, this value should be set to "Web Role Service, Worker Role Service" for the 2 services which are most commonly running on servers.
There will always be a single resultset with at least 1 result when invoking the Z_EmpowerID_Health. The resultset is a single column. A default record is returned with text “Z_EmpowerID_Health Complete (disregard this message)”. This record will be the only one returned if all checks succeed. Otherwise, it will be the last record returned preceded by records with text indicating which checks are currently failing. The design was meant to simplify incorporation into almost any enterprise monitoring tool.
Numerous checks are built-in, and additional checks can be added to this stored procedure that ensures use-case-specific logic. The following is a list of possible health checks, and the text returned if the check fails:
EmpowerID Service <ServiceName> Heartbeat Overdue (5 minutes or more elapsed)
EmpowerID Server Role <RoleName> Heartbeat Overdue (5 minutes or more elapsed)
Job Overdue: <JobName> (10 minutes or more elapsed)
Job Not Succeeding: <JobName>
Inventory Progress Stalled: <SystemName>
Enforcement Progress Stalled: <SystemName>
Membership Progress Stalled: <SystemName>
Projection Progress Stalled: <SystemName>
Inventory Overdue: <SystemName>
Enforcement Overdue: <SystemName>
Membership Overdue: <SystemName>
Projection Overdue: <SystemName>
Inventory Not Succeeding: <SystemName>
Enforcement Not Succeeding: <SystemName>
Membership Not Succeeding: <SystemName>
Projection Not Succeeding: <SystemName>
Job <JobName> on AccountStore <SystemName> failed <#Failures> times consecutively from <EarliestFailureDateTime> to <RecentFailureDateTime>
Permanent Workflow <PermamentWorkflowName> is marked Active but overdue
Compiled Set <SetName> is marked Enabled but overdue
Cloud Gateway Server <ServerName> has not contacted EmpowerID in more than 5 minutes