The near-RT RIC Self-Health-Check flow fulfills the requirement that all systems need to monitor their own health – internal subsystems, hosted software, and external interfaces. At configurable intervals, RIC is to trigger Health-Check requests to its internal common platform modules and hosted xAPPs. Each platform module and each xAPP are required to support Health-Check requests and to perform a self-check. The RIC also exercises the E2 interface to ensure it is operational and receiving data from downstream RAN resources.
For NB interfaces, while the RIC is responsible to check its interface functions (i.e., O1 Termination module and A1 Mediator), heartbeats or keep-alive signals over O1 and A1 are verified by the NB clients invoking the O1 and A1. (see sections below on Flows #2 and #3)
<<insert sequence diagram and plantuml syntax>>
The specific Health-Check validations are as follows (see flow diagram on Figure 1):
- Health of RIC platform modules –
- Ability to initiate a Health-check on each of the common platform modules within the RIC (e.g., logging, tracing, conflict manager, xAPP manager, subscription management, O1 Termination, A1 Mediator, etc.), store results and declare alarm/alert conditions [5 in Figure 1]
- Ability for each common platform module in the RIC to perform a self-check [6]
- Implementation Option[1]: The self-check can potentially leverage Kubernetes Liveness and Readiness probes. Liveness probes can be configured to execute a command, issue a http-get, and open a TCP socket against the container/pod. Readiness probes can be configured to ensure the pod is ready before allowing it handle traffic. To further check a module’s (pod) ability to communicate with other modules over RMR (RIC Message Router), each module could subscribe to its own topic, send a hello-world message regularly to itself and ensure it can send and receive messages.
- Any alarm/alert conditions or clearing of alarms/alerts are sent immediately via the O1 VES interface. [7-8]
- Health of xAPPs
- Ability of RIC to invoke Health-check requests to each of the xAPP instances deployed on the RIC [9]
- Ability of each xAPP to perform Health-checks on itself and respond back to the RIC [10]
- Implementation Option: See Implementation Option above for platform modules.
- Any alarm/alert conditions or clearing of alarms/alerts are sent immediately via the O1 VES interface. [7-8]
- Health of E2 interface - Ability to send request downstream RAN resources (O-CU/O-DU) via E2 interface for PM collection and report generation, and receive PM report [13-15]
- Any alarm/alert conditions or clearing of alarms/alerts are sent immediately via the O1 VES interface. [16]
- Health of Overall RIC Instance based on Health-Check results (successes, failures and anomalies), mapped to alarms and alerts (which represent RIC’s operability) are stored . [17]
- Ability to update alarms for queries by NB clients. For example, RIC alarms/alerts can be incorporated into the O1 NetConf operational tree. The corresponding Yang model might need to be augmented (e.g., define health state leaf with alarm-list in the Yang model).
- Ability to make performance test results available to NB clients (for on-demand requests) [19-22]
To support this flow, a new Health-Check functional block within the RIC is being proposed, which can be implemented as a separate software module or as distributed function across one or more existing modules. The Health-Check functional block has to perform the following:
- Perform health-checks on the underlying common platform functions/modules and on xAPP instances hosted on the RIC (self-checks at configured intervals and on-demand requests)
- Map failures and anomalies to alarms and alerts
- Send out notifications for alarms and alerts
- Determine the state of the RIC based on alarms and alerts
- Store health-check results for queries
- Clear alarms and alerts when conditions clear
Figure 1 below shows the flow of RIC Self-Checks – regular heartbeats over O1 and A1, the Health-Check Module initiating health-check requests within the RIC to assess its overall health, and issuing alarms/alerts, as appropriate based on health results.
[1] Implementation options are suggested at the use case level, to be further fleshed out during user stories phase.