Release K: Creating & running an ML-based rApp

 

 

Note: This page is a work in progress (WIP).

Introduction

This tutorial outlines the steps to deploy a QOE prediction model on KServe using RAppManager. It includes instructions for setting up the environment, installing necessary components, and deploying the model.

For the UAV path prediction use case: TBD.

Case 1: Deploying the QoE Prediction Model

Prerequisites

Ensure you have the following setup:

  • Operating System: Ubuntu 22.04 Server

  • CPU: 16 cores

  • RAM: 32 GB

  • Disk Space: 100 GB

  • Root Access: Required

Installation Steps

  1. Install Required Components:

    • AIML Framework Manager (aimlfmw)

    • RAN PM (ranpm)

    • KServe Inference Service

  2. Create Feature Group, Training Function, and Start Training Job:

    • Set up the required feature group.

    • Create the training function.

    • Initiate the training job.

  3. Patch KServe:

    • Apply necessary patches to KServe after installation.

  4. Install RAppManager, DMEParticipant, and ACM:

    • Install the required modules for RAppManager.

    • Onboard, prime, and deploy the QoE RApp via RAppManager.

Cloning Repositories

git clone "https://gerrit.o-ran-sc.org/r/aiml-fw/aimlfw-dep" git clone "https://gerrit.o-ran-sc.org/r/nonrtric/plt/ranpm" git clone "https://gerrit.o-ran-sc.org/r/nonrtric/plt/rappmanager"

Install Training Host

  1. Switch to root and navigate to the directory:

    sudo su cd aimlfw-dep
  2. Edit the configuration file:

    nano RECIPE_EXAMPLE/example_recipe_latest_stable.yaml
  3. Run the installation script:

Postgres Fix (If Needed)

Note: This issue with bitnami postgres might arise if previous resources were not completely cleaned. In that case the password for user: postgres will not be the one set by the secret)

If Postgres issues arise during setup (for example TM pod in traininghost namespace is in crashbackloop because unable to reach the database), follow these steps:

  1. Delete and reinstall the Postgres database with a custom password:

  2. Check psql -U postgres and if you cannot login with password: mypwd : Edit the pg_hba.conf file to disable password authentication:

  3. Check again psql -U postgres . In rare cases where the Postgres database isn’t found after step 2:

  4. Set the Postgres password:

  5. Revert the configuration back to md5 and reload:

Restart Kubeflow Pods

To ensure smooth operation, restart all Kubeflow pods:

Install RAN PM

  1. Modify the image prefix in global-values.yaml:

  2. Run the installation script:

If ISTIO is not installed, you can install it manually or use the the script installation via this step below:

Model Creation via DME

Follow the AIML Framework Installation Guide to create the model via DME.

  1. Reinstall AIML:
    Update the example_recipe_latest_stable.yaml with your machine’s IP and Datalake information, then reinstall:

  2. Get InfluxDB Token:
    Run the script to retrieve access tokens:

Feature Group and Training Job Creation

  1. SSH into the environment with port forwarding:

  2. Run the QoE Pipeline Jupyter Notebook:
    Open qoe-pipeline.ipynb in Jupyter and run all cells. Ensure the response code is 200.
    Visit: http://localhost:32088/tree

  3. Create Feature Group:
    Visit: http://localhost:32005/TrainingJob/CreateFeatureGroup

  4. Push QoE Data:

  5. Verify Data in InfluxDB:

  6. Create Training Job:
    Visit: http://localhost:32005/TrainingJob/CreateTrainingJob



NO UI STEPS (IF YOU WISH TO CURL DIRECTLY AIML DASHBOARD)
1b. CREATE A FEATURE GROUP (CHANGE THE "Host":"IP" AND THE "token":"TOKEN" IN THE BODY)


2b.CREATE A TRAINING JOB


3b.START A TRAINING JOB


4b. GET INFOS



Deploy KServe Inference Service

  1. Install and patch KServe:

Final Step: Deploying the QoE Prediction Service Using Postman

1. Install Only the Required Services (ACM, RAppManager, DMEParticipant, Capifcore, ServiceManager, Kong)

To deploy the QoE prediction service, you'll need to install only a subset of the services, such as ACM, RAppManager, and others. Follow these steps:

  1. Modify the install-nonrtric.sh Script:

    Open the script to enable only the services you need:

    • Enable the necessary services:

    • Disable all unnecessary services:

  2. Comment Out Unnecessary KServe Steps:

    In the install-all.sh script, comment out any steps related to KServe installation or patching since you’ve already handled them.

    Comment the following lines:

  3. Handling Kong PV and Proxy Port Issues
    If you encounter issues with Kong Persistent Volume (PV) or Proxy Port, follow these steps:

    1. Edit Kong Persistent Volume:

      Open the Kong PV configuration and add storageClassName: nfs-client under the spec section.

      Add this line under spec:

    2. Update Kong Proxy Port:

      If the default port for Kong is unavailable, update the kongvalues.yaml file:

      Set a new available nodePort:

    3. Upgrade Kong Helm Chart (If they are already running, otherwise skip):

      Apply the changes by upgrading the Helm deployment for Kong:

 

Run the Installation Script:

Now, run the installation script for RAppManager and the other enabled services:

Wait until the scripts exits successfully (In some cases it can take more than 30 minutes)

4. Create a Namespace for KServe Testing

Create a new Kubernetes namespace specifically for testing the KServe inference service:

5. Deploy the QoE Prediction Service via Postman or cURL

Once the services are up and running, you can now deploy the QoE prediction service using Postman or cURL.

Watch the video for the steps using the postman collection to Onboard Prime an Instantiate the rApp.

  1. Check the Status of the Inference Service:

    Verify that the inference service for QoE is deployed in the kserve-test namespace:

    The output should display the QoE model's status and endpoint.

  2. Prepare the Input for QoE Prediction:

    Create a JSON file (input_qoe.json) with test data to send to the prediction model:

  3. Send a Prediction Request Using Postman or cURL:

    To send a prediction request, you can either use Postman or the curl command. Here’s the curl command example:

    Replace <KONG_IP> with the IP address of your Kong proxy and <KONG_PORT> with the configured NodePort (e.g., 32083).

  4. Validate the Response:

    You should receive a JSON response with the prediction results. This confirms that the QoE model has been successfully deployed and is functioning correctly.

Useful Files


Note: This csar package is generated via rappmanger/sample-rapp-generator/generate.sh

 

Video of the prediction


Demo_960x540-lores.mp4
Deploying the rApp

 

output2_960x540-lores.mp4
Undeploying the rApp