1. Introduction

In this use case, we utilize the "Cell Metrics"(RRU.PrbUsedDl) dataset provided by the O-RAN SC SIM space, which includes synthetic data generated by a simulator, with all data recorded in Unix timestamp format.
- Download data: https://nexus3.o-ran-sc.org/repository/datasets/CellReports.csv
The model training process is carried out on the O-RAN SC AI/ML Framework, including GPU support, and considers both traditional machine learning (ML) and deep learning (DL) approaches. For ML models, we use Random Forest and Support Vector Regression (SVR), while for DL models, we employ RNN, LSTM, and GRU architectures.
By managing the ON/OFF state of cells through traffic forecasting, we can reduce power consumption. Additionally, if the AI/ML models used for forecasting are operated in an eco-friendly manner, further power savings can be achieved. In this use case, we measure the carbon emissions and energy consumption during the cell traffic forecasting process using AI/ML to ensure that the forecasting model is not only effective but also environmentally sustainable.

...

Image Added

2. Requirements

Configuring GPU Usage in a Machine Learning Pipeline

...

Step 1. Install the nvidia-container-toolkit

Code Block

language	bash

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit

Step 2. Configure containerd

Code Block

language	bash

sudo nvidia-ctk runtime configure --runtime=containerd

Code Block

language	bash

sudo vim /etc/containerd/config.toml

Code Block

  [plugins."io.containerd.grpc.v1.cri".containerd]
    default_runtime_name = "nvidia" # Change to "nvidia"
    # Additional configurations are omitted

  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]

    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
      BinaryName = "/usr/bin/nvidia-container-runtime"
      # Additional configurations are omitted


    # Include the following content below
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.env]
      LD_LIBRARY_PATH = "/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64"

Code Block

language	bash

# Restart containerd service
sudo systemctl restart containerd

Step 3. Install the nvidia-device-plugin

Code Block

language	bash

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.2/deployments/static/nvidia-device-plugin.yml

Step 4. Build the traininghost/pipelinegpuimage image

We built a new base image that recognizes the configured GPU to enable proper GPU usage in the ML pipeline components
To build the required image, you can refer to the provided Dockerfile and requirement.txt at the following link, or modify the pipeline image available in the existing aimlfw-dep

Code Block

language	bash

sudo buildctl --addr=nerdctl-container://buildkitd build \
    --frontend dockerfile.v0 \
    --opt filename=Dockerfile.pipeline_gpu \
    --local dockerfile=pipeline_gpu \
    --local context=pipeline_gpu \
    --output type=oci,name=traininghost/pipelinegpuimage:latest | sudo nerdctl load --namespace k8s.io

Step 5. Verify GPU usage with nerdctl

If the output is similar to the one below, the GPU setup is complete.

Code Block

language	bash

sudo nerdctl run -it --rm --gpus all --namespace k8s.io -p 8888:8888 -v $(pwd):/app_run traininghost/pipelinegpuimage:latest /bin/bash -c "nvidia-smi"

...

Step 1. Download or Copy insert.py file available in the "File List" on this page
Step 2. Update the insert.py file with the appropriate values for your database(InfluxDB), including token, org, bucket, and the file location of the dataset(csv_file)
- the bucket for Viavi dataset must already be created
Code Block
language py
# InfluxDB connection settings url = "http://localhost:8086" token = "JEQWYIOLvB4iOwJp8BwA" org = "primary" bucket = "Viavi_Dataset" csv_file = "CellReports.csv"
Step 3. Install insert.py
- It takes about 10 min to insert all the Viavi dataset
Code Block
language bash
python3 insert.py

2. Setting FeatureGroup

Refer to the image below, and make sure to set _measurement to "cell_metrics"

...

3. Upload AI/ML Pipeline Script (

...

Jupyter Notebook)

...

Step 1. Download the pieline pipeline script(pipeline.ipynb) provided in the “File List”
Step 2. Modify the pipeline script to satisfy your own requirements
- Set data features required for model training (using FeatureStoreSdk)
  - The provided pipeline script uses We used the RRU_PrbUsedDl column from the Viavi DatasetYou can modify this based on your needs
- Write a TensorFlow-based AI/ML model script
  - The provided pipeline script uses an We used LSTM(Long Short-Term Memory) model to predict downlink traffic
  - You can add other model prediction accuracy(exe.g. RMSE, MAE, MAPE)
- Configure Energy and CO2 emission tracking for the Green Network use case using CodeCarbon
  Train
  - we provide:
    - Training duration, RAM/CPU/GPU energy consumption, CO2 emissions
- Upload the trained model along with its metrics (using ModelMetricsSdk)
Step 3. Compile the pipeline code to generate a Kubeflow pipeline YAML file

4. TrainingJob

Set the TrainingJob name in lowercase
Configure the Feature Filter
- For the query to work correctly, use backticks(`) to specify a specific cell site for filtering (e.g., `Viavi.Cell.Name` == "S1/B13/C1")

Refer to the image below

...

5. Result

These logs can be reviewed through the logs of the Kubeflow pod generated during training execution, and the details that can be checked are as follows:

...

6. Load Model

Code Block

language	yaml

 apiVersion: "serving.kserve.io/v1beta1"
 kind: "InferenceService"
 metadata:
   name: traffic-forecasting-model
 spec:
   predictor:
     tensorflow:
       storageUri: "http://tm.traininghost:32002/model/viavi_lstm_s1b13c3/1/Model.zip"
       runtimeVersion: "2.5.1"
       resources:
         requests:
           cpu: 0.1
           memory: 0.5Gi
         limits:
           cpu: 0.1
           memory: 0.5Gi

Code Block

language	bash

kubectl create namespace kserve-test
kubectl apply -f deploy.yaml -n kserve-test

...

4. File list

CellReports.csv
The file contains traffic data from Viavi.
View file
name insert.py
The file processes the GPU_dataset and inserts the data into InfluxDB.
(Changed required: DATASET_PATH , INFLUX_IP , INFLUX_TOKEN)
View file
name pipeline.ipynb
The file defines the model structure and training process.
View file
name deploy.yaml
The yaml file is used for deploying model inference service.
View file
name predict.sh
The script used for excuting the model prediction.
add inference files….(yaml, inference 실행되는 sh 파일, 예시 input 파일, 예시 output 파일)

...

Version	Old Version 20	New Version 21
Changes made by	HoSeong Choi	Hyuk-Sun Kwon (KyungHee University)
Saved on	Dec 09, 2024	Dec 09, 2024

Versions Compared

Key

1. Introduction

2. Requirements

Configuring GPU Usage in a Machine Learning Pipeline

2. Setting FeatureGroup

3. Upload AI/ML Pipeline Script (

Jupyter Notebook)

4. TrainingJob

5. Result

6. Load Model

4. File list

Page Comparison

Versions Compared

Key

2. Requirements

Configuring GPU Usage in a Machine Learning Pipeline

2. Setting FeatureGroup

3. Upload AI/ML Pipeline Script (

Jupyter Notebook)

4. TrainingJob

5. Result

6. Load Model

4. File list