Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 31 Current »

1. Introduction

  • In this use case, we utilize the "Cell Metrics"(RRU.PrbUsedDl) dataset provided by the O-RAN SC SIM space, which includes synthetic data generated by a simulator, with all data recorded in Unix timestamp format.

  • The model training process is carried out on the O-RAN SC AI/ML Framework, including GPU support, and considers both traditional machine learning (ML) and deep learning (DL) approaches. For ML models, we use Random Forest and Support Vector Regression (SVR), while for DL models, we employ RNN, LSTM, and GRU architectures.

  • By managing the ON/OFF state of cells through traffic forecasting, we can reduce power consumption. Additionally, if the AI/ML models used for forecasting are operated in an eco-friendly manner, further power savings can be achieved. In this use case, we measure the carbon emissions and energy consumption during the cell traffic forecasting process using AI/ML to ensure that the forecasting model is not only effective but also environmentally sustainable.

image-20241209-091900.png

2. Requirements

Configuring GPU Usage in a Machine Learning Pipeline

This section is based on contributions from Sungjin Lee's Github repository. For more details, visit this link.

  • Step 1. Install the nvidia-container-toolkit

    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
    && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
    sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
  • Step 2. Configure containerd

    sudo nvidia-ctk runtime configure --runtime=containerd
    sudo vim /etc/containerd/config.toml
      [plugins."io.containerd.grpc.v1.cri".containerd]
        default_runtime_name = "nvidia" # Change to "nvidia"
        # Additional configurations are omitted
    
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
    
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
          BinaryName = "/usr/bin/nvidia-container-runtime"
          # Additional configurations are omitted
    
        # Include the following content below
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.env]
          LD_LIBRARY_PATH = "/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64"
    # Restart containerd service
    sudo systemctl restart containerd
  • Step 3. Install the nvidia-device-plugin

    kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.16.2/deployments/static/nvidia-device-plugin.yml
  • Step 4. Build the traininghost/pipelinegpuimage image

    • We built a new base image that recognizes the configured GPU to enable proper GPU usage in the ML pipeline components

    • To build the required image, you can refer to the provided Dockerfile and requirement.txt at the following link, or modify the pipeline image available in the existing aimlfw-dep

    sudo buildctl --addr=nerdctl-container://buildkitd build \
        --frontend dockerfile.v0 \
        --opt filename=Dockerfile.pipeline_gpu \
        --local dockerfile=pipeline_gpu \
        --local context=pipeline_gpu \
        --output type=oci,name=traininghost/pipelinegpuimage:latest | sudo nerdctl load --namespace k8s.io
  • Step 5. Verify GPU usage with nerdctl

    • If the output is similar to the one below, the GPU setup is complete.

    sudo nerdctl run -it --rm --gpus all --namespace k8s.io -p 8888:8888 -v $(pwd):/app_run traininghost/pipelinegpuimage:latest /bin/bash -c "nvidia-smi"

3. Getting Started

1. Viavi Dataset Insertion

  • Step 1. Download or Copy insert.py file available in the "File List" on this page

  • Step 2. Update the insert.py file with the appropriate values for your database(InfluxDB), including token, org, bucket, and the file location of the dataset(csv_file)

    • the bucket for Viavi dataset must already be created

    # InfluxDB connection settings
    url = "http://localhost:8086"
    token = "JEQWYIOLvB4iOwJp8BwA"
    org = "primary"
    bucket = "Viavi_Dataset"
    csv_file = "CellReports.csv"
  • Step 3. Install insert.py

    • It takes about 10 min to insert all the Viavi dataset

    python3 insert.py

2. Setting FeatureGroup

  • Refer to the image below, and make sure to set _measurement to "cell_metrics"

Viavi_featuregroup.png

3. Upload AI/ML Pipeline Script (Jupyter Notebook)

Viavi_pipeline.png
  • Step 1. Download the pipeline script(pipeline.ipynb) provided in the “File List”

  • Step 2. Modify the pipeline script to satisfy your own requirements

    • Set data features required for model training (using FeatureStoreSdk)

      • We used the RRU_PrbUsedDl column from the Viavi Dataset
        (lightbulb) we extracted data at 30-minute intervals to train the model, focusing on capturing meaningful patterns in traffic data.

    • Write a TensorFlow-based AI/ML model script

      • We used LSTM(Long Short-Term Memory) model to predict downlink traffic

      • You can add other model prediction accuracy(e.g. RMSE, MAE, MAPE)

    • Configure Energy and CO2 emission tracking for the Green Network use case using CodeCarbon

      • we provide:

        • Training duration, RAM/CPU/GPU energy consumption, CO2 emissions

    • Upload the trained model along with its metrics (using ModelMetricsSdk)

  • Step 3. Compile the pipeline code to generate a Kubeflow pipeline YAML file

4. TrainingJob

  • Set the TrainingJob name in lowercase

  • Configure the Feature Filter

    • For the query to work correctly, use backticks(`) to specify a specific cell site for filtering (e.g., `Viavi.Cell.Name` == "S10/B13/C3")

  • Refer to the image below

Viavi_create_trainingjobs.png

5. Result

  • These logs can be reviewed through the logs of the Kubeflow pod generated during training execution, and the details that can be checked are as follows:

Viavi_codecarbon_hardware_log-20241209-083846.pngViavi_model_summary.pngViavi_pipeline_component_log.png

6. Load Model

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: traffic-forecasting-model
  namespace: kserve-test
spec:
  predictor:
    tensorflow:
      storageUri: "http://tm.traininghost:32002/model/viavi_lstm_s10b13c3/1/Model.zip"
      runtimeVersion: "2.17.0-gpu"
      resources:
        requests:
          cpu: 0.1
          memory: 1Gi
          nvidia.com/gpu: 1
        limits:
          cpu: 0.1
          memory: 1Gi
          nvidia.com/gpu: 1
kubectl create namespace kserve-test
kubectl apply -f deploy.yaml -n kserve-test
image-20241209-085120.png
source predict.sh
image-20241210-084014.png

7. Comparison

import json
import requests
import matplotlib.pyplot as plt

with open('input.json', 'r') as file:
    data = json.load(file)

instances = data['instances']
original_data = [instance[-1][0] for instance in instances[1:]]

print(f"Original: {original_data}")
print(f"Length: {len(original_data)}")

cluster_ip = "172.24.100.123"   # IP of where Kserve is deployed
ingress_port = "32667"          # Port of the Istio Ingress Gateway
url = f"http://{cluster_ip}:{ingress_port}/v1/models/traffic-forecasting-model:predict"

headers = {
    "Host": "traffic-forecasting-model.kserve-test.svc.cluster.local",
    "Content-Type": "application/json"
}

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    predictions = [pred[0] for pred in response.json()['predictions']]
else:
    print(f"Error: {response.status_code} - {response.text}")
    predictions = []

print(f"Predictions: {predictions}")

plt.figure(figsize=(35, 7))
plt.plot(original_data, label='Original Data', linestyle='-', marker='o')
if predictions:
    plt.plot(range(len(predictions)), predictions, label='Predictions', linestyle='--', marker='x')
plt.title('Comparison of Original Data and Predictions')
plt.xlabel('Index')
plt.ylabel('Value')
plt.legend()
plt.grid(True)
plt.show()
image-20241210-084549.png

4. File list

  • CellReports.csv
    The file contains traffic data from Viavi.

  • The file processes the dataset and inserts the data into InfluxDB.
    (Changed required: DATASET_PATH , INFLUXDB_IP , INFLUXDB_TOKEN)

  • The file defines the model structure and training process.

  • The yaml file is used for deploying model inference service.

  • The script used for excuting the model prediction.

  • The json file is used for Inference.

5. Example

  • Input data

[
    [
        [
            0.8342111290054597
        ],
        [
            0.6010991527639614
        ],
        [
            0.6010991527639614
        ],
        [
            0.7556675063318373
        ],
        ...
    ]
]

  • output data

"predictions": [
    [0.112432681], [0.0969874561], [0.100156009], [0.308645517],
    [0.303829], 0.41753903], [0.539769948], [0.618172765],
    [0.634754062], [0.625748396], [0.530416906], [0.333626628],
    [0.208134115], [0.137386888], [0.113081336], [0.105671018],
    [0.14362359], [0.258447438], [0.55385375], [0.567666292],
    [0.680490911], [0.738528132], [0.717839658], [0.653536677],
    [0.524426162], [0.453805923], [0.268508255], [0.170119792],
    [0.115568757], [0.0889923126], [0.0926849917],[0.143254936],
    [0.277066618], [0.629820824], [0.710377932], [0.728215456],
    [0.707016706], [0.638304889], [0.527941108], [0.411735266],
    [0.345081121], [0.171168745], [0.14139232], [0.0903047472],
    [0.231243849],[0.170835257], [0.178088814], [0.255523622],
    [0.405331433], [0.626908183], [0.738317609], [0.797301173],
    [0.575131953], [0.416216075], [0.296788305], [0.203006953],
    [0.15064013], [0.136724383], [0.159134328], [0.414751947],
    [0.386262804], [0.395934224], [0.432646126], [0.598033369],
    [0.657576084], [0.664435327], [0.585267484], [0.40978539],
    [0.28032586], [0.203143746], [0.136435121], [0.102216847],
    [0.109813243], [0.164212137], [0.438008606], [0.511521935]
]

Contributors

  • Peter Moonki Hong - Samsung

  • Taewan Kim - Samsung

  • Corbin(Geon) Kim - Kyunghee Univ. MCL

  • Sungjin Lee - Kyunghee Univ. MCL

  • Hyuksun Kwon - Kyunghee Univ. MCL

  • Hoseong Choi - Kyunghee Univ. MCL

Version History

Data

Ver.

Author

Comment

2024-12-10

1.0.0

Corbin(Geon) Kim, Sungjin Lee, Hyuksun Kwon, Hoseong Choi

2024-01-08

1.0.1

Sungjin Lee

Corrected the input.json file

  • No labels