AIMLFW pipeline for Generative AI
1. Objective
The primary objective of this document is to introduce native support for Generative AI pipelines into AIMLFW, enabling the following capabilities:
Extend AIMLFW to support workflows for text, sequence, and synthetic pattern generation
Facilitate real-world use cases for O-RAN network performance improvement, such as:
Security: generate explanations for anomaly detection, automate incident response reports
RAN optimization: simulate traffic patterns, create synthetic datasets
Energy efficiency: generate network demand scenarios, auto-create scheduling policies
Provide reusable, modular support for custom generative models and datasets
Ensure seamless integration with existing components such as the Template Store and Training Manager
2. Scope
2.1 Identification
This section defines the scope and system-level description of the Generative AI Pipeline Extension for AIMLFW.
System Name: AIMLFW-GAI Pipeline
Abbreviation: AIMLFW-GAI
Functional Scope:
Supports generative AI workflows, including text generation, synthetic traffic, and scenario simulation
Enables pipeline configuration via JSON/YAML templates
Manages the full lifecycle: training, generation validation, export, and deployment
Integration Scope:
Fully compatible with existing AIMLFW modules: Template Store, Training Manager, Model Manager
Operates containerized on Kubernetes clusters; compatible with Ubuntu 22.04 and Python 3.x
Intended Users:
Researchers using generative data in wireless network research
Contributors integrating generative AI into open-source O-RAN pipelines
3. Current System or Situation
3.1 Background, Objectives, and Scope
AIMLFW currently supports supervised and unsupervised pipelines with preprocessing, feature extraction, and batch training. However, it does not natively support generative AI workloads such as pattern generation, simulation, or synthetic data creation. The goal of this enhancement is to add a first-class generative AI execution path to AIMLFW, enabling improved O-RAN network performance.
3.2 Operational Policies and Constraints
Only containerized, version-controlled generative components are permitted
Reproducibility across environments is ensured
Image signing/scanning is enforced for security
Rollback and upgrade processes are simplified for better uptime
Resource allocation follows cluster scheduling policies
Generated data must not contain personally identifiable information or subscriber-specific patterns
Full compliance with GDPR and telecom privacy regulations
3.3 Description of Current System
The Training Manager currently schedules supervised ML jobs with static templates. There is no mechanism for managing generative workflows; manual scripts are sometimes used but are not reusable or pipeline-compliant.
3.4 Users or Affected Personnel
Data Scientists: Design generative pipelines and create synthetic data
Platform Engineers: Ensure stable cluster deployment and compatibility
Operations: Monitor generation runs and validate output quality
3.5 Support Concept
Generative module containers are maintained through CI/CD
A version compatibility matrix will be provided
Monitoring dashboards will include generation quality, data distribution, and scenario statistics
4. Analysis of the Proposed System
4.1 Summary of Advantages
Enables AIMLFW to handle generative AI use cases natively
Integrates seamlessly with existing modules
Reusable modular design supports various generative scenarios
4.2 Summary of Disadvantages or Limitations
Generative model training typically requires more time and compute resources
Additional quality assurance logic is needed for output validation
4.3 Alternatives and Trade-offs Considered
Alternative 1: Use external generative AI platforms (e.g., OpenAI API) and sync results back to AIMLFW
Alternative 2: Adapt existing supervised templates for temporary generative use
TBD
Data | Ver. | Author | Comment |
---|---|---|---|
2025-06-18 | 1.0.0 | Corbin(Geon) Kim |
|