Simulated datasets

Simulated datasets

Description

For AI/ML in general, and In the context of Energy Savings in particular, it is important that datasets offering training data exist. The SIM project proposes currently 2 data sets to be leveraged: one based on real network data, coming from Telecom Italia, and another one based on synthetic data created with the help of Viavi.


Telecom Italia dataset

Where

The TIM data set representing the “Internet” usage from the network, for 2 months (Nov and Dec 2013, timestamps every 10 minutes) is published in O-RAN SC nexus:

How

Just download the data set and extract.

Details

The data is parsed, meaning we pruned the voice and SMS usage, keeping only the Internet traffic. The values themselves for the usage are normalised to a value known only to TIM (assuring some anonymisation).

It contains data for November and December 2023, with timestamps every 10 minutes. We have a different file for each of the days.

The data is from the city of Milano. As described in the references, the city was divided into 10.000 squares (roughly a grid of 100 x 100 squares). That is the SquareID (between 1 and 10000) column. The timestamp is the start time interval of the measurement. The end time will be start time + 10 minutes. Please note that the timestamp is GMT+1, Milano local time. The InternetTraffic represents “number of CDRs generated inside a given Square id during a given Time interval.”. It does not have any unit of measure, because it is somehow normalised such that it is anonymised. It has no meaning in itself, but we can see patterns of how that value increases and decreases over time.

Other references

Viavi dataset

Where

The two files of this dataset are stored here:

How

Download the data set. Files will be password protected. Whoever is interested in using the data can comment on this page and will receive the file password via email.

Details

The dataset contain synthetic data generated by the simulator, consisting of two files: Cell metrics and UE metrics. Each file contains various metrics specific to either cells or user equipment (UE). Data collection occurred over a 7-day period, from January 1, 2023, to January 7, 2023. The 4G cells use frequencies B2 and B17, while the 5G cells use frequency N77, distributed randomly within a 10 square kilometer simulation area, utilizing the default scenario settings in the simulator.

The data is timestamped in Unix format.

Cell Metric

ParameterDescriptionUnit
DRB.UEThpDlDownlink throughputGbps
DRB.UEThpUlUplink throughputGbps
RRU.PrbUsedDlDownlink Physical Resource Blocks (PRBs) usedN/A
RRU.PrbUsedUlUplink Physical Resource Blocks (PRBs) usedN/A
RRU.PrbAvailDlNumber of Physical Resource Blocks (PRBs) available for downlinkN/A
RRU.PrbAvailUlNumber of Physical Resource Blocks (PRBs) available for uplinkN/A
RRU.PrbTotUlTotal usage of Physical Resource Blocks (PRBs) on the uplink
RRU.PrbTotDlTotal usage of Physical Resource Blocks (PRBs) on the downlink
RRC.ConnMeanMean number of UEs in RRC connected modeN/A
RRC.ConnMaxMaximum number of UEs in RRC connected modeN/A
QosFlow.TotPdcpPduVolumeUlUplink data volume (PDCP PDU) delivered from gNB-DU to gNB-CUMbits
QosFlow.TotPdcpPduVolumeDlDownlink data volume (PDCP PDU) delivered from gNB-CU to gNB-DUMbits
PEE.AvgPowerAverage power utilizationwatts (W)
PEE.EnergyEnergy utilizationkilowatt-hours (khW)
RRU.MaxLayerDlMimoAverage maximum scheduled layer number under MIMO scenario in DLN/A
CARR.AverageLayersDlAverage value of scheduled MIMO layers per PRB on the DLN/A
RRC.ConnEstabAtt.mo-DataNumber of UE RRC connections to the cell by "mobile oriented data" causeN/A
RRC.ConnEstabAtt.mo-VoiceCall Number of UE RRC connections to the cell by "mobile oriented voice call" causeN/A
RRC.ConnEstabAtt.mo-VideoCallNumber of UE RRC connections to the cell by "mobile oriented video call" causeN/A
RRC.ConnEstabSucc.mo-DataNumber of successful UE RRC connections to the cell by "mobile oriented data" causeN/A
RRC.ConnEstabSucc.mo-VoiceCallNumber of successful UE RRC connections to the cell by "mobile oriented voice call" causeN/A
RRC.ConnEstabSucc.mo-VideoCallNumber of successful UE RRC connections to the cell by "mobile oriented video call" causeN/A
RRC.ConnEstabFailCause.NetworkRejectNumber of unsuccessful UE RRC connections to the cell rejected by the networkN/A

UE Metric

ParameterDescriptionUnit
RRU.PrbUsedUlMean uplink Physical Resource Blocks (PRBs)N/A
RRU.PrbUsedDlMean downlink Physical Resource Blocks (PRBs)N/A
DRB.UEThpUlUplink throughputGbps
DRB.UEThpDlDownlink throughputGbps
TB.TotNbrUlTotal number of uplink Transport Blocks (TBs)N/A
TB.TotNbrDlTotal number of downlink Transport Blocks (TBs)     N/A
DRB.UECqiUlUE's uplink CQIN/A
DRB.UECqiDlUE's downlink CQIN/A