6.5 oneM2M and MEC support for federated Learning

Federated Learning is a distributed machine learning approach in which model training is performed locally on multiple nodes and only model’s parameters or weights, not row data, are exchanged with aggregators. Instead of sending sensitive data to a central location, each node trains a local model using its own data and then shares only the model updates with a central location, which aggregates these updates to improve the global model. This enhances privacy, reduces bandwidth usage, and allows model adaptation close to data sources and IoT devices.

In the MEC/oneM2M interworking context, local training can be executed within Application Entities (AEs) on Middle Nodes (MN-CSE) or even on constrained edge devices, while aggregation roles can be hosted on Infrastructure Nodes CSEs (IN-CSE) in the cloud environment, or on ETSI MEC applications acting as intermediate servers at the edge. ETSI MEC services could provide capabilities such as Location API, Radio Network Information API, or IoT API to enrich local training with contextual features.

In this clause, the entities involved are described as follows:

  • FL Client: a participating device or node (e.g., IoT sensor, oneM2M AE, oneM2M MN-CSE, MEC node or application entity) that trains a machine learning model locally using its own private data. It holds raw data (never shared outside the device), performs local model training based on the global model received from the server/aggregator, and sends only model updates/gradients (not raw data) back to the FL Server or Aggregator.

  • FL Server: coordinates the training process across all clients by distributing the initial global model, collecting updates, and managing the learning rounds. It selects which clients participate in each training round, sends the current global model parameters to selected clients, receives updated parameters or gradients from clients after local training, and passes these updates to the FL Aggregator for combination. The FL Server could be hosted on a MN-CSE (at the edge) acting as a local server distributing models to nearby AEs, on an IN-CSE (in the cloud) serving as a central FL Server for large-scale coordination, on a MEC Application or MEC Hosts playing the role of a distributed FL Server close to the edge FL Clients, or on a MEC Platform hosting control logic for FL Clients selection and management/orchestration.

  • FL Aggregator: is responsible for combining model updates from multiple clients into a new, improved global model. It performs model aggregation (e.g., weighted averaging of client models), handles data heterogeneity by balancing contributions from clients (e.g., clients with more data may weigh more), and ensures robustness against noisy updates, stragglers, or malicious clients (e.g., secure aggregation techniques). The FL Aggregator could be hosted as an IN-CSE in the form of a central orchestration entity coordinating across multiple CSEs, on a MN-CSE for local orchestration, or on a MEC Platform exposing APIs for monitoring resources and optimizing FL Client/Fl Aggregator placement.

  • FL Orchestrator: manages the overall lifecycle and policies of the federated learning process, ensuring scalability and resilience across distributed clients and servers. It decides which clients participate in each round (e.g., based on resource availability, network conditions, or data diversity), optimizes how training workloads are distributed, especially in heterogeneous edge/cloud environments, handles privacy rules, security requirements, energy constraints, fault tolerance, and orchestrates interactions across multiple FL Servers/Aggregators. The FL Orchestrator maps to high-level coordination functions in MEC Platform or in oneM2M IN-CSE but can also exist locally at MN-CSE for edge deployment scenarios.

Deployment of federated learning may be implemented through the following options:

  • Option 1: In this option, the FL Orchestrator might be hosted in a cloud-based environment, while the FL Aggregator and FL Server might run on the edge. Finally, FL Clients could be hosted on the edge or device, depending on the computational capabilities. The FL Orchestrator, as depicted in Figure 6.5.1, initiates the first training round and delivers the initial global model to the FL Server which is in charge of distributing the model to each FL Client. Each FL Client trains the model locally on private data and sends its model updates back to the FL Server. The FL Server forwards the collected client updates to the FL Aggregator (Global) which combines the updates to produce the global model. The global model is returned to the FL Server, which passes it back to the FL Client and the FL Orchestrator. The FL Orchestrator starts then the next training round. In this configuration, the FL Server coordinates participation in each round and forwards collected updates to a FL Aggregator (Global) located at the edge, while the FL Orchestrator supervises the process by initiating training rounds, distributing models, and ensuring synchronization across all participants. The Option 1 is suitable for applications that require rapid response as the aggregation is completed at the edge only (locally), and it addresses low latency requirements since the process remains close to the data sources.

Fig 6.5.1

Fig 6.5.1 Federated Learning Orchestration - Option 1

  • Option 2: In this option, both the FL Orchestrator and the FL Aggregator (Global) might be hosted in a cloud-based environment, while the FL Aggregators (Local) and FL Servers might run on the edge. Finally, FL Clients could be hosted on the edge or device, depending on the computational capabilities. The FL Orchestrator, as depicted in Figure 6.5.2, initiates the first training round and distributes the initial global model to the FL Server which is in charge of distributing the model to each FL Client. Each FL Client trains the model locally on its private dataset and sends model updates to the FL Aggregator (Local) which performs partial aggregation of the client updates to reduce communication overhead. The aggregated results are forwarded to the FL Server, which transmits the aggregated updates to the FL Aggregator (Global). The FL Aggregator (Global) performs the final global aggregation, producing the improved global model. This model is then returned to the FL Server and then to the FL Orchestrator. The FL Orchestrator initiates the next training round. In this option, the FL Servers act as the local coordination layer, while the FL Orchestrator manages the end-to-end process. The Option 2 is suitable for large-scale and geographically distributed deployment scenarios as the aggregation occurs in two stages (first at the edge, then globally).

Fig 6.5.2

Fig 6.5.2 Federated Learning Orchestration - Option 2

The orchestration mechanism should be able to perform the following steps:

  • Federation Group Selection: based on the federation registry, MEC/oneM2M nodes or instances with sufficient compute and local training capabilities are selected as participants to the federated learning process.
  • Model Distribution: a global training model initialized at IN-CSE or MEC Orchestrator is distributed among selected participants.
  • Local Training: each selected node trains the model locally using its own data. The training parameters (e.g., weights, gradients, etc.) are stored locally and not exposed outside the node.
  • Aggregation and Synchronization: model’s updates are periodically transmitted to the aggregation nodes which might be either an IN-CSE or a MEC host. The aggregator combines updates using secure aggregation protocols and redistributes the improved global model. This allows to maintain consistency between IN-CSE/MEC host, MEC/oneM2M nodes and MEC orchestrator (aggregator).