π§Ύ Federated Learning for Autoencoder-based Anomaly Detection in the Industrial IoT
This project investigates the use of autoencoders and federated learning for condition monitoring in Industrial IoT (IIoT) environments, with a focus on resource-constrained edge devices and data privacy. It was developed as part of my Bachelor thesis and published at IEEE BigData 2022.
ποΈ System Architecture
The project follows a stepwise approach:
βοΈ Lightweight Autoencoder Design
A compact and efficient autoencoder was developed for detecting anomalies in sensor data from industrial machinery. The model is optimized to run on edge devices with constrained compute and memory capacity.
π Federated Learning for Privacy
To preserve data privacy, the system leverages a federated learning setup. Instead of transferring raw data, only model parameters are shared among devices. This allows local data to remain on-premise while enabling global model improvements through collaboration.
| Federated Learning IIoT Use Case Scenario | Federated Learning Training Cycle |
|---|---|
![]() |
![]() |
π Evaluation & Results
To evaluate the success of this approach, we conducted a case study on a real-world industrial application of anomaly detection in rotating machines, which are commonly found in manufacturing.
Here, the performances and resource demands of three configurations were compared:
- A baseline model: centralized and resource-unconstrained
- A centralized, resource-efficient model: trained on pooled data
- A federated version of the resource-efficient model: multiple instances trained locally on disjoint data subsets, exchanging only model weights
π Key Findings
Our research showed, that:
- The proposed resource-efficient centralized model was able to achieve similar anomaly detection performance to the baseline architecture.
- Even when used in a federated learning framework, only able to share model weights instead of data, instances of the resource-efficient model were still able to achive equal certainty of defect predictions.
- At the same time, this approach succeeded in strongly improving resource consumption and guaranteeing data privacy, as no trainings data was ever required to leave individual devices.

π§± Project Structure & Usage
π§ͺ Training & Models
src/models/: Resource-efficient condition monitoring model for deployment at the edgesrc/training/: Training pipeline for the resource-efficient autoencoder, as well as baseline (resource-unconstrained) condition monitoring model for comparisonsrc/federated_learning/: Federated training, communication between models and model aggregation logicsrc/data/: Entire data pipeline for data loading, cleaning and transformationconfig.yaml: Central configuration for training parameters (e.g., LR, batch size, number of clients)
π Build & Deployment
- Dockerfiles for all components are located in
docker/ - Persistent state and model transfer mechanisms are built in for simulated or real federated setups
- Supports local execution and deployment to cloud environments (e.g., Google Cloud Platform) via ansible in
deployment/ - For detailed information on deploying the KubeEdge testbed in GCP using Ansible, please refer to the Deployment README.
π License
This project is licensed under the MIT License β see the LICENSE file for details.
π€ Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

