Encoder-Decoder Network on the Shuttle Dataset
Université de Sherbrooke
This project implements an autoencoder neural network dedicated to anomaly detection on the Shuttle dataset. The model learns to reconstruct only normal data in order to identify deviations during the test phase.
Parameter selection is based on a statistical analysis of the data to ensure optimal compression.
The choice of k=6 for the latent space is justified by the following spectral analysis, where 6 eigenvalues clearly stand out:
Figure 1: MSE loss evolution (K=6) showing model stabilization.
The network adopts a symmetric "hourglass" structure to extract the essential features from the signals.
ReLU layers after each linear layer introduce the non-linearity needed to capture complex relationships in the data. The absence of output activation allows unconstrained reconstruction of continuous values.
Figure 2: Encoder-Decoder network architecture diagram.
Training is optimized to process normal data only, effectively ignoring anomalies.
Final classification relies on computing the reconstruction error (MSE). A threshold is determined to separate "normal" from "anomalous".
F-Measure formula used for evaluation.
Full distribution of reconstruction errors — normal data vs anomalies.
Figure 4: Zoomed view — separation between normal and anomalous distributions.
Figure 5: F-measure and Accuracy evolution as a function of the selected threshold.
Final performance demonstrates the model's high precision on this dataset.
Final anomaly detection results on the Shuttle dataset.
This project demonstrates that a lightweight autoencoder with a latent space of dimension 6 is sufficient to effectively detect anomalies in the Shuttle dataset. The reconstruction approach — training only on normal data then thresholding the MSE error — proves to be robust and interpretable.