Device Identification Based on Radio Signals Using Deep Metric Learning

Posted Sep 2, 2025

By Anh Dinh 4 min read

1. Introduction
2. Related Work
3. Method
4. Experiment
5. Result

1. Introduction

Radio Frequency Fingerprinting (RFF) is a technique for identifying radio devices based on unique characteristics (fingerprints) generated by hardware imperfections during RF signal transmission.

Each device (WiFi, IoT, phones, drones, etc.) inherently has small unavoidable errors in circuits (oscillator, PA, DAC…) when transmitting.
These imperfections create a distinct electromagnetic “fingerprint.”
By extracting these features from received signals (e.g., IQ imbalance, phase noise, transient, spectral features…), devices can be identified and distinguished even when using the same communication standard and MAC/IP address.

Main applications of RFF

Wireless network security: detect impersonation, prevent spoofing.
IoT: lightweight device identification and authentication without heavy protocols.
Military & Drones: distinguish unmanned aerial vehicles.

Transient Signal Identification

Studies show that the transient stage in radio signals (between noise and steady state) contains key features for device identification. Precisely detecting the transient boundary is crucial to avoid errors affecting feature extraction and training.

Common detection methods:

Bayesian change point detection: based on fractal dimension changes.
Phase-Based detection: based on linear phase variations during the transient stage.

Machine Learning and Deep Learning Applications

Once the transient signal is extracted, supervised machine learning models like SVM perform well but heavily rely on accurate transient segmentation.

To overcome this, deep learning (CNN, LSTM) leverages the full signal, reducing dependence on transient detection and improving accuracy. Recent models such as RiftNet have been applied to identify devices from smartphone Bluetooth signals.

3. Method

Signal Classification with Machine Learning

Instantaneous Phase Features

Based on transient-based RF fingerprinting.
Extract high-order statistical features: skewness, kurtosis, variance from:
- Instantaneous amplitude
- Instantaneous frequency
- Instantaneous phase
These features reflect micro hardware imperfections, creating a unique “fingerprint” for each device.

TFED Features

Use Hilbert-Huang transform to analyze signals in the time-frequency domain.
Extract feature groups:
- Transient Signal (e.g., duration, energy, phase entropy)
- Envelope (e.g., energy, envelope variance)
- TFED-Time (e.g., slope, variance, energy peak distribution)
- TFED-Frequency (e.g., peaks, energy variance)
Limitation: requires manual extraction, depends on accurate transient detection, and can lose information at low sampling rates.

Signal Classification with RiftNet

RiftNet architecture

RiftNet has two branches:
- Branch (A): processes long segments (~16 μs).
- Branch (B): processes short segments (~2.5 μs).
Both use Dilated Convolutional Cells (DCC) with different dilation rates to extract multi-scale temporal information.
Intermediate outputs are combined with skip connections and fed into the classification layer.
Advantage: effectively captures RF features, reduces reliance on manual feature extraction.

Metric Learning and Open Set Problem

After training RiftNet, the softmax layer is removed.
Apply metric learning with contrastive loss, train for 20 additional epochs.
Extract latent vectors of known users and index them using FAISS for querying and evaluation.
A threshold of 1.0 (Euclidean distance) in the embedding space separates known from unknown users.

4. Experiment

Dataset

Public dataset from Uzundurukan et al. (2020), published in Data, MDPI, titled “A Database for the Radio Frequency Fingerprinting of Bluetooth Devices.”
Paper link: https://www.mdpi.com/2306-5729/5/2/55
Dataset link: https://doi.org/10.5281/zenodo.3876140
Bluetooth (BT) signals, sampling rate 250 Msps.
13 devices from 5 brands, ~1950 recordings from 33 phones.
Data split: 80% train – 20% test.

Device list: iPhone 4s, 5, 5s, 6, 6s, 7, 7 Plus, LG G4, Samsung Note 2, Note 3, S3, J7, Xiaomi Mi 6.

Machine Learning Classification

Models: SVM, LDA, Decision Tree, Random Forest, XGBoost, CatBoost, Gradient Boosting.
Libraries: scikit-learn, XGBoost, CatBoost.
Default hyperparameters.
Training and evaluation on the same dataset.

RiftNet Classification

Trained for 100 epochs.
Optimizer: Adam, learning rate = 1e-4.
Loss function: cross-entropy.

5. Result

Machine Learning Classification

Instantaneous features:
- SVM: 67.42% (train), 42.22% (test) → low effectiveness.
TFED features:
- LDA: 73.64% (train) / 72.93% (test)
- Decision Tree: 100% / 56.57%
- Random Forest: 100% / 69.19%
- XGBoost: 100% / 70.61%
- CatBoost: 99.12% / 72.12%
- Gradient Boosting: 99.55% / 65.35%

👉 Results show overfitting: high training accuracy but large drop on test data. Main reason: difficulty in accurately detecting transient stage, making manual features (instantaneous, TFED) insufficient.

RiftNet Classification

Training loss curve

Train accuracy curve

Test accuracy curve

Test confusion matrix

Model converges quickly and stably: loss drops sharply in early epochs.
Performance improves steadily on both train and test.
Best result at epoch 96:
- Train accuracy: 99.4%
- Test accuracy: 96.57%
Slight test fluctuations due to small and limited diversity in test data, causing potential distribution shift.

👉 RiftNet outperforms traditional machine learning models, demonstrating deep learning’s power in RFF identification.

Metric Learning in Open Set

Latent vectors clustered using t-SNE plot

After metric learning, the latent space forms clear clusters corresponding to known users.
- Devices of the same class group tightly.
- Different classes are well separated.
- Previously unseen devices appear as outliers, easily identifiable.

Open-set problem on test set (with unknown users)

Tested on mixed dataset (known + unknown devices):
- Confusion matrix shows accurate and stable predictions for known classes.
- Some misclassifications mainly occur with unknown class, normal for open-set.
- With 1.0 threshold (Euclidean distance), system achieves 97.05% accuracy in separating known/unknown.

👉 Results prove RiftNet with metric learning has high generalization and strong potential for device authentication using RF signals in open-set scenarios.

Paper published in APSIPA ASC 2025, check here.

Code used in the study is available in this repository, check here.

Machine Learning, Deep Learning, Radio Frequency Fingerprinting

This post is licensed under CC BY 4.0 by the author.

Device Identification Based on Radio Signals Using Deep Metric Learning

Contents

1. Introduction

Main applications of RFF

Transient Signal Identification

Machine Learning and Deep Learning Applications

3. Method

Signal Classification with Machine Learning

Instantaneous Phase Features

TFED Features

Signal Classification with RiftNet

Metric Learning and Open Set Problem

4. Experiment

Dataset

Machine Learning Classification

RiftNet Classification

5. Result

Machine Learning Classification

RiftNet Classification

Metric Learning in Open Set

Trending Tags

Contents

1. Introduction

Main applications of RFF

2. Related Work

Transient Signal Identification

Machine Learning and Deep Learning Applications

3. Method

Signal Classification with Machine Learning

Instantaneous Phase Features

TFED Features

Signal Classification with RiftNet

Metric Learning and Open Set Problem

4. Experiment

Dataset

Machine Learning Classification

RiftNet Classification

5. Result

Machine Learning Classification

RiftNet Classification

Metric Learning in Open Set

Trending Tags