Host-based Intrusion Detection System (HIDS)

Intrusion Detection using DistilBERT

This application demonstrates a machine learning-based intrusion detection system trained on the ADFA-LD dataset.

Model Repository: salsazufar/distilbert-base-hids-adfa

The system processes system calls through three stages:

Preprocessing - Converts raw system call sequence into 18-gram sliding windows
Inference - Classifies each window as Normal or Attack using the DistilBERT model
Aggregation - Determines final detection based on all window predictions

Detection Strategy: If any window is classified as Attack, the final result is Attack.

Enter system calls as space-separated integers

Sample data: Normal (25 calls) → Attack (30 calls) → Normal (25 calls)

Detection Summary

The model uses 18-gram sliding windows with stride=1 for comprehensive coverage
Each window is independently classified, providing detailed analysis of the sequence
The aggregation strategy flags the entire sequence as Attack if any window is detected as malicious
Sample sequence demonstrates transition from Normal → Attack → Normal behavior

Developed as part of thesis research | Model trained on ADFA-LD dataset