Top 5 This Week

Related Posts

Explainable artificial intelligence for botnet detection in internet of things – Scientific Reports

This section introduces the methodology of conducting this study. It proposed a conceptual framework for integrating XAI techniques into the detection of botnet activities in IoT environments. Our framework aims to enhance the interpretability and transparency of the botnet detection system. By integrating XAI, we provide deeper insights into the factors driving the classification characteristics of benign and malicious network traffic in the IoT environment. We describe the dataset used for evaluation, the dataset preprocessing phase, model evaluation phase, the integration of XAI techniques, and the evaluation methodology employed to assess the interpretability and performance of the proposed framework.

Proposed framework

Figure 1 illustrates the proposed conceptual framework. It consists of four primary phases. The first phase is the dataset creation. It includes six steps:

  1. 1.

    Sniffing benign IoT network traffic.

  2. 2.

    Injecting botnet malware.

  3. 3.

    Sniffing malicious network traffic.

  4. 4.

    Labelling traffic.

  5. 5.

    Applying traffic statistics.

  6. 6.

    Storing traffic statistics.

Fig. 1

Proposed XAI methodology framework.

The second phase is the dataset preprocessing. It is an essential step in preparing data for model learning. It involves cleaning, transforming, and reformatting the raw data to make it suitable for use in an AI model.

The quality of the preprocessing step plays a crucial role in determining the performance of the machine learning algorithm. This step involves six stages:

  1. 1.

    Dataset cleansing

  2. 2.

    Analyzing the dataset using exploratory data analysis (EDA)

  3. 3.

    Feature selection to select the most relevant features

  4. 4.

    Outliers treatment

  5. 5.

    Dataset balancing

  6. 6.

    Shuffling

The third phase is the models evaluation. It focuses on learning several algorithms and comparative evaluating them. It includes four steps:

  1. 1.

    Dataset splitting into two subsets: a training set and a test set.

  2. 2.

    Models learning: training set is used to train the model.

  3. 3.

    Models testing: test set is used to evaluate the performance of the model on new, unseen data. The test set is utilized to estimate the model’s generalization error, which represents the expected error rate on new data.

  4. 4.

    Models evaluation: evaluating each model performance and forming an evaluation matrix. Based on the evaluation results, the model of the best performance will be selected.

The fourth phase is integrating the selected model with XAI techniques. XAI is applied to the best performing model to add the interpretability in the black-box model. Two XAI explanations types are used: model simplification and feature relevance. Explanation by simplification provides explanation through rule extraction and distillation. Feature relevance explanation provides explanation through ranking or measuring the influence each feature has on a prediction output.

The following is the pseudo code for the proposed framework:

Algorithm
figure a

Explanation of the Pseudocode:

CreateDataset: This function encapsulates the steps for creating the dataset from both benign and malicious traffic.

PreprocessDataset: This function outlines the preprocessing steps, ensuring the data is cleaned and formatted appropriately for model training.

EvaluateModels: This function describes how to split the dataset, train different models, and evaluate their performance.

IntegrateXAI: This function integrates explainable AI techniques into the best-performing model, providing interpretability.

The Main Execution section brings together all phases, leading to the final output of the best model and its explanations.

Dataset description

This section describes the dataset used in the experimental framework. N-BaIoT24 dataset is selected for training and evaluation purposes as it is a widely accepted as benchmark sequential dataset. It contains realistic network traffic and a variety of attack traffic. It was suggested by Meidan et al.25 through gathering traffic of nine commercially available IoT devices authentically infected by Mirai and Bashlite malware. The devices were 2 smart doorbells, 1 smart thermostat, 1 smart baby monitor, 4 security cameras and 1 webcam. The traffic data was captured during normal device execution and after infection with malware. The network sniffing utility was employed to capture the traffic in raw network traffic pcap format, typically achieved through port mirroring. Five features were extracted from the network traffic, as summarized in Table 2. For each of these five features, three or more statistical measures were computed for data aggregation, resulting in a total of 23 features. These 23 distinct features were calculated over five separate time windows (100 ms, 500 ms, 1.5 s, 10 s, and 1 min).

Table 2 Dataset attributes information.

The use of time windows makes this dataset suitable for stateful Intrusion Detection Systems (IDS), resulting in a total of 115 features. The dataset consists of instances of network traffic data categorized into three groups: normal traffic (Benign data), Bashlite infected traffic, and Mirai infected traffic. Each data instance is represented by 115 features derived from 23 different traffic characteristics across five different time frames. Table 2 provides an abstracted overview of the dataset attributes.

Dataset preprocessing

The attacks executed by botnets include Scan, which aims to discover vulnerable devices; flooding, which utilizes SYN, ACK, UDP, and TCP flooding techniques; and combo attacks involving opening connections and sending junk data. Figure 3 illustrates the unbalanced nature of the N-BaIoT dataset. Therefore, a subset of the dataset was selected to form a balanced binary class labeled dataset. All instances of benign traffic, totaling 555,932 instances, were included, while the remaining malicious traffic datasets were merged. To achieve a balanced dataset, an equal number of benign instances were selected from the malicious traffic. As a result, the balanced dataset consisted of a total of 1,111,864 instances, as indicated in Table 3.

Table 3 Dataset balancing.

Subsequently, the dataset was randomly shuffled to randomize the order of the training data before feeding it into the learning algorithms. Shuffling is performed to prevent any patterns in the data from influencing the learning algorithm’s order.

Evaluation metrics

In conducting a thorough performance evaluation, it is crucial to consider various metrics beyond just accuracy. Effective detection requires not only the accurate identification of attacks but also the reduction of false positives. The following four evaluation metrics are utilized:

Accuracy: This metric assesses the overall correctness of the classification model and is calculated as the ratio of correctly classified instances to the total number of instances.

$$Accuracy= \frac{TP+TN}{TP+TN+FP+FN}$$

Precision: This represents the ratio of true positive predictions to all positive predictions made by the model.

$$Precision= \frac{TP}{TP+FP}$$

Recall: This measures the model’s ability to correctly identify positive instances from all actual positive cases.

$$Recall (Sensitivity)= \frac{TP}{TP+FN}$$

F1 Score: This combines precision and recall into a single metric, providing a balanced assessment of the model’s performance.

$$F1 Score= \frac{2 \times Precision \times Recall}{Precision+Recall}$$

All these metrics range from 0 to 1, with higher values indicating superior classification performance. Additionally, two further performance metrics training time and testing time are included for a comprehensive comparative evaluation.

Models evaluation

The present study builds upon our prior research2, which focused on the empirical evaluation of six tree-based algorithms: DT, RF, BMC, ADB, GDB, and XGB. The results of previous study2 is shown in Table 4. It shows that RF model outperforms the other models in all evaluation measures.

Table 4 Evaluation results.

Table 5 compares the performance of this study with a previous study22. The previous study has been selected as it utilized the same dataset and same number of classes. The results show that the RF algorithm outperforms the adopted algorithm HGB in the previous study in all the evaluation metrics.

Table 5 Results comparison.

Integration of XAI techniques

XAI refers to the techniques and methods used to make ML models more transparent and interpretable. The goal of XAI is to provide explanations for the decisions made by AI models, enabling users to understand and trust the reasoning behind those decisions.

To enhance the interpretability and transparency of the best performing model (Random Forest model) in the context of botnet detection, we apply XAI techniques. Explainable AI includes a collection of techniques and approaches that seek to offer transparent and comprehensible explanations for the decisions made by AI and ML models, ensuring their interpretability to humans. Specifically, we utilized two types of XAI explanations: model simplification and feature relevance.

Model simplification explanation

It provides insights into the decision-making process of enigmatic or black-box model by extracting and distilling rules. This explanation type enables us to obtain a simplified representation of the complex model’s behavior.

To generate simplification explanations for individual predictions made by the RF model, we employed two techniques: rule extraction and distillation technique, and local interpretable model-agnostic explanations (LIME) technique.

By leveraging model simplification explanations, we can gain insights into the key factors and conditions that contribute to the model’s predictions for botnet detection in the IoT environment. This enables security analysts and domain experts to better understand the decision-making process and identify potential vulnerabilities or biases within the model.

Rule extraction and distillation techniques aim to identify a set of rules that mimic the black-box model’s behavior while maintaining a high level of interpretability. The extracted rules provide a human-understandable representation of the decision logic employed by the model.

LIME is first introduced by Ribeiro, Marco Tulio Guestrin, Carlos26. LIME operates by approximating the decision boundaries around specific instances and identifying the most influential features contributing to the model’s predictions. By generating explanations in the form of feature importance weights, LIME provides insights into the factors driving the classification outcomes and highlights the distinguishing characteristics of benign and malicious network traffic in the IoT environment. LIME creates an interpretable local model around the instance of interest, which can be used to understand the model’s decision. It aims to explain the predictions of black-box models by approximating them with interpretable models, thus providing insights into the decision-making process.

Feature relevance explanation

It provides an understanding of the influence that each feature has on the model’s prediction outputs. This type of explanation ranks or measures the relevance of individual features in driving the model’s decision-making process.

To generate feature relevance explanations, we employed SHapley Additive exPlanations technique (SHAP). SHAP is first introduced by Lundberg and Lee27. These techniques utilize game theory principles through assigning importance scores to each feature based on their impact on the model’s predictions. The higher the importance score, the more influential the corresponding feature is in determining the output.

SHAP is based on cooperative game theory that provides a unified approach to explain the output of any machine learning model. It assigns each feature in the input a “SHAP value,” which represents the contribution of that feature to the model’s prediction for a specific instance. SHAP values provide a global perspective on feature importance and also enable instance-level explanations.

By utilizing feature relevance explanations, we can identify the most critical features for detecting IoT botnet activities. This information aids in understanding the underlying characteristics and patterns associated with botnet behavior in the IoT environment. It allows the intrusion detection process to be more transparent and improves interpretability of the detection system.

#Explainable #artificial #intelligence #botnet #detection #internet #Scientific #Reports

source: https://www.nature.com/articles/s41598-025-90420-6

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Popular Articles