David Mohaisen

Our research interests are in the areas of networks security and online privacy. Our work has broadly combined principles of the design, analysis, and development of security and privacy primitives and tools for various systems. Over the past ten years, our interests have evolved to include topics in (big and small data) security analytics, social networks security and privacy, Internet security, networks security, and privacy. Our approach in conducting research in these areas considered exploratory, constructive, and empirical methods. A common theme in our most recent research work is the use of advanced machine learning techniques for security analytics: to understand codes, traffic, and infrastructure usage in real-world deployments. Our earlier work focused on understanding various security issues in multiple networking contexts, by design and analysis.

Adversarial and Applied Machine Learning

Conventional machine learning approaches: Up until recently, the majority of our work has been focused on using conventional machine learning approaches, including supervised and unsupervised learning, in order to classify and automate the process of labeling threat indicators (such as domain names, binaries, vulnerability severity label) as well as prediction (time-series type of data). Supervised learning algorithms used include SVM, MLP, RF, ANFIS, among others. Unsupervised learning algorithms include k-mean, fuzzy c-means, and hierarchical clustering. Some of the recent of the problems we solved using conventional machine learning algorithms include the build of Internet of Things malicious software detectors, vulnerability severity score predictor and labeling system, a semi-supervised detector of cryptojacking codes (type of malicious codes used for abusing computer systems for cryptomining), malicious webpage classification system (to annotate malicious webpages based on capabilities and compromise vector), vulnerability cost assessment system (by stock performance prediction using ARMIA), among others.

Deep learning approaches: As the complexity and size of the data increased we utilized different deep learning algorithms for both feature extraction as well as pattern recognition. Deep learning algorithms benefit from automatic feature extraction and learning which not only improves the performance of the model by extracting more meaningful features, but also eliminates the need for feature extraction phase in conventional machine learning algorithms, which is laborious and require domain knowledge. We utilized convolutional neural network to build an Internet of Things malicious software vendors, intrusion detection system in software defined networking, website fingerprinting (for improving privacy), authorship identification (for identifying malicious codes authors and documents forgers), and binaries classification.

Adversarial machine learning approaches: adversarial learning is concerned with generated input samples similar to original ones (with simple perturbations) that would result in fooling machine learning algorithms (e.g., result in misclassification) and can be used for improving the robustness of machine learning algorithms, highlighting the risk of machine learning algorithms through purposeful attacks, and understanding practical limitations of such algorithms. Algorithms used for adversarial learning include MIM, FGSM, JSMA, PGD, DeepFool, NewtonFool, etc. Problems that benefited from adversarial machine learning approaches include generating practical malware samples that will not only fool classifiers but also be executable, intrusion detection in software defined networks, and website fingerprinting.

Representative Publications:

Multi-χ: Identifying Multiple Authors from Source Code Files. M. Abuhamad, T. Abuhmed, D. Nyang, and D. Mohaisen. Privacy Enhancing Technologies Symposium (PoPETS/PETS), 2020.
Soteria: Detecting Adversarial Examples in Control Flow Graph-based Malware Classifiers H. Alasmary, A. Abusnaina, R. Jang, M. Abuhamad, A. Anwar, D. Nyang, and D. Mohaisen. The 40th IEEE International Conference on Distributed Computing Systems (IEEE ICDCS 2020)
Large-Scale and Language-Oblivious Authorship Identification. M. Abuhamed, T. Abuhamed, A. Mohaisen, D. Nyang: ACM SIGSAC Conference on Computer and Communications Security (ACM CCS 2018)

Secure and Reliable Systems with Blockchains

Our work on blockchains covers a range of topics, from primitives and foundations to applications and translations. More precisely, he has been leading three thrusts of research: 1) foundational and principled research into distributed systems primitives (consensus algorithms) that would ensure desirable properties in blockchain systems, such as privacy, fairness, decentralization, 2) distributed systems requirements and their translation into a blockchain framework by combining requirements engineering and composable designs, and 3) sustainability of system properties in the new ecosystem through active measurements (predictive models) and design evolution of alternatives and trade-offs. Related to the last thrust, we have been working on understanding the abuse of blockchains through a system attack surface analysis.

Representative Publications

Towards Characterizing Blockchain-based Cryptocurrencies for Highly-Accurate Predictions. M. Saad, J. Choi, J. Kim, D. Nyang, A Mohaisen. IEEE Systems Journal (IEEE ISJ 2020) (Best Paper Award)
Exploring the Attack Surface of Blockchain. . Muhammad Saad, Jeffrey Spaulding, L. Njilla, C. A. Kamhoua, S. Shetty, D. Nyang, D. Mohaisen: IEEE Communication Surveys and Tutorials (IEEE CS&T 2020).
Exploring Spatial, Temporal, and Logical Attacks on the Bitcoin Network. M. Saad, V. Cook, L. Nguyen, My Thai, A. Mohaisen: IEEE International Conference on Distributed Computing Systems(IEEE ICDCS 2019)

Mobile and Internet of Things Security and Privacy

Mobile security threats have recently emerged because of the fast growth in mobile technologies and the essential role that mobile devices play in our daily lives. For that, and to particularly address threats associated with malware, various techniques are developed in the literature, including ones that utilize static, dynamic, on-device, off-device, and hybrid approaches for identifying, classifying, and defend against mobile threats. Those techniques fail at times, and succeed at other times, while creating a trade-off of performance and operation. To this end, we contribute several systems: Andro-AutoPsy, AndroTracker, Andro-Dumpsys, and APHunter. In summary, we design efficent and accurate techniques for detecting and classifying mobile malware, techniques for improving privacy in mobile networks, as well as tecniques for detecting hardware malicious access points.

In our work on IoT security, we systemize for a finer understanding of the security threats in smart home networks and propose to perform a comprehensive multi-layer and cross-layer analysis, recommendations, and design of primitives and functions for securing the home network. Towards that, the proposed research explores a quantification of the attack surface of home network devices, network gears, and services, to guide a system-aware design of a security layer that incorporates primitives at the device, network, and service layer of the home network for intrusion detection and prevention. For prevention, the security layer features various functions such as secure naming and resolution, safe cryptographic primitives, including identification and authentication, and various other system layer-specific primitives. For intrusion detection, the cornerstone of our security layer is a behavioral logging and profiling capability at the device, network, and service, to facilitate real-time intrusion detection and notification. In summary, in this research theme we develop algorithms to improve the efficiency, security, and operation of mobile and wireless networks, including usable authentication techniques, and intrusion detection mechanisms, among others, for use in IoT applications. En route, we also explore various aspects of IoT privacy.

Representative Publications:

AUToSen: Deep Learning-based Implicit Continuous Authentication Using Smartphone Sensors. Mohammed Abuhamad, Tamer Abuhmed, David Mohaisen, and DaeHun Nyang. the IEEE Internet of Things Journal (IEEE IoTJ 2020)
Catch Me If You Can: Rogue Access Point Detection Using Intentional Channel Interference. R. Jang, J. Kang, A. Mohaisen, D. Nyang. IEEE Transactions on Mobile Computing (IEEE TMC 2019)
XLF: A Cross-layer Framework to Secure the Internet of Things (IoT). An Wang, Aziz Mohaisen and Songqing Chen. 39th IEEE International Conference on Distributed Computing Systems (IEEE ICDCS 2019)

Distributed Denial of Service Attacks and Defenses

Analyzing and understanding distributed denial of service (DDoS) attacks is another thrust of my work. Enormous efforts are continuously made from both academia and industry to understand the DDoS attacks and defend against them. With an ever-improving defense posture, the attack strategies are constantly changing as well; making DDoS attacks some of the most severe threats on the Internet. DDoS attacks, by nature, are difficult to defend against because: 1) it is hard to know in advance when an attack is launched, 2) where the attacking machines are from, 3) how many attacking machines are involved, and 4) how long an attack will last (among others). Most Internet DDoS attacks are today attributed to larger interconnected and overly complex entities that belong to various botnets. For such botnet-based (commercialized) DDoS attacks, understanding the underlying relationships between various attacks and attackers is fundamental in defending against the attacks. Particularly, are those relationships and efforts totally random? How do the attackers manage their resources? Can we estimate attack origins, sizes, duration, start time, and magnitude based on historical data? If there are some patterns in these attacks, can we learn and utilize them to improve the existing defenses? Apparently, understanding the latest attacking strategies and postures is key to the success of any defense. To pursue this work, and as a starting point, we relied on 50,704 different Internet DDoS attacks across the globe, of which data is collected for a seven-month periods operationally. These attacks were launched by 674 botnet generations from 23 different botnet families with a total of 9026 victim IPs belonging to 1074 organizations that are collectively located in 186 countries. To sum up, we design and develop a data-driven and model-guided approach to defending against application-level distributed denial of service (DDoS) attacks by botnets.

Representative Publications:

Examining the Robustness of Learning-Based DDoS Detection in Software Defined Networks. Ahmed Abusnaina, Aminollah Khormali, Daehun Nyang, Murat Yuksel and Aziz Mohaisen. The 2019 IEEE Conference on Dependable and Secure Computing (IEEE DSC 2019) -- Best Paper--Runner Up.
A Data-Driven Study of DDoS Attacks and Their Dynamics. A. Wang, W. Chang, S. Chen, and A. Mohaisen. IEEE Transactions on Dependable and Secure Computing (IEEE TDSC 2018)
An Adversary-Centric Behavior Modeling of DDoS Attacks. A. Wang, A. Mohaisen and S. Chen: IEEE International Conference on Distributed Computing Systems (IEEE ICDCS 2017)

Wearable Security and Privacy

Privacy leakage from elevation profiles: The extensive use of smartphones and wearable devices has facilitated many useful applications. For example, with Global Positioning System (GPS)-equipped smart and wearable devices, many applications can gather, process, and share rich metadata, such as geolocation, trajectories, elevation, and time. For example, fitness applications, such as Runkeeper and Strava, utilize information for activity tracking, and have recently witnessed a boom in popularity. Those fitness tracker applications have their own web platforms, and allow users to share activities on such platforms, or even with other social network platforms. To preserve privacy of users while allowing sharing, several of those platforms may allow users to disclose partial information, such as the elevation profile for an activity, which supposedly would not leak the location of the users. In this work, and as a cautionary tale, we create a proof of concept where we examine the extent to which elevation profiles can be used to predict the location of users. To tackle this problem, we devise three plausible threat settings under which the city or borough of the targets can be predicted. Those threat settings define the amount of information available to the adversary to launch the prediction attacks. Establishing that simple features of elevation profiles, e.g., spectral features, are insufficient, we devise both natural language processing (NLP)-inspired text-like representation and computer vision-inspired image-like representation of elevation profiles, and we convert the problem at hand into text and image classification problem. We use both traditional machine learning- and deep learning-based techniques, and achieve a prediction success rate ranging from 59.59% to 95.83%. The findings are alarming, and highlight that sharing elevation information may have significant location privacy risks.

AR/VR Security: Enabling users to push the limits of the physical world, augmented reality (AR) and virtual reality (VR) platforms opened a new chapter in human perception. The novel immersive experiences resulted in the emergence of new interaction methods for virtual environments, which came along with their security and privacy risks that are never considered before. In this project, we explore a spatial side-channel keylogging attack to infer user inputs typed with in air tapping keyboards in virtual environments. We exploit the observation that hands follow certain patterns while typing in the air to initiate our attack. We introduce three plausible attack scenarios under which the adversary obtains the hand traces of the victim by either planting a small-sized hand tracker near the victim, keeping with a close proximity to the victim, or tricking the victim into installing a malicious application. Our five-step pipeline takes the hand traces of the victim and outputs a set of inferences ordered from the best to worst. Through our experiments, we achieved pinpoint accuracy ranging from 40% to 87% within at most top-500 candidate reconstructions. We discuss possible countermeasures, while the results presented provide a cautionary tale of the potential security and privacy risk of the immersive mobile technology.

Representative Publications:

Understanding the Potential Risks of Sharing Elevation Information on Fitness Applications. Ulku Meteriz, Necip Fazil Yıldıran, Joongheon Kim, and David Mohaisen. The 40th IEEE International Conference on Distributed Computing Systems (IEEE ICDCS 2020)
Deep Fingerprinting Defender: Adversarial Learning-based Approach to Defend Against Website Fingerprinting. Ahmed Abusnaina, Rhongho Jang, Aminollah Khormali, DeaHun Nyang, and Aziz Mohaisen. in Proceedings of the 39th IEEE International Conf. onComputer Communications (IEEE INFOCOM 2020).
You are a Game Bot!: Uncovering game bots in MMORPGs via self-similarity in the wild. E. Lee, J. Woo, H. Kim, A. Mohaisen, H. Kim: ISOC Network and Distributed System Security Symposium (ISOC NDSS 2016)

Website last updated on 04/28/2019