CADUE: Content-Agnostic Detection of Unwanted Emails for Enterprise Security
- 6 October 2021
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM) in 24th International Symposium on Research in Attacks, Intrusions and Defenses
Abstract
End-to-end email encryption (E2EE) ensures that an email could only be decrypted and read by its intended recipients. E2EE’s strong security guarantee is particularly desirable for the enterprises in the event of breaches: even if attackers break into an email server, under E2EE no contents of emails are leaked. Meanwhile, E2EE brings significant challenges for an enterprise to detect and filter unwanted emails (spams and phishing emails). Most existing solutions rely heavily on email contents (i.e., email body and attachments), which would be difficult when email contents are encrypted. In this paper, we investigate how to detect unwanted emails in a content-agnostic manner, that is, without access to the contents of emails at all. Our key observation is that the communication patterns and relationships among internal users of an enterprise contain rich and reliable information about benign email communications. Combining such information with other metadata of emails (headers and subjects when available), unwanted emails can be accurately distinguished from legitimate ones without access to email contents. Specifically, we propose two types of novel enterprise features from enterprise email logs: sender profiling features, which capture the patterns of past emails from external senders to internal recipients; and enterprise graph features, which capture the co-recipient and the sender-recipient relationships between internal users. We design a classifier utilizing the above features along with existing meta-data features. We run extensive experiments over a real-world enterprise email dataset, and show that our approach, even without any content-based features, achieves high true positive rate of 95.2% and low false positive rate of 0.3% with such stringent constraints.Keywords
This publication has 10 references indexed in Scilit:
- Pattern Matching on Encrypted StreamsPublished by Springer Science and Business Media LLC ,2018
- AdaGraph: Adaptive Graph-Based Algorithms for Spam Detection in Social NetworksLecture Notes in Computer Science, 2017
- That Ain’t You: Blocking Spearphishing Through Behavioral ModellingPublished by Springer Science and Business Media LLC ,2015
- Semantic Feature Selection for Text with Application to Phishing Email DetectionLecture Notes in Computer Science, 2014
- An Expanded Feature Extraction of E-Mail Header for Spam RecognitionAdvanced Materials Research, 2013
- Maps of random walks on complex networks reveal community structureProceedings of the National Academy of Sciences of the United States of America, 2008
- Filtering spam with behavioral blacklistingPublished by Association for Computing Machinery (ACM) ,2007
- Using header session messages to anti-spammingComputers & Security, 2007
- Computing Communities in Large Networks Using Random WalksLecture Notes in Computer Science, 2005
- Secure Conjunctive Keyword Search over Encrypted DataLecture Notes in Computer Science, 2004