ABOUT

As time passes, we witness a growing number of cyber-attacks that affect schools, hospitals, critical infrastructures, and businesses. As the global community increases its attention to the impacts of cybersecurity events, several efforts have been made to mitigate the impacts of these significant events. Machine learning has been playing a key role as a cyber defense mechanism. However, a lot remains to be done, and a higher effort needs to be put forward to allow the usage of machine learning with the speed and scale needed to detect and respond to new and emerging cybersecurity threats.

One of the critical issues preventing the development of robust, efficient, and trustworthy models for cybersecurity is related to data difficulties and other domain-specific constraints to requirements. For instance, many real-world cybersecurity applications involve developing predictive models that either use data sets with highly imbalanced distributions of the target variable or are deployed in environments where the target variable distribution is skewed. The most scarce values are typically the most relevant for end-users, and the models developed are required to excel in capturing these rare values. This is a usual scenario when dealing with cybersecurity problems such as intrusion detection, fraudulent transactions detection in cryptocurrency graphs, vulnerability detection in source code, phishing detection, or malware detection, to name a few.

The imbalance problem has been thoroughly studied for almost three decades. However, the research community has not given much attention to this problem in a cybersecurity context. The cybersecurity field faces a critical challenge associated with the gap between research and practice, especially when using machine learning methods. In fact, several works that build predictive models in a cybersecurity context completely disregard the imbalance issue or only superficially address it. Several other difficulties motivate this, such as the availability of high-quality public data for research in cybersecurity. This lack of data that is publicly available, up-to-date, and representative of cybersecurity problems is thus another critical challenge in the cybersecurity domain.

Tackling the issues raised by imbalanced domains in cybersecurity is extremely relevant for academia and industry. This workshop focuses on contributing significantly to the problem of learning with imbalanced domains and other challenges in cybersecurity. The main goal is to increase the awareness of researchers and industry to these challenges and increase the interest and contributions to addressing the multiple issues raised. The workshop offers a set of talks that introduce the imbalance problem and other critical challenges in cybersecurity. Then, a session dedicated to peer-reviewed contributions will be carried out. The paper session will be open to any other potential problems related to the application of machine learning to cybersecurity.

TOPICS OF INTEREST

The research topics of interest to MALECIC'2023 workshop include (but are not limited to) the following:

KEY DATES

PROGRAM

SUBMISSION

Proceedings. Accepted papers will be included in the conference proceedings published in the ACM Digital Library.

ORGANIZATION