Visible to the public A Data Streaming Approach to Link Mining in Criminal Networks

TitleA Data Streaming Approach to Link Mining in Criminal Networks
Publication TypeConference Paper
Year of Publication2017
AuthorsMarciani, G., Porretta, M., Nardelli, M., Italiano, G. F.
Conference Name2017 5th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW)
Date Publishedaug
KeywordsApache Flink framework, Big Data analytics, big data security metrics, criminal actions, criminal law, criminal link detection, criminal networks, data mining, data stream processing, data streaming approach, evolving criminal network, flexible data stream processing application, Indexes, law enforcement agencies, link mining, Measurement, Metrics, pattern discovery, potential links, pubcrawl, public administration, Real-time Systems, Resiliency, Resource management, Scalability, security, similarity social network metrics, Social network services, social networking (online), social networks, social networks analysis, stream processing approach, time-sensible scenarios

The ability to discover patterns of interest in criminal networks can support and ease the investigation tasks by security and law enforcement agencies. By considering criminal networks as a special case of social networks, we can properly reuse most of the state-of-the-art techniques to discover patterns of interests, i.e., hidden and potential links. Nevertheless, in time-sensible scenarios, like the one involving criminal actions, the ability to discover patterns in a (near) real-time manner can be of primary importance.In this paper, we investigate the identification of patterns for link detection and prediction on an evolving criminal network. To extract valuable information as soon as data is generated, we exploit a stream processing approach. To this end, we also propose three new similarity social network metrics, specifically tailored for criminal link detection and prediction. Then, we develop a flexible data stream processing application relying on the Apache Flink framework; this solution allows us to deploy and evaluate the newly proposed metrics as well as the ones existing in literature. The experimental results show that the new metrics we propose can reach up to 83% accuracy in detection and 82% accuracy in prediction, resulting competitive with the state of the art metrics.

Citation Keymarciani_data_2017