Visible to the public A deep learning-based RNNs model for automatic security audit of short messages

TitleA deep learning-based RNNs model for automatic security audit of short messages
Publication TypeConference Paper
Year of Publication2016
AuthorsYou, L., Li, Y., Wang, Y., Zhang, J., Yang, Y.
Conference Name2016 16th International Symposium on Communications and Information Technologies (ISCIT)
ISBN Number978-1-5090-4099-5
Keywordsartificial neural network, Artificial neural networks, bag of words, binary classification, Collaboration, Deep Learning, deep learning-based recurrent neural networks, deep learning-based RNN model, electronic messaging, feature extraction, governance, Government, learning (artificial intelligence), Logic gates, message authentication, pattern classification, police data processing, policy, policy-based governance, prisons, pubcrawl, recurrent neural nets, recurrent neural networks (RNNs), Resiliency, security, Semantics, sentence feature vector, sentiment analysis, sentiment classification, short message security audit, short messages automatic security audit, Support vector machines, text categorization, text classification, Training, Vectors, word order information, Word2Vec

The traditional text classification methods usually follow this process: first, a sentence can be considered as a bag of words (BOW), then transformed into sentence feature vector which can be classified by some methods, such as maximum entropy (ME), Naive Bayes (NB), support vector machines (SVM), and so on. However, when these methods are applied to text classification, we usually can not obtain an ideal result. The most important reason is that the semantic relations between words is very important for text categorization, however, the traditional method can not capture it. Sentiment classification, as a special case of text classification, is binary classification (positive or negative). Inspired by the sentiment analysis, we use a novel deep learning-based recurrent neural networks (RNNs)model for automatic security audit of short messages from prisons, which can classify short messages(secure and non-insecure). In this paper, the feature of short messages is extracted by word2vec which captures word order information, and each sentence is mapped to a feature vector. In particular, words with similar meaning are mapped to a similar position in the vector space, and then classified by RNNs. RNNs are now widely used and the network structure of RNNs determines that it can easily process the sequence data. We preprocess short messages, extract typical features from existing security and non-security short messages via word2vec, and classify short messages through RNNs which accept a fixed-sized vector as input and produce a fixed-sized vector as output. The experimental results show that the RNNs model achieves an average 92.7% accuracy which is higher than SVM.

Citation Keyyou_deep_2016