Visible to the public An applied pattern-driven corpus to predictive analytics in mitigating SQL injection attack

TitleAn applied pattern-driven corpus to predictive analytics in mitigating SQL injection attack
Publication TypeConference Paper
Year of Publication2017
AuthorsUwagbole, S. O., Buchanan, W. J., Fan, L.
Conference Name2017 Seventh International Conference on Emerging Security Technologies (EST)
Keywordsartificial intelligence, back-end database, Big Data, cloud computing, cloud-hosted Web applications, Collaboration, confidential data, data mining, Data models, Human Behavior, Internet of Things, IoT, learning (artificial intelligence), learning automata, Microsoft Azure Machine Learning, ML algorithms, pattern classification, pattern-driven corpus, policy, policy-based governance, Policy-Governed Secure Collaboration, predictive analytics, privacy, pubcrawl, receiver operating characteristic curve, Resiliency, ROC curve, secure backend storage, security, security of data, smart devices, SQL, SQL Injection, SQL injection attack, SQLIA, SQLIA big data, SQLIA Data analytics, SQLIA hashing, SQLIA Pattern-driven data set, Structured Query Language, structured query language injection attack, supervised learning, supervised learning model, Support vector machines

Emerging computing relies heavily on secure backend storage for the massive size of big data originating from the Internet of Things (IoT) smart devices to the Cloud-hosted web applications. Structured Query Language (SQL) Injection Attack (SQLIA) remains an intruder's exploit of choice to pilfer confidential data from the back-end database with damaging ramifications. The existing approaches were all before the new emerging computing in the context of the Internet big data mining and as such will lack the ability to cope with new signatures concealed in a large volume of web requests over time. Also, these existing approaches were strings lookup approaches aimed at on-premise application domain boundary, not applicable to roaming Cloud-hosted services' edge Software-Defined Network (SDN) to application endpoints with large web request hits. Using a Machine Learning (ML) approach provides scalable big data mining for SQLIA detection and prevention. Unfortunately, the absence of corpus to train a classifier is an issue well known in SQLIA research in applying Artificial Intelligence (AI) techniques. This paper presents an application context pattern-driven corpus to train a supervised learning model. The model is trained with ML algorithms of Two-Class Support Vector Machine (TC SVM) and Two-Class Logistic Regression (TC LR) implemented on Microsoft Azure Machine Learning (MAML) studio to mitigate SQLIA. This scheme presented here, then forms the subject of the empirical evaluation in Receiver Operating Characteristic (ROC) curve.

Citation Keyuwagbole_applied_2017