Visible to the public Biblio

Found 330 results

Filters: Keyword is data mining  [Clear All Filters]
Sadek, Mennatallah M., Khalifa, Amal, Khafga, Doaa.  2022.  An enhanced Skin-tone Block-map Image Steganography using Integer Wavelet Transforms. 2022 5th International Conference on Computing and Informatics (ICCI). :378–384.
Steganography is the technique of hiding a confidential message in an ordinary message where the extraction of embedded information is done at its destination. Among the different carrier files formats; digital images are the most popular. This paper presents a Wavelet-based method for hiding secret information in digital images where skin areas are identified and used as a region of interest. The work presented here is an extension of a method published earlier by the authors that utilized a rule-based approach to detect skin regions. The proposed method, proposed embedding the secret data into the integer Wavelet coefficients of the approximation sub-band of the cover image. When compared to the original technique, experimental results showed a lower error percentage between skin maps detected before the embedding and during the extraction processes. This eventually increased the similarity between the original and the retrieved secret image.
Ashlam, Ahmed Abadulla, Badii, Atta, Stahl, Frederic.  2022.  A Novel Approach Exploiting Machine Learning to Detect SQLi Attacks. 2022 5th International Conference on Advanced Systems and Emergent Technologies (IC\_ASET). :513–517.
The increasing use of Information Technology applications in the distributed environment is increasing security exploits. Information about vulnerabilities is also available on the open web in an unstructured format that developers can take advantage of to fix vulnerabilities in their IT applications. SQL injection (SQLi) attacks are frequently launched with the objective of exfiltration of data typically through targeting the back-end server organisations to compromise their customer databases. There have been a number of high profile attacks against large enterprises in recent years. With the ever-increasing growth of online trading, it is possible to see how SQLi attacks can continue to be one of the leading routes for cyber-attacks in the future, as indicated by findings reported in OWASP. Various machine learning and deep learning algorithms have been applied to detect and prevent these attacks. However, such preventive attempts have not limited the incidence of cyber-attacks and the resulting compromised database as reported by (CVE) repository. In this paper, the potential of using data mining approaches is pursued in order to enhance the efficacy of SQL injection safeguarding measures by reducing the false-positive rates in SQLi detection. The proposed approach uses CountVectorizer to extract features and then apply various supervised machine-learning models to automate the classification of SQLi. The model that returns the highest accuracy has been chosen among available models. Also a new model has been created PALOSDM (Performance analysis and Iterative optimisation of the SQLI Detection Model) for reducing false-positive rate and false-negative rate. The detection rate accuracy has also been improved significantly from a baseline of 94% up to 99%.
Wang, Mei.  2022.  Big Data Analysis and Mining Technology of Smart Grid Based on Privacy Protection. 2022 6th International Conference on Computing Methodologies and Communication (ICCMC). :868—871.
Aiming at the big data security and privacy protection issues in the smart grid, the current key technologies for big data security and privacy protection in smart grids are sorted out, and a privacy-protecting smart grid association rule is proposed according to the privacy-protecting smart grid big data analysis and mining technology route The mining plan specifically analyzes the risk factors in the operation of the new power grid, and discusses the information security of power grid users from the perspective of the user, focusing on the protection of privacy and security, using safe multi-party calculation of the support and confidence of the association rules. Privacy-protecting smart grid big data mining enables power companies to improve service quality to 7.5% without divulging customer private information.
Da Costa, Alessandro Monteiro, de Sá, Alan Oliveira, Machado, Raphael C. S..  2022.  Data Acquisition and extraction on mobile devices-A Review. 2022 IEEE International Workshop on Metrology for Industry 4.0 & IoT (MetroInd4.0&IoT). :294—299.
Forensic Science comprises a set of technical-scientific knowledge used to solve illicit acts. The increasing use of mobile devices as the main computing platform, in particular smartphones, makes existing information valuable for forensics. However, the blocking mechanisms imposed by the manufacturers and the variety of models and technologies make the task of reconstructing the data for analysis challenging. It is worth mentioning that the conclusion of a case requires more than the simple identification of evidence, as it is extremely important to correlate all the data and sources obtained, to confirm a suspicion or to seek new evidence. This work carries out a systematic review of the literature, identifying the different types of existing image acquisition and the main extraction and encryption methods used in smartphones with the Android operating system.
Sewak, Mohit, Sahay, Sanjay K., Rathore, Hemant.  2022.  X-Swarm: Adversarial DRL for Metamorphic Malware Swarm Generation. 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops). :169–174.
Advanced metamorphic malware and ransomware use techniques like obfuscation to alter their internal structure with every attack. Therefore, any signature extracted from such attack, and used to bolster endpoint defense, cannot avert subsequent attacks. Therefore, if even a single such malware intrudes even a single device of an IoT network, it will continue to infect the entire network. Scenarios where an entire network is targeted by a coordinated swarm of such malware is not beyond imagination. Therefore, the IoT era also requires Industry-4.0 grade AI-based solutions against such advanced attacks. But AI-based solutions need a large repository of data extracted from similar attacks to learn robust representations. Whereas, developing a metamorphic malware is a very complex task and requires extreme human ingenuity. Hence, there does not exist abundant metamorphic malware to train AI-based defensive solutions. Also, there is currently no system that could generate enough functionality preserving metamorphic variants of multiple malware to train AI-based defensive systems. Therefore, to this end, we design and develop a novel system, named X-Swarm. X-Swarm uses deep policy-based adversarial reinforcement learning to generate swarm of metamorphic instances of any malware by obfuscating them at the opcode level and ensuring that they could evade even capable, adversarial-attack immune endpoint defense systems.
Islam, Md Rofiqul, Cerny, Tomas.  2021.  Business Process Extraction Using Static Analysis. 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1202–1204.
Business process mining of a large-scale project has many benefits such as finding vulnerabilities, improving processes, collecting data for data science, generating more clear and simple representation, etc. The general way of process mining is to turn event data such as application logs into insights and actions. Observing logs broad enough to depict the whole business logic scenario of a large project can become very costly due to difficult environment setup, unavailability of users, presence of not reachable or hardly reachable log statements, etc. Using static source code analysis to extract logs and arranging them perfect runtime execution order is a potential way to solve the problem and reduce the business process mining operation cost.
Singh, Karan Kumar, B S, Radhika, Shyamasundar, R K.  2021.  SEFlowViz: A Visualization Tool for SELinux Policy Analysis. 2021 12th International Conference on Information and Communication Systems (ICICS). :439—444.
SELinux policies used in practice are generally large and complex. As a result, it is difficult for the policy writers to completely understand the policy and ensure that the policy meets the intended security goals. To remedy this, we have developed a tool called SEFlowViz that helps in visualizing the information flows of a policy and thereby helps in creating flow-secure policies. The tool uses the graph database Neo4j to visualize the policy. Along with visualization, the tool also supports extracting various information regarding the policy and its components through queries. Furthermore, the tool also supports the addition and deletion of rules which is useful in converting inconsistent policies into consistent policies.
Wu, Yue-hong, Zhuang, Shen, Sun, Qi.  2020.  A Steganography Algorithm Based on GM Model of optimized Parameters. 2020 International Conference on Computer Engineering and Application (ICCEA). :384—387.
In order to improve the concealment of image steganography, a new method is proposed. The algorithm firstly adopted GM (1, 1) model to detect texture and edge points of carrier image, then embedded secret information in them. GM (1, 1) model of optimized parameters can make full use of pixels information. These pixels are the nearest to the detected point, so it improves the detection accuracy. The method is a kind of steganography based on human visual system. By testing the stegano images with different embedding capacities, the result indicates concealment and image quality of the proposed algorithm are better than BPCS (Bit-plane Complexity Segmentation) and PVD (Pixel-value Differencing), which are also based on visual characteristics.
Nahar, Nazmun, Ahmed, Md. Kawsher, Miah, Tareq, Alam, Shahriar, Rahman, Kh. Mustafizur, Rabbi, Md. Anayt.  2021.  Implementation of Android Based Text to Image Steganography Using 512-Bit Algorithm with LSB Technique. 2021 5th International Conference on Electrical Information and Communication Technology (EICT). :1—6.
Steganography security is the main concern in today’s informative world. The fact is that communication takes place to hide information secretly. Steganography is the technique of hiding secret data within an ordinary, non-secret, file, text message and images. This technique avoids detection of the secret data then extracted at its destination. The main reason for using steganography is, we can hide any secret message behind its ordinary file. This work presents a unique technique for image steganography based on a 512-bit algorithm. The secure stego image is a very challenging task to give protection. Therefore we used the least significant bit (LSB) techniques for implementing stego and cover image. However, data encryption and decryption are used to embedded text and replace data into the least significant bit (LSB) for better approaches. Android-based interface used in encryption-decryption techniques that evaluated in this process.Contribution—this research work with 512-bit data simultaneously in a block cipher to reduce the time complexity of a system, android platform used for data encryption decryption process. Steganography model works with stego image that interacts with LSB techniques for data hiding.
Li, Chunzhi.  2021.  A Phishing Detection Method Based on Data Mining. 2021 3rd International Conference on Applied Machine Learning (ICAML). :202—205.
Data mining technology is a very important technology in the current era of data explosion. With the informationization of society and the transparency and openness of information, network security issues have become the focus of concern of people all over the world. This paper wants to compare the accuracy of multiple machine learning methods and two deep learning frameworks when using lexical features to detect and classify malicious URLs. As a result, this paper shows that the Random Forest, which is an ensemble learning method for classification, is superior to 8 other machine learning methods in this paper. Furthermore, the Random Forest is even superior to some popular deep neural network models produced by famous frameworks such as TensorFlow and PyTorch when using lexical features to detect and classify malicious URLs.
Williams, Joseph, MacDermott, Áine, Stamp, Kellyann, Iqbal, Farkhund.  2021.  Forensic Analysis of Fitbit Versa: Android vs iOS. 2021 IEEE Security and Privacy Workshops (SPW). :318–326.
Fitbit Versa is the most popular of its predecessors and successors in the Fitbit faction. Increasingly data stored on these smart fitness devices, their linked applications and cloud datacenters are being used for criminal convictions. There is limited research for investigators on wearable devices and specifically exploring evidence identification and methods of extraction. In this paper we present our analysis of Fitbit Versa using Cellebrite UFED and MSAB XRY. We present a clear scope for investigation and data significance based on the findings from our experiments. The data recovery will include logical and physical extractions using devices running Android 9 and iOS 12, comparing between Cellebrite and XRY capabilities. This paper discusses databases and datatypes that can be recovered using different extraction and analysis techniques, providing a robust outlook of data availability. We also discuss the accuracy of recorded data compared to planned test instances, verifying the accuracy of individual data types. The verifiable accuracy of some datatypes could prove useful if such data was required during the evidentiary processes of a forensic investigation.
Ndemeye, Bosco, Hussain, Shahid, Norris, Boyana.  2021.  Threshold-Based Analysis of the Code Quality of High-Performance Computing Software Packages. 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C). :222—228.
Many popular metrics used for the quantification of the quality or complexity of a codebase (e.g. cyclomatic complexity) were developed in the 1970s or 1980s when source code sizes were significantly smaller than they are today, and before a number of modern programming language features were introduced in different languages. Thus, the many thresholds that were suggested by researchers for deciding whether a given function is lacking in a given quality dimension need to be updated. In the pursuit of this goal, we study a number of open-source high-performance codes, each of which has been in development for more than 15 years—a characteristic which we take to imply good design to score them in terms of their source codes' quality and to relax the above-mentioned thresholds. First, we employ the LLVM/Clang compiler infrastructure and introduce a Clang AST tool to gather AST-based metrics, as well as an LLVM IR pass for those based on a source code's static call graph. Second, we perform statistical analysis to identify the reference thresholds of 22 code quality and callgraph-related metrics at a fine grained level.
Xu, Rong-Zhen, He, Meng-Ke.  2020.  Application of Deep Learning Neural Network in Online Supply Chain Financial Credit Risk Assessment. 2020 International Conference on Computer Information and Big Data Applications (CIBDA). :224—232.
Under the background of "Internet +", in order to solve the problem of deeply mining credit risk behind online supply chain financial big data, this paper proposes an online supply chain financial credit risk assessment method based on deep belief network (DBN). First, a deep belief network evaluation model composed of Restricted Boltzmann Machine (RBM) and classifier SOFTMAX is established, and the performance evaluation test of three kinds of data sets is carried out by using this model. Using factor analysis to select 8 indicators from 21 indicators, and then input them into RBM for conversion to form a more scientific evaluation index, and finally input them into SOFTMAX for evaluation. This method of online supply chain financial credit risk assessment based on DBN is applied to an example for verification. The results show that the evaluation accuracy of this method is 96.04%, which has higher evaluation accuracy and better rationality compared with SVM method and Logistic method.
Chao, Wang, Qun, Li, XiaoHu, Wang, TianYu, Ren, JiaHan, Dong, GuangXin, Guo, EnJie, Shi.  2020.  An Android Application Vulnerability Mining Method Based On Static and Dynamic Analysis. 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC). :599–603.
Due to the advantages and limitations of the two kinds of vulnerability mining methods of static and dynamic analysis of android applications, the paper proposes a method of Android application vulnerability mining based on dynamic and static combination. Firstly, the static analysis method is used to obtain the basic vulnerability analysis results of the application, and then the input test case of dynamic analysis is constructed on this basis. The fuzzy input test is carried out in the real machine environment, and the application security vulnerability is verified with the taint analysis technology, and finally the application vulnerability report is obtained. Experimental results show that compared with static analysis results, the method can significantly improve the accuracy of vulnerability mining.
Bae, Jin Hee, Kim, Minwoo, Lim, Joon S..  2021.  Emotion Detection and Analysis from Facial Image using Distance between Coordinates Feature. 2021 International Conference on Information and Communication Technology Convergence (ICTC). :494—497.
Facial expression recognition has long been established as a subject of continuous research in various fields. In this study, feature extraction was conducted by calculating the distance between facial landmarks in an image. The extracted features of the relationship between each landmark and analysis were used to classify five facial expressions. We increased the data and label reliability based on our labeling work with multiple observers. Additionally, faces were recognized from the original data, and landmark coordinates were extracted and used as features. A genetic algorithm was used to select features that were relatively more helpful for classification. We performed facial recognition classification and analysis using the method proposed in this study, which showed the validity and effectiveness of the proposed method.
Yasa, Ray Novita, Buana, I Komang Setia, Girinoto, Setiawan, Hermawan, Hadiprakoso, Raden Budiarto.  2021.  Modified RNP Privacy Protection Data Mining Method as Big Data Security. 2021 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS. :30–34.
Privacy-Preserving Data Mining (PPDM) has become an exciting topic to discuss in recent decades due to the growing interest in big data and data mining. A technique of securing data but still preserving the privacy that is in it. This paper provides an alternative perturbation-based PPDM technique which is carried out by modifying the RNP algorithm. The novelty given in this paper are modifications of some steps method with a specific purpose. The modifications made are in the form of first narrowing the selection of the disturbance value. With the aim that the number of attributes that are replaced in each record line is only as many as the attributes in the original data, no more and no need to repeat; secondly, derive the perturbation function from the cumulative distribution function and use it to find the probability distribution function so that the selection of replacement data has a clear basis. The experiment results on twenty-five perturbed data show that the modified RNP algorithm balances data utility and security level by selecting the appropriate disturbance value and perturbation value. The level of security is measured using privacy metrics in the form of value difference, average transformation of data, and percentage of retains. The method presented in this paper is fascinating to be applied to actual data that requires privacy preservation.
He, Weiyu, Wu, Xu, Wu, Jingchen, Xie, Xiaqing, Qiu, Lirong, Sun, Lijuan.  2021.  Insider Threat Detection Based on User Historical Behavior and Attention Mechanism. 2021 IEEE Sixth International Conference on Data Science in Cyberspace (DSC). :564–569.
Insider threat makes enterprises or organizations suffer from the loss of property and the negative influence of reputation. User behavior analysis is the mainstream method of insider threat detection, but due to the lack of fine-grained detection and the inability to effectively capture the behavior patterns of individual users, the accuracy and precision of detection are insufficient. To solve this problem, this paper designs an insider threat detection method based on user historical behavior and attention mechanism, including using Long Short Term Memory (LSTM) to extract user behavior sequence information, using Attention-based on user history behavior (ABUHB) learns the differences between different user behaviors, uses Bidirectional-LSTM (Bi-LSTM) to learn the evolution of different user behavior patterns, and finally realizes fine-grained user abnormal behavior detection. To evaluate the effectiveness of this method, experiments are conducted on the CMU-CERT Insider Threat Dataset. The experimental results show that the effectiveness of this method is 3.1% to 6.3% higher than that of other comparative model methods, and it can detect insider threats in different user behaviors with fine granularity.
Zhang, Xinyuan, Liu, Hongzhi, Wu, Zhonghai.  2020.  Noise Reduction Framework for Distantly Supervised Relation Extraction with Human in the Loop. 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC). :1–4.
Distant supervision is a widely used data labeling method for relation extraction. While aligning knowledge base with the corpus, distant supervision leads to a mass of wrong labels which are defined as noise. The pattern-based denoising model has achieved great progress in selecting trustable sentences (instances). However, the writing of relation-specific patterns heavily relies on expert’s knowledge and is a high labor intensity work. To solve these problems, we propose a noise reduction framework, NOIR, to iteratively select trustable sentences with a little help of a human. Under the guidance of experts, the iterative process can avoid semantic drift. Besides, NOIR can help experts discover relation-specific tokens that are hard to think of. Experimental results on three real-world datasets show the effectiveness of the proposed method compared with state-of-the-art methods.
Zhang, Xiaoyu, Fujiwara, Takanori, Chandrasegaran, Senthil, Brundage, Michael P., Sexton, Thurston, Dima, Alden, Ma, Kwan-Liu.  2021.  A Visual Analytics Approach for the Diagnosis of Heterogeneous and Multidimensional Machine Maintenance Data. 2021 IEEE 14th Pacific Visualization Symposium (PacificVis). :196–205.
Analysis of large, high-dimensional, and heterogeneous datasets is challenging as no one technique is suitable for visualizing and clustering such data in order to make sense of the underlying information. For instance, heterogeneous logs detailing machine repair and maintenance in an organization often need to be analyzed to diagnose errors and identify abnormal patterns, formalize root-cause analyses, and plan preventive maintenance. Such real-world datasets are also beset by issues such as inconsistent and/or missing entries. To conduct an effective diagnosis, it is important to extract and understand patterns from the data with support from analytic algorithms (e.g., finding that certain kinds of machine complaints occur more in the summer) while involving the human-in-the-loop. To address these challenges, we adopt existing techniques for dimensionality reduction (DR) and clustering of numerical, categorical, and text data dimensions, and introduce a visual analytics approach that uses multiple coordinated views to connect DR + clustering results across each kind of the data dimension stated. To help analysts label the clusters, each clustering view is supplemented with techniques and visualizations that contrast a cluster of interest with the rest of the dataset. Our approach assists analysts to make sense of machine maintenance logs and their errors. Then the gained insights help them carry out preventive maintenance. We illustrate and evaluate our approach through use cases and expert studies respectively, and discuss generalization of the approach to other heterogeneous data.
Rabbani, Mustafa Raza, Bashar, Abu, Atif, Mohd, Jreisat, Ammar, Zulfikar, Zehra, Naseem, Yusra.  2021.  Text mining and visual analytics in research: Exploring the innovative tools. 2021 International Conference on Decision Aid Sciences and Application (DASA). :1087–1091.
The aim of the study is to present an advanced overview and potential application of the innovative tools/software's/methods used for data visualization, text mining, scientific mapping, and bibliometric analysis. Text mining and data visualization has been a topic of research for several years for academic researchers and practitioners. With the advancement in technology and innovation in the data analysis techniques, there are many online and offline software tools available for text mining and visualisation. The purpose of this study is to present an advanced overview of latest, sophisticated, and innovative tools available for this purpose. The unique characteristic about this study is that it provides an overview with examples of the five most adopted software tools such as VOSviewer, Biblioshiny, Gephi, HistCite and CiteSpace in social science research. This study will contribute to the academic literature and will help the researchers and practitioners to apply these tools in future research to present their findings in a more scientific manner.
Sun, Dengdi, Lv, Xiangjie, Huang, Shilei, Yao, Lin, Ding, Zhuanlian.  2021.  Salient Object Detection Based on Multi-layer Cascade and Fine Boundary. 2021 17th International Conference on Computational Intelligence and Security (CIS). :299–303.
Due to the continuous improvement of deep learning, saliency object detection based on deep learning has been a hot topic in computational vision. The Fully Convolutional Neural Network (FCNS) has become the mainstream method in salient target measurement. In this article, we propose a new end-to-end multi-level feature fusion module(MCFB), success-fully achieving the goal of extracting rich multi-scale global information by integrating semantic and detailed information. In our module, we obtain different levels of feature maps through convolution, and then cascade the different levels of feature maps, fully considering our global information, and get a rough saliency image. We also propose an optimization module upon our base module to further optimize the feature map. To obtain a clearer boundary, we use a self-defined loss function to optimize the learning process, which includes the Intersection-over-Union (IoU) losses, Binary Cross-Entropy (BCE), and Structural Similarity (SSIM). The module can extract global information to a greater extent while obtaining clearer boundaries. Compared with some existing representative methods, this method has achieved good results.
Bothos, Ioannis, Vlachos, Vasileios, Kyriazanos, Dimitris M., Stamatiou, Ioannis, Thanos, Konstantinos Georgios, Tzamalis, Pantelis, Nikoletseas, Sotirios, Thomopoulos, Stelios C.A..  2021.  Modelling Cyber-Risk in an Economic Perspective. 2021 IEEE International Conference on Cyber Security and Resilience (CSR). :372–377.
In this paper, we present a theoretical approach concerning the econometric modelling for the estimation of cyber-security risk, with the use of time-series analysis methods and alternatively with Machine Learning (ML) based, deep learning methodology. Also we present work performed in the framework of SAINT H2020 Project [1], concerning innovative data mining techniques, based on automated web scrapping, for the retrieving of the relevant time-series data. We conclude with a review of emerging challenges in cyber-risk assessment brought by the rapid development of adversarial AI.
Nair, Viswajit Vinod, van Staalduinen, Mark, Oosterman, Dion T..  2021.  Template Clustering for the Foundational Analysis of the Dark Web. 2021 IEEE International Conference on Big Data (Big Data). :2542—2549.
The rapid rise of the Dark Web and supportive technologies has served as the backbone facilitating online illegal activity worldwide. These illegal activities supported by anonymisation technologies such as Tor has made it increasingly elusive to law enforcement agencies. Despite several successful law enforcement operations, illegal activity on the Dark Web is still growing. There are approaches to monitor, mine, and research the Dark Web, all with varying degrees of success. Given the complexity and dynamics of the services offered, we recognize the need for in depth analysis of the Dark Web with regard to its infrastructures, actors, types of abuse and their relationships. This involves the challenging task of information extraction from the very heterogeneous collection of web pages that make up the Dark Web. Most providers develop their services on top of standard frameworks such as WordPress, Simple Machine Forum, phpBB and several other frameworks to deploy their services. As a result, these service providers publish significant number of pages based on similar structural and stylistic templates. We propose an efficient, scalable, repeatable and accurate approach to cluster Dark Web pages based on those structural and stylistic features. Extracting relevant information from those clusters should make it feasible to conduct in depth Dark Web analysis. This paper presents our clustering algorithm to accelerate information extraction, and as a result improve attribution of digital traces to infrastructures or individuals in the fight against cyber crime.
Markina, Maria S., Markin, Pavel V., Voevodin, Vladislav A., Burenok, Dmitry S..  2021.  Methodology for Quantifying the Materiality of Audit Evidence Using Expert Assessments and Their Ranking. 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). :2390—2393.
An Information security audit is a process of obtaining objective audit evidence and evaluating it objectively for compliance with audit criteria. Given resource constraints, it's advisable to focus on obtaining evidence that has a significant impact on its effectiveness when developing an audit program to organize the audit. The person managing the audit program faces an urgent task developing an audit program, taking into account the information content of extracted evidence and resource constraints. In practice, evidence cannot be evaluated correctly directly in numerical scales, so they are forced to use less informative scales. The purpose of scientific research is to develop a methodology for assessing the materiality of audit evidence using expert assessments, their statistical processing, and transition to quantitative scales. As a result, the person managing the audit program gets a tool for developing an effective audit program.
Zhou, Zequan, Wang, Yupeng, Luo, Xiling, Bai, Yi, Wang, Xiaochao, Zeng, Feng.  2021.  Secure Accountable Dynamic Storage Integrity Verification. 2021 IEEE SmartWorld, Ubiquitous Intelligence Computing, Advanced Trusted Computing, Scalable Computing Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI). :440—447.
Integrity verification of cloud data is of great importance for secure and effective cloud storage since attackers can change the data even though it is encrypted. Traditional integrity verification schemes only let the client know the integrity status of the remote data. When the data is corrupted, the system cannot hold the server accountable. Besides, almost all existing schemes assume that the users are credible. Instead, especially in a dynamic operation environment, users can deny their behaviors, and let the server bear the penalty of data loss. To address the issues above, we propose an accountable dynamic storage integrity verification (ADS-IV) scheme which provides means to detect or eliminate misbehavior of all participants. In the meanwhile, we modify the Invertible Bloom Filter (IBF) to recover the corrupted data and use the Mahalanobis distance to calculate the degree of damage. We prove that our scheme is secure under Computational Diffie-Hellman (CDH) assumption and Discrete Logarithm (DL) assumption and that the audit process is privacy-preserving. The experimental results demonstrate that the computational complexity of the audit is constant; the storage overhead is \$O(\textbackslashtextbackslashsqrt n )\$, which is only 1/400 of the size of the original data; and the whole communication overhead is O(1).As a result, the proposed scheme is not only suitable for large-scale cloud data storage systems, but also for systems with sensitive data, such as banking systems, medical systems, and so on.