Visible to the public Biblio

Filters: Keyword is Writing  [Clear All Filters]
2021-07-08
Cao, Yetong, Zhang, Qian, Li, Fan, Yang, Song, Wang, Yu.  2020.  PPGPass: Nonintrusive and Secure Mobile Two-Factor Authentication via Wearables. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. :1917—1926.
{Mobile devices are promising to apply two-factor authentication in order to improve system security and enhance user privacy-preserving. Existing solutions usually have certain limits of requiring some form of user effort, which might seriously affect user experience and delay authentication time. In this paper, we propose PPGPass, a novel mobile two-factor authentication system, which leverages Photoplethysmography (PPG) sensors in wrist-worn wearables to extract individual characteristics of PPG signals. In order to realize both nonintrusive and secure, we design a two-stage algorithm to separate clean heartbeat signals from PPG signals contaminated by motion artifacts, which allows verifying users without intentionally staying still during the process of authentication. In addition, to deal with non-cancelable issues when biometrics are compromised, we design a repeatable and non-invertible method to generate cancelable feature templates as alternative credentials, which enables to defense against man-in-the-middle attacks and replay attacks. To the best of our knowledge, PPGPass is the first nonintrusive and secure mobile two-factor authentication based on PPG sensors in wearables. We build a prototype of PPGPass and conduct the system with comprehensive experiments involving multiple participants. PPGPass can achieve an average F1 score of 95.3%, which confirms its high effectiveness, security, and usability}.
Sato, Masaya, Taniguchi, Hideo, Nakamura, Ryosuke.  2020.  Virtual Machine Monitor-based Hiding Method for Access to Debug Registers. 2020 Eighth International Symposium on Computing and Networking (CANDAR). :209—214.
To secure a guest operating system running on a virtual machine (VM), a monitoring method using hardware breakpoints by a virtual machine monitor is required. However, debug registers are visible to guest operating systems; thus, malicious programs on a guest operating system can detect or disable the monitoring method. This paper presents a method to hide access to debug registers from programs running on a VM. Our proposed method detects programs' access to debug registers and disguises the access as having succeeded. The register's actual value is not visible or modifiable to programs, so the monitoring method is hidden. This paper presents the basic design and evaluation results of our method.
2021-06-01
Ghosal, Sandip, Shyamasundar, R. K..  2020.  A Generalized Notion of Non-interference for Flow Security of Sequential and Concurrent Programs. 2020 27th Asia-Pacific Software Engineering Conference (APSEC). :51–60.
For the last two decades, a wide spectrum of interpretations of non-interference11The notion of non-interference discussed in this paper enforces flow security in a program and is different from the concept of non-interference used for establishing functional correctness of parallel programs [1] have been used in the security analysis of programs, starting with the notion proposed by Goguen & Meseguer along with arguments of its impact on security practice. While the majority of works deal with sequential programs, several researchers have extended the notion of non-interference to enforce information flow-security in non-deterministic and concurrent programs. Major efforts of generalizations are based on (i) considering input sequences as a basic unit for input/output with semantic interpretation on a two-point information flow lattice, or (ii) typing of expressions as values for reading and writing, or (iii) typing of expressions along with its limited effects. Such approaches have limited compositionality and, thus, pose issues while extending these notions for concurrent programs. Further, in a general multi-point lattice, the notion of a public observer (or attacker) is not unique as it depends on the level of the attacker and the one attacked. In this paper, we first propose a compositional variant of non-interference for sequential systems that follow a general information flow lattice and place it in the context of earlier definitions of non-interference. We show that such an extension leads to the capturing of violations of information flow security in a concrete setting of a sequential language. Finally, we generalize non-interference for concurrent programs and illustrate its use for security analysis, particularly in the cases where information is transmitted through shared variables.
2021-05-03
Xu, Shenglin, Xie, Peidai, Wang, Yongjun.  2020.  AT-ROP: Using static analysis and binary patch technology to defend against ROP attacks based on return instruction. 2020 International Symposium on Theoretical Aspects of Software Engineering (TASE). :209–216.
Return-Oriented Programming (ROP) is one of the most common techniques to exploit software vulnerabilities. Although many solutions to defend against ROP attacks have been proposed, they still have various drawbacks, such as requiring additional information (source code, debug symbols, etc.), increasing program running cost, and causing program instability. In this paper, we propose a method: using static analysis and binary patch technology to defend against ROP attacks based on return instruction. According to this method, we implemented the AT- ROP tool in a Linux 64-bit system environment. Compared to existing tools, it clears the parameter registers when the function returns. As a result, it makes the binary to defend against ROP attacks based on return instruction without having to obtain the source code of the binary. We use the binary challenges in the CTF competition and the binary programs commonly used in the Linux environment to experiment. It turns out that AT-ROP can make the binary program have the ability to defend against ROP attacks based on return instruction with a small increase in the size of the binary program and without affecting its normal execution.
2021-03-15
Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M..  2020.  Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

Azahari, A. M., Ahmad, A., Rahayu, S. B., Halip, M. H. Mohamed.  2020.  CheckMyCode: Assignment Submission System with Cloud-Based Java Compiler. 2020 8th International Conference on Information Technology and Multimedia (ICIMU). :343–347.
Learning programming language of Java is a basic part of the Computer Science and Engineering curriculum. Specific Java compiler is a requirement for writing and convert the writing code to executable format. However, some local installed Java compiler is suffering from compatibility, portability and storage space issues. These issues sometimes affect student-learning interest and slow down the learning process. This paper is directed toward the solution for such problems, which offers a new programming assignment submission system with cloud-based Java compiler and is known as CheckMyCode. Leveraging cloud-computing technology in terms of its availability, prevalence and affordability, CheckMyCode implements Java cloud-based programming compiler as a part of the assignment management system. CheckMyCode system is a cloud-based system that allows both main users, which are a lecturer and student to access the system via a browser on PC or smart devices. Modules of submission assignment system with cloud compiler allow lecturer and student to manage Java programming task in one platform. A framework, system module, main user and feature of CheckMyCode are presented. Also, taking into account are the future study/direction and new enhancement of CheckMyCode.
2021-02-22
Alzahrani, A., Feki, J..  2020.  Toward a Natural Language-Based Approach for the Specification of Decisional-Users Requirements. 2020 3rd International Conference on Computer Applications Information Security (ICCAIS). :1–6.
The number of organizations adopting the Data Warehouse (DW) technology along with data analytics in order to improve the effectiveness of their decision-making processes is permanently increasing. Despite the efforts invested, the DW design remains a great challenge research domain. More accurately, the design quality of the DW depends on several aspects; among them, the requirement-gathering phase is a critical and complex task. In this context, we propose a Natural language (NL) NL-template based design approach, which is twofold; firstly, it facilitates the involvement of decision-makers in the early step of the DW design; indeed, using NL is a good and natural means to encourage the decision-makers to express their requirements as query-like English sentences. Secondly, our approach aims to generate a DW multidimensional schema from a set of gathered requirements (as OLAP: On-Line-Analytical-Processing queries, written according to the NL suggested templates). This approach articulates around: (i) two NL-templates for specifying multidimensional components, and (ii) a set of five heuristic rules for extracting the multidimensional concepts from requirements. Really, we are developing a software prototype that accepts the decision-makers' requirements then automatically identifies the multidimensional components of the DW model.
2020-12-11
Dabas, K., Madaan, N., Arya, V., Mehta, S., Chakraborty, T., Singh, G..  2019.  Fair Transfer of Multiple Style Attributes in Text. 2019 Grace Hopper Celebration India (GHCI). :1—5.

To preserve anonymity and obfuscate their identity on online platforms users may morph their text and portray themselves as a different gender or demographic. Similarly, a chatbot may need to customize its communication style to improve engagement with its audience. This manner of changing the style of written text has gained significant attention in recent years. Yet these past research works largely cater to the transfer of single style attributes. The disadvantage of focusing on a single style alone is that this often results in target text where other existing style attributes behave unpredictably or are unfairly dominated by the new style. To counteract this behavior, it would be nice to have a style transfer mechanism that can transfer or control multiple styles simultaneously and fairly. Through such an approach, one could obtain obfuscated or written text incorporated with a desired degree of multiple soft styles such as female-quality, politeness, or formalness. To the best of our knowledge this work is the first that shows and attempt to solve the issues related to multiple style transfer. We also demonstrate that the transfer of multiple styles cannot be achieved by sequentially performing multiple single-style transfers. This is because each single style-transfer step often reverses or dominates over the style incorporated by a previous transfer step. We then propose a neural network architecture for fairly transferring multiple style attributes in a given text. We test our architecture on the Yelp dataset to demonstrate our superior performance as compared to existing one-style transfer steps performed in a sequence.

2020-10-30
Kang, Qiao, Lee, Sunwoo, Hou, Kaiyuan, Ross, Robert, Agrawal, Ankit, Choudhary, Alok, Liao, Wei-keng.  2020.  Improving MPI Collective I/O for High Volume Non-Contiguous Requests With Intra-Node Aggregation. IEEE Transactions on Parallel and Distributed Systems. 31:2682—2695.

Two-phase I/O is a well-known strategy for implementing collective MPI-IO functions. It redistributes I/O requests among the calling processes into a form that minimizes the file access costs. As modern parallel computers continue to grow into the exascale era, the communication cost of such request redistribution can quickly overwhelm collective I/O performance. This effect has been observed from parallel jobs that run on multiple compute nodes with a high count of MPI processes on each node. To reduce the communication cost, we present a new design for collective I/O by adding an extra communication layer that performs request aggregation among processes within the same compute nodes. This approach can significantly reduce inter-node communication contention when redistributing the I/O requests. We evaluate the performance and compare it with the original two-phase I/O on Cray XC40 parallel computers (Theta and Cori) with Intel KNL and Haswell processors. Using I/O patterns from two large-scale production applications and an I/O benchmark, we show our proposed method effectively reduces the communication cost and hence maintains the scalability for a large number of processes.

2020-07-10
Chen, Shuo-Han, Yang, Ming-Chang, Chang, Yuan-Hao, Wu, Chun-Feng.  2019.  Enabling File-Oriented Fast Secure Deletion on Shingled Magnetic Recording Drives. 2019 56th ACM/IEEE Design Automation Conference (DAC). :1—6.

Existing secure deletion approaches are inefficient in erasing data permanently because file systems have no knowledge of the data layout on the storage device, nor is the storage device aware of file information within the file systems. This inefficiency is exaggerated on the emerging shingled magnetic recording (SMR) drive due to its inherent sequential-write constraint. On SMR drives, secure deletion requests may lead to serious write amplification and performance degradation if the data layout is not properly configured. Such observation motivates us to propose a file-oriented fast secure deletion (FFSD) strategy to alleviate the negative impacts of SMR drives' sequential-write constraint and improve the efficiency of secure deletion operations on SMR drives. A series of experiments was conducted to demonstrate the capability of the proposed strategy on improving the efficiency of secure deletion on SMR drives.

2020-03-30
Miao, Hui, Deshpande, Amol.  2019.  Understanding Data Science Lifecycle Provenance via Graph Segmentation and Summarization. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :1710–1713.
Increasingly modern data science platforms today have non-intrusive and extensible provenance ingestion mechanisms to collect rich provenance and context information, handle modifications to the same file using distinguishable versions, and use graph data models (e.g., property graphs) and query languages (e.g., Cypher) to represent and manipulate the stored provenance/context information. Due to the schema-later nature of the metadata, multiple versions of the same files, and unfamiliar artifacts introduced by team members, the resulting "provenance graphs" are quite verbose and evolving; further, it is very difficult for the users to compose queries and utilize this valuable information just using standard graph query model. In this paper, we propose two high-level graph query operators to address the verboseness and evolving nature of such provenance graphs. First, we introduce a graph segmentation operator, which queries the retrospective provenance between a set of source vertices and a set of destination vertices via flexible boundary criteria to help users get insight about the derivation relationships among those vertices. We show the semantics of such a query in terms of a context-free grammar, and develop efficient algorithms that run orders of magnitude faster than state-of-the-art. Second, we propose a graph summarization operator that combines similar segments together to query prospective provenance of the underlying project. The operator allows tuning the summary by ignoring vertex details and characterizing local structures, and ensures the provenance meaning using path constraints. We show the optimal summary problem is PSPACE-complete and develop effective approximation algorithms. We implement the operators on top of Neo4j, evaluate our query techniques extensively, and show the effectiveness and efficiency of the proposed methods.
2020-02-26
Naik, Nitin, Jenkins, Paul, Savage, Nick, Yang, Longzhi.  2019.  Cyberthreat Hunting - Part 2: Tracking Ransomware Threat Actors Using Fuzzy Hashing and Fuzzy C-Means Clustering. 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1–6.

Threat actors are constantly seeking new attack surfaces, with ransomeware being one the most successful attack vectors that have been used for financial gain. This has been achieved through the dispersion of unlimited polymorphic samples of ransomware whilst those responsible evade detection and hide their identity. Nonetheless, every ransomware threat actor adopts some similar style or uses some common patterns in their malicious code writing, which can be significant evidence contributing to their identification. he first step in attempting to identify the source of the attack is to cluster a large number of ransomware samples based on very little or no information about the samples, accordingly, their traits and signatures can be analysed and identified. T herefore, this paper proposes an efficient fuzzy analysis approach to cluster ransomware samples based on the combination of two fuzzy techniques fuzzy hashing and fuzzy c-means (FCM) clustering. Unlike other clustering techniques, FCM can directly utilise similarity scores generated by a fuzzy hashing method and cluster them into similar groups without requiring additional transformational steps to obtain distance among objects for clustering. Thus, it reduces the computational overheads by utilising fuzzy similarity scores obtained at the time of initial triaging of whether the sample is known or unknown ransomware. The performance of the proposed fuzzy method is compared against k-means clustering and the two fuzzy hashing methods SSDEEP and SDHASH which are evaluated based on their FCM clustering results to understand how the similarity score affects the clustering results.

2020-02-10
Todorov, Vassil, Taha, Safouan, Boulanger, Frédéric, Hernandez, Armando.  2019.  Improved Invariant Generation for Industrial Software Model Checking of Time Properties. 2019 IEEE 19th International Conference on Software Quality, Reliability and Security (QRS). :334–341.
Modern automotive embedded software is mostly designed using model-based design tools such as Simulink or SCADE, and source code is generated automatically from the models. Formal proof using symbolic model checking has been integrated in these tools and can provide a higher assurance by proving safety-critical properties. Our experience shows that proving properties involving time is rather challenging when they involve long durations and timers. These properties are generally not inductive and even advanced techniques such as PDR/IC3 are unable to handle them on production models in reasonable time. In this paper, we first present our industrial use case and comment on the results obtained with the existing model checkers. Then we present our invariant generator and methodology for selecting invariants according to physical dimensions. They enable the proof of properties with long-running timers. Finally, we discuss their implementation and benchmarks.
2019-12-02
Abate, Carmine, Blanco, Roberto, Garg, Deepak, Hritcu, Catalin, Patrignani, Marco, Thibault, Jérémy.  2019.  Journey Beyond Full Abstraction: Exploring Robust Property Preservation for Secure Compilation. 2019 IEEE 32nd Computer Security Foundations Symposium (CSF). :256–25615.
Good programming languages provide helpful abstractions for writing secure code, but the security properties of the source language are generally not preserved when compiling a program and linking it with adversarial code in a low-level target language (e.g., a library or a legacy application). Linked target code that is compromised or malicious may, for instance, read and write the compiled program's data and code, jump to arbitrary memory locations, or smash the stack, blatantly violating any source-level abstraction. By contrast, a fully abstract compilation chain protects source-level abstractions all the way down, ensuring that linked adversarial target code cannot observe more about the compiled program than what some linked source code could about the source program. However, while research in this area has so far focused on preserving observational equivalence, as needed for achieving full abstraction, there is a much larger space of security properties one can choose to preserve against linked adversarial code. And the precise class of security properties one chooses crucially impacts not only the supported security goals and the strength of the attacker model, but also the kind of protections a secure compilation chain has to introduce. We are the first to thoroughly explore a large space of formal secure compilation criteria based on robust property preservation, i.e., the preservation of properties satisfied against arbitrary adversarial contexts. We study robustly preserving various classes of trace properties such as safety, of hyperproperties such as noninterference, and of relational hyperproperties such as trace equivalence. This leads to many new secure compilation criteria, some of which are easier to practically achieve and prove than full abstraction, and some of which provide strictly stronger security guarantees. For each of the studied criteria we propose an equivalent “property-free” characterization that clarifies which proof techniques apply. For relational properties and hyperproperties, which relate the behaviors of multiple programs, our formal definitions of the property classes themselves are novel. We order our criteria by their relative strength and show several collapses and separation results. Finally, we adapt existing proof techniques to show that even the strongest of our secure compilation criteria, the robust preservation of all relational hyperproperties, is achievable for a simple translation from a statically typed to a dynamically typed language.
2019-09-30
Jiao, Y., Hohlfield, J., Victora, R. H..  2018.  Understanding Transition and Remanence Noise in HAMR. IEEE Transactions on Magnetics. 54:1–5.

Transition noise and remanence noise are the two most important types of media noise in heat-assisted magnetic recording. We examine two methods (spatial splitting and principal components analysis) to distinguish them: both techniques show similar trends with respect to applied field and grain pitch (GP). It was also found that PW50can be affected by GP and reader design, but is almost independent of write field and bit length (larger than 50 nm). Interestingly, our simulation shows a linear relationship between jitter and PW50NSRrem, which agrees qualitatively with experimental results.

2019-09-26
Khatchadourian, R., Tang, Y., Bagherzadeh, M., Ahmed, S..  2019.  Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :619-630.

Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming languages. For example, the Stream API introduced in Java 8 allows for functional-like, MapReduce-style operations in processing both finite and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we present an automated refactoring approach that assists developers in writing efficient stream code in a semantics-preserving fashion. The approach, based on a novel data ordering and typestate analysis, consists of preconditions for automatically determining when it is safe and possibly advantageous to convert sequential streams to parallel and unorder or de-parallelize already parallel streams. The approach was implemented as a plug-in to the Eclipse IDE, uses the WALA and SAFE analysis frameworks, and was evaluated on 11 Java projects consisting of ?642K lines of code. We found that 57 of 157 candidate streams (36.31%) were refactorable, and an average speedup of 3.49 on performance tests was observed. The results indicate that the approach is useful in optimizing stream code to their full potential.

2019-05-08
Ölvecký, M., Gabriška, D..  2018.  Wiping Techniques and Anti-Forensics Methods. 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY). :000127–000132.

This paper presents a theoretical background of main research activity focused on the evaluation of wiping/erasure standards which are mostly implemented in specific software products developed and programming for data wiping. The information saved in storage devices often consists of metadata and trace data. Especially but not only these kinds of data are very important in the process of forensic analysis because they sometimes contain information about interconnection on another file. Most people saving their sensitive information on their local storage devices and later they want to secure erase these files but usually there is a problem with this operation. Secure file destruction is one of many Anti-forensics methods. The outcome of this paper is to define the future research activities focused on the establishment of the suitable digital environment. This environment will be prepared for testing and evaluating selected wiping standards and appropriate eraser software.

2019-02-22
Gaston, J., Narayanan, M., Dozier, G., Cothran, D. L., Arms-Chavez, C., Rossi, M., King, M. C., Xu, J..  2018.  Authorship Attribution vs. Adversarial Authorship from a LIWC and Sentiment Analysis Perspective. 2018 IEEE Symposium Series on Computational Intelligence (SSCI). :920-927.

Although Stylometry has been effectively used for Authorship Attribution, there is a growing number of methods being developed that allow authors to mask their identity [2, 13]. In this paper, we investigate the usage of non-traditional feature sets for Authorship Attribution. By using non-traditional feature sets, one may be able to reveal the identity of adversarial authors who are attempting to evade detection from Authorship Attribution systems that are based on more traditional feature sets. In addition, we demonstrate how GEFeS (Genetic & Evolutionary Feature Selection) can be used to evolve high-performance hybrid feature sets composed of two non-traditional feature sets for Authorship Attribution: LIWC (Linguistic Inquiry & Word Count) and Sentiment Analysis. These hybrids were able to reduce the Adversarial Effectiveness on a test set presented in [2] by approximately 33.4%.

2019-02-08
Cui, S., Asghar, M. R., Russello, G..  2018.  Towards Blockchain-Based Scalable and Trustworthy File Sharing. 2018 27th International Conference on Computer Communication and Networks (ICCCN). :1-2.

In blockchain-based systems, malicious behaviour can be detected using auditable information in transactions managed by distributed ledgers. Besides cryptocurrency, blockchain technology has recently been used for other applications, such as file storage. However, most of existing blockchain- based file storage systems can not revoke a user efficiently when multiple users have access to the same file that is encrypted. Actually, they need to update file encryption keys and distribute new keys to remaining users, which significantly increases computation and bandwidth overheads. In this work, we propose a blockchain and proxy re-encryption based design for encrypted file sharing that brings a distributed access control and data management. By combining blockchain with proxy re-encryption, our approach not only ensures confidentiality and integrity of files, but also provides a scalable key management mechanism for file sharing among multiple users. Moreover, by storing encrypted files and related keys in a distributed way, our method can resist collusion attacks between revoked users and distributed proxies.

2018-08-23
Vassena, M., Breitner, J., Russo, A..  2017.  Securing Concurrent Lazy Programs Against Information Leakage. 2017 IEEE 30th Computer Security Foundations Symposium (CSF). :37–52.
Many state-of-the-art information-flow control (IFC) tools are implemented as Haskell libraries. A distinctive feature of this language is lazy evaluation. In his influencal paper on why functional programming matters, John Hughes proclaims:,,Lazy evaluation is perhaps the most powerful tool for modularization in the functional programmer's repertoire.,,Unfortunately, lazy evaluation makes IFC libraries vulnerable to leaks via the internal timing covert channel. The problem arises due to sharing, the distinguishing feature of lazy evaluation, which ensures that results of evaluated terms are stored for subsequent re-utilization. In this sense, the evaluation of a term in a high context represents a side-effect that eludes the security mechanisms of the libraries. A naïve approach to prevent that consists in forcing the evaluation of terms before entering a high context. However, this is not always possible in lazy languages, where terms often denote infinite data structures. Instead, we propose a new language primitive, lazyDup, which duplicates terms lazily. By using lazyDup to duplicate terms manipulated in high contexts, we make the security library MAC robust against internal timing leaks via lazy evaluation. We show that well-typed programs satisfy progress-sensitive non-interference in our lazy calculus with non-strict references. Our security guarantees are supported by mechanized proofs in the Agda proof assistant.
2018-03-19
Rocha, A., Scheirer, W. J., Forstall, C. W., Cavalcante, T., Theophilo, A., Shen, B., Carvalho, A. R. B., Stamatatos, E..  2017.  Authorship Attribution for Social Media Forensics. IEEE Transactions on Information Forensics and Security. 12:5–33.

The veil of anonymity provided by smartphones with pre-paid SIM cards, public Wi-Fi hotspots, and distributed networks like Tor has drastically complicated the task of identifying users of social media during forensic investigations. In some cases, the text of a single posted message will be the only clue to an author's identity. How can we accurately predict who that author might be when the message may never exceed 140 characters on a service like Twitter? For the past 50 years, linguists, computer scientists, and scholars of the humanities have been jointly developing automated methods to identify authors based on the style of their writing. All authors possess peculiarities of habit that influence the form and content of their written works. These characteristics can often be quantified and measured using machine learning algorithms. In this paper, we provide a comprehensive review of the methods of authorship attribution that can be applied to the problem of social media forensics. Furthermore, we examine emerging supervised learning-based methods that are effective for small sample sizes, and provide step-by-step explanations for several scalable approaches as instructional case studies for newcomers to the field. We argue that there is a significant need in forensics for new authorship attribution algorithms that can exploit context, can process multi-modal data, and are tolerant to incomplete knowledge of the space of all possible authors at training time.

Thankaraj, A., Nair, A. J., Vasudevan, N., Pathari, V..  2017.  Misclassifications: The Missing Link. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI). :1719–1722.

The notion of style is pivotal to literature. The choice of a certain writing style moulds and enhances the overall character of a book. Stylometry uses statistical methods to analyze literary style. This work aims to build a recommendation system based on the similarity in stylometric cues of various authors. The problem at hand is in close proximity to the author attribution problem. It follows a supervised approach with an initial corpus of books labelled with their respective authors as training set and generate recommendations based on the misclassified books. Results in book similarity are substantiated by domain experts.

Faust, C., Dozier, G., Xu, J., King, M. C..  2017.  Adversarial Authorship, Interactive Evolutionary Hill-Climbing, and Author CAAT-III. 2017 IEEE Symposium Series on Computational Intelligence (SSCI). :1–8.

We are currently witnessing the development of increasingly effective author identification systems (AISs) that have the potential to track users across the internet based on their writing style. In this paper, we discuss two methods for providing user anonymity with respect to writing style: Adversarial Stylometry and Adversarial Authorship. With Adversarial Stylometry, a user attempts to obfuscate their writing style by consciously altering it. With Adversarial Authorship, a user can select an author cluster target (ACT) and write toward this target with the intention of subverting an AIS so that the user's writing sample will be misclassified Our results show that Adversarial Authorship via interactive evolutionary hill-climbing outperforms Adversarial Stylometry.

Keerthana, S., Monisha, C., Priyanka, S., Veena, S..  2017.  De Duplication Scalable Secure File Sharing on Untrusted Storage in Big Data. 2017 International Conference on Information Communication and Embedded Systems (ICICES). :1–6.

Data Deduplication provides lots of benefits to security and privacy issues which can arise as user's sensitive data at risk of within and out of doors attacks. Traditional secret writing that provides knowledge confidentiality is incompatible with knowledge deduplication. Ancient secret writing wants completely different users to encode their knowledge with their own keys. Thus, identical knowledge copies of completely different various users can result in different ciphertexts that makes Deduplication not possible. Convergent secret writing has been planned to enforce knowledge confidentiality whereas creating Deduplication possible. It encrypts/decrypts a knowledge copy with a confluent key, that is obtained by computing the cryptographical hash price of the content of the information copy. Once generation of key and encryption, the user can retain the keys and send ciphertext to cloud.

2018-02-27
Bours, P., Brahmanpally, S..  2017.  Language Dependent Challenge-Based Keystroke Dynamics. 2017 International Carnahan Conference on Security Technology (ICCST). :1–6.

Keystroke Dynamics can be used as an unobtrusive method to enhance password authentication, by checking the typing rhythm of the user. Fixed passwords will give an attacker the possibility to try to learn to mimic the typing behaviour of a victim. In this paper we will investigate the performance of a keystroke dynamic (KD) system when the users have to type given (English) words. Under the assumption that it is easy to type words in your native language and difficult in a foreign language will we also test the performance of such a challenge-based KD system when the challenges are not common English words, but words in the native language of the user. We collected data from participants with 6 different native language backgrounds and had them type random 8-12 character words in each of the 6 languages. The participants also typed random English words and random French words. English was assumed to be a language familiar to all participants, while French was not a native language to any participant and most likely most participants were not fluent in French. Analysis showed that using language dependent words gave a better performance of the challenge-based KD compared to an all English challenge-based system. When using words in a native language, then the performance of the participants with their mother-tongue equal to that native language had a similar performance compared to the all English challenge-based system, but the non-native speakers had an FMR that was significantly lower than the native language speakers. We found that native Telugu speakers had an FMR of less than 1% when writing Spanish or Slovak words. We also found that duration features were best to recognize genuine users, but latency features performed best to recognize non-native impostor users.