Visible to the public Biblio

Found 1515 results

Filters: First Letter Of Title is S  [Clear All Filters]
A B C D E F G H I J K L M N O P Q R [S] T U V W X Y Z   [Show ALL]
S
Jouini, Mouna, Ben Arfa Rabai, Latifa.  2016.  A Scalable Threats Classification Model in Information Systems. Proceedings of the 9th International Conference on Security of Information and Networks. :141–144.

Threat classification is extremely important for individuals and organizations, as it is an important step towards realization of information security. In fact, with the progress of information technologies (IT) security becomes a major challenge for organizations which are vulnerable to many types of insiders and outsiders security threats. The paper deals with threats classification models in order to help managers to define threat characteristics and then protect their assets from them. Existing threats classification models are non complete and present non orthogonal threats classes. The aim of this paper is to suggest a scalable and complete approach that classifies security threat in orthogonal way.

Tahat, Amer, Joshi, Sarang, Goswami, Pronnoy, Ravindran, Binoy.  2019.  Scalable Translation Validation of Unverified Legacy OS Code. 2019 Formal Methods in Computer Aided Design (FMCAD). :1–9.

Formally verifying functional and security properties of a large-scale production operating system is highly desirable. However, it is challenging as such OSes are often written in multiple source languages that have no formal semantics - a prerequisite for formal reasoning. To avoid expensive formalization of the semantics of multiple high-level source languages, we present a lightweight and rigorous verification toolchain that verifies OS code at the binary level, targeting ARM machines. To reason about ARM instructions, we first translate the ARM Specification Language that describes the semantics of the ARMv8 ISA into the PVS7 theorem prover and verify the translation. We leverage the radare2 reverse engineering tool to decode ARM binaries into PVS7 and verify the translation. Our translation verification methodology is a lightweight formal validation technique that generates large-scale instruction emulation test lemmas whose proof obligations are automatically discharged. To demonstrate our verification methodology, we apply the technique on two OSes: Google's Zircon and a subset of Linux. We extract a set of 370 functions from these OSes, translate them into PVS7, and verify the correctness of the translation by automatically discharging hundreds of thousands of proof obligations and tests. This took 27.5 person-months to develop.

Chen, G., Wang, D., Li, T., Zhang, C., Gu, M., Sun, J..  2018.  Scalable Verification Framework for C Program. 2018 25th Asia-Pacific Software Engineering Conference (APSEC). :129-138.

Software verification has been well applied in safety critical areas and has shown the ability to provide better quality assurance for modern software. However, as lines of code and complexity of software systems increase, the scalability of verification becomes a challenge. In this paper, we present an automatic software verification framework TSV to address the scalability issues: (i) the extended structural abstraction and property-guided program slicing to solve large-scale program verification problem, saving time and memory without losing accuracy; (ii) automatically select different verification methods according to the program and property context to improve the verification efficiency. For evaluation, we compare TSV's different configurations with existing C program verifiers based on open benchmarks. We found that TSV with auto-selection performs better than with bounded model checking only or with extended structural abstraction only. Compared to existing tools such as CMBC and CPAChecker, it acquires 10%-20% improvement of accuracy and 50%-90% improvement of memory consumption.

Weitz, Konstantin, Woos, Doug, Torlak, Emina, Ernst, Michael D., Krishnamurthy, Arvind, Tatlock, Zachary.  2016.  Scalable Verification of Border Gateway Protocol Configurations with an SMT Solver. Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. :765–780.

Internet Service Providers (ISPs) use the Border Gateway Protocol (BGP) to announce and exchange routes for de- livering packets through the internet. ISPs must carefully configure their BGP routers to ensure traffic is routed reli- ably and securely. Correctly configuring BGP routers has proven challenging in practice, and misconfiguration has led to worldwide outages and traffic hijacks. This paper presents Bagpipe, a system that enables ISPs to declaratively express BGP policies and that automatically verifies that router configurations implement such policies. The novel initial network reduction soundly reduces policy verification to a search for counterexamples in a finite space. An SMT-based symbolic execution engine performs this search efficiently. Bagpipe reduces the size of its search space using predicate abstraction and parallelizes its search using symbolic variable hoisting. Bagpipe's policy specification language is expressive: we expressed policies inferred from real AS configurations, policies from the literature, and policies for 10 Juniper TechLibrary configuration scenarios. Bagpipe is efficient: we ran it on three ASes with a total of over 240,000 lines of Cisco and Juniper BGP configuration. Bagpipe is effective: it revealed 19 policy violations without issuing any false positives.

Das, Arnab, Briggs, Ian, Gopalakrishnan, Ganesh, Krishnamoorthy, Sriram, Panchekha, Pavel.  2020.  Scalable yet Rigorous Floating-Point Error Analysis. SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. :1–14.
Automated techniques for rigorous floating-point round-off error analysis are a prerequisite to placing important activities in HPC such as precision allocation, verification, and code optimization on a formal footing. Yet existing techniques cannot provide tight bounds for expressions beyond a few dozen operators-barely enough for HPC. In this work, we offer an approach embedded in a new tool called SATIHE that scales error analysis by four orders of magnitude compared to today's best-of-class tools. We explain how three key ideas underlying SATIHE helps it attain such scale: path strength reduction, bound optimization, and abstraction. SATIHE provides tight bounds and rigorous guarantees on significantly larger expressions with well over a hundred thousand operators, covering important examples including FFT, matrix multiplication, and PDE stencils.
Morrell, C., Ransbottom, J.S., Marchany, R., Tront, J.G..  2014.  Scaling IPv6 address bindings in support of a moving target defense. Internet Technology and Secured Transactions (ICITST), 2014 9th International Conference for. :440-445.

Moving target defense is an area of network security research in which machines are moved logically around a network in order to avoid detection. This is done by leveraging the immense size of the IPv6 address space and the statistical improbability of two machines selecting the same IPv6 address. This defensive technique forces a malicious actor to focus on the reconnaissance phase of their attack rather than focusing only on finding holes in a machine's static defenses. We have a current implementation of an IPv6 moving target defense entitled MT6D, which works well although is limited to functioning in a peer to peer scenario. As we push our research forward into client server networks, we must discover what the limits are in reference to the client server ratio. In our current implementation of a simple UDP echo server that binds large numbers of IPv6 addresses to the ethernet interface, we discover limits in both the number of addresses that we can successfully bind to an interface and the speed at which UDP requests can be successfully handled across a large number of bound interfaces.
 

Mirmohseni, M., Papadimitratos, P..  2014.  Scaling laws for secrecy capacity in cooperative wireless networks. INFOCOM, 2014 Proceedings IEEE. :1527-1535.

We investigate large wireless networks subject to security constraints. In contrast to point-to-point, interference-limited communications considered in prior works, we propose active cooperative relaying based schemes. We consider a network with nl legitimate nodes and ne eavesdroppers, and path loss exponent α ≥ 2. As long as ne2(log(ne))γ = o(nl) holds for some positive γ, we show one can obtain unbounded secure aggregate rate. This means zero-cost secure communication, given a fixed total power constraint for the entire network. We achieve this result with (i) the source using Wyner randomized encoder and a serial (multi-stage) block Markov scheme, to cooperate with the relays, and (ii) the relays acting as a virtual multi-antenna to apply beamforming against the eavesdroppers. Our simpler parallel (two-stage) relaying scheme can achieve the same unbounded secure aggregate rate when neα/2 + 1 (log(ne))γ+δ(α/2+1) = o(nl) holds, for some positive γ, δ.

Fujiwara, Yasuhiro, Marumo, Naoki, Blondel, Mathieu, Takeuchi, Koh, Kim, Hideaki, Iwata, Tomoharu, Ueda, Naonori.  2017.  Scaling Locally Linear Embedding. Proceedings of the 2017 ACM International Conference on Management of Data. :1479–1492.
Locally Linear Embedding (LLE) is a popular approach to dimensionality reduction as it can effectively represent nonlinear structures of high-dimensional data. For dimensionality reduction, it computes a nearest neighbor graph from a given dataset where edge weights are obtained by applying the Lagrange multiplier method, and it then computes eigenvectors of the LLE kernel where the edge weights are used to obtain the kernel. Although LLE is used in many applications, its computation cost is significantly high. This is because, in obtaining edge weights, its computation cost is cubic in the number of edges to each data point. In addition, the computation cost in obtaining the eigenvectors of the LLE kernel is cubic in the number of data points. Our approach, Ripple, is based on two ideas: (1) it incrementally updates the edge weights by exploiting the Woodbury formula and (2) it efficiently computes eigenvectors of the LLE kernel by exploiting the LU decomposition-based inverse power method. Experiments show that Ripple is significantly faster than the original approach of LLE by guaranteeing the same results of dimensionality reduction.
Shuhao Liu, Baochun Li.  2015.  "On scaling software-Defined Networking in wide-area networks". Tsinghua Science and Technology. 20:221-232.

Software-Defined Networking (SDN) has emerged as a promising direction for next-generation network design. Due to its clean-slate and highly flexible design, it is believed to be the foundational principle for designing network architectures and improving their flexibility, resilience, reliability, and security. As the technology matures, research in both industry and academia has designed a considerable number of tools to scale software-defined networks, in preparation for the wide deployment in wide-area networks. In this paper, we survey the mechanisms that can be used to address the scalability issues in software-defined wide-area networks. Starting from a successful distributed system, the Domain Name System, we discuss the essential elements to make a large scale network infrastructure scalable. Then, the existing technologies proposed in the literature are reviewed in three categories: scaling out/up the data plane and scaling the control plane. We conclude with possible research directions towards scaling software-defined wide-area networks.

Stein, Michael, Frömmgen, Alexander, Kluge, Roland, Wang, Lin, Wilberg, Augustin, Koldehofe, Boris, Mühlhäuser, Max.  2018.  Scaling Topology Pattern Matching: A Distributed Approach. Proceedings of the 33rd Annual ACM Symposium on Applied Computing. :996-1005.

Graph pattern matching in network topologies is a building block of many distributed algorithms. Based on a limited local view of the topology, pattern-based algorithms substantiate the decision-making of each device on the occurrence of graph patterns in its surrounding topology. Existing pattern-based algorithms require that each device has a sufficiently large local view to match patterns without support of other devices. In practical environments, the local view is often restricted to one hop. Thus, algorithms matching non-trivial patterns are locked out from such environments today. This paper presents the first algorithm for distributed topology pattern matching, enabling pattern matching beyond the local view. Outgoing from initiating devices, our pattern matcher delegates the matching procedure to further devices in the network. Exploring major contextual parameters of our algorithm, we show that the optimal local view size depends on scenario-specific conditions. Our pattern matcher provides the flexibility for adaptations of the local view size at runtime. Making use of this flexibility, we optimize the execution of an established pattern-based algorithm and evaluate our pattern matcher in two topology control case studies for the Internet of Things. By scaling the view size of each device in a distributed way, our adaptive approach achieves significant communication cost savings in face of dynamic conditions.

Chen, X., Qu, G., Cui, A., Dunbar, C..  2017.  Scan chain based IP fingerprint and identification. 2017 18th International Symposium on Quality Electronic Design (ISQED). :264–270.

Digital fingerprinting refers to as method that can assign each copy of an intellectual property (IP) a distinct fingerprint. It was introduced for the purpose of protecting legal and honest IP users. The unique fingerprint can be used to identify the IP or a chip that contains the IP. However, existing fingerprinting techniques are not practical due to expensive cost of creating fingerprints and the lack of effective methods to verify the fingerprints. In the paper, we study a practical scan chain based fingerprinting method, where the digital fingerprint is generated by selecting the Q-SD or Q'-SD connection during the design of scan chains. This method has two major advantages. First, fingerprints are created as a post-silicon procedure and therefore there will be little fabrication overhead. Second, altering the Q-SD or Q'-SD connection style requires the modification of test vectors for each fingerprinted IP in order to maintain the fault coverage. This enables us to verify the fingerprint by inspecting the test vectors without opening up the chip to check the Q-SD or Q'-SD connection styles. We perform experiment on standard benchmarks to demonstrate that our approach has low design overhead. We also conduct security analysis to show that such fingerprints are robust against various attacks.

Spreitzer, Raphael, Palfinger, Gerald, Mangard, Stefan.  2018.  SCAnDroid: Automated Side-Channel Analysis of Android APIs. Proceedings of the 11th ACM Conference on Security & Privacy in Wireless and Mobile Networks. :224-235.

Although the Android system has been continuously hardened against side-channel attacks, there are still plenty of APIs available that can be exploited. However, most side-channel analyses in the literature consider specifically chosen APIs (or resources) in the Android framework, after a manual analysis of APIs for possible information leaks has been performed. Such a manual analysis is a tedious, time consuming, and error-prone task, meaning that information leaks tend to be overlooked. To overcome this tedious task, we introduce SCANDROID, a framework that automatically profiles the Java-based Android API for possible information leaks. Events of interest, such as website launches, Google Maps queries, or application starts, are triggered automatically, and while these events are being triggered, the Java-based Android API is analyzed for possible information leaks that allow inferring these events later on. To assess the Android API for information leaks, SCANDROID relies on dynamic time warping. By applying SCANDROID on Android 8 (Android Oreo), we identified several Android APIs that allow inferring website launches, Google Maps queries, and application starts. The triggered events are by no means exhaustive but have been chosen to demonstrate the broad applicability of SCANDROID. Among the automatically identified information leaks are, for example, the java.io.File API, the android.os.storage.StorageManager API, and several methods within the android.net. Traffics tats API. Thereby, we identify the first side-channel leaks in the Android API on Android 8 (Android Oreo).

Bin Sun, Shutao Li, Jun Sun.  2014.  Scanned Image Descreening With Image Redundancy and Adaptive Filtering. Image Processing, IEEE Transactions on. 23:3698-3710.

Currently, most electrophotographic printers use halftoning technique to print continuous tone images, so scanned images obtained from such hard copies are usually corrupted by screen like artifacts. In this paper, a new model of scanned halftone image is proposed to consider both printing distortions and halftone patterns. Based on this model, an adaptive filtering based descreening method is proposed to recover high quality contone images from the scanned images. Image redundancy based denoising algorithm is first adopted to reduce printing noise and attenuate distortions. Then, screen frequency of the scanned image and local gradient features are used for adaptive filtering. Basic contone estimate is obtained by filtering the denoised scanned image with an anisotropic Gaussian kernel, whose parameters are automatically adjusted with the screen frequency and local gradient information. Finally, an edge-preserving filter is used to further enhance the sharpness of edges to recover a high quality contone image. Experiments on real scanned images demonstrate that the proposed method can recover high quality contone images from the scanned images. Compared with the state-of-the-art methods, the proposed method produces very sharp edges and much cleaner smooth regions.

Hlyne, C. N. N., Zavarsky, P., Butakov, S..  2016.  SCAP benchmark for Cisco router security configuration compliance. 2015 10th International Conference for Internet Technology and Secured Transactions (ICITST). :270–276.

Information security management is time-consuming and error-prone. Apart from day-to-day operations, organizations need to comply with industrial regulations or government directives. Thus, organizations are looking for security tools to automate security management tasks and daily operations. Security Content Automation Protocol (SCAP) is a suite of specifications that help to automate security management tasks such as vulnerability measurement and policy compliance evaluation. SCAP benchmark provides detailed guidance on setting the security configuration of network devices, operating systems, and applications. Organizations can use SCAP benchmark to perform automated configuration compliance assessment on network devices, operating systems, and applications. This paper discusses SCAP benchmark components and the development of a SCAP benchmark for automating Cisco router security configuration compliance.

Mohamed, F., AlBelooshi, B., Salah, K., Yeun, C. Y., Damiani, E..  2017.  A Scattering Technique for Protecting Cryptographic Keys in the Cloud. 2017 IEEE 2nd International Workshops on Foundations and Applications of Self* Systems (FAS*W). :301–306.

Cloud computing has become a widely used computing paradigm providing on-demand computing and storage capabilities based on pay-as-you-go model. Recently, many organizations, especially in the field of big data, have been adopting the cloud model to perform data analytics through leasing powerful Virtual Machines (VMs). VMs can be attractive targets to attackers as well as untrusted cloud providers who aim to get unauthorized access to the business critical-data. The obvious security solution is to perform data analytics on encrypted data through the use of cryptographic keys as that of the Advanced Encryption Standard (AES). However, it is very easy to obtain AES cryptographic keys from the VM's Random Access Memory (RAM). In this paper, we present a novel key-scattering (KS) approach to protect the cryptographic keys while encrypting/decrypting data. Our solution is highly portable and interoperable. Thus, it could be integrated within today's existing cloud architecture without the need for further modifications. The feasibility of the approach has been proven by implementing a functioning prototype. The evaluation results show that our approach is substantially more resilient to brute force attacks and key extraction tools than the standard AES algorithm, with acceptable execution time.

Bahirat, Paritosh, Sun, Qizhang, Knijnenburg, Bart P..  2018.  Scenario Context V/s Framing and Defaults in Managing Privacy in Household IoT. Proceedings of the 23rd International Conference on Intelligent User Interfaces Companion. :63:1–63:2.

The Internet of Things provides household device users with an ability to connect and manage numerous devices over a common platform. However, the sheer number of possible privacy settings creates issues such as choice overload. This article outlines a data-driven approach to understand how users make privacy decisions in household IoT scenarios. We demonstrate that users are not just influenced by the specifics of the IoT scenario, but also by aspects immaterial to the decision, such as the default setting and its framing.

Reinerman-Jones, L., Matthews, G., Wohleber, R., Ortiz, E..  2017.  Scenarios using situation awareness in a simulation environment for eliciting insider threat behavior. 2017 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA). :1–3.

An important topic in cybersecurity is validating Active Indicators (AI), which are stimuli that can be implemented in systems to trigger responses from individuals who might or might not be Insider Threats (ITs). The way in which a person responds to the AI is being validated for identifying a potential threat and a non-threat. In order to execute this validation process, it is important to create a paradigm that allows manipulation of AIs for measuring response. The scenarios are posed in a manner that require participants to be situationally aware that they are being monitored and have to act deceptively. In particular, manipulations in the environment should no differences between conditions relative to immersion and ease of use, but the narrative should be the driving force behind non-deceptive and IT responses. The success of the narrative and the simulation environment to induce such behaviors is determined by immersion, usability, and stress response questionnaires, and performance. Initial results of the feasibility to use a narrative reliant upon situation awareness of monitoring and evasion are discussed.

Baruah, Sanjoy.  2016.  Schedulability Analysis of Mixed-criticality Systems with Multiple Frequency Specifications. Proceedings of the 13th International Conference on Embedded Software. :24:1–24:10.

In mixed-criticality systems functionalities of different criticalities, that need to have their correctness validated to different levels of assurance, co-exist upon a shared platform. Multiple specifications at differing levels of assurance may be provided for such systems; the specifications that are trusted at very high levels of assurance tend to be more conservative than those at lower levels of assurance. Prior research on the scheduling of such mixed-criticality systems has primarily focused upon the case where multiple estimates of the worst-case execution time (WCET) of pieces of code are provided; in this paper, a model is considered in which multiple estimates are instead provided for the rate at which event-triggered processes are executed. An algorithm is derived for scheduling such systems upon a preemptive uniprocessor; the effectiveness of this algorithm is demonstrated quantitatively via the speedup factor metric.

Abbas, Waseem, Laszka, Aron, Vorobeychik, Yevgeniy, Koutsoukos, Xenofon.  2015.  Scheduling Intrusion Detection Systems in Resource-Bounded Cyber-Physical Systems. Proceedings of the First ACM Workshop on Cyber-Physical Systems-Security and/or PrivaCy. :55–66.

In order to be resilient to attacks, a cyber-physical system (CPS) must be able to detect attacks before they can cause significant damage. To achieve this, \emph{intrusion detection systems} (IDS) may be deployed, which can detect attacks and alert human operators, who can then intervene. However, the resource-constrained nature of many CPS poses a challenge, since reliable IDS can be computationally expensive. Consequently, computational nodes may not be able to perform intrusion detection continuously, which means that we have to devise a schedule for performing intrusion detection. While a uniformly random schedule may be optimal in a purely cyber system, an optimal schedule for protecting CPS must also take into account the physical properties of the system, since the set of adversarial actions and their consequences depend on the physical systems. Here, in the context of water distribution networks, we study IDS scheduling problems in two settings and under the constraints on the available battery supplies. In the first problem, the objective is to design, for a given duration of time $T$, scheduling schemes for IDS so that the probability of detecting an attack is maximized within that duration. We propose efficient heuristic algorithms for this general problem and evaluate them on various networks. In the second problem, our objective is to design scheduling schemes for IDS so that the overall lifetime of the network is maximized while ensuring that an intruder attack is always detected. Various strategies to deal with this problem are presented and evaluated for various networks.

Zhang, G., Qiu, X., Chang, W..  2017.  Scheduling of Security Resources in Software Defined Security Architecture. 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :494–503.

With the development of Software Defined Networking, its software programmability and openness brings new idea for network security. Therefore, many Software Defined Security Architectures emerged at the right moment. Software Defined Security decouples security control plane and security data plane. In Software Defined Security Architectures, underlying security devices are abstracted as security resources in resource pool, intellectualized and automated security business management and orchestration can be realized through software programming in security control plane. However, network management has been becoming extremely complicated due to expansible network scale, varying network devices, lack of abstraction and heterogeneity of network especially. Therefore, new-type open security devices are needed in SDS Architecture for unified management so that they can be conveniently abstracted as security resources in resource pool. This paper firstly analyses why open security devices are needed in SDS architecture and proposes a method of opening security devices. Considering this new architecture requires a new security scheduling mechanism, this paper proposes a security resource scheduling algorithm which is used for managing and scheduling security resources in resource pool according to user s security demand. The security resource scheduling algorithm aims to allocate a security protection task to a suitable security resource in resource pool so that improving security protection efficiency. In the algorithm, we use BP neural network to predict the execution time of security tasks to improve the performance of the algorithm. The simulation result shows that the algorithm has ideal performance. Finally, a usage scenario is given to illustrate the role of security resource scheduling in software defined security architecture.

Chowdhary, A., Dixit, V. H., Tiwari, N., Kyung, S., Huang, D., Ahn, G. J..  2017.  Science DMZ: SDN based secured cloud testbed. 2017 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN). :1–2.

Software Defined Networking (SDN) presents a unique opportunity to manage and orchestrate cloud networks. The educational institutions, like many other industries face a lot of security threats. We have established an SDN enabled Demilitarized Zone (DMZ) — Science DMZ to serve as testbed for securing ASU Internet2 environment. Science DMZ allows researchers to conduct in-depth analysis of security attacks and take necessary countermeasures using SDN based command and control (C&C) center. Demo URL: https : //www.youtube.corn/watchlv = 8yo2lTNV 3r4.

Williams, Laurie.  2019.  Science Leaves Clues. IEEE Security Privacy. 17:4–6.
The elusive science of security. Science advances when research results build upon prior findings through the evolution of hypotheses and theories about the fundamental relationships among variables within a context and considering the threats and limitations of the work. Some hypothesize that, through this science of security, the industry can take a more principled and systematic approach to securing systems, rather than reacting to the latest move by attackers. Others debate the utility of a science of security.
van Oorschot, Paul C..  2017.  Science, Security and Academic Literature: Can We Learn from History? Proceedings of the 2017 Workshop on Moving Target Defense. :1–2.
A recent paper (Oakland 2017) discussed science and security research in the context of the government-funded Science of Security movement, and the history and prospects of security as a scientific pursuit. It drew on literature from within the security research community, and mature history and philosophy of science literature. The paper sparked debate in numerous organizations and the security community. Here we consider some of the main ideas, provide a summary list of relevant literature, and encourage discussion within the Moving Target Defense (MTD) sub-community1.
Stephan, E., Raju, B., Elsethagen, T., Pouchard, L., Gamboa, C..  2017.  A scientific data provenance harvester for distributed applications. 2017 New York Scientific Data Summit (NYSDS). :1–9.

Data provenance provides a way for scientists to observe how experimental data originates, conveys process history, and explains influential factors such as experimental rationale and associated environmental factors from system metrics measured at runtime. The US Department of Energy Office of Science Integrated end-to-end Performance Prediction and Diagnosis for Extreme Scientific Workflows (IPPD) project has developed a provenance harvester that is capable of collecting observations from file based evidence typically produced by distributed applications. To achieve this, file based evidence is extracted and transformed into an intermediate data format inspired in part by W3C CSV on the Web recommendations, called the Harvester Provenance Application Interface (HAPI) syntax. This syntax provides a general means to pre-stage provenance into messages that are both human readable and capable of being written to a provenance store, Provenance Environment (ProvEn). HAPI is being applied to harvest provenance from climate ensemble runs for Accelerated Climate Modeling for Energy (ACME) project funded under the U.S. Department of Energy's Office of Biological and Environmental Research (BER) Earth System Modeling (ESM) program. ACME informally provides provenance in a native form through configuration files, directory structures, and log files that contain success/failure indicators, code traces, and performance measurements. Because of its generic format, HAPI is also being applied to harvest tabular job management provenance from Belle II DIRAC scheduler relational database tables as well as other scientific applications that log provenance related information.