Visible to the public Hash Algorithms

SoS Newsletter- Advanced Book Block

Hash Algorithms

Hashing algorithms are used extensively in information security and forensics. Research focuses on new methods and techniques to optimize security. The articles cited here, from the first half of 2014, cover topics such as Secure Hash Algorithm (SHA)-1 and SHA-3, one time password generation, Keccac and Mceliece algorithms, and Bloom filters.

  • Eddeen, L.M.H.N.; Saleh, E.M.; Saadah, D., "Genetic Hash Algorithm," Computer Science and Information Technology (CSIT), 2014 6th International Conference on , vol., no., pp.23,26, 26-27 March 2014. Security is becoming a major concern in computing. New techniques are evolving every day; one of these techniques is Hash Visualization. Hash Visualization uses complex random generated images for security, these images can be used to hide data (watermarking). This proposed new technique improves hash visualization by using genetic algorithms. Genetic algorithms are a search optimization technique that is based on the evolution of living creatures. The proposed technique uses genetic algorithms to improve hash visualization. The used genetic algorithm was away faster than traditional previous ones, and it improved hash visualization by evolving the tree that was used to generate the images, in order to obtain a better and larger tree that will generate images with higher security. The security was satisfied by calculating the fitness value for each chromosome based on a specifically designed algorithm.
    Keywords: cryptography; data encapsulation; genetic algorithms; image watermarking; trees (mathematics);complex random generated images; data hiding; genetic hash algorithm; hash visualization; search optimization technique; watermarking; Authentication; Biological cells; Data visualization; Genetic algorithms; Genetics; Visualization; Chromosome; Fitness value; Genetic Algorithms; Hash Visualization; Hash functions; Security (ID#:14-2113)
  • Kishore, N.; Kapoor, B., "An Efficient Parallel Algorithm For Hash Computation In Security And Forensics Applications," Advance Computing Conference (IACC), 2014 IEEE International , vol., no., pp.873,877, 21-22 Feb. 2014. Hashing algorithms are used extensively in information security and digital forensics applications. This paper presents an efficient parallel algorithm hash computation. It's a modification of the SHA-1 algorithm for faster parallel implementation in applications such as the digital signature and data preservation in digital forensics. The algorithm implements recursive hash to break the chain dependencies of the standard hash function. We discuss the theoretical foundation for the work including the collision probability and the performance implications. The algorithm is implemented using the OpenMP API and experiments performed using machines with multicore processors. The results show a performance gain by more than a factor of 3 when running on the 8-core configuration of the machine.
    Keywords: application program interfaces; cryptography; digital forensics; digital signatures; file organization; parallel algorithms; probability; OpenMP API;SHA-1 algorithm; collision probability; data preservation; digital forensics; digital signature; hash computation; hashing algorithms; information security; parallel algorithm; standard hash function; Algorithm design and analysis; Conferences; Cryptography; Multicore processing; Program processors; Standards; Cryptographic Hash Function; Digital Forensics; Digital Signature; MD5; Multicore Processors; OpenMP;SHA-1 (ID#:14-2114)
  • Nemoianu, I-D.; Greco, C.; Cagnazzo, M.; Pesquet-Popescu, B., "On a Hashing-Based Enhancement of Source Separation Algorithms over Finite Fields for Network Coding Perspectives," Multimedia, IEEE Transactions on, vol. PP, no.99, pp.1, 1, July 2014. Blind Source Separation (BSS) deals with the recovery of source signals from a set of observed mixtures, when little or no knowledge of the mixing process is available. BSS can find an application in the context of network coding, where relaying linear combinations of packets maximizes the throughput and increases the loss immunity. By relieving the nodes from the need to send the combination coefficients, the overhead cost is largely reduced. However, the scaling ambiguity of the technique and the quasi-uniformity of compressed media sources makes it unfit, at its present state, for multimedia transmission. In order to open new practical applications for BSS in the context of multimedia transmission, we have recently proposed to use a non-linear encoding to increase the discriminating power of the classical entropy-based separation methods. Here, we propose to append to each source a non-linear message digest, which offers an overhead smaller than a per-symbol encoding and that can be more easily tuned. Our results prove that our algorithm is able to provide high decoding rates for different media types such as image, audio, and video, when the transmitted messages are less than 1.5 kilobytes, which is typically the case in a realistic transmission scenario.
    Keywords: (not provided) (ID#:14-2115)
  • Bayat-Sarmadi, S.; Mozaffari-Kermani, M.; Reyhani-Masoleh, A, "Efficient and Concurrent Reliable Realization of the Secure Cryptographic SHA-3 Algorithm," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , vol.33, no.7, pp.1105,1109, July 2014. The secure hash algorithm (SHA)-3 has been selected in 2012 and will be used to provide security to any application which requires hashing, pseudo-random number generation, and integrity checking. This algorithm has been selected based on various benchmarks such as security, performance, and complexity. In this paper, in order to provide reliable architectures for this algorithm, an efficient concurrent error detection scheme for the selected SHA-3 algorithm, i.e., Keccak, is proposed. To the best of our knowledge, effective countermeasures for potential reliability issues in the hardware implementations of this algorithm have not been presented to date. In proposing the error detection approach, our aim is to have acceptable complexity and performance overheads while maintaining high error coverage. In this regard, we present a low-complexity recomputing with rotated operands-based scheme which is a step-forward toward reducing the hardware overhead of the proposed error detection approach. Moreover, we perform injection-based fault simulations and show that the error coverage of close to 100% is derived. Furthermore, we have designed the proposed scheme and through ASIC analysis, it is shown that acceptable complexity and performance overheads are reached. By utilizing the proposed high-performance concurrent error detection scheme, more reliable and robust hardware implementations for the newly-standardized SHA-3 are realized.
    Keywords: application specific integrated circuits; computational complexity; concurrency control; cryptography; error detection; parallel processing; ASIC analysis; Keccak;SHA-3 algorithm; acceptable complexity; error coverage; hardware overhead reduction; hashing; high-performance concurrent error detection scheme; injection-based fault simulations; integrity checking ;low-complexity recomputing; performance overheads; pseudorandom number generation; reliability; robust hardware implementations ;rotated operand-based scheme; secure hash algorithm; step-forward toward; Algorithm design and analysis; Application specific integrated circuits; Circuit faults; Cryptography; Hardware; Reliability; Transient analysis; Application-specific integrated circuit (ASIC);high performance; reliability; secure hash algorithm (SHA)-3; security (ID#:14-2116)
  • Ghosh, Santosh, "On the Implementation Of Mceliece With CCA2 Indeterminacy by SHA-3," Circuits and Systems (ISCAS), 2014 IEEE International Symposium on, vol., no., pp.2804, 2807, 1-5 June 2014. This paper deals with the design and implementation of the post-quantum public-key algorithm McEliece. Seamless incorporation of a new error generator and new SHA-3 module provides higher indeterminacy and more randomization of the original McEliece algorithm and achieves CCA2 security standard. Due to the lightweight and high-speed implementation of SHA-3 module the proposed 128-bit secure McEliece architecture provides 6% higher performance in only 0.78 times area of the best known existing design.
    Keywords: Algorithm design and analysis; Clocks; Computer architecture; Encryption; Vectors; CCA2; Keccak ;McEliece; Post-quantum cryptography; SHA-3; Secure hash algorithm; public-key (ID#:14-2117)
  • Yakut, S.; Ozer, AB., "HMAC Based One Time Password Generator," Signal Processing and Communications Applications Conference (SIU), 2014 22nd, vol., no., pp.1563,1566, 23-25 April 2014. One Time Password which is fixed length strings to perform authentication in electronic media is used as a one-time. In this paper, One Time Password production methods which based on hash functions were investigated. Keccak digest algorithm was used for the production of One Time Password. This algorithm has been selected as the latest standards for hash algorithm in October 2012 by National Instute of Standards and Technology. This algorithm is preferred because it is faster and safer than the others. One Time Password production methods based on hash functions is called Hashing-Based Message Authentication Code structure. In these structures, the key value is using with the hash function to generate the Hashing-Based Message Authentication Code value. Produced One Time Password value is based on the This value. In this application, the length of the value One Time Password was the eight characters to be useful in practice.
    Keywords: cryptography; message authentication; HMAC; Keccak digest algorithm; electronic media; fixed length strings; hash algorithm; hash functions; hashing-based message authentication code structure; one time password generator; one time password production methods;Authentication;Conferences;Cryptography;Production;Signal processing; Signal processing algorithms; Standards; Hash Function; Hash-Based Message Authentication Code; Keccak; One Time Passwords (ID#:14-2118)
  • Hyesook Lim; Kyuhee Lim; Nara Lee; Kyong-Hye Park, "On Adding Bloom Filters to Longest Prefix Matching Algorithms," Computers, IEEE Transactions on , vol.63, no.2, pp.411,423, Feb. 2014. High-speed IP address lookup is essential to achieve wire-speed packet forwarding in Internet routers. Ternary content addressable memory (TCAM) technology has been adopted to solve the IP address lookup problem because of its ability to perform fast parallel matching. However, the applicability of TCAMs presents difficulties due to cost and power dissipation issues. Various algorithms and hardware architectures have been proposed to perform the IP address lookup using ordinary memories such as SRAMs or DRAMs without using TCAMs. Among the algorithms, we focus on two efficient algorithms providing high-speed IP address lookup: parallel multiple-hashing (PMH) algorithm and binary search on level algorithm. This paper shows how effectively an on-chip Bloom filter can improve those algorithms. A performance evaluation using actual backbone routing data with 15,000-220,000 prefixes shows that by adding a Bloom filter, the complicated hardware for parallel access is removed without search performance penalty in parallel-multiple hashing algorithm. Search speed has been improved by 30-40 percent by adding a Bloom filter in binary search on level algorithm.
    Keywords: DRAM chips ;Internet; SRAM chips; data structures; routing protocols; Bloom filters; DRAM; IP address lookup; Internet protocol; Internet routers; PMH algorithm; SRAM; TCAM technology; binary search; dynamic random access memory; fast parallel matching; hardware architectures; level algorithm; parallel multiple-hashing algorithm; parallel-multiple hashing algorithm; performance evaluation; prefix matching algorithms; static random access memory; ternary content addressable memory technology; wire-speed packet forwarding; Generators; IP networks; Indexes; Memory management; Routing; System-on-a-chip; Bloom filter; Generators; IP address lookup; IP networks; Indexes; Internet; Memory management; Routing; System-on-a-chip; binary search on levels; leaf pushing; longest prefix matching; multihashing; router (ID#:14-2119)
  • Jae Min Cho; Kiyoung Choi, "An FPGA Implementation Of High-Throughput Key-Value Store Using Bloom Filter," VLSI Design, Automation and Test (VLSI-DAT), 2014 International Symposium on , vol., no., pp.1,4, 28-30 April 2014. This paper presents an efficient implementation of key-value store using Bloom filters on FPGA. Bloom filters are used to reduce the number of unnecessary accesses to the hash tables, thereby improving the performance. Additionally, for better hash table utilization, we use a modified cuckoo hashing algorithm for the implementation. They are implemented in FPGA to further improve the performance. Experimental results show significant performance improvement over existing approaches.
    Keywords: data structures field programmable gate arrays; file organization; Bloom filter; FPGA implementation; cuckoo hashing algorithm; hash tables; high-throughput key-value store; Arrays; Field programmable gate arrays; Hardware; Information filters; Random access memory; Software; Bloom filter; FPGA; Key-value Store (ID#:14-2120)
  • Mokhtar, B.; Eltoweissy, M., "Towards a Data Semantics Management System for Internet Traffic," New Technologies, Mobility and Security (NTMS), 2014 6th International Conference on, vol., no., pp.1, 5, March 30 2014-April 2, 2014. Although current Internet operations generate voluminous data, they remain largely oblivious of traffic data semantics. This poses many inefficiencies and challenges due to emergent or anomalous behavior impacting the vast array of Internet elements such as services and protocols. In this paper, we propose a Data Semantics Management System (DSMS) for learning Internet traffic data semantics to enable smarter semantics- driven networking operations. We extract networking semantics and build and utilize a dynamic ontology of network concepts to better recognize and act upon emergent or abnormal behavior. Our DSMS utilizes: (1) Latent Dirichlet Allocation algorithm (LDA) for latent features extraction and semantics reasoning; (2) big tables as a cloud-like data storage technique to maintain large-scale data; and (3) Locality Sensitive Hashing algorithm (LSH) for reducing data dimensionality. Our preliminary evaluation using real Internet traffic shows the efficacy of DSMS for learning behavior of normal and abnormal traffic data and for accurately detecting anomalies at low cost.
    Keywords: Internet; data reduction;l earning (artificial intelligence);ontologies (artificial intelligence);storage management; telecommunication traffic; DSMS; Internet traffic data semantic learning; LSH; cloud-like data storage technique; data dimensionality reduction; data semantics management system; dynamic ontology; latent Dirichlet allocation algorithm; learning behavior ;locality sensitive hashing algorithm; networking semantics; protocols; traffic data semantics; Algorithm design and analysis; Cognition; Data mining; Data models; Feature extraction; Internet; Semantics (ID#:14-2121)
  • Kafai, M.; Eshghi, K.; Bhanu, B., "Discrete Cosine Transform Locality-Sensitive Hashes for Face Retrieval," Multimedia, IEEE Transactions on , vol.16, no.4, pp.1090,1103, June 2014. Descriptors such as local binary patterns perform well for face recognition. Searching large databases using such descriptors has been problematic due to the cost of the linear search, and the inadequate performance of existing indexing methods. We present Discrete Cosine Transform (DCT) hashing for creating index structures for face descriptors. Hashes play the role of
    Keywords: an index is created, and queried to find the images most similar to the query image. Common hash suppression is used to improve retrieval efficiency and accuracy. Results are shown on a combination of six publicly available face databases (LFW, FERET, FEI, BioID, Multi-PIE, and RaFD). It is shown that DCT hashing has significantly better retrieval accuracy and it is more efficient compared to other popular state-of-the-art hash algorithms.
    Keywords: cryptography; discrete cosine transforms; face recognition; image coding; image retrieval; BioID; DCT hashing; FEI; FERET; LFW; RaFD; discrete cosine transform hashing; face databases; face descriptors; face recognition; face retrieval; hash suppression ;image querying; index structures; linear search; local binary patterns; locality-sensitive hashes; multiPIE; retrieval efficiency; Discrete cosine transforms; Face ;Indexing; Kernel; Probes; Vectors; Discrete Cosine Transform (DCT) hashing; Local Binary Patterns (LBP);Locality-Sensitive Hashing (LSH);face indexing; image retrieval (ID#:14-2122)
  • Jingkuan Song; Yi Yang; Xuelong Li; Zi Huang; Yang Yang, "Robust Hashing With Local Models for Approximate Similarity Search," Cybernetics, IEEE Transactions on , vol.44, no.7, pp.1225,1236, July 2014. Similarity search plays an important role in many applications involving high-dimensional data. Due to the known dimensionality curse, the performance of most existing indexing structures degrades quickly as the feature dimensionality increases. Hashing methods, such as locality sensitive hashing (LSH) and its variants, have been widely used to achieve fast approximate similarity search by trading search quality for efficiency. However, most existing hashing methods make use of randomized algorithms to generate hash codes without considering the specific structural information in the data. In this paper, we propose a novel hashing method, namely, robust hashing with local models (RHLM), which learns a set of robust hash functions to map the high-dimensional data points into binary hash codes by effectively utilizing local structural information. In RHLM, for each individual data point in the training dataset, a local hashing model is learned and used to predict the hash codes of its neighboring data points. The local models from all the data points are globally aligned so that an optimal hash code can be assigned to each data point. After obtaining the hash codes of all the training data points, we design a robust method by employing l2,1-norm minimization on the loss function to learn effective hash functions, which are then used to map each database point into its hash code. Given a query data point, the search process first maps it into the query hash code by the hash functions and then explores the buckets, which have similar hash codes to the query hash code. Extensive experimental results conducted on real-life datasets show that the proposed RHLM outperforms the state-of-the-art methods in terms of search quality and efficiency.
    Keywords: computational complexity; file organization; query processing; RHLM; approximate similarity search; binary hash codes; database point; dimensionality curse; feature dimensionality; high-dimensional data; high-dimensional data point mapping;l2,1-norm minimization; local hashing model; local structural information; loss function; optimal hash code; query data point; query hash code real-life datasets; robust hash function learning; robust hashing-with-local models; search efficiency; search quality; training data points; training dataset; Databases; Laplace equations; Linear programming; Nickel; Robustness; Training; Training data; Approximate similarity search; indexing; robust hashing}, (ID#:14-2123)
  • Balkesen, C.; Teubner, J.; Alonso, G.; Ozsu, M., "Main-Memory Hash Joins on Modern Processor Architectures," Knowledge and Data Engineering, IEEE Transactions on, vol. PP, no.99, pp.1, 1, March 2014. Existing main-memory hash join algorithms for multi-core can be classified into two camps. Hardware-oblivious hash join variants do not depend on hardware-specific parameters. Rather, they consider qualitative characteristics of modern hardware and are expected to achieve good performance on any technologically similar platform. The assumption behind these algorithms is that hardware is now good enough at hiding its own limitations--through automatic hardware prefetching, out-of-order execution, or simultaneous multi-threading (SMT)--to make hardware-oblivious algorithms competitive without the overhead of carefully tuning to the underlying hardware. Hardware-conscious implementations, such as (parallel) radix join, aim to maximally exploit a given architecture by tuning the algorithm parameters (e.g., hash table sizes) to the particular features of the architecture. The assumption here is that explicit parameter tuning yields enough performance advantages to warrant the effort required. This paper compares the two approaches under a wide range of workloads (relative table sizes, tuple sizes, effects of sorted data, etc.) and configuration parameters (VM page sizes, number of threads, number of cores, SMT, SIMD, prefetching, etc.). The results show that hardware-conscious algorithms generally outperform hardware-oblivious ones. However, on specific workloads and special architectures with aggressive simultaneous multi-threading, hardware-oblivious algorithms are competitive. The main conclusion of the paper is that, in existing multi-core architectures, it is still important to carefully tailor algorithms to the underlying hardware to get the necessary performance. But processor developments may require to revisit this conclusion in the future.
    Keywords: (not provided) (ID#:14-2124)
  • Yang Xu; Zhaobo Liu; Zhuoyuan Zhang; Chao, H.J., "High-Throughput and Memory-Efficient Multimatch Packet Classification Based on Distributed and Pipelined Hash Tables," Networking, IEEE/ACM Transactions on , vol.22, no.3, pp.982,995, June 2014. The emergence of new network applications, such as the network intrusion detection system and packet-level accounting, requires packet classification to report all matched rules instead of only the best matched rule. Although several schemes have been proposed recently to address the multimatch packet classification problem, most of them require either huge memory or expensive ternary content addressable memory (TCAM) to store the intermediate data structure, or they suffer from steep performance degradation under certain types of classifiers. In this paper, we decompose the operation of multimatch packet classification from the complicated multidimensional search to several single-dimensional searches, and present an asynchronous pipeline architecture based on a signature tree structure to combine the intermediate results returned from single-dimensional searches. By spreading edges of the signature tree across multiple hash tables at different stages, the pipeline can achieve a high throughput via the interstage parallel access to hash tables. To exploit further intrastage parallelism, two edge-grouping algorithms are designed to evenly divide the edges associated with each stage into multiple work-conserving hash tables. To avoid collisions involved in hash table lookup, a hybrid perfect hash table construction scheme is proposed. Extensive simulation using realistic classifiers and traffic traces shows that the proposed pipeline architecture outperforms HyperCuts and B2PC schemes in classification speed by at least one order of magnitude, while having a similar storage requirement. Particularly, with different types of classifiers of 4K rules, the proposed pipeline architecture is able to achieve a throughput between 26.8 and 93.1 Gb/s using perfect hash tables.
    Keywords: content-addressable storage; cryptography; security of data; signal classification; table lookup; tree data structures; TCAM; asynchronous pipeline architecture; bit rate 26.8 Gbit/s to 93.1 Gbit/s; complicated multidimensional search; distributed hash tables; edge-grouping; hash table lookup; high-throughput multimatch; hybrid perfect hash table construction; intermediate data structure; interstage parallel access; memory-efficient multimatch; multimatch packet classification problem; network intrusion detection; packet-level accounting; pipelined hash tables; signature tree structure; single-dimensional searches ;steep performance degradation; ternary content addressable memory; throughput via ;traffic traces; Encoding; IEEE transactions; Memory management; Pipelines; Power demand; Search engines; Throughput; Hash table; packet classification; signature tree; ternary content addressable memory (TCAM) (ID#:14-2125)
  • Chi Sing Chum; Changha Jun; Xiaowen Zhang, "Implementation of Randomize-Then-Combine Constructed Hash Function," Wireless and Optical Communication Conference (WOCC), 2014 23rd , vol., no., pp.1,6, 9-10 May 2014. Hash functions, such as SHA (secure hash algorithm) and MD (message digest) families that are built upon Merkle-Damgard construction, suffer many attacks due to the iterative nature of block-by-block message processing. Chum and Zhang [4] proposed a new hash function construction that takes advantage of the randomize-then-combine technique, which was used in the incremental hash functions, to the iterative hash function. In this paper, we implement such hash construction in three ways distinguished by their corresponding padding methods. We conduct the experiment in parallel multi-threaded programming settings. The results show that the speed of proposed hash function is no worse than SHA1.
    Keywords: cryptography ;iterative methods; multi-threading; Merkle-Damgard construction; block-by-block message processing; hash function construction; incremental hash function; iterative hash function; parallel multithreaded programming; randomize-then-combine technique; Educational institutions; Message systems; Programming; Random access memory; Resistance; Vectors; Hash function implementation; incremental hash function; pair block chaining; randomize-then-combine (ID#:14-2126)
  • Pi-Chung Wang, "Scalable Packet Classification for Datacenter Networks," Selected Areas in Communications, IEEE Journal on , vol.32, no.1, pp.124,137, January 2014. The key challenge to a datacenter network is its scalability to handle many customers and their applications. In a datacenter network, packet classification plays an important role in supporting various network services. Previous algorithms store classification rules with the same length combinations in a hash table to simplify the search procedure. The search performance of hash-based algorithms is tied to the number of hash tables. To achieve fast and scalable packet classification, we propose an algorithm, encoded rule expansion, to transform rules into an equivalent set of rules with fewer distinct length combinations, without affecting the classification results. The new algorithm can minimize the storage penalty of transformation and achieve a short search time. In addition, the scheme supports fast incremental updates. Our simulation results show that more than 90% hash tables can be eliminated. The reduction of length combinations leads to an improvement on speed performance of packet classification by an order of magnitude. The results also show that the software implementation of our scheme without using any hardware parallelism can support up to one thousand customer VLANs and one million rules, where each rule consumes less than 60 bytes and each packet classification can be accomplished under 50 memory accesses.
    Keywords: computer centers; firewalls; local area networks; telecommunication network routing; virtual machines; VLAN; classification rules; datacenter networks; encoded rule expansion; hardware parallelism; hash table; length combinations; packet classification; packet forwarding; storage penalty; Data structures; Decision trees; Encoding; Hardware; IP networks ;Indexes; Software; Packet classification; VLANs; datacenter network ;firewalls; packet forwarding; router architectures; scalability (ID#:14-2127)
  • Kanizo, Y.; Hay, D.; Keslassy, I, "Maximizing the Throughput of Hash Tables in Network Devices with Combined SRAM/DRAM Memory," Parallel and Distributed Systems, IEEE Transactions on, vol. PP, no.99, pp.1, 1, April 2014. Hash tables form a core component of many algorithms as well as network devices. Because of their large size, they often require a combined memory model, in which some of the elements are stored in a fast memory (for example, cache or on-chip SRAM) while others are stored in much slower memory (namely, the main memory or off-chip DRAM). This makes the implementation of real-life hash tables particularly delicate, as a suboptimal choice of the hashing scheme parameters may result in a higher average query time, and therefore in a lower throughput. In this paper, we focus on multiple-choice hash tables. Given the number of choices, we study the tradeoff between the load of a hash table and its average lookup time. The problem is solved by analyzing an equivalent problem: the expected maximum matching size of a random bipartite graph with a fixed left-side vertex degree. Given two choices, we provide exact results for any finite system, and also deduce asymptotic results as the fast memory size increases. In addition, we further consider other variants of this problem and model the impact of several parameters. Finally, we evaluate the performance of our models on Internet backbone traces, and illustrate the impact of the memories speed difference on the choice of parameters. In particular, we show that the common intuition of entirely avoiding slow memory accesses by using highly efficient schemes (namely, with many fast-memory choices) is not always optimal.
    Keywords: Bipartite graph; Internet; Memory management; Performance evaluation; Random access memory; System-on-chip; Throughput (ID#:14-2128)
  • Plesca, Cezar; Morogan, Luciana, "Efficient And Robust Perceptual Hashing Using Log-Polar Image Representation," Communications (COMM), 2014 10th International Conference on , vol., no., pp.1,6, 29-31 May 2014. Robust image hashing seeks to transform a given input image into a shorter hashed version using a key-dependent non-invertible transform. These hashes find extensive applications in content authentication, image indexing for database search and watermarking. Modern robust hashing algorithms consist of feature extraction, a randomization stage to introduce non-invertibility, followed by quantization and binary encoding to produce a binary hash. This paper describes a novel algorithm for generating an image hash based on Log-Polar transform features. The Log-Polar transform is a part of the Fourier-Mellin transformation, often used in image recognition and registration techniques due to its invariant properties to geometric operations. First, we show that the proposed perceptual hash is resistant to content-preserving operations like compression, noise addition, moderate geometric and filtering. Second, we illustrate the discriminative capability of our hash in order to rapidly distinguish between two perceptually different images. Third, we study the security of our method for image authentication purposes. Finally, we show that the proposed hashing method can provide both excellent security and robustness.
    Keywords: image authentication; log-polar transformation; multimedia security; perceptual hashing (ID#:14-2129)
  • Jin, Z.; Li, C.; Lin, Y.; Cai, D., "Density Sensitive Hashing," Cybernetics, IEEE Transactions on , vol. 44, no.8, pp.1362,1371, Aug. 2014. Nearest neighbor search is a fundamental problem in various research fields like machine learning, data mining and pattern recognition. Recently, hashing-based approaches, for example, locality sensitive hashing (LSH), are proved to be effective for scalable high dimensional nearest neighbor search. Many hashing algorithms found their theoretic root in random projection. Since these algorithms generate the hash tables (projections) randomly, a large number of hash tables (i.e., long codewords) are required in order to achieve both high precision and recall. To address this limitation, we propose a novel hashing algorithm called density sensitive hashing (DSH) in this paper. DSH can be regarded as an extension of LSH. By exploring the geometric structure of the data, DSH avoids the purely random projections selection and uses those projective functions which best agree with the distribution of the data. Extensive experimental results on real-world data sets have shown that the proposed method achieves better performance compared to the state-of-the-art hashing approaches.
    Keywords: Binary codes; Databases; Entropy; Nearest neighbor searches; Principal component analysis; Quantization (signal); Vectors; Clustering; locality sensitive hashing; random projection (ID#:14-2130)


Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to SoS.Project (at) for removal of the links or modifications to specific citations. Please include the ID# of the specific citation in your correspondence.