Visible to the public Acoustic Fingerprints (2014 Year in Review)

SoS Newsletter- Advanced Book Block


SoS Newsletter Logo

Acoustic Fingerprints
(2014 Year in Review)


Acoustic fingerprints can be used to identify an audio sample or quickly locate similar items in an audio database. As a security tool, fingerprints offer a modality of biometric identification of a user. Current research is exploring various aspects and applications, including the use of these fingerprints for mobile device security, antiforensics, use of image processing techniques, and client side embedding.  The research work cited here was published in 2014. 


Zurek, E.E.; Gamarra, A.M.R.; Escorcia, G.J.R.; Gutierrez, C.; Bayona, H.; Perez, R.; Garcia, X., "Spectral Analysis Techniques For Acoustic Fingerprints Recognition," Image, Signal Processing and Artificial Vision (STSIVA), 2014 XIX Symposium on,  pp. 1, 5, 17-19 Sept. 2014. doi: 10.1109/STSIVA.2014.7010154 This article presents results of the recognition process of acoustic fingerprints from a noise source using spectral characteristics of the signal. Principal Components Analysis (PCA) is applied to reduce the dimensionality of extracted features and then a classifier is implemented using the method of the k-nearest neighbors (KNN) to identify the pattern of the audio signal. This classifier is compared with an Artificial Neural Network (ANN) implementation. It is necessary to implement a filtering system to the acquired signals for 60Hz noise reduction generated by imperfections in the acquisition system. The methods described in this paper were used for vessel recognition.

Keywords: acoustic noise; acoustic signal processing; audio signals; fingerprint identification; neural nets; principal component analysis; spectral analysis; ANN; PCA; acoustic fingerprints recognition; artificial neural network; audio signal; filtering system; frequency 60 Hz; k-nearest neighbors; noise reduction; noise source; principal components analysis; signal spectral characteristics; spectral analysis; vessel recognition; Acoustics; Artificial neural networks; Boats; Feature extraction; Fingerprint recognition; Finite impulse response filters; Principal component analysis; ANN; Acoustic Fingerprint; FFT; KNN; PCA; Spectrogram   (ID#: 15-3770)



Moussallam, M.; Daudet, L., "A General Framework For Dictionary Based Audio Fingerprinting," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 3077, 3081, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854166 Fingerprint-based Audio recognition system must address concurrent objectives. Indeed, fingerprints must be both robust to distortions and discriminative while their dimension must remain to allow fast comparison. This paper proposes to restate these objectives as a penalized sparse representation problem. On top of this dictionary-based approach, we propose a structured sparsity model in the form of a probabilistic distribution for the sparse support. A practical suboptimal greedy algorithm is then presented and evaluated on robustness and recognition tasks. We show that some existing methods can be seen as particular cases of this algorithm and that the general framework allows to reach other points of a Pareto-like continuum.

Keywords: Pareto distribution; audio signal processing; fingerprint identification; greedy algorithms; Pareto-like continuum; concurrent objectives; dictionary based audio fingerprinting; fingerprint-based audio recognition system; general framework; penalized sparse representation problem; probabilistic distribution; sparse support; structured sparsity; suboptimal greedy algorithm; Atomic clocks; Dictionaries; Entropy; Fingerprint recognition; Robustness; Speech; Time-frequency analysis; Audio Fingerprinting; Sparse Representation  (ID#: 15-3771)



Hui Zeng; Tengfei Qin; Xiangui Kang; Li Liu, "Countering Anti-Forensics Of Median Filtering," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp.2704,2708, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854091 The statistical fingerprints left by median filtering can be a valuable clue for image forensics. However, these fingerprints may be maliciously erased by a forger. Recently, a tricky anti-forensic method has been proposed to remove median filtering traces by restoring images' pixel difference distribution. In this paper, we analyze the traces of this anti-forensic technique and propose a novel counter method. The experimental results show that our method could reveal this anti-forensics effectively at low computation load. According to our best knowledge, it's the first work on countering anti-forensics of median filtering.

Keywords: image coding; image forensics; image restoration; median filters; statistical analysis; antiforensic method; antiforensics countering; image forensics; image pixel difference distribution restoration; median filtering traces; statistical fingerprints; Detectors; Digital images; Discrete Fourier transforms; Filtering; Forensics; Noise; Radiation detectors; Image forensics; anti-forensic; median filtering; pixel difference  (ID#: 15-3772)



Naini, R.; Moulin, P., "Fingerprint Information Maximization For Content Identification," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 3809, 3813, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854314 This paper presents a novel design of content fingerprints based on maximization of the mutual information across the distortion channel. We use the information bottleneck method to optimize the filters and quantizers that generate these fingerprints. A greedy optimization scheme is used to select filters from a dictionary and allocate fingerprint bits. We test the performance of this method for audio fingerprinting and show substantial improvements over existing learning based fingerprints.

Keywords: information retrieval; optimisation; content fingerprint; content identification; distortion channel; filter optimization ;fingerprint information maximization; greedy optimization; information bottleneck method; learning based fingerprint; mutual information across; quantizer optimization; Approximation methods; Databases; Dictionaries; Joints; Mutual information; Optimization; Quantization (signal);Audio fingerprinting; Content Identification; Information bottleneck; Information maximization  (ID#: 15-3773)



Yuxi Liu; Hatzinakos, D., "Human Acoustic Fingerprints: A Novel Biometric Modality For Mobile Security," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 784, 3788, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854309 Recently, the demand for more robust protection against unauthorized use of mobile devices has been rapidly growing. This paper presents a novel biometric modality Transient Evoked Otoacoustic Emission (TEOAE) for mobile security. Prior works have investigated TEOAE for biometrics in a setting where an individual is to be identified among a pre-enrolled identity gallery. However, this limits the applicability to mobile environment, where attacks in most cases are from imposters unknown to the system before. Therefore, we employ an unsupervised learning approach based on Autoencoder Neural Network to tackle such blind recognition problem. The learning model is trained upon a generic dataset and used to verify an individual in a random population. We also introduce the framework of mobile biometric system considering practical application. Experiments show the merits of the proposed method and system performance is further evaluated by cross-validation with an average EER 2.41% achieved.

Keywords: acoustic signal processing; biometrics (access control); learning (artificial intelligence); mobile computing; mobile handsets; neural nets; otoacoustic emissions; autoencoder neural network; biometric modality; blind recognition problem; generic dataset; human acoustic fingerprints; learning model; mobile biometric system; mobile devices; mobile environment; mobile security; pre-enrolled identity gallery; transient evoked otoacoustic emission; unsupervised learning approach; Biometrics (access control);Feature extraction; Mobile communication; Neural networks; Security; Time-frequency analysis; Training; Autoencoder Neural Network; Biometric Verification; Mobile Security; Otoacoustic Emission; Time-frequency Analysis  (ID#: 15-3774)



Alias T, E.; Naveen, N.; Mathew, D., "A Novel Acoustic Fingerprint Method for Audio Signal Pattern Detection," Advances in Computing and Communications (ICACC), 2014 Fourth International Conference on, pp. 64, 68, 27-29 Aug. 2014. doi: 10.1109/ICACC.2014.21 This paper presents a novel and efficient audio signal recognition algorithm with limited computational complexity. As the audio recognition system will be used in real world environment where background noises are high, conventional speech recognition techniques are not directly applicable, since they have a poor performance in these environments. So here, we introduce a new audio recognition algorithm which is optimized for mechanical sounds such as car horn, telephone ring etc. This is a hybrid time-frequency approach which makes use of acoustic fingerprint for the recognition of audio signal patterns. The limited computational complexity is achieved through efficient usage of both time domain and frequency domain in two different processing phases, detection and recognition respectively. And the transition between these two phases is carried out through a finite state machine(FSM)model. Simulation results shows that the algorithm effectively recognizes audio signals within a noisy environment.

Keywords: acoustic noise; acoustic signal detection; acoustic signal processing; audio signal processing; computational complexity; finite state machines; pattern recognition ;time-frequency analysis; FSM model; acoustic fingerprint method; audio signal pattern detection; background noises; computational complexity; efficient audio signal recognition algorithm; finite state machine; hybrid time-frequency approach; mechanical sounds; speech recognition techniques; Acoustics; Computational complexity; Correlation; Frequency-domain analysis; Noise measurement; Pattern recognition; Time-domain analysis; Acoustic fingerprint; Audio detection; Audio recognition; Finite State Machine(FSM);Pitch frequency; Spectral signature; Time-Frequency processing   (ID#: 15-3775)



Ghatak, S.; Lodh, A.; Saha, E.; Goyal, A.; Das, A.; Dutta, S., "Development of a Keyboardless Social Networking Website For Visually Impaired: SocialWeb," Global Humanitarian Technology Conference - South Asia Satellite (GHTC-SAS), 2014 IEEE, pp.232,236, 26-27 Sept. 2014. doi: 10.1109/GHTC-SAS.2014.6967589 Over the past decade, we have witnessed a huge upsurge in social networking which continues to touch and transform our lives till present day. Social networks help us to communicate amongst our acquaintances and friends with whom we share similar interests on a common platform. Globally, there are more than 200 million visually impaired people. Visual impairment has many issues associated with it, but the one that stands out is the lack of accessibility to content for entertainment and socializing safely. This paper deals with the development of a keyboard less social networking website for visually impaired. The term keyboard less signifies minimum use of keyboard and allows the user to explore the contents of the website using assistive technologies like screen readers and speech to text (STT) conversion technologies which in turn provides a user friendly experience for the target audience. As soon as the user with minimal computer proficiency opens this website, with the help of screen reader, he/she identifies the username and password fields. The user speaks out his username and with the help of STT conversion (using Web Speech API), the username is entered. Then the control moves over to the password field and similarly, the password of the user is obtained and matched with the one saved in the website database. The concept of acoustic fingerprinting has been implemented for successfully validating the passwords of registered users and foiling intentions of malicious attackers. On successful match of the passwords, the user is able to enjoy the services of the website without any further hassle. Once the access obstacles associated to deal with social networking sites are successfully resolved and proper technologies are put to place, social networking sites can be a rewarding, fulfilling, and enjoyable experience for the visually impaired people.

Keywords: handicapped aids; human computer interaction; message authentication; social networking (online);STT conversion; SocialWeb; acoustic fingerprinting; assistive technologies; computer proficiency; keyboardless social networking Website; malicious attackers; screen readers; speech to text conversion technologies; user friendliness; visually impaired people; Communities; Computers; Fingerprint recognition; Media; Social network services; Speech; Speech recognition; Assitive technologies; STT conversion; Web Speech API; audio fingerprinting; screen reader  (ID#: 15-3776)



Severin, F.; Baradarani, A.; Taylor, J.; Zhelnakov, S.; Maev, R., "Auto-Adjustment Of Image Produced By Multi-Transducer Ultrasonic System," Ultrasonics Symposium (IUS), 2014 IEEE International, pp.1944,1947, 3-6 Sept. 2014. doi: 10.1109/ULTSYM.2014.0483 Acoustic microscopy is characterized by relatively long scanning time, which is required for the motion of the transducer over the entire scanning area. This time may be reduced by using a multi-channel acoustical system which has several identical transducers arranged as an array and is mounted on a mechanical scanner so that each transducer scans only a fraction of the total area. The resulting image is formed as a combination of all acquired partial data sets. The mechanical instability of the scanner, as well as the difference in parameters of the individual transducers causes a misalignment of the image fractures. This distortion may be partially compensated for by the introduction of constant or dynamical signal leveling and data shift procedures. However, a reduction of the random instability component requires more advanced algorithms, including auto-adjustment of processing parameters. The described procedure was implemented into the prototype of an ultrasonic fingerprint reading system. The specialized cylindrical scanner provides a helical spiral lens trajectory which eliminates repeatable acceleration, reduces vibration and allows constant data flow on maximal rate. It is equipped with an array of four spherically focused 50 MHz acoustic lenses operating in pulse-echo mode. Each transducer is connected to a separate channel including pulser, receiver and digitizer. The output 3D data volume contains interlaced B-scans coming from each channel. Afterward, data processing includes pre-determined procedures of constant layer shift in order to compensate for the transducer displacement, phase shift and amplitude leveling for compensation of variation in transducer characteristics. Analysis of statistical parameters of individual scans allows adaptive eliminating of the axial misalignment and mechanical vibrations. Further 2D correlation of overlapping partial C-scans will realize an interpolative adjustment which essentially improves the output image. Imple- entation of this adaptive algorithm into a data processing sequence allows us to significantly reduce misreading due to hardware noise and finger motion during scanning. The system provides a high quality acoustic image of the fingerprint including different levels of information: fingerprint pattern, sweat porous locations, internal dermis structures. These additional features can effectively facilitate fingerprint based identification. The developed principles and algorithm implementations allow improved quality, stability and reliability of acoustical data obtained with the mechanical scanner, accommodating several transducers. General principles developed during this work can be applied to other configurations of advanced ultrasonic systems designed for various biomedical and NDE applications. The data processing algorithm, developed for a specific biometric task, can be adapted for the compensation of mechanical imperfections of the other devices.

Keywords: acoustic devices; acoustic microscopy; fingerprint identification; image processing; ultrasonic imaging; ultrasonic transducer arrays; acoustic lenses; acoustic microscopy; amplitude leveling; arrayed transducers; biometric task; cylindrical scanner; data processing sequence; data shift procedures; digitizer; dynamical signal leveling; frequency 50 MHz; helical spiral lens trajectory; high quality acoustic image; image autoadjustment; image fracture; multichannel acoustical system; multitransducer ultrasonic system; phase shift; pulse-echo mode operation; pulser; receiver; scanner mechanical instability; transducer displacement; ultrasonic fingerprint reading system; Acoustic distortion; Acoustics; Arrays; Fingerprint recognition; Lenses; Skin; Transducers; Acoustical microscopy; array transducer; image processing  (ID#: 15-3777)



Chitnis, P.V.; Lloyd, H.; Silverman, R.H., "An Adaptive Interferometric Sensor For All-Optical Photoacoustic Microscopy," Ultrasonics Symposium (IUS), 2014 IEEE International, pp.353,356, 3-6 Sept. 2014. doi: 10.1109/ULTSYM.2014.0087 Conventional photoacoustic microscopy (PAM) involves detection of optically induced thermo-elastic waves using ultrasound transducers. This approach requires acoustic coupling and the spatial resolution is limited by the focusing properties of the transducer. We present an all-optical PAM approach that involved detection of the photoacoustically induced surface displacements using an adaptive, two-wave mixing interferometer. The interferometer consisted of a 532-nm, CW laser and a Bismuth Silicon Oxide photorefractive crystal (PRC) that was 5×5×5 mm3. The laser beam was expanded to 3 mm and split into two paths, a reference beam that passed directly through the PRC and a signal beam that was focused at the surface through a 100-X, infinity-corrected objective and returned to the PRC. The PRC matched the wave front of the reference beam to that of the signal beam for optimal interference. The interference of the two beams produced optical-intensity modulations that were correlated with surface displacements. A GHz-bandwidth photoreceiver, a low-noise 20-dB amplifier, and a 12-bit digitizer were employed for time-resolved detection of the surface-displacement signals. In combination with a 5-ns, 532-nm pump laser, the interferometric probe was employed for imaging ink patterns, such as a fingerprint, on a glass slide. The signal beam was focused at a reflective cover slip that was separated from the fingerprint by 5 mm of acoustic-coupling gel. A 3×5 mm2 area of the coverslip was raster scanned with 100-μm steps and surface-displacement signals at each location were averaged 20 times. Image reconstruction based on time reversal of the PA-induced displacement signals produced the photoacoustic image of the ink patterns. The reconstructed image of the fingerprint was consistent with its photograph, which demonstrated the ability of our system to resolve micron-scaled features at a depth of 5 mm.

 Keywords: acoustic microscopy; acoustic signal detection; acoustic wave interferometry; analogue-digital conversion; biological techniques; biological tissues; bismuth compounds; image reconstruction; light interferometers; low noise amplifiers; multiwave mixing; optical microscopy; optical pumping; optical receivers; photoacoustic effect; photorefractive materials ;thermoelasticity; ultrasonic focusing; ultrasonic transducers;BiSiO2;CW laser; acoustic coupling; acoustic-coupling gel; adaptive interferometric microscopy; adaptive interferometric sensor; bismuth silicon oxide photorefractive crystals; focusing properties; glass slide ;image reconstruction; imaging ink patterns; laser beam; low-noise amplifier; noise figure 20 dB; optical PAM approach; optical photoacoustic microscopy; optical-intensity modulation; optically induced thermo-elastic wave detection; optimal interference; photoacoustic image; photoacoustically induced surface displacement detection; photoreceiver; reconstructed image; reflective cover slip; surface amplifier; surface-displacement signals; time-resolved detection; two-wave mixing interferometer; ultrasound transducers; wavelength 532 nm; Acoustic beams; Acoustics; Imaging; Laser beams; Laser excitation; Optical interferometry; Optical surface waves  (ID#: 15-3778)



Shakeri, S.; Leus, G., "Underwater Ultra-Wideband Fingerprinting-Based Sparse Localization," Signal Processing Advances in Wireless Communications (SPAWC), 2014 IEEE 15th International Workshop on, pp.140,144, 22-25 June 2014. doi: 10.1109/SPAWC.2014.6941333 In this work, a new fingerprinting-based localization algorithm is proposed for an underwater medium by utilizing ultra-wideband (UWB) signals. In many conventional underwater systems, localization is accomplished by utilizing acoustic waves. On the other hand, electromagnetic waves haven't been employed for underwater localization due to the high attenuation of the signal in water. However, it is possible to use UWB signals for short-range underwater localization. In this work, the feasibility of performing localization for an underwater medium is illustrated by utilizing a fingerprinting-based localization approach. By employing the concept of compressive sampling, we propose a sparsity-based localization method for which we define a system model exploiting the spatial sparsity.

Keywords: compressed sensing; ultra wideband communication; underwater acoustic communication; underwater acoustic propagation; UWB signal utilization; acoustic wave utilization; compressive sampling; grid matching; sparsity-based localization method; ultrawideband signal utilization; underwater ultrawideband fingerprinting-based sparse localization; Accuracy; Dictionaries; Indexes; Receiving antennas; Signal processing algorithms; Synchronization; Vectors; fingerprinting localization; grid matching; sparse recovery; underwater   (ID#: 15-3779)



Rafii, Z.; Coover, B.; Jinyu Han, "An Audio Fingerprinting System For Live Version Identification Using Image Processing Techniques," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp.644,648, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6853675 Suppose that you are at a music festival checking on an artist, and you would like to quickly know about the song that is being played (e.g., title, lyrics, album, etc.). If you have a smartphone, you could record a sample of the live performance and compare it against a database of existing recordings from the artist. Services such as Shazam or SoundHound will not work here, as this is not the typical framework for audio fingerprinting or query-by-humming systems, as a live performance is neither identical to its studio version (e.g., variations in instrumentation, key, tempo, etc.) nor it is a hummed or sung melody. We propose an audio fingerprinting system that can deal with live version identification by using image processing techniques. Compact fingerprints are derived using a log-frequency spectrogram and an adaptive thresholding method, and template matching is performed using the Hamming similarity and the Hough Transform.

Keywords: Hough transforms; audio signal processing; fingerprint identification; image segmentation; Hamming similarity; Hough Transform; adaptive thresholding method; audio fingerprinting system; compact fingerprints; image processing techniques; live version identification; log-frequency spectrogram; music festival; smartphone; template matching; Degradation; Robustness; Spectrogram; Speech; Speech processing; Time-frequency analysis; Transforms; Adaptive thresholding; Constant Q Transform; audio fingerprinting; cover identification  (ID#: 15-3780)



Jun-Yong Lee; Hyoung-Gook Kim, "Audio Fingerprinting To Identify TV Commercial Advertisement In Real-Noisy Environment," Communications and Information Technologies (ISCIT), 2014 14th International Symposium on,  pp. 527, 530, 24-26 Sept. 2014. doi: 10.1109/ISCIT.2014.7011969 This paper proposes a high-performance audio fingerprint extraction method for identifying TV commercial advertisement. In the proposed method, a salient audio peak pair fingerprints based on constant Q transform (CQT) are hashed and stored, to be efficiently compared to one another. Experimental results confirm that the proposed method is quite robust in different noise conditions and improves the accuracy of the audio fingerprinting system in real noisy environments.

Keywords: acoustic noise; audio signal processing; feature extraction; television broadcasting; transforms; CQT; TV commercial advertisement identification; audio fingerprinting extraction method; constant Q transform; real-noisy environment; salient audio peak pair fingerprints; Databases; Fingerprint recognition; Noise; Robustness; Servers; TV; Time-frequency analysis; audio content identification; audio fingerprinting; constant Q transform  (ID#: 15-3780)



Hui Su; Hajj-Ahmad, A.; Min Wu; Oard, D.W., "Exploring the Use Of ENF For Multimedia Synchronization," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 4613, 4617, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854476 The electric network frequency (ENF) signal can be captured in multimedia recordings due to electromagnetic influences from the power grid at the time of recording. Recent work has exploited the ENF signals for forensic applications, such as authenticating and detecting forgery of ENF-containing multimedia signals, and inferring their time and location of creation. In this paper, we explore a new potential of ENF signals for automatic synchronization of audio and video. The ENF signal as a time-varying random process can be used as a timing fingerprint of multimedia signals. Synchronization of audio and video recordings can be achieved by aligning their embedded ENF signals. We demonstrate the proposed scheme with two applications: multi-view video synchronization and synchronization of historical audio recordings. The experimental results show the ENF based synchronization approach is effective, and has the potential to solve problems that are intractable by other existing methods.

Keywords: audio recording; electromagnetic interference; random processes; synchronisation; video recording; ENF signal; electric network frequency signal; forensic applications; historical audio recording automatic synchronization; multimedia recordings; multimedia signal timing fingerprint; multiview video recording automatic synchronization; power grid; time-varying random process; Audio recording; Forensics; Frequency estimation; Multimedia communication; Streaming media; Synchronization; Video recording; ENF; audio; historical recordings; synchronization; video  (ID#: 15-3781)



Hongbo Liu; Jie Yang; Sidhom, S.; Yan Wang; Yingying Chen; Fan Ye, "Accurate WiFi Based Localization for Smartphones Using Peer Assistance," Mobile Computing, IEEE Transactions on, vol.13, no.10, pp. 2199, 2214, Oct. 2014. doi: 10.1109/TMC.2013.140 Highly accurate indoor localization of smartphones is critical to enable novel location based features for users and businesses. In this paper, we first conduct an empirical investigation of the suitability of WiFi localization for this purpose. We find that although reasonable accuracy can be achieved, significant errors (e.g., 6 8m) always exist. The root cause is the existence of distinct locations with similar signatures, which is a fundamental limit of pure WiFi-based methods. Inspired by high densities of smartphones in public spaces, we propose a peer assisted localization approach to eliminate such large errors. It obtains accurate acoustic ranging estimates among peer phones, then maps their locations jointly against WiFi signature map subjecting to ranging constraints. We devise techniques for fast acoustic ranging among multiple phones and build a prototype. Experiments show that it can reduce the maximum and 80-percentile errors to as small as 2m and 1m, in time no longer than the original WiFi scanning, with negligible impact on battery lifetime.

Keywords: indoor radio; radionavigation; smart phones; wireless LAN; WiFi based localization method; WiFi signature map; acoustic ranging estimates; battery lifetime; indoor localization; location based features; peer assisted localization approach; peer phones; smart phones; Accuracy; Acoustics; Distance measurement; IEEE 802.11 Standards; Servers; Smart phones; Testing; Peer Assisted Localization; Smartphone; WiFi fingerprint localization; peer assisted localization  (ID#: 15-3782)



Lan Zhang; Kebin Liu; Yonghang Jiang; Xiang-Yang Li; Yunhao Liu; Panlong Yang, "Montage: Combine Frames With Movement Continuity For Realtime Multi-User Tracking," INFOCOM, 2014 Proceedings IEEE, pp.799,807, April 27 2014-May 2 2014. doi: 10.1109/INFOCOM.2014.6848007 In this work we design and develop Montage for real-time multi-user formation tracking and localization by off-the-shelf smartphones. Montage achieves submeter-level tracking accuracy by integrating temporal and spatial constraints from user movement vector estimation and distance measuring. In Montage we designed a suite of novel techniques to surmount a variety of challenges in real-time tracking, without infrastructure and fingerprints, and without any a priori user-specific (e.g., stride-length and phone-placement) or site-specific (e.g., digitalized map) knowledge. We implemented, deployed and evaluated Montage in both outdoor and indoor environment. Our experimental results (847 traces from 15 users) show that the stride-length estimated by Montage over all users has error within 9cm, and the moving-direction estimated by Montage is within 20°. For realtime tracking, Montage provides meter-second-level formation tracking accuracy with off-the-shelf mobile phones.

Keywords: smart phones; target tracking; meter-second-level formation tracking accuracy; mobile phones; movement continuity; moving-direction estimation; real-time multiuser formation tracking; smartphones; spatial constraints; submeter-level tracking; temporal constraints; user movement vector estimation; Acceleration; Acoustics; Distance measurement; Earth; Topology; Tracking; Vectors  (ID#: 15-3783)



Kaghaz-Garan, S.; Umbarkar, A.; Doboli, A., "Joint Localization And Fingerprinting Of Sound Sources For Auditory Scene Analysis," Robotic and Sensors Environments (ROSE), 2014 IEEE International Symposium on, pp.49,54, 16-18 Oct. 2014. doi: 10.1109/ROSE.2014.6952982 In the field of scene understanding, researchers have mainly focused on using video/images to extract different elements in a scene. The computational as well as monetary cost associated with such implementations is high. This paper proposes a low-cost system which uses sound-based techniques in order to jointly perform localization as well as fingerprinting of the sound sources. A network of embedded nodes is used to sense the sound inputs. Phase-based sound localization and Support-Vector Machine classification are used to locate and classify elements of the scene, respectively. The fusion of all this data presents a complete “picture” of the scene. The proposed concepts are applied to a vehicular-traffic case study. Experiments show that the system has a fingerprinting accuracy of up to 97.5%, localization error less than 4 degrees and scene prediction accuracy of 100%.

Keywords: acoustic signal processing; pattern classification; sensor fusion; support vector machines; traffic engineering computing; auditory scene analysis; data fusion; embedded nodes; phase-based sound localization; scene element classification; sound source fingerprinting; sound source localization; sound-based techniques; support-vector machine classification; vehicular-traffic case study; Accuracy; Feature extraction; Image analysis; Sensors; Support vector machines; Testing; Vehicles  (ID#: 15-3784)



Luque, J.; Anguera, X., "On the Modeling Of Natural Vocal Emotion Expressions Through Binary Key," Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European, pp. 1562, 1566, 1-5 Sept. 2014. This work presents a novel method to estimate natural expressed emotions in speech through binary acoustic modeling. Standard acoustic features are mapped to a binary value representation and a support vector regression model is used to correlate them with the three-continuous emotional dimensions. Three different sets of speech features, two based on spectral parameters and one on prosody are compared on the VAM corpus, a set of spontaneous dialogues from a German TV talk-show. The regression analysis, in terms of correlation coefficient and mean absolute error, show that the binary key modeling is able to successfully capture speaker emotion characteristics. The proposed algorithm obtains comparable results to those reported on the literature while it relies on a much smaller set of acoustic descriptors. Furthermore, we also report on preliminary results based on the combination of the binary models, which brings further performance improvements.

Keywords: acoustic signal processing; emotion recognition; regression analysis; speech recognition; support vector machines; German TV talk-show; VAM corpus; acoustic descriptors; binary acoustic modeling; binary key modeling; binary value representation; correlation coefficient; mean absolute error; natural vocal emotion expression modelling; speaker emotion characteristics; spectral parameters; speech features; spontaneous dialogues; standard acoustic feature mapping; support vector regression model; three-continuous emotional dimensions; Acoustics; Emotion recognition; Feature extraction; Speech; Speech recognition; Training; Vectors; Emotion modeling; VAM corpus; binary fingerprint; dimensional emotions  (ID#: 15-3785)



Van Vaerenbergh, S.; González, O.; Vía, J.; Santamaría, I., "Physical Layer Authentication Based On Channel Response Tracking Using Gaussian Processes," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 2410, 2414, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854032 Physical-layer authentication techniques exploit the unique properties of the wireless medium to enhance traditional higher-level authentication procedures. We propose to reduce the higher-level authentication overhead by using a state-of-the-art multi-target tracking technique based on Gaussian processes. The proposed technique has the additional advantage that it is capable of automatically learning the dynamics of the trusted user's channel response and the time-frequency fingerprint of intruders. Numerical simulations show very low intrusion rates, and an experimental validation using a wireless test bed with programmable radios demonstrates the technique's effectiveness.

Keywords: Gaussian processes; fingerprint identification; security of data; target tracking; telecommunication security; time-frequency analysis; wireless channels; Gaussian process; automatic learning; channel response tracking; higher level authentication overhead; higher level authentication procedure; intruder; multitarget tracking technique; numerical simulation; physical layer authentication; programmable radio; time-frequency fingerprint; trusted user channel response; wireless medium; wireless test bed; Authentication; Channel estimation; Communication system security; Gaussian processes; Time-frequency analysis; Trajectory; Wireless communication; Gaussian processes; multi-target tracking; physical-layer authentication; wireless communications  (ID#: 15-3786)



Bianchi, T.; Piva, A., "TTP-free Asymmetric Fingerprinting Protocol Based On Client Side Embedding," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp.3987,3991, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6854350 In this paper, we propose a scheme to employ an asymmetric fingerprinting protocol within a client-side embedding distribution framework. The scheme is based on a novel client-side embedding technique that is able to transmit a binary fingerprint. This enables secure distribution of personalized decryption keys containing the Buyer's fingerprint by means of existing asymmetric protocols, without using a trusted third party. Simulation results show that the fingerprint can be reliably recovered by using non-blind decoding, and it is robust with respect to common attacks. The proposed scheme can be a valid solution to both customer's rights and scalability issues in multimedia content distribution.

Keywords: client-server systems; cryptographic protocols; image coding; image watermarking; multimedia systems; trusted computing; Buyer's fingerprint; TTP-free asymmetric fingerprinting protocol; asymmetric protocols; binary fingerprint; client-side embedding distribution framework; client-side embedding technique; customer rights; multimedia content distribution; nonblind decoding; personalized decryption key distribution; scalability issues; trusted third party; Decoding; Encryption; Protocols; Servers; Table lookup; Watermarking; Buyer-Seller watermarking protocol; Client-side embedding; Fingerprinting; secure watermark embedding  (ID#: 15-3787)



Coover, B.; Jinyu Han, "A Power Mask Based Audio Fingerprint," Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pp. 1394, 1398, 4-9 May 2014. doi: 10.1109/ICASSP.2014.6853826 The Philips audio fingerprint[1] has been used for years, but its robustness against external noise has not been studied accurately. This paper shows the Philips fingerprint is noise resistant, and is capable of recognizing music that is corrupted by noise at a -4 to -7 dB signal to noise ratio. In addition, the drawbacks of the Philips fingerprint are addressed by utilizing a “Power Mask” in conjunction with the Philips fingerprint during the matching process. This Power Mask is a weight matrix given to the fingerprint bits, which allows mismatched bits to be penalized according to their relevance in the fingerprint. The effectiveness of the proposed fingerprint was evaluated by experiments using a database of 1030 songs and 1184 query files that were heavily corrupted by two types of noise at varying levels. Our experiments show the proposed method has significantly improved the noise resistance of the standard Philips fingerprint.

Keywords: audio signal processing; music; Power Mask; audio fingerprint; fingerprint bits; music; noise resistance; weight matrix;1f noise; Bit error rate; Databases; Resistance; Robustness; Signal to noise ratio; Audio Fingerprint; Music Recognition  (ID#: 15-3788)



Articles listed on these pages have been found on publicly available internet pages and are cited with links to those pages. Some of the information included herein has been reprinted with permission from the authors or data repositories. Direct any requests via Email to for removal of the links or modifications to specific citations. Please include the ID# of the specific citation in your correspondence.