A New Approach to Model Pitch Perception Using Sparse Coding
Our acoustical environment abounds with repetitive sounds, some of which are related to pitch perception. It is still unknown how the auditory system, in processing these sounds, relates a physical stimulus and its percept. Since, in mammals, all auditory stimuli are conveyed into the nervous system...
Uloženo v:
| Vydáno v: | PLoS computational biology Ročník 13; číslo 1; s. e1005338 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
Public Library of Science
01.01.2017
Public Library of Science (PLoS) |
| Témata: | |
| ISSN: | 1553-7358, 1553-734X, 1553-7358 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Our acoustical environment abounds with repetitive sounds, some of which are related to pitch perception. It is still unknown how the auditory system, in processing these sounds, relates a physical stimulus and its percept. Since, in mammals, all auditory stimuli are conveyed into the nervous system through the auditory nerve (AN) fibers, a model should explain the perception of pitch as a function of this particular input. However, pitch perception is invariant to certain features of the physical stimulus. For example, a missing fundamental stimulus with resolved or unresolved harmonics, or a low and high-level amplitude stimulus with the same spectral content-these all give rise to the same percept of pitch. In contrast, the AN representations for these different stimuli are not invariant to these effects. In fact, due to saturation and non-linearity of both cochlear and inner hair cells responses, these differences are enhanced by the AN fibers. Thus there is a difficulty in explaining how pitch percept arises from the activity of the AN fibers. We introduce a novel approach for extracting pitch cues from the AN population activity for a given arbitrary stimulus. The method is based on a technique known as sparse coding (SC). It is the representation of pitch cues by a few spatiotemporal atoms (templates) from among a large set of possible ones (a dictionary). The amount of activity of each atom is represented by a non-zero coefficient, analogous to an active neuron. Such a technique has been successfully applied to other modalities, particularly vision. The model is composed of a cochlear model, an SC processing unit, and a harmonic sieve. We show that the model copes with different pitch phenomena: extracting resolved and non-resolved harmonics, missing fundamental pitches, stimuli with both high and low amplitudes, iterated rippled noises, and recorded musical instruments. |
|---|---|
| Bibliografie: | new_version ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Conceived and designed the experiments: OdB OmB MF.Performed the experiments: OdB.Analyzed the data: OdB.Wrote the paper: OdB MF OmB. The authors have declared that no competing interests exist. |
| ISSN: | 1553-7358 1553-734X 1553-7358 |
| DOI: | 10.1371/journal.pcbi.1005338 |