Computational paralinguistics : emotion, affect and personality in speech and language processing
This book presents the methods, tools and techniques that are currently being used to recognise (automatically) the affect, emotion, personality and everything else beyond linguistics ('paralinguistics') expressed by or embedded in human speech and language. It is the first book to provide...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | eBook Book |
| Language: | English |
| Published: |
Chichester
Wiley
2013
John Wiley & Sons, Incorporated Wiley-Blackwell |
| Edition: | 1 |
| Subjects: | |
| ISBN: | 1119971365, 9781119971368, 1118706625, 9781118706626 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- 4.2.3 In Between: The Case of Other Voice Qualities, Especially Laryngealisation -- 4.3 The Non-Distinctive Use of Linguistics Elements -- 4.3.1 Words and Word Classes -- 4.3.2 Phrase Level: The Case of Filler Phrases and Hedges -- 4.4 Disfluencies -- 4.5 Non-Verbal, Vocal Events -- 4.6 Common Traits of Formal Aspects -- References -- 5 Functional Aspects -- 5.1 Biological Trait Primitives -- 5.1.1 Speaker Characteristics -- 5.2 Cultural Trait Primitives -- 5.2.1 Speech Characteristics -- 5.3 Personality -- 5.4 Emotion and Affect -- 5.5 Subjectivity and Sentiment Analysis -- 5.6 Deviant Speech -- 5.6.1 Pathological Speech -- 5.6.2 Temporarily Deviant Speech -- 5.6.3 Non-native Speech -- 5.7 Social Signals -- 5.8 Discrepant Communication -- 5.8.1 Indirect Speech, Irony, and Sarcasm -- 5.8.2 Deceptive Speech -- 5.8.3 Off-Talk -- 5.9 Common Traits of Functional Aspects -- References -- 6 Corpus Engineering -- 6.1 Annotation -- 6.1.1 Assessment of Annotations -- 6.1.2 New Trends -- 6.2 Corpora and Benchmarks: Some Examples -- 6.2.1 FAU Aibo Emotion Corpus -- 6.2.2 aGender Corpus -- 6.2.3 TUM AVIC Corpus -- 6.2.4 Alcohol Language Corpus -- 6.2.5 Sleepy Language Corpus -- 6.2.6 Speaker Personality Corpus -- 6.2.7 Speaker Likability Database -- 6.2.8 NKI CCRT Speech Corpus -- 6.2.9 TIMIT Database -- 6.2.10 Final Remarks on Databases -- References -- Part II Modelling -- 7 Computational Modelling of Paralinguistics: Overview -- References -- 8 Acoustic Features -- 8.1 Digital Signal Representation -- 8.2 Short Time Analysis -- 8.3 Acoustic Segmentation -- 8.4 Continuous Descriptors -- 8.4.1 Intensity -- 8.4.2 Zero Crossings -- 8.4.3 Autocorrelation -- 8.4.4 Spectrum and Cepstrum -- 8.4.5 Linear Prediction -- 8.4.6 Line Spectral Pairs -- 8.4.7 Perceptual Linear Prediction -- 8.4.8 Formants -- 8.4.9 Fundamental Frequency and Voicing Probability
- Intro -- COMPUTATIONAL PARALINGUISTICS -- Contents -- Preface -- Acknowledgements -- List of Abbreviations -- Part I Foundations -- 1 Introduction -- 1.1 What is Computational Paralinguistics? A First Approximation -- 1.2 History and Subject Area -- 1.3 Form versus Function -- 1.4 Further Aspects -- 1.4.1 The Synthesis of Emotion and Personality -- 1.4.2 Multimodality: Analysis and Generation -- 1.4.3 Applications, Usability and Ethics -- 1.5 Summary and Structure of the Book -- References -- 2 Taxonomies -- 2.1 Traits versus States -- 2.2 Acted versus Spontaneous -- 2.3 Complex versus Simple -- 2.4 Measured versus Assessed -- 2.5 Categorical versus Continuous -- 2.6 Felt versus Perceived -- 2.7 Intentional versus Instinctual -- 2.8 Consistent versus Discrepant -- 2.9 Private versus Social -- 2.10 Prototypical versus Peripheral -- 2.11 Universal versus Culture-Specific -- 2.12 Unimodal versus Multimodal -- 2.13 All These Taxonomies - So What? -- 2.13.1 Emotion Data: The FAU AEC -- 2.13.2 Non-native Data: The C-AuDiT corpus -- References -- 3 Aspects of Modelling -- 3.1 Theories and Models of Personality -- 3.2 Theories and Models of Emotion and Affect -- 3.3 Type and Segmentation of Units -- 3.4 Typical versus Atypical Speech -- 3.5 Context -- 3.6 Lab versus Life, or Through the Looking Glass -- 3.7 Sheep and Goats, or Single Instance Decision versus Cumulative Evidence and Overall Performance -- 3.8 The Few and the Many, or How to Analyse a Hamburger -- 3.9 Reifications, and What You are Looking for is What You Get -- 3.10 Magical Numbers versus Sound Reasoning -- References -- 4 Formal Aspects -- 4.1 The Linguistic Code and Beyond -- 4.2 The Non-Distinctive Use of Phonetic Elements -- 4.2.1 Segmental Level: The Case of /r/ Variants -- 4.2.2 Supra-segmental Level: The Case of Pitch and Fundamental Frequency - and of Other Prosodic Parameters
- 8.4.10 Jitter and Shimmer -- 8.4.11 Derived Low-Level Descriptors -- References -- 9 Linguistic Features -- 9.1 Textual Descriptors -- 9.2 Preprocessing -- 9.3 Reduction -- 9.3.1 Stopping -- 9.3.2 Stemming -- 9.3.3 Tagging -- 9.4 Modelling -- 9.4.1 Vector Space Modelling -- 9.4.2 On-line Knowledge -- References -- 10 Supra-segmental Features -- 10.1 Functionals -- 10.2 Feature Brute-Forcing -- 10.3 Feature Stacking -- References -- 11 Machine-Based Modelling -- 11.1 Feature Relevance Analysis -- 11.2 Machine Learning -- 11.2.1 Static Classification -- 11.2.2 Dynamic Classification: Hidden Markov Models -- 11.2.3 Regression -- 11.3 Testing Protocols -- 11.3.1 Partitioning -- 11.3.2 Balancing -- 11.3.3 Performance Measures -- 11.3.4 Result Interpretation -- References -- 12 System Integration and Application -- 12.1 Distributed Processing -- 12.2 Autonomous and Collaborative Learning -- 12.3 Confidence Measures -- References -- 13 'Hands-On': Existing Toolkits and Practical Tutorial -- 13.1 Related Toolkits -- 13.2 openSMILE -- 13.2.1 Available Feature Extractors -- 13.3 Practical Computational Paralinguistics How-to -- 13.3.1 Obtaining and Installing openSMILE -- 13.3.2 Extracting Features -- 13.3.3 Classification and Regression -- References -- 14 Epilogue -- Appendix -- A.1 openSMILE Feature Sets Used at Interspeech Challenges -- A.2 Feature Encoding Scheme -- References -- Index

