Using general-purpose compression algorithms for music analysis

General-purpose compression algorithms encode files as dictionaries of substrings with the positions of these strings' occurrences. We hypothesized that such algorithms could be used for pattern discovery in music. We compared LZ77, LZ78, Burrows-Wheeler and COSIATEC on classifying folk song me...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of new music research Ročník 45; číslo 1; s. 1 - 16
Hlavní autoři: Louboutin, Corentin, Meredith, David
Médium: Journal Article
Jazyk:angličtina
Vydáno: Abingdon Routledge 02.01.2016
Taylor & Francis Ltd
Témata:
ISSN:0929-8215, 1744-5027
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract General-purpose compression algorithms encode files as dictionaries of substrings with the positions of these strings' occurrences. We hypothesized that such algorithms could be used for pattern discovery in music. We compared LZ77, LZ78, Burrows-Wheeler and COSIATEC on classifying folk song melodies. A novel method was used, combining multiple viewpoints, the k-nearest-neighbour algorithm and a novel distance metric, corpus compression distance. Using single viewpoints, COSIATEC outperformed the general-purpose compressors, with a classification success rate of 85% on this task. However, by combining 8 of the 10 best-performing viewpoints, including seven that used LZ77, the classification success rate rose to over 94%. In a second experiment, we compared LZ77 with COSIATEC on the task of discovering subject and countersubject entries in fugues by J.S. Bach. When voice information was absent in the input data, COSIATEC outperformed LZ77 with a mean score of 0.123, compared with 0.053 for LZ77. However, when the music was processed a voice at a time, the score for LZ77 more than doubled to 0.124. We also discovered a significant correlation between compression factor and score for all the algorithms, supporting the hypothesis that the best analyses are those represented by the shortest descriptions.
AbstractList General-purpose compression algorithms encode files as dictionaries of substrings with the positions of these strings' occurrences. We hypothesized that such algorithms could be used for pattern discovery in music. We compared LZ77, LZ78, Burrows-Wheeler and COSIATEC on classifying folk song melodies. A novel method was used, combining multiple viewpoints, the k-nearest-neighbour algorithm and a novel distance metric, corpus compression distance. Using single viewpoints, COSIATEC outperformed the general-purpose compressors, with a classification success rate of 85% on this task. However, by combining 8 of the 10 best-performing viewpoints, including seven that used LZ77, the classification success rate rose to over 94%. In a second experiment, we compared LZ77 with COSIATEC on the task of discovering subject and countersubject entries in fugues by J.S. Bach. When voice information was absent in the input data, COSIATEC outperformed LZ77 with a mean [Formula omitted.] score of 0.123, compared with 0.053 for LZ77. However, when the music was processed a voice at a time, the [Formula omitted.] score for LZ77 more than doubled to 0.124. We also discovered a significant correlation between compression factor and [Formula omitted.] score for all the algorithms, supporting the hypothesis that the best analyses are those represented by the shortest descriptions.
General-purpose compression algorithms encode files as dictionaries of substrings with the positions of these strings' occurrences. We hypothesized that such algorithms could be used for pattern discovery in music. We compared LZ77, LZ78, Burrows-Wheeler and COSIATEC on classifying folk song melodies. A novel method was used, combining multiple viewpoints, the k-nearest-neighbour algorithm and a novel distance metric, corpus compression distance. Using single viewpoints, COSIATEC outperformed the general-purpose compressors, with a classification success rate of 85% on this task. However, by combining 8 of the 10 best-performing viewpoints, including seven that used LZ77, the classification success rate rose to over 94%. In a second experiment, we compared LZ77 with COSIATEC on the task of discovering subject and countersubject entries in fugues by J.S. Bach. When voice information was absent in the input data, COSIATEC outperformed LZ77 with a mean score of 0.123, compared with 0.053 for LZ77. However, when the music was processed a voice at a time, the score for LZ77 more than doubled to 0.124. We also discovered a significant correlation between compression factor and score for all the algorithms, supporting the hypothesis that the best analyses are those represented by the shortest descriptions.
Author Meredith, David
Louboutin, Corentin
Author_xml – sequence: 1
  givenname: Corentin
  surname: Louboutin
  fullname: Louboutin, Corentin
  email: dave@create.aau.dk
  organization: École Normale Supérieure de Rennes
– sequence: 2
  givenname: David
  surname: Meredith
  fullname: Meredith, David
  organization: Aalborg University
BookMark eNqFkE9LxDAUxIMouLv6EYSC565J06YpHlQW_8GCF_cc0jRZs7RJzWuR_fam7HrxoJf3LjPDzG-OTp13GqErgpcEc3yDq6ziGSmWGY6HEEpZwU7QjJR5nhY4K0_RbNKkk-gczQF2GBOWMzpDdxuwbptstdNBtmk_ht6DTpTv-qABrHeJbLc-2OGjg8T4kHQjWJVIJ9s9WLhAZ0a2oC-Pf4E2T4_vq5d0_fb8unpYp4pSPqRVZuqsqQuc85LUVVUrFlvmJisl1lSqnDEuOW4arYjhinOtCSeN4ZWmhZGMLtD1IbcP_nPUMIidH0MsAYKULA4rq5JG1e1BpYIHCNoIZQc5xBVDkLYVBIuJmPghJiZi4kgsuotf7j7YTob9v777g8-6CKiTXz60jRjkvvXBBOmUBUH_jvgGFCmEug
CitedBy_id crossref_primary_10_1186_s13640_018_0397_0
crossref_primary_10_1007_s00500_018_3383_7
crossref_primary_10_1016_j_eswa_2024_123300
crossref_primary_10_1080_09298215_2023_2270973
crossref_primary_10_5334_tismir_250
crossref_primary_10_1007_s12652_020_01806_5
crossref_primary_10_1080_09298215_2017_1409769
crossref_primary_10_1080_09298215_2021_1978505
crossref_primary_10_1098_rsos_240920
crossref_primary_10_1080_0144929X_2020_1787515
crossref_primary_10_1080_09298215_2017_1305419
Cites_doi 10.1016/S0019-9958(64)90131-7
10.1080/09298215.2012.718790
10.1016/0005-1098(78)90005-5
10.1525/mp.2014.31.3.244
10.1007/978-0-387-49820-1
10.1080/09298210600834961
10.1002/j.1538-7305.1948.tb00917.x
10.1007/978-3-540-31807-1_22
10.1002/j.1538-7305.1948.tb01338.x
10.1080/09298215.2013.776611
10.1109/CIP.2014.6844503
10.1016/S0019-9958(64)90223-2
10.1109/TIT.1977.1055714
10.1076/jnmr.31.4.321.14162
10.1076/jnmr.32.3.333.16861
10.1080/09298215.2015.1045003
10.1525/aa.1956.58.3.02a00060
10.1525/mp.2014.32.1.85
10.1109/TIT.1978.1055934
10.1162/0148926042728449
10.1109/TIT.2004.838101
10.1080/09298219508570672
10.1080/17459730903313122
10.2307/3686642
10.1109/MC.1984.1659158
10.1007/978-3-319-25931-4_13
10.1109/18.825807
ContentType Journal Article
Copyright 2016 Taylor & Francis 2016
2016 Taylor & Francis
Copyright_xml – notice: 2016 Taylor & Francis 2016
– notice: 2016 Taylor & Francis
DBID AAYXX
CITATION
DOI 10.1080/09298215.2015.1133656
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Music
EISSN 1744-5027
EndPage 16
ExternalDocumentID 3964512311
10_1080_09298215_2015_1133656
1133656
Genre Article
Feature
GrantInformation_xml – fundername: European Commission
  grantid: 610859
  funderid: 10.13039/501100000780
GroupedDBID -~X
.7I
.QK
0BK
0R~
29L
4.4
5GY
AACJB
AAGDL
AAGZJ
AAHIA
AAMFJ
AAMIU
AAPUL
AATTQ
AAZMC
ABCCR
ABCCY
ABDBF
ABFIM
ABJNI
ABLIJ
ABLJU
ABPEM
ABTAI
ABXUL
ABXYU
ACGFS
ACIWK
ACTIO
ACTOA
ACUHS
ADAHI
ADCVX
ADKVQ
ADLRE
ADXPE
AECIN
AEFOU
AEISY
AEKEX
AENEX
AEOZL
AEPSL
AEYOC
AEZRU
AFFNX
AFRVT
AGDLA
AGMYJ
AGRBW
AHDZW
AIJEM
AIYEW
AKBVH
ALMA_UNASSIGNED_HOLDINGS
ALQZU
AQTUD
AVBZW
AWYRJ
BEJHT
BLEHA
BMOTO
BOHLJ
BZPAY
CCCUG
CQ1
CS3
DGFLZ
DKSSO
DU5
EAP
EBO
EBS
EJD
EMK
EPL
ESX
E~B
E~C
G-F
GTTXZ
H13
HF~
HZ~
IPNFZ
J.O
KYCEM
M4Z
NA5
NV0
O9-
P2P
RIG
RNANH
ROSJB
RSYQP
S-F
STATR
TASJS
TBQAZ
TDBHL
TEA
TFH
TFL
TFW
TH9
TNTFI
TRJHH
TUROJ
UT5
UT9
VAE
~01
~S~
AAYXX
CITATION
ID FETCH-LOGICAL-c338t-92fb2db504871b99bc63364f27a0e3ac4668a80ddec1f8c88ee181df89e35fa63
IEDL.DBID TFW
ISICitedReferencesCount 20
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000371807100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0929-8215
IngestDate Wed Aug 13 09:25:57 EDT 2025
Sat Nov 29 04:16:28 EST 2025
Tue Nov 18 21:49:27 EST 2025
Mon Oct 20 23:44:03 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c338t-92fb2db504871b99bc63364f27a0e3ac4668a80ddec1f8c88ee181df89e35fa63
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
PQID 1768217973
PQPubID 436393
PageCount 16
ParticipantIDs crossref_citationtrail_10_1080_09298215_2015_1133656
crossref_primary_10_1080_09298215_2015_1133656
proquest_journals_1768217973
informaworld_taylorfrancis_310_1080_09298215_2015_1133656
PublicationCentury 2000
PublicationDate 2016-01-02
PublicationDateYYYYMMDD 2016-01-02
PublicationDate_xml – month: 01
  year: 2016
  text: 2016-01-02
  day: 02
PublicationDecade 2010
PublicationPlace Abingdon
PublicationPlace_xml – name: Abingdon
PublicationTitle Journal of new music research
PublicationYear 2016
Publisher Routledge
Taylor & Francis Ltd
Publisher_xml – name: Routledge
– name: Taylor & Francis Ltd
References CIT0050
CIT0030
CIT0052
CIT0051
CIT0034
Li M. (CIT0017) 2005
Sayood K. (CIT0040) 2012
Meredith D. (CIT0027) 2014
Nattiez J.-J. (CIT0033) 1975
Hillewaere R. (CIT0013) 2012
Mavromatis P. (CIT0020) 2005; 14
CIT0036
CIT0038
CIT0015
CIT0018
CIT0039
Li M. (CIT0016) 2004
CIT0021
CIT0043
CIT0042
CIT0023
CIT0045
CIT0044
Regener E. (CIT0037) 1973
Hillewaere R. (CIT0012) 2009
Chordia P. (CIT0004) 2010
Grijp L.P. (CIT0011) 2008
Brinkman A.R. (CIT0002) 1990
Kolmogorov A.N. (CIT0014) 1965; 1
CIT0047
CIT0024
CIT0046
CIT0005
Conklin D. (CIT0006) 2013
CIT0049
Burrows M. (CIT0003) 1994
CIT0026
CIT0048
CIT0007
CIT0029
CIT0028
CIT0009
Bimbot F. (CIT0001) 2012
CIT0008
References_xml – volume-title: PASCAL programming for music research
  year: 1990
  ident: CIT0002
– ident: CIT0045
  doi: 10.1016/S0019-9958(64)90131-7
– volume-title: A block-sorting lossless data compression algorithm (Technical Report SRC 124)
  year: 1994
  ident: CIT0003
– volume-title: Onder de Groene Linde: 163 Liederen uit de mondelinge overlevering
  year: 2008
  ident: CIT0011
– ident: CIT0047
  doi: 10.1080/09298215.2012.718790
– ident: CIT0038
  doi: 10.1016/0005-1098(78)90005-5
– ident: CIT0050
  doi: 10.1525/mp.2014.31.3.244
– ident: CIT0018
  doi: 10.1007/978-0-387-49820-1
– ident: CIT0023
  doi: 10.1080/09298210600834961
– volume-title: Paper presented at the Sound and Music Computing Conference (SMC’04)
  year: 2004
  ident: CIT0016
– ident: CIT0043
  doi: 10.1002/j.1538-7305.1948.tb00917.x
– ident: CIT0036
  doi: 10.1007/978-3-540-31807-1_22
– ident: CIT0042
  doi: 10.1002/j.1538-7305.1948.tb01338.x
– volume-title: Pitch notation and equal temperament: A formal study
  year: 1973
  ident: CIT0037
– ident: CIT0007
  doi: 10.1080/09298215.2013.776611
– start-page: 729
  volume-title: Proceedings of the 10th International Society for Music Information Retrieval Conference (ISMIR 2009)
  year: 2009
  ident: CIT0012
– volume-title: Paper presented at the Fourth International Workshop on Folk Music Analysis (FMA 2014)
  year: 2014
  ident: CIT0027
– volume-title: Paper presented at the Sixth International Workshop on Machine Learning and Music (MML 2013)
  year: 2013
  ident: CIT0006
– ident: CIT0026
  doi: 10.1109/CIP.2014.6844503
– volume-title: Introduction to data compression
  year: 2012
  ident: CIT0040
– ident: CIT0044
  doi: 10.1016/S0019-9958(64)90223-2
– ident: CIT0051
  doi: 10.1109/TIT.1977.1055714
– start-page: 217
  volume-title: Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR 2012)
  year: 2012
  ident: CIT0013
– volume-title: Fondements d’une sémiologie de la musique
  year: 1975
  ident: CIT0033
– volume-title: System & contrast: A polymorphous model of the inner organization of structural segments within music pieces (original extensive version) (Research Report IRISA PI-1999)
  year: 2012
  ident: CIT0001
– ident: CIT0030
  doi: 10.1076/jnmr.31.4.321.14162
– volume: 14
  start-page: 93
  year: 2005
  ident: CIT0020
  publication-title: Computing in Musicology
– ident: CIT0034
  doi: 10.1076/jnmr.32.3.333.16861
– start-page: 381
  volume-title: Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010)
  year: 2010
  ident: CIT0004
– ident: CIT0028
  doi: 10.1080/09298215.2015.1045003
– start-page: 252
  volume-title: Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005)
  year: 2005
  ident: CIT0017
– ident: CIT0009
  doi: 10.1525/aa.1956.58.3.02a00060
– ident: CIT0046
  doi: 10.1525/mp.2014.32.1.85
– ident: CIT0052
  doi: 10.1109/TIT.1978.1055934
– ident: CIT0005
  doi: 10.1162/0148926042728449
– ident: CIT0015
  doi: 10.1109/TIT.2004.838101
– ident: CIT0008
  doi: 10.1080/09298219508570672
– ident: CIT0021
  doi: 10.1080/17459730903313122
– ident: CIT0039
  doi: 10.2307/3686642
– volume: 1
  start-page: 3
  issue: 1
  year: 1965
  ident: CIT0014
  publication-title: Problemi Peredachi Informatsii
– ident: CIT0049
  doi: 10.1109/MC.1984.1659158
– ident: CIT0024
– ident: CIT0029
  doi: 10.1007/978-3-319-25931-4_13
– ident: CIT0048
  doi: 10.1109/18.825807
SSID ssj0016463
Score 2.179219
Snippet General-purpose compression algorithms encode files as dictionaries of substrings with the positions of these strings' occurrences. We hypothesized that such...
SourceID proquest
crossref
informaworld
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms data compression
folk music
Lempel-Ziv
music analysis
music information retrieval
pattern discovery
representation
Title Using general-purpose compression algorithms for music analysis
URI https://www.tandfonline.com/doi/abs/10.1080/09298215.2015.1133656
https://www.proquest.com/docview/1768217973
Volume 45
WOSCitedRecordID wos000371807100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAWR
  databaseName: Taylor & Francis Journals Complete
  customDbUrl:
  eissn: 1744-5027
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016463
  issn: 0929-8215
  databaseCode: TFW
  dateStart: 19940301
  isFulltext: true
  titleUrlDefault: https://www.tandfonline.com
  providerName: Taylor & Francis
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQMLDwRhQK8sAayMN17AkhRMVUMRTRzXL8KJXatGoCv59z7FRUCHWAMcNZ1tl357O_fB9CN5qnhmaERIpZaFCs4VGRQjwabQnsZ2u1VI3YRD4YsNGIvwQ0YRVgla6Htp4oosnVLrhlUbWIuLsYSjqDUuWAWT0nSpLBoQSyMJR-p2Ew7L-t3hEo8VpqYBE5k_Yfnt9GWatOa9ylP3J1U4D6B_8w9UO0H06f-MFvlyO0ZcpjtNNoPZ-g-wY-gMeeiDpawBLMK4Md6NyDZUssp-P5clK_zyoMc8YzZ4hl4DU5Ra_9p-HjcxT0FSIFjWkd8dQWqS56EMR5UnBeKArzITbNZWwyqQilTLIYEqBKLFOMGQPnAW0ZN1nPSpqdoe1yXppzhMHRhMbaJES7l0HLaSpzazJucqkTkncQaf0qVCAfdxoYU5G0HKXBM8J5RgTPdNDtymzh2Tc2GfDviybq5trDeo0SkW2w7bYrLEIgVyKBdgy6Np5nF38Y-hLtwae_ukm7aLtefpgrtKs-60m1vG627BefjeYb
linkProvider Taylor & Francis
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT8MwDI5gQ4ILb8RgQA5cC32kbXJCCDENMXYaYrcoTROYtHXTWvj9OE07bUJoBzhHjizHjuPE-T6ErlPmqyggxJFUQ4GiFXMSH-JRpZqAP2udClmSTcT9Ph0O2fJfGNNWaWpobYEiyr3aBLe5jK5b4m5dyOkUcpXpzAoNK0kAp5JN1Awh1xovH3TeFi8JEbFsaiDiGJn6F89v06zkpxX00h-7dZmCOnv_ofw-2q0OoPjeeswB2lDZIWqWdM9H6K7sIMDvFovamcEqTHOFTd-57ZfNsBi_T-ej4mOSY1AaT4wgFhW0yTF67TwOHrpORbHgSKhNC4f5OvHTJIQ4jr2EsURGoA_RfixcFQhJoogK6sIeKD1NJaVKwZEg1ZSpINQiCk5QI5tm6hRhsDSJ3FR5JDWPg5pFvoi1CpiKReqRuIVIbVguK_xxQ4Mx5l4NU1pZhhvL8MoyLXSzEJtZAI51Amx51XhR3nxoS1PCgzWy7XqJeRXLOfegIoPCjcXB2R-mvkLb3cFLj_ee-s_naAeG7E2O30aNYv6pLtCW_CpG-fyy9N9vSp_qRg
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PS8MwFA66iXjxtzid2oPX6tpmaXISUYeijB0m7hbS_JiDrRtr9e_3pU2HQ2QHPZcXwkte3nvJ1-9D6FKxUJMIY19SAw2K0cxPQohHrQyG_WyMErIQm4i7XToYsJ5DE2YOVml7aFMSRRRntQ3umTIVIu66BSmdQqqywKy2FSWJoChZR3UonYntv_qdt8VDAsGlmBqY-Nam-onnt2GW0tMSeemPw7rIQJ2df5j7Ltp25ad3W-6XPbSm031UL8SeD9BNgR_whiUTtT-DNZhm2rOo8xItm3piPJzOR_n7JPNgzt7EGnrCEZscotfOQ__u0XcCC76EzjT3WWiSUCVtiOI4SBhLJIH5YBPGoqUjITEhVNAWnIAyMFRSqjUUBMpQpqO2ESQ6QrV0mupj5IGjMWkpHWBlnwYNI6GIjY6YjoUKcNxAuPIrl4593IpgjHlQkZQ6z3DrGe4800BXC7NZSb-xyoB9XzSeF_cephQp4dEK22a1wtxFcsYD6MegbWNxdPKHoS_QZu--w1-eus-naAu-lNc4YRPV8vmHPkMb8jMfZfPzYvd-AYwt6Pc
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+general-purpose+compression+algorithms+for+music+analysis&rft.jtitle=Journal+of+new+music+research&rft.au=Louboutin%2C+Corentin&rft.au=Meredith%2C+David&rft.date=2016-01-02&rft.pub=Taylor+%26+Francis+Ltd&rft.issn=0929-8215&rft.eissn=1744-5027&rft.volume=45&rft.issue=1&rft.spage=1&rft_id=info:doi/10.1080%2F09298215.2015.1133656&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=3964512311
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0929-8215&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0929-8215&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0929-8215&client=summon