Content selection and curation for web archiving: The gatekeepers vs. the masses
Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers to the question "what should we archive?" The Internet Archive's broad entire-web crawls have been supplemented by narrower do...
Uloženo v:
| Vydáno v: | JCDL '16 : proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries : June 19-23, 2016, Newark, NJ, USA s. 107 - 110 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
ACM
01.06.2016
|
| Témata: | |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers to the question "what should we archive?" The Internet Archive's broad entire-web crawls have been supplemented by narrower domain-or topic-specific collections gathered by numerous libraries. We can characterize this as content selection and curation by "gatekeepers". In contrast, we have witnessed the emergence of another approach driven by "the masses" - we can archive pages that are contained in social media streams such as Twitter. The interesting question, of course, is how these approaches differ. We provide an answer to this question in the context of a case study about the 2015 Canadian federal elections. Based on our analysis, we recommend a hybrid approach that combines an effort driven by social media and more traditional curatorial methods. |
|---|---|
| AbstractList | Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers to the question "what should we archive?" The Internet Archive's broad entire-web crawls have been supplemented by narrower domain-or topic-specific collections gathered by numerous libraries. We can characterize this as content selection and curation by "gatekeepers". In contrast, we have witnessed the emergence of another approach driven by "the masses" - we can archive pages that are contained in social media streams such as Twitter. The interesting question, of course, is how these approaches differ. We provide an answer to this question in the context of a case study about the 2015 Canadian federal elections. Based on our analysis, we recommend a hybrid approach that combines an effort driven by social media and more traditional curatorial methods. |
| Author | Lin, Jimmy Ruest, Nick Milligan, Ian |
| Author_xml | – sequence: 1 givenname: Ian surname: Milligan fullname: Milligan, Ian email: i2milligan@uwaterloo.ca – sequence: 2 givenname: Nick surname: Ruest fullname: Ruest, Nick email: ruestn@yorku.ca – sequence: 3 givenname: Jimmy surname: Lin fullname: Lin, Jimmy email: jimmylin@uwaterloo.ca |
| BookMark | eNotzD1PwzAUhWEjgQQtnRlY_AcSfO04sdlQBQWpEgxlrm6c6zbQOpVtivj3hI_p1XmGM2GnYQjE2BWIEqDSN9KCMLYuf2pBnbDJqEJVUlp7zmYpvQkhJBiQor5gL_MhZAqZJ9qRy_0QOIaOu4-Iv8MPkX9SyzG6bX_sw-aWr7bEN5jpnehAMfFjKnkebY8pUbpkZx53iWb_nbLXh_vV_LFYPi-e5nfLAmVT50K32IEG46AF2QknvdHemJFraEWlKu29AqusM74GTwLJ2saDIYUKu1ZN2fXfb09E60Ps9xi_1o3WVjegvgFiEk6i |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/2910896.2910913 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Library & Information Science |
| EISBN | 1450342299 9781450342292 |
| EndPage | 110 |
| ExternalDocumentID | 7559571 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK GUFHI LHSKQ RIE RIL |
| ID | FETCH-LOGICAL-a276t-5bad1518c1b12d0c2f85f885ba61b04345ff31939c8f61fe0ae997f18e3a3adb3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 9 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000389502300017&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 01:34:05 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a276t-5bad1518c1b12d0c2f85f885ba61b04345ff31939c8f61fe0ae997f18e3a3adb3 |
| OpenAccessLink | https://hdl.handle.net/10012/11649 |
| PageCount | 4 |
| ParticipantIDs | ieee_primary_7559571 |
| PublicationCentury | 2000 |
| PublicationDate | 2016-June |
| PublicationDateYYYYMMDD | 2016-06-01 |
| PublicationDate_xml | – month: 06 year: 2016 text: 2016-June |
| PublicationDecade | 2010 |
| PublicationTitle | JCDL '16 : proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries : June 19-23, 2016, Newark, NJ, USA |
| PublicationTitleAbbrev | JCDL |
| PublicationYear | 2016 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0002181206 |
| Score | 1.7198243 |
| Snippet | Any preservation effort must begin with an assessment of what content to preserve, and web archiving is no different. There have historically been two answers... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 107 |
| SubjectTerms | Internet Libraries Logic gates Media Nominations and elections Tagging |
| Title | Content selection and curation for web archiving: The gatekeepers vs. the masses |
| URI | https://ieeexplore.ieee.org/document/7559571 |
| WOSCitedRecordID | wos000389502300017&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZKxcDEo0W8im5ATKSNE8cPVkTFVHUAqVvlx1lCiLSiaX8_thMVBhamRLaiSHd2vsudv_sIuXOxdKOlz2iYzljlfaYLazLOpRXUciylTmITYjaTi4Wa98jDnguDiOnwGY7jbarlu5XdxlTZRITwt4qE8QMheMvV2udTElTlvOveQ1k1KQISSsXH8aqifMEv-ZSEHtPj_733hAx_aHgw3wPMKelhfUZGHdMA7qGjEkXTQrdHB2Se-k3VDWySwk2c07UDu21dDeERCJ9OSBWEmE14hLBUIGbTPhDXIRqE3WYMIS6ETx0rwkPyNn1-fXrJOtmEYGXBm6wy2gUcl5YaWrjcFl5WXsowzKnJWRkdEjZeqaz0nHrMNSolPJVY6lI7U56Tfr2q8YIAM0Jo5w2jXjP0QlU5xfALZyzVXjF5SQbRWst12xlj2Rnq6u_ha3IUwg3eHrS6If3ma4sjcmh3zfvm6za58xvZ7aMY |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA9jCnryYxO_pjmIJ7s1bZoPr-KYOMcOE3YbSfoCInZj7fb3m6RlevDiqSWhFN5L-nt9L7_3Q-gu96UbJWxE3HREM2sjlRgdMSYMJ4ZBKlQQm-CTiZjP5bSFHnZcGAAIh8-g729DLT9fmo1PlQ24C38zTxjfyyhN4pqttcuoBLCKWdO_h9BskDgsFJL1_VV6AYNfAioBP4ZH_3vzMer-EPHwdAcxJ6gFxSnqNVwDfI8bMpE3Lm52aQdNQ8eposJl0Ljxc6rIsdnUzsbuEew-njjUEHw-4RG7xYJ9Pu0TYOXiQbwt-9hFhvhL-ZpwF70Pn2dPo6gRTnB25qyKMq1yh-TCEE2SPDaJFZkVwg0zomOaepe4rZdKIywjFmIFUnJLBKQqVblOz1C7WBZwjjDVnKvcakqsomC5zGIC7idOG6KspOICdby1Fqu6N8aiMdTl38O36GA0exsvxi-T1yt06IIPVh-7ukbtar2BHto32-qjXN8E134DgxGmXw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=JCDL+%2716+%3A+proceedings+of+the+16th+ACM%2FIEEE-CS+Joint+Conference+on+Digital+Libraries+%3A+June+19-23%2C+2016%2C+Newark%2C+NJ%2C+USA&rft.atitle=Content+selection+and+curation+for+web+archiving%3A+The+gatekeepers+vs.+the+masses&rft.au=Milligan%2C+Ian&rft.au=Ruest%2C+Nick&rft.au=Lin%2C+Jimmy&rft.date=2016-06-01&rft.pub=ACM&rft.spage=107&rft.epage=110&rft_id=info:doi/10.1145%2F2910896.2910913&rft.externalDocID=7559571 |