Scalable Strategies for Computing with Massive Data
This paper presents two complementary statistical computing frameworks that address challenges in parallel processing and the analysis of massive data. First, the foreach package allows users of the R programming environment to define parallel loops that may be run sequentially on a single machine,...
Saved in:
| Published in: | Journal of statistical software Vol. 55; no. 14; pp. 1 - 19 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Foundation for Open Access Statistics
01.11.2013
|
| ISSN: | 1548-7660, 1548-7660 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | This paper presents two complementary statistical computing frameworks that address challenges in parallel processing and the analysis of massive data. First, the foreach package allows users of the R programming environment to define parallel loops that may be run sequentially on a single machine, in parallel on a symmetric multiprocessing (SMP) machine, or in cluster environments without platform-specific code. Second, the bigmemory package implements memory- and file-mapped data structures that provide (a) access to arbitrarily large data while retaining a look and feel that is familiar to R users and (b) data structures that are shared across processor cores in order to support efficient parallel computing techniques. Although these packages may be used independently, this paper shows how they can be used in combination to address challenges that have effectively been beyond the reach of researchers who lack specialized software development skills or expensive hardware. |
|---|---|
| AbstractList | This paper presents two complementary statistical computing frameworks that address challenges in parallel processing and the analysis of massive data. First, the foreach package allows users of the R programming environment to define parallel loops that may be run sequentially on a single machine, in parallel on a symmetric multiprocessing (SMP) machine, or in cluster environments without platform-specific code. Second, the bigmemory package implements memory- and file-mapped data structures that provide (a) access to arbitrarily large data while retaining a look and feel that is familiar to R users and (b) data structures that are shared across processor cores in order to support efficient parallel computing techniques. Although these packages may be used independently, this paper shows how they can be used in combination to address challenges that have effectively been beyond the reach of researchers who lack specialized software development skills or expensive hardware. |
| Author | Emerson, John Kane, Michael J. Weston, Stephen |
| Author_xml | – sequence: 1 givenname: Michael J. surname: Kane fullname: Kane, Michael J. – sequence: 2 givenname: John surname: Emerson fullname: Emerson, John – sequence: 3 givenname: Stephen surname: Weston fullname: Weston, Stephen |
| BookMark | eNp1kMtOwzAQRS1UJNrCkn1-IGUcv5IlKq9KRSwKa2viOMVVGle2KeLvCS1CCInVXI3uPYszIaPe95aQSwozWkqmrjYxzvYgxMxRfkLGVPAyV1LC6Fc-I5MYNwAF8EqMCVsZ7LDubLZKAZNdOxuz1ods7re7t-T6dfbu0mv2iDG6vc1uMOE5OW2xi_bi-07Jy93t8_whXz7dL-bXy9wwXqTcGNGyuuKUKWNRmVIYRCqVUQ22DfISeVMyMDUVVErKBLbCAGNAJYVSFWxKFkdu43Gjd8FtMXxoj04fHj6sNYbkTGc1byUrLCguRcUl2AorYLSQQg0kK2Fg5UeWCT7GYNsfHgV9sKcHe_rLnh7sDX32p29cwuR8P2hy3T-rT4wQdOM |
| CitedBy_id | crossref_primary_10_1177_0081175018796871 crossref_primary_10_5194_hess_23_2939_2019 crossref_primary_10_1038_ncomms12083 crossref_primary_10_1007_s11004_019_09791_y crossref_primary_10_1007_s00180_019_00950_7 crossref_primary_10_3390_app12136670 crossref_primary_10_1002_pld3_53 crossref_primary_10_1016_j_ecolmodel_2017_12_010 crossref_primary_10_1016_j_jhydrol_2024_132502 crossref_primary_10_7554_eLife_43966 crossref_primary_10_1093_nargab_lqaf084 crossref_primary_10_3390_rs15184450 crossref_primary_10_1016_j_pocean_2019_02_006 crossref_primary_10_3390_fermentation9070672 crossref_primary_10_7717_peerj_cs_175 crossref_primary_10_1016_j_bdr_2017_07_003 crossref_primary_10_1093_molbev_msae098 crossref_primary_10_1093_nsr_nwaa244 crossref_primary_10_1080_03610918_2023_2300747 crossref_primary_10_1016_j_jmva_2022_105128 crossref_primary_10_1101_gr_277525_122 crossref_primary_10_1002_gepi_22605 crossref_primary_10_1007_s11524_018_0259_1 crossref_primary_10_1016_j_jmoldx_2019_08_006 crossref_primary_10_1109_MCI_2016_2532267 crossref_primary_10_1186_s12859_016_1006_9 crossref_primary_10_1534_g3_119_400018 crossref_primary_10_1016_j_nima_2016_10_006 crossref_primary_10_3835_plantgenome2016_09_0089 crossref_primary_10_3897_BDJ_4_e8357 crossref_primary_10_1016_j_ecosta_2021_11_008 crossref_primary_10_1186_s12863_017_0533_3 crossref_primary_10_1016_j_neuroimage_2014_02_024 crossref_primary_10_6339_24_JDS1132 crossref_primary_10_1038_srep10576 crossref_primary_10_1002_pst_2438 crossref_primary_10_3390_brainsci14040325 crossref_primary_10_1002_sam_11283 crossref_primary_10_1016_j_gpb_2020_10_007 crossref_primary_10_1007_s00606_018_1494_3 |
| ContentType | Journal Article |
| DBID | AAYXX CITATION DOA |
| DOI | 10.18637/jss.v055.i14 |
| DatabaseName | CrossRef Directory of Open Access Journals |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Mathematics |
| EISSN | 1548-7660 |
| EndPage | 19 |
| ExternalDocumentID | oai_doaj_org_article_4f632e074659460e9a90312657087e60 10_18637_jss_v055_i14 |
| GroupedDBID | 29L 2WC 5GY 5VS AAFWJ AAKPC AAYXX ACGFO ACIPV ADBBV AENEX AFPKN ALMA_UNASSIGNED_HOLDINGS BCNDV C1A CITATION E3Z EBS EJD F5P GROUPED_DOAJ GX1 IPNFZ KQ8 M~E OK1 OVT P2P RIG RNS TR2 XSB |
| ID | FETCH-LOGICAL-c342t-cc5f3b94137cea7c85caa167c7dafda48a4d830cb15166135af5c033016108723 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 95 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000328131700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1548-7660 |
| IngestDate | Fri Oct 03 12:42:42 EDT 2025 Sat Nov 29 04:37:59 EST 2025 Tue Nov 18 21:52:33 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 14 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c342t-cc5f3b94137cea7c85caa167c7dafda48a4d830cb15166135af5c033016108723 |
| OpenAccessLink | https://doaj.org/article/4f632e074659460e9a90312657087e60 |
| PageCount | 19 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_4f632e074659460e9a90312657087e60 crossref_primary_10_18637_jss_v055_i14 crossref_citationtrail_10_18637_jss_v055_i14 |
| PublicationCentury | 2000 |
| PublicationDate | 2013-11-01 |
| PublicationDateYYYYMMDD | 2013-11-01 |
| PublicationDate_xml | – month: 11 year: 2013 text: 2013-11-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | Journal of statistical software |
| PublicationYear | 2013 |
| Publisher | Foundation for Open Access Statistics |
| Publisher_xml | – name: Foundation for Open Access Statistics |
| SSID | ssj0020495 |
| Score | 2.4129016 |
| Snippet | This paper presents two complementary statistical computing frameworks that address challenges in parallel processing and the analysis of massive data. First,... |
| SourceID | doaj crossref |
| SourceType | Open Website Enrichment Source Index Database |
| StartPage | 1 |
| Title | Scalable Strategies for Computing with Massive Data |
| URI | https://doaj.org/article/4f632e074659460e9a90312657087e60 |
| Volume | 55 |
| WOSCitedRecordID | wos000328131700001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1548-7660 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0020495 issn: 1548-7660 databaseCode: DOA dateStart: 19960101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1548-7660 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0020495 issn: 1548-7660 databaseCode: M~E dateStart: 19960101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8NAEF6keNCD-MQ3exBPpk2y76OPFg-2CCr0Fia7G6hIlSb26G93NklLPYgXLzkkQ0i-WfLNhJ3vI-RCg_DKcRVB4lJsUKCI8sTLKAcXW5DK-0Yy_0GNRno8No8rVl9hT1gjD9wA1-OFZKkPrhjCcBl7AwbXYRp2bGjlZd2tx8osmqm21cK6V7SKmloy1Xsty-48FqI7SfgPBloR6q8ZZbBNttpSkF43j7BD1vx0l2wOlzqq5R5hTwhhGG6iCxVZX1IsM2ljxoC0Q8OPVDrEEhg_W_QOKtgnL4P-8-191NocRJbxtIqsFQXLDbKJsh6U1cICJFJZ5aBwwDVwp1lscyRnZFMmoBA2ZiwUa4hByg5IZ_o-9YeESlvoxOMlxyzXVhrvXIwtCoR5WZ_kR-Rq8eqZbTXAgxXFWxZ6gYBUhkhlAakMkToil8vwj0b84rfAm4DjMihoVtcnMJNZm8nsr0we_8dNTshGGgwr6mnBU9KpZp_-jKzbeTUpZ-f1IsHj8Kv_DfsowF0 |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+Strategies+for+Computing+with+Massive+Data&rft.jtitle=Journal+of+statistical+software&rft.au=Kane%2C+Michael+J.&rft.au=Emerson%2C+John&rft.au=Weston%2C+Stephen&rft.date=2013-11-01&rft.issn=1548-7660&rft.eissn=1548-7660&rft.volume=55&rft.issue=14&rft_id=info:doi/10.18637%2Fjss.v055.i14&rft.externalDBID=n%2Fa&rft.externalDocID=10_18637_jss_v055_i14 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1548-7660&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1548-7660&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1548-7660&client=summon |