Divide and recombine (D&R) data science projects for deep analysis of big data and high computational complexity

The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data analysis. Next, big data are addressed, which create computational challenges due to the data size, as does the computational complexity of many a...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Japanese journal of statistics and data science Ročník 1; číslo 1; s. 139 - 156
Hlavní autori: Tung, Wen-wen, Barthur, Ashrith, Bowers, Matthew C., Song, Yuying, Gerth, John, Cleveland, William S.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Singapore Springer Singapore 01.06.2018
Predmet:
ISSN:2520-8756, 2520-8764
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data analysis. Next, big data are addressed, which create computational challenges due to the data size, as does the computational complexity of many analytic methods. Divide and recombine (D&R) is a statistical approach whose goal is to meet the challenges. In D&R, the data are divided into subsets, an analytic method is applied independently to each subset, and the outputs are recombined. This enables a large component of embarrassingly-parallel computation, the fastest parallel computation. DeltaRho open-source software implements D&R. At the front end, the analyst programs in R . The back end is the Hadoop distributed file system and parallel compute engine. The goals of D&R are the following: access to thousands of methods of machine learning, statistics, and data visualization; deep analysis of the data, which means analysis of the detailed data at their finest granularity; easy programming of analyses; and high computational performance. To succeed, D&R requires research in all of the technical areas of data science. Network cybersecurity and climate science are two subject-matter areas with big, complex data benefiting from D&R. We illustrate this by discussing two datasets, one from each area. The first is the measurements of 13 variables for each of 10,615,054,608 queries to the Spamhaus IP address blacklisting service. The second has 50,632 3-hourly satellite rainfall estimates at 576,000 locations.
AbstractList The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data analysis. Next, big data are addressed, which create computational challenges due to the data size, as does the computational complexity of many analytic methods. Divide and recombine (D&R) is a statistical approach whose goal is to meet the challenges. In D&R, the data are divided into subsets, an analytic method is applied independently to each subset, and the outputs are recombined. This enables a large component of embarrassingly-parallel computation, the fastest parallel computation. DeltaRho open-source software implements D&R. At the front end, the analyst programs in R . The back end is the Hadoop distributed file system and parallel compute engine. The goals of D&R are the following: access to thousands of methods of machine learning, statistics, and data visualization; deep analysis of the data, which means analysis of the detailed data at their finest granularity; easy programming of analyses; and high computational performance. To succeed, D&R requires research in all of the technical areas of data science. Network cybersecurity and climate science are two subject-matter areas with big, complex data benefiting from D&R. We illustrate this by discussing two datasets, one from each area. The first is the measurements of 13 variables for each of 10,615,054,608 queries to the Spamhaus IP address blacklisting service. The second has 50,632 3-hourly satellite rainfall estimates at 576,000 locations.
Author Tung, Wen-wen
Barthur, Ashrith
Bowers, Matthew C.
Song, Yuying
Gerth, John
Cleveland, William S.
Author_xml – sequence: 1
  givenname: Wen-wen
  orcidid: 0000-0001-8627-1503
  surname: Tung
  fullname: Tung, Wen-wen
  email: wwtung@purdue.edu
  organization: Department of Earth, Atmospheric, and Planetary Sciences, Purdue University
– sequence: 2
  givenname: Ashrith
  surname: Barthur
  fullname: Barthur, Ashrith
  organization: CERIAS, Purdue University
– sequence: 3
  givenname: Matthew C.
  surname: Bowers
  fullname: Bowers, Matthew C.
  organization: Department of Earth, Atmospheric, and Planetary Sciences, Purdue University
– sequence: 4
  givenname: Yuying
  surname: Song
  fullname: Song, Yuying
  organization: Department of Statistics, Purdue University
– sequence: 5
  givenname: John
  surname: Gerth
  fullname: Gerth, John
  organization: Departments of Computer Science and Electrical Engineering, Stanford University
– sequence: 6
  givenname: William S.
  surname: Cleveland
  fullname: Cleveland, William S.
  organization: Department of Statistics, Purdue University
BookMark eNp9kE9LAzEQxYNUsNZ-AG85iR5Ws0k2mz1K6z8oCKLnkE1m25R2syRbsd_e1BUPHnqaGXi_N7x3jkatbwGhy5zc5oSUd5FTIvOM5DIjhMiMn6AxLSjJZCn46G8vxBmaxrhOGloyXlI5Rt3cfToLWLcWBzB-W7sW8PX86u0GW91rHI2D1gDugl-D6SNufMAWoEuI3uyji9g3uHbLQX7wWbnlCierbtfr3vkk-7k28OX6_QU6bfQmwvR3TtDH48P77DlbvD69zO4XmaEV6TNBmK2LhoLhpTVMGC1ozlhe06YEUYu6MhaKSoPllZWltjVjBa9kVUjDCBNsgvLB1wQfY4BGdcFtddirnKhDa2poTaXW1KE1xRNT_mOMGyL0QbvNUZIOZExf2iUEtfa7kJLHI9A3HiiDtQ
CitedBy_id crossref_primary_10_1029_2020JD033667
crossref_primary_10_1175_JCLI_D_17_0090_1
crossref_primary_10_1088_2752_5295_acdf0f
Cites_doi 10.1016/J.PHYSA.2011.08.042
10.1029/2012GL054011
10.1080/01621459.1988.10478639
10.1175/JHM-D-14-0101.1
10.1017/CBO9781107050242
10.1175/JHM560.1
10.1002/sam.11242
10.1175/MWR3145.1
10.1002/2013EO320001
10.1007/BF01029783
10.1371/journal.pone.0024331
10.1175/1520-0477(2000)081<2035:EOPSSE>2.3.CO;2
10.1111/j.1751-5823.2005.tb00276.x
10.1007/s10994-013-5346-7
10.2151/jmsj1965.66.6_823
10.1103/PhysRevE.73.016117
10.1175/AMSMONOGRAPHS-D-15-0014.1
10.1002/sta4.7
10.1175/JAS-D-13-0122.1
10.1017/CBO9781139170666
10.14778/2831360.2831365
10.1038/nclimate2903
10.5194/acp-11-3731-2011
10.1214/aos/1043351248
10.1175/MWR3146.1
10.1016/j.geomorph.2010.09.033
10.1111/j.1600-0870.1985.tb00423.x
10.1103/PhysRevE.49.1685
10.1061/TACEAT.0006518
10.1175/1520-0477(1991)072<1331:TMYNCO>2.0.CO;2
10.1017/CBO9781107415324.004
10.1002/9780470191651
ContentType Journal Article
Copyright Japanese Federation of Statistical Science Associations 2018
Copyright_xml – notice: Japanese Federation of Statistical Science Associations 2018
DBID AAYXX
CITATION
DOI 10.1007/s42081-018-0008-4
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
Economics
Law
Statistics
Physics
Computer Science
EISSN 2520-8764
EndPage 156
ExternalDocumentID 10_1007_s42081_018_0008_4
GrantInformation_xml – fundername: National Science Foundation
  grantid: DHS-0937123; CDSE-1228348
  funderid: http://dx.doi.org/10.13039/100000001
– fundername: Defense Advanced Research Projects Agency
  grantid: FA8750-12-2-0343
  funderid: http://dx.doi.org/10.13039/100000185
– fundername: National Aeronautics and Space Administration
  grantid: NNX16AO62H
  funderid: http://dx.doi.org/10.13039/100000104
GroupedDBID -EM
0R~
406
AACDK
AAHNG
AAIAL
AAJBT
AASML
AATNV
AAUYE
ABAKF
ABDZT
ABECU
ABFTV
ABJNI
ABKCH
ABMQK
ABQBU
ABTEG
ABTKH
ABTMW
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACMLO
ACOKC
ACPIV
ACZOJ
ADKNI
ADRFC
ADTPH
ADURQ
ADYFF
AEFQL
AEJRE
AEMSY
AESKC
AFBBN
AFQWF
AGDGC
AGJBK
AGMZJ
AGQEE
AGRTI
AIAKS
AIGIU
AILAN
AITGF
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMXSW
AMYLF
AXYYD
BGNMA
CSCUP
DPUIP
EBLON
EBS
EJD
FIGPU
FINBP
FNLPD
FSGXE
GGCAI
IKXTQ
IWAJR
J-C
JZLTJ
KOV
LLZTM
M4Y
NPVJJ
NQJWS
NU0
O9J
PT4
RLLFE
ROL
RSV
SJYHP
SNE
SNPRN
SOHCF
SOJ
SRMVM
SSLCW
STPWE
TSG
UOJIU
UTJUX
UZXMN
VFIZW
ZMTXR
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
AEZWR
AFDZB
AFFHD
AFHIU
AFKRA
AFOHR
AHPBZ
AHWEU
AIXLP
ARAPS
ATHPR
AYFIA
AZQEC
BENPR
BGLVJ
CCPQU
CITATION
DWQXO
GNUQQ
HCIFZ
K7-
M2P
PHGZM
PHGZT
PQGLB
ID FETCH-LOGICAL-c290t-603db5f2ec47dc36ca621331b2f7e6b6b9cde59aed49d87adb335498958c30363
IEDL.DBID RSV
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000655501800010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2520-8756
IngestDate Sat Nov 29 06:12:05 EST 2025
Tue Nov 18 22:39:03 EST 2025
Fri Feb 21 02:32:46 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Hadoop
Data science
Blacklisting IP addresses
Big data
Parallel
Weather and climate data analysis
Distributed computing
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c290t-603db5f2ec47dc36ca621331b2f7e6b6b9cde59aed49d87adb335498958c30363
ORCID 0000-0001-8627-1503
PageCount 18
ParticipantIDs crossref_primary_10_1007_s42081_018_0008_4
crossref_citationtrail_10_1007_s42081_018_0008_4
springer_journals_10_1007_s42081_018_0008_4
PublicationCentury 2000
PublicationDate 20180600
2018-6-00
PublicationDateYYYYMMDD 2018-06-01
PublicationDate_xml – month: 6
  year: 2018
  text: 20180600
PublicationDecade 2010
PublicationPlace Singapore
PublicationPlace_xml – name: Singapore
PublicationTitle Japanese journal of statistics and data science
PublicationTitleAbbrev Jpn J Stat Data Sci
PublicationYear 2018
Publisher Springer Singapore
Publisher_xml – name: Springer Singapore
References Nakazawa (CR24) 1988; 66
Davis, Brown, Bullock (CR11) 2006; 134
Arakawa, Jung, Wu (CR1) 2011; 11
Cleveland (CR8) 2005; 73
Bowers, Gao, Tung (CR5) 2013; 40
Telesca, Pierini, Scian (CR31) 2012; 391
Cleveland, Hafen (CR10) 2014; 7
CR19
Simpson, Kummerow, Tao, Adler (CR29) 1996; 60
CR16
Hurst (CR21) 1951; 116
Sellars, Nguyen, Chu, Gao, Hsu, Sorooshian (CR26) 2013; 94
van Vliet, Wiberg, Leduc, Riahi (CR33) 2016; 6
Gao, Hu, Tung, Cao, Sarshar, Roychowdhury (CR14) 2006; 73
Arakawa, Jung, Wu, Fovell, Tung (CR2) 2016
CR36
CR35
Brillinger (CR6) 2002; 30
Davis, Brown, Bullock (CR12) 2006; 134
Tung, Giannakis, Majda (CR32) 2014; 71
Gao, Hu, Tung (CR15) 2011; 6
Mitasova, Harmon, Weaver, Lyons, Overton (CR23) 2012; 137
Huffman, Bolvin, Nelkin, Wolff, Adler, Gu, Hong, Bowman, Stocker (CR20) 2007; 8
Shi, Qiu, Minhas, Jiao, Wang, Reinwald, Özcan (CR28) 2015; 8
Sellars, Gao, Sorooshian (CR27) 2015; 16
Barenblatt (CR3) 1996
CR4
Cleveland (CR7) 2001; 4
Frisch (CR13) 1995
Sorooshian, Hsu, Gao, Gupta, Imam, Braithwaite (CR30) 2000; 81
Peng, Buldyrev, Havlin, Simons, Stanley, Goldberger (CR25) 1994; 49
Cleveland, Devlin (CR9) 1988; 83
Guha, Hafen, Rounds, Xia, Li, Xi, Cleveland (CR18) 2012; 1
Guha, Kidwell, Hafen, Cleveland (CR17) 2009; 5
Lovejoy, Mandelbrot (CR22) 1985; 37
Williams (CR34) 2014; 95
S Guha (8_CR17) 2009; 5
W Tung (8_CR32) 2014; 71
WS Cleveland (8_CR10) 2014; 7
J Gao (8_CR15) 2011; 6
WS Cleveland (8_CR8) 2005; 73
GJ Huffman (8_CR20) 2007; 8
C Davis (8_CR11) 2006; 134
DR Brillinger (8_CR6) 2002; 30
J Shi (8_CR28) 2015; 8
WS Cleveland (8_CR7) 2001; 4
CK Peng (8_CR25) 1994; 49
8_CR36
8_CR35
8_CR16
SL Sellars (8_CR27) 2015; 16
T Nakazawa (8_CR24) 1988; 66
8_CR19
S Lovejoy (8_CR22) 1985; 37
HE Hurst (8_CR21) 1951; 116
C Davis (8_CR12) 2006; 134
A Arakawa (8_CR1) 2011; 11
8_CR4
J Simpson (8_CR29) 1996; 60
MTH van Vliet (8_CR33) 2016; 6
A Arakawa (8_CR2) 2016
J Gao (8_CR14) 2006; 73
S Sorooshian (8_CR30) 2000; 81
H Mitasova (8_CR23) 2012; 137
GI Barenblatt (8_CR3) 1996
U Frisch (8_CR13) 1995
S Guha (8_CR18) 2012; 1
S Sellars (8_CR26) 2013; 94
MC Bowers (8_CR5) 2013; 40
JK Williams (8_CR34) 2014; 95
WS Cleveland (8_CR9) 1988; 83
L Telesca (8_CR31) 2012; 391
References_xml – volume: 391
  start-page: 1553
  issue: 4
  year: 2012
  end-page: 1562
  ident: CR31
  article-title: Investigating the temporal variation of the scaling behavior in rainfall data measured in central Argentina by means of detrended fluctuation analysis
  publication-title: Physica A: Statistical Mechanics and Its Applications
  doi: 10.1016/J.PHYSA.2011.08.042
– ident: CR4
– volume: 40
  start-page: 1
  issue: September 12
  year: 2013
  end-page: 5
  ident: CR5
  article-title: Long range correlations in tree ring chronologies of the USA: Variation within and across species
  publication-title: Geophysical Research Letters
  doi: 10.1029/2012GL054011
– ident: CR16
– volume: 83
  start-page: 596
  issue: 403
  year: 1988
  end-page: 610
  ident: CR9
  article-title: Locally weighted regression: An approach to regression analysis by local fitting
  publication-title: Journal of the American Statistical Association
  doi: 10.1080/01621459.1988.10478639
– volume: 16
  start-page: 830
  year: 2015
  end-page: 842
  ident: CR27
  article-title: An object-oriented approach to investigate impacts of climate oscillations on precipitation: A western United States case study
  publication-title: Journal of Hydrometeorology
  doi: 10.1175/JHM-D-14-0101.1
– year: 1996
  ident: CR3
  publication-title: Scaling, self-similarity and intermediate asymptotics
  doi: 10.1017/CBO9781107050242
– ident: CR35
– volume: 8
  start-page: 38
  issue: 1
  year: 2007
  end-page: 55
  ident: CR20
  article-title: The TRMM multisatellite precipitation analysis (TMPA): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales
  publication-title: Journal of Hydrometeorology
  doi: 10.1175/JHM560.1
– volume: 4
  start-page: 497
  issue: 5
  year: 2001
  end-page: 511
  ident: CR7
  article-title: Data science: An action plan for expanding the technical areas of the field of statistics
  publication-title: International Statistical Review
– volume: 7
  start-page: 425
  issue: 6
  year: 2014
  end-page: 433
  ident: CR10
  article-title: Divide and recombine (D&R): Data science for large complex data
  publication-title: Statistical Analysis and Data Mining
  doi: 10.1002/sam.11242
– volume: 134
  start-page: 1772
  issue: 7
  year: 2006
  end-page: 1784
  ident: CR11
  article-title: Object-based verification of precipitation forecasts. Part I: Methodology and application to mesoscale rain areas
  publication-title: Monthly Weather Review
  doi: 10.1175/MWR3145.1
– volume: 5
  start-page: 193
  year: 2009
  end-page: 200
  ident: CR17
  article-title: Visualization databases for the analysis of large complex datasets
  publication-title: Journal of Machine Learning Research
– volume: 94
  start-page: 277
  issue: 32
  year: 2013
  end-page: 278
  ident: CR26
  article-title: Computational earth science: Big data transformed into insight
  publication-title: EOS Transactions American Geophysical Union
  doi: 10.1002/2013EO320001
– volume: 60
  start-page: 19
  issue: 1–3
  year: 1996
  end-page: 36
  ident: CR29
  article-title: On the tropical rainfall measuring mission (TRMM)
  publication-title: Meteorology and Atmospheric Physics
  doi: 10.1007/BF01029783
– volume: 6
  start-page: e24331
  issue: 9
  year: 2011
  ident: CR15
  article-title: Facilitating joint chaos and fractal analysis of biosignals through nonlinear adaptive filtering
  publication-title: PLoS ONE
  doi: 10.1371/journal.pone.0024331
– volume: 81
  start-page: 2035
  issue: 9
  year: 2000
  end-page: 2046
  ident: CR30
  article-title: Evaluation of PERSIANN system satellite-based estimates of tropical rainfall
  publication-title: Bulletin of the American Meteorological Society
  doi: 10.1175/1520-0477(2000)081<2035:EOPSSE>2.3.CO;2
– ident: CR19
– volume: 73
  start-page: 217
  issue: 2
  year: 2005
  end-page: 221
  ident: CR8
  article-title: Learning from data: Unifying statistics and computer science
  publication-title: International Statistical Review
  doi: 10.1111/j.1751-5823.2005.tb00276.x
– volume: 95
  start-page: 51
  issue: 1
  year: 2014
  end-page: 70
  ident: CR34
  article-title: Using random forests to diagnose aviation turbulence
  publication-title: Machine Learning
  doi: 10.1007/s10994-013-5346-7
– volume: 66
  start-page: 823
  issue: 6
  year: 1988
  end-page: 839
  ident: CR24
  article-title: Tropical super clusters within intraseasonal variations over the Western Pacific
  publication-title: Journal of the Meteorological Society of Japan
  doi: 10.2151/jmsj1965.66.6_823
– volume: 73
  start-page: 1
  issue: 1
  year: 2006
  end-page: 10
  ident: CR14
  article-title: Assessment of long-range correlation in time series: How to avoid pitfalls
  publication-title: Physical Review E
  doi: 10.1103/PhysRevE.73.016117
– start-page: 16.1
  year: 2016
  end-page: 16.17
  ident: CR2
  article-title: Multiscale modeling of the moist-convective atmosphere
  publication-title: Meteorological monographs
  doi: 10.1175/AMSMONOGRAPHS-D-15-0014.1
– volume: 1
  start-page: 53
  issue: 1
  year: 2012
  end-page: 67
  ident: CR18
  article-title: Large complex data: Divide and recombine (D&R) with RHIPE
  publication-title: Stat
  doi: 10.1002/sta4.7
– volume: 71
  start-page: 3302
  issue: 9
  year: 2014
  end-page: 3326
  ident: CR32
  article-title: Symmetric and antisymmetric convection signals in the Madden-Julian oscillation. Part I: Basic modes in infrared brightness temperature
  publication-title: Journal of the Atmospheric Sciences
  doi: 10.1175/JAS-D-13-0122.1
– year: 1995
  ident: CR13
  publication-title: Turbulence: The legacy of A.N. Kolmogorov
  doi: 10.1017/CBO9781139170666
– volume: 8
  start-page: 2110
  issue: 13
  year: 2015
  end-page: 2121
  ident: CR28
  article-title: Clash of the titans: Mapreduce vs. spark for large scale data analytics
  publication-title: Proceedings of the VLDB Endowment
  doi: 10.14778/2831360.2831365
– volume: 6
  start-page: 375
  year: 2016
  end-page: 380
  ident: CR33
  article-title: Power-generation system vulnerability and adaptation to changes in climate and water resources
  publication-title: Nature Climate Change
  doi: 10.1038/nclimate2903
– volume: 11
  start-page: 3731
  issue: 8
  year: 2011
  end-page: 3742
  ident: CR1
  article-title: Toward unification of the multiscale modeling of the atmosphere
  publication-title: Atmospheric Chemistry and Physics
  doi: 10.5194/acp-11-3731-2011
– volume: 30
  start-page: 1595
  issue: 6
  year: 2002
  end-page: 1618
  ident: CR6
  article-title: John W. Tukey’s work on time series and spectrum analysis
  publication-title: Annals of Statistics
  doi: 10.1214/aos/1043351248
– volume: 134
  start-page: 1785
  issue: 7
  year: 2006
  end-page: 1795
  ident: CR12
  article-title: Object-based verification of precipitation forecasts. Part II: Application to convective rain systems
  publication-title: Monthly Weather Review
  doi: 10.1175/MWR3146.1
– volume: 137
  start-page: 122
  issue: 1
  year: 2012
  end-page: 137
  ident: CR23
  article-title: Scientific visualization of landscapes and landforms
  publication-title: Geomorphology
  doi: 10.1016/j.geomorph.2010.09.033
– ident: CR36
– volume: 37
  start-page: 209
  issue: 3
  year: 1985
  end-page: 232
  ident: CR22
  article-title: Fractal properties of rain, and a fractal model
  publication-title: Tellus A: Dynamic Meteorology and Oceanography
  doi: 10.1111/j.1600-0870.1985.tb00423.x
– volume: 116
  start-page: 770
  issue: 1
  year: 1951
  end-page: 799
  ident: CR21
  article-title: Long-term storage capacity of reservoirs
  publication-title: Transactions of the American Society of Civil Engineers
– volume: 49
  start-page: 1685
  issue: 2
  year: 1994
  end-page: 1689
  ident: CR25
  article-title: Mosaic organization of DNA nucleotides
  publication-title: Physical Review E
  doi: 10.1103/PhysRevE.49.1685
– volume-title: Scaling, self-similarity and intermediate asymptotics
  year: 1996
  ident: 8_CR3
  doi: 10.1017/CBO9781107050242
– volume: 81
  start-page: 2035
  issue: 9
  year: 2000
  ident: 8_CR30
  publication-title: Bulletin of the American Meteorological Society
  doi: 10.1175/1520-0477(2000)081<2035:EOPSSE>2.3.CO;2
– volume: 30
  start-page: 1595
  issue: 6
  year: 2002
  ident: 8_CR6
  publication-title: Annals of Statistics
  doi: 10.1214/aos/1043351248
– start-page: 16.1
  volume-title: Meteorological monographs
  year: 2016
  ident: 8_CR2
  doi: 10.1175/AMSMONOGRAPHS-D-15-0014.1
– volume-title: Turbulence: The legacy of A.N. Kolmogorov
  year: 1995
  ident: 8_CR13
  doi: 10.1017/CBO9781139170666
– volume: 16
  start-page: 830
  year: 2015
  ident: 8_CR27
  publication-title: Journal of Hydrometeorology
  doi: 10.1175/JHM-D-14-0101.1
– volume: 60
  start-page: 19
  issue: 1–3
  year: 1996
  ident: 8_CR29
  publication-title: Meteorology and Atmospheric Physics
  doi: 10.1007/BF01029783
– ident: 8_CR19
– volume: 137
  start-page: 122
  issue: 1
  year: 2012
  ident: 8_CR23
  publication-title: Geomorphology
  doi: 10.1016/j.geomorph.2010.09.033
– volume: 94
  start-page: 277
  issue: 32
  year: 2013
  ident: 8_CR26
  publication-title: EOS Transactions American Geophysical Union
  doi: 10.1002/2013EO320001
– volume: 391
  start-page: 1553
  issue: 4
  year: 2012
  ident: 8_CR31
  publication-title: Physica A: Statistical Mechanics and Its Applications
  doi: 10.1016/J.PHYSA.2011.08.042
– volume: 11
  start-page: 3731
  issue: 8
  year: 2011
  ident: 8_CR1
  publication-title: Atmospheric Chemistry and Physics
  doi: 10.5194/acp-11-3731-2011
– volume: 73
  start-page: 217
  issue: 2
  year: 2005
  ident: 8_CR8
  publication-title: International Statistical Review
  doi: 10.1111/j.1751-5823.2005.tb00276.x
– volume: 116
  start-page: 770
  issue: 1
  year: 1951
  ident: 8_CR21
  publication-title: Transactions of the American Society of Civil Engineers
  doi: 10.1061/TACEAT.0006518
– volume: 40
  start-page: 1
  issue: September 12
  year: 2013
  ident: 8_CR5
  publication-title: Geophysical Research Letters
  doi: 10.1029/2012GL054011
– volume: 1
  start-page: 53
  issue: 1
  year: 2012
  ident: 8_CR18
  publication-title: Stat
  doi: 10.1002/sta4.7
– volume: 95
  start-page: 51
  issue: 1
  year: 2014
  ident: 8_CR34
  publication-title: Machine Learning
  doi: 10.1007/s10994-013-5346-7
– volume: 7
  start-page: 425
  issue: 6
  year: 2014
  ident: 8_CR10
  publication-title: Statistical Analysis and Data Mining
  doi: 10.1002/sam.11242
– volume: 66
  start-page: 823
  issue: 6
  year: 1988
  ident: 8_CR24
  publication-title: Journal of the Meteorological Society of Japan
  doi: 10.2151/jmsj1965.66.6_823
– volume: 5
  start-page: 193
  year: 2009
  ident: 8_CR17
  publication-title: Journal of Machine Learning Research
– volume: 8
  start-page: 38
  issue: 1
  year: 2007
  ident: 8_CR20
  publication-title: Journal of Hydrometeorology
  doi: 10.1175/JHM560.1
– volume: 4
  start-page: 497
  issue: 5
  year: 2001
  ident: 8_CR7
  publication-title: International Statistical Review
– volume: 134
  start-page: 1772
  issue: 7
  year: 2006
  ident: 8_CR11
  publication-title: Monthly Weather Review
  doi: 10.1175/MWR3145.1
– volume: 71
  start-page: 3302
  issue: 9
  year: 2014
  ident: 8_CR32
  publication-title: Journal of the Atmospheric Sciences
  doi: 10.1175/JAS-D-13-0122.1
– volume: 6
  start-page: e24331
  issue: 9
  year: 2011
  ident: 8_CR15
  publication-title: PLoS ONE
  doi: 10.1371/journal.pone.0024331
– volume: 83
  start-page: 596
  issue: 403
  year: 1988
  ident: 8_CR9
  publication-title: Journal of the American Statistical Association
  doi: 10.1080/01621459.1988.10478639
– volume: 73
  start-page: 1
  issue: 1
  year: 2006
  ident: 8_CR14
  publication-title: Physical Review E
  doi: 10.1103/PhysRevE.73.016117
– volume: 37
  start-page: 209
  issue: 3
  year: 1985
  ident: 8_CR22
  publication-title: Tellus A: Dynamic Meteorology and Oceanography
  doi: 10.1111/j.1600-0870.1985.tb00423.x
– ident: 8_CR36
  doi: 10.1175/1520-0477(1991)072<1331:TMYNCO>2.0.CO;2
– ident: 8_CR4
– volume: 49
  start-page: 1685
  issue: 2
  year: 1994
  ident: 8_CR25
  publication-title: Physical Review E
  doi: 10.1103/PhysRevE.49.1685
– volume: 134
  start-page: 1785
  issue: 7
  year: 2006
  ident: 8_CR12
  publication-title: Monthly Weather Review
  doi: 10.1175/MWR3146.1
– ident: 8_CR35
  doi: 10.1017/CBO9781107415324.004
– volume: 8
  start-page: 2110
  issue: 13
  year: 2015
  ident: 8_CR28
  publication-title: Proceedings of the VLDB Endowment
  doi: 10.14778/2831360.2831365
– volume: 6
  start-page: 375
  year: 2016
  ident: 8_CR33
  publication-title: Nature Climate Change
  doi: 10.1038/nclimate2903
– ident: 8_CR16
  doi: 10.1002/9780470191651
SSID ssj0002734728
Score 2.0588965
Snippet The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data...
SourceID crossref
springer
SourceType Enrichment Source
Index Database
Publisher
StartPage 139
SubjectTerms Chemistry and Earth Sciences
Computer Science
Economics
Finance
Health Sciences
Humanities
Insurance
Law
Management
Mathematics and Statistics
Medicine
Physics
Statistical Theory and Methods
Statistics
Statistics and Computing/Statistics Programs
Statistics for Business
Statistics for Engineering
Statistics for Life Sciences
Statistics for Social Sciences
Title Divide and recombine (D&R) data science projects for deep analysis of big data and high computational complexity
URI https://link.springer.com/article/10.1007/s42081-018-0008-4
Volume 1
WOSCitedRecordID wos000655501800010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLink Journals
  customDbUrl:
  eissn: 2520-8764
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002734728
  issn: 2520-8756
  databaseCode: RSV
  dateStart: 20180601
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1ZS8QwEB689cWjKt7kQcSDwm6aNsmjeOCDinjhW8lVEXRd3FXx35tJ0wVBBX0snaSFSTLzzWS-AdjMJVNtzV3qj8VWynjOU-kcTymz3LSFwcbGodkEPz8Xd3fyItZx95rb7k1KMpzUg2I3TAQj9BVYCS1SNgyj3toJ3I2XV7eDwArytfDQU5Xm_vPeHy-abOZ3s3y1R1-TocHGHM_86-9mYTq6lGS_XgNzMOQ6Ccw07RpI3L0JTDZFyL0Ehk_VewITZzGznsB4uAqKr6bQ_6zpm-ehe4jlWo6ojiWInZ88kHZk-3Drcofg5VISLSiJEZ0e8U4wsc51_ZCa7oQ8V0Q_3NfiOA8yJBMTfi9GIsMTMnP2Pxbg5vjo-uAkjU0aUkNlq596ZVqdV9QZxq3JCqMK6nFvW9OKu0IXWhrrcqmcZdIKrqzOMo9JhcyFQfOZLcJI57njloDIjBnltAfrbc08yJeK8ryqNLWK-2OnWIZWo6rSRAZzbKTxWA64l4MWSq8FTKqLki3D7mBIt6bv-E14r9FtGXdy72fplT9Jr8IUDYsD4zdrMNJ_eXXrMGbevEJfNsIK_gRunud8
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bS-QwFD7ouOq86G5VvJuHRValMJOmTfMo6qA4DouXxbeSW0XQcXBGxX9vTpoOCKugjyUnaeAkOffvAPxOBZNtxW3snsVWzHjKY2EtjykzXLdzjY2NfbMJ3uvl19fib6jjHtbZ7nVI0r_U42I3DASj6ZtjJXQes0mYYk5gYR7f-cW_sWMF8Vq476lKU_d7p49ndTTzf6u8l0fvg6FexnTmv7W7nzAXVEqyX52BXzBh-xHM1-0aSLi9EczWRcjDCCa78iWCmbMQWY9g2qeC4lAT9c8KvnkBBodYrmWJ7BuCtvO9M6Qt-XO4fb5DMLmUBAlKgkdnSJwSTIy1AzelgjshDyVRtzcVOa6DCMlE--0FT6T_QmTO0esiXHWOLg-O49CkIdZUtEaxY6ZRaUmtZtzoJNMyo87ubStacpupTAltbCqkNUyYnEujksTZpLlIc43iM1mCRv-hb5eBiIRpaZUz1tuKOSNfSMrTslTUSO6enWwFWjWrCh0QzLGRxl0xxl72XCgcFzConhdsBXbHUwYVfMdnxHs1b4twk4cfU69-iXoLZo8vz7pF96R3ugZN6g8K-nLWoTF6fLIb8EM_O-Y-bvrT_AZQJOpg
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3dS-QwEB_81hc_eh7nnR95EDmVsrtp2jSP4roo6iLqiW8lXz0ErYu7KvffXyZNFwTvQHwsnYTAJJn5zWR-A7CdCiY7itvYXYvtmPGUx8JaHlNmuO7kGhsb-2YTvN_Pb2_FRehzOmxeuzcpybqmAVmaqlFrYMrWuPANk8IIg3Osis5jNgnTDHsGIVy_uhkHWZC7hfv-qjR1S3G-edZkNt-b5a1tepsY9famt_TplS7DYnA1yUG9N1ZgwlYRLDVtHEg41RHMN8XJwwgmz-RrBHPnIeMewax_Ioq_FtAvrWmdv8Cgi2VclsjKEMTUDw5gW_Kzu3O5S_DRKQmWlYRIz5A455gYawduSE2DQh5Lou5-1-I4DzInE-2XFyKU_gsZO0d_VuFX7-j68DgOzRtiTUV7FDslG5WW1GrGjU4yLTPq8HBH0ZLbTGVKaGNTIa1hwuRcGpUkDqvmIs01mtXkK0xVj5X9BkQkTEurHIjvKObAv5CUp2WpqJHcXUfZGrQbtRU6MJtjg437YszJ7LVQOC1gsj0v2BrsjYcMalqP_wnvN3ouwgkf_lv6-4ekt2Duotsrzk76pz9ggfp9giGedZgaPT3bDZjRL063T5t-Y_8FAyfzRA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Divide+and+recombine+%28D%26R%29+data+science+projects+for+deep+analysis+of+big+data+and+high+computational+complexity&rft.jtitle=Japanese+journal+of+statistics+and+data+science&rft.au=Tung%2C+Wen-wen&rft.au=Barthur%2C+Ashrith&rft.au=Bowers%2C+Matthew+C.&rft.au=Song%2C+Yuying&rft.date=2018-06-01&rft.pub=Springer+Singapore&rft.issn=2520-8756&rft.eissn=2520-8764&rft.volume=1&rft.issue=1&rft.spage=139&rft.epage=156&rft_id=info:doi/10.1007%2Fs42081-018-0008-4&rft.externalDocID=10_1007_s42081_018_0008_4
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2520-8756&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2520-8756&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2520-8756&client=summon