Divide and recombine (D&R) data science projects for deep analysis of big data and high computational complexity
The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data analysis. Next, big data are addressed, which create computational challenges due to the data size, as does the computational complexity of many a...
Uložené v:
| Vydané v: | Japanese journal of statistics and data science Ročník 1; číslo 1; s. 139 - 156 |
|---|---|
| Hlavní autori: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Singapore
Springer Singapore
01.06.2018
|
| Predmet: | |
| ISSN: | 2520-8756, 2520-8764 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data analysis. Next, big data are addressed, which create computational challenges due to the data size, as does the computational complexity of many analytic methods. Divide and recombine (D&R) is a statistical approach whose goal is to meet the challenges. In D&R, the data are divided into subsets, an analytic method is applied independently to each subset, and the outputs are recombined. This enables a large component of embarrassingly-parallel computation, the fastest parallel computation.
DeltaRho
open-source software implements D&R. At the front end, the analyst programs in
R
. The back end is the
Hadoop
distributed file system and parallel compute engine. The goals of D&R are the following: access to thousands of methods of machine learning, statistics, and data visualization; deep analysis of the data, which means analysis of the detailed data at their finest granularity; easy programming of analyses; and high computational performance. To succeed, D&R requires research in all of the technical areas of data science. Network cybersecurity and climate science are two subject-matter areas with big, complex data benefiting from D&R. We illustrate this by discussing two datasets, one from each area. The first is the measurements of 13 variables for each of 10,615,054,608 queries to the Spamhaus IP address blacklisting service. The second has 50,632 3-hourly satellite rainfall estimates at 576,000 locations. |
|---|---|
| AbstractList | The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data analysis. Next, big data are addressed, which create computational challenges due to the data size, as does the computational complexity of many analytic methods. Divide and recombine (D&R) is a statistical approach whose goal is to meet the challenges. In D&R, the data are divided into subsets, an analytic method is applied independently to each subset, and the outputs are recombined. This enables a large component of embarrassingly-parallel computation, the fastest parallel computation.
DeltaRho
open-source software implements D&R. At the front end, the analyst programs in
R
. The back end is the
Hadoop
distributed file system and parallel compute engine. The goals of D&R are the following: access to thousands of methods of machine learning, statistics, and data visualization; deep analysis of the data, which means analysis of the detailed data at their finest granularity; easy programming of analyses; and high computational performance. To succeed, D&R requires research in all of the technical areas of data science. Network cybersecurity and climate science are two subject-matter areas with big, complex data benefiting from D&R. We illustrate this by discussing two datasets, one from each area. The first is the measurements of 13 variables for each of 10,615,054,608 queries to the Spamhaus IP address blacklisting service. The second has 50,632 3-hourly satellite rainfall estimates at 576,000 locations. |
| Author | Tung, Wen-wen Barthur, Ashrith Bowers, Matthew C. Song, Yuying Gerth, John Cleveland, William S. |
| Author_xml | – sequence: 1 givenname: Wen-wen orcidid: 0000-0001-8627-1503 surname: Tung fullname: Tung, Wen-wen email: wwtung@purdue.edu organization: Department of Earth, Atmospheric, and Planetary Sciences, Purdue University – sequence: 2 givenname: Ashrith surname: Barthur fullname: Barthur, Ashrith organization: CERIAS, Purdue University – sequence: 3 givenname: Matthew C. surname: Bowers fullname: Bowers, Matthew C. organization: Department of Earth, Atmospheric, and Planetary Sciences, Purdue University – sequence: 4 givenname: Yuying surname: Song fullname: Song, Yuying organization: Department of Statistics, Purdue University – sequence: 5 givenname: John surname: Gerth fullname: Gerth, John organization: Departments of Computer Science and Electrical Engineering, Stanford University – sequence: 6 givenname: William S. surname: Cleveland fullname: Cleveland, William S. organization: Department of Statistics, Purdue University |
| BookMark | eNp9kE9LAzEQxYNUsNZ-AG85iR5Ws0k2mz1K6z8oCKLnkE1m25R2syRbsd_e1BUPHnqaGXi_N7x3jkatbwGhy5zc5oSUd5FTIvOM5DIjhMiMn6AxLSjJZCn46G8vxBmaxrhOGloyXlI5Rt3cfToLWLcWBzB-W7sW8PX86u0GW91rHI2D1gDugl-D6SNufMAWoEuI3uyji9g3uHbLQX7wWbnlCierbtfr3vkk-7k28OX6_QU6bfQmwvR3TtDH48P77DlbvD69zO4XmaEV6TNBmK2LhoLhpTVMGC1ozlhe06YEUYu6MhaKSoPllZWltjVjBa9kVUjDCBNsgvLB1wQfY4BGdcFtddirnKhDa2poTaXW1KE1xRNT_mOMGyL0QbvNUZIOZExf2iUEtfa7kJLHI9A3HiiDtQ |
| CitedBy_id | crossref_primary_10_1029_2020JD033667 crossref_primary_10_1175_JCLI_D_17_0090_1 crossref_primary_10_1088_2752_5295_acdf0f |
| Cites_doi | 10.1016/J.PHYSA.2011.08.042 10.1029/2012GL054011 10.1080/01621459.1988.10478639 10.1175/JHM-D-14-0101.1 10.1017/CBO9781107050242 10.1175/JHM560.1 10.1002/sam.11242 10.1175/MWR3145.1 10.1002/2013EO320001 10.1007/BF01029783 10.1371/journal.pone.0024331 10.1175/1520-0477(2000)081<2035:EOPSSE>2.3.CO;2 10.1111/j.1751-5823.2005.tb00276.x 10.1007/s10994-013-5346-7 10.2151/jmsj1965.66.6_823 10.1103/PhysRevE.73.016117 10.1175/AMSMONOGRAPHS-D-15-0014.1 10.1002/sta4.7 10.1175/JAS-D-13-0122.1 10.1017/CBO9781139170666 10.14778/2831360.2831365 10.1038/nclimate2903 10.5194/acp-11-3731-2011 10.1214/aos/1043351248 10.1175/MWR3146.1 10.1016/j.geomorph.2010.09.033 10.1111/j.1600-0870.1985.tb00423.x 10.1103/PhysRevE.49.1685 10.1061/TACEAT.0006518 10.1175/1520-0477(1991)072<1331:TMYNCO>2.0.CO;2 10.1017/CBO9781107415324.004 10.1002/9780470191651 |
| ContentType | Journal Article |
| Copyright | Japanese Federation of Statistical Science Associations 2018 |
| Copyright_xml | – notice: Japanese Federation of Statistical Science Associations 2018 |
| DBID | AAYXX CITATION |
| DOI | 10.1007/s42081-018-0008-4 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine Economics Law Statistics Physics Computer Science |
| EISSN | 2520-8764 |
| EndPage | 156 |
| ExternalDocumentID | 10_1007_s42081_018_0008_4 |
| GrantInformation_xml | – fundername: National Science Foundation grantid: DHS-0937123; CDSE-1228348 funderid: http://dx.doi.org/10.13039/100000001 – fundername: Defense Advanced Research Projects Agency grantid: FA8750-12-2-0343 funderid: http://dx.doi.org/10.13039/100000185 – fundername: National Aeronautics and Space Administration grantid: NNX16AO62H funderid: http://dx.doi.org/10.13039/100000104 |
| GroupedDBID | -EM 0R~ 406 AACDK AAHNG AAIAL AAJBT AASML AATNV AAUYE ABAKF ABDZT ABECU ABFTV ABJNI ABKCH ABMQK ABQBU ABTEG ABTKH ABTMW ABXPI ACAOD ACDTI ACGFS ACHSB ACMLO ACOKC ACPIV ACZOJ ADKNI ADRFC ADTPH ADURQ ADYFF AEFQL AEJRE AEMSY AESKC AFBBN AFQWF AGDGC AGJBK AGMZJ AGQEE AGRTI AIAKS AIGIU AILAN AITGF AJZVZ ALMA_UNASSIGNED_HOLDINGS AMKLP AMXSW AMYLF AXYYD BGNMA CSCUP DPUIP EBLON EBS EJD FIGPU FINBP FNLPD FSGXE GGCAI IKXTQ IWAJR J-C JZLTJ KOV LLZTM M4Y NPVJJ NQJWS NU0 O9J PT4 RLLFE ROL RSV SJYHP SNE SNPRN SOHCF SOJ SRMVM SSLCW STPWE TSG UOJIU UTJUX UZXMN VFIZW ZMTXR AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC AEZWR AFDZB AFFHD AFHIU AFKRA AFOHR AHPBZ AHWEU AIXLP ARAPS ATHPR AYFIA AZQEC BENPR BGLVJ CCPQU CITATION DWQXO GNUQQ HCIFZ K7- M2P PHGZM PHGZT PQGLB |
| ID | FETCH-LOGICAL-c290t-603db5f2ec47dc36ca621331b2f7e6b6b9cde59aed49d87adb335498958c30363 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000655501800010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2520-8756 |
| IngestDate | Sat Nov 29 06:12:05 EST 2025 Tue Nov 18 22:39:03 EST 2025 Fri Feb 21 02:32:46 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Hadoop Data science Blacklisting IP addresses Big data Parallel Weather and climate data analysis Distributed computing |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c290t-603db5f2ec47dc36ca621331b2f7e6b6b9cde59aed49d87adb335498958c30363 |
| ORCID | 0000-0001-8627-1503 |
| PageCount | 18 |
| ParticipantIDs | crossref_primary_10_1007_s42081_018_0008_4 crossref_citationtrail_10_1007_s42081_018_0008_4 springer_journals_10_1007_s42081_018_0008_4 |
| PublicationCentury | 2000 |
| PublicationDate | 20180600 2018-6-00 |
| PublicationDateYYYYMMDD | 2018-06-01 |
| PublicationDate_xml | – month: 6 year: 2018 text: 20180600 |
| PublicationDecade | 2010 |
| PublicationPlace | Singapore |
| PublicationPlace_xml | – name: Singapore |
| PublicationTitle | Japanese journal of statistics and data science |
| PublicationTitleAbbrev | Jpn J Stat Data Sci |
| PublicationYear | 2018 |
| Publisher | Springer Singapore |
| Publisher_xml | – name: Springer Singapore |
| References | Nakazawa (CR24) 1988; 66 Davis, Brown, Bullock (CR11) 2006; 134 Arakawa, Jung, Wu (CR1) 2011; 11 Cleveland (CR8) 2005; 73 Bowers, Gao, Tung (CR5) 2013; 40 Telesca, Pierini, Scian (CR31) 2012; 391 Cleveland, Hafen (CR10) 2014; 7 CR19 Simpson, Kummerow, Tao, Adler (CR29) 1996; 60 CR16 Hurst (CR21) 1951; 116 Sellars, Nguyen, Chu, Gao, Hsu, Sorooshian (CR26) 2013; 94 van Vliet, Wiberg, Leduc, Riahi (CR33) 2016; 6 Gao, Hu, Tung, Cao, Sarshar, Roychowdhury (CR14) 2006; 73 Arakawa, Jung, Wu, Fovell, Tung (CR2) 2016 CR36 CR35 Brillinger (CR6) 2002; 30 Davis, Brown, Bullock (CR12) 2006; 134 Tung, Giannakis, Majda (CR32) 2014; 71 Gao, Hu, Tung (CR15) 2011; 6 Mitasova, Harmon, Weaver, Lyons, Overton (CR23) 2012; 137 Huffman, Bolvin, Nelkin, Wolff, Adler, Gu, Hong, Bowman, Stocker (CR20) 2007; 8 Shi, Qiu, Minhas, Jiao, Wang, Reinwald, Özcan (CR28) 2015; 8 Sellars, Gao, Sorooshian (CR27) 2015; 16 Barenblatt (CR3) 1996 CR4 Cleveland (CR7) 2001; 4 Frisch (CR13) 1995 Sorooshian, Hsu, Gao, Gupta, Imam, Braithwaite (CR30) 2000; 81 Peng, Buldyrev, Havlin, Simons, Stanley, Goldberger (CR25) 1994; 49 Cleveland, Devlin (CR9) 1988; 83 Guha, Hafen, Rounds, Xia, Li, Xi, Cleveland (CR18) 2012; 1 Guha, Kidwell, Hafen, Cleveland (CR17) 2009; 5 Lovejoy, Mandelbrot (CR22) 1985; 37 Williams (CR34) 2014; 95 S Guha (8_CR17) 2009; 5 W Tung (8_CR32) 2014; 71 WS Cleveland (8_CR10) 2014; 7 J Gao (8_CR15) 2011; 6 WS Cleveland (8_CR8) 2005; 73 GJ Huffman (8_CR20) 2007; 8 C Davis (8_CR11) 2006; 134 DR Brillinger (8_CR6) 2002; 30 J Shi (8_CR28) 2015; 8 WS Cleveland (8_CR7) 2001; 4 CK Peng (8_CR25) 1994; 49 8_CR36 8_CR35 8_CR16 SL Sellars (8_CR27) 2015; 16 T Nakazawa (8_CR24) 1988; 66 8_CR19 S Lovejoy (8_CR22) 1985; 37 HE Hurst (8_CR21) 1951; 116 C Davis (8_CR12) 2006; 134 A Arakawa (8_CR1) 2011; 11 8_CR4 J Simpson (8_CR29) 1996; 60 MTH van Vliet (8_CR33) 2016; 6 A Arakawa (8_CR2) 2016 J Gao (8_CR14) 2006; 73 S Sorooshian (8_CR30) 2000; 81 H Mitasova (8_CR23) 2012; 137 GI Barenblatt (8_CR3) 1996 U Frisch (8_CR13) 1995 S Guha (8_CR18) 2012; 1 S Sellars (8_CR26) 2013; 94 MC Bowers (8_CR5) 2013; 40 JK Williams (8_CR34) 2014; 95 WS Cleveland (8_CR9) 1988; 83 L Telesca (8_CR31) 2012; 391 |
| References_xml | – volume: 391 start-page: 1553 issue: 4 year: 2012 end-page: 1562 ident: CR31 article-title: Investigating the temporal variation of the scaling behavior in rainfall data measured in central Argentina by means of detrended fluctuation analysis publication-title: Physica A: Statistical Mechanics and Its Applications doi: 10.1016/J.PHYSA.2011.08.042 – ident: CR4 – volume: 40 start-page: 1 issue: September 12 year: 2013 end-page: 5 ident: CR5 article-title: Long range correlations in tree ring chronologies of the USA: Variation within and across species publication-title: Geophysical Research Letters doi: 10.1029/2012GL054011 – ident: CR16 – volume: 83 start-page: 596 issue: 403 year: 1988 end-page: 610 ident: CR9 article-title: Locally weighted regression: An approach to regression analysis by local fitting publication-title: Journal of the American Statistical Association doi: 10.1080/01621459.1988.10478639 – volume: 16 start-page: 830 year: 2015 end-page: 842 ident: CR27 article-title: An object-oriented approach to investigate impacts of climate oscillations on precipitation: A western United States case study publication-title: Journal of Hydrometeorology doi: 10.1175/JHM-D-14-0101.1 – year: 1996 ident: CR3 publication-title: Scaling, self-similarity and intermediate asymptotics doi: 10.1017/CBO9781107050242 – ident: CR35 – volume: 8 start-page: 38 issue: 1 year: 2007 end-page: 55 ident: CR20 article-title: The TRMM multisatellite precipitation analysis (TMPA): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales publication-title: Journal of Hydrometeorology doi: 10.1175/JHM560.1 – volume: 4 start-page: 497 issue: 5 year: 2001 end-page: 511 ident: CR7 article-title: Data science: An action plan for expanding the technical areas of the field of statistics publication-title: International Statistical Review – volume: 7 start-page: 425 issue: 6 year: 2014 end-page: 433 ident: CR10 article-title: Divide and recombine (D&R): Data science for large complex data publication-title: Statistical Analysis and Data Mining doi: 10.1002/sam.11242 – volume: 134 start-page: 1772 issue: 7 year: 2006 end-page: 1784 ident: CR11 article-title: Object-based verification of precipitation forecasts. Part I: Methodology and application to mesoscale rain areas publication-title: Monthly Weather Review doi: 10.1175/MWR3145.1 – volume: 5 start-page: 193 year: 2009 end-page: 200 ident: CR17 article-title: Visualization databases for the analysis of large complex datasets publication-title: Journal of Machine Learning Research – volume: 94 start-page: 277 issue: 32 year: 2013 end-page: 278 ident: CR26 article-title: Computational earth science: Big data transformed into insight publication-title: EOS Transactions American Geophysical Union doi: 10.1002/2013EO320001 – volume: 60 start-page: 19 issue: 1–3 year: 1996 end-page: 36 ident: CR29 article-title: On the tropical rainfall measuring mission (TRMM) publication-title: Meteorology and Atmospheric Physics doi: 10.1007/BF01029783 – volume: 6 start-page: e24331 issue: 9 year: 2011 ident: CR15 article-title: Facilitating joint chaos and fractal analysis of biosignals through nonlinear adaptive filtering publication-title: PLoS ONE doi: 10.1371/journal.pone.0024331 – volume: 81 start-page: 2035 issue: 9 year: 2000 end-page: 2046 ident: CR30 article-title: Evaluation of PERSIANN system satellite-based estimates of tropical rainfall publication-title: Bulletin of the American Meteorological Society doi: 10.1175/1520-0477(2000)081<2035:EOPSSE>2.3.CO;2 – ident: CR19 – volume: 73 start-page: 217 issue: 2 year: 2005 end-page: 221 ident: CR8 article-title: Learning from data: Unifying statistics and computer science publication-title: International Statistical Review doi: 10.1111/j.1751-5823.2005.tb00276.x – volume: 95 start-page: 51 issue: 1 year: 2014 end-page: 70 ident: CR34 article-title: Using random forests to diagnose aviation turbulence publication-title: Machine Learning doi: 10.1007/s10994-013-5346-7 – volume: 66 start-page: 823 issue: 6 year: 1988 end-page: 839 ident: CR24 article-title: Tropical super clusters within intraseasonal variations over the Western Pacific publication-title: Journal of the Meteorological Society of Japan doi: 10.2151/jmsj1965.66.6_823 – volume: 73 start-page: 1 issue: 1 year: 2006 end-page: 10 ident: CR14 article-title: Assessment of long-range correlation in time series: How to avoid pitfalls publication-title: Physical Review E doi: 10.1103/PhysRevE.73.016117 – start-page: 16.1 year: 2016 end-page: 16.17 ident: CR2 article-title: Multiscale modeling of the moist-convective atmosphere publication-title: Meteorological monographs doi: 10.1175/AMSMONOGRAPHS-D-15-0014.1 – volume: 1 start-page: 53 issue: 1 year: 2012 end-page: 67 ident: CR18 article-title: Large complex data: Divide and recombine (D&R) with RHIPE publication-title: Stat doi: 10.1002/sta4.7 – volume: 71 start-page: 3302 issue: 9 year: 2014 end-page: 3326 ident: CR32 article-title: Symmetric and antisymmetric convection signals in the Madden-Julian oscillation. Part I: Basic modes in infrared brightness temperature publication-title: Journal of the Atmospheric Sciences doi: 10.1175/JAS-D-13-0122.1 – year: 1995 ident: CR13 publication-title: Turbulence: The legacy of A.N. Kolmogorov doi: 10.1017/CBO9781139170666 – volume: 8 start-page: 2110 issue: 13 year: 2015 end-page: 2121 ident: CR28 article-title: Clash of the titans: Mapreduce vs. spark for large scale data analytics publication-title: Proceedings of the VLDB Endowment doi: 10.14778/2831360.2831365 – volume: 6 start-page: 375 year: 2016 end-page: 380 ident: CR33 article-title: Power-generation system vulnerability and adaptation to changes in climate and water resources publication-title: Nature Climate Change doi: 10.1038/nclimate2903 – volume: 11 start-page: 3731 issue: 8 year: 2011 end-page: 3742 ident: CR1 article-title: Toward unification of the multiscale modeling of the atmosphere publication-title: Atmospheric Chemistry and Physics doi: 10.5194/acp-11-3731-2011 – volume: 30 start-page: 1595 issue: 6 year: 2002 end-page: 1618 ident: CR6 article-title: John W. Tukey’s work on time series and spectrum analysis publication-title: Annals of Statistics doi: 10.1214/aos/1043351248 – volume: 134 start-page: 1785 issue: 7 year: 2006 end-page: 1795 ident: CR12 article-title: Object-based verification of precipitation forecasts. Part II: Application to convective rain systems publication-title: Monthly Weather Review doi: 10.1175/MWR3146.1 – volume: 137 start-page: 122 issue: 1 year: 2012 end-page: 137 ident: CR23 article-title: Scientific visualization of landscapes and landforms publication-title: Geomorphology doi: 10.1016/j.geomorph.2010.09.033 – ident: CR36 – volume: 37 start-page: 209 issue: 3 year: 1985 end-page: 232 ident: CR22 article-title: Fractal properties of rain, and a fractal model publication-title: Tellus A: Dynamic Meteorology and Oceanography doi: 10.1111/j.1600-0870.1985.tb00423.x – volume: 116 start-page: 770 issue: 1 year: 1951 end-page: 799 ident: CR21 article-title: Long-term storage capacity of reservoirs publication-title: Transactions of the American Society of Civil Engineers – volume: 49 start-page: 1685 issue: 2 year: 1994 end-page: 1689 ident: CR25 article-title: Mosaic organization of DNA nucleotides publication-title: Physical Review E doi: 10.1103/PhysRevE.49.1685 – volume-title: Scaling, self-similarity and intermediate asymptotics year: 1996 ident: 8_CR3 doi: 10.1017/CBO9781107050242 – volume: 81 start-page: 2035 issue: 9 year: 2000 ident: 8_CR30 publication-title: Bulletin of the American Meteorological Society doi: 10.1175/1520-0477(2000)081<2035:EOPSSE>2.3.CO;2 – volume: 30 start-page: 1595 issue: 6 year: 2002 ident: 8_CR6 publication-title: Annals of Statistics doi: 10.1214/aos/1043351248 – start-page: 16.1 volume-title: Meteorological monographs year: 2016 ident: 8_CR2 doi: 10.1175/AMSMONOGRAPHS-D-15-0014.1 – volume-title: Turbulence: The legacy of A.N. Kolmogorov year: 1995 ident: 8_CR13 doi: 10.1017/CBO9781139170666 – volume: 16 start-page: 830 year: 2015 ident: 8_CR27 publication-title: Journal of Hydrometeorology doi: 10.1175/JHM-D-14-0101.1 – volume: 60 start-page: 19 issue: 1–3 year: 1996 ident: 8_CR29 publication-title: Meteorology and Atmospheric Physics doi: 10.1007/BF01029783 – ident: 8_CR19 – volume: 137 start-page: 122 issue: 1 year: 2012 ident: 8_CR23 publication-title: Geomorphology doi: 10.1016/j.geomorph.2010.09.033 – volume: 94 start-page: 277 issue: 32 year: 2013 ident: 8_CR26 publication-title: EOS Transactions American Geophysical Union doi: 10.1002/2013EO320001 – volume: 391 start-page: 1553 issue: 4 year: 2012 ident: 8_CR31 publication-title: Physica A: Statistical Mechanics and Its Applications doi: 10.1016/J.PHYSA.2011.08.042 – volume: 11 start-page: 3731 issue: 8 year: 2011 ident: 8_CR1 publication-title: Atmospheric Chemistry and Physics doi: 10.5194/acp-11-3731-2011 – volume: 73 start-page: 217 issue: 2 year: 2005 ident: 8_CR8 publication-title: International Statistical Review doi: 10.1111/j.1751-5823.2005.tb00276.x – volume: 116 start-page: 770 issue: 1 year: 1951 ident: 8_CR21 publication-title: Transactions of the American Society of Civil Engineers doi: 10.1061/TACEAT.0006518 – volume: 40 start-page: 1 issue: September 12 year: 2013 ident: 8_CR5 publication-title: Geophysical Research Letters doi: 10.1029/2012GL054011 – volume: 1 start-page: 53 issue: 1 year: 2012 ident: 8_CR18 publication-title: Stat doi: 10.1002/sta4.7 – volume: 95 start-page: 51 issue: 1 year: 2014 ident: 8_CR34 publication-title: Machine Learning doi: 10.1007/s10994-013-5346-7 – volume: 7 start-page: 425 issue: 6 year: 2014 ident: 8_CR10 publication-title: Statistical Analysis and Data Mining doi: 10.1002/sam.11242 – volume: 66 start-page: 823 issue: 6 year: 1988 ident: 8_CR24 publication-title: Journal of the Meteorological Society of Japan doi: 10.2151/jmsj1965.66.6_823 – volume: 5 start-page: 193 year: 2009 ident: 8_CR17 publication-title: Journal of Machine Learning Research – volume: 8 start-page: 38 issue: 1 year: 2007 ident: 8_CR20 publication-title: Journal of Hydrometeorology doi: 10.1175/JHM560.1 – volume: 4 start-page: 497 issue: 5 year: 2001 ident: 8_CR7 publication-title: International Statistical Review – volume: 134 start-page: 1772 issue: 7 year: 2006 ident: 8_CR11 publication-title: Monthly Weather Review doi: 10.1175/MWR3145.1 – volume: 71 start-page: 3302 issue: 9 year: 2014 ident: 8_CR32 publication-title: Journal of the Atmospheric Sciences doi: 10.1175/JAS-D-13-0122.1 – volume: 6 start-page: e24331 issue: 9 year: 2011 ident: 8_CR15 publication-title: PLoS ONE doi: 10.1371/journal.pone.0024331 – volume: 83 start-page: 596 issue: 403 year: 1988 ident: 8_CR9 publication-title: Journal of the American Statistical Association doi: 10.1080/01621459.1988.10478639 – volume: 73 start-page: 1 issue: 1 year: 2006 ident: 8_CR14 publication-title: Physical Review E doi: 10.1103/PhysRevE.73.016117 – volume: 37 start-page: 209 issue: 3 year: 1985 ident: 8_CR22 publication-title: Tellus A: Dynamic Meteorology and Oceanography doi: 10.1111/j.1600-0870.1985.tb00423.x – ident: 8_CR36 doi: 10.1175/1520-0477(1991)072<1331:TMYNCO>2.0.CO;2 – ident: 8_CR4 – volume: 49 start-page: 1685 issue: 2 year: 1994 ident: 8_CR25 publication-title: Physical Review E doi: 10.1103/PhysRevE.49.1685 – volume: 134 start-page: 1785 issue: 7 year: 2006 ident: 8_CR12 publication-title: Monthly Weather Review doi: 10.1175/MWR3146.1 – ident: 8_CR35 doi: 10.1017/CBO9781107415324.004 – volume: 8 start-page: 2110 issue: 13 year: 2015 ident: 8_CR28 publication-title: Proceedings of the VLDB Endowment doi: 10.14778/2831360.2831365 – volume: 6 start-page: 375 year: 2016 ident: 8_CR33 publication-title: Nature Climate Change doi: 10.1038/nclimate2903 – ident: 8_CR16 doi: 10.1002/9780470191651 |
| SSID | ssj0002734728 |
| Score | 2.0588965 |
| Snippet | The focus of data science is data analysis. This article begins with a categorization of the data science technical areas that play a direct role in data... |
| SourceID | crossref springer |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 139 |
| SubjectTerms | Chemistry and Earth Sciences Computer Science Economics Finance Health Sciences Humanities Insurance Law Management Mathematics and Statistics Medicine Physics Statistical Theory and Methods Statistics Statistics and Computing/Statistics Programs Statistics for Business Statistics for Engineering Statistics for Life Sciences Statistics for Social Sciences |
| Title | Divide and recombine (D&R) data science projects for deep analysis of big data and high computational complexity |
| URI | https://link.springer.com/article/10.1007/s42081-018-0008-4 |
| Volume | 1 |
| WOSCitedRecordID | wos000655501800010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLink Journals customDbUrl: eissn: 2520-8764 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002734728 issn: 2520-8756 databaseCode: RSV dateStart: 20180601 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1ZS8QwEB689cWjKt7kQcSDwm6aNsmjeOCDinjhW8lVEXRd3FXx35tJ0wVBBX0snaSFSTLzzWS-AdjMJVNtzV3qj8VWynjOU-kcTymz3LSFwcbGodkEPz8Xd3fyItZx95rb7k1KMpzUg2I3TAQj9BVYCS1SNgyj3toJ3I2XV7eDwArytfDQU5Xm_vPeHy-abOZ3s3y1R1-TocHGHM_86-9mYTq6lGS_XgNzMOQ6Ccw07RpI3L0JTDZFyL0Ehk_VewITZzGznsB4uAqKr6bQ_6zpm-ehe4jlWo6ojiWInZ88kHZk-3Drcofg5VISLSiJEZ0e8U4wsc51_ZCa7oQ8V0Q_3NfiOA8yJBMTfi9GIsMTMnP2Pxbg5vjo-uAkjU0aUkNlq596ZVqdV9QZxq3JCqMK6nFvW9OKu0IXWhrrcqmcZdIKrqzOMo9JhcyFQfOZLcJI57njloDIjBnltAfrbc08yJeK8ryqNLWK-2OnWIZWo6rSRAZzbKTxWA64l4MWSq8FTKqLki3D7mBIt6bv-E14r9FtGXdy72fplT9Jr8IUDYsD4zdrMNJ_eXXrMGbevEJfNsIK_gRunud8 |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bS-QwFD7ouOq86G5VvJuHRValMJOmTfMo6qA4DouXxbeSW0XQcXBGxX9vTpoOCKugjyUnaeAkOffvAPxOBZNtxW3snsVWzHjKY2EtjykzXLdzjY2NfbMJ3uvl19fib6jjHtbZ7nVI0r_U42I3DASj6ZtjJXQes0mYYk5gYR7f-cW_sWMF8Vq476lKU_d7p49ndTTzf6u8l0fvg6FexnTmv7W7nzAXVEqyX52BXzBh-xHM1-0aSLi9EczWRcjDCCa78iWCmbMQWY9g2qeC4lAT9c8KvnkBBodYrmWJ7BuCtvO9M6Qt-XO4fb5DMLmUBAlKgkdnSJwSTIy1AzelgjshDyVRtzcVOa6DCMlE--0FT6T_QmTO0esiXHWOLg-O49CkIdZUtEaxY6ZRaUmtZtzoJNMyo87ubStacpupTAltbCqkNUyYnEujksTZpLlIc43iM1mCRv-hb5eBiIRpaZUz1tuKOSNfSMrTslTUSO6enWwFWjWrCh0QzLGRxl0xxl72XCgcFzConhdsBXbHUwYVfMdnxHs1b4twk4cfU69-iXoLZo8vz7pF96R3ugZN6g8K-nLWoTF6fLIb8EM_O-Y-bvrT_AZQJOpg |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3dS-QwEB_81hc_eh7nnR95EDmVsrtp2jSP4roo6iLqiW8lXz0ErYu7KvffXyZNFwTvQHwsnYTAJJn5zWR-A7CdCiY7itvYXYvtmPGUx8JaHlNmuO7kGhsb-2YTvN_Pb2_FRehzOmxeuzcpybqmAVmaqlFrYMrWuPANk8IIg3Osis5jNgnTDHsGIVy_uhkHWZC7hfv-qjR1S3G-edZkNt-b5a1tepsY9famt_TplS7DYnA1yUG9N1ZgwlYRLDVtHEg41RHMN8XJwwgmz-RrBHPnIeMewax_Ioq_FtAvrWmdv8Cgi2VclsjKEMTUDw5gW_Kzu3O5S_DRKQmWlYRIz5A455gYawduSE2DQh5Lou5-1-I4DzInE-2XFyKU_gsZO0d_VuFX7-j68DgOzRtiTUV7FDslG5WW1GrGjU4yLTPq8HBH0ZLbTGVKaGNTIa1hwuRcGpUkDqvmIs01mtXkK0xVj5X9BkQkTEurHIjvKObAv5CUp2WpqJHcXUfZGrQbtRU6MJtjg437YszJ7LVQOC1gsj0v2BrsjYcMalqP_wnvN3ouwgkf_lv6-4ekt2Duotsrzk76pz9ggfp9giGedZgaPT3bDZjRL063T5t-Y_8FAyfzRA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Divide+and+recombine+%28D%26R%29+data+science+projects+for+deep+analysis+of+big+data+and+high+computational+complexity&rft.jtitle=Japanese+journal+of+statistics+and+data+science&rft.au=Tung%2C+Wen-wen&rft.au=Barthur%2C+Ashrith&rft.au=Bowers%2C+Matthew+C.&rft.au=Song%2C+Yuying&rft.date=2018-06-01&rft.pub=Springer+Singapore&rft.issn=2520-8756&rft.eissn=2520-8764&rft.volume=1&rft.issue=1&rft.spage=139&rft.epage=156&rft_id=info:doi/10.1007%2Fs42081-018-0008-4&rft.externalDocID=10_1007_s42081_018_0008_4 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2520-8756&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2520-8756&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2520-8756&client=summon |