SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor

The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes that enable insightful analytics. Applications a...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on parallel and distributed systems Ročník 35; číslo 1; s. 73 - 88
Hlavní autoři: Jawarneh, Isam Mashhour Al, Bellavista, Paolo, Corradi, Antonio, Foschini, Luca, Montanari, Rebecca
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.01.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1045-9219, 1558-2183
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes that enable insightful analytics. Applications are now relying on finding, in real-time, to which geographical regions data streaming tuples belong. This problem requires a computationally intensive stream-static join for joining a dynamic stream with a disk-resident static table. In addition, the time-varying nature of fluctuation in geospatial data arriving online calls for an approximate solution that can trade-off QoS constraints while ensuring that the system survives sudden spikes in data loads. In this paper, we present SpatialSSJP, an adaptive spatial-aware approximate query processing system that specifically focuses on stream-static joins in a way that guarantees achieving an agreed set of Quality-of-Service goals and maintains geo-statistics of stateful online aggregations over stream-static join results. SpatialSSJP employs a state-of-art stratified-like sampling design to select well-balanced representative geospatial data stream samples and serve them to a stream-static geospatial join operator downstream. We implemented a prototype atop Spark Structured Streaming. Our extensive evaluations on big real datasets show that our system can survive and mitigate harsh join workloads and outperform state-of-art baselines by significant magnitudes, without risking rigorous error bounds in terms of the accuracy of the output results. SpatialSSJP achieves a relative accuracy gain against plain Spark joins of approximately 10% in worst cases but reaching up to 50% in best case scenarios.
AbstractList The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes that enable insightful analytics. Applications are now relying on finding, in real-time, to which geographical regions data streaming tuples belong. This problem requires a computationally intensive stream-static join for joining a dynamic stream with a disk-resident static table. In addition, the time-varying nature of fluctuation in geospatial data arriving online calls for an approximate solution that can trade-off QoS constraints while ensuring that the system survives sudden spikes in data loads. In this paper, we present SpatialSSJP, an adaptive spatial-aware approximate query processing system that specifically focuses on stream-static joins in a way that guarantees achieving an agreed set of Quality-of-Service goals and maintains geo-statistics of stateful online aggregations over stream-static join results. SpatialSSJP employs a state-of-art stratified-like sampling design to select well-balanced representative geospatial data stream samples and serve them to a stream-static geospatial join operator downstream. We implemented a prototype atop Spark Structured Streaming. Our extensive evaluations on big real datasets show that our system can survive and mitigate harsh join workloads and outperform state-of-art baselines by significant magnitudes, without risking rigorous error bounds in terms of the accuracy of the output results. SpatialSSJP achieves a relative accuracy gain against plain Spark joins of approximately 10% in worst cases but reaching up to 50% in best case scenarios.
Author Montanari, Rebecca
Corradi, Antonio
Bellavista, Paolo
Foschini, Luca
Jawarneh, Isam Mashhour Al
Author_xml – sequence: 1
  givenname: Isam Mashhour Al
  orcidid: 0000-0002-4796-2181
  surname: Jawarneh
  fullname: Jawarneh, Isam Mashhour Al
  email: isam.aljawarneh@studio.unibo.it
  organization: Dipartimento di Informatica - Scienza e Ingegneria (DISI), University of Bologna, Bologna, Italy
– sequence: 2
  givenname: Paolo
  orcidid: 0000-0003-0992-7948
  surname: Bellavista
  fullname: Bellavista, Paolo
  email: paolo.bellavista@unibo.it
  organization: Dipartimento di Informatica - Scienza e Ingegneria (DISI), University of Bologna, Bologna, Italy
– sequence: 3
  givenname: Antonio
  orcidid: 0000-0002-5107-1023
  surname: Corradi
  fullname: Corradi, Antonio
  email: antonio.corradi@unibo.it
  organization: Dipartimento di Informatica - Scienza e Ingegneria (DISI), University of Bologna, Bologna, Italy
– sequence: 4
  givenname: Luca
  orcidid: 0000-0001-9062-3647
  surname: Foschini
  fullname: Foschini, Luca
  email: luca.foschini@unibo.it
  organization: Dipartimento di Informatica - Scienza e Ingegneria (DISI), University of Bologna, Bologna, Italy
– sequence: 5
  givenname: Rebecca
  orcidid: 0000-0002-3687-0361
  surname: Montanari
  fullname: Montanari, Rebecca
  email: rebecca.montanari@unibo.it
  organization: Dipartimento di Informatica - Scienza e Ingegneria (DISI), University of Bologna, Bologna, Italy
BookMark eNp9kMtqwzAQRUVJoUnaDyh0YejaqUayZKm7kD5DoC5O10JxZHBILFdS-vj7yiSL0kVXM4t7Zi5nhAatbQ1Cl4AnAFjeLIu7ckIwoRNKKeZcnqAhMCZSAoIO4o4zlkoC8gyNvN9gDBnD2RAVZadDo7dlOS9uk1dbptNP7UwyXesuNB9x6Tpnv5qdDiYpgzN6l5YhIlVyJJO5bdqkcLYy3lt3jk5rvfXm4jjH6O3hfjl7Shcvj8-z6SKtKM1DKhg1Ms81ITUwDGtZkRxWZKUxrSuecQOUagNS5JwLKetYVlKdyzVjXGio6RhdH-7Geu9744Pa2L1r40tFhIwngTCIqfyQqpz13plaVU3f3rbB6WarAKten-r1qV6fOuqLJPwhOxctuO9_masD0xhjfuUpllJw-gMV0Xtu
CODEN ITDSEO
CitedBy_id crossref_primary_10_1109_ACCESS_2024_3467375
crossref_primary_10_3390_computers14020035
crossref_primary_10_1109_TPDS_2024_3453607
crossref_primary_10_1109_TC_2025_3575917
Cites_doi 10.1109/ICAC.2017.37
10.1145/2723372.2742797
10.1007/s10707-018-0330-9
10.1145/2820783.2820860
10.14778/2536222.2536227
10.1016/j.is.2016.09.007
10.1117/12.2177233
10.1145/3183713.3190664
10.1145/1206049.1206056
10.1007/978-3-540-28608-0_16
10.1109/CAMAD50429.2020.9209294
10.3389/fdata.2020.00030
10.1145/3448016.3457269
10.1109/ICDEW.2015.7129541
10.1145/2505515.2505728
10.1109/GLOBECOM38437.2019.9014291
10.1145/253260.253291
10.1109/ACCESS.2019.2904730
10.1145/2517349.2522737
10.1145/3300061.3300132
10.5555/1863103.1863113
10.1007/978-3-319-77525-8_154
10.1145/2882903.2915237
10.1145/3139958.3139963
10.1145/3221269.3223040
10.1109/TNSM.2020.3034150
10.1109/CAMAD52502.2021.9617784
10.1109/ICAC.2017.31
10.3390/s21124160
10.1109/ICDE.2015.7113382
10.14778/3236187.3236213
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2023.3330669
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 88
ExternalDocumentID 10_1109_TPDS_2023_3330669
10309986
Genre orig-research
GrantInformation_xml – fundername: OntoTrans EU Horizon 2020 Project
  grantid: 862136
– fundername: H2020 SimDOME-Digital Ontology-Based Modelling Environment for Simulation of Materials
  grantid: 814492
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
5VS
6IK
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
ESBDL
HZ~
H~9
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RZB
TN5
TWZ
UHB
VH1
AAYXX
CITATION
7SC
7SP
8FD
AARMG
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c337t-853e977a22f1501d9c271b2ba03fc646e133ae198766899f50493a79d5568a1f3
IEDL.DBID RIE
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001122809600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Sun Jun 29 15:08:06 EDT 2025
Tue Nov 18 22:34:45 EST 2025
Sat Nov 29 07:00:01 EST 2025
Wed Oct 29 06:12:44 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c337t-853e977a22f1501d9c271b2ba03fc646e133ae198766899f50493a79d5568a1f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-0992-7948
0000-0002-5107-1023
0000-0002-4796-2181
0000-0001-9062-3647
0000-0002-3687-0361
OpenAccessLink https://ieeexplore.ieee.org/document/10309986
PQID 2895011251
PQPubID 85437
PageCount 16
ParticipantIDs proquest_journals_2895011251
ieee_primary_10309986
crossref_primary_10_1109_TPDS_2023_3330669
crossref_citationtrail_10_1109_TPDS_2023_3330669
PublicationCentury 2000
PublicationDate 2024-Jan.
2024-1-00
20240101
PublicationDateYYYYMMDD 2024-01-01
PublicationDate_xml – month: 01
  year: 2024
  text: 2024-Jan.
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref12
ref15
ref31
Lohr (ref14) 2009
ref30
ref11
ref33
ref10
ref32
ref2
ref1
ref16
ref19
ref18
(ref17) 2009-2018
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref33
  doi: 10.1109/ICAC.2017.37
– ident: ref6
  doi: 10.1145/2723372.2742797
– ident: ref24
  doi: 10.1007/s10707-018-0330-9
– ident: ref27
  doi: 10.1145/2820783.2820860
– ident: ref31
  doi: 10.14778/2536222.2536227
– ident: ref30
  doi: 10.1016/j.is.2016.09.007
– ident: ref28
  doi: 10.1117/12.2177233
– ident: ref4
  doi: 10.1145/3183713.3190664
– ident: ref8
  doi: 10.1145/1206049.1206056
– ident: ref9
  doi: 10.1007/978-3-540-28608-0_16
– ident: ref10
  doi: 10.1109/CAMAD50429.2020.9209294
– volume-title: New York, NY, USA (N.Y.). Taxi and Limousine Commission. New York, NY, USA City Taxi Trip Data
  year: 2009-2018
  ident: ref17
– ident: ref22
  doi: 10.3389/fdata.2020.00030
– ident: ref12
  doi: 10.1145/3448016.3457269
– ident: ref32
  doi: 10.1109/ICDEW.2015.7129541
– ident: ref19
  doi: 10.1145/2505515.2505728
– ident: ref3
  doi: 10.1109/GLOBECOM38437.2019.9014291
– ident: ref15
  doi: 10.1145/253260.253291
– ident: ref20
  doi: 10.1109/ACCESS.2019.2904730
– ident: ref11
  doi: 10.1145/2517349.2522737
– ident: ref18
  doi: 10.1145/3300061.3300132
– ident: ref5
  doi: 10.5555/1863103.1863113
– ident: ref16
  doi: 10.1007/978-3-319-77525-8_154
– volume-title: Sampling: Design and Analysis
  year: 2009
  ident: ref14
– ident: ref23
  doi: 10.1145/2882903.2915237
– ident: ref7
  doi: 10.1145/3139958.3139963
– ident: ref21
  doi: 10.1145/3221269.3223040
– ident: ref29
  doi: 10.1109/TNSM.2020.3034150
– ident: ref1
  doi: 10.1109/CAMAD52502.2021.9617784
– ident: ref13
  doi: 10.1109/ICAC.2017.31
– ident: ref2
  doi: 10.3390/s21124160
– ident: ref25
  doi: 10.1109/ICDE.2015.7113382
– ident: ref26
  doi: 10.14778/3236187.3236213
SSID ssj0014504
Score 2.4372873
Snippet The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 73
SubjectTerms Algorithms for data and knowledge management
Apache Spark
Big Data
Big Data Applications
Data Architecture
Data transmission
Geospatial analysis
Internet of Things
Microprocessors
Pipelines
QoS Data Management
Quality of service
Query Processing
Sampling designs
Smart cities
Sparks
Spatial data
Spatial databases and GIS
Spatial Indexes
Spatial Join
State of the art
Streams
Urban areas
Workload
Workloads
Title SpatialSSJP: QoS-Aware Adaptive Approximate Stream-Static Spatial Join Processor
URI https://ieeexplore.ieee.org/document/10309986
https://www.proquest.com/docview/2895011251
Volume 35
WOSCitedRecordID wos001122809600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5UPOjBZ8X6IgdPQtrdJpu43ooPREQqVehtyWazUNBuqfXx853JpqUgCt72kCy7-ZLMfJnMfACnVONESVPwBH1pLg1y1lwKxyOL9q4URWGML-J6rx8ezgeDtBeS1X0ujHPOXz5zLXr0sfyisu90VNYmSSykB2oZlrVWdbLWPGQgE68ViPQi4SmuwxDCjKO0_dS76rdIJ7wlkL4ruty8YIS8qsqPrdjbl5vNf37ZFmwER5J1a-S3YcmNdmBzJtLAwprdgfWFioO70CMJYpxy_f5d74I9Vn3e_TQTx7qFGdPGx7pUY_xriH6sYxSxNq-c_NGhZaEnu6uGIxbyC6pJA55vrp8ub3kQVeBWCD3laJ4d-nym0ynRF4yL1HZ0nHdyE4nSKqkcklbj6ChCKeRiJY5qKoxOCypVZuJS7MHKqBq5fWBaJKRSlpSldFLlOsfdQau0jBFnK2LZhGg2ypkNFcdJ-OIl88wjSjMCJiNgsgBME87mXcZ1uY2_GjcIiYWGNQhNOJphmYUV-ZYhscTfJXfu4Jduh7CGb5f1-coRrEwn7-4YVu3HdPg2OfGT7Rvjas4l
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT-MwEB7xkmAPvNGWpw-cVnJJYsch3CoeKqVUXbUrcYscx5EqQYNK2eXnM-O4qNIKJG452Eriz_bM5_HMB3BKNU6U1AWP0ZfmUiNnzaWwPDBo70pRFFq7Iq7dpNc7f3hI-z5Z3eXCWGvd5TPbpEcXyy8q80pHZWckiYX0QC3CcixlFNTpWh9BAxk7tUAkGDFPcSX6IGYYpGfD_tWgSUrhTYEEXtH15jkz5HRV_tuMnYW52fjmt23CunclWavGfgsW7HgbNmYyDcyv2m34MVdzcAf6JEKMk24w6PQv2O9qwFv_9MSyVqGfaetjLaoy_jZCT9YyilnrJ04e6cgw35N1qtGY-QyDarILf26uh5dt7mUVuBEimXI00Ba9Ph1FJXqDYZGaKAnzKNeBKI2SyiJt1ZYOI5RCNlbiqKZCJ2lBxcp0WIo9WBpXY_sTWCJi0imLy1JaqfIkx_0hUWkZItJGhLIBwWyUM-NrjpP0xWPmuEeQZgRMRsBkHpgG_Pro8lwX3Piq8S4hMdewBqEBhzMsM78mXzKklvi75NDtf9LtBFbbw_tu1r3t3R3AGr5J1qcth7A0nbzaI1gxf6ejl8mxm3jvyazRbA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SpatialSSJP%3A+QoS-Aware+Adaptive+Approximate+Stream-Static+Spatial+Join+Processor&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Isam+Mashhour+Al+Jawarneh&rft.au=Bellavista%2C+Paolo&rft.au=Corradi%2C+Antonio&rft.au=Foschini%2C+Luca&rft.date=2024-01-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=35&rft.issue=1&rft.spage=73&rft_id=info:doi/10.1109%2FTPDS.2023.3330669&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon