Scalable global and local hashing strategies for duplicate pruning in parallel A graph search

For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different processors may perform significant duplicated work if interprocessor duplicates are not pruned. In this paper, we consider the problem of dup...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems Jg. 8; H. 7; S. 738 - 756
Hauptverfasser:	Mahapatra, N.R., Dutt, S.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	IEEE 01.07.1997
Schlagworte:	Algorithm design and analysis Costs Delay Hypercubes Load management Scalability State-space methods Traveling salesman problems Tree graphs Upper bound
ISSN:	1045-9219
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different processors may perform significant duplicated work if interprocessor duplicates are not pruned. In this paper, we consider the problem of duplicate pruning in parallel A* graph-search algorithms implemented on distributed-memory machines. A commonly used method for duplicate pruning uses a hash function to associate with each distinct node of the search space a particular processor to which duplicate nodes arising in different processors are transmitted and thereby pruned. This approach has two major drawbacks. First, load balance is determined solely by the hash function. Second, node transmissions for duplicate pruning are global; this can lead to hot spots and slower message delivery. To overcome these problems, we propose two different duplicate pruning strategies: 1) To achieve good load balance, we decouple the task of duplicate pruning from load balancing, by using a hash function for the former and a load balancing scheme for the latter. 2) A novel search-space partitioning scheme that allocates disjoint parts of the search space to disjoint subcubes in a hypercube (or disjoint processor groups in the target architecture), so that duplicate pruning is achieved with only intrasubcube or adjacent intersubcube communication. Thus message latency and hot-spot probability are greatly reduced. The above duplicate pruning schemes were implemented on an nCUBE2 hypercube multicomputer to solve the Traveling Salesman Problem (TSP). For uniformly distributed intercity costs, our strategies yield a speedup improvement of 13 to 35 percent on 1,024-processors over previous methods that do not prune any duplicates, and 13 to 25 percent over the previous hashing-only scheme. For normally distributed data the corresponding figures are 135 percent and 10 to 155 percent. Finally, we analyze the scalability of our parallel A* algorithms on k-ary n-cube networks in terms of the isoefficiency metric, and show that they have isoefficiency lower and upper bounds of /spl Theta/(P log P) and /spl Theta/(Pkn/sup 2/), respectively.
AbstractList	The state space is a graph rather than a tree for many applications of the A* algorithm. This means that in parallel A* algorithms, different processors may perform significant duplicated work if interprocessor duplicates are not pruned. The problem of duplicate pruning in parallel A* graph-search algorithms implemented on distributed-memory machines is presented. Two different duplicate pruning strategies are discussed to solve this problem: first, duplicate pruning from load balancing is decoupled by using hash functions to achieve load balance, and second, a search space partitioning scheme that allocates disjoint parts of the search space to disjoint subcubes in hypercubes so that duplicate pruning is achieved. For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different processors may perform significant duplicated work if interprocessor duplicates are not pruned. In this paper, we consider the problem of duplicate pruning in parallel A* graph-search algorithms implemented on distributed-memory machines. A commonly used method for duplicate pruning uses a hash function to associate with each distinct node of the search space a particular processor to which duplicate nodes arising in different processors are transmitted and thereby pruned. This approach has two major drawbacks. First, load balance is determined solely by the hash function. Second, node transmissions for duplicate pruning are global; this can lead to hot spots and slower message delivery. To overcome these problems, we propose two different duplicate pruning strategies: 1) To achieve good load balance, we decouple the task of duplicate pruning from load balancing, by using a hash function for the former and a load balancing scheme for the latter. 2) A novel search-space partitioning scheme that allocates disjoint parts of the search space to disjoint subcubes in a hypercube (or disjoint processor groups in the target architecture), so that duplicate pruning is achieved with only intrasubcube or adjacent intersubcube communication. Thus message latency and hot-spot probability are greatly reduced. The above duplicate pruning schemes were implemented on an nCUBE2 hypercube multicomputer to solve the Traveling Salesman Problem (TSP). For uniformly distributed intercity costs, our strategies yield a speedup improvement of 13 to 35 percent on 1,024-processors over previous methods that do not prune any duplicates, and 13 to 25 percent over the previous hashing-only scheme. For normally distributed data the corresponding figures are 135 percent and 10 to 155 percent. Finally, we analyze the scalability of our parallel A* algorithms on k-ary n-cube networks in terms of the isoefficiency metric, and show that they have isoefficiency lower and upper bounds of Theta(P log P) and Theta(Pkn(2)), respectively For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different processors may perform significant duplicated work if interprocessor duplicates are not pruned. In this paper, we consider the problem of duplicate pruning in parallel A* graph-search algorithms implemented on distributed-memory machines. A commonly used method for duplicate pruning uses a hash function to associate with each distinct node of the search space a particular processor to which duplicate nodes arising in different processors are transmitted and thereby pruned. This approach has two major drawbacks. First, load balance is determined solely by the hash function. Second, node transmissions for duplicate pruning are global; this can lead to hot spots and slower message delivery. To overcome these problems, we propose two different duplicate pruning strategies: 1) To achieve good load balance, we decouple the task of duplicate pruning from load balancing, by using a hash function for the former and a load balancing scheme for the latter. 2) A novel search-space partitioning scheme that allocates disjoint parts of the search space to disjoint subcubes in a hypercube (or disjoint processor groups in the target architecture), so that duplicate pruning is achieved with only intrasubcube or adjacent intersubcube communication. Thus message latency and hot-spot probability are greatly reduced. The above duplicate pruning schemes were implemented on an nCUBE2 hypercube multicomputer to solve the Traveling Salesman Problem (TSP). For uniformly distributed intercity costs, our strategies yield a speedup improvement of 13 to 35 percent on 1,024-processors over previous methods that do not prune any duplicates, and 13 to 25 percent over the previous hashing-only scheme. For normally distributed data the corresponding figures are 135 percent and 10 to 155 percent. Finally, we analyze the scalability of our parallel A* algorithms on k-ary n-cube networks in terms of the isoefficiency metric, and show that they have isoefficiency lower and upper bounds of /spl Theta/(P log P) and /spl Theta/(Pkn/sup 2/), respectively.
Author	Mahapatra, N.R. Dutt, S.
Author_xml	– sequence: 1 givenname: N.R. surname: Mahapatra fullname: Mahapatra, N.R. organization: Dept. of Electr. & Comput. Eng., State Univ. of New York, Buffalo, NY, USA – sequence: 2 givenname: S. surname: Dutt fullname: Dutt, S.
BookMark	eNqFkD1PwzAURT0UibYwsDJ5QmJIayd2HI9VxZdUiQEYUfTivKRGbhLsZODfkyoVA0JietJ9597hLMisaRsk5IqzFedMrxVfSZ0lIpuROWdCRjrm-pwsQvhgjAvJxJy8vxhwUDiktWsLcBSakrp2DOkewt42NQ29hx5ri4FWrafl0DlrxoR2fmiOgG1oBx6cQ0c3tPbQ7WlA8GZ_Qc4qcAEvT3dJ3u7vXreP0e754Wm72UUmUaKPEiZSWRjQqVFaJxjHAkuRVQyhZExmouQxU6bSDFUlCy6UzNAAcEi1iiuZLMnNtNv59nPA0OcHGww6Bw22Q8jjTCqW6vR_ME20Zmk2grcTaHwbgscq77w9gP_KOcuPdnPF88nuyK5_scb20Nu2Gc1Z92fjempYRPxZPj2_Ad7aiCY
CODEN	ITDSEO
CitedBy_id	crossref_primary_10_1016_j_artint_2012_10_007 crossref_primary_10_1016_j_asoc_2007_10_011 crossref_primary_10_1016_j_parco_2004_05_001 crossref_primary_10_1109_ACCESS_2020_2973607 crossref_primary_10_1006_jpdc_2000_1664 crossref_primary_10_1016_j_jpdc_2005_05_028 crossref_primary_10_1109_69_755612
Cites_doi	10.1090/dimacs/022/09 10.1109/12.53599 10.1016/0167-6377(89)90038-2 10.1145/174130.174145 10.1109/DMCC.1990.556310 10.1016/0004-3702(89)90010-6 10.1287/opre.11.6.972 10.1016/j.jpdc.2007.05.013 10.1109/FMPC.1990.89450 10.1137/0804046 10.1145/2422.322422 10.1006/jpdc.1994.1106 10.1109/IPPS.1992.222970 10.1007/978-1-4757-2219-2 10.1109/IPPS.1993.262779
ContentType	Journal Article
DBID	AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D
DOI	10.1109/71.598348
DatabaseName	CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional
DatabaseTitleList	Computer and Information Systems Abstracts Computer and Information Systems Abstracts
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Computer Science
EndPage	756
ExternalDocumentID	10_1109_71_598348 598348
GroupedDBID	--Z -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RZB TN5 TWZ UHB VH1 AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c374t-30465bca96c7993e224ed48f0ead00584d1207cf90e7f5b14758ecaa1a6972f53
IEDL.DBID	RIE
ISICitedReferencesCount	11
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=10_1109_71_598348&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	1045-9219
IngestDate	Sun Sep 28 12:04:10 EDT 2025 Wed Oct 01 17:07:45 EDT 2025 Sat Nov 29 03:35:56 EST 2025 Tue Nov 18 20:53:15 EST 2025 Wed Aug 27 02:52:19 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	7
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c374t-30465bca96c7993e224ed48f0ead00584d1207cf90e7f5b14758ecaa1a6972f53
Notes	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
PQID	26399068
PQPubID	23500
PageCount	19
ParticipantIDs	crossref_primary_10_1109_71_598348 ieee_primary_598348 proquest_miscellaneous_26399068 crossref_citationtrail_10_1109_71_598348 proquest_miscellaneous_28570696
PublicationCentury	1900
PublicationDate	1997-07-01
PublicationDateYYYYMMDD	1997-07-01
PublicationDate_xml	– month: 07 year: 1997 text: 1997-07-01 day: 01
PublicationDecade	1990
PublicationTitle	IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev	TPDS
PublicationYear	1997
Publisher	IEEE
Publisher_xml	– name: IEEE
References	wah (bibl073828) 1981 bibl073827 rich (bibl073823) 1983 bibl073826 holte (bibl073810) 1994 bibl073811 mahapatra (bibl073818) 1995; 22 bibl07387 knuth (bibl073814) 1973; 3 bibl07388 kumar (bibl073812) 1988; 1 bibl07385 bibl07386 bibl073816 bibl073817 mohan (bibl073822) 1983 bibl07384 hennessey (bibl07389) 1990 kumar (bibl073813) 1989; 1 bibl073819 bibl07382 bibl073824 sen (bibl073825) 1989 lawler (bibl073815) 1976 bibl073821 manzini (bibl073820) 1990 cormen (bibl07383) 1990 anderson (bibl07381) 1987
References_xml	– volume: 22 start-page: 197 year: 1995 ident: bibl073818 article-title: new anticipatory load balancing strategies for parallel a* algorithms publication-title: Am Math Society s Proc DIMACS Series in Discrete Math and Theoretical Computer Science doi: 10.1090/dimacs/022/09 – volume: 1 start-page: 603 year: 1989 ident: bibl073813 article-title: load balancing on the hypercube architecture publication-title: Proc Fourth Conf Hypercubes Concurrent Computers and Applications – ident: bibl07384 doi: 10.1109/12.53599 – ident: bibl073821 doi: 10.1016/0167-6377(89)90038-2 – ident: bibl073811 doi: 10.1145/174130.174145 – ident: bibl073824 doi: 10.1109/DMCC.1990.556310 – start-page: 191 year: 1983 ident: bibl073822 article-title: experience with two parallel programs solving the traveling salesman problem publication-title: Proc 1983 Int l Conf Parallel Processing – ident: bibl07382 doi: 10.1016/0004-3702(89)90010-6 – start-page: 263 year: 1994 ident: bibl073810 article-title: searching with abstractions: a unifying framework and new high-performance algorithm publication-title: Proc 10th Canadian Conf Artificial Intelligence – start-page: 309 year: 1987 ident: bibl07381 article-title: parallel branch-and-bound algorithms on the hypercube publication-title: Proc Second Conf Hypercube Multiprocessors – ident: bibl073816 doi: 10.1287/opre.11.6.972 – start-page: 297 year: 1989 ident: bibl073825 article-title: fast recursive formulations for best-first search that allow controlled use of memory publication-title: Proc 11th Int l Joint Conf Artificial Intelligence (IJCAI-89) – year: 1990 ident: bibl07383 publication-title: Introduction to Algorithms – year: 1983 ident: bibl073823 publication-title: Artificial Intelligence – year: 1990 ident: bibl073820 article-title: probabilistic performance analysis of heuristic search using parallel hash tables publication-title: Proc Int l Symp Artificial Intelligence and Math – year: 1976 ident: bibl073815 publication-title: Combinatorial Optimization Networks and Matroids – ident: bibl073819 doi: 10.1016/j.jpdc.2007.05.013 – ident: bibl07388 doi: 10.1109/FMPC.1990.89450 – ident: bibl07387 doi: 10.1137/0804046 – ident: bibl073827 doi: 10.1145/2422.322422 – ident: bibl07386 doi: 10.1006/jpdc.1994.1106 – ident: bibl073817 doi: 10.1109/IPPS.1992.222970 – volume: 1 start-page: 122 year: 1988 ident: bibl073812 article-title: parallel best-first search of state-space graphs: a summary of results publication-title: Proc Seventh Nat l Conf Artificial Intelligence (AAAI 88) – year: 1990 ident: bibl07389 publication-title: Computer Architecture A Quantitative Approach – volume: 3 year: 1973 ident: bibl073814 article-title: sorting and searching publication-title: The art of computer programming – start-page: 239 year: 1981 ident: bibl073828 article-title: manip-a parallel computer system for implementing branch and bound algorithms publication-title: Proc Eighth Ann Symp Computer Architecture – ident: bibl073826 doi: 10.1007/978-1-4757-2219-2 – ident: bibl07385 doi: 10.1109/IPPS.1993.262779
SSID	ssj0014504
Score	1.6066258
Snippet	For many applications of the A* algorithm, the state space is a graph rather than a tree. The implication of this for parallel A* algorithms is that different... The state space is a graph rather than a tree for many applications of the A* algorithm. This means that in parallel A* algorithms, different processors may...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	738
SubjectTerms	Algorithm design and analysis Costs Delay Hypercubes Load management Scalability State-space methods Traveling salesman problems Tree graphs Upper bound
Title	Scalable global and local hashing strategies for duplicate pruning in parallel A graph search
URI	https://ieeexplore.ieee.org/document/598348 https://www.proquest.com/docview/26399068 https://www.proquest.com/docview/28570696
Volume	8
WOSCitedRecordID	wos10_1109_71_598348&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) issn: 1045-9219 databaseCode: RIE dateStart: 19900101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://ieeexplore.ieee.org/ omitProxy: false ssIdentifier: ssj0014504 providerName: IEEE
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5ueNCD06k4fwbx4CVb0iVNcxRxeJAhqLCLlLRJcDC6sbX-_eZHOxSH4K2Ul7bk5fV7yUu-D4Abi4HU4QDKiMGIykQioYhChKrMKBNb1Pckrk98PE4mE_Fc82z7szBaa7_5TPfdpa_lq3leuaWyARPJkCYt0OI8Dke11gUDyrxSoJ1cMCRsFNYkQgSLASf90PAH9HgtlV8_YI8qo86_vmcf7NXJI7wL3j4AW7rogk4jzADrOO2C3W8sg4fg_cX6wZ2QgoH9A8pCQY9h8CNIKcFV2TBGQJvEQlWFqraGi2XlFk7gtICOJHw20_b10LNcwxAkR-Bt9PB6_4hqVQWUDzktkSuFsiyXIs65TU60xXCtaGKwHVNOZJAqEmGeG4E1Nywj1M4odC4lkbHgkWHDY9Au5oU-AdBkjsomE4JFirLEUYxiLhmV1AiaqaQHbpsOT_OactwpX8xSP_XAIuUkDZ3YA9dr00Xg2dhk1HVOWBs0d68aJ6Y2NlzBQxZ6Xq3SyKVfOP7LwvH7xyI-3fjkM7AT2Grd7txz0C6Xlb4A2_lnOV0tL_0A_ALYk9o4
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB58gXrwsSq-DeLBS91mN2mao4iiuC6CCl6kpE2CwtKV3a2_30zSLooieCtl0pZMpt8kk3wfwInDQIY4EOXUxhFTqYqkpjqiTOdW28Shvidx7Yl-P31-lvc1z7Y_C2OM8ZvPzBle-lq-HhYVLpW1uUy7LJ2FeRTOqg9rTUsGjHutQDe94JF0cVjTCNFYtgU9C02_gY9XU_nxC_a4crX6ry9ag5U6fSTnwd_rMGPKFqw20gykjtQWLH_hGdyAlwfnCTwjRQL_B1GlJh7FyGsQUyLjScMZQVwaS3QV6tqGvI8qXDohbyVBmvDBwLjXE89zTUKYbMLT1eXjxXVU6ypERVewSYTFUJ4XSiaFcOmJcShuNEtt7EYVygwyTTuxKKyMjbA8p8zNKUyhFFWJFB3Lu1swVw5Lsw3E5khmk0vJO5rxFElGY6E4U8xKlut0B06bDs-KmnQctS8GmZ98xDITNAuduAPHU9P3wLTxm1ELnTA1aO4eNU7MXHRgyUOVZliNsw4mYHHylwUy_Ccy2f31yUeweP1418t6N_3bPVgK3LW4V3cf5iajyhzAQvExeRuPDv1g_ARaf92B
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+global+and+local+hashing+strategies+for+duplicate+pruning+in+parallel+A+graph+search&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Mahapatra%2C+Nihar+R&rft.au=Dutt%2C+Shantanu&rft.date=1997-07-01&rft.issn=1045-9219&rft.volume=8&rft.issue=7&rft.spage=738&rft.epage=756&rft_id=info:doi/10.1109%2F71.598348&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon