A Stochastic Technique to Obtain Training Data for Word Segmentation
Unlike western languages, there exists no word boundary in Japanese. This is why we face to hard problems to analyze documents in Japanese very often. More difficulty arises in expertised domains such as medical, mechanical, computer science documents. In this work, we discuss how to obtain pseudo t...
Gespeichert in:
| Veröffentlicht in: | Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03 Jg. 3; S. 283 - 286 |
|---|---|
| Hauptverfasser: | , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
Washington, DC, USA
IEEE Computer Society
15.09.2009
IEEE |
| Schriftenreihe: | ACM Conferences |
| Schlagworte: |
Computing methodologies
> Modeling and simulation
> Model development and analysis
> Modeling methodologies
Mathematics of computing
> Probability and statistics
> Probabilistic reasoning algorithms
> Markov-chain Monte Carlo methods
Mathematics of computing
> Probability and statistics
> Probabilistic reasoning algorithms
> Sequential Monte Carlo methods
Mathematics of computing
> Probability and statistics
> Probabilistic representations
> Markov networks
|
| ISBN: | 0769538010, 9780769538013 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Unlike western languages, there exists no word boundary in Japanese. This is why we face to hard problems to analyze documents in Japanese very often. More difficulty arises in expertised domains such as medical, mechanical, computer science documents. In this work, we discuss how to obtain pseudo test corpus based on Markov process Monte Carlo Method (MCMC), given small amount of test data. In this environment we show nice results using our approach. |
|---|---|
| AbstractList | Unlike western languages, there exists no word boundary in Japanese. This is why we face to hard problems to analyze documents in Japanese very often. More difficulty arises in expertised domains such as medical, mechanical, computer science documents. In this work, we discuss how to obtain pseudo test corpus based on Markov process Monte Carlo Method (MCMC), given small amount of test data. In this environment we show nice results using our approach. |
| Author | Miura, Takao Fukuda, Takuya |
| Author_xml | – sequence: 1 givenname: Takuya surname: Fukuda fullname: Fukuda, Takuya – sequence: 2 givenname: Takao surname: Miura fullname: Miura, Takao |
| BookMark | eNqNkD1PwzAURY0ACVo6M7B4ZEl5L3ZSe6xaPiJV6tCgjtZL7LQGGkNiBv49qcoPYLl3uEd3OCN20YbWMXaLMEUE_bAtkmJeTlMAPU2VOGMTPVMoUykzIVCcsxHMcp0JBQhXbNL3bwCAmILM8mu2nPNNDPWe-uhrXrp63_qvb8dj4Osqkm952Q3p2x1fUiTehI5vQ2f5xu0Oro0UfWhv2GVDH72b_PWYvT49louXZLV-LhbzVUKYpzHJlbZ5rVQNGmwqldVS5aKyQllytkIFLkNCKSw2FpEykpJmVaN0TaIhEGN2d_r1zjnz2fkDdT8mS1UG4rjen1aqD6YK4b03COboyGwLMzgyR0dmcDSg03-ipuq8a8Qve_hnMQ |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/WI-IAT.2009.283 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781424453313 1424453313 |
| EndPage | 286 |
| ExternalDocumentID | 5285030 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AARBI ACM ADPZR ALMA_UNASSIGNED_HOLDINGS APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK GUFHI IERZE OCL RIB RIC RIE RIL AAWTH LHSKQ |
| ID | FETCH-LOGICAL-a162t-689d6c88c090d248d94863bd38daedb180e51a143d1fd11a5a44a7bf89ca3fa03 |
| IEDL.DBID | RIE |
| ISBN | 0769538010 9780769538013 |
| IngestDate | Wed Aug 27 01:35:35 EDT 2025 Wed Jan 31 06:41:49 EST 2024 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Keywords | Word Segmentation Stochastic Techniques Markov Chain Monte Carlo (MCMC) method |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a162t-689d6c88c090d248d94863bd38daedb180e51a143d1fd11a5a44a7bf89ca3fa03 |
| PageCount | 4 |
| ParticipantIDs | acm_books_10_1109_WI_IAT_2009_283 acm_books_10_1109_WI_IAT_2009_283_brief ieee_primary_5285030 |
| PublicationCentury | 2000 |
| PublicationDate | 20090915 2009-Sept. |
| PublicationDateYYYYMMDD | 2009-09-15 2009-09-01 |
| PublicationDate_xml | – month: 09 year: 2009 text: 20090915 day: 15 |
| PublicationDecade | 2000 |
| PublicationPlace | Washington, DC, USA |
| PublicationPlace_xml | – name: Washington, DC, USA |
| PublicationSeriesTitle | ACM Conferences |
| PublicationTitle | Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03 |
| PublicationTitleAbbrev | WIIAT |
| PublicationYear | 2009 |
| Publisher | IEEE Computer Society IEEE |
| Publisher_xml | – name: IEEE Computer Society – name: IEEE |
| SSID | ssj0001120456 |
| Score | 1.4240607 |
| Snippet | Unlike western languages, there exists no word boundary in Japanese. This is why we face to hard problems to analyze documents in Japanese very often. More... |
| SourceID | ieee acm |
| SourceType | Publisher |
| StartPage | 283 |
| SubjectTerms | Computing methodologies -- Artificial intelligence -- Natural language processing Computing methodologies -- Machine learning Computing methodologies -- Machine learning -- Learning paradigms Computing methodologies -- Modeling and simulation -- Model development and analysis -- Modeling methodologies Markov Chain Monte Carlo (MCMC) method Mathematics of computing -- Probability and statistics -- Probabilistic algorithms Mathematics of computing -- Probability and statistics -- Probabilistic reasoning algorithms -- Markov-chain Monte Carlo methods Mathematics of computing -- Probability and statistics -- Probabilistic reasoning algorithms -- Sequential Monte Carlo methods Mathematics of computing -- Probability and statistics -- Probabilistic representations -- Markov networks Mathematics of computing -- Probability and statistics -- Stochastic processes Mathematics of computing -- Probability and statistics -- Stochastic processes -- Markov processes Stochastic processes Stochastic Techniques Theory of computation -- Theory and algorithms for application domains -- Machine learning theory -- Markov decision processes Training data Word Segmentation |
| Title | A Stochastic Technique to Obtain Training Data for Word Segmentation |
| URI | https://ieeexplore.ieee.org/document/5285030 |
| Volume | 3 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFL048cGn-TFxfhFB8MW6ZEnb5FHU4UBU2HR7K7dNontYJ1v195u0nUMQxLemtBAOueSem9xzAM4yziyP0AZGWkdQuAgDRCoC7ULJC5grUZYuXu7jhwc5HqunNbj47oUxxpSXz8ylfyzP8vUs-_Clsk7YlaFblA1oxHFU9Wqt6inMC6tHFTNXLowd0agFdpZjXkv7MKo6o37QvxpWepXdUjMQs-kPh5Vyg-k1_ze1LWitOvXI0_cetA1rJt-B5tKqgdSRuws3V2RQzLI39LLMZLgUbiXFjDymvjpAhrVXBLnBAolLZcnI8VIyMK_Tuj0pb8Fz73Z4fRfUBgoBsqhbBJFUOsqkzKiiuiukVkJGPNVcajQ6ZZKakKHLmDSzmjEMUQiMUytVhtwi5Xuwns9ysw8kNi4R5ChszFKBPEwFo-5Px-C0sBTjNpw6ABPPDBZJSSyoSkb9xIHsnS5V4kBuw_mf3yTpfGJsG3Y9xMl7pbiR1Oge_P76EDarMx5_8-sI1ov5hzmGjeyzmCzmJ-Uy-QKKiLV5 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dT9swFL2CDok9daMgCvsw0qS9ELBjJ7Ef0QC1WilIDZQ36ya2Nx5opxL4_dhJWoSEhPYWR4lkHfnK91z7ngPwo-TM8RRdZKXzBIWLJEKkIjI-lIKAuRJ16eJmlI3H8vZWXa3B4aoXxlpbXz6zR-GxPss38_IxlMqOk1gmflGuw4dEiJg23VovFRUWpNXThpsrH8iearQSO8sxb8V9GFXH02E0PMkbxcq4Vg3E8v6Vx0q9xZx3_29yn2D7pVePXK12oc-wZmdb0F2aNZA2dntwekIm1bz8i0GYmeRL6VZSzcllEeoDJG_dIsgpVkh8MkumnpmSif1z3zYozbbh-vws_zWIWguFCFkaV1EqlUlLKUuqqImFNErIlBeGS4PWFExSmzD0OZNhzjCGCQqBWeGkKpE7pHwHOrP5zO4CyaxPBTkKl7FCIE8Kwaj_03M4IxzFrA8HHkAduMGDrqkFVXo61B7k4HWptAe5Dz_f_UYXizvr-tALEOt_jeaGbtHde_v1d9gc5BcjPRqOf-_Dx-bEJ9wD-wKdavFov8JG-VTdPSy-1UvmGS7VuMA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+2009+IEEE%2FWIC%2FACM+International+Joint+Conference+on+Web+Intelligence+and+Intelligent+Agent+Technology+-+Volume+03&rft.atitle=A+Stochastic+Technique+to+Obtain+Training+Data+for+Word+Segmentation&rft.au=Fukuda%2C+Takuya&rft.au=Miura%2C+Takao&rft.series=ACM+Conferences&rft.date=2009-09-15&rft.pub=IEEE+Computer+Society&rft.isbn=0769538010&rft.spage=283&rft.epage=286&rft_id=info:doi/10.1109%2FWI-IAT.2009.283 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769538013/lc.gif&client=summon&freeimage=true |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769538013/mc.gif&client=summon&freeimage=true |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769538013/sc.gif&client=summon&freeimage=true |

