TiCA: A Tibetan Text Compression Algorithm
Saved in:
| Title: | TiCA: A Tibetan Text Compression Algorithm |
|---|---|
| Authors: | Chen Shuo, Suonan Jiancuo, Nima Zhaxi, Renqing Nuobu |
| Source: | Proceedings of the 2nd International Conference on Artificial Intelligence and Advanced Manufacture. :8-12 |
| Publisher Information: | ACM, 2020. |
| Publication Year: | 2020 |
| Subject Terms: | 4. Education, 0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology |
| Description: | This paper proposes a Tibetan text compression algorithm (TiCA), which is based on the fact that each Tibetan syllable is composed of one to seven components and each component has a unique Unicode encoding. First of all, through statistical analysis of 20G Tibetan text corpus, a fault-tolerant mapping dictionary is established and used as the dictionary of the TiCA. The TiCA then compresses the Tibetan text according to the mapping dictionary by mapping the original code to a single code. Finally, the experimental comparison shows that the Tibetan text compression algorithm proposed in this paper has achieved excellent results both in the compression rate and time consuming. |
| Document Type: | Article |
| DOI: | 10.1145/3421766.3421868 |
| Access URL: | https://dblp.uni-trier.de/db/conf/aiam/aiam2020.html#SuonanCRZ20 https://doi.org/10.1145/3421766.3421868 |
| Rights: | URL: https://www.acm.org/publications/policies/copyright_policy#Background |
| Accession Number: | edsair.doi.dedup.....665d19c20e9fb8cc734e86d18f7e38a5 |
| Database: | OpenAIRE |
| Abstract: | This paper proposes a Tibetan text compression algorithm (TiCA), which is based on the fact that each Tibetan syllable is composed of one to seven components and each component has a unique Unicode encoding. First of all, through statistical analysis of 20G Tibetan text corpus, a fault-tolerant mapping dictionary is established and used as the dictionary of the TiCA. The TiCA then compresses the Tibetan text according to the mapping dictionary by mapping the original code to a single code. Finally, the experimental comparison shows that the Tibetan text compression algorithm proposed in this paper has achieved excellent results both in the compression rate and time consuming. |
|---|---|
| DOI: | 10.1145/3421766.3421868 |
Nájsť tento článok vo Web of Science