This paper proposes a method based on linguistic word-formation rules
and dictionaries for determining reduplicative words in Vietnamese. The
key idea for identifying whether adjacent syllables in a text can form a
reduplicative word based on its formation rules. For 2-syllable
reduplicative words, this paper uses rules that describe the repeating
and the opposing between pairs of initial consonants, rhymes and tones.
Then the method is expanded to identify reduplicative words that have 3
or 4 syllables from 2-syllable ones for the Vietnamese word segmentation
task. Experimental results showed that the F1-score was improved to
98.61% and that word segmentation errors were reduced significantly,
1.26%
Proceedings - 2015 IEEE RIVF International Conference on Computing and
Communication Technologies: Research, Innovation, and Vision for Future,
IEEE RIVF 2015, 25 February 2015, Article number 7049878, Pages 77-82
Không có nhận xét nào:
Đăng nhận xét