Normal view MARC view ISBD view

Performance evaluation of part-of-speech tagging for bengali text

By: Pan, Subrata.
Contributor(s): Saha, Diganta.
Publisher: New York Springer 2022Edition: Vol.103(2), Apr.Description: 577-589p.Subject(s): Humanities and Applied SciencesOnline resources: Click here In: Journal of the institution of engineers (India): Series BSummary: In this paper, it has been proposed an approach to assess the performance of part-of-speech tagging of the Bengali text. The tagging can be viewed as a process of interpreting a syntactic category for tokens in text documents. The difficulty occurs when choosing an appropriate category of the speech part for tokens. To overcome this limitation, we proposed an effective method to carry out part of the speech tagging on 5 corpora independent of the domain. Subsequently, we performed a tagging performance assessment to verify the efficiency of our system. Our system is developed in 10 phases: initialization of the dataset, sentence boundary determination, tokenization, identification of unique tokens, part-of-speech tagging, retrieving the token and the tagged portion of the speech class, record the retrieved outcomes, query processing, performance evaluation and rank generation. Five corpora have been used for the experiment of the system. The system is successfully tagged 98.97%, 98.35%, 89.93%, 88.46% and 90.01% tokens of experimental corpora. The system has obtained excellent tagging performance for the POS category (Common Noun) compared to other POS categories. The efficiency of this system is visualized through detailed performance appraisal in 18 part-of-speech categories. The system is successfully tagged 16,504,118 tokens over 18,047,593 numbers of distinct tokens in the total corpora. It has been achieved 91.44% overall tagging effectiveness, which represents an improvement of about 3.24% compared to the baseline method.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2022-1663
Total holds: 0

In this paper, it has been proposed an approach to assess the performance of part-of-speech tagging of the Bengali text. The tagging can be viewed as a process of interpreting a syntactic category for tokens in text documents. The difficulty occurs when choosing an appropriate category of the speech part for tokens. To overcome this limitation, we proposed an effective method to carry out part of the speech tagging on 5 corpora independent of the domain. Subsequently, we performed a tagging performance assessment to verify the efficiency of our system. Our system is developed in 10 phases: initialization of the dataset, sentence boundary determination, tokenization, identification of unique tokens, part-of-speech tagging, retrieving the token and the tagged portion of the speech class, record the retrieved outcomes, query processing, performance evaluation and rank generation. Five corpora have been used for the experiment of the system. The system is successfully tagged 98.97%, 98.35%, 89.93%, 88.46% and 90.01% tokens of experimental corpora. The system has obtained excellent tagging performance for the POS category (Common Noun) compared to other POS categories. The efficiency of this system is visualized through detailed performance appraisal in 18 part-of-speech categories. The system is successfully tagged 16,504,118 tokens over 18,047,593 numbers of distinct tokens in the total corpora. It has been achieved 91.44% overall tagging effectiveness, which represents an improvement of about 3.24% compared to the baseline method.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha