Performance evaluation of part-of-speech tagging for bengali text (Record no. 17561)

000 -LEADER
fixed length control field a
003 - CONTROL NUMBER IDENTIFIER
control field OSt
005 - DATE AND TIME OF LATEST TRANSACTION
control field 20220920152408.0
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field 220920b xxu||||| |||| 00| 0 eng d
040 ## - CATALOGING SOURCE
Original cataloging agency AIKTC-KRRC
Transcribing agency AIKTC-KRRC
100 ## - MAIN ENTRY--PERSONAL NAME
9 (RLIN) 17966
Author Pan, Subrata
245 ## - TITLE STATEMENT
Title Performance evaluation of part-of-speech tagging for bengali text
250 ## - EDITION STATEMENT
Volume, Issue number Vol.103(2), Apr
260 ## - PUBLICATION, DISTRIBUTION, ETC.
Place of publication, distribution, etc. New York
Name of publisher, distributor, etc. Springer
Year 2022
300 ## - PHYSICAL DESCRIPTION
Pagination 577-589p.
520 ## - SUMMARY, ETC.
Summary, etc. In this paper, it has been proposed an approach to assess the performance of part-of-speech tagging of the Bengali text. The tagging can be viewed as a process of interpreting a syntactic category for tokens in text documents. The difficulty occurs when choosing an appropriate category of the speech part for tokens. To overcome this limitation, we proposed an effective method to carry out part of the speech tagging on 5 corpora independent of the domain. Subsequently, we performed a tagging performance assessment to verify the efficiency of our system. Our system is developed in 10 phases: initialization of the dataset, sentence boundary determination, tokenization, identification of unique tokens, part-of-speech tagging, retrieving the token and the tagged portion of the speech class, record the retrieved outcomes, query processing, performance evaluation and rank generation. Five corpora have been used for the experiment of the system. The system is successfully tagged 98.97%, 98.35%, 89.93%, 88.46% and 90.01% tokens of experimental corpora. The system has obtained excellent tagging performance for the POS category (Common Noun) compared to other POS categories. The efficiency of this system is visualized through detailed performance appraisal in 18 part-of-speech categories. The system is successfully tagged 16,504,118 tokens over 18,047,593 numbers of distinct tokens in the total corpora. It has been achieved 91.44% overall tagging effectiveness, which represents an improvement of about 3.24% compared to the baseline method.
650 #0 - SUBJECT ADDED ENTRY--TOPICAL TERM
9 (RLIN) 4642
Topical term or geographic name entry element Humanities and Applied Sciences
700 ## - ADDED ENTRY--PERSONAL NAME
9 (RLIN) 17967
Co-Author Saha, Diganta
773 0# - HOST ITEM ENTRY
International Standard Serial Number 2250-2106
Title Journal of the institution of engineers (India): Series B
856 ## - ELECTRONIC LOCATION AND ACCESS
URL https://link.springer.com/article/10.1007/s40031-021-00630-5
Link text Click here
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme
Koha item type Articles Abstract Database
Holdings
Withdrawn status Lost status Source of classification or shelving scheme Damaged status Not for loan Permanent Location Current Location Shelving location Date acquired Barcode Date last seen Price effective from Koha item type
          School of Engineering & Technology School of Engineering & Technology Archieval Section 2022-09-20 2022-1663 2022-09-20 2022-09-20 Articles Abstract Database
Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha