Normal view MARC view ISBD view

Automatic Language Identification using Basic Signal Class

By: Panchanan, Supaarna.
Contributor(s): Arup, Saha.
Publisher: New Delhi STM Journals 2019Edition: Vol.9(2), May-Aug.Description: 12-19p.Subject(s): Electrical EngineeringOnline resources: Click here In: Trends in electrical engineering (TEE)Summary: Automatic language identification (ASLID) is a problem of identifying an unknown language from spoken utterance by a computer. A segmental approach to ASLID based on the assumption that the acoustic structure of languages can be estimated by segmenting speech into three basic classes of speech signals. This paper presents a procedure of ASLID with the details of methodology and the results without recognizing the words, but the lengths segments of three basic classes of signals namely, quasi-periodic (free voice vowels, obstructed voice, e.g., murmurs, laterals), quasi-random (noise segments, sibilants, frictions in affricates) and quiescent (plosives and affricates, silent periods, occlusions as well as silences caused by breath pause). The quasi-periodic class is again classified as fully voiced signals, obstructed vocalic signals. The classifier uses features from these four classes, which are extracted with more than 98.6% accuracy. The study is conducted with standard dialects of sixteen spoken languages namely Assamese, Bengali, Hindi, Marathi, Gujarati, Panjabi, Urdu, Malayalam, Odia, Konkani, Maithili, Kannada, Manipuri, Nepali and Telugu. The sixteen languages have been chosen in such a manner so that it covers all most all the states of India. The corpus mainly contains the spontaneous speech in conversational mode on various topic, viz. agriculture, social welfare, personal interview, etc. spoken by both sexes. The database consists of more than 30 minutes of spoken data for each of these dialects. The corpus has been collected from the regional radio broadcast. It is expected that the relative abundance of the aforesaid signal classes is different for different languages. Hence, a unique pattern is expected to be observed across the languages. Hence, the collected database is evaluated with Relative Abundance Model (RAM) using weighted Euclidean distance classifier. Here, we are proposing a model which explores the spoken data using time domain parameter. The uniqueness of this model is that it does not use any normally used linguistic information. It is observed that variation of segmental duration of the aforesaid signal types is present in different languages. Exploiting the above phenomenon RAM has been developed. With these sixteen languages of three language families, viz. Indo-Aryan, Dravidian and Tibeto-Burman the recognition rate of 70% has been achieved.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2020631
Total holds: 0

Automatic language identification (ASLID) is a problem of identifying an unknown language from spoken utterance by a computer. A segmental approach to ASLID based on the assumption that the acoustic structure of languages can be estimated by segmenting speech into three basic classes of speech signals. This paper presents a procedure of ASLID with the details of methodology and the results without recognizing the words, but the lengths segments of three basic classes of signals namely, quasi-periodic (free voice vowels, obstructed voice, e.g., murmurs, laterals), quasi-random (noise segments, sibilants, frictions in affricates) and quiescent (plosives and affricates, silent periods, occlusions as well as silences caused by breath pause). The quasi-periodic class is again classified as fully voiced signals, obstructed vocalic signals. The classifier uses features from these four classes, which are extracted with more than 98.6% accuracy. The study is conducted with standard dialects of sixteen spoken languages namely Assamese, Bengali, Hindi, Marathi, Gujarati, Panjabi, Urdu, Malayalam, Odia, Konkani, Maithili, Kannada, Manipuri, Nepali and Telugu. The sixteen languages have been chosen in such a manner so that it covers all most all the states of India. The corpus mainly contains the spontaneous speech in conversational mode on various topic, viz. agriculture, social welfare, personal interview, etc. spoken by both sexes. The database consists of more than 30 minutes of spoken data for each of these dialects. The corpus has been collected from the regional radio broadcast. It is expected that the relative abundance of the aforesaid signal classes is different for different languages. Hence, a unique pattern is expected to be observed across the languages. Hence, the collected database is evaluated with Relative Abundance Model (RAM) using weighted Euclidean distance classifier. Here, we are proposing a model which explores the spoken data using time domain parameter. The uniqueness of this model is that it does not use any normally used linguistic information. It is observed that variation of segmental duration of the aforesaid signal types is present in different languages. Exploiting the above phenomenon RAM has been developed. With these sixteen languages of three language families, viz. Indo-Aryan, Dravidian and Tibeto-Burman the recognition rate of 70% has been achieved.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha