Normal view MARC view ISBD view

Proposed model for context topic identification of english and hindi news article through LDA approach with NLP technique

By: Srivastav, Anukriti.
Contributor(s): Singh, Satwinder.
Publisher: New York Springer 2022Edition: Vol.103(2), Apr.Description: 591-597p.Subject(s): Humanities and Applied SciencesOnline resources: Click here In: Journal of the institution of engineers (India): Series BSummary: According to the survey, India has the world's second-largest newspaper market, with more than 100 K newspaper outlets, approx 240 million circulation, and 1300 million subscribers or readers. The topic modeling work is increasing day by day, and researchers have published multiple topic modeling papers and have implemented them in different areas like software engineering, political science and medical, etc. LDA topic modeling is used in this research because it has been introduced successfully for topic modeling and classification and it measures the probability of a text-dependent on the bag-of-words scheme without considering the word series. LDA is a common topic modeling algorithm with excellent implementation in the Gensim Python package. However, the challenge is how to extract good quality topics that are simple, separated, and meaningful. The purpose of this research deals with finding the main topics of the same category news articles which are in two different languages (Hindi and English) and then classifying these different language news topics with similarity measurement. In this research, the corpus is constructed with bigram. To achieve the research goal, we have to first build a headline and link extractor that scrap the top news from Google News feeds for both English and Hindi languages (Google News collects news stories that have appeared on different news website which is already accessible in 35 languages over the last 30 days) and then analyses which two news headlines are similar.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2022-1664
Total holds: 0

According to the survey, India has the world's second-largest newspaper market, with more than 100 K newspaper outlets, approx 240 million circulation, and 1300 million subscribers or readers. The topic modeling work is increasing day by day, and researchers have published multiple topic modeling papers and have implemented them in different areas like software engineering, political science and medical, etc. LDA topic modeling is used in this research because it has been introduced successfully for topic modeling and classification and it measures the probability of a text-dependent on the bag-of-words scheme without considering the word series. LDA is a common topic modeling algorithm with excellent implementation in the Gensim Python package. However, the challenge is how to extract good quality topics that are simple, separated, and meaningful. The purpose of this research deals with finding the main topics of the same category news articles which are in two different languages (Hindi and English) and then classifying these different language news topics with similarity measurement. In this research, the corpus is constructed with bigram. To achieve the research goal, we have to first build a headline and link extractor that scrap the top news from Google News feeds for both English and Hindi languages (Google News collects news stories that have appeared on different news website which is already accessible in 35 languages over the last 30 days) and then analyses which two news headlines are similar.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha