Local cover image
Local cover image
Image from Google Jackets

Ensemble learning approach to classifying documents based on formal and informal writing styles

By: Contributor(s): Publication details: Hyderabad IUP Publications 2022Edition: Vol,18(3), SepDescription: 27-49pSubject(s): In: IUP journal of information technologySummary: Recent advances in technology, many students and scholars have been tempted to use the internet as their main educational resource since they can obtain a variety of documents online.these documents can be classified as either formal or informal in writing style, involving different linguistics. the paper presents a method to identify automatically the style of a particular document.first, a dataset of online documents was compiled and preprocessed.next features ware extracted via a term frequency- inverse document frequency vectorizer. classification models were then built using six classification algorithms. initially, five machine learning algorithms- random forest, decision tree,support vactor machine, multilayer perceptionnn,and naive bayes- were used. of these five algorithms, the random forest algoritham performed best, obtaining an accuracy of 87.44%,high value for precision and recall,and an f measure with the lowest error rate. in the second experiment,an ensemble learning method was used, whereby a vote algoritham was used with a combination of the five algorithms.this method obtained an accuracy of 91.96% the method combines several algorithms.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
Holdings
Item type Current library Status Barcode
Articles Abstract Database Articles Abstract Database School of Engineering & Technology Archieval Section Not for loan 2023-0123
Total holds: 0

Recent advances in technology, many students and scholars have been tempted to use the internet as their main educational resource since they can obtain a variety of documents online.these documents can be classified as either formal or informal in writing style, involving different linguistics. the paper presents a method to identify automatically the style of a particular document.first, a dataset of online documents was compiled and preprocessed.next features ware extracted via a term frequency- inverse document frequency vectorizer. classification models were then built using six classification algorithms. initially, five machine learning algorithms- random forest, decision tree,support vactor machine, multilayer perceptionnn,and naive bayes- were used. of these five algorithms, the random forest algoritham performed best, obtaining an accuracy of 87.44%,high value for precision and recall,and an f measure with the lowest error rate. in the second experiment,an ensemble learning method was used, whereby a vote algoritham was used with a combination of the five algorithms.this method obtained an accuracy of 91.96% the method combines several algorithms.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer

Local cover image
Share
Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.