Normal view MARC view ISBD view

Data Quality Evaluation Framework for Big Data

By: Onyeabor, Grace Amina.
Contributor(s): Azman Ta'a.
Publisher: Tamil Nadu i-manager's 2018Edition: Vol.5(2), Jul-Dec.Description: 27-35p.Subject(s): Computer EngineeringOnline resources: Click here In: i-manager's journal on cloud computing (JCC)Summary: Data is an important asset in all business organizations of today. Thus the results of its poor quality can be very grievous leading to erroneous insights. Therefore, Data Quality (DQ) needs to be evaluated before the analysis of any Big Data (BD). The evaluation of DQ in BD is challenging. Given the enormous datasets that are of varied format fashioned at a rapid speed, it is impossible to use the traditional methods of evaluating DQ in BD. Rather, there is a requirement of strategies and devices for the assessment and evaluation of DQ in BD in a rapid and more efficient manner. However, assessing the quality of data on the whole BD can be very expensive. In addition, there is also a need for improvement in data transformation activities of BD. This paper proposes a framework for DQ evaluation with the application of data sampling technique on BD sets from different data sources reducing the size of the data to samples representing the population of the BD sets. The Bag of Little Bootstrap (BLB) sampling technique will be used. The target Data Quality Dimensions (DQDs) to be used in this paper are completeness, consistency, and accuracy. In addition, the DQDs will be measured using different metric functions relevant to the DQDs. This will be done before and after an improved data transformation techniques to check the improvement of DQ in BD.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2019882
Total holds: 0

Data is an important asset in all business organizations of today. Thus the results of its poor quality can be very grievous leading to erroneous insights. Therefore, Data Quality (DQ) needs to be evaluated before the analysis of any Big Data (BD). The evaluation of DQ in BD is challenging. Given the enormous datasets that are of varied format fashioned at a rapid speed, it is impossible to use the traditional methods of evaluating DQ in BD. Rather, there is a requirement of strategies and devices for the assessment and evaluation of DQ in BD in a rapid and more efficient manner. However, assessing the quality of data on the whole BD can be very expensive. In addition, there is also a need for improvement in data transformation activities of BD. This paper proposes a framework for DQ evaluation with the application of data sampling technique on BD sets from different data sources reducing the size of the data to samples representing the population of the BD sets. The Bag of Little Bootstrap (BLB) sampling technique will be used. The target Data Quality Dimensions (DQDs) to be used in this paper are completeness, consistency, and accuracy. In addition, the DQDs will be measured using different metric functions relevant to the DQDs. This will be done before and after an improved data transformation techniques to check the improvement of DQ in BD.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha