Normal view MARC view ISBD view

De-duplication avoidance in regional names using an approach based on pronunciation

By: Raykar, Nagesh .

Contributor(s): Kumbharkar, Prashant .

Publisher: South Africa AkiNik Publications 2023Edition: Vol,4(1), Jan-Jun.Description: 10-17p.Subject(s): Electrical Engineering

Online resources: Click here In: International journal of advances in electrical engineeringSummary: Demographic data deduplication has occurred in every field, including government, marketing, and opinion research, particularly if you work in IT and are in charge of taking backups or transferring large amounts of data. Duplication occurs both directly and indirectly when copying the same backup. As a result, there is an inherent need to proceed or remove redundant data. The term "de-duplication" refers to the removal of duplicate data. This is required for better data storage utilization. The deduplication process involves removing the duplicate copy and keeping only one copy. Deduplication includes a de-duplication process. A different user stores the same file in the same location. As a result, it increases redundancy. Many scholars have already did work on demographic data de - duplication, and one such requirement is that a specific reduction rule is useful for the deduplication algorithm in Indian demographic data. Based on the pronunciation rule, the researchers will evaluate the regional name, first name, and last name. It is necessary to test with various phonetic-based algorithms and then develop an efficient new phonetic-based algorithm. The phonetic algorithm is responsible for indexing words based on their own phonetics. The majority of phonetic algorithms have been primarily designed for English language. Demographic Information provides data on individuals based on features such as First name, Surname, age, gender, contact no, email id, and so on. Considering the Indian regional languages names scenario, we must identify an individual who has the same name but different spellings. The proposed study compares traditional regional names in the format of First name and Surname using the pronunciation rule. For the local languages, a prototype effective phonetic-based algorithm has been developed. An effort has been made to avoid redundant information in the names, and secondly, equivalent names, even with different alphabetical arrangements, have been identified in order to locate an individual in e-governance of a region or any industry. The proposed approach's findings are encouraging, and it can be used in a real - world environment.

Tags from this library: No tags from this library for this title. Log in to add tags.

average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes
Comments ( 0 )
Images

Item type	Current location	Call number	Status	Date due	Barcode	Item holds
Articles Abstract Database	School of Engineering & Technology Archieval Section		Not for loan		2023-0968

Total holds: 0

Demographic data deduplication has occurred in every field, including government, marketing, and opinion research, particularly if you work in IT and are in charge of taking backups or transferring large amounts of data. Duplication occurs both directly and indirectly when copying the same backup. As a result, there is an inherent need to proceed or remove redundant data. The term "de-duplication" refers to the removal of duplicate data. This is required for better data storage utilization. The deduplication process involves removing the duplicate copy and keeping only one copy. Deduplication includes a de-duplication process. A different user stores the same file in the same location. As a result, it increases redundancy. Many scholars have already did work on demographic data de - duplication, and one such requirement is that a specific reduction rule is useful for the deduplication algorithm in Indian demographic data. Based on the pronunciation rule, the researchers will evaluate the regional name, first name, and last name. It is necessary to test with various phonetic-based algorithms and then develop an efficient new phonetic-based algorithm. The phonetic algorithm is responsible for indexing words based on their own phonetics. The majority of phonetic algorithms have been primarily designed for English language. Demographic Information provides data on individuals based on features such as First name, Surname, age, gender, contact no, email id, and so on. Considering the Indian regional languages names scenario, we must identify an individual who has the same name but different spellings. The proposed study compares traditional regional names in the format of First name and Surname using the pronunciation rule. For the local languages, a prototype effective phonetic-based algorithm has been developed. An effort has been made to avoid redundant information in the names, and secondly, equivalent names, even with different alphabetical arrangements, have been identified in order to locate an individual in e-governance of a region or any industry. The proposed approach's findings are encouraging, and it can be used in a real - world environment.