Publicación:
APPLYING MACHINE LEARNING SAMPLING TECHNIQUES TO ADDRESS DATA IMBALANCE IN A CHILEAN COVID-19 SYMPTOMS AND COMORBIDITIES DATASET

dc.creatorGASTÓN PATRICIO MÁRQUEZ ORTEGA
dc.date2025
dc.date.accessioned2025-11-24T19:19:17Z
dc.date.available2025-11-24T19:19:17Z
dc.date.issued2025
dc.description.abstractRELIABLY DETECTING COVID-19 IS CRITICAL FOR DIAGNOSIS AND DISEASE CONTROL. HOWEVER, IMBALANCED DATA IN MEDICAL DATASETS POSE SIGNIFICANT CHALLENGES FOR MACHINE LEARNING MODELS, LEADING TO BIAS AND POOR GENERALIZATION. THE DATASET OBTAINED FROM THE EPIVIGILA SYSTEM AND THE CHILEAN EPIDEMIOLOGICAL SURVEILLANCE PROCESS CONTAINS INFORMATION ON OVER 6,000,000 PATIENTS, BUT, LIKE MANY CURRENT DATASETS, IT SUFFERS FROM CLASS IMBALANCE. TO ADDRESS THIS ISSUE, WE APPLIED VARIOUS MACHINE LEARNING ALGORITHMS, BOTH WITH AND WITHOUT SAMPLING METHODS, AND COMPARED THEM USING DIFFERENT CLASSIFICATION AND DIAGNOSTIC METRICS SUCH AS PRECISION, SENSITIVITY, SPECIFICITY, LIKELIHOOD RATIO POSITIVE, AND DIAGNOSTIC ODDS RATIO. OUR RESULTS SHOWED THAT APPLYING SAMPLING METHODS TO THIS DATASET IMPROVED THE METRIC VALUES AND CONTRIBUTED TO MODELS WITH BETTER GENERALIZATION. EFFECTIVELY MANAGING IMBALANCED DATA IS CRUCIAL FOR RELIABLE COVID-19 DIAGNOSIS. THIS STUDY ENHANCES THE UNDERSTANDING OF HOW MACHINE LEARNING TECHNIQUES CAN IMPROVE DIAGNOSTIC RELIABILITY AND CONTRIBUTE TO BETTER PATIENT OUTCOMES.
dc.formatapplication/pdf
dc.identifier.doi10.3390/app15031132
dc.identifier.issn2076-3417
dc.identifier.issn2076-3417
dc.identifier.urihttps://repositorio.ubiobio.cl/handle/123456789/14292
dc.language
dc.publisherApplied Sciences-Basel
dc.relation.uri10.3390/app15031132
dc.rightsOPEN ACCESS
dc.subjectSampling methods
dc.subjectMachine learning algorithms
dc.subjectImbalanced data
dc.subjectEpidemiological dataset
dc.subjectClassification metrics
dc.subjectEPIVIGILA system
dc.subjectCOVID-19 diagnosis
dc.titleAPPLYING MACHINE LEARNING SAMPLING TECHNIQUES TO ADDRESS DATA IMBALANCE IN A CHILEAN COVID-19 SYMPTOMS AND COMORBIDITIES DATASET
dc.typeARTÍCULO
dspace.entity.typePublication
oaire.licenseConditionCC BY 4.0
ubb.EstadoPUBLICADA
ubb.Otra ReparticionDEPARTAMENTO DE CIENCIAS DE LA COMPUTACION Y TECNOLOGIA DE LA INFORMACION.
ubb.SedeCHILLÁN
Archivos
Bloque original
Mostrando 1 - 1 de 1
Cargando...
Miniatura
Nombre:
documento_publicacion_24_11_2025_16_19_03.pdf
Tamaño:
835.44 KB
Formato:
Adobe Portable Document Format
Descripción: