Examinando por Autor "ALEJANDRO MAURICIO VALDÉS JIMÉNEZ"
Mostrando 1 - 3 de 3
Resultados por página
Opciones de ordenación
- PublicaciónA PARALLEL APPROACH TO TEXT DATA AUGMENTATION FOR SENTIMENT ANALYSIS USING THE POS WISE SYNONYM SUBSTITUTION ALGORITHM(IEEE CONFERENCIAS, 2023)
;RODRIGO ANDRÉS GUTIÉRREZ BENÍTEZ ;ALEJANDRO MAURICIO VALDÉS JIMÉNEZALEJANDRA ANDREA SEGURA NAVARRETEOVER THE LAST DECADE, THE USE OF SOCIAL MEDIA AS A MASSIVE COMMUNICATION MEDIUM HAS GIVEN PEOPLE A TOOL TO EXPRESS THEIR OPINIONS. IN IT, PEOPLE WRITE THEIR THOUGHTS AND FEELINGS ABOUT PLENTY OF TOPICS GENERATING LARGE AMOUNT OF DATA THAT CAN BE ANALYZED BY COMPANIES AND RESEARCHERS. BEING TASKS OF THE NATURAL LANGUAGE PROCESSING, EMOTION ANALYSIS FOCUSES ON EXTRACTING THE UNDERLYING EMOTIONS IN TEXT, MEANWHILE, SENTIMENT ANALYSIS FOCUSES ON EXTRACTING THE POLARITY OF IT. TO ACCOMPLISH THIS TWO TASKS, TRADITIONAL MACHINE LEARNING AND DEEP LEARNING TECHNIQUES ARE USED. HOWEVER, TO REACH GOOD GENERALIZATION PERFORMANCE, THESE TECHNIQUES REQUIRE LARGE DATASETS OF LABELED DATA FOR TRAINING. FOR RESEARCHERS THIS IS AN ISSUE BECAUSE IN LANGUAGES LIKE SPANISH THE LABELED DATASETS ARE SPARSE. TO SOLVE THIS, DATA AUGMENTATION TECHNIQUES ARE USED TO GENERATE WIDER DATASETS OF LABELED DATA FROM A SMALL, LABELED DATASET. THIS WORK PRESENTS AN OPENMP VERSION FOR SHARED MEMORY SYSTEMS OF A DATA AUGMENTATION TECHNIQUE CALLED POS WISE SYNONYM SUBSTITUTION THAT REPLACES SOME OF THE WORDS OF A SENTENCE WITH THEIR SYNONYMS EXTRACTED FROM WORDNET TO CREATE NEW SENTENCES. WITH THE PARALLEL APPROACH WE REDUCED THE EXECUTION TIME REASONABLY COMPARED TO THE ORIGINAL VERSION REACHING A SPEEDUP OF UP TO 17.5X - PublicaciónIMPROVING THE DISCOVERY AND CLUSTERING OF THREE-DIMENSIONAL PROTEIN PATTERNS WITH OPENMP(IEEE CONFERENCIAS, 2023)ALEJANDRO MAURICIO VALDÉS JIMÉNEZTHE DISCOVERY OF CONSERVED THREE-DIMENSIONAL (3D) AMINO-ACID PATTERNS AMONG A SET OF PROTEIN STRUCTURES CAN BE USEFUL, FOR INSTANCE, TO PREDICT THE FUNCTIONS OF UNKNOWN PROTEINS OR FOR THE RATIONAL DESIGN OF MULTI-TARGET DRUGS. THERE ARE SEVERAL APPLICATIONS THAT PERFORM A THREE-DIMENSIONAL SEARCH OF PATTERNS IN THE STRUCTURES OF PROTEINS. HOWEVER, DISCOVERING CONSERVED 3D PATTERNS IN A SET OF PROTEINS WITH NO OTHER BASELINE PATTERNS IS A CHALLENGE. IN THIS PAPER, WE ANALYZE AND IMPROVE A STATE-OF-THE-ART ALGORITHM, 3D-PP, THAT IMPLEMENTS THIS DISCOVERY. IN THIS ALGORITHM, THE 3D PATTERNS ARE DETECTED AND CLUSTERED USING THE ROOT MEAN SQUARE DEVIATION VALUE, MEASURED AMONG EACH PAIR OF 3D PATTERNS (TOPOLOGICAL VARIABILITY INDICATOR). EVEN WHEN 3D-PP DEALS WITH THIS TASK, THE SIMULTANEOUS PROCESSING OF HIGH AMOUNTS OF PROTEINS BECOMES A COMPUTATIONAL CHALLENGE WITH THE SIZE AND THE NUMBER OF PROTEINS TO BE EVALUATED. IN THIS WORK, WE PRESENT AND ANALYZE DIFFERENT SHARED MEMORY PARALLEL STRATEGIES OF 3D-PP, USING OPENMP. THOSE STRATEGIES IMPROVE THE OVERALL PERFORMANCE OF THE ORIGINAL IMPLEMENTATION BY REDUCING PARALLEL LOAD UNBALANCE AMONG THREADS AND OVERALL INCREASING PARALLELISM. THE RESULTS SHOW SIGNIFICANT PERFORMANCE IMPROVEMENTS COMPARED TO THE ORIGINAL VERSION, ACHIEVING UP TO 13X SPEEDUP FOR A SMALL NUMBER OF PROTEINS AND 17.7× FOR A LARGER SET.
- PublicaciónPARALLEL ALGORITHM FOR DISCOVERING AND COMPARING THREE-DIMENSIONAL PROTEINS PATTERNS(IEEE-ACM Transactions on Computational Biology and Bioinformatics, 2024)ALEJANDRO MAURICIO VALDÉS JIMÉNEZIDENTIFYING CONSERVED (SIMILAR) THREE-DIMENSIONAL PATTERNS AMONG A SET OF PROTEINS CAN BE HELPFUL FOR THE RATIONAL DESIGN OF POLYPHARMACOLOGICAL DRUGS. SOME AVAILABLE TOOLS ALLOW THIS IDENTIFICATION FROM A LIMITED PERSPECTIVE, ONLY CONSIDERING THE AVAILABLE INFORMATION, SUCH AS KNOWN BINDING SITES OR PREVIOUSLY ANNOTATED STRUCTURAL MOTIFS. THUS, THESE APPROACHES DO NOT LOOK FOR SIMILARITIES AMONG ALL PUTATIVE ORTHOSTERIC AND OR ALLOSTERIC BINDINGS SITES BETWEEN PROTEIN STRUCTURES. TO OVERCOME THIS TECH-WEAKNESS GEOMFINDER WAS DEVELOPED, AN ALGORITHM FOR THE ESTIMATION OF SIMILARITIES BETWEEN ALL PAIRS OF THREE-DIMENSIONAL AMINO ACIDS PATTERNS DETECTED IN ANY TWO GIVEN PROTEIN STRUCTURES, WHICH WORKS WITHOUT INFORMATION ABOUT THEIR KNOWN PATTERNS. EVEN THOUGH GEOMFINDER IS A FUNCTIONAL ALTERNATIVE TO COMPARE SMALL STRUCTURAL PROTEINS, IT IS COMPUTATIONALLY UNFEASIBLE FOR THE CASE OF LARGE PROTEIN PROCESSING AND THE ALGORITHM NEEDS TO IMPROVE ITS PERFORMANCE. THIS WORK PRESENTS SEVERAL PARALLEL VERSIONS OF THE GEOMFINDER TO EXPLOIT SMPS, DISTRIBUTED MEMORY SYSTEMS, HYBRID VERSION OF SMP AND DISTRIBUTED MEMORY SYSTEMS, AND GPU BASED SYSTEMS. RESULTS SHOW SIGNIFICANT IMPROVEMENTS IN PERFORMANCE AS COMPARED TO THE ORIGINAL VERSION AND ACHIEVE UP TO 24.5X SPEEDUP WHEN ANALYZING PROTEINS OF AVERAGE SIZE AND UP TO 95.4X IN LARGER PROTEINS.