Publicación: IMPROVING THE DISCOVERY AND CLUSTERING OF THREE-DIMENSIONAL PROTEIN PATTERNS WITH OPENMP

Fecha
2023
Título de la revista
ISSN de la revista
Título del volumen
Editor
IEEE CONFERENCIAS
Resumen
THE DISCOVERY OF CONSERVED THREE-DIMENSIONAL (3D) AMINO-ACID PATTERNS AMONG A SET OF PROTEIN STRUCTURES CAN BE USEFUL, FOR INSTANCE, TO PREDICT THE FUNCTIONS OF UNKNOWN PROTEINS OR FOR THE RATIONAL DESIGN OF MULTI-TARGET DRUGS. THERE ARE SEVERAL APPLICATIONS THAT PERFORM A THREE-DIMENSIONAL SEARCH OF PATTERNS IN THE STRUCTURES OF PROTEINS. HOWEVER, DISCOVERING CONSERVED 3D PATTERNS IN A SET OF PROTEINS WITH NO OTHER BASELINE PATTERNS IS A CHALLENGE. IN THIS PAPER, WE ANALYZE AND IMPROVE A STATE-OF-THE-ART ALGORITHM, 3D-PP, THAT IMPLEMENTS THIS DISCOVERY. IN THIS ALGORITHM, THE 3D PATTERNS ARE DETECTED AND CLUSTERED USING THE ROOT MEAN SQUARE DEVIATION VALUE, MEASURED AMONG EACH PAIR OF 3D PATTERNS (TOPOLOGICAL VARIABILITY INDICATOR). EVEN WHEN 3D-PP DEALS WITH THIS TASK, THE SIMULTANEOUS PROCESSING OF HIGH AMOUNTS OF PROTEINS BECOMES A COMPUTATIONAL CHALLENGE WITH THE SIZE AND THE NUMBER OF PROTEINS TO BE EVALUATED. IN THIS WORK, WE PRESENT AND ANALYZE DIFFERENT SHARED MEMORY PARALLEL STRATEGIES OF 3D-PP, USING OPENMP. THOSE STRATEGIES IMPROVE THE OVERALL PERFORMANCE OF THE ORIGINAL IMPLEMENTATION BY REDUCING PARALLEL LOAD UNBALANCE AMONG THREADS AND OVERALL INCREASING PARALLELISM. THE RESULTS SHOW SIGNIFICANT PERFORMANCE IMPROVEMENTS COMPARED TO THE ORIGINAL VERSION, ACHIEVING UP TO 13X SPEEDUP FOR A SMALL NUMBER OF PROTEINS AND 17.7× FOR A LARGER SET.