Publicación:
AN ONLINE MULTI-SOURCE SUMMARIZATION ALGORITHM FOR TEXT READABILITY IN TOPIC-BASED SEARCH

dc.creatorCLAUDIO ORLANDO GUTIÉRREZ SOTO
dc.date2021
dc.date.accessioned2025-01-10T15:25:12Z
dc.date.available2025-01-10T15:25:12Z
dc.date.issued2021
dc.description.abstractWEB SEARCH USERS ARE LIKELY TO FACE PROBLEMS RELATED TO THE AVAILABILITY OF LARGE AMOUNTS OF DATA. AS THE QUANTITY OF ONLINE CONTENT GROWS, THE RISK OF MISSING RELEVANT INFORMATION DURING SEARCH CAN ONLY INCREASE. MOREOVER, EXTERNAL VARIABLES SUCH AS THE USERS? READING PROFICIENCY LEVEL CAN FURTHER COMPLICATE THE TASK. THIS ARTICLE PROPOSES AN ONLINE MULTI-DOCUMENT SUMMARIZATION ALGORITHM FOR TEXT READABILITY, AS A MEANS TO SIMPLIFY WEB SEARCH. THE ALGORITHM IS DESIGNED TO WORK OVER COLLECTIONS OF TOPIC-RELATED DOCUMENTS, SUCH AS THE ONES RETURNED AS THE RESULTS TO A WEB QUERY. CONTRARY TO MOST MODERN APPROACHES, NO PRELIMINARY TRAINING FOR THE ALGORITHM IS REQUIRED. THE ALGORITHM WAS TESTED IN BOTH ENGLISH AND SPANISH LANGUAGE DOCUMENTS, USING DIFFERENT METRICS OF TERM AND SENTENCE RELEVANCE. THE RESULTS WERE COMPARED AGAINST SUMMARIES CREATED BY BOTH HUMAN SUMMARIZERS AND THIRD-PARTY AUTOMATIC TEXT SUMMARIZATION (ATS) SYSTEMS IN TERMS OF TWO VARIABLES: READABILITY AND INFORMATION CONTENT. IN BOTH VARIABLES, THE RESULTS SHOW GENERALIZED GAINS WITH RESPECT TO BOTH THE HUMAN SUMMARIZERS AND THE THIRD-PARTY ATS SYSTEMS. FURTHERMORE, THE ALGORITHM ACHIEVED THESE RESULTS WITH A TIME COMPLEXITY STRICTLY LOWER THAN ; WELL BELOW TRADITIONAL MACHINE LEARNING APPROACHES.
dc.formatapplication/pdf
dc.identifier.doi10.1016/j.csl.2020.101143
dc.identifier.issn1095-8363
dc.identifier.issn0885-2308
dc.identifier.urihttps://repositorio.ubiobio.cl/handle/123456789/11928
dc.languagespa
dc.publisherCOMPUTER SPEECH AND LANGUAGE
dc.relation.uri10.1016/j.csl.2020.101143
dc.rightsPUBLICADA
dc.titleAN ONLINE MULTI-SOURCE SUMMARIZATION ALGORITHM FOR TEXT READABILITY IN TOPIC-BASED SEARCH
dc.title.alternativeUN ALGORITMO DE RESUMEN MULTIFUENTE EN LÍNEA PARA LA LEGIBILIDAD DEL TEXTO EN LA BÚSQUEDA BASADA EN TEMAS
dc.typeARTÍCULO
dspace.entity.typePublication
ubb.EstadoPUBLICADA
ubb.Otra ReparticionDEPARTAMENTO DE SISTEMAS DE INFORMACION
ubb.SedeCONCEPCIÓN
Archivos