Publicación: AN ONLINE MULTI-SOURCE SUMMARIZATION ALGORITHM FOR TEXT READABILITY IN TOPIC-BASED SEARCH
dc.creator | CLAUDIO ORLANDO GUTIÉRREZ SOTO | |
dc.date | 2021 | |
dc.date.accessioned | 2025-01-10T15:25:12Z | |
dc.date.available | 2025-01-10T15:25:12Z | |
dc.date.issued | 2021 | |
dc.description.abstract | WEB SEARCH USERS ARE LIKELY TO FACE PROBLEMS RELATED TO THE AVAILABILITY OF LARGE AMOUNTS OF DATA. AS THE QUANTITY OF ONLINE CONTENT GROWS, THE RISK OF MISSING RELEVANT INFORMATION DURING SEARCH CAN ONLY INCREASE. MOREOVER, EXTERNAL VARIABLES SUCH AS THE USERS? READING PROFICIENCY LEVEL CAN FURTHER COMPLICATE THE TASK. THIS ARTICLE PROPOSES AN ONLINE MULTI-DOCUMENT SUMMARIZATION ALGORITHM FOR TEXT READABILITY, AS A MEANS TO SIMPLIFY WEB SEARCH. THE ALGORITHM IS DESIGNED TO WORK OVER COLLECTIONS OF TOPIC-RELATED DOCUMENTS, SUCH AS THE ONES RETURNED AS THE RESULTS TO A WEB QUERY. CONTRARY TO MOST MODERN APPROACHES, NO PRELIMINARY TRAINING FOR THE ALGORITHM IS REQUIRED. THE ALGORITHM WAS TESTED IN BOTH ENGLISH AND SPANISH LANGUAGE DOCUMENTS, USING DIFFERENT METRICS OF TERM AND SENTENCE RELEVANCE. THE RESULTS WERE COMPARED AGAINST SUMMARIES CREATED BY BOTH HUMAN SUMMARIZERS AND THIRD-PARTY AUTOMATIC TEXT SUMMARIZATION (ATS) SYSTEMS IN TERMS OF TWO VARIABLES: READABILITY AND INFORMATION CONTENT. IN BOTH VARIABLES, THE RESULTS SHOW GENERALIZED GAINS WITH RESPECT TO BOTH THE HUMAN SUMMARIZERS AND THE THIRD-PARTY ATS SYSTEMS. FURTHERMORE, THE ALGORITHM ACHIEVED THESE RESULTS WITH A TIME COMPLEXITY STRICTLY LOWER THAN ; WELL BELOW TRADITIONAL MACHINE LEARNING APPROACHES. | |
dc.format | application/pdf | |
dc.identifier.doi | 10.1016/j.csl.2020.101143 | |
dc.identifier.issn | 1095-8363 | |
dc.identifier.issn | 0885-2308 | |
dc.identifier.uri | https://repositorio.ubiobio.cl/handle/123456789/11928 | |
dc.language | spa | |
dc.publisher | COMPUTER SPEECH AND LANGUAGE | |
dc.relation.uri | 10.1016/j.csl.2020.101143 | |
dc.rights | PUBLICADA | |
dc.title | AN ONLINE MULTI-SOURCE SUMMARIZATION ALGORITHM FOR TEXT READABILITY IN TOPIC-BASED SEARCH | |
dc.title.alternative | UN ALGORITMO DE RESUMEN MULTIFUENTE EN LÍNEA PARA LA LEGIBILIDAD DEL TEXTO EN LA BÚSQUEDA BASADA EN TEMAS | |
dc.type | ARTÍCULO | |
dspace.entity.type | Publication | |
ubb.Estado | PUBLICADA | |
ubb.Otra Reparticion | DEPARTAMENTO DE SISTEMAS DE INFORMACION | |
ubb.Sede | CONCEPCIÓN |