Author: Saleh, Marwa Emad Eldeen./ Title: Improved Text Summarization using Artificial Intelligence Techniques /

Search In this Thesis

العنوان

Improved Text Summarization using Artificial Intelligence Techniques /

المؤلف

Saleh, Marwa Emad Eldeen.

هيئة الاعداد

باحث / مروة عماد الدين صالح يوسف

مشرف / عبد المجيد أمين علي

مشرف / ياسر ماهر عبد المنطلب

الموضوع

Artificial intelligence. Systems engineering.

تاريخ النشر

2023.

عدد الصفحات

206 p. :

اللغة

الإنجليزية

الدرجة

الدكتوراه

التخصص

Computer Science (miscellaneous)

تاريخ الإجازة

21/9/2023

مكان الإجازة

جامعة المنيا - كلية الحاسبات والمعلومات - علوم الحاسب

الفهرس

Only 14 pages are availabe for public view

from

232

from

232

Abstract

With the ever-increasing amount of textual information available on the internet, there is a growing need for efficient and effective methods of summarizing large volumes of text. Automatic text summarization (ATS) is a field of natural language processing that aims to automatically generate summaries of long texts that capture the most important information while retaining the original meaning.
There are two main techniques to ATS: extractive and abstractive. Extractive summarization involves selecting the most important sentences or phrases from the original text and combining them to form a summary. Conversely, abstractive summarization involves generating a summary that is not restricted to the original text and can include new phrases or sentences that capture the meaning of the text.
Recently, deep learning methods significantly improved ATS, either extractive or abstractive. Despite the availability of numerous state-of-the-art techniques for abstractive text summarization in English, only a few studies have applied these techniques to other languages, particularly Arabic, which is known for its complexity.
Furthermore, most current extractive summarization techniques rely on embedding vectors that may not encompass crucial features, such as sentence length, position, and TF-IDF of the sentence. These features are widely recognized as significant factors in extractive summarization and can significantly impact the effectiveness of the summarization method.
Objectives:
This thesis focuses on two main goals: firstly, developing a deep learning model for Arabic text summarization and utilizing advanced preprocessing tools to aid the model’s comprehension. Secondly, it aims to enhance extractive summarization by incorporating ensemble features to represent the input text. The objectives of this thesis can be summarized as follows:
• Conduct a systematic literature review of deep learning-based text summarization covering extractive and abstractive techniques. This review aims to provide a comprehensive overview of techniques, input representations, training strategies for extractive summarization, mechanisms to improve abstractive summarization, datasets, evaluation metrics, and challenges in both types of text summarization, along with their potential solutions.
• Propose an abstractive text summarization technique for Arabic that employs advanced preprocessing tools to aid the model’s understanding.
• Propose an extractive summarization technique based on semantic information and statistical features to improve the quality of generated summaries.
Results:
The results indicated that the best performance for the first technique was achieved using three layers of BiLSTM hidden states at the encoder. Moreover, abstractive summarization models using the skip-gram word2Vec model outperformed those using the CBOW word2Vec model.
Moreover, the experiments demonstrate that the second proposed technique effectively captures the semantic and statistical information of the document and outperforms deep learning, machine learning, and state-of-the-art techniques.