Search In this Thesis
   Search In this Thesis  
العنوان
Arabic Optical character Recognition Using Local Invariant Features /
المؤلف
Alkholy, Mohamed Dahi Abdel-Zaher.
هيئة الاعداد
باحث / محمد ضاحى عبد الظاهر الخولى
مشرف / محيى محمد هدهود
مناقش / معوض إبراهيم معوض
مناقش / خالد محمد أمين
الموضوع
Optical scanners. Image processing -
تاريخ النشر
2016.
عدد الصفحات
131p. :
اللغة
الإنجليزية
الدرجة
ماجستير
التخصص
Information Systems
تاريخ الإجازة
9/8/2016
مكان الإجازة
جامعة المنوفية - كلية الحاسبات والمعلومات - تكنولوجيا المعلومات
الفهرس
Only 14 pages are availabe for public view

from 154

from 154

Abstract

Arabic Optical character Recognition (AOCR) is the science of conversion Arabic text image documents of type, printed, or handwritten into machine-encoded text. OCR role is to help or replace humans in computerizing paperwork in order to accelerate, improve and reduce cost as well as time and effort. It provide although the ability to electronically editing, storing more compactly and searching documents. It is not a recent research field; it started about 40 years ago. The need for it has become increasingly urgent due to overcrowding paperwork in our societies. A lot of research conducted on AOCR as the Arabic script language is the mother tongues of over quarter of the world population despite this fact, robust and reliable performance AOCR system is still challenge. It is not such as Latin language OCR which have Reliable font-written OCR systems which are readily in use since long time ago. This thesis aimed to enhance the optical printed Arabic characters recognition accuracy across using local invariant features. A comparative study of four recent highly reported recognition accuracy algorithms presented. The algorithms have been evaluated on a proposed computer generated Primitive Arabic characters Noise Free dataset (PAC-NF) since there is no publicly available dataset for primitive printed Arabic text. It contains two models PAC-NFA and PAC-NFB. Accuracy of algorithms is evaluated using CRR (Character Recognition Rate) metric. Results show that one of the four Approaches[1]achieved the highest CRR by average of 99.36% on PAC-NFA and 75.21% on PAC-NFB. considering this algorithm as the base technique to be improved, a combination of additional features has been proposed to achieve higher recognition rates, three types of classifiers used to test the features (Random Forest Tree, ANN, and SVM). the results showed that the Random Forest Tree classifier achieved the highest CRR. The proposed technique achieved CRR by average of 100% on PAC-NFA and 92.81% on PAC-NFB using Random Forest Tree classifier. The proposed technique robustness against two types of noise (scanning noise, and Artificial Gaussian noise) is tested, the results showed that the proposed technique more robust to the two types of noise than the base technique. Another system process has been added to AOCR system to automate the recognition process of Omni font documents which is the Optical Font Recognition (OFR).