الفهرس | Only 14 pages are availabe for public view |
Abstract The massive volume of data generated on daily basis decreases the ability of current data mining techniques to generate knowledge in a short time. The constant change in data requires constant updating of the existing patterns. It is computationally intensive to repeat the knowledge discovery process on the whole databases with every update. Therefore, there is a need to enhance the performance of association rules mining methodologies when dealing with incremental updates. In order to enhance the performance of incremental association rules mining, this thesis focus on the utilization of current hardware and software advances in high-performance computing. This thesis proposes a distributed incremental association rules mining approach based on MPI. In addition, the thesis also proposes a hybrid incremental mining approach based on OpenMP and MPI to work in high performance computing environments. In order to reduce the need to reprocess the entire database, this thesis depends on pre-large and negative borders approaches. To evaluate the applied approaches, this thesis considered the output accuracy, processing time and the acceleration as our primary evaluation metrics. In fact, experimental results have proved that our distributed method reduces processing time by 40% when compared to serial existing approach and our hybrid approach reduces processing time by 19% when compared to distributed approach. |