Mining on Relationships in Big Data era using Improve Apriori Algorithm with MapReduce Approach








Abstract

The current time technology is growing very faster and data is generating very fast so data characteristics have changed in form of data to big data. If anybody wants to mine some related data in big data environment then present data mining algorithm fails to mine relationship in big data and it takes a lot of time for processing. MapReduce approach is a most efficient algorithm in big data framework which handles a huge amount of data and gives fast result. The Apriori algorithm is more powerful algorithm for mining on interesting relationships between dataset in any type of databases or same databases. In present time a lot of MapReduce base Apriori algorithms are available but its Map and Reduce function run to multiple times and works only for the transaction database. This paper describes what is big data with its characteristics, concept of Association rules with the Apriori algorithm in big data, problems in the existing MapReduce base Apriori algorithm. We propose new improve MapReduce approach base Apriori algorithm for mining on a relationship with the help of given one suitable example where Reduce function runs only one time after running on Map function and this proposed algorithm run on any type of database.


Modules


Algorithms

Data Mining algorithms


Software And Hardware

• Hardware: Processor: i3 ,i5 RAM: 4GB Hard disk: 16 GB • Software: operating System : Windws2000/XP/7/8/10 Anaconda,jupyter,spyder,flask,hadoop Frontend :-python Backend:- MYSQL