Vous êtes sur la page 1sur 2


Name of Algorithm Apriori Algorithm Description

In computer science and data mining . Apriori is a classic algorithm for learning association rules. Apriori is designed to operate on databases containing transaction (for example, collection of items bought by customer, or details of a website frequentation). The algorithm attempts to find subsets which are common to at least a minimum number C (the cutoff, or confidence threshold) of the item sets. Apriori uses a bottom up approach, where frequent subsets are extended one item at a time ( a step known as candidate generation, and groups of candidates are tested against the data). The algorithm terminates when no further successful extension are found. Apriori uses breadth-first search and a hash tree structure to count candidate item sets efficiently.

=>B) = #_tuples_containing_both_A_and_ B
total_#_of_tuples Confidence(A=>B) = #_tuples_containing_both_A_and_B #_tuples_containing_A

Interpretation of Algorithm: EM is the clustering Algorithm. With help of EM,one can obtain the cluster of records from a given pattern. In the above dataset the organization has collected the information regarding the number of login done on a system by various users through various ports. With the help of EM clustering algorithm the organization can to find out answer to the following question: Which group is logging maximum no. of time Which group is logging minimum no. of time Which port is more used Which group of employee uses which protocol

The answer of the above question help organization to understand How efficient the system is On which port the traffic is more Which protocol is most widely used by employee Answer to above question helps organization to increases the efficiency of the system, helps in controlling the traffic on the system, and etc.

Interpretation of Algorithm: The algorithm helps to find out most frequently purchased items from the given dataset. In our example the output is showing that 10 association rules are found. The algorithm tells which two items are best complemented for each other based on dataset. In output the first association rule is juice->milk, meaning that whenever customer buys juice he also buys milk.Thus two items are best complemented for each other. In same way it has given 9 more rules. And the output is in the descending order of frequency. This algorithm is widely used if the dataset size is very large and is very efficient for large datasets.