Introduction to Data Mining

<<< Previous    Up    Next >>>

Lesson 6.6.3.4

Reducing Scans via Partition

 

  1. Divide the dataset into m partitions, i.e. D1, D2, ......., Dm. The size of the partitions should be such that each partition could fit into memory.

  2. For each parition i, find the frequent itemsets Fi in Di with support ≥ minSupp. If it is frequent in D, then is must be frequent in some Di.

  3. The union of all in Fi's forms the candidate set of the frequent itemsets for D. Get the counts for these only.

  4. Often this requires only 2 scans of D.    

 

<<< Previous    Up    Next >>>