org.knime.base.node.mine.subgroupminer.apriori
Class ArrayApriori

java.lang.Object
  extended by org.knime.base.node.mine.subgroupminer.apriori.ArrayApriori
All Implemented Interfaces:
AprioriAlgorithm

public class ArrayApriori
extends Object
implements AprioriAlgorithm

The array apriori uses the ArrayPrefixTreeNode data structure to find frequent itemsets. Based on these it constructs a prefix tree. In a prefix tree each child of an item has the path in the tree to that item in common. The path is its prefix. The transactions are processed, for each level once, by going to the node corresponding to first item in the transaction, and process the rest of the transaction for that node. Thus, there is no candidate generation.

Author:
Fabian Dill, University of Konstanz

Constructor Summary
ArrayApriori(int bitSetLength, int dbsize)
          Creates an ArrayApriori instance with the bitset length, corresponding to the number of items.
 
Method Summary
 void findFrequentItems(List<BitVectorValue> transactions)
          First of all it starts to identify those items which are frequent at all.
 void findFrequentItemSets(List<BitVectorValue> transactions, double minSupport, int maxDepth, FrequentItemSet.Type type, ExecutionMonitor exec)
          Finds the frequent itemsets by going down the tree until the current build level is reached, there it counts those items which are present in the transaction.
 List<AssociationRule> getAssociationRules(double confidence)
          Returns the association rules generated from the found frequent itemsets with the passed minimal confidence.
 List<FrequentItemSet> getFrequentItemSets(FrequentItemSet.Type type)
          Returns the found frequent itemsets according to their type, which can either be FREE, CLOSED or MAXIMAL.
 void setMinSupport(double minSupport)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ArrayApriori

public ArrayApriori(int bitSetLength,
                    int dbsize)
Creates an ArrayApriori instance with the bitset length, corresponding to the number of items.

Parameters:
bitSetLength - the number of items
dbsize - the number of transactions
Method Detail

setMinSupport

public void setMinSupport(double minSupport)
Parameters:
minSupport - the minimum support

findFrequentItems

public void findFrequentItems(List<BitVectorValue> transactions)
First of all it starts to identify those items which are frequent at all. Then it creates a mapping, where the whole transaction length (all items) are mapped to the array position of only the frequent ones. Thus, the algorithm works with the mostly much shorter array of frequent items only.

Parameters:
transactions - the database as bitsets

findFrequentItemSets

public void findFrequentItemSets(List<BitVectorValue> transactions,
                                 double minSupport,
                                 int maxDepth,
                                 FrequentItemSet.Type type,
                                 ExecutionMonitor exec)
                          throws CanceledExecutionException
Finds the frequent itemsets by going down the tree until the current build level is reached, there it counts those items which are present in the transaction. This implies, that it can count only those items, for which a path is present in the tree, that is, which have frequent predecessors. When the counting is finished, new children are created for those itemsets, which might become frequent in the next level, that is, itemsets with one item more. This is the method to start with when mining for frequent itemsets.

Specified by:
findFrequentItemSets in interface AprioriAlgorithm
Parameters:
transactions - a list of BitSets representing the bitvectors, thus, corresponding to the whole database
minSupport - the minimum support as an absolute value
maxDepth - the maximal length of an itemset
type - the desired type of the frequent itemsets
exec - the execution monitor
Throws:
CanceledExecutionException - if the execution was cancelled

getAssociationRules

public List<AssociationRule> getAssociationRules(double confidence)
Returns the association rules generated from the found frequent itemsets with the passed minimal confidence.

Specified by:
getAssociationRules in interface AprioriAlgorithm
Parameters:
confidence - the desired minimal confidence of the rules
Returns:
a list of associaiton rules with the minimum confidence

getFrequentItemSets

public List<FrequentItemSet> getFrequentItemSets(FrequentItemSet.Type type)
Returns the found frequent itemsets according to their type, which can either be FREE, CLOSED or MAXIMAL.

Specified by:
getFrequentItemSets in interface AprioriAlgorithm
Parameters:
type - the desired type, either free, closed or maximal
Returns:
a list of the found frequent itemsets of the referring type


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.