org.knime.base.node.mine.decisiontree2.model
Class DecisionTreeNode

java.lang.Object
  extended by org.knime.base.node.mine.decisiontree2.model.DecisionTreeNode
All Implemented Interfaces:
Serializable, TreeNode
Direct Known Subclasses:
DecisionTreeNodeLeaf, DecisionTreeNodeSplit

public abstract class DecisionTreeNode
extends Object
implements TreeNode, Serializable

The base abstract implementations of a node of a decision tree. Separate implementations for a leaf and a split node (abstract) exist.

Author:
Michael Berthold, University of Konstanz, Christoph Sieb, University of Konstanz
See Also:
Serialized Form

Constructor Summary
(package private) DecisionTreeNode()
          Empty Constructor visible only within package.
protected DecisionTreeNode(int nodeId, DataCell majorityClass, LinkedHashMap<DataCell,Double> classCounts)
          Constructor of base class.
protected DecisionTreeNode(Node xmlNode, DataCellStringMapper mapper)
          Constructor of base class.
 
Method Summary
protected  void addColorToMap(Color col, double weight)
          Adds the given color to the color map.
abstract  void addCoveredColor(DataRow row, DataTableSpec spec, double weight)
          Add colors for a row of values if they fall within a specific node/branch.
abstract  void addCoveredPattern(DataRow row, DataTableSpec spec, double weight)
          Add patterns given as a row of values if they fall within a specific node.
abstract  boolean addNodeToTreeDepthFirst(DecisionTreeNode node, int ix)
          Add a new node to the tree structure based on a depth-first indexing strategy.
abstract  Enumeration<DecisionTreeNode> children()
           
 DataCell classifyPattern(DataRow row, DataTableSpec spec)
          Classify a new pattern given as a row of values.
 HashMap<Color,Double> coveredColors()
           
abstract  Set<RowKey> coveredPattern()
           
static DecisionTreeNode createNewNode(Node xmlNode, DataCellStringMapper mapper)
          Create new node from XML-information.
static DecisionTreeNode createNodeFromPredictorParams(ModelContentRO predParams, DecisionTreeNode parent)
          Creates a new DecisionTreeNode (and all it's children!) based on an model content object.
abstract  boolean getAllowsChildren()
           
abstract  TreeNode getChildAt(int pos)
           
abstract  int getChildCount()
           
 LinkedHashMap<DataCell,Double> getClassCounts()
          Return class counts, that is how many patterns (also fractions of) for each class were encountered in this branch during training.
abstract  LinkedHashMap<DataCell,Double> getClassCounts(DataRow row, DataTableSpec spec)
          Determine class counts for a new pattern given as a row of values.
abstract  int getCountOfSubtree()
          Returns the count of the subtree.
 Object getCustomData()
          To get the custom data object.
 double getEntireClassCount()
          Return number of patterns of all classes.
abstract  int getIndex(TreeNode node)
          Returns the index of node in the receivers children.
 DataCell getMajorityClass()
          Return majority class of this node.
(package private)  double getOverallColorCount()
           
 double getOwnClassCount()
          Return number of patterns of correct class (= majority class in a non-risk decision tree).
 int getOwnIndex()
           
 TreeNode getParent()
           
 String getPrefix()
          Returns the prefix of this node representing the condition.
abstract  String getStringSummary()
           
static DataCell getWinner(LinkedHashMap<DataCell,Double> classCounts)
          Find the winning data cell.
abstract  boolean isLeaf()
           
 void loadFromPredictorParams(ModelContentRO predParams)
          Load node from a model content object.
abstract  void loadNodeInternalsFromPredParams(ModelContentRO pConf)
          Load internal node settings from a model content object.
(package private)  boolean newColors()
           
 void resetColorInformation()
          Clean all color information in this node and all children.
abstract  void saveNodeInternalsToPredParams(ModelContentWO pConf, boolean saveKeysAndPatterns)
          Save internal node settings to a model content object.
 void saveToPredictorParams(ModelContentWO predParams, boolean saveColorsAndKeys)
          Save node to a model content object.
 void setCoveredColors(HashMap<Color,Double> coveredColors)
          Sets the covered colors distribution directly.
 void setCustomData(Object customData)
          To set a custom data object.
 void setParent(DecisionTreeNode parent)
          Set parent of this node.
 void setPrefix(String pf)
          Set information about this node, e.g.
 String toString()
          
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DecisionTreeNode

DecisionTreeNode()
Empty Constructor visible only within package.


DecisionTreeNode

protected DecisionTreeNode(Node xmlNode,
                           DataCellStringMapper mapper)
Constructor of base class. Read all type-invariant information from XML file.

Parameters:
xmlNode - XML node object
mapper - map translating column names to DataCells and vice versa

DecisionTreeNode

protected DecisionTreeNode(int nodeId,
                           DataCell majorityClass,
                           LinkedHashMap<DataCell,Double> classCounts)
Constructor of base class. The necessary data is provided directly in the constructor.

Parameters:
nodeId - the id of this node
majorityClass - the majority class of the records in this node
classCounts - the class distribution of the data in this node
Method Detail

createNewNode

public static DecisionTreeNode createNewNode(Node xmlNode,
                                             DataCellStringMapper mapper)
Create new node from XML-information. Note that this constructor only constructs the node itself and does not generate any other nodes connected to it - it will solely read it's children's indices from the XML file. This function serves as a factory for subclasses that are distinguished here - the constructors of these classes are called and will read their individual information themselves. Most "node" type stuff is handled here, however.

Parameters:
xmlNode - XML Information for this node
mapper - map translating column names to DataCells and vice versa
Returns:
new node initialized from XML node or null if type is not recognized

addNodeToTreeDepthFirst

public abstract boolean addNodeToTreeDepthFirst(DecisionTreeNode node,
                                                int ix)
Add a new node to the tree structure based on a depth-first indexing strategy.

Parameters:
node - node to be inserted
ix - index of this node in depth first traversal order
Returns:
true only if the node was successfully inserted

getMajorityClass

public DataCell getMajorityClass()
Return majority class of this node.

Returns:
majority class

getClassCounts

public LinkedHashMap<DataCell,Double> getClassCounts()
Return class counts, that is how many patterns (also fractions of) for each class were encountered in this branch during training.

Returns:
class counts

classifyPattern

public final DataCell classifyPattern(DataRow row,
                                      DataTableSpec spec)
                               throws Exception
Classify a new pattern given as a row of values. Returns the class with the maximum count.

Parameters:
row - input pattern
spec - the corresponding table spec
Returns:
class of pattern the decision tree predicts
Throws:
Exception - if something went wrong (unknown attriubte for example)

getWinner

public static final DataCell getWinner(LinkedHashMap<DataCell,Double> classCounts)
Find the winning data cell. Returns the class with the maximum count.

Parameters:
classCounts - HashMap with the class counts
Returns:
class of pattern the decision tree predicts

getClassCounts

public abstract LinkedHashMap<DataCell,Double> getClassCounts(DataRow row,
                                                              DataTableSpec spec)
                                                       throws Exception
Determine class counts for a new pattern given as a row of values. Returns a HashMap listing counts for all classes.

Parameters:
row - input pattern
spec - the corresponding table spec
Returns:
HashMap class/count
Throws:
Exception - if something went wrong (unknown attriubte for example)

getOwnClassCount

public double getOwnClassCount()
Return number of patterns of correct class (= majority class in a non-risk decision tree). Note that there can be fractions of patterns, for example when we have built a fuzzy decision tree or there were missing values.

Returns:
number (and fractions) of patterns of class of this node

getEntireClassCount

public double getEntireClassCount()
Return number of patterns of all classes. Note that there can be fractions of patterns, for example when we have built a fuzzy decision tree or there were missing values.

Returns:
number (and fractions) of patterns of all classes

addCoveredPattern

public abstract void addCoveredPattern(DataRow row,
                                       DataTableSpec spec,
                                       double weight)
                                throws Exception
Add patterns given as a row of values if they fall within a specific node. Usually only Leafs will actually hold a list of RowKeys, all intermediate nodes will collect "their" information recursively.

Parameters:
row - input pattern
spec - the corresponding table spec
weight - the weight of the row (between 0.0 and 1.0)
Throws:
Exception - if something went wrong (unknown attribute for example)

addCoveredColor

public abstract void addCoveredColor(DataRow row,
                                     DataTableSpec spec,
                                     double weight)
                              throws Exception
Add colors for a row of values if they fall within a specific node/branch. Used if we don't want to (or can't anymore) store the pattern itself. We still want the color pie chart to be correct.

Parameters:
row - input pattern
spec - the corresponding table spec
weight - the weight of the row (between 0.0 and 1.0)
Throws:
Exception - if something went wrong (unknown attribute for example)

coveredPattern

public abstract Set<RowKey> coveredPattern()
Returns:
set of data cells which are the row keys that are covered by all nodes of this branch

coveredColors

public final HashMap<Color,Double> coveredColors()
Returns:
list of colors and coverage counts covered by this node

resetColorInformation

public void resetColorInformation()
Clean all color information in this node and all children.


newColors

boolean newColors()
Returns:
true of the colors of this node were overwritten.

getOwnIndex

public int getOwnIndex()
Returns:
index of this node itself

setParent

public void setParent(DecisionTreeNode parent)
Set parent of this node.

Parameters:
parent - new parent

getOverallColorCount

double getOverallColorCount()
Returns:
overall accumulate count of all color-weights in this node

toString

public final String toString()

Overrides:
toString in class Object

setPrefix

public void setPrefix(String pf)
Set information about this node, e.g. condition this branch needs to fulfill.

Parameters:
pf - string describing condition

getStringSummary

public abstract String getStringSummary()
Returns:
string summary of node content (split, leaf info...)

saveToPredictorParams

public void saveToPredictorParams(ModelContentWO predParams,
                                  boolean saveColorsAndKeys)
Save node to a model content object.

Parameters:
predParams - configuration object to attach decision tree to
saveColorsAndKeys - whether to save the colors and the row keys

saveNodeInternalsToPredParams

public abstract void saveNodeInternalsToPredParams(ModelContentWO pConf,
                                                   boolean saveKeysAndPatterns)
Save internal node settings to a model content object.

Parameters:
pConf - configuration object to attach decision tree to
saveKeysAndPatterns - whether to save the keys and patterns

loadFromPredictorParams

public void loadFromPredictorParams(ModelContentRO predParams)
                             throws InvalidSettingsException
Load node from a model content object.

Parameters:
predParams - configuration object to load decision tree from
Throws:
InvalidSettingsException - if something goes wrong

createNodeFromPredictorParams

public static DecisionTreeNode createNodeFromPredictorParams(ModelContentRO predParams,
                                                             DecisionTreeNode parent)
                                                      throws InvalidSettingsException
Creates a new DecisionTreeNode (and all it's children!) based on an model content object.

Parameters:
predParams - configuration object
parent - the parent node (or null if this is the root)
Returns:
newly created DecisionTreeNode
Throws:
InvalidSettingsException - if something goes wrong

loadNodeInternalsFromPredParams

public abstract void loadNodeInternalsFromPredParams(ModelContentRO pConf)
                                              throws InvalidSettingsException
Load internal node settings from a model content object.

Parameters:
pConf - configuration object to load decision tree from
Throws:
InvalidSettingsException - if something goes wrong

getChildCount

public abstract int getChildCount()
Specified by:
getChildCount in interface TreeNode
Returns:
count of children

getIndex

public abstract int getIndex(TreeNode node)
Returns the index of node in the receivers children. If the receiver does not contain node, -1 will be returned.

Specified by:
getIndex in interface TreeNode
Parameters:
node - that supposedly is a child of this one
Returns:
index of node (or -1 if not found)

getChildAt

public abstract TreeNode getChildAt(int pos)
Specified by:
getChildAt in interface TreeNode
Parameters:
pos - position of child
Returns:
child node at index

getParent

public final TreeNode getParent()
Specified by:
getParent in interface TreeNode
Returns:
parent of node

getCountOfSubtree

public abstract int getCountOfSubtree()
Returns the count of the subtree.

Returns:
the count of the subtree

isLeaf

public abstract boolean isLeaf()
Specified by:
isLeaf in interface TreeNode
Returns:
true if node is a leaf

children

public abstract Enumeration<DecisionTreeNode> children()
Specified by:
children in interface TreeNode
Returns:
enumeration of all children

getAllowsChildren

public abstract boolean getAllowsChildren()
Specified by:
getAllowsChildren in interface TreeNode
Returns:
true if the receiver allows children

getPrefix

public String getPrefix()
Returns the prefix of this node representing the condition.

Returns:
the prefix of this node representing the condition

addColorToMap

protected void addColorToMap(Color col,
                             double weight)
Adds the given color to the color map.

Parameters:
col - the color to add
weight - the weight for the color count

getCustomData

public Object getCustomData()
To get the custom data object.

Returns:
the custom data object

setCustomData

public void setCustomData(Object customData)
To set a custom data object.

Parameters:
customData - the custom data object to set

setCoveredColors

public void setCoveredColors(HashMap<Color,Double> coveredColors)
Sets the covered colors distribution directly.

Parameters:
coveredColors - the color distribution to set


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.