org.knime.base.node.mine.decisiontree2.learner
Class Split

java.lang.Object
  extended by org.knime.base.node.mine.decisiontree2.learner.Split
Direct Known Subclasses:
SplitContinuous, SplitNominal

public abstract class Split
extends Object

Calculates the best split for a given attribute list and the original class distribution.

Author:
Christoph Sieb, University of Konstanz

Field Summary
protected  SplitQualityMeasure m_splitQualityMeasure
          The quality measure to be used for the best split point calculation.
 
Constructor Summary
Split(InMemoryTable table, int attributeIndex, SplitQualityMeasure splitQualityMeasure)
          Constructs the best split for the given attribute list and the class distribution.
 
Method Summary
abstract  boolean canBeFurtherUsed()
          Returns true if it makes sense to use this split's attribute further in deeper levels, false if not.
 int getAttributeIndex()
          Returns the index of the attribute this split object is responsible for.
 double getBestQualityMeasure()
          Returns the quality index of this split.
abstract  int getNumberPartitions()
          Return the number of partitions resulting from this split.
abstract  int getPartitionForRow(DataRowWeighted row)
          Returns the partition the given row belongs to according to this split.
abstract  double[] getPartitionWeights()
          Returns the partition weights.
 String getQualityMeasureName()
           
 String getSplitAttributeName()
          Returns the name of this split's attribute.
 InMemoryTable getTable()
          Returns the InMemoryTable.
 boolean isValidSplit()
          Whether this split is a valid split.
protected  void setBestQualityMeasure(double bestGini)
          To set the quality index once calculated by the detailed implementatioins.
 String toString()
          
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

m_splitQualityMeasure

protected final SplitQualityMeasure m_splitQualityMeasure
The quality measure to be used for the best split point calculation.

Constructor Detail

Split

public Split(InMemoryTable table,
             int attributeIndex,
             SplitQualityMeasure splitQualityMeasure)
Constructs the best split for the given attribute list and the class distribution. The results can be retrieved from getter methods.

Parameters:
table - the table for which to create the split
attributeIndex - the index specifying the attribute for which to calculate the split
splitQualityMeasure - the quality measure to determine the best split (e.g. gini or gain ratio)
Method Detail

getTable

public InMemoryTable getTable()
Returns the InMemoryTable.

Returns:
the InMemoryTable

getBestQualityMeasure

public double getBestQualityMeasure()
Returns the quality index of this split.

Returns:
the quality index of this split

setBestQualityMeasure

protected void setBestQualityMeasure(double bestGini)
To set the quality index once calculated by the detailed implementatioins.

Parameters:
bestGini - the gini index to set

getNumberPartitions

public abstract int getNumberPartitions()
Return the number of partitions resulting from this split.

Returns:
the number of partitions resulting from this split

getSplitAttributeName

public String getSplitAttributeName()
Returns the name of this split's attribute.

Returns:
the name of this split's attribute

getQualityMeasureName

public String getQualityMeasureName()
Returns:
the name of the quality measure used by this split

getAttributeIndex

public int getAttributeIndex()
Returns the index of the attribute this split object is responsible for.

Returns:
the index of the attribute this split object is responsible for

isValidSplit

public boolean isValidSplit()
Whether this split is a valid split. I.e. there exist a valid quality measure.

Returns:
whether this split is a valid split

canBeFurtherUsed

public abstract boolean canBeFurtherUsed()
Returns true if it makes sense to use this split's attribute further in deeper levels, false if not.

Returns:
true if it makes sense to use this split's attribute further in deeper levels, false if not

getPartitionForRow

public abstract int getPartitionForRow(DataRowWeighted row)
Returns the partition the given row belongs to according to this split. If the value of the split attribute is missing (i.e. NaN) -1 is returned.

Parameters:
row - the row for which to get the partition index
Returns:
the partition the given row belongs to according to this split; if the value of the split attribute is missing (i.e. NaN) -1 is returned

getPartitionWeights

public abstract double[] getPartitionWeights()
Returns the partition weights. The weights represent the relative frequency of valid rows per partition. The weights are normally used to adapt the weight of rows whose split value is missing. Such a row is then assigned to each parition with the adapted weight.

Returns:
the partition weights

toString

public String toString()

Overrides:
toString in class Object


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.