org.knime.base.node.mine.decisiontree2.learner
Class SplitContinuous

java.lang.Object
  extended by org.knime.base.node.mine.decisiontree2.learner.Split
      extended by org.knime.base.node.mine.decisiontree2.learner.SplitContinuous

public class SplitContinuous
extends Split

This class determines the best split for a numeric attribute.

Author:
Christoph Sieb, University of Konstanz

Field Summary
 
Fields inherited from class org.knime.base.node.mine.decisiontree2.learner.Split
m_splitQualityMeasure
 
Constructor Summary
SplitContinuous(InMemoryTable table, int attributeIndex, SplitQualityMeasure splitQualityMeasure, boolean averageSplitpoint, double minObjectsCount)
          Constructs the best split for the given numeric attribute list and the class distribution.
 
Method Summary
 boolean canBeFurtherUsed()
          For numeric splits it makes sense to use the corresponding atribute in deeper levels.
 double getBestSplitValue()
          Returns the split value which was evaluated as the best according to the induced partition purity.
 int getNumberPartitions()
          The number of partitions of a numeric split is always 2.
 int getPartitionForRow(DataRowWeighted row)
          Returns the partition the given row belongs to according to this split.
 double[] getPartitionWeights()
          Returns the partition weights.
 String toString()
          
 
Methods inherited from class org.knime.base.node.mine.decisiontree2.learner.Split
getAttributeIndex, getBestQualityMeasure, getQualityMeasureName, getSplitAttributeName, getTable, isValidSplit, setBestQualityMeasure
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SplitContinuous

public SplitContinuous(InMemoryTable table,
                       int attributeIndex,
                       SplitQualityMeasure splitQualityMeasure,
                       boolean averageSplitpoint,
                       double minObjectsCount)
Constructs the best split for the given numeric attribute list and the class distribution. The results can be retrieved from getter methods.

Parameters:
table - the table with the data for which to create the split
attributeIndex - the index of the attribute for which to create the split
splitQualityMeasure - the quality measure (e.g. gini or gain ratio)
averageSplitpoint - if true, the split point is set as the average of the partition borders, else the upper value of the lower partition is used
minObjectsCount - the minimum number of objects in at least two partitions
Method Detail

getBestSplitValue

public double getBestSplitValue()
Returns the split value which was evaluated as the best according to the induced partition purity.

Returns:
the best split value for the underlying attribute

getNumberPartitions

public int getNumberPartitions()
The number of partitions of a numeric split is always 2. Return the number of partitions resulting from this split.

Specified by:
getNumberPartitions in class Split
Returns:
the number of partitions resulting from this split

canBeFurtherUsed

public boolean canBeFurtherUsed()
For numeric splits it makes sense to use the corresponding atribute in deeper levels. Returns true if it makes sense to use this split's attribute further in deeper levels, false if not.

Specified by:
canBeFurtherUsed in class Split
Returns:
true if it makes sense to use this split's attribute further in deeper levels, false if not

getPartitionForRow

public int getPartitionForRow(DataRowWeighted row)
Returns the partition the given row belongs to according to this split. If the value of the split attribute is missing (i.e. NaN) -1 is returned.

Specified by:
getPartitionForRow in class Split
Parameters:
row - the row for which to get the partition index
Returns:
the partition the given row belongs to according to this split; if the value of the split attribute is missing (i.e. NaN) -1 is returned

getPartitionWeights

public double[] getPartitionWeights()
Returns the partition weights. The weights represent the relative frequency of valid rows per partition. The weights are normally used to adapt the weight of rows whose split value is missing. Such a row is then assigned to each parition with the adapted weight.

Specified by:
getPartitionWeights in class Split
Returns:
the partition weights

toString

public String toString()

Overrides:
toString in class Split


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.