org.knime.base.node.mine.cluster.fuzzycmeans
Class FuzzyClusterNodeModel

java.lang.Object
  extended by org.knime.core.node.NodeModel
      extended by org.knime.base.node.mine.cluster.fuzzycmeans.FuzzyClusterNodeModel

public class FuzzyClusterNodeModel
extends NodeModel

Generate a fuzzy c-means clustering using a fixed number of cluster centers.

Author:
Michael Berthold, University of Konstanz, Nicolas Cebron, University of Konstanz

Field Summary
static String CFGKEY_KEEPALL
          Config key to keep all columns in include list.
static String CLUSTER_KEY
          Key for the Cluster Columns in the output DataTable.
static String DELTAVALUE_KEY
          Key to store the delta value in the config.
static String FUZZIFIER_KEY
          Key to store the fuzzifier in the settings.
static String INCLUDELIST_KEY
          Key to store the excluded column list in the settings.
(package private) static int INPORT
          The input port used here.
static String LAMBDAVALUE_KEY
          Key to store the lambda value in the config.
static String MAXITERATIONS_KEY
          Key to store the number of maximal iterations in the settings.
static String MEASURES_KEY
          Key to store whether cluster quality measures should be calculated.
static String MEMORY_KEY
          Key to store whether the clustering should be performed in memory in the PredParams.
static String NOISE_KEY
          Key to store wheher a noise cluster is induced.
static String NOISESPEC_KEY
          Key for the Cluster Columns in the output DataTable.
static String NRCLUSTERS_KEY
          Key to store the number of clusters in the settings.
(package private) static int OUTPORT
          The output port used here.
 
Constructor Summary
FuzzyClusterNodeModel()
          Constructor, remember parent and initialize status.
 
Method Summary
protected  PortObjectSpec[] configure(PortObjectSpec[] inSpecs)
          Number of columns in the output table is not deterministic.
protected  PortObject[] execute(PortObject[] inData, ExecutionContext exec)
          Generate new clustering based on InputDataTable and specified number of clusters.
 double getBetweenClusterVariation()
          Calculates the Between-Cluster Variation.
 double[][] getClusterCentres()
           
 double[] getFuzzyHyperVolumes()
          Calculates the fuzzy hypervolumnes for each cluster.
 double getPartitionCoefficient()
          Calculates the partition coefficient.
 double getPartitionEntropy()
          Calculates the partition entropy.
 double[][] getweightMatrix()
           
 double[] getWithinClusterVariations()
          Calculates the Within-Cluster Variation for each cluster.
 double getXieBeniIndex()
          Calculates the Xie Beni Index.
protected  void loadInternals(File internDir, ExecutionMonitor exec)
          Load internals into the derived NodeModel.
protected  void loadValidatedSettingsFrom(NodeSettingsRO settings)
          Loads the number of clusters and the maximum number of iterations from the settings.
 boolean noiseClustering()
           
 void reset()
          Override this function in the derived model and reset your NodeModel.
protected  void saveInternals(File internDir, ExecutionMonitor exec)
          Save internals of the derived NodeModel.
protected  void saveSettingsTo(NodeSettingsWO settings)
          Saves the number of Clusters and the maximum number of iterations in the settings.
protected  void validateSettings(NodeSettingsRO settings)
          Validates the number of Clusters and the maximum number of iterations in the settings.
 
Methods inherited from class org.knime.core.node.NodeModel
addWarningListener, configure, continueLoop, execute, executeModel, getInHiLiteHandler, getLoopEndNode, getLoopStartNode, getNrInPorts, getNrOutPorts, getOutHiLiteHandler, getWarningMessage, notifyViews, notifyWarningListeners, peekFlowVariableDouble, peekFlowVariableInt, peekFlowVariableString, peekScopeVariableDouble, peekScopeVariableInt, peekScopeVariableString, pushFlowVariableDouble, pushFlowVariableInt, pushFlowVariableString, pushScopeVariableDouble, pushScopeVariableInt, pushScopeVariableString, removeWarningListener, setInHiLiteHandler, setWarningMessage, stateChanged
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CLUSTER_KEY

public static final String CLUSTER_KEY
Key for the Cluster Columns in the output DataTable.

See Also:
Constant Field Values

NOISESPEC_KEY

public static final String NOISESPEC_KEY
Key for the Cluster Columns in the output DataTable.

See Also:
Constant Field Values

NRCLUSTERS_KEY

public static final String NRCLUSTERS_KEY
Key to store the number of clusters in the settings.

See Also:
Constant Field Values

MAXITERATIONS_KEY

public static final String MAXITERATIONS_KEY
Key to store the number of maximal iterations in the settings.

See Also:
Constant Field Values

FUZZIFIER_KEY

public static final String FUZZIFIER_KEY
Key to store the fuzzifier in the settings.

See Also:
Constant Field Values

INCLUDELIST_KEY

public static final String INCLUDELIST_KEY
Key to store the excluded column list in the settings.

See Also:
Constant Field Values

NOISE_KEY

public static final String NOISE_KEY
Key to store wheher a noise cluster is induced.

See Also:
Constant Field Values

DELTAVALUE_KEY

public static final String DELTAVALUE_KEY
Key to store the delta value in the config.

See Also:
Constant Field Values

LAMBDAVALUE_KEY

public static final String LAMBDAVALUE_KEY
Key to store the lambda value in the config.

See Also:
Constant Field Values

MEMORY_KEY

public static final String MEMORY_KEY
Key to store whether the clustering should be performed in memory in the PredParams.

See Also:
Constant Field Values

MEASURES_KEY

public static final String MEASURES_KEY
Key to store whether cluster quality measures should be calculated.

See Also:
Constant Field Values

CFGKEY_KEEPALL

public static final String CFGKEY_KEEPALL
Config key to keep all columns in include list.

See Also:
Constant Field Values

INPORT

static final int INPORT
The input port used here.

See Also:
Constant Field Values

OUTPORT

static final int OUTPORT
The output port used here. Contains the original rows with additional cluster membership information.

See Also:
Constant Field Values
Constructor Detail

FuzzyClusterNodeModel

public FuzzyClusterNodeModel()
Constructor, remember parent and initialize status.

Method Detail

execute

protected PortObject[] execute(PortObject[] inData,
                               ExecutionContext exec)
                        throws Exception
Generate new clustering based on InputDataTable and specified number of clusters. In the output table, you will find the datarow with supplementary information about the membership to each cluster center. OUTPORT = original datarows with cluster membership information Execute method for general port types. The argument objects represent the input objects and are guaranteed to be subclasses of the PortObject classes that are defined through the PortTypes given in the constructor. Similarly, the returned output objects need to comply with their port types object class (otherwise an error is reported by the framework).

For a general description of the execute method refer to the description of the specialized NodeModel.execute(BufferedDataTable[], ExecutionContext) methods as it addresses more use cases.

Overrides:
execute in class NodeModel
Parameters:
inData - The input objects.
exec - For BufferedDataTable creation and progress.
Returns:
The output objects.
Throws:
Exception - If the node execution fails for any reason.

reset

public void reset()
Override this function in the derived model and reset your NodeModel. All components should unregister themselves from any observables (at least from the hilite handler right now). All internally stored data structures should be released. User settings should not be deleted/reset though.

Specified by:
reset in class NodeModel

saveSettingsTo

protected void saveSettingsTo(NodeSettingsWO settings)
Saves the number of Clusters and the maximum number of iterations in the settings. Adds to the given NodeSettings the model specific settings. The settings don't need to be complete or consistent. If, right after startup, no valid settings are available this method can write either nothing or invalid settings.

Method is called by the Node if the current settings need to be saved or transfered to the node's dialog.

Specified by:
saveSettingsTo in class NodeModel
Parameters:
settings - The object to write settings into.
See Also:
NodeModel.loadValidatedSettingsFrom(NodeSettingsRO), NodeModel.validateSettings(NodeSettingsRO)

validateSettings

protected void validateSettings(NodeSettingsRO settings)
                         throws InvalidSettingsException
Validates the number of Clusters and the maximum number of iterations in the settings. Validates the settings in the passed NodeSettings object. The specified settings should be checked for completeness and consistency. It must be possible to load a settings object validated here without any exception in the #loadValidatedSettings(NodeSettings) method. The method must not change the current settings in the model - it is supposed to just check them. If some settings are missing, invalid, inconsistent, or just not right throw an exception with a message useful to the user.

Specified by:
validateSettings in class NodeModel
Parameters:
settings - The settings to validate.
Throws:
InvalidSettingsException - If the validation of the settings failed.
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.loadValidatedSettingsFrom(NodeSettingsRO)

loadValidatedSettingsFrom

protected void loadValidatedSettingsFrom(NodeSettingsRO settings)
                                  throws InvalidSettingsException
Loads the number of clusters and the maximum number of iterations from the settings. Sets new settings from the passed object in the model. You can safely assume that the object passed has been successfully validated by the #validateSettings(NodeSettings) method. The model must set its internal configuration according to the settings object passed.

Specified by:
loadValidatedSettingsFrom in class NodeModel
Parameters:
settings - The settings to read.
Throws:
InvalidSettingsException - If a property is not available.
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.validateSettings(NodeSettingsRO)

configure

protected PortObjectSpec[] configure(PortObjectSpec[] inSpecs)
                              throws InvalidSettingsException
Number of columns in the output table is not deterministic. Configure method for general port types. The argument specs represent the input object specs and are guaranteed to be subclasses of the PortObjectSpecs that are defined through the PortTypes given in the constructor. Similarly, the returned output specs need to comply with their port types spec class (otherwise an error is reported by the framework). They may also be null.

For a general description of the configure method refer to the description of the specialized NodeModel.configure(DataTableSpec[]) methods as it addresses more use cases.

Overrides:
configure in class NodeModel
Parameters:
inSpecs - The input object specs.
Returns:
The output objects specs or null.
Throws:
InvalidSettingsException - If this node can't be configured.

getClusterCentres

public double[][] getClusterCentres()
Returns:
the cluster centers as 2-dimensional double matrix

getweightMatrix

public double[][] getweightMatrix()
Returns:
the 2-dimensional weight matrix

getBetweenClusterVariation

public double getBetweenClusterVariation()
Calculates the Between-Cluster Variation.

Returns:
the between cluster variation

getPartitionCoefficient

public double getPartitionCoefficient()
Calculates the partition coefficient.

Returns:
the partition coefficient

getPartitionEntropy

public double getPartitionEntropy()
Calculates the partition entropy.

Returns:
the partition entropy

getXieBeniIndex

public double getXieBeniIndex()
Calculates the Xie Beni Index.

Returns:
the Xie Beni Index

getWithinClusterVariations

public double[] getWithinClusterVariations()
Calculates the Within-Cluster Variation for each cluster. We take 'crisp' cluster centers to determine the membership from a datarow to a cluster center.

Returns:
withinClusterVariations

getFuzzyHyperVolumes

public double[] getFuzzyHyperVolumes()
Calculates the fuzzy hypervolumnes for each cluster.

Returns:
fuzzy hypervolumnes of all clusters

noiseClustering

public boolean noiseClustering()
Returns:
flag indicating whether a noise clustering was performed

loadInternals

protected void loadInternals(File internDir,
                             ExecutionMonitor exec)
                      throws IOException
Load internals into the derived NodeModel. This method is only called if the Node was executed. Read all your internal structures from the given file directory to create your internal data structure which is necessary to provide all node functionalities after the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
loadInternals in class NodeModel
Parameters:
internDir - The directory to read from.
exec - Used to report progress and to cancel the load process.
Throws:
IOException - If an error occurs during reading from this dir.
See Also:
NodeModel.saveInternals(File,ExecutionMonitor)

saveInternals

protected void saveInternals(File internDir,
                             ExecutionMonitor exec)
                      throws IOException
Save internals of the derived NodeModel. This method is only called if the Node is executed. Write all your internal structures into the given file directory which are necessary to recreate this model when the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
saveInternals in class NodeModel
Parameters:
internDir - The directory to write into.
exec - Used to report progress and to cancel the save process.
Throws:
IOException - If an error occurs during writing to this dir.
See Also:
NodeModel.loadInternals(File,ExecutionMonitor)


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.