org.knime.base.node.mine.cluster.kmeans
Class ClusterNodeModel

java.lang.Object
  extended by org.knime.core.node.NodeModel
      extended by org.knime.base.node.mine.cluster.kmeans.ClusterNodeModel

public class ClusterNodeModel
extends NodeModel

Generate a clustering using a fixed number of cluster centers and the k-means algorithm. Right now this works only on DataTables holding DoubleCells (or derivatives thereof).

Author:
Michael Berthold, University of Konstanz

Field Summary
static String CFG_COLUMNS
          Config key for the used columns.
static String CFG_MAX_ITERATIONS
          Config key for the maximal number of iterations.
static String CFG_NR_OF_CLUSTERS
          Config key for the number of clusters.
static String CLUSTER
          Constant for the RowKey generation and identification in the view.
static int INITIAL_MAX_ITERATIONS
          Constant for the initial number of iterations used in the dialog.
static int INITIAL_NR_CLUSTERS
          Constant for the initial number of clusters used in the dialog.
 
Constructor Summary
ClusterNodeModel()
          Constructor, remember parent and initialize status.
 
Method Summary
protected  PortObjectSpec[] configure(PortObjectSpec[] inSpecs)
          Returns true always and passes the current input spec to the output spec which is identical to the input specification - after all, we are building cluster centers in the original feature space.
protected  PortObject[] execute(PortObject[] data, ExecutionContext exec)
          Generate new clustering based on InputDataTable and specified number of clusters.
(package private)  double[] getClusterCenter(int c)
          Return prototype vector of cluster c.
(package private)  int getClusterCoverage(int c)
          Return coverage of a cluster.
(package private)  int getDimension()
          Return dimension of feature space (and hence also clusters).
(package private)  String getFeatureName(int i)
          Return name of column at i'th postion within cluster prototype.
(package private)  HiLiteHandler getHiLiteHandler()
           
(package private)  int getMaxNumIterations()
          Get maximum number of iterations for batch mode.
(package private)  int getNrUsedColumns()
           
(package private)  int getNumClusters()
          Get number of clusters.
protected  HiLiteHandler getOutHiLiteHandler(int outIndex)
          Returns the HiLiteHandler for the given output index.
(package private)  boolean hasModel()
           
protected  void loadInternals(File internDir, ExecutionMonitor exec)
          Load internals into the derived NodeModel.
protected  void loadValidatedSettingsFrom(NodeSettingsRO settings)
          Method is called when the NodeModel has to set its configuration using the given one.
protected  void reset()
          Clears the model.
protected  void saveInternals(File internDir, ExecutionMonitor exec)
          Save internals of the derived NodeModel.
protected  void saveSettingsTo(NodeSettingsWO settings)
          Appends to the given node settings the model specific configuration, that are, the current settings (e.g.
protected  void setInHiLiteHandler(int inIndex, HiLiteHandler hiLiteHdl)
          This implementation is empty.
(package private)  void setMaxNumIterations(int i)
          Set maximum number of iterations for batch mode.
(package private)  void setNumClusters(int n)
          Set number of clusters.
protected  void validateSettings(NodeSettingsRO settings)
          Method is called when before the model has to change it's configuration (@see loadsettings) using the given one.
 
Methods inherited from class org.knime.core.node.NodeModel
addWarningListener, configure, continueLoop, execute, executeModel, getInHiLiteHandler, getLoopEndNode, getLoopStartNode, getNrInPorts, getNrOutPorts, getWarningMessage, notifyViews, notifyWarningListeners, peekFlowVariableDouble, peekFlowVariableInt, peekFlowVariableString, peekScopeVariableDouble, peekScopeVariableInt, peekScopeVariableString, pushFlowVariableDouble, pushFlowVariableInt, pushFlowVariableString, pushScopeVariableDouble, pushScopeVariableInt, pushScopeVariableString, removeWarningListener, setWarningMessage, stateChanged
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CLUSTER

public static final String CLUSTER
Constant for the RowKey generation and identification in the view.

See Also:
Constant Field Values

INITIAL_NR_CLUSTERS

public static final int INITIAL_NR_CLUSTERS
Constant for the initial number of clusters used in the dialog.

See Also:
Constant Field Values

INITIAL_MAX_ITERATIONS

public static final int INITIAL_MAX_ITERATIONS
Constant for the initial number of iterations used in the dialog.

See Also:
Constant Field Values

CFG_NR_OF_CLUSTERS

public static final String CFG_NR_OF_CLUSTERS
Config key for the number of clusters.

See Also:
Constant Field Values

CFG_MAX_ITERATIONS

public static final String CFG_MAX_ITERATIONS
Config key for the maximal number of iterations.

See Also:
Constant Field Values

CFG_COLUMNS

public static final String CFG_COLUMNS
Config key for the used columns.

See Also:
Constant Field Values
Constructor Detail

ClusterNodeModel

ClusterNodeModel()
Constructor, remember parent and initialize status.

Method Detail

getHiLiteHandler

final HiLiteHandler getHiLiteHandler()
Returns:
cluster centers' hilite handler

getOutHiLiteHandler

protected HiLiteHandler getOutHiLiteHandler(int outIndex)
Returns the HiLiteHandler for the given output index. This default implementation simply passes on the handler of input port 0 or generates a new one if this node has no inputs.

This method is intended to be overridden

Overrides:
getOutHiLiteHandler in class NodeModel
Parameters:
outIndex - The output index.
Returns:
HiLiteHandler for the given output port.

setInHiLiteHandler

protected void setInHiLiteHandler(int inIndex,
                                  HiLiteHandler hiLiteHdl)
This implementation is empty. Subclasses may override this method in order to be informed when the hilite handler changes at the inport, e.g. when the node (or an preceding node) is newly connected.

Overrides:
setInHiLiteHandler in class NodeModel
Parameters:
inIndex - The index of the input.
hiLiteHdl - The HiLiteHandler at input index. May be null when not available, i.e. not properly connected.

saveSettingsTo

protected void saveSettingsTo(NodeSettingsWO settings)
Appends to the given node settings the model specific configuration, that are, the current settings (e.g. from the NodeDialogPane), as wells, the NodeModel itself if applicable.

Method is called by the Node if the current configuration needs to be saved.

Specified by:
saveSettingsTo in class NodeModel
Parameters:
settings - to write into
See Also:
NodeModel.loadValidatedSettingsFrom(NodeSettingsRO), NodeModel.validateSettings(NodeSettingsRO)

validateSettings

protected void validateSettings(NodeSettingsRO settings)
                         throws InvalidSettingsException
Method is called when before the model has to change it's configuration (@see loadsettings) using the given one. This method is also called by the Node.

Specified by:
validateSettings in class NodeModel
Parameters:
settings - to validate
Throws:
InvalidSettingsException - if a property is not available or doesn't fit
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.loadValidatedSettingsFrom(NodeSettingsRO)

loadValidatedSettingsFrom

protected void loadValidatedSettingsFrom(NodeSettingsRO settings)
                                  throws InvalidSettingsException
Method is called when the NodeModel has to set its configuration using the given one. This method is also called by the Node. Note that the settings should have been validated before this method is called.

Specified by:
loadValidatedSettingsFrom in class NodeModel
Parameters:
settings - to read from
Throws:
InvalidSettingsException - if a property is not available - which shouldn't happen...
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.validateSettings(NodeSettingsRO)

getNumClusters

int getNumClusters()
Get number of clusters.

Returns:
number of clusters

setNumClusters

void setNumClusters(int n)
Set number of clusters.

Parameters:
n - number of clusters

getMaxNumIterations

int getMaxNumIterations()
Get maximum number of iterations for batch mode.

Returns:
maximum number of iterations currently chosen

setMaxNumIterations

void setMaxNumIterations(int i)
Set maximum number of iterations for batch mode.

Parameters:
i - maximum number of iterations

getDimension

int getDimension()
Return dimension of feature space (and hence also clusters).

Returns:
dimension of feature space

getNrUsedColumns

int getNrUsedColumns()
Returns:
the number of used columns

hasModel

boolean hasModel()
Returns:
true if the model is executed (and not reset) and cluster centers are available

getClusterCenter

double[] getClusterCenter(int c)
Return prototype vector of cluster c. Do not call if model is not executed or reset.

Parameters:
c - index of cluster
Returns:
array of doubles holding prototype vector

getFeatureName

String getFeatureName(int i)
Return name of column at i'th postion within cluster prototype.

Parameters:
i - index of (double compatible = not ignored) feature
Returns:
name

getClusterCoverage

int getClusterCoverage(int c)
Return coverage of a cluster.

Parameters:
c - index of cluster
Returns:
number of patterns covered by a cluster

execute

protected PortObject[] execute(PortObject[] data,
                               ExecutionContext exec)
                        throws Exception
Generate new clustering based on InputDataTable and specified number of clusters. Currently the objective function only looks for cluster centers that are extremely similar to the first n patterns... Execute method for general port types. The argument objects represent the input objects and are guaranteed to be subclasses of the PortObject classes that are defined through the PortTypes given in the constructor. Similarly, the returned output objects need to comply with their port types object class (otherwise an error is reported by the framework).

For a general description of the execute method refer to the description of the specialized NodeModel.execute(BufferedDataTable[], ExecutionContext) methods as it addresses more use cases.

Overrides:
execute in class NodeModel
Parameters:
data - The input objects.
exec - For BufferedDataTable creation and progress.
Returns:
The output objects.
Throws:
Exception - If the node execution fails for any reason.

reset

protected void reset()
Clears the model.

Specified by:
reset in class NodeModel
See Also:
NodeModel.reset()

configure

protected PortObjectSpec[] configure(PortObjectSpec[] inSpecs)
                              throws InvalidSettingsException
Returns true always and passes the current input spec to the output spec which is identical to the input specification - after all, we are building cluster centers in the original feature space.

Overrides:
configure in class NodeModel
Parameters:
inSpecs - the specifications of the input port(s) - should be one
Returns:
the copied input spec
Throws:
InvalidSettingsException - if PMML incompatible type was found

loadInternals

protected void loadInternals(File internDir,
                             ExecutionMonitor exec)
                      throws IOException
Load internals into the derived NodeModel. This method is only called if the Node was executed. Read all your internal structures from the given file directory to create your internal data structure which is necessary to provide all node functionalities after the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
loadInternals in class NodeModel
Parameters:
internDir - The directory to read from.
exec - Used to report progress and to cancel the load process.
Throws:
IOException - If an error occurs during reading from this dir.
See Also:
NodeModel.saveInternals(File,ExecutionMonitor)

saveInternals

protected void saveInternals(File internDir,
                             ExecutionMonitor exec)
                      throws IOException,
                             CanceledExecutionException
Save internals of the derived NodeModel. This method is only called if the Node is executed. Write all your internal structures into the given file directory which are necessary to recreate this model when the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
saveInternals in class NodeModel
Parameters:
internDir - The directory to write into.
exec - Used to report progress and to cancel the save process.
Throws:
IOException - If an error occurs during writing to this dir.
CanceledExecutionException - If the saving has been canceled.
See Also:
NodeModel.loadInternals(File,ExecutionMonitor)


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.