org.knime.base.node.mine.scorer.accuracy
Class AccuracyScorerNodeModel

java.lang.Object
  extended by org.knime.core.node.NodeModel
      extended by org.knime.base.node.mine.scorer.accuracy.AccuracyScorerNodeModel
All Implemented Interfaces:
DataProvider

public class AccuracyScorerNodeModel
extends NodeModel
implements DataProvider

The hilite scorer node's model. The scoring is performed on two given columns set by the dialog. The row keys are stored for later hiliting purpose.

Author:
Christoph Sieb, University of Konstanz
See Also:
AccuracyScorerNodeFactory

Field Summary
(package private) static String FIRST_COMP_ID
          Identifier in model spec to address first column name to compare.
(package private) static int INPORT
          The input port 0.
protected static NodeLogger LOGGER
          The node logger for this class.
(package private) static int OUTPORT_0
          The output port 0: confusion matrix.
(package private) static int OUTPORT_1
          The output port 1: accuracy measures.
(package private) static String SECOND_COMP_ID
          Identifier in model spec to address first second name to compare.
 
Fields inherited from interface org.knime.base.node.viz.plotter.DataProvider
END, START
 
Constructor Summary
AccuracyScorerNodeModel()
          Inits a new ScorerNodeModel with one in- and one output.
 
Method Summary
protected  DataTableSpec[] configure(DataTableSpec[] inSpecs)
          This function is called whenever the derived model should re-configure its output DataTableSpecs.
(package private)  boolean containsConfusionMatrixKeys(int x, int y, Set<RowKey> keys)
          Checks if the specified confusion matrix cell contains at least one of the given keys.
protected  DataCell[] determineColValues(BufferedDataTable in, int index1, int index2, ExecutionMonitor exec)
          Called to determine all possible values in the respective columns.
protected  BufferedDataTable[] execute(BufferedDataTable[] data, ExecutionContext exec)
          Starts the scoring in the scorer.
protected static int findValue(DataCell[] source, DataCell key)
          Finds the position where key is located in source.
 double getAccuracy()
           
(package private)  Point[] getCompleteHilitedCells(Set<RowKey> keys)
          Returns all cells of the confusion matrix (as Points) if the given key set contains all keys of that cell.
 int getCorrectCount()
          Get the correct classification count, i.e.
 DataArray getDataArray(int index)
          Provides the data that should be visualized.
 double getError()
           
 int getFalseCount()
          Get the misclassification count, i.e.
 String getFirstCompareColumn()
          Returns the first column to compare.
 int getNrRows()
          Get the number of rows in the input table.
protected  HiLiteHandler getOutHiLiteHandler(int outIndex)
          Returns the HiLiteHandler for the given output index.
(package private)  BitSet getRocCurve()
          Returns a bit set with data for the ROC curve.
(package private)  int[][] getScorerCount()
           
 String getSecondCompareColumn()
          Returns the second column to compare.
(package private)  Set<RowKey> getSelectedSet(Point[] cells)
          Determines the row keys (as DataCells) which belong to the given cell of the confusion matrix.
(package private)  String[] getValues()
           
protected  void loadInternals(File internDir, ExecutionMonitor exec)
          Load internals into the derived NodeModel.
protected  void loadValidatedSettingsFrom(NodeSettingsRO settings)
          Sets new settings from the passed object in the model.
protected  void reset()
          Resets all internal data.
protected  void saveInternals(File internDir, ExecutionMonitor exec)
          Save internals of the derived NodeModel.
protected  void saveSettingsTo(NodeSettingsWO settings)
          Adds to the given NodeSettings the model specific settings.
(package private)  void setCompareColumn(String first, String second)
          Sets the columns that will be compared during execution.
protected  void validateSettings(NodeSettingsRO settings)
          Validates the settings in the passed NodeSettings object.
 
Methods inherited from class org.knime.core.node.NodeModel
addWarningListener, configure, continueLoop, execute, executeModel, getInHiLiteHandler, getLoopEndNode, getLoopStartNode, getNrInPorts, getNrOutPorts, getWarningMessage, notifyViews, notifyWarningListeners, peekFlowVariableDouble, peekFlowVariableInt, peekFlowVariableString, peekScopeVariableDouble, peekScopeVariableInt, peekScopeVariableString, pushFlowVariableDouble, pushFlowVariableInt, pushFlowVariableString, pushScopeVariableDouble, pushScopeVariableInt, pushScopeVariableString, removeWarningListener, setInHiLiteHandler, setWarningMessage, stateChanged
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOGGER

protected static final NodeLogger LOGGER
The node logger for this class.


FIRST_COMP_ID

static final String FIRST_COMP_ID
Identifier in model spec to address first column name to compare.

See Also:
Constant Field Values

SECOND_COMP_ID

static final String SECOND_COMP_ID
Identifier in model spec to address first second name to compare.

See Also:
Constant Field Values

INPORT

static final int INPORT
The input port 0.

See Also:
Constant Field Values

OUTPORT_0

static final int OUTPORT_0
The output port 0: confusion matrix.

See Also:
Constant Field Values

OUTPORT_1

static final int OUTPORT_1
The output port 1: accuracy measures.

See Also:
Constant Field Values
Constructor Detail

AccuracyScorerNodeModel

AccuracyScorerNodeModel()
Inits a new ScorerNodeModel with one in- and one output.

Method Detail

execute

protected BufferedDataTable[] execute(BufferedDataTable[] data,
                                      ExecutionContext exec)
                               throws CanceledExecutionException
Starts the scoring in the scorer.

Overrides:
execute in class NodeModel
Parameters:
data - the input data of length one
exec - the execution monitor
Returns:
the confusion matrix
Throws:
CanceledExecutionException - if user canceled execution
See Also:
NodeModel.execute(BufferedDataTable[],ExecutionContext)

reset

protected void reset()
Resets all internal data.

Specified by:
reset in class NodeModel

setCompareColumn

void setCompareColumn(String first,
                      String second)
Sets the columns that will be compared during execution.

Parameters:
first - the first column
second - the second column
Throws:
NullPointerException - if one of the parameters is null

configure

protected DataTableSpec[] configure(DataTableSpec[] inSpecs)
                             throws InvalidSettingsException
This function is called whenever the derived model should re-configure its output DataTableSpecs. Based on the given input data table spec(s) and the current model's settings, the derived model has to calculate the output data table spec and return them.

The passed DataTableSpec elements are never null but can be empty. The model may return null data table spec(s) for the outputs. But still, the model may be in an executable state. Note, after the model has been executed this function will not be called anymore, as the output DataTableSpecs are then being pulled from the output DataTables. A derived NodeModel that cannot provide any DataTableSpecs at its outputs before execution (because the table structure is unknown at this point) can return an array containing just null elements.

Implementation note: This method is called from the NodeModel.configure(PortObjectSpec[]) method unless that method is overwritten.

Overrides:
configure in class NodeModel
Parameters:
inSpecs - An array of DataTableSpecs (as many as this model has inputs). Do NOT modify the contents of this array. None of the DataTableSpecs in the array can be null but empty. If the predecessor node is not yet connected, or doesn't provide a DataTableSpecs at its output port.
Returns:
An array of DataTableSpecs (as many as this model has outputs) They will be propagated to connected successor nodes. null DataTableSpec elements are changed to empty once.
Throws:
InvalidSettingsException - if the #configure() failed, that is, the settings are inconsistent with given DataTableSpec elements.

getCorrectCount

public int getCorrectCount()
Get the correct classification count, i.e. where both columns agree.

Returns:
the count of rows where the two columns have an equal value or -1 if the node is not executed

getFalseCount

public int getFalseCount()
Get the misclassification count, i.e. where both columns have different values.

Returns:
the count of rows where the two columns have an unequal value or -1 if the node is not executed

getNrRows

public int getNrRows()
Get the number of rows in the input table. This count can be different from getFalseCount() + getCorrectCount(), though it must be at least the sum of both. The difference is the number of rows containing a missing value in either of the target columns.

Returns:
number of rows in input table

getError

public double getError()
Returns:
ratio of wrong classified and all patterns

getAccuracy

public double getAccuracy()
Returns:
ratio of correct classified and all patterns

loadValidatedSettingsFrom

protected void loadValidatedSettingsFrom(NodeSettingsRO settings)
                                  throws InvalidSettingsException
Sets new settings from the passed object in the model. You can safely assume that the object passed has been successfully validated by the #validateSettings(NodeSettings) method. The model must set its internal configuration according to the settings object passed.

Specified by:
loadValidatedSettingsFrom in class NodeModel
Parameters:
settings - The settings to read.
Throws:
InvalidSettingsException - If a property is not available.
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.validateSettings(NodeSettingsRO)

saveSettingsTo

protected void saveSettingsTo(NodeSettingsWO settings)
Adds to the given NodeSettings the model specific settings. The settings don't need to be complete or consistent. If, right after startup, no valid settings are available this method can write either nothing or invalid settings.

Method is called by the Node if the current settings need to be saved or transfered to the node's dialog.

Specified by:
saveSettingsTo in class NodeModel
Parameters:
settings - The object to write settings into.
See Also:
NodeModel.loadValidatedSettingsFrom(NodeSettingsRO), NodeModel.validateSettings(NodeSettingsRO)

validateSettings

protected void validateSettings(NodeSettingsRO settings)
                         throws InvalidSettingsException
Validates the settings in the passed NodeSettings object. The specified settings should be checked for completeness and consistency. It must be possible to load a settings object validated here without any exception in the #loadValidatedSettings(NodeSettings) method. The method must not change the current settings in the model - it is supposed to just check them. If some settings are missing, invalid, inconsistent, or just not right throw an exception with a message useful to the user.

Specified by:
validateSettings in class NodeModel
Parameters:
settings - The settings to validate.
Throws:
InvalidSettingsException - If the validation of the settings failed.
See Also:
NodeModel.saveSettingsTo(NodeSettingsWO), NodeModel.loadValidatedSettingsFrom(NodeSettingsRO)

getSelectedSet

Set<RowKey> getSelectedSet(Point[] cells)
Determines the row keys (as DataCells) which belong to the given cell of the confusion matrix.

Parameters:
cells - the cells of the confusion matrix for which the keys should be returned
Returns:
a set of DataCells containing the row keys

determineColValues

protected DataCell[] determineColValues(BufferedDataTable in,
                                        int index1,
                                        int index2,
                                        ExecutionMonitor exec)
                                 throws CanceledExecutionException
Called to determine all possible values in the respective columns.

Parameters:
in - the input table
index1 - the first column to compare
index2 - the second column to compare
exec - object to check with if user canceled
Returns:
the order of rows and columns in the confusion matrix
Throws:
CanceledExecutionException - if user canceled operation

findValue

protected static int findValue(DataCell[] source,
                               DataCell key)
Finds the position where key is located in source. It must be ensured that the key is indeed in the argument array.

Parameters:
source - the source array
key - the key to find
Returns:
the index in source where key is located

containsConfusionMatrixKeys

boolean containsConfusionMatrixKeys(int x,
                                    int y,
                                    Set<RowKey> keys)
Checks if the specified confusion matrix cell contains at least one of the given keys.

Parameters:
x - the x value to specify the matrix cell
y - the y value to specify the matrix cell
keys - the keys to check
Returns:
true if at least one key is contained in the specified cell

getCompleteHilitedCells

Point[] getCompleteHilitedCells(Set<RowKey> keys)
Returns all cells of the confusion matrix (as Points) if the given key set contains all keys of that cell.

Parameters:
keys - the keys to check for
Returns:
the cells that fullfill the above condition

loadInternals

protected void loadInternals(File internDir,
                             ExecutionMonitor exec)
                      throws IOException
Load internals into the derived NodeModel. This method is only called if the Node was executed. Read all your internal structures from the given file directory to create your internal data structure which is necessary to provide all node functionalities after the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
loadInternals in class NodeModel
Parameters:
internDir - The directory to read from.
exec - Used to report progress and to cancel the load process.
Throws:
IOException - If an error occurs during reading from this dir.
See Also:
NodeModel.saveInternals(File,ExecutionMonitor)

saveInternals

protected void saveInternals(File internDir,
                             ExecutionMonitor exec)
                      throws IOException
Save internals of the derived NodeModel. This method is only called if the Node is executed. Write all your internal structures into the given file directory which are necessary to recreate this model when the workflow is loaded, e.g. view content and/or hilite mapping.

Specified by:
saveInternals in class NodeModel
Parameters:
internDir - The directory to write into.
exec - Used to report progress and to cancel the save process.
Throws:
IOException - If an error occurs during writing to this dir.
See Also:
NodeModel.loadInternals(File,ExecutionMonitor)

getScorerCount

int[][] getScorerCount()
Returns:
the confusion matrix as int 2-D array

getRocCurve

BitSet getRocCurve()
Returns a bit set with data for the ROC curve. A set bit means a correct classified example, an unset bit is a wrong classified example. The number of interesting bits is BitSet.length() - 1, i.e. the last set bit must be ignored, it is just the end marker.

Returns:
a bit set

getValues

String[] getValues()
Returns:
the attribute names of the confusion matrix

getDataArray

public DataArray getDataArray(int index)
Provides the data that should be visualized. The index can be used, if a NodeModel has two inports and both data should be visualized. Then the index provides means to determine which DataArray should be returned.

Specified by:
getDataArray in interface DataProvider
Parameters:
index - if the data of more than one data table should be visualized.
Returns:
the data as a data array.

getFirstCompareColumn

public String getFirstCompareColumn()
Returns the first column to compare.

Returns:
the first column to compare

getSecondCompareColumn

public String getSecondCompareColumn()
Returns the second column to compare.

Returns:
the second column to compare

getOutHiLiteHandler

protected HiLiteHandler getOutHiLiteHandler(int outIndex)
Returns the HiLiteHandler for the given output index. This default implementation simply passes on the handler of input port 0 or generates a new one if this node has no inputs.

This method is intended to be overridden

Overrides:
getOutHiLiteHandler in class NodeModel
Parameters:
outIndex - The output index.
Returns:
HiLiteHandler for the given output port.


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.