org.knime.base.data.statistics
Class StatisticsTable

java.lang.Object
  extended by org.knime.base.data.statistics.StatisticsTable
All Implemented Interfaces:
Iterable<DataRow>, DataTable

Deprecated. use Statistics2Table

@Deprecated
public class StatisticsTable
extends Object
implements DataTable

A wrapper table that is able to compute statistics for each row The following moments are available:

Important: If you need all statistical values from a DataTable consider calling the calculateAllMoments(ExecutionMonitor)-method first for a faster processing speed.

Author:
Nicolas Cebron, University of Konstanz

Constructor Summary
protected StatisticsTable(DataTable table)
          Deprecated. To be used in derived classes that do additional calculations.
  StatisticsTable(DataTable table, ExecutionMonitor exec)
          Deprecated. Create new wrapper table from an existing one.
 
Method Summary
protected  void calculateAllMoments(double rowCount, ExecutionMonitor exec)
          Deprecated. Calculates all the statistical moments in one pass .
protected  void calculateAllMoments(ExecutionMonitor exec)
          Deprecated. Calculates all the statistical moments in one pass .
protected  void calculateMomentInSubClass(DataRow row)
          Deprecated. Derived classes may do additional calculations here.
 DataTableSpec getDataTableSpec()
          Deprecated. Produces a DataTableSpec for the statistics table which contains the range values calculated here.
 double[] getdoubleMax()
          Deprecated. Returns the maximum for all columns.
 double[] getdoubleMin()
          Deprecated. Returns the minimum for all columns.
 DataCell[] getMax()
          Deprecated. Returns the maximum for all columns.
 DataCell getMax(int colIdx)
          Deprecated. Returns the maximum for the desired column.
 double[] getMean()
          Deprecated. Returns the means for all columns.
 double getMean(int colIdx)
          Deprecated. Returns the mean for the desired column.
 DataCell[] getMin()
          Deprecated. Returns the minimum for all columns.
 DataCell getMin(int colIdx)
          Deprecated. Returns the minimum for the desired column.
 int getNrRows()
          Deprecated. Computes the number of rows of the data table.
 int[] getNumberMissingValues()
          Deprecated. Returns an array of the number of missing values for each dimension.
 int getNumberMissingValues(int colIdx)
          Deprecated. Returns the number of missing values for the given column index.
 double[] getStandardDeviation()
          Deprecated. Returns the standard deviation for all columns.
 double getStandardDeviation(int colIdx)
          Deprecated. Calculates the standard deviation for the desired column.
 double[] getSum()
          Deprecated. Returns the sum values for all columns.
 double getSum(int colIdx)
          Deprecated. Returns the sum for the desired column.
protected  DataTable getUnderlyingTable()
          Deprecated. Getter for the underlying table.
 double[] getVariance()
          Deprecated. Returns the variance for all columns.
 double getVariance(int colIdx)
          Deprecated. Returns the variance for the desired column.
 RowIterator iterator()
          Deprecated. Returns the row iterator of the original data table.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StatisticsTable

protected StatisticsTable(DataTable table)
Deprecated. 
To be used in derived classes that do additional calculations. Please do call calculateAllMoments when done!

Parameters:
table - To wrap.

StatisticsTable

public StatisticsTable(DataTable table,
                       ExecutionMonitor exec)
                throws CanceledExecutionException
Deprecated. 
Create new wrapper table from an existing one. This constructor calculates all values. It needs to traverse (twice) through the entire specified table. User can cancel action if an execution monitor is passed.

Parameters:
table - table to be wrapped
exec - an object to check with if user canceled operation
Throws:
CanceledExecutionException - if user canceled
See Also:
DataTable.getDataTableSpec()
Method Detail

getDataTableSpec

public DataTableSpec getDataTableSpec()
Deprecated. 
Produces a DataTableSpec for the statistics table which contains the range values calculated here.

Specified by:
getDataTableSpec in interface DataTable
Returns:
a table spec with ranges set in column. If the spec of the underlying table had ranges set nothing will change.

iterator

public RowIterator iterator()
Deprecated. 
Returns the row iterator of the original data table. Returns a row iterator which returns each row one-by-one from the table.

Specified by:
iterator in interface Iterable<DataRow>
Specified by:
iterator in interface DataTable
Returns:
row iterator
See Also:
DataRow

getNrRows

public int getNrRows()
Deprecated. 
Computes the number of rows of the data table.

Returns:
number of rows

getUnderlyingTable

protected DataTable getUnderlyingTable()
Deprecated. 
Getter for the underlying table.

Returns:
Table as passed in constructor.

calculateAllMoments

protected void calculateAllMoments(ExecutionMonitor exec)
                            throws CanceledExecutionException
Deprecated. 
Calculates all the statistical moments in one pass . After the call of this operation, the statistical moments can be obtained very fast from all the other methods.

Parameters:
exec - object to check with if user canceled the operation
Throws:
CanceledExecutionException - if user canceled

calculateAllMoments

protected void calculateAllMoments(double rowCount,
                                   ExecutionMonitor exec)
                            throws CanceledExecutionException
Deprecated. 
Calculates all the statistical moments in one pass . After the call of this operation, the statistical moments can be obtained very fast from all the other methods.

Parameters:
rowCount - Row count of table for progress, may be NaN if unknown.
exec - object to check with if user canceled the operation
Throws:
CanceledExecutionException - if user canceled
IllegalArgumentException - if rowCount argument < 0

calculateMomentInSubClass

protected void calculateMomentInSubClass(DataRow row)
Deprecated. 
Derived classes may do additional calculations here. This method is called from calculateAllMoments(ExecutionMonitor) with all of the rows.

Parameters:
row - For processing.

getMean

public double getMean(int colIdx)
Deprecated. 
Returns the mean for the desired column. Throws an exception if the specified column is not compatible to DoubleValue. Returns Double.NaN if the specified column contains only missing cells or if the table is empty.

Parameters:
colIdx - the column index for which the mean is calculated
Returns:
mean value or Double.NaN

getMean

public double[] getMean()
Deprecated. 
Returns the means for all columns. Returns Double.NaN if the column type is not of type DoubleValue.

Returns:
an array of mean values with an item for each column, which is Double.NaN if the column type is not DoubleValue

getSum

public double getSum(int colIdx)
Deprecated. 
Returns the sum for the desired column. Throws an exception if the specified column is not compatible to DoubleValue. Returns Double.NaN if the specified column contains only missing cells or if the table is empty.

Parameters:
colIdx - the column index for which the mean is calculated
Returns:
sum value or Double.NaN

getSum

public double[] getSum()
Deprecated. 
Returns the sum values for all columns. Returns Double.NaN if the column type is not of type DoubleValue.

Returns:
an array of sum values with an item for each column, which is Double.NaN if the column type is not DoubleValue

getNumberMissingValues

public int[] getNumberMissingValues()
Deprecated. 
Returns an array of the number of missing values for each dimension.

Returns:
number missing values for each dimensions

getNumberMissingValues

public int getNumberMissingValues(int colIdx)
Deprecated. 
Returns the number of missing values for the given column index.

Parameters:
colIdx - column index to consider
Returns:
number of missing values in this columns

getVariance

public double getVariance(int colIdx)
Deprecated. 
Returns the variance for the desired column. Throws an exception if the specified column is not compatible to DoubleValue. Returns Double.NaN if the specified column contains only missing cells or if the table is empty.

Parameters:
colIdx - the column index for which the variance is calculated
Returns:
variance or Double.NaN

getVariance

public double[] getVariance()
Deprecated. 
Returns the variance for all columns. Returns Double.NaN if the column type is not of type DoubleValue, if the entire column contains missing cells, or if the table is empty.

Returns:
variance values

getStandardDeviation

public double getStandardDeviation(int colIdx)
Deprecated. 
Calculates the standard deviation for the desired column. Throws an exception if the column type is not compatible to DoubleValue. Will return zero if the column contains only missing cells or the table was empty.

Parameters:
colIdx - the index of the column for which the standard deviation is to be calculated
Returns:
standard deviation or zero if its a column of missing values of the table is empty

getStandardDeviation

public double[] getStandardDeviation()
Deprecated. 
Returns the standard deviation for all columns. The returned array contains no valid value (i.e. Double.NaN) for column that are not compatible to DoubleValue.

Returns:
standard deviation values

getMin

public DataCell getMin(int colIdx)
Deprecated. 
Returns the minimum for the desired column. Returns a missing cell, if the column contains only missing cells or if the table is empty.

Parameters:
colIdx - the index of the column for which the minimum is calculated
Returns:
minimum or a missing cell if the column contains only missing cells, or if the table is empty

getMin

public DataCell[] getMin()
Deprecated. 
Returns the minimum for all columns. Will be a missing cell for columns that only contain missing cells or for empty data tables.

Returns:
the minimum values

getdoubleMin

public double[] getdoubleMin()
Deprecated. 
Returns the minimum for all columns. Will be Double.NaN for columns that only contain missing cells or for empty data tables.

Returns:
the minimum values

getMax

public DataCell getMax(int colIdx)
Deprecated. 
Returns the maximum for the desired column. Returns a missing cell, if the column contains only missing cells or if the table is empty.

Parameters:
colIdx - the index of the column for which the maximum is calculated
Returns:
maximum or a missing cell if the column contains only missing cells, or if the table is empty

getMax

public DataCell[] getMax()
Deprecated. 
Returns the maximum for all columns. Will be a missing cell for columns that only contain missing cells or for empty data tables.

Returns:
the maximum values

getdoubleMax

public double[] getdoubleMax()
Deprecated. 
Returns the maximum for all columns. Will be Double.NaN for columns that only contain missing cells or for empty data tables.

Returns:
the maximum values


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.