org.knime.base.node.util
Class DefaultDataArray

java.lang.Object
  extended by org.knime.base.node.util.DefaultDataArray
All Implemented Interfaces:
Iterable<DataRow>, DataArray, DataTable

public class DefaultDataArray
extends Object
implements DataArray

Can be used to locally store a certain number of rows. It provides random access to the stored rows. It maintains the min and max value for each column (min/max with respect to the row sample stored - not the entire data table). These values can be changed, in case somebody knows better limits. It provides a list of all values seen for each string column (i.e. a list of all values appearing in the rows stored - not the entire data table). If the maximal number of possible values (2000) is exceeded, no possible values are available.

Author:
Peter Ohl, University of Konstanz

Constructor Summary
DefaultDataArray(DataTable dTable, int firstRow, int numOfRows)
          Constructs a random access container holding a certain number of rows from the data table passed in.
DefaultDataArray(DataTable dTable, int firstRow, int numOfRows, ExecutionMonitor execMon)
          Same, but allows for user cancellation from a progress monitor, while the container is filled.
 
Method Summary
 DataTableSpec getDataTableSpec()
          Get the table spec corresponding the the rows.
 int getFirstRowNumber()
          
 DataCell getMaxValue(int colIdx)
          
 DataCell getMinValue(int colIdx)
          
 DataRow getRow(int idx)
          Returns the row from the container with index idx.
 Set<DataCell> getValues(int colIdx)
          Returns a set of all different values seen in the specified column.
 RowIterator iterator()
          Returns a row iterator which returns each row one-by-one from the table.
 void setMaxValue(int colIdx, DataCell newMaxValue)
          Sets a new max value for the specified column.
 void setMinValue(int colIdx, DataCell newMinValue)
          Sets a new min value for the specified column.
 int size()
          
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DefaultDataArray

public DefaultDataArray(DataTable dTable,
                        int firstRow,
                        int numOfRows)
Constructs a random access container holding a certain number of rows from the data table passed in. It will store the specified amount of rows starting from the row specified in the "firstRow" parameter (where the first row is number 1). The rows can be accessed by index later on always starting with index zero.

Parameters:
dTable - the data table to read the rows from
firstRow - the first row to store (must be greater than zero)
numOfRows - the number of rows to store (must be zero or more)

DefaultDataArray

public DefaultDataArray(DataTable dTable,
                        int firstRow,
                        int numOfRows,
                        ExecutionMonitor execMon)
                 throws CanceledExecutionException
Same, but allows for user cancellation from a progress monitor, while the container is filled.

Parameters:
dTable - the data table to read the rows from
firstRow - the first row to store (must be greater than zero)
numOfRows - the number of rows to store (must be zero or more)
execMon - the object listening to our progress and providing cancel functionality
Throws:
CanceledExecutionException - if the construction was canceled
Method Detail

getRow

public DataRow getRow(int idx)
Returns the row from the container with index idx. Index starts at zero and must be less than the size of the container (which could be less than the number of rows requested at construction time as the table could be shorter than that). The original row number in the table can be reconstructed by adding the index to the result of the DataArray.getFirstRowNumber() method.

Specified by:
getRow in interface DataArray
Parameters:
idx - the index of the row to return (must be between 0 and size of the row container)
Returns:
the row from the container with index idx

getValues

public Set<DataCell> getValues(int colIdx)
Returns a set of all different values seen in the specified column. Will always return null if the idx doesn't specifiy a column of type StringCell (or derived from that). The list will be in the order the values appeared in the rows read in. It contains only the values showing in these rows, the complete table may contain more values. The list doesn't contain "missing value" cells.

Specified by:
getValues in interface DataArray
Parameters:
colIdx - the index of the column to return the possible values for
Returns:
a list of possible values of the specified column in the order they appear in the rows read. The list includes only values seen in the rows stored in the container. Returns null for non-string columns.

getMinValue

public DataCell getMinValue(int colIdx)

Specified by:
getMinValue in interface DataArray
Parameters:
colIdx - the index of the column to return the min value for
Returns:
the minimum value seen in the specified column in the rows read in (the entire table could contain a smaller value). Or the min value set with the corresponding setter method. Will return null if the number of rows actually stored is zero, or the column contains only missing cells.

getMaxValue

public DataCell getMaxValue(int colIdx)

Specified by:
getMaxValue in interface DataArray
Parameters:
colIdx - the index of the column to return the max value for
Returns:
the maximum value seen in the specified column in the rows read in (the entire table could contain a larger value). Or the max value set with the corresponding setter method. Will return null if the number of rows actually stored is zero, or the column contains only missing cells.

setMaxValue

public void setMaxValue(int colIdx,
                        DataCell newMaxValue)
Sets a new max value for the specified column.

Parameters:
colIdx - the index of the column to set the new max value for
newMaxValue - the new max value for the specified column. Must not be null and must fit the type of the column.

setMinValue

public void setMinValue(int colIdx,
                        DataCell newMinValue)
Sets a new min value for the specified column.

Parameters:
colIdx - the index of the column to set the new min value for. Must be between zero and the size of this container.
newMinValue - the new min value for the specified column. Must not be null and must fit the type of the column.

size

public int size()

Specified by:
size in interface DataArray
Returns:
the size of the container, i.e. the number of rows actually stored. Could be different from the number fo rows requested, if the table is shorter than the sum of the first row and the number of rows specified to the constructor.

getFirstRowNumber

public int getFirstRowNumber()

Specified by:
getFirstRowNumber in interface DataArray
Returns:
the number of the row with index 0 - i.e. the original row number in the underlying data table of any row with index i in the container can be reconstructed by i + getFirstRowNumber().

iterator

public RowIterator iterator()
Returns a row iterator which returns each row one-by-one from the table.

Specified by:
iterator in interface Iterable<DataRow>
Specified by:
iterator in interface DataArray
Specified by:
iterator in interface DataTable
Returns:
an iterator to traverse the container. Unfortunately the iterator returns objects, i.e. you would have to use a typecast to DataRow to obtain the real type of the object.
See Also:
DataRow

getDataTableSpec

public DataTableSpec getDataTableSpec()
Get the table spec corresponding the the rows. The domain information is ensured to be included, i.e. for all string compatible columns it contains the possible values and for all double compatible columns it contains lower and upper bounds.

Specified by:
getDataTableSpec in interface DataArray
Specified by:
getDataTableSpec in interface DataTable
Returns:
the table spec belonging to the rows stored.


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.