org.knime.base.node.preproc.groupby
Class GroupByTable

java.lang.Object
  extended by org.knime.base.node.preproc.groupby.GroupByTable

public class GroupByTable
extends Object

A data table that groups a given input table by the given columns and calculates the aggregation values of the remaining rows. Call the getBufferedTable() method after instance creation to get the grouped table. If the enableHilite flag was set to true call the getHiliteMapping() method to get the row key translation Map. Call the getSkippedGroupsByColName() method to get a Map with all skipped groups or the getSkippedGroupsMessage(int, int) for a appropriate warning message.

Author:
Tobias Koetter, University of Konstanz

Constructor Summary
GroupByTable(ExecutionContext exec, BufferedDataTable inDataTable, List<String> groupByCols, ColumnAggregator[] colAggregators, int maxUniqueValues, boolean sortInMemory, boolean enableHilite, ColumnNamePolicy colNamePolicy)
          Constructor for class GroupByTable.
GroupByTable(ExecutionContext exec, BufferedDataTable inDataTable, List<String> groupByCols, ColumnAggregator[] colAggregators, int maxUniqueValues, boolean sortInMemory, boolean enableHilite, ColumnNamePolicy colNamePolicy, boolean retainOrder)
          Constructor for class GroupByTable.
 
Method Summary
static void checkGroupCols(DataTableSpec spec, List<String> groupCols)
           
static DataTableSpec createGroupByTableSpec(DataTableSpec spec, List<String> groupColNames, ColumnAggregator[] columnAggregators, ColumnNamePolicy colNamePolicy)
           
 BufferedDataTable getBufferedTable()
           
 Map<RowKey,Set<RowKey>> getHiliteMapping()
          the hilite translation Map or null if the enableHilte flag in the constructor was set to false.
 Map<String,Collection<String>> getSkippedGroupsByColName()
          Returns a Map with all skipped groups.
 String getSkippedGroupsMessage(int maxGroups, int maxCols)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GroupByTable

public GroupByTable(ExecutionContext exec,
                    BufferedDataTable inDataTable,
                    List<String> groupByCols,
                    ColumnAggregator[] colAggregators,
                    int maxUniqueValues,
                    boolean sortInMemory,
                    boolean enableHilite,
                    ColumnNamePolicy colNamePolicy)
             throws CanceledExecutionException
Constructor for class GroupByTable.

Parameters:
exec - the ExecutionContext
inDataTable - the table to aggregate
groupByCols - the name of all columns to group by
colAggregators - the aggregation columns with the aggregation method to use in the order the columns should be appear in the result table numerical columns
maxUniqueValues - the maximum number of unique values
sortInMemory - true if the table should be sorted in the memory
enableHilite - true if a row key map should be maintained to enable hiliting
colNamePolicy - the ColumnNamePolicy for the aggregation columns input table if set to true
Throws:
CanceledExecutionException - if the user has canceled the execution

GroupByTable

public GroupByTable(ExecutionContext exec,
                    BufferedDataTable inDataTable,
                    List<String> groupByCols,
                    ColumnAggregator[] colAggregators,
                    int maxUniqueValues,
                    boolean sortInMemory,
                    boolean enableHilite,
                    ColumnNamePolicy colNamePolicy,
                    boolean retainOrder)
             throws CanceledExecutionException
Constructor for class GroupByTable.

Parameters:
exec - the ExecutionContext
inDataTable - the table to aggregate
groupByCols - the name of all columns to group by
colAggregators - the aggregation columns with the aggregation method to use in the order the columns should be appear in the result table numerical columns
maxUniqueValues - the maximum number of unique values
sortInMemory - true if the table should be sorted in the memory
enableHilite - true if a row key map should be maintained to enable hiliting
colNamePolicy - the ColumnNamePolicy for the aggregation columns
retainOrder - returns the row of the table in the same order as the input table if set to true
Throws:
CanceledExecutionException - if the user has canceled the execution
Method Detail

getBufferedTable

public BufferedDataTable getBufferedTable()
Returns:
the aggregated BufferedDataTable

getHiliteMapping

public Map<RowKey,Set<RowKey>> getHiliteMapping()
the hilite translation Map or null if the enableHilte flag in the constructor was set to false. The key of the Map is the row key of the new group row and the corresponding value is the Collection with all old row keys which belong to this group.

Returns:
the hilite translation Map or null if the enableHilte flag in the constructor was set to false.

getSkippedGroupsByColName

public Map<String,Collection<String>> getSkippedGroupsByColName()
Returns a Map with all skipped groups. The key of the Map is the name of the column and the value is a Collection with all skipped groups.

Returns:
a Map with all skipped groups

getSkippedGroupsMessage

public String getSkippedGroupsMessage(int maxGroups,
                                      int maxCols)
Parameters:
maxGroups - the maximum number of skipped groups to display
maxCols - the maximum number of columns to display per group
Returns:
String message with the skipped groups per column or null if no groups where skipped

createGroupByTableSpec

public static final DataTableSpec createGroupByTableSpec(DataTableSpec spec,
                                                         List<String> groupColNames,
                                                         ColumnAggregator[] columnAggregators,
                                                         ColumnNamePolicy colNamePolicy)
Parameters:
spec - the original DataTableSpec
groupColNames - the name of all columns to group by
columnAggregators - the aggregation columns with the aggregation method to use in the order the columns should be appear in the result table
colNamePolicy - the ColumnNamePolicy for the aggregation columns
Returns:
the result DataTableSpec

checkGroupCols

public static void checkGroupCols(DataTableSpec spec,
                                  List<String> groupCols)
                           throws IllegalArgumentException
Parameters:
spec - the DataTableSpec to check
groupCols - the group by column name List
Throws:
IllegalArgumentException - if one of the group by columns doesn't exists in the given DataTableSpec


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.