org.knime.base.node.mine.mds.mdsprojection
Class MDSProjectionManager

java.lang.Object
  extended by org.knime.base.node.mine.mds.mdsprojection.MDSProjectionManager

public class MDSProjectionManager
extends Object

The MDSProjectionManager handling the MDS algorithmic. Like the MDSManager for each row of the input data a lower dimensional representation is computed. The difference is that the points of the input data are not adjusted to themselves but to a set of fixed data points and their corresponding lower dimensional representation. The rearrangement is an iterative process running as many epochs as specified. The learn rate, specifying the step size is reduced after each epoch, so that the process converges at the end.

Author:
Kilian Thiel, University of Konstanz

Field Summary
static int DEFAULT_SEED
          The default random seed.
protected  int m_dimension
          The dimension of the target space.
protected  DistanceManager m_distMan
          The distance manager to use.
protected  double m_epochs
          The number of epochs to train.
protected  DistanceManager m_euclideanDistMan
          The Euclidean distance manager used in the target space.
protected  ExecutionContext m_exec
          The execution context to show progress information an enable cancel.
protected  double m_finalLearningRate
          The final learning rate.
protected  DataTable m_fixedDataPoints
          The input data table storing the fixed data points.
protected  Hashtable<RowKey,DataPoint> m_fixedPoints
          A hashtable holding row keys of fixed points and related points of the target space.
protected  DataTable m_inData
          The input data table.
protected  double m_initialLearningrate
          The initial learning rate.
protected  boolean m_isInit
          Flag, indicating if data points in target space have been initialized (if true) or not (if false).
protected  double m_learningrate
          The learning rate.
protected  double m_minDistThreshold
          Threshold of minimum distance.
protected  Hashtable<RowKey,DataPoint> m_points
          A hashtable holding keys of input rows and related points of the target space.
protected  boolean m_projectOnly
          Flag, indicating if data points have to be projected only according to the fixed points (if true) or adjusted according to the other (not fixed) points as well (if false).
protected  Set<DataPoint> m_unmodifiablePoints
          The set of unmodifyable data points.
static int MAX_SEED
          The maximum random seed.
static int MIN_SEED
          The minimum random seed.
 
Constructor Summary
MDSProjectionManager(int dimension, String distance, boolean fuzzy, BufferedDataTable inData, BufferedDataTable fixedDataPoints, int[] fixedDataMdsIndices, ExecutionContext exec)
          Creates a new instance of MDSProjectionManager with the given dimension, distance metric, fuzzy flag, in data and fixed data to use.
 
Method Summary
protected  void adjustDataPoint(DataPoint p1, DataPoint p2, DataRow r1, DataRow r2)
          Adjusts the low dimensional mapping of the first data point according to the second data point and its mapping.
protected  void adjustLearningRate(int epoch)
          Adjusts learning rate according to the given epoch.
protected  double disparityTransformation(double distance)
          Computes the disparity value for the given distance value.
protected  void doEpoch(int epoch, ExecutionMonitor exec)
          Computing one epoch if the iterative mds.
 Hashtable<RowKey,DataPoint> getDataPoints()
           
 int getDimension()
           
 double getMinDistanceThreshold()
           
 boolean getProjectOnly()
           
 void init(long seed)
          Initializes the lower dimensional data points randomly.
protected  void preprocFixedDataPoints(int[] fixedDataMdsIndices)
          Initializes for each of the fixed data points a point in the target space.
 void reset()
          Clears the Hashtable containing the high and the corresponding low dimensional data points.
 void setMinDistanceThreshold(double minDistThreshold)
           
 void setProjectOnly(boolean projectOnly)
           
 void train(int epochs, double learningrate)
          Does the training by adjusting the lower dimensional data points according to their distances and the distances of the original data.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_SEED

public static final int DEFAULT_SEED
The default random seed.

See Also:
Constant Field Values

MIN_SEED

public static final int MIN_SEED
The minimum random seed.

See Also:
Constant Field Values

MAX_SEED

public static final int MAX_SEED
The maximum random seed.

See Also:
Constant Field Values

m_minDistThreshold

protected double m_minDistThreshold
Threshold of minimum distance.


m_unmodifiablePoints

protected Set<DataPoint> m_unmodifiablePoints
The set of unmodifyable data points.


m_dimension

protected int m_dimension
The dimension of the target space.


m_distMan

protected DistanceManager m_distMan
The distance manager to use.


m_euclideanDistMan

protected DistanceManager m_euclideanDistMan
The Euclidean distance manager used in the target space.


m_inData

protected DataTable m_inData
The input data table.


m_points

protected Hashtable<RowKey,DataPoint> m_points
A hashtable holding keys of input rows and related points of the target space.


m_fixedDataPoints

protected DataTable m_fixedDataPoints
The input data table storing the fixed data points.


m_fixedPoints

protected Hashtable<RowKey,DataPoint> m_fixedPoints
A hashtable holding row keys of fixed points and related points of the target space.


m_learningrate

protected double m_learningrate
The learning rate.


m_initialLearningrate

protected double m_initialLearningrate
The initial learning rate.


m_epochs

protected double m_epochs
The number of epochs to train.


m_finalLearningRate

protected double m_finalLearningRate
The final learning rate.


m_isInit

protected boolean m_isInit
Flag, indicating if data points in target space have been initialized (if true) or not (if false).


m_exec

protected ExecutionContext m_exec
The execution context to show progress information an enable cancel.


m_projectOnly

protected boolean m_projectOnly
Flag, indicating if data points have to be projected only according to the fixed points (if true) or adjusted according to the other (not fixed) points as well (if false).

Constructor Detail

MDSProjectionManager

public MDSProjectionManager(int dimension,
                            String distance,
                            boolean fuzzy,
                            BufferedDataTable inData,
                            BufferedDataTable fixedDataPoints,
                            int[] fixedDataMdsIndices,
                            ExecutionContext exec)
                     throws IllegalArgumentException,
                            CanceledExecutionException
Creates a new instance of MDSProjectionManager with the given dimension, distance metric, fuzzy flag, in data and fixed data to use. If the dimension is less or equals zero, the fixedDataPoints is null, the low dimension of the fixed data is not equal the specified dimension or the high dimension of the fixed data is not equal to the dimension of the input data an IllegalArgumentException is thrown. The fixed data is used to project the input data. First the in data is placed with respect to the fixed data and than it is moved by means of mds.

Parameters:
dimension - The output MDS dimension
distance - The distance metric to use.
fuzzy - true if the in data is fuzzy valued data.
inData - The in data to use.
exec - The ExecutionContext to monitor the progress.
fixedDataPoints - The fixed data points to project the in data at.
fixedDataMdsIndices - Array, containing the indices of the fixed mds data points according to the fixedDataPoints data table.
Throws:
IllegalArgumentException - if the specified dimension is less or equals zero or dimension incompatibilities of in data and fixed data occur.
CanceledExecutionException - If execution was canceled by the user.
Method Detail

preprocFixedDataPoints

protected void preprocFixedDataPoints(int[] fixedDataMdsIndices)
                               throws CanceledExecutionException
Initializes for each of the fixed data points a point in the target space. Which of the columns of the data table containing the fixed points have to be considered (according to the non fixed points) is specified by the given array of indices.

Parameters:
fixedDataMdsIndices - The indices specifying the columns of the data table containing the fixed data points, to consider.
Throws:
CanceledExecutionException - If the process is canceled.

init

public void init(long seed)
          throws CanceledExecutionException
Initializes the lower dimensional data points randomly.

Parameters:
seed - The random seed to use.
Throws:
CanceledExecutionException - If execution was canceled by the user.

train

public void train(int epochs,
                  double learningrate)
           throws CanceledExecutionException
Does the training by adjusting the lower dimensional data points according to their distances and the distances of the original data.

Parameters:
epochs - The number of epochs to train.
learningrate - The learn rate, specifying the step size of adjustment.
Throws:
CanceledExecutionException - If execution was canceled by the user.

doEpoch

protected void doEpoch(int epoch,
                       ExecutionMonitor exec)
                throws CanceledExecutionException
Computing one epoch if the iterative mds. In one epoch all points are adjusted according to all fixed points and if projectOnly is set false to all other points too.

Parameters:
epoch - The current epoch.
exec - The execution monitor to show the progress and enable canceling.
Throws:
CanceledExecutionException - If the process was canceled.

adjustDataPoint

protected void adjustDataPoint(DataPoint p1,
                               DataPoint p2,
                               DataRow r1,
                               DataRow r2)
Adjusts the low dimensional mapping of the first data point according to the second data point and its mapping.

Parameters:
p1 - The mapping of the first data point in the target space.
p2 - The mapping of the second data point in the target space.
r1 - The first data point in the original space.
r2 - The second data point in the original space.

disparityTransformation

protected double disparityTransformation(double distance)
Computes the disparity value for the given distance value.

Parameters:
distance - The distance value to compute the disparity value for.
Returns:
The disparity value according to the given distance value.

adjustLearningRate

protected void adjustLearningRate(int epoch)
Adjusts learning rate according to the given epoch. The learning rate is decreased over time.

Parameters:
epoch - The epoch for which the learning rate has to be computed. The higher the given epoch (according to the maximum epochs) the more is the learning rate decreased.

getDataPoints

public Hashtable<RowKey,DataPoint> getDataPoints()
Returns:
a Hashtable containing the RowKeys as as keys and the corresponding lower dimensional DataPoints as values.

reset

public void reset()
Clears the Hashtable containing the high and the corresponding low dimensional data points.


getDimension

public int getDimension()
Returns:
the dimension The dimension of the low dimensionl data points.

getProjectOnly

public boolean getProjectOnly()
Returns:
the projectOnly

setProjectOnly

public void setProjectOnly(boolean projectOnly)
Parameters:
projectOnly - the projectOnly to set

getMinDistanceThreshold

public double getMinDistanceThreshold()
Returns:
the minDistThreshold

setMinDistanceThreshold

public void setMinDistanceThreshold(double minDistThreshold)
Parameters:
minDistThreshold - the minDistThreshold to set


Copyright, 2003 - 2010. All rights reserved.
University of Konstanz, Germany.
Chair for Bioinformatics and Information Mining, Prof. Dr. Michael R. Berthold.
You may not modify, publish, transmit, transfer or sell, reproduce, create derivative works from, distribute, perform, display, or in any way exploit any of the content, in whole or in part, except as otherwise expressly permitted in writing by the copyright owner or as specified in the license file distributed with this product.