|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.knime.core.util.DuplicateChecker
public class DuplicateChecker
This class checks for duplicates in an (almost) arbitrary number of strings.
This can be used to check for e.g. unique row keys. The checking is done in
two stages: first new keys are added to a set. If the set already contains a
key an exception is thrown. If the set gets bigger than the maximum chunk
size it is written to disk and the set is cleared. If then after adding all
keys checkForDuplicates()
is called all created chunks are processed
and sorted by a merge sort like algorithm. If any duplicate keys are detected
during this process an exception is thrown.
Note: This implementation is not thread-safe, it's supposed to be used by a single thread only.
Field Summary | |
---|---|
static int |
MAX_CHUNK_SIZE
The default chunk size. |
static int |
MAX_STREAMS
The default number of streams open during merging. |
Constructor Summary | |
---|---|
DuplicateChecker()
Creates a new duplicate checker with default parameters. |
|
DuplicateChecker(int maxChunkSize,
int maxStreams)
Creates a new duplicate checker. |
Method Summary | |
---|---|
void |
addKey(String s)
Adds a new key to the duplicate checker. |
void |
checkForDuplicates()
Checks for duplicates in all added keys. |
void |
clear()
Clears the checker, i.e. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int MAX_CHUNK_SIZE
public static final int MAX_STREAMS
Constructor Detail |
---|
public DuplicateChecker()
public DuplicateChecker(int maxChunkSize, int maxStreams)
maxChunkSize
- the size of each chunk, i.e. the maximum number of
elements kept in memorymaxStreams
- the maximum number of streams that are kept open during
the merge processMethod Detail |
---|
public void addKey(String s) throws DuplicateKeyException, IOException
s
- the key
DuplicateKeyException
- if a duplicate within the current chunk has
been detected
IOException
- if an I/O error occurs while writing the chunk to
diskpublic void checkForDuplicates() throws DuplicateKeyException, IOException
DuplicateKeyException
- if a duplicate key has been detected
IOException
- if an I/O error occurspublic void clear()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |