Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

org.fencedb.util
Class StringTokenizer  view StringTokenizer download StringTokenizer.java

java.lang.Object
  extended byorg.fencedb.util.StringTokenizer
All Implemented Interfaces:
java.util.Enumeration, java.util.Iterator

public class StringTokenizer
extends java.lang.Object
implements java.util.Enumeration, java.util.Iterator

The string tokenizer class allows an application to break a string into tokens. More information about this class is available from ostermiller.org.

The tokenization method is much simpler than the one used by the StreamTokenizer class. The StringTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.

The set of delimiters (the characters that separate tokens) may be specified either at creation time or on a per-token basis.

There are two kinds of delimiters: token delimiters and nontoken delimiters. A token is either one token delimiter character, or a maximal sequence of consecutive characters that are not delimiters.

A StringTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the characters processed.

The implementation is not thread safe; if a StringTokenizer object is intended to be used in multiple threads, an appropriate wrapper must be provided.

The following is one example of the use of the tokenizer. It also demonstrates the usefulness of having both token and nontoken delimiters in one StringTokenizer.

The code:

String s = "  (   aaa \t  * (b+c1 ))";
StringTokenizer st = new StringTokenizer(s, " \t\n\r\f", "()+*");
while (st.hasMoreTokens()) {
    System.out.println(st.nextToken());
};

prints the following output:

(
aaa
*
(
b
+
c1
)
)

Compatibility with java.util.StringTokenizer

In the original version of java.util.StringTokenizer, the method nextToken() left the current position after the returned token, and the method hasMoreTokens() moved (as a side effect) the current position before the beginning of the next token. Thus, the code:

String s = "x=a,b,c";
java.util.StringTokenizer st = new java.util.StringTokenizer(s,"=");
System.out.println(st.nextToken());
while (st.hasMoreTokens()) {
    System.out.println(st.nextToken(","));
};

prints the following output:

x
a
b
c

The Java SDK 1.3 implementation removed the undesired side effect of hasMoreTokens method: now, it does not advance current position. However, after these changes the output of the above code was:

x
=a
b
c

and there was no good way to produce a second token without "=".

To solve the problem, this implementation introduces a new method skipDelimiters(). To produce the original output, the above code should be modified as:

String s = "x=a,b,c";
StringTokenizer st = new StringTokenizer(s,"=");
System.out.println(st.nextToken());
st.skipDelimiters();
while (st.hasMoreTokens()) {
    System.out.println(st.nextToken(","));
};


Field Summary
protected  int delimsChangedPosition
          Indicates at which position the delimiters last changed.
protected  boolean emptyReturned
          One of two variables used to maintain state through the tokenizing process.
protected  char maxDelimChar
          Stores the value of the delimiter character with the highest value.
protected  java.lang.String nontokenDelims
          The set of nontoken delimiters.
protected  int position
          One of two variables used to maintain state through the tokenizing process.
protected  boolean returnEmptyTokens
          Whether empty tokens should be returned.
protected  int strLength
          The length of the text.
protected  java.lang.String text
          The string to be tokenized.
protected  int tokenCount
          A cache of the token count.
protected  java.lang.String tokenDelims
          The set of token delimiters.
 
Constructor Summary
StringTokenizer(java.lang.String text)
          Constructs a string tokenizer for the specified string.
StringTokenizer(java.lang.String text, java.lang.String nontokenDelims)
          Constructs a string tokenizer for the specified string.
StringTokenizer(java.lang.String text, java.lang.String delims, boolean delimsAreTokens)
          Constructs a string tokenizer for the specified string.
StringTokenizer(java.lang.String text, java.lang.String nontokenDelims, java.lang.String tokenDelims)
          Constructs a string tokenizer for the specified string.
StringTokenizer(java.lang.String text, java.lang.String nontokenDelims, java.lang.String tokenDelims, boolean returnEmptyTokens)
          Constructs a string tokenizer for the specified string.
 
Method Summary
private  boolean advancePosition()
          Advances the state of the tokenizer to the next token or delimiter.
 int countTokens()
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception.
 int countTokens(java.lang.String delims)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of (nontoken) delimiters.
 int countTokens(java.lang.String delims, boolean delimsAreTokens)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters.
 int countTokens(java.lang.String nontokenDelims, java.lang.String tokenDelims)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters.
 int countTokens(java.lang.String nontokenDelims, java.lang.String tokenDelims, boolean returnEmptyTokens)
          Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters.
 int getCurrentPosition()
          Get the the index of the character immediately following the end of the last token.
 boolean hasMoreElements()
          Returns the same value as the hasMoreTokens() method.
 boolean hasMoreTokens()
          Tests if there are more tokens available from this tokenizer's string.
 boolean hasNext()
          Returns the same value as the hasMoreTokens() method.
private  int indexOfNextDelimiter(int start)
          Similar to String.indexOf(int, String) but will look for any character from string rather than the entire string.
 java.lang.Object next()
          Returns the same value as the nextToken() method, except that its declared return value is Object rather than String.
 java.lang.Object nextElement()
          Returns the same value as the nextToken() method, except that its declared return value is Object rather than String.
 java.lang.String nextToken()
          Returns the next token from this string tokenizer.
 java.lang.String nextToken(java.lang.String nontokenDelims)
          Returns the next token in this string tokenizer's string.
 java.lang.String nextToken(java.lang.String delims, boolean delimsAreTokens)
          Returns the next token in this string tokenizer's string.
 java.lang.String nextToken(java.lang.String nontokenDelims, java.lang.String tokenDelims)
          Returns the next token in this string tokenizer's string.
 java.lang.String nextToken(java.lang.String nontokenDelims, java.lang.String tokenDelims, boolean returnEmptyTokens)
          Returns the next token in this string tokenizer's string.
 java.lang.String peek()
          Returns the same value as nextToken() but does not alter the internal state of the Tokenizer.
 void remove()
          This implementation always throws UnsupportedOperationException.
 java.lang.String restOfText()
          Retrieves the rest of the text as a single token.
 void setDelimiters(java.lang.String delims)
          Set the delimiters used to this set of (nontoken) delimiters.
 void setDelimiters(java.lang.String delims, boolean delimsAreTokens)
          Set the delimiters used to this set of delimiters.
 void setDelimiters(java.lang.String nontokenDelims, java.lang.String tokenDelims)
          Set the delimiters used to this set of delimiters.
 void setDelimiters(java.lang.String nontokenDelims, java.lang.String tokenDelims, boolean returnEmptyTokens)
          Set the delimiters used to this set of delimiters.
private  void setDelims(java.lang.String nontokenDelims, java.lang.String tokenDelims)
          Set the delimiters for this StringTokenizer.
 void setReturnEmptyTokens(boolean returnEmptyTokens)
          Set whether empty tokens should be returned from this point in in the tokenizing process onward.
 void setText(java.lang.String text)
          Set the text to be tokenized in this StringTokenizer.
 boolean skipDelimiters()
          Advances the current position so it is before the next token.
 java.lang.String[] toArray()
          Retrieve all of the remaining tokens in a String array.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

text

protected java.lang.String text
The string to be tokenized. The code relies on this to never be null.


strLength

protected int strLength
The length of the text. Cached for performance. This should be set whenever the string we are working with is changed.


nontokenDelims

protected java.lang.String nontokenDelims
The set of nontoken delimiters.


tokenDelims

protected java.lang.String tokenDelims
The set of token delimiters.


position

protected int position
One of two variables used to maintain state through the tokenizing process.

Represents the position at which we should start looking for the next token(the position of the character immediately following the end of the last token, or 0 to start), or -1 if the entire string has been examined.


emptyReturned

protected boolean emptyReturned
One of two variables used to maintain state through the tokenizing process.

true if and only if is found that an empty token should be returned or if empty token was the last thing returned.

If returnEmptyTokens in false, then this variable will always be false.


maxDelimChar

protected char maxDelimChar
Stores the value of the delimiter character with the highest value. It is used to optimize the detection of delimiter characters. The common case will be that the int values of delimiters will be less than that of most characters in the string (, or space less than any letter for example). Given this, we can check easily check to see if a character is not a delimiter by comparing it to the max delimiter. If it is greater than the max delimiter, then it is no a delimiter otherwise we have to do some more in depth analysis. (ie search the delimiter string.) This will reduce the running time of the algorithm not to depend on the length of the delimiter string for the common case.


returnEmptyTokens

protected boolean returnEmptyTokens
Whether empty tokens should be returned. ie if "" should be returned when text starts with a delim, has two delims next to each other, or ends with a delim.


delimsChangedPosition

protected int delimsChangedPosition
Indicates at which position the delimiters last changed. This will effect how null tokens are returned. Any time that delimiters are changed, the string will be treated as if it is being parsed from position zero, ie, null strings are possible at the very beginning.


tokenCount

protected int tokenCount
A cache of the token count. This variable should be -1 if the token have not yet been counted. It should be greater than or equal to zero if the tokens have been counted.

Constructor Detail

StringTokenizer

public StringTokenizer(java.lang.String text,
                       java.lang.String nontokenDelims,
                       java.lang.String tokenDelims)
Constructs a string tokenizer for the specified string. Both token and nontoken delimiters are specified.

The current position is set at the beginning of the string.


StringTokenizer

public StringTokenizer(java.lang.String text,
                       java.lang.String nontokenDelims,
                       java.lang.String tokenDelims,
                       boolean returnEmptyTokens)
Constructs a string tokenizer for the specified string. Both token and nontoken delimiters are specified and whether or not empty tokens are returned is specified.

Empty tokens are tokens that are between consecutive delimiters.

It is a primary constructor (i.e. all other constructors are defined in terms of it.)

The current position is set at the beginning of the string.


StringTokenizer

public StringTokenizer(java.lang.String text,
                       java.lang.String delims,
                       boolean delimsAreTokens)
Constructs a string tokenizer for the specified string. Either token or nontoken delimiters are specified.

Is equivalent to:

  • If the third parameter is false -- StringTokenizer(text,delims, null)
  • If the third parameter is true -- StringTokenizer(text, null ,delims)


StringTokenizer

public StringTokenizer(java.lang.String text,
                       java.lang.String nontokenDelims)
Constructs a string tokenizer for the specified string. The characters in the nontokenDelims argument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.

Is equivalent to StringTokenizer(text,nontokenDelims, null).


StringTokenizer

public StringTokenizer(java.lang.String text)
Constructs a string tokenizer for the specified string. The tokenizer uses " \t\n\r\f" as a delimiter set of nontoken delimiters, and an empty token delimiter set.

Is equivalent to StringTokenizer(text, " \t\n\r\f", null);

Method Detail

setText

public void setText(java.lang.String text)
Set the text to be tokenized in this StringTokenizer.

This is useful when for StringTokenizer re-use so that new string tokenizers do no have to be created for each string you want to tokenizer.

The string will be tokenized from the beginning of the string.


setDelims

private void setDelims(java.lang.String nontokenDelims,
                       java.lang.String tokenDelims)
Set the delimiters for this StringTokenizer. The position must be initialized before this method is used. (setText does this and it is called from the constructor)


hasMoreTokens

public boolean hasMoreTokens()
Tests if there are more tokens available from this tokenizer's string. If this method returns true, then a subsequent call to nextToken with no argument will successfully return a token.

The current position is not changed.


nextToken

public java.lang.String nextToken()
Returns the next token from this string tokenizer.

The current position is set after the token returned.


skipDelimiters

public boolean skipDelimiters()
Advances the current position so it is before the next token.

This method skips nontoken delimiters but does not skip token delimiters.

This method is useful when switching to the new delimiter sets (see the second example in the class comment.)


countTokens

public int countTokens()
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception. The current position is not advanced.


setDelimiters

public void setDelimiters(java.lang.String delims)
Set the delimiters used to this set of (nontoken) delimiters.


setDelimiters

public void setDelimiters(java.lang.String delims,
                          boolean delimsAreTokens)
Set the delimiters used to this set of delimiters.


setDelimiters

public void setDelimiters(java.lang.String nontokenDelims,
                          java.lang.String tokenDelims)
Set the delimiters used to this set of delimiters.


setDelimiters

public void setDelimiters(java.lang.String nontokenDelims,
                          java.lang.String tokenDelims,
                          boolean returnEmptyTokens)
Set the delimiters used to this set of delimiters.


countTokens

public int countTokens(java.lang.String delims)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of (nontoken) delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.


countTokens

public int countTokens(java.lang.String delims,
                       boolean delimsAreTokens)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.


countTokens

public int countTokens(java.lang.String nontokenDelims,
                       java.lang.String tokenDelims)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.


countTokens

public int countTokens(java.lang.String nontokenDelims,
                       java.lang.String tokenDelims,
                       boolean returnEmptyTokens)
Calculates the number of times that this tokenizer's nextToken method can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.


advancePosition

private boolean advancePosition()
Advances the state of the tokenizer to the next token or delimiter. This method only modifies the class variables position, and emptyReturned. The type of token that should be emitted can be deduced by examining the changes to these two variables. If there are no more tokens, the state of these variables does not change at all.


nextToken

public java.lang.String nextToken(java.lang.String nontokenDelims,
                                  java.lang.String tokenDelims)
Returns the next token in this string tokenizer's string.

First, the sets of token and nontoken delimiters are changed to be the tokenDelims and nontokenDelims, respectively. Then the next token (with respect to new delimiters) in the string after the current position is returned.

The current position is set after the token returned.

The new delimiter sets remains the used ones after this call.


nextToken

public java.lang.String nextToken(java.lang.String nontokenDelims,
                                  java.lang.String tokenDelims,
                                  boolean returnEmptyTokens)
Returns the next token in this string tokenizer's string.

First, the sets of token and nontoken delimiters are changed to be the tokenDelims and nontokenDelims, respectively; and whether or not to return empty tokens is set. Then the next token (with respect to new delimiters) in the string after the current position is returned.

The current position is set after the token returned.

The new delimiter set remains the one used for this call and empty tokens are returned in the future as they are in this call.


nextToken

public java.lang.String nextToken(java.lang.String delims,
                                  boolean delimsAreTokens)
Returns the next token in this string tokenizer's string.

Is equivalent to:

  • If the second parameter is false -- nextToken(delims, null)
  • If the second parameter is true -- nextToken(null ,delims)


nextToken

public java.lang.String nextToken(java.lang.String nontokenDelims)
Returns the next token in this string tokenizer's string.

Is equivalent to nextToken(delims, null).


indexOfNextDelimiter

private int indexOfNextDelimiter(int start)
Similar to String.indexOf(int, String) but will look for any character from string rather than the entire string.


hasMoreElements

public boolean hasMoreElements()
Returns the same value as the hasMoreTokens() method. It exists so that this class can implement the Enumeration interface.

Specified by:
hasMoreElements in interface java.util.Enumeration

nextElement

public java.lang.Object nextElement()
Returns the same value as the nextToken() method, except that its declared return value is Object rather than String. It exists so that this class can implement the Enumeration interface.

Specified by:
nextElement in interface java.util.Enumeration

hasNext

public boolean hasNext()
Returns the same value as the hasMoreTokens() method. It exists so that this class can implement the Iterator interface.

Specified by:
hasNext in interface java.util.Iterator

next

public java.lang.Object next()
Returns the same value as the nextToken() method, except that its declared return value is Object rather than String. It exists so that this class can implement the Iterator interface.

Specified by:
next in interface java.util.Iterator

remove

public void remove()
This implementation always throws UnsupportedOperationException. It exists so that this class can implement the Iterator interface.

Specified by:
remove in interface java.util.Iterator

setReturnEmptyTokens

public void setReturnEmptyTokens(boolean returnEmptyTokens)
Set whether empty tokens should be returned from this point in in the tokenizing process onward.

Empty tokens occur when two delimiters are next to each other or a delimiter occurs at the beginning or end of a string. If empty tokens are set to be returned, and a comma is the non token delimiter, the following table shows how many tokens are in each string.
StringNumber of tokens
"one,two"2 - normal case with no empty tokens.
"one,,three"3 including the empty token in the middle.
"one,"2 including the empty token at the end.
",two"2 including the empty token at the beginning.
","2 including the empty tokens at the beginning and the ends.
""1 - all strings will have at least one token if empty tokens are returned.


getCurrentPosition

public int getCurrentPosition()
Get the the index of the character immediately following the end of the last token. This is the position at which this tokenizer will begin looking for the next token when a nextToken() method is invoked.


toArray

public java.lang.String[] toArray()
Retrieve all of the remaining tokens in a String array. This method uses the options that are currently set for the tokenizer and will advance the state of the tokenizer such that hasMoreTokens() will return false.


restOfText

public java.lang.String restOfText()
Retrieves the rest of the text as a single token. After calling this method hasMoreTokens() will always return false.


peek

public java.lang.String peek()
Returns the same value as nextToken() but does not alter the internal state of the Tokenizer. Subsequent calls to peek() or a call to nextToken() will return the same token again.