Web & Social Media Analytics Previous Year Question Paper.pdf
string tokenization
1. StringTokenization
Overview
StringTokenizer class
Some StringTokenizer methods
StringTokenizer examples
1
2. StringTokenizer class
A token is a portion of a string that is separated from another
portion of that string by one or more chosen characters (called
delimiters).
Example: Assuming that a while space character (i.e., blank, ‘n’
(new line ), ‘t’ (tab), or ‘r’ (carriage return)) is a delimiter, then the
string: “I like KFUPM very much” has the tokens: “I”, “like”,
“KFUPM”, “very”, and “much”
The StringTokenizer class contained in the java.util package can be
used to break a string into separate tokens. This is particularly useful
in those situations in which we want to read and process one token
at a time; the BufferedReader class does not have a method to read
one token at a time.
The StringTokenizer constructors are:
StringTokenizer(String str) Uses white space characters
as a delimiters. The
delimiters are not returned.
StringTokenizer(String str, delimiters is a string that
String delimiters) specifies the delimiters. The
delimiters are not returned.
StringTokenizer(String str, If delimAsToken is true, then
String delimiters, boolean each delimiter is also
returned as a token; otherwise
delimAsToken)
delimiters are not returned.
2
3. Some StringTokenizer methods
Some StringTokenizer methods are:
int countTokens( ) Using the current set of
delimiters, the method returns
the number of tokens left.
boolean hasMoreTokens( ) Returns true if one or more
tokens remain in the string;
otherwise it returns false.
String nextToken( ) throws Returns the next token as a
NoSuchElementException string. Throws an exception if
there are no more tokens
String nextToken(String Returns the next token as a
newDelimiters) throws string and sets the delimiters to
newDelimiters. Throws an
NoSuchElementException
exception if there are no more
tokens.
To break a string into tokens, a loop having one of the following
forms may be used:
StringTokenizer tokenizer = new StringTokenizer(stringName);
while(tokenizer.hasMoreTokens( ))
{
String token = tokenizer.nextToken( );
// process the token
. . .
}
3
4. StringTokenizer examples
StringTokenizer tokenizer = new StringTokenizer (stringName);
int tokenCount = tokenizer.countTokens( );
for(int k = 1; k <= tokenCount; k++)
{
String token = tokenizer.nextToken( );
// process token
. . .
}
Example1:
import java.util.StringTokenizer;
public class Tokenizer1
{
public static void main(String[ ] args)
{
StringTokenizer wordFinder = new StringTokenizer
("We like KFUPM very much");
while( wordFinder.hasMoreTokens( ) )
System.out.println( wordFinder.nextToken( ) );
}
}
4
5. StringTokenizer examples (Cont’d)
Example2: The following program reads grades from the keyboard and
finds their average. The grades are read in one line.
import java.io.*;
import java.util.StringTokenizer;
public class Tokenizer5
{
public static void main(String[ ] args) throws IOException
{
BufferedReader stdin = new BufferedReader(new
InputStreamReader(System.in));
System.out.println("Enter grades in one line:");
String inputLine = stdin.readLine( );
StringTokenizer tokenizer = new StringTokenizer(inputLine);
int count = 0;
float grade, sum = 0.0F;
try {
while( tokenizer.hasMoreTokens( ) ) {
grade = Float.parseFloat( tokenizer.nextToken( ) );
if(grade >= 0 && grade <= 100)
{
sum += grade;
count++;
}
}
if(count > 0)
System.out.println("nThe average = "+ sum / count);
else
System.out.println("No valid grades entered");
}
5
6. StringTokenizer examples (Cont’d)
catch(NumberFormatException e)
{
System.err.println("Error - an invalid float value read");
}
}
}
Example3: Given that a text file grades.txt contains ids and quiz grades
of students:
980000 50.0 30.0 40.0
975348 50.0 35.0
960035 80.0 70.0 60.0 75.0
950000 20.0 40.0
996245 65.0 70.0 80.0 60.0 45.0
987645 50.0 60.0
the program on the next slide will display the id, number of quizzes
taken, and average of each student:
6
7. StringTokenizer examples (Cont’d)
import java.io.*;
import java.util.StringTokenizer;
public class Tokenizer6
{
public static void main(String[ ] args) throws IOException
{
BufferedReader inputStream = new BufferedReader(new
FileReader("grades.txt"));
StringTokenizer tokenizer;
String inputLine, id;
int count;
float sum;
System.out.println("ID# Number of Quizzes Averagen");
while((inputLine = inputStream.readLine( )) != null)
{
tokenizer = new StringTokenizer(inputLine);
id = tokenizer.nextToken( );
count = tokenizer.countTokens( );
sum = 0.0F;
while( tokenizer.hasMoreTokens( ) )
sum += Float.parseFloat( tokenizer.nextToken( ) );
System.out.println(id + " " + count + " ”
+ sum / count);
}
}
}
7