Performance Issues (StringTokenizer)

29 November 2007

StringTokenizer is a very useful class that is used by developers while parsing text. It works fine and produces the required results.

StringTokenizer st = new StringTokenizer(s,",");


StringTokenizer is very powerful and can handle large sets of delimiters at once. This makes it powerful but a little slow. What if we have a single character delimiter? Should we still use StringTokenizer? If we want to make life easy, the answer is yes. But if we are concerned about performance issues, the answer is no.

Using StringTokenizer to get tokens:

                String s= "Australia, Germany, Austria, France";
		String sub = null;
		StringTokenizer st = new StringTokenizer(s,",");
 
		try{
		while((sub = (String)st.nextToken()) != null)
		{
			// code goes here
			System.out.println(sub);
		}
		}catch(NoSuchElementException e)
		{
 
		}

Our own code to get tokens:

                String s= "Australia, Germany, Austria, France";
	        int i=0;
		int j = s.indexOf(",");
		while(j >= 0)
		{
			sub = s.substring(i,j);
			// code goes here
			i = j+1;
			j = s.indexOf(",", i);
		}
		sub = s.substring(i);

How we are finding the end of one token? We simply make a single call to indexOf method and get the end position of our token. The key point is that we know that our delimiter is of one character length so we coded accordingly. StringTokenizer on the other hand caters delimiter of variable length as well, so it calls indexOf() for every character in the string in order to decide if it belongs in the set of delimiters or not.

So the rule is, flexible functionality is powerful but slow whereas inflexible functionality is efficient.

del.icio.us:Performance Issues (StringTokenizer)  digg:Performance Issues (StringTokenizer)  spurl:Performance Issues (StringTokenizer)  wists:Performance Issues (StringTokenizer)  simpy:Performance Issues (StringTokenizer)  newsvine:Performance Issues (StringTokenizer)  blinklist:Performance Issues (StringTokenizer)  furl:Performance Issues (StringTokenizer)  reddit:Performance Issues (StringTokenizer)  fark:Performance Issues (StringTokenizer)  blogmarks:Performance Issues (StringTokenizer)  Y!:Performance Issues (StringTokenizer)  smarking:Performance Issues (StringTokenizer)  magnolia:Performance Issues (StringTokenizer)  segnalo:Performance Issues (StringTokenizer)  gifttagging:Performance Issues (StringTokenizer)

Top Of Page | Trackback

If you found this page useful, consider linking to it. Simply copy and paste the code below into your web site.

It will look like this: Performance Issues (StringTokenizer)

2 Responses to “Performance Issues (StringTokenizer)”

  1. Ivan Says:

    even if you’re concerned with performance, this logic should be encapsulated because it looks ugly when it is intermixed with business logic. when you use the tokenizer at least readability is much better. in 99% of situations i would choose readability.

  2. António Pereira Says:

    The method split from String class do it with only one line of code:

    String[] tokens = s.split(”,”);

Leave a Reply