Sunday, October 5, 2008

StringBuilder is NOT always faster than string concatenation

While reading through Effective Java, 2nd Edition (which I will publish a positive book review for when I'm finished), I noticed one piece of advice which is correct, but which I realized can deceive some people into doing something inadvisable.

The advice is "Item 51: Beware the performance of string concatenation". This clearly points out the key to the problem, which is that if you end up creating lots of strings in a loop, it will be slow. This is as opposed to simply appending existing strings to an existing StringBuilder object, which will be much more efficient. I'm fine with all that.

My problem with this section is that it doesn't clearly point out that ordinary string concatenation actually uses StringBuilder under the covers. That's right. It works exactly the same way as StringBuilder, because that's how it's implemented.

If you're not convinced, we can look at two very simple Java classes, and we'll use a nice Bytecode visualization plugin for Eclipse called "Bytecode Outline" () which will show you that they truly are doing the same thing.

The two sample classes are this:
package stringbuilder;
public class UsesPlus
{
public static void main(String[] args)
{
String[] values = { "abc", "def", "ghi" };
System.out.println("string[" +
values[0] + ":" +
values[1] + ":" +
values[2] + "]");
}
}

And:
package stringbuilder;
public class UsesStringBuilder
{
public static void main(String[] args)
{
String[] values = { "abc", "def", "ghi" };
System.out.println((new StringBuilder("string[")).
append(values[0]).append(":").
append(values[1]).append(":").
append(values[2]).append("]").
toString());
}
}

After installing the Bytecode Outline plugin, the best way to visualize the difference is to select both classes in the Package Explorer, then right-click and select "Compare With" and then "Each Other Bytecode". This really just generates the bytecode for both and then gives you an ordinary text comparison view to compare them. For now, I'll just present for you a meaningful excerpt of the bytecode for each. Both of these samples start with loading the "string[" string and continuing to the "toString()" call.

Here's the relevant bytecode for "UsesPlus.java" (both samples show long lines wrapped with "\" for viewability):
    LDC "string["
INVOKESPECIAL java/lang/StringBuilder.\
(Ljava/lang/String;)V
L2 (22)
LINENUMBER 10 L2
ALOAD 1
ICONST_0
AALOAD
INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC ":"
INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;

Here's the similar block for "UsesStringBuilder.java":
    LDC "string["
INVOKESPECIAL java/lang/StringBuilder.\
(Ljava/lang/String;)V
ALOAD 1
ICONST_0
AALOAD
INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;
LDC ":"
INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;


Notice that they are almost completely the same.

The moral here is, don't apply good advice in the wrong places.

No comments: