David M. Karr's Blog: 2008

Thursday, November 6, 2008

Book Review: Clean Code: A Handbook of Agile Software Craftsmanship

I recently noticed (and quickly read) another book from the prolific writer Robert C. Martin, "Uncle Bob" to people who know him in the industry. The book is titled Clean Code: A Handbook of Agile Software Craftsmanship. Although you see "Agile" in the title, it's not really a book about Agile Software Development, although it certainly describes practices that would occur while utilizing Agile techniques. It's really much more about software craftsmanship and refactoring. I'll tell you a little more about the book in this review. Note that all of the examples are Java, but very little content in the book references anything past "core Java", except for perhaps some references to collection classes and concurrency (although the basics of concurrency are not really Java-specific), so a C#-er would still get a lot of benefit from this book.

Note that although I really liked the theme and content of the book, I don't necessarily agree with everything he advises. You shouldn't either, but you should do so for the right reasons. You should never follow rules and guidelines if you don't know why you should do it, or what benefit it provides. If you respect that strategy while reading this book, you will appreciate it for what it is, a very careful and thoughtful analysis of how to write clean code.

What I'm reviewing is the online Safari version of the book, published August 1st, 2008.

Although RCM has the sole author credit, several chapters say right at the beginning "by ...", with names like Tim Ottinger, Michael Feathers, James Grenning, Jeff Langr, Dr. Kevin Dean Wampler, and Brett L. Schuchert, so I imagine this is more of a collaborative work, although it does not detract from the theme or style of the book.

The book has 17 chapters of content. I usually don't consider appendices part of the content of the book, but this book has a very good appendix titled "Concurrency II", which is sort of a "sequel" to the earlier "Concurrency" chapter in the book. I'll talk more about this appendix a little later.

The first chapter of the book is titled "Clean Code", which really sets the theme for the entire book (it's in the title, of course). Whether it's talking about good naming, formatting, coupling/cohesion, or testing, the object is to produce clean code, and to reap the benefits of such behavior. This chapter reviews the basic principles of what "clean code" means without getting into code yet.

The next chapter, "Meaningful Names" (by Tim Ottinger), covers one of the most important intangible skills required for effective software development, being capable enough in the English language (and typing, I suppose) to define names that effectively convey the true meaning and intent of a symbol, without requiring extensive comments to explain the meaning of the name (more about that anti-pattern later). This is a fundamental skill required for writing clean code.

Following this is the chapter called "Functions", which covers issues with function structure and high-level design goals, like "Do one thing" and "One level of abstraction" and issues that the function interface presents to clients of the function that can add or detract to the quality of the product.

The next chapter, titled "Comments", presents a theme that I've felt very strongly about for a long time, in that bad comments are much worse than no comments, and good understandable code with no comments at all is a good thing. The chapter explores various aspects of this.

After this is the chapter titled "Formatting", which emphasizes that formatting is very important, but the actual specification of how code is formatted is less important than the consistency of code formatting within a development group. Fortunately, modern tool and IDE support make it easier to set up and follow agreed-upon formatting guidelines. This chapter also talks a bit about average file length, calling this "vertical formatting".

The next chapter on "Objects and Data Structures" starts with points about proper interface abstraction. It then makes an interesting point about the difference between objects and data structures, in that the latter exposes data, but no functions, and the former exposes functions, but no data. In addition, it points out that between procedural code (using data structures) and object-oriented code, some changes are easier in procedural code, and others are easier in object-oriented code. There is also discussion of the Law of Demeter, which helps to reduce coupling between modules.

The "Exception Handling" chapter after this, by Michael Feathers, gives good advice on defining and using exceptions, and how to handle situations with "null" that can help reduce boilerplate special-case handlers.

The following chapter, "Boundaries" by James Grenning, refers to things beyond the boundaries of our code and application, mostly third-party packages that we integrate our code with. The most important point it makes is to insulate the rest of your code from fragility in that boundary by defining facades and interfaces that allow for changes past the boundaries to not adversely impact the rest of the application.

Next, the "Unit Tests" chapter gets into lots of details about how to write clean tests, but the underlying point (emphasized in the conclusion) is that the code in your tests is just as important as the code being tested, and that clean and understandable tests need to be there as part of the complete package.

After this is the "Classes" chapter, with (as opposed to "by") Jeff Langr, which brings up issues in class organization and structure that help to make the resulting classes easier to understand and more amenable to changes.

Next is the "Systems" chapter, by Dr. Kevin Dean Wampler. This chapter talks about some architecture issues above classes and functions. For instance, it talks about using dependency injection or lazy initialization to separate the setup of a system from its execution. Next, we get into separation of concerns and using proxies and aspect-orientation to implement cross-cutting concerns the right way.

Following this is the "Emergence" chapter, by Jeff Langr. This covers Kent Beck's four rules of "Simple Design", which the author believes contributes to the "emergence" of good design. When you read the information on the first principle, "Runs All the Tests", it really seems like it would be better named "Is Fully Testable". Whatever it's called, the section is clear on the importance of this. The next three principles, named "No Duplication", "Expressive", and "Minimal Classes and Methods" are introduced in the book by summarizing that all of these are achieved through the practice of careful refactoring, which may be an even more important lesson than the principles themselves.

The next chapter, titled "Concurrency", by Brett L. Schuchert, is one of the longest chapters in the book, and deservedly so. Code implementing concurrent algorithms can be extremely difficult to understand and maintain, if principles of clean coding are not observed. Even considering the length of this chapter, this is actually only the first of two chapters on concurrency. The second part is one of the appendices, as opposed to be being part of the regular content. Perhaps the authors felt it was getting too far out of scope, I don't know. The most important advice presented in this chapter is to use the "Single Responsibility Principle", particularly when you're considering writing code that implements both concurrency principles and business logic. This is where the "java.util.concurrent" package provides a lot of advantages, as it encapsulates the details of concurrency, allowing you to plug in POJOs that concern themselves only with business logic. This chapter also mentions an intriguing tool from IBM called "ConTest" that takes an aggressive approach with testing of concurrent code, by introspecting and instrumenting the code to accentuate concurrency vulnerabilities in the code. This is the first I've heard of this tool, but if and when I need to test concurrent code, I will definitely make use of it.

The next chapter, "Successive Refinement" is the first of three chapters where we get to see much of the advice that came before being put to use. These chapters present several "deep dives" into refactoring exercises with specific code samples (mostly taken from real code bases, some written by the author himself). The refactoring steps are very clear and detailed, although it sometimes even becomes hard to follow the detail. This would be a great situation to execute the same steps yourself with the original code base, not just to follow the details, but to help absorb the techniques into your common practices. As is normal in refactoring, some of the later steps were to rework code written in previous steps. Some of the examples mention the Clover code coverage tool, which the author uses to analyze the code coverage of his tests. The second of these three chapters even examines refactoring of JUnit tests, which emphasizes the fact that tests are just as important as the production code.

The last regular content chapter of the book, "Smells and Heuristics", is essentially a dictionary of numerous principles referenced throughout the book. It is divided into sections titled "Comments", "Environment", "Functions", "General", "Java", "Names", and "Tests" (they were ordered alphabetically in the book, also). Each principle in each section is numbered, and the references to these principles throughout the book are abbreviated like "N4", being the fourth principle in the Names section. Although each reference to these principles explained why the principle was used, citing examples where applicable, this dictionary also has its own examples in each principle, further supporting the advice.

As mentioned previously, Appendix A is titled "Concurrency II", by Brett L. Schuchert (again), and it really just explores more issues with concurrency. Curiously, comparing the content of the two chapters, this one explores some of the same concepts, but in more detail, and with more examples than the first chapter. This appendix is perhaps as long or longer than the first chapter on concurrency. This appendix also shows examples using generated bytecode, which helps to illustrate some of the issues a little better.

All in all, a concise (it is only 464 pages, after all) book on refactoring and principles to achieve clean code. It's definitely a book you should read if this is important to you. Just read it with an open mind and you will learn the things you didn't know, strengthen the principles you were already familiar with, and perhaps learn to appreciate more the principles you follow that conflict with the author's advice (few, most likely).

Wednesday, October 22, 2008

Book Review: Web Service Contract Design & Versioning for SOA

My wife will see any movie starring Sean Connery, no questions asked, no review required. I, on the other hand, will read anything Thomas Erl publishes, although I think I will at least limit that promise to the domain I've seen his work in, which is service-oriented architecture and web services. In that domain, the books he has published are unique in their ability to sharply focus on the important and relevant details that I need to know about. He has continued his outstanding work in the new book Web Service Contract Design & Versioning for SOA, published by Prentice-Hall in October of 2008.

This book is different in one way from the other Thomas Erl books I've read. Instead of just Thomas Erl in the credits, there are actually 9 different authors, with Thomas Erl being the "lead" author (I would assume). I suppose this is somewhat understandable, as this is also the largest book in this series that I've read (718 pages of non-appendix content), so I won't apologize for the length of this review. Considering the range of topics covered, I give them credit for being able to limit the scope of the book down to what they did, to let them focus strictly on issues with design of the web service contract. For instance, if the scope of the book was extended to give a good treatment of WS-Security, it would have expanded into areas that would detract from the main point of the book. As in the other purely SOA and web-service focused books in the series, this book carefully avoids talking about any implementations of these specifications (these are planned for later books), to continue the focus on the design of the contract.

The book covers contract design and versioning, divided into three main content sections, although the third section on versioning is small compared to the two sections on contract design, which are divided by "fundamental" and "advanced" contract design.

The book uses a fictional case study throughout the book to help us focus on making it "real". These case study sidebars are sometimes quite detailed, covering design decisions and options that we'll probably have to make in our own services.

The first section, "Fundamental Service Contract Design", starts with a foundation seen elsewhere in the "Principles" (SOA - Principles of Service Design) and "Concepts" (SOA - Concepts, Technology, and Design) books in the series. Most important here are the "design principles" (Abstraction, Loose Coupling, etc.) which contribute towards the specific goals and benefits of "service-oriented computing".

This section continues with a chapter that breaks down the various pieces of a WSDL and explains their purposes and relationships, from a high level (no syntax yet). This chapter also introduces the fact that both WSDL 1.1 (what virtually all of us use now) and WSDL 2.0 (anyone out there?) will be covered in parallel in the book. Each section that talks about an issue specific to one version will do the same for the other version. In most cases, this is done clearly to prevent confusion. This chapter also introduces WS-Policy and WS-I, in general.

This is followed by a concise chapter on namespaces and prefixes, and the various issues these present in this domain. It goes into detail of why a namespace is a URI, and not a URL, but also covers in detail the tradeoffs for treating it like a URL.

The next chapter gets into the syntax of XML Schema, going into depth with the details of defining types and messages. I liked that some features of XML Schema are de-emphasized, or not even mentioned, if they tend to be problematic in real implementations. I learned about how the "elementFormDefault" attribute really works (whether local elements are assumed to be in the target namespace). I was misinformed about this previously. This chapter has a long case study sidebar using the ideas presented in the chapter.

The next two chapters begin the fundamentals of WSDL design, down to the syntax level. The two chapters are divided by "Abstract" and "Concrete" portions of the contract. Besides the syntax, these chapters cover several common conventions and recommendations for naming and structure, from the WS-I specification, and just from common sense. These chapters focus on WSDL 1.1, but NOTE boxes are shown for places where WSDL 2.0 differs from 1.1.

The short chapter that follows focuses directly on WSDL 2.0 and what new features it provides over WSDL 1.1, and what new options it presents.

The next chapter begins detailed discussion of the first of two WS-I specifications focused on in this book, the WS-Policy specification. It hints at the ability of the WS-Policy framework to extend the behaviors and semantics of many aspects of the contract, and to do this in a reusable way.

The next chapter, the last chapter in the first section, opens up the SOAP envelope, covering the various parts, with mentions of variances between SOAP 1.1 and SOAP 1.2. This chapter introduces the notion of how SOAP messages are represented while bouncing around the network, between SOAP intermediaries, and how information about the "role" of various nodes in the network are described, and which roles can or will process particular parts of the envelope.

Now begins the second section, titled "Advanced Service Contract Design". This section, with eight chapters, covers advanced design aspects of XML Schema and WSDL, along with WS-Policy and WS-Addressing. Each of these four topics is covered in two chapters each. The first two chapters cover issues with flexibility and reusability, covering wildcards, "generic vs. specific" elements, include/import, key/keyref, and some other topics. It concludes with the usual case study examination of applying these ideas.

The next two chapters, on advanced WSDL design, focuses initially on issues of modularity and extension, then covers designing asynchronous operations utilizing WS-Addressing and WS-Policy, and WS-BPEL extensions to WSDL. It then covers some miscellaneous topics mostly covering message dispatch challenges. Note that one section discusses an example using both the "addressing wsdl" and "addressing metadata" specifications at the same time, which according to my human resources does not work, as they conflict with each other. The chapter continues with an interesting section on binding services to HTTP without SOAP. The case study at the end of the chapter expands on this idea.

The following two chapters fully explore WS-Policy and what you can do with it. It's a small language, but it shows how it can provide very complex functionality by building on it, perhaps to the point where it could be hard to understand. A case study example using WS-Security is quite complex. The explanation of the difference between "wsp:Ignorable" and "wsp:Optional" is clear with a simple example. The last of the two chapters focuses mostly on building custom policy assertions. This demonstrates a lot of its subtle power, and how the use of it may expand as specifications and implementations advance over time.

The last two chapters of this section fully explore WS-Addressing. The simple use case of replacing the SOAPAction HTTP header with the "wsa:Action" SOAP header is barely the start of what you can do with it. It presents so many unusual possibilities, like allowing WSDLs and messages to specify asynchronous messages and dynamic routing paths, along with additional metadata for each "endpoint reference" to get carried along to each endpoint.

The last section of the book, with four chapters, covers the range of ideas in service contract versioning. It goes through a "fundamentals" chapter that defines terms and basic issues, then a pair of chapters on WSDL and XML Schema versioning, and then an "Advanced" chapter (miscellaneous issues) to end the book. Throughout this section, they assume that enterprises use a consistent versioning policy, but not necessarily the same between enterprises, so they classify the strategies as "strict", "flexible", and "loose", and each issue treatment explores how those strategies apply to the issue.

Overall, I was pleased with the content and level of detail in the book. Reading it motivated me to build some sample code in my primary application server, which led me down some very interesting paths and eventual discoveries. Although this led me to discover one minor point where the book content was disputable (mixing two different addressing specs), it led me to appreciate even more how complex the landscape is, which is probably why Thomas Erl saw the need to share the writing load for this latest volume in his series.

Thursday, October 16, 2008

Java class to test XPath queries in XML documents

Something we have to do more and more these days is run XPath queries on XML documents, either interactively or programatically. Sometimes we need to test the queries we develop. Eclipse has several plugins in various states of quality that can do this for you. If none of them work well enough for you, you'll need a different approach.
I'll show here a simple Java class that uses the javax.xml.xpath.XPath class and the Apache Commons CLI package in barely 100 lines of code that you can wrap with a shell script to easily run XPath queries against arbitrary documents, and specifying arbitrary namespaces.

Command-line usage

The "usage" string for the shell-script wrapper, "xpathfind", is the following:

xpathfind -p filepath {-x xpath}+ {-n pfx:ns}*

This means that you supply a single file path to a document, along with one or more XPath strings, and zero or more prefix and namespace pairs.

Example

Assuming we have the following "web.xml" file:

<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5">
<display-name>mdkcarousel</display-name>
<welcome-file-list>
<welcome-file>index.html</welcome-file>
<welcome-file>index.htm</welcome-file>
<welcome-file>index.jsp</welcome-file>
<welcome-file>default.html</welcome-file>
<welcome-file>default.htm</welcome-file>
<welcome-file>default.jsp</welcome-file>
</welcome-file-list>
<filter id="abc">
<filter-name>No Caching Filter</filter-name>
<filter-class>carousel.NoCachingFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>No Caching Filter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
</web-app>

The following command line and resulting output shows some examples of what it can do:

% xpathfind -p etc/web.xml -n ":http://java.sun.com/xml/ns/javaee" -x "//:welcome-file[following-sibling::*/text()='index.htm']" -x "//:filter-name/text()" -x "//:filter/@id"
result = "index.html".
result = "No Caching Filter".
result = "abc".

This example demonstrates a common roadblock people run into when writing XPath strings, how to deal with the default namespace? The idea is simply to register the namespace with a blank prefix value. In the actual XPath string, you have to specify the colon with no prefix, as opposed to leaving out the prefix.

Now let's see how we do this.

The Code

The XPathFind class I show here uses classes in package javax.xml.xpath for XPath processing, and package org.apache.commons.cli to process command-line arguments. The associated MapNamespaceContext class implements the javax.xml.namespace.NamespaceContext interface, which is used by the javax.xml.xpath.XPath class to specify prefix and namespace pairs to use during the processing of XPath queries. I'll list both of these classes without package or import statements for brevity. In practice, you should not use the default package.
Also, the "xpathfind" shell script is a simple wrapper around "java" to call the class and pass the command-line arguments. This version is customized for Cygwin on Windows. A version for "plain" Unix can easily be developed from this, and a Windows batch file is very straightforward to build (except for the annoyance of limited command-line arguments).

XPathFind.java

public class XPathFind
{
    public static void main(String[] args)
    {
        Option  pathOption  = new Option("p", "path", true, "Path to XML file");
        pathOption.setRequired(true);
        
        Option  xpathOption = new Option("x", "xpath", true, "XPath expression to search for");
        xpathOption.setRequired(true);
        xpathOption.setArgs(Option.UNLIMITED_VALUES);
        
        Option  namespaceOption = new Option("n", "namespace", true, "prefix:namespace to use in xpath");
        namespaceOption.setArgs(Option.UNLIMITED_VALUES);

        Options options = new Options();
        options.addOption(pathOption);
        options.addOption(xpathOption);
        options.addOption(namespaceOption);

        CommandLineParser   parser  = new PosixParser();
        
        try
        {
            CommandLine line    = parser.parse(options, args);
            
            String      filePath    = line.getOptionValue("p");
            String[]    xpaths      = line.getOptionValues("x");
            String[]    namespaces  = line.getOptionValues("n");
            
            File    file    = new File(filePath);
            if (!file.exists() || !file.canRead() || !file.isFile())
            {
                System.out.println("File \"" + filePath + "\" is not a readable file.");
                usage(options);
                System.exit(1);
            }

            go(filePath, xpaths, namespaces);
        }
        catch (MissingOptionException ex)
        {
            usage(options);
            System.exit(1);
        }
        catch (ParseException ex)
        {
            ex.printStackTrace();
            System.exit(1);
        }
        catch (Exception ex)
        {
            ex.printStackTrace();
        }
    }
    
    private static void go(String filePath, String[] xpaths, String[] namespaces)
        throws FileNotFoundException, ParserConfigurationException, IOException, SAXException
    {
        DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
        domFactory.setNamespaceAware(true); // never forget this!
        DocumentBuilder builder = domFactory.newDocumentBuilder();
        Document doc = builder.parse(filePath);

        XPathFactory    xpathFactory    = XPathFactory.newInstance();
        XPath   xpath   = xpathFactory.newXPath();
        if ((namespaces != null) && (namespaces.length > 0))
            xpath.setNamespaceContext(new MapNamespaceContext(namespaces));
        
        for (String xpathStr: xpaths)
        {
            try
            {
                XPathExpression xpathExpr   = xpath.compile(xpathStr);
                
                String  result  = xpathExpr.evaluate(doc);
                System.out.println("result = \"" + result + "\".");
            }
            catch (XPathExpressionException ex)
            {
                System.out.println("Xpath \"" + xpathStr + "\" was invalid: " +
                                   ex.getCause().getMessage());
            }
        }
    }
    
    private static void usage(Options options)
    {
        HelpFormatter   formatter   = new HelpFormatter();
        formatter.printHelp("xpathfind -p filepath {-x xpath}+ {-n pfx:ns}*", options);
        System.out.println("Can specify one or more \"-x\" options, and zero or more \"-n\" options.");
    }
}

MapNamespaceContext.java

public class MapNamespaceContext implements NamespaceContext
{
    private Map uriMap      = new HashMap();
    private Map prefixMap   = new HashMap();
    
    public MapNamespaceContext(Map uriMap)
    {
        this.uriMap = uriMap;
        
        for (String key: uriMap.keySet())
            prefixMap.put(uriMap.get(key), key);
    }
    
    public MapNamespaceContext(String[] colonPairs)
    {
        uriMap  = new HashMap();
        
        for (String colonPair: colonPairs)
        {
            int colonIndex  = colonPair.indexOf(':');
            uriMap.put(colonPair.substring(0, colonIndex).trim(),
                       colonPair.substring(colonIndex + 1));
        }

        for (String key: uriMap.keySet())
            prefixMap.put(uriMap.get(key), key);
    }
    
    public String getNamespaceURI(String prefix)
    { return (uriMap.get(prefix)); }

    public String getPrefix(String namespaceURI)
    { return (prefixMap.get(namespaceURI)); }

    public Iterator getPrefixes(String namespaceURI)
    { return (null); }
}

xpathfind

#! /bin/bash
export JAVA_HOME=$(cygpath -u "$XPATHFIND_JAVA_HOME")
export PATH="$JAVA_HOME/bin:$PATH"

java -classpath "$XPATHFIND_HOME/lib/commons-cli-1.1.jar;$XPATHFIND_HOME/build/classes" xpathfind.XPathFind "$@"

Wednesday, October 8, 2008

Reformat generated getters/setters with Eclipse Monkey (Aptana Scripting)

I write Java code with Eclipse every day, and I find it to be an indispensable tool. However, I still find situations where I drop back to Emacs for things that Eclipse is just not flexible enough for. For instance, although the code formatting options in Eclipse are very thorough, there is at least one situation where I prefer to override what it generates for me.

Before I go any further, I'd like to point out that it doesn't matter if you agree with the formatting preference I'm going to describe. I'm not saying that to be difficult, I'm just saying that the facilities I want to describe to you will allow you to implement your own formatting preferences, or your own custom scripting, whatever you want. My formatting preference is just an example of something you could do.

What I'm going to describe is part of the Aptana Studio plugin for Eclipse. The facility is called "Eclipse Monkey", although the Eclipse project for this appears to be obsolete, and this is now totally under the Aptana domain. It's now generically called "Aptana Scripting". You can read about this (still with the "Eclipse Monkey" name) here.

The tool is inspired by the Mozilla Greasemonkey facility. It allows you to write and execute JavaScript in Eclipse that operates on the "object model" in Eclipse. The Eclipse Monkey site has examples, and when you install the Aptana plugin, you'll see lots of examples you can view in the "Scripts" view. The bad news is that the documentation for this tool leaves a great deal to be desired. You can do some useful things just by copying code from examples, but there's no obvious place to get information on the APIs you're using in that code. The transition from "Eclipse Monkey" to "Aptana Scripting" happened recently, so hopefully Aptana will make more of this information available.

The example I'm going to show you will reformat the generated code for the "Generate Getters/Setters" operation to reflect my preference. I prefer to format getters/setters in a very compact form, with each method on a single line, although my general bracing and spacing preferences are very different from this.

After installing the Aptana plugin, create a simple Java project. Create a folder at the root of the project called "scripts". In that folder, place the following script, calling it "formatGettersSetters.js" (or whatever you want to call it):

/*
 * Reformat the result of "Generate Getters/Setters" so each method is on a single
 * line.  For instance, if the generated code was this:
 * 
 *    public String getStuff()
 *    {
 *        return stuff;
 *    }
 *    public void setStuff(String stuff)
 *    {
 *        this.stuff = stuff;
 *    }
 *
 * The script will produce this:
 *
 *    public String getStuff() { return stuff; }
 *    public void setStuff(String stuff) { this.stuff = stuff; }
 *
 * Menu: Java > Format Getters/Setters
 * Kudos: David Karr
 * Key:M3+INSERT
 * License: EPL 1.0
 * DOM: http://download.eclipse.org/technology/dash/update/org.eclipse.eclipsemonkey.lang.javascript
 */

function main()
{
    var sourceEditor = editors.activeEditor;
    // make sure we have an editor
    if (sourceEditor === undefined) 
    {
        valid = false;
        showError("No active editor");
    }
    else 
    {
        var range = sourceEditor.selectionRange;
        var offset = range.startingOffset;
        var deleteLength = range.endingOffset - range.startingOffset;
        var source = sourceEditor.source;
        
        var selection = source.substring(range.startingOffset, range.endingOffset);
        // Find spaces between right paren and left brace
        var regex   = /\)[ \r\t\f\n]*{/gm;
        selection   = selection.replace(regex, ') {');
        // Find spaces between left brace and first non-space
        regex       = /{[ \r\t\f\n]*/gm;
        selection   = selection.replace(regex, '{ ');
        // Find spaces between last non-space and right-brace
        regex       = /[ \r\t\f\n]*}/gm;
        selection   = selection.replace(regex, ' }');
        sourceEditor.applyEdit(offset, deleteLength, selection);
        sourceEditor.selectAndReveal(offset, selection.length);
    }
}

Now, create a new Java class in your project. In the class, define three (for example) instance variables. Then, execute the "Generate Getters and Setters..." option from the Eclipse menu, however you like to do it. I use the keyboard shortcut, which is "Alt-Shift-s, r". Then do "Select All" (Alt-a) and click OK.

At this point, the generated getters/setters have been inserted into the editor, and the inserted region is selected. Without changing the selection, from the menubar select "Scripts", then "Java", then "Format Getters/Setters". This will replace the generated region with a reformatted version of it, with each method on a single line.

Again, this is just an example of the kinds of things you can do with this tool. As Aptana does more work on refining the tool and producing more documentation, this tool will become even more useful. For example, one facility that is available, but which I need more information about, is the ability to have scripts execute automatically when certain events occur, without having to manually execute them. It's possible that this reformatting script could be executed automatically after completion of the "Generate Getters/Setters" operation.

Tuesday, October 7, 2008

Book Review: Effective Java (2nd Edition)

This edition of Effective Java was released over 4 months ago, which means by now there are likely 8 zillion reviews of it. Nevertheless, I'd like to present my thoughts on this book.

As a very experienced Java programmer, I was somewhat reluctant to pick up this book, as I thought there wasn't much more I could learn from a book like this. I was wrong. I'd say most of the advised items were not new to me, but it's always good to see a concise reminder of base principles and "Effective Java", as it were. However, there were also numerous items that were very much new to me, and were very enlightening. In the "I've got to find something wrong with it" department, I found one advice which although completely correct, could lead some people astray because of some details it left out (http://davidmichaelkarr.blogspot.com/2008/10/stringbuilder-is-not-always-faster-than.html).

The book is divided into eleven chapters, divided by subject area, like "Creating and Destroying Objects", "Methods Common to All Objects", "Generics", "Concurrency", and "Serialization", and others. This edition is well revised from the first edition (published in 2001). It not only added new chapters relevant to features new in Java 5, it revised items where the advice clearly needed to change, and it even removed some items that had become obsolete.

Many of the advices are a single page, but many of them go into great detail. Item 8, "Obey the general contract when overriding equals", is 12 pages long and covers everything you need to know about correctly implementing the "equals" method for logical equality. This item only barely mentions the "hashcode" method, usually discussed at the same time, as it covers it thoroughly in the very next item.

The most surprising content (to me) was in the last chapter, on Serialization. The effort required to ensure secure, reliable, and maintainable serialization was mostly lost on me before reading this. This chapter not only went into great detail on how to resolve most serialization issues, but it also showed many examples of nefarious code that could be used to compromise serialization code that was not secure.

In short, I would recommend this book to any intermediate to advanced Java programmer. You are certain to gain a better appreciation for the details of producing better and more effective code.

Sunday, October 5, 2008

StringBuilder is NOT always faster than string concatenation

While reading through Effective Java, 2nd Edition (which I will publish a positive book review for when I'm finished), I noticed one piece of advice which is correct, but which I realized can deceive some people into doing something inadvisable.

The advice is "Item 51: Beware the performance of string concatenation". This clearly points out the key to the problem, which is that if you end up creating lots of strings in a loop, it will be slow. This is as opposed to simply appending existing strings to an existing StringBuilder object, which will be much more efficient. I'm fine with all that.

My problem with this section is that it doesn't clearly point out that ordinary string concatenation actually uses StringBuilder under the covers. That's right. It works exactly the same way as StringBuilder, because that's how it's implemented.

If you're not convinced, we can look at two very simple Java classes, and we'll use a nice Bytecode visualization plugin for Eclipse called "Bytecode Outline" () which will show you that they truly are doing the same thing.

The two sample classes are this:

package stringbuilder;
public class UsesPlus
{
    public static void main(String[] args)
    {
        String[] values = { "abc", "def", "ghi" };
        System.out.println("string[" +
                           values[0] + ":" +
                           values[1] + ":" +
                           values[2] + "]");
    }
}

And:

package stringbuilder;
public class UsesStringBuilder
{
    public static void main(String[] args)
    {
        String[] values = { "abc", "def", "ghi" };
        System.out.println((new StringBuilder("string[")).
                           append(values[0]).append(":").
                           append(values[1]).append(":").
                           append(values[2]).append("]").
                           toString());
    }
}

After installing the Bytecode Outline plugin, the best way to visualize the difference is to select both classes in the Package Explorer, then right-click and select "Compare With" and then "Each Other Bytecode". This really just generates the bytecode for both and then gives you an ordinary text comparison view to compare them. For now, I'll just present for you a meaningful excerpt of the bytecode for each. Both of these samples start with loading the "string[" string and continuing to the "toString()" call.

Here's the relevant bytecode for "UsesPlus.java" (both samples show long lines wrapped with "\" for viewability):

    LDC "string["
    INVOKESPECIAL java/lang/StringBuilder.\
(Ljava/lang/String;)V
   L2 (22)
    LINENUMBER 10 L2
    ALOAD 1
    ICONST_0
    AALOAD
    INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    LDC ":"
    INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;

Here's the similar block for "UsesStringBuilder.java":

    LDC "string["
    INVOKESPECIAL java/lang/StringBuilder.\
(Ljava/lang/String;)V
    ALOAD 1
    ICONST_0
    AALOAD
    INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    LDC ":"
    INVOKEVIRTUAL java/lang/StringBuilder.\
append(Ljava/lang/String;)Ljava/lang/StringBuilder;

Notice that they are almost completely the same.

The moral here is, don't apply good advice in the wrong places.

Thursday, October 2, 2008

Make WLST scripts more flexible with Python's Getopt

My impression is that the audience of people who are learning and using the WebLogic Scripting Tool are not experienced Python programmers (I'm not, either). That's unfortunate, because Python (and Jython) have a lot of capabilities that can enhance your WLST scripts.

A natural "first script" in WLST probably hardcodes all the values that you might pass as command-line parameters in an ordinary shell script or Java class. Doing that limits the flexibility of your script, forcing you to modify it each time you need to change it's behavior slightly.

The solution is to use the Python getopt module, which works similarly to the feature available in Unix shell scripts. It also adds "long options", which the Perl getopt module provides.

For instance, the following is the beginning of a sample script that might be used to create a JMS module in a domain. Instead of hardcoding the parameters into the script, they're expected to be passed in on the command line. If any errors occur while gathering the command-line arguments, the "usage()" method is called, and the script exits.

import sys
import os
from java.lang import System

import getopt

def usage():
    print "Usage:"
    print "createJMSModule [-n] -u user -c credential -h host -p port -s serverName -m moduleName [-d subDeploymentName] -j jmsServerName"

try:
    opts, args    = getopt.getopt(sys.argv[1:], "nu:c:h:p:s:m:d:j:",
                                  ["user=", "credential=", "host=", "port=",
                                   "targetServerName=", "moduleName=",
                                   "subDeploymentName=", "jmsServerName="])
except getopt.GetoptError, err:
    print str(err)
    usage()
    sys.exit(2)

reallyDoIt  = true
user        = ''
credential  = ''
host        = ''
port        = ''
targetServerName  = ''
moduleName  = ''
subDeploymentName   = 'DeployToJMSServer'
jmsServerName   = ''

for opt, arg in opts:
    if opt == "-n":
        reallyDoIt  = false
    elif opt == "-u":
        user        = arg
    elif opt == "-c":
        credential  = arg
    elif opt == "-h":
        host        = arg
    elif opt == "-p":
        port        = arg
    elif opt == "-s":
        targetServerName  = arg
    elif opt == "-m":
        moduleName  = arg
    elif opt == "-d":
        subDeploymentName   = arg
    elif opt == "-j":
        jmsServerName   = arg
        
if user == "":
    print "Missing \"-u user\" parameter."
    usage()
    sys.exit(2)
elif credential == "":
    print "Missing \"-c credential\" parameter."
    usage()
    sys.exit(2)
elif host == "":
    print "Missing \"-h host\" parameter."
    usage()
    sys.exit(2)
elif port == "":
    print "Missing \"-p port\" parameter."
    usage()
    sys.exit(2)
elif targetServerName == "":
    print "Missing \"-s serverName\" parameter."
    usage()
    sys.exit(2)
elif moduleName == "":
    print "Missing \"-m moduleName\" parameter."
    usage()
    sys.exit(2)
elif jmsServerName == "":
    print "Missing \"-j jmsServerName\" parameter."
    usage()
    sys.exit(2)

print "Got all the required parameters"

Calling WLST scripts from the shell requires actually calling the "wlst" application, and passing your script, and the additional command line parameters, as arguments to "wlst". For instance, this is the skeleton of a Bash script file (use Cygwin on Windows) that could be used to call this script. I assume that WEBLOGIC_HOME points to your WebLogic installation (like "c:/bea/weblogic92") and WLST_SCRIPTS_HOME points to your repository of scripts for execution.

#! /bin/bash
$WEBLOGIC_HOME/common/bin/wlst.cmd $WLST_SCRIPTS_HOME/src/samplegetopt.py "$@"

This would produce output like this (output in versions besides 9.2.2 might vary slightly):

% ./samplegetopt -n -u weblogic -c password -h somehost.somenet.com -p 8001 -s joe -m blow -j jmsserver

CLASSPATH=C:\bea\...

PATH=C:\bea\...

Your environment has been set.

CLASSPATH=C:\bea\...

Initializing WebLogic Scripting Tool (WLST) ...

Welcome to WebLogic Server Administration Scripting Shell

Type help() for help on available commands

Got all the required parameters

Saturday, September 27, 2008

Jarsearch: Bash script to search for class name substrings in archive files

How often have you had to figure out which of numerous jars (or zips) in numerous directories has a particular class (or other kind of file), so you can copy it, reference it, or open it for inspection?

I long ago tired of struggling through this, so I wrote a simple Bash (runs on Unix or Cygwin) script that facilitates this, and is easy to integrate into an "xargs" pipe.

The script is called "jarsearch". It takes at least two parameters, and two options. The first expected parameter is a search string. The second expected parameter is a path to an archive file that "jar" can understand. Additional parameters are additional file paths. It also takes optional flags "-l" (just list the archives it was found in) and "-i" (ignore case in comparisons).

Here are some examples of its usage:

% jarsearch Delivery *.jar
File dsn.jar:
  3506 Wed Oct 17 11:06:06 PDT 2007 com/sun/mail/dsn/DeliveryStatus.class

% jarsearch SAXParse apache-ant-1.7.0/lib/*.jar
File apache-ant-1.7.0/lib/xercesImpl.jar:
    45 Wed Sep 13 22:13:54 PDT 2006 META-INF/services/javax.xml.parsers.SAXParserFactory
  2701 Wed Sep 13 22:16:12 PDT 2006 org/apache/xerces/jaxp/SAXParserFactoryImpl.class
  5760 Wed Sep 13 22:16:12 PDT 2006 org/apache/xerces/jaxp/SAXParserImpl$JAXPSAXParser.class
  7748 Wed Sep 13 22:16:12 PDT 2006 org/apache/xerces/jaxp/SAXParserImpl.class
   564 Wed Sep 13 22:16:06 PDT 2006 org/apache/xerces/parsers/AbstractSAXParser$1.class
   564 Wed Sep 13 22:16:06 PDT 2006 org/apache/xerces/parsers/AbstractSAXParser$2.class
  2500 Wed Sep 13 22:16:06 PDT 2006 org/apache/xerces/parsers/AbstractSAXParser$AttributesProxy.class
  1009 Wed Sep 13 22:16:06 PDT 2006 org/apache/xerces/parsers/AbstractSAXParser$LocatorProxy.class
 18424 Wed Sep 13 22:16:06 PDT 2006 org/apache/xerces/parsers/AbstractSAXParser.class
  1710 Wed Sep 13 22:16:06 PDT 2006 org/apache/xerces/parsers/SAXParser.class
File apache-ant-1.7.0/lib/xml-apis.jar:
  3896 Sat Feb 25 11:28:32 PST 2006 javax/xml/parsers/SAXParser.class
  2483 Sat Feb 25 11:28:32 PST 2006 javax/xml/parsers/SAXParserFactory.class
  1287 Sat Feb 25 11:28:32 PST 2006 org/xml/sax/SAXParseException.class

% find apache-* castor-* -name "*.jar" | xargs -n100 jarsearch javax.xml.parsers.SAXParser.class
File apache-ant-1.7.0/lib/xml-apis.jar:
  3896 Sat Feb 25 11:28:32 PST 2006 javax/xml/parsers/SAXParser.class
File castor-0.9.6/lib/xerces-J_1.4.0.jar:
  3518 Tue May 22 17:21:10 PDT 2001 javax/xml/parsers/SAXParser.class
File castor-0.9.7/lib/xerces-J_1.4.0.jar:
  3518 Tue May 22 17:21:10 PDT 2001 javax/xml/parsers/SAXParser.class

And following this is the entire "jarsearch" script. Put this in your $HOME/bin and set the execute bit ("chmod +x $HOME/bin/jarsearch").

#! /bin/bash
listflag=0
ignorecase=0
while getopts li opt
do
    case $opt in
    l) listflag=1;;
    i) ignorecase=1;;
    esac
done
shift `expr $OPTIND - 1`
sstring="$1"
shift

casestr=""
if [ "$ignorecase" == 1 ]; then
    casestr="-i"
fi

# (copied from "ant")
# OS specific support.  $var _must_ be set to either true or false.
cygwin=false;
darwin=false;
case "`uname`" in
  CYGWIN*) cygwin=true ;;
  Darwin*) darwin=true
           if [ -z "$JAVA_HOME" ] ; then
             JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Home
           fi
           ;;
esac

tmpfile=${TMPDIR:-/tmp}/js.$$
trap '/bin/rm -f $tmpfile; exit 1' 1 2 3 15
for fn in "$@"; do
    if $cygwin; then
        fn="$(cygpath -m $fn)"
    fi
    jar tvf "$(echo $fn)" | 2>&1 grep $casestr "$sstring" > $tmpfile
    if [ -s $tmpfile ]; then
        if [ "$listflag" == 1 ]; then
            echo $fn
        else
          echo File $fn\:
            cat $tmpfile
        fi
    fi
done
/bin/rm -f $tmpfile

David M. Karr's Blog