Saturday, September 7, 2013

Adding syntax highlighting to Javadocs

Often times, when writing Javadocs, it helps to include source code samples along with the documentation. Typically, this is achieved by inserting the source code into a <pre> tag. This will render the code in a monospaced font when viewed in a browser.

/**
 * <p>Represents a fruit.</p>
 * <pre>
 * //create a new fruit
 * Fruit fruit = new Fruit("banana");
 *
 * //copy an existing fruit
 * Fruit copy = new Fruit(fruit);
 * </pre>
 * @author John Doe
 */
public class Fruit{
  ...
}

But there tools out there that can add syntax highlighting to source code on a web page. Javadocs are a webpage. Why can't these syntax highlighting tools be applied to Javadocs as well?

In this blog post, I am going to show you how to add syntax highlighting to a Maven-enabled Java project. I will be using the popular Javascript-based SyntaxHighlighter library for the syntax highlighting.

1. Download SyntaxHighligher

First, download SyntaxHighlighter.

2. Create a CSS file

SyntaxHighlighter makes use of CSS styling to perform the code coloring. Luckily, the Javadoc tool allows you to specify a CSS file to customize the look and feel of your Javadoc webpage. So, we will need to create a CSS file that contains the styling that SyntaxHighlighter requires.

Navigate to the SyntaxHighlighter files that you downloaded in the previous step. In the "styles" directory, locate the "shCore.css" file and one of the "shTheme" files (such as "shThemeDefault.css", see: all the available themes). Combine these two files into a single file and give it a name of your choosing. Save this file somewhere within your project folder. The location doesn't matter, since Javadoc will end up copying the file when the Javadocs are generated. A good place is the "src/main/javadoc" folder, as this is the standard Maven location for all Javadoc-related resources.

3. Configure your POM file

Next, we will need to add some configuration settings to the project POM. In the configuration section of the "maven-javadoc-plugin" plugin, add the following: (1) the location of the CSS file that was created in the previous step, (2) <script> tags for the SyntaxHighlighter Javascript files, and (3) Javascript code to configure and initialize SyntaxHighlighter.

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-javadoc-plugin</artifactId>
  <version>2.8.1</version>
  <configuration>

    <!-- (1) CSS file location -->
    <stylesheetfile>src/main/javadoc/syntax-highlighter.css</stylesheetfile>

    <!-- (2) SyntaxHighlighter Javascript files -->
    <top><![CDATA[
      <script src="http://alexgorbatchev.com/pub/sh/current/scripts/shCore.js" type="text/javascript"></script>
      <script src="http://alexgorbatchev.com/pub/sh/current/scripts/shBrushJava.js" type="text/javascript"></script>
    ]]></top>

    <!--
    (3) SyntaxHighlighter configuration and initialization
    See: http://alexgorbatchev.com/SyntaxHighlighter/manual/configuration/
    -->
    <footer><![CDATA[
      <script type="text/javascript">
        SyntaxHighlighter.defaults["auto-links"] = false;
        SyntaxHighlighter.defaults["tab-size"] = 2;
        SyntaxHighlighter.all();
      </script>
    ]]></footer>

  </configuration>
</plugin>

A list of available SyntaxHighlighter configuration settings can be found on the SyntaxHighlighter homepage.

4. Modify the Javadocs

Each <pre> tag in the Javadocs must be given a class="brush:java" attribute. This signals to SyntaxHighlighter that the text content should be treated as Java source code.

/**
 * <p>Represents a fruit.</p>
 * <pre class="brush:java">
 * //create a new fruit
 * Fruit fruit = new Fruit("banana");
 *
 * //copy an existing fruit
 * Fruit copy = new Fruit(fruit);
 * </pre>
 * @author John Doe
 */
public class Fruit{
  ...
}

5. Generate the Javadocs

Instruct Maven to generate the Javadocs for the project by running the following command:

mvn javadoc:javadoc

And that's it! You should be good to go.

References:

Wednesday, July 17, 2013

ACM Webcast on Parallel Programming

I just finished listening to an ACM webcast entitled "Changing How Programmers Think about Parallel Programming". It was presented by William Gropp from the University of Illinois at Urbana-Champaign and it was very informative!

He discussed two different types of parallel programming. Course grained parallelism is where you divide a task up into chunks and each process performs the same sequence of operations on its assigned chunk. For example, say you have to mail some letters and have multiple people at your disposal to help you. With course grained parallelism, each person would be given some number of letters to mail and would be responsible for all tasks that are involved with mailing the letter, such as folding the paper, placing it in the envelope, and applying a stamp.

Fine grained parallelism, however, is where you divide the task up by operation and each process is assigned to a specific operation. Using the letter example, this would mean that one person would be responsible for folding the paper, another for placing the paper in the envelope, and so on.

He discussed the fact that processes must often share data with each other. One technique for doing this is for each process to copy some of the data that is assigned to other processes before execution begins. He also mentioned that parallel programs are harder to debug than traditional, single-threaded programs, which is one reason why programming in a parallel fashion is so difficult!

Thursday, June 20, 2013

Working with Timezones in Java

Timezones are confusing to say the least. Here are four things to keep in mind when working with timezones in Java.

1. Timezones are relative to UTC

UTC, which stands for "Coordinated Universal Time", is always the same no matter where on the planet you are. The term "GMT" is often used as well. While GMT is pretty much identical to UTC, it's not exactly the same. UTC is scientifically defined, while GMT is not.

A timezone is an offset from UTC. For example, my timezone is currently 4 hours behind UTC. However, a timezone's offset may change depending on the time of year. For example, my timezone is 4 hours behind UTC for half the year (daylight savings time) and 5 hours behind UTC for the other half of the year (standard time). This is why it's preferable to represent timezones using their IDs (such as "America/New_York", described in more detail below) instead of their offsets (such as "-0400"). An offset can change depending on the time of year, but a timezone ID encapsulates these offset variations.

Fun fact: You might have noticed that the letters in the acronym "UTC" don't match up with "Coordinated Universal Time". "UTC" actually arose as a compromise between the English version "CUT" (Coordinated Universal Time), and the French version "TUC" (Temps Universel Coordonné).

2. Date objects use UTC

Whenever you call the toString() method on a java.util.Date object (for example, by printing the object to the console), it will generate a string that looks something like this:

Thu Jun 20 13:23:52 EDT 2013

This might lead you to believe that this is the exact timestamp that the object is holding. Not necessarily. Internally, Date objects store their timestamps in UTC. When you call toString(), it converts the UTC timestamp to your JVM's default timezone.

3. Timezones are clothing

Think of timezones as just different sets of clothing a timestamp can wear. It can put on a sweater, a jacket, or a tuxedo, but underneath it all, it's still a UTC timestamp. To format a Date object using a timezone of your choosing, call the setTimeZone() method on the DateFormat class.

DateFormat formatter = new SimpleDateFormat("HH:mm Z");
TimeZone timezone = TimeZone.getTimeZone("Europe/Madrid");
formatter.setTimeZone(timezone);
System.out.println(formatter.format(new Date()));
//prints: 19:50 +0200

In the example above, I'm printing the current time in Madrid, Spain (and surrounding areas). The "+0200" part describes the timezone's offset from UTC at that moment in time. It shows that the timezone is 2 hours and 0 minutes ahead of UTC. To give another example of an offset, "-0430" would mean that the timezone is 4 hours and 30 minutes behind UTC.

You might be wondering where a list of these timezone identifier strings can be found. A source that I like to use is from the PHP user manual. But the official listing can be found in what's known as the TZ database.

4. The TimeZone class is quirky

Note that if an unrecognized timezone ID is passed into the TimeZone.getTimeZone() method, the method will return an object representing the "GMT" timezone. It would be less confusing if it just returned null, but who am I to complain.

TimeZone timezone = TimeZone.getTimeZone("Bogus/Timezone");
if ("GMT".equals(timezone.getID())){
  //timezone not found!
} else {
  //timezone found
}

Thursday, April 18, 2013

Blobs and JDBC

A blob is a database data type for storing raw, binary data. It stands for "binary large object". In this blog post, I'm going to show you how to use this data type to insert and retrieve a photo using JDBC.

To insert the photo, start by creating an InputStream object to the photo you want to insert. For example, if the photo resides in a file, create a FileInputStream object.

File file = new File("photo.jpg");
InputStream in = new FileInputStream(file);

Then, create a PreparedStatement object for your INSERT statement. The PreparedStatement should contain a parameter for where the binary data should go, just as if you were inserting "normal" data, like a string or an integer.

Connection conn = ...
PreparedStatement stmt = conn.prepareStatement("INSERT INTO test (photo) VALUES (?)");

To set the binary data, pass the InputStream object into the setBlob() method, and then execute the statement.

stmt.setBlob(1, in);
stmt.execute();

To retrieve blob data from the database, call the getBlob() method on the ResultSet object that is returned from the SELECT statement. This will return a Blob object. Then, invoke the Blob.getBinaryStream() method to get an InputStream to the binary data.

Connection conn = ...
PreparedStatement stmt = conn.prepareStatement("SELECT photo FROM test");
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
  Blob blob = rs.getBlob(1);
  InputStream in = blob.getBinaryStream();
  ...
}

Tuesday, April 16, 2013

4 Ways to Initialize a List in Java

Creating a List and populating it with a set of elements is a common programming task in Java. In this blog post, I'm going to describe four ways to do this.

1. Collections.emptyList()

This method will return a List object that is empty. This is a convenient, shorthand alternative to explicitly instantiating a new List object. However, this list is immutable, which means that you cannot add any elements to it.

List<String> list = Collections.emptyList();

2. Arrays.asList()

This method takes an array and converts it to a List object. What makes this method special is the fact that the argument to this method is a vararg. This means that you can pass as many elements into the method as you like. The syntax is very compact because all of the elements can fit on one line.

But note that, just as with Collections.emptyList(), the list that is created is immutable, so you cannot add or remove elements to/from it.

List<String> list = Arrays.asList("one", "two", "three");

3. Anonymous child class

A somewhat trickier way of creating a list is to define your list as an anonymous, child class. The elements are added to the list by calling the add() method within the class' initializer block (notice the double braces).

List<String> list = new ArrayList<String>(){{add("one"); add("two"); add("three");}}

4. JUST CALL add(), EINSTEIN!

Of course, the traditional way to create a list is to instantiate it and then call the add() method for each element. But where's the fun in that?

List<String> list = new ArrayList<String>();
list.add("one");
list.add("two");
list.add("three");

Monday, April 15, 2013

5 Unique Features of Stackoverflow Chat

Stackoverflow, a technical question and answer site, has a web-based chat room system that allows you to talk with other programmers in real time. In this blog post, I'm going to describe five unique features of the Stackoverflow Chat system.

1. Mentions.

If you want to direct a message to a particular user, type a @ character, followed by the user's name. This will cause a "ping" sound to play in the user's browser, altering them to the fact that they were mentioned in a chat message. If the user's name has spaces in it, simply leave out the spaces.

You can also reply to specific message. To do this, hover your mouse over the message you want to reply to and click the "reply" icon on the right. Hovering your mouse over a reply will highlight the message that it was a reply to. The user who posted the message will receive a "ping".

2. Editing and deleting messages.

After sending a message, you may notice a spelling mistake or typo in your message. Stackoverflow chat allows you to edit and delete messages that are less than 2 minutes old. Hover your mouse over the message you want to edit or delete, and click the drop down arrow icon on the left.

This will open up a menu, which allows you to edit or delete your message.

3. Text formatting.

You can also format the text of your message for added emphasis. The following syntaxes are supported:

*italic*
**bold**
`code (monospaced font)`
---strikeout---

Note that, if you want to include a multi-line code sample, hold down the Shift key and then press Enter to insert a line break in your chat message (just pressing Enter will send the message). Then, you can click on the "fixed font" button to change the font of your entire message to monospace. The "fixed font" button will not appear unless your chat message has multiple lines.

4. Starred messages.

If someone posts a message that you really like, you can "star" it. Starred messages appear on the right-hand side of the window. The more stars a message has, the longer it will stay pinned to this location. To star a message, hover your mouse over the chat message you want to star, and click the "star" icon on the right.

5. Oneboxing.

If you paste URLs to particular websites into a chat messages, then the chat system will create a nicely formatted "widget" containing the content of that webpage. For example, pasting the URL to a Stackoverflow question will display various information about the question, such as the number of upvotes, the tags, and the question itself. This is called "oneboxing".

Supported websites include the Stack Exchange family of websites, Wikipedia, and Youtube.

For more information about Stackoverflow Chat, see the FAQ page.

Sunday, April 14, 2013

Javadoc and the @see tag

Javadoc, as you know, is a tool for documenting your source code in the Java language. It consists of specially-formatted comments that describe the classes, methods, and fields of your Java program. IDEs leverage Javadoc comments to help inform developers of how various APIs function (for example, hovering your mouse over a method in Eclipse will display that method's Javadoc). You can also run the "javadoc" command, which comes packaged with the JDK, to generate an HTML webpage containing the Javadoc comments of your entire code base.

Javadoc syntax defines a collection of tags. Tags allow you to describe specific aspects of your code, such as the return value of a method or the author of class. One of these tags is @see, which is like a "see also" reference. In this blog post, I'm going to describe the three ways to use @see.

1) Referring to a class or method. One way to use @see is to refer the user to another class or method within your code base. An import statement for the class you want to reference must be added to the Java code. Then, simply put the class name after the tag.

import com.example.RefClass;
/*
 * Description of this class.
 * @see RefClass
 */
public class MyClass{}

To refer to a method, put a #, followed by the method name, after the class name. If the method is overloaded, then be sure to include the parameters as well (enclosed in parenthesis) to remove any ambiguity as to which method you are referring to. Javadoc gurus will recognize this as the same syntax that's used with the @link tag.

import com.example.RefClass;
/*
 * Description of this class.
 * @see RefClass#execute(String, int)
 */
public class MyClass{}

2) Referring to a website. Another way of using @see is to refer to a website. For this, you must use the HTML <a> tag.

/*
 * Description of this class.
 * @see <a href="http://example.com">Library website</a>
 */
public class MyClass{}

3) Plain text. You can also just put plain text within the @see tag. However, it must be surrounded with double quotes in order to prevent the Javadoc parser from treating it like a class name.

/*
 * Description of this class.
 * @see "Library Documentation"
 */
public class MyClass{}

Saturday, February 9, 2013

Hard-coded XML in Unit Test code

Everyone knows that unit tests should be as self-contained as possible. Everything that is needed to run the unit test should be contained within the unit test class itself. It should rely as little as possible on other resources, such as files or database connections. This makes them easier to maintain, and makes them less susceptible to the failures of these external systems.

So what happens when your unit test needs some large block of text, like an XML document, to run? You might be tempted to put it in a separate file, but this goes against the self-containability principle described above. You might consider including them as hard-coded strings, but this isn't the best approach either. Java doesn't have a multi-line string syntax like most other languages, so doing this will strip away all the formatting that makes the XML document human-readable.

String xml = "<library><wifi>true</wifi><book><title>The Hunger Games</title><author>Suzanne Collins</author></book></library>";

By contrast, this same XML document could be defined in PHP as a multi-line string.

$xml = <<<XML
<library>
  <wifi>true</wifi>
  <book>
    <title>The Hunger Games</title>
    <author>Suzanne Collins</author>
  </book>
</library>
XML;

As you can see, the XML document in the PHP code is much more readable than the XML document in the Java code.

Despite Java not supporting multi-line strings, there is still a way that they can be mimicked. The string can be split up into multiple substrings that can be arranged however you want. These substrings are then concatenated together to form the final string.

//@formatter:off
String xml =
"<library>" +
  "<wifi>true</wifi>" +
  "<book>" +
    "<title>The Hunger Games</title>" +
    "<author>Suzanne Collins</author>" +
  "</book>" +
"</library>";
//@formatter:on

If you use the code-formatting functionality provided by an IDE, you must remember to instruct the IDE not to format this block of code. Eclipse uses a @formatter:off/on pair of comments to accomplish this (the setting for which must be manually enabled in the code formatting preferences).

Sunday, January 6, 2013

Method chaining

Jsoup is a great Java library for parsing HTML pages. Its API is elegant and easy to use. One feature I love is the way it allows a webpage to be parsed from a website URL (as opposed to providing the HTML page from a String or Reader object). It uses a technique called method chaining to build and send the HTTP request that will retrieve the webpage.

You start out by passing the URL into the Jsoup.connect(String) method. This method returns a Connection object, but you're not supposed to assign this object to a variable, as is done typically. That's not how method chaining works. Instead, you continue calling methods one after another without assigning their return values to anything. You can call as many or as few of these methods as you want. They allow you to customize the HTTP request. They all return a reference to the same Connection object (i.e. return this;), which is what allows you to call the methods in a chain-like fashion. The methods allow you to specify things like cookies and the connection timeout.

In addition to using method chaining, Jsoup.connect(String) also uses a sort of factory pattern, since its purpose is to construct an HTTP request, send it, and then parse the returned HTML page into a DOM. So, there has to be a method that terminates the chain and returns the object we want it to build. In Jsoup's case, the termination method elegantly serves an additional purpose: specifying the HTTP method (which all HTTP requests must have).

The example below parses the HTML page of google.com. It assigns some cookies to the request, specifies a connection timeout of 60 seconds, and then sends the request using the GET method.

Map<String, String> cookies = ...
Document doc = Jsoup.connect("http://www.google.com")
                    .cookies(cookies)
                    .timeout(60000)
                    .get();

Inspired by Jsoup, I've done something similar with my ez-vcard project (I will be releasing these changes in the next version of the library). To parse a vCard, you no longer have to use the relatively cumbersome VCardReader class. You can now use a method chaining API, which calls VCardReader behind the scenes. It reduces the amount of boilerplate code, making the code easier to read and understand.

//using VCardReader
File vCardFile = ...
Reader reader = new FileReader(vCardFile);
VCardReader vcr = new VCardReader(reader);
VCard vcard = vcr.readNext();
reader.close();

//using method chaining
File vCardFile = ...
VCard vcard = Ezvcard.parse(vCardFile).first();

Similarly, method chaining can be used to write a vCard as well.

//using VCardWriter
VCard vcard = ...
File vCardFile = ...
Writer writer = new FileWriter(vCardFile);
VCardWriter vcw = new VCardWriter(writer, VCardVersion.V3_0);
vcw.write(vcard);
writer.close();

//using method chaining
VCard vcard = ...
File vCardFile = ...
Ezvcard.write(vcard).version(VCardVersion.V3_0).go(vCardFile);

Method chaining can make your code a lot easier to read and understand. Have you used method chaining before? Leave a comment below.