Posts Tagged ‘java’

Intro to Scala for Java Developers – slides

Monday, August 17th, 2009

Thought I’d post the slides of a talk I gave at work on Scala. We’re primarily a Java shop, and every week we do either a code review or a tech-related presentation.

Our domain at work is analyzing residential energy data, so the examples herein are tailored to that:

  • Read or Meter Read – Some amount of energy used over a period, e.g. “100kwh in the month of June”
  • Service Point – meta-data about an electric meter (the “point” at which “service” is available).

I also omitted a code demo where I refactored part of our codebase into Scala to show the difference (trust me, it was awesome!).

Why maven drives me absolutely batty

Wednesday, May 13th, 2009

Although my maven bitching has been mostly snarky, I have come to truly believe it is the wrong tool for a growing enterprise and, like centralized version control, will lead to a situation where tools dictate process (and design).

But, what is maven actually good at?

  • Maven is great for getting started — you don’t have author an ant file (or copy one from an existing project)
  • Maven is great for enforcing a standard project structure — if you always use maven, your projects always look the same

This is about where it ends for me; everything else maven does – manage dependencies, automated process, etc., is done much better and much more quickly by other technology. It’s pretty amazing that someone can make a tool worse than ant, but maven is surely it

Dependency management is not a build step

Maven is the equivalent of doing a sudo gem update everytime you call rake, or doing a sudo yum update before running make. That’s just insane. While automated dependency management is a key feature of a sophisticated development process, this is a separate process from developing my application.

Maven’s configuration is incredibly verbose

It requires 36 lines of human-readable XML to have my webapp run during integration tests. Thirty Six! It requires six lines just to state a dependency. Examining a maven file and tying to figure out where you are in its insane hierarchy is quite difficult. It’s been pretty well-established outside the Java community that XML is horrible configuration file format; formats like YAML have a higher signal to noise ration, and using (gasp) actual scripting language code can be even more compact (and readable and maintainable).

The jars you use are at the mercy of Maven

If you want to use a third-party library, and maven doesn’t provide it (or doesn’t provide the version you need), you have to set up your own maven repo. You then have to include that repo in your pom file, or in every single developer’s local maven settings. If you secure your repo? More XML configuration (and, until the most recent version, you had to have your password in cleartext…in a version 2 application). The fallout here is that you will tend to stick with the versions available publicly, and we see how well that worked out for Debian.

Modifying default behavior is very difficult

Since maven is essentially a very, very high-level abstraction, you are the mercy of the plugin developers as to what you can do. For example, it is not possible to run your integration tests through Cobertura. The plugin developers didn’t provide this and there’s no way to do it without some major hacking of your test code organization and pom file. This is bending your process to fit a tool’s shortcoming. This is limitation designed into maven. This is fundamentally different that “opinionated software” like Rails; Rails doesn’t punish you so harshly for wanting to tweak things; maven makes it very difficult (or impossible). There was no thought given in Maven’s design to using non-default behavior.

Extending Maven requires learning a plugin API

While you can throw in random Ant code into maven, the only way to create re-usable functionality is to learn a complex plugin API. Granted, this isn’t complex like J2EE is complex, but for scripting a build, it’s rather ludicrous.

Maven is hard to understand

I would be willing to bet that every one of my gripes is addressed through some crazy incantation. But that’s not good enough. The combined experience of the 7 developers at my company is about 70 years and not one of us can explain maven’s phases, identify the available targets, or successfully add new functionality for a pom without at least an hour on the net and maven’s documentation.

A great example is the release plugin. All five developers here that have used it go through the same cycle of having no idea what it’s doing, having it fail with a baffling error message, starting over and finally figuring out the one environment tweak that makes it work. At the end of this journey each one (myself included) has realized all this is a HUGE wrapper around scp and a few svn commands. Running two commands to do a source code tag and artifact copy shouldn’t be this difficult.

Maven’s command line output is a straight-up lie

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Compilation failure

“Compilation failure”, but it’s own definition is a failure and therefore an error (not an informational message). Further, most build failures do not exit with nonzero. This makes maven completely unscriptable.

Maven doesn’t solve the problems of make

Ant’s whole reason for being is “tabs are evil”, and that tells you something. While maven’s description of itself is a complete fabrication, it at least has its heart in the right place. However, it STILL fails to solve make’s shortcomings wrt to java:

  • Maven doesn’t recompile the java classes that are truly out-of-date
  • Maven recompiles java classes that are not out-of-date
  • Maven doesn’t allow for sophisiticated behavior through scripting
  • Maven replaces arcane magic symbols with arcane magic nested XML (e.g. pom files aren’t more readable than a Makefile)

Maven is slow

My test/debug cycle is around a minute. It should be 5 seconds (and it shouldn’t require an IDE).

Conclusion

Apache’s Ivy + Ant is probably a better environment than maven for getting things done; a bit of up-front work is required, but it’s not an ongoing cost, and maintenance is much simpler and more straightforward. Tools like Buildr and Raven seem promising, but it might be like discussing the best braking system for a horse-drawn carriage; utterly futile and irrelevant.

Why underscores might be better than camel case

Wednesday, December 10th, 2008

So, the “Ruby way” is to use underscores to delimit most identifiers, e.g. “add_months_to_date“, as opposed to the Java camel-case way of “addMonthsToDate“. This was initially something that irked me about Ruby, mostly because typing an underscore is kindof a pain (shift with the left hand and pinky with the other).

Now that I’ve started working, I’ve been reading a lot of code and realizing that code is more often read than written. Ultimately, camel case is a just lot harder to read (especially if you create meaningful method names like myself and my co-workers seem to do).

It’s pretty hard to defend:

Date calculatePersonDataUsageHistoryStartDate() {}

as more readable than:

def calculate_person_data_usage_history_start_date()
end

The underscores are like spaces, making the identifier a lot more readable. Of course, both are more readable than:

// Calculates the start date of the
// person's data usage history
time_ prsn_dt_uhst_st_dt(){}

This would never fly with Java (and, honestly, look a bit weird), but I’m no longer gonna curse the Ruby convention.

EMMA and TestNG for Simple Java Code Coverage

Sunday, December 7th, 2008

Although RCov makes code coverage in Ruby dead simple, I wasn’t sure how easy this would be to achieve with Java. The first free tool I found is called EMMA and it was surprisingly easy to setup, especially since the documentation isn’t geared toward getting coverage during tests (but getting it during execution).

EMMA works by instrumenting the classfiles to analyze coverage. Although it can do just-in-time instrumentation, that didn’t seem to work for recording coverage via TestNG. The offline instrumentation makes is pretty easy to use with anything. Basically, you want your ANT file to:

  1. Compile your code
  2. Use EMMA to instrument your classes to a different directory
  3. Run your tests, using the instrumented classes first in your classpath
  4. and passing a few system properties to your running code

  5. Run EMMA’s report generator on the output

At first, I was getting some runtime errors because interfaces are not instrumented (and don’t show up in the location you tell EMMA to put them). The solution is to put both your instrumented classes directory and your regular, non-instrumented classes directory in the classpath, making sure the instrumented ones are first.

Here’s my test.xml I’m using in my fork of ImportScrubber that shows it all working together. All in all, it only took about 15 minutes to set up and debug. Of course, now, the tests that came with ImportScrubber provide almost no coverage, but that’s another story….

Interview Rubric really needed?

Friday, October 24th, 2008

It’s been a few weeks of job hunting and I haven’t once had to make a decision that my post on interviewing would’ve helped resolve. The fact is, you get a feel almost immediately for a place, and the reasons you say “yes” or “no” have more to do with who you’ll be working with than the number of monitors you get. Usually, a place that’s using ancient COTS products has put up red flags far earlier in the process. And the jobs I get excited about have nothing to do with their use of Git.

Some lowlights:

  • Left waiting for so long, I had to walk out of the interview in order to meet an appointment. The receptionist explained “Well, they have a lot of work to get done!” Good luck with getting it done.
  • A guy giving a tech interview (for a J2EE position) couldn’t understand how a Swing front-end to an EJB/JPA backend could possibly be called J2EE.
  • Being asked a logic question and, as soon as I used the right word while thinking out loud (in this case “tree”), was cut off and we moved to the next question. After 10 minutes of this, he walked out without telling me anything about the company or job.
  • Being offered a job after having No technical questions asked of me. Gee, who else is working there?
  • Interviewing in a place where every single aspect of the work environment was crappier than the crappiest house I’ve ever lived. If I can afford a fresh coat of paint every few years, shouldn’t some “global enterprise solution consulting firm” be able to swing it?
  • A Java development shop using….Visual Source Safe. I think cp Foo.java Foo.java.bak might be better

Of course, there’s been some legitimate highlights as well:

  • Being asked some challenging questions about concurrency and data structures. You may not ever have to implement a linked-list, but anyone should know it and when I’m asked, it’s a definite plus that the people on the other end know what they are doing
  • Being asked to write code. So far, exactly two positions have asked me write code in the interview. Thank god for them, or my faith would be shaken; seriously. I’m always very nervous when I’m applying for a job where I have to write code and no one seems to need any proof that I can do it. It makes me wonder who else is working there
  • Solid explanations of the business or project. I used to think this was a no-brainer, but more often than not, I come away from a second interview with NO IDEA what I’d be doing if I took the job.

Criteria is only useful when you have to narrow down a lot of choices. I’d love to have so many great opportunities that I could just pick the one where I can develop on a Mac, or the one with the nicest office (all other things being equal). Sadly, that is not the case around here. It seems very few of my colleagues are being too particular, and I can’t help wondering what effect it might have on, well, the world if clueless developers were not as employable as it seems they are.

Things I’d like to leave behind

Tuesday, October 14th, 2008
  • Subversion – Git is so much better in so many ways (it goes without saying that CVS should be allowed to die)
  • JUnit – TestNG does all that JUnit does and more; what does JUnit even have to offer these days?
  • Java 1.4 – Java without generics is just so much pain
  • Java Web Frameworks – As far as I can tell, none of them adequately address one of the fundamental problems in web development, which is to simplify the creation of the UI.
  • Ant – Ant has always been the world’s worst build automation language; who creates a build tool without variables, loops and conditionals? We can do better (and Maven doesn’t appear to be it)
  • Checked exceptionsjava.lang.Exception is possibly the worst class in the Java library; should never be caught nor thrown1
  • Misuses of XML – XML has a purpose, and it’s not as a programming language or configuration file format. Any notion that XML is anything other than a binary format is misguided


1API developers should be allowed to declare that they throw Exception to allow subclasses to throw whatever they want with impunity, however Throwable is the preferred thing to catch for catch-alls, and the entire exception mechanism in Java is woefully broken.

Interviewing the Interviewer: A Rubric

Tuesday, October 7th, 2008

Sad to say, my time at Gliffy is at an end (:sniff:), so I’m heading back into the job pool. I was lucky to get some time in at Gliffy, because, living in Washington, DC, my opportunities for sexy cutting-edge jobs are about zilch. Instead, I’m facing a huge market of “Senior JBoss Portal Maintenance Archiect” type jobs.

I guess I should feel lucky that there’s lots of positions out there, but I really don’t want to be the cog in a huge machine. Gliffy has shielded me from the Horror That Can Be Consulting, so I need to keep my perspective in such trying times. So, calling on my experiences before Gliffy, I’ve made this handy rubric to make sure I explore all facets of a potential position.

I don’t expect anyone to get all positives and no negatives (I’ve certainly never been anywhere that perfect), bit it’s always good to know. For example, if I have to put up with PVCS, I better be sitting on an Aeron chair and be using a normalized database. Further, this is obviously in addition to standard questions regarding what the project is about (i.e. is it interesting) and what the people are like. A whole lot of this can be forgiven by being part of a great team or working on a really cool product.

Question Points for Points against
How do you fare on the Joel Test? · High score
· Good explanations for missing items
· Low score
· Never heard of it
Describe your development process · Structured
· Easily described
· Overly draconian
· Lack of
· Not easily described
What kind of computer will I be using to develop? · Two monitors
· Mac
· Linux
· Administrator access
· Vague answer
· Small monitor
· Windows
· Locked-down
Do you block certain sites or applications on your network? · Open network · Closed network
What is the physical environment like? · Good Chairs
· Private office
· Natural light
· Reasonable Temperature
· Old, crappy
· Bullpen style
Am I required or encouraged to use Windows? · No, few devs use Windows · Yes
What collaboration tools do you use? · Wikis
· IM
· Bug tracking
· Sensible PM
· Email word documents
· MS-project
· SharePoint
· Other proprietary crap (e.g. eRoom, Documentum)
· No tools
What are some of your HR policies? · Few, if any
· Loose dress code
· Flexible schedule
· Draconian
· Dress code
· Core hours
Can I see the code I will be working with? · Letting me see it
· Meaningful javadocs/API documentation
· Structured, consistent style
· Sensible class names and file organization
· Sane build process
· Not letting me see it
· Mix of styles
· No javadocs
· Empty javadocs
· Convoluted file organization
· Broken build file
How do you do testing? · Have testers
· Do unit tests
· Maintain tests
· TDD
· Bug tracker
· Ad hoc
· Lip service to test-first
· No unit tests
What is your approach to configuration management? · Having an approach
· Git
· Database migrations
· Know the versions of 3rd party software/have a baseline configuration
· Organized
· No approach
· CVS
· Perforce, ClearCase, other closed crud
· Shared drives
· Can’t describe versions/configuration
Can I see the database schema I’ll be working with? · Letting me see it
· Normalized
· Sane names (no TBL_* bullshit)
· Synthetic numeric keys
· Referential integrity
· Documented!
· Versioned!
· Not letting me see it
· Unnormalized
· Dumb names
· Incorrect types
· String-based keys
· No constraints
· Undocumented

What else am I missing?

Are you emailing yourself your log errors? You should be.

Friday, September 26th, 2008

Time and time again, users complain about an application crashing on them or otherwise not working. They don’t provide you any info and it’s hard to repeat. You check out the log, but there’s thousands (or millions) of entries and you have no clue where their error occured. Worse, if you are deploying a RIA, the log may be on their computer and not available.

On my last project we experienced this scenario so much that we instituted two things

  • All messages logged with Level.ERROR in log4j would be emailed to us
  • All exceptions caught on the client would be packaged and sent back to the server and logged at Level.ERROR level (thus emailing them to us

After the initial deluge of emails, we found a lot of bugs. I mean a lot of bugs. The annoying, intermittent kind that are hard to reproduce. Further, by judicious use of logging, we discovered a lot of mis-configured environments and other problems without having to get users to mail us their logs.

At Gliffy, they are doing the same thing. Right now, we’re testing a bunch of new features and the stage instance just sent me a bunch of emails, all indicating configuration problems, which is the exact kind of thing that can be hard to track down.

Setting it up using log4j is dead simple:

log4j.appender.mail=org.apache.log4j.net.SMTPAuthenticateAppender
log4j.appender.mail.SMTPHost=@SMTP_HOST@
log4j.appender.mail.UserName=@SMTP_USER@
log4j.appender.mail.Password=@SMTP_PASS@
log4j.appender.mail.Authenticate=true
log4j.appender.mail.From=errors@gliffy.com
log4j.appender.mail.To=@SMTP_LOGGER_FAILURE@
log4j.appender.mail.Subject=Errors from @SMTP_DESC@
log4j.appender.mail.BufferSize=1
log4j.appender.mail.Threshold=ERROR
log4j.appender.mail.LocationInfo=true
log4j.appender.mail.layout=org.apache.log4j.PatternLayout
log4j.appender.mail.layout.ConversionPattern=%d %p%n%t%n%c:%M:%L%n---%n%m%n---%n%n

In my previous job I even created a customized layout to format the emails in such a way that our code was highlighted and GMail didn’t compress things into threads.

If you aren’t doing this, you should be. Now.

Better open-source hosting: SourceForge is looking weak

Wednesday, September 17th, 2008

I currently host my Vim Javadoc doclet on SourceForge and every time I have to deal with it, it’s just a monumental pain. The documentation is insanely long and detailed, the website looks horribly out of date and cruddy, and when compared to stuff like GitHub and Lighthouse, it’s almost embarrassing how difficult it is to deal with and how bad the UI is (despite my best efforts, it still insists that the featured download is the vimdoc samples and not the doclet itself. WTF?).

I’m already hosting the code in GitHub and just moved my tickets to Lighthouse. The only thing left is where to host binary downloads and static assets. For RESTUnit, I’m using Google Code, which is pretty easy to deal with (about a zillion times simpler and easier than SourceForge), however it has no facility for hosting arbitrary HTML. Currently, I’m just using my website for static assets.

While I do like the new Web-2.0 way of doing things (one site like GitHub really focusing on source, another like Lighthouse just does ticketing, etc. and they integrate via web services), I’m not sure where the best place is to host downloads and static assets. I would need programmatic access and some liberal download/diskspace quotas for sure. It would also be nice to be able to connect to other services, for example generate a changelog based on commits or tickets closed since the last release.

Test REST Services

Friday, September 12th, 2008

In my reply to a post on Tim Bray’s blog about using RSpec for testing REST services, I briefly described a project I’m working on, based on the work I’ve been doing at Gliffy, which is a testing framework for REST services called, unsurprisingly, RestUNIT.

For Gliffy’s REST-based integration API, I needed a way to test it, and hand-coding test cases using HTTPClient was just not going to cut it. Further, requests to Gliffy’s API require signing (similar to how Flickr does it), and our API was going to support multiple ways of specifying the representation type as well as tunneling over POST.

So, it occured to me that there was a lazier way of doing this testing. All I really needed to specify was the relative URL, parameters, headers, method, and expected response. Someone else could do the signing and re-run the tests with the various options (such as specifying the MIME Type via the Accept: header, and then again via a file extension in the URL).

I ended up creating a bunch of text files with this information. I then used a Ruby script to generate two things: an XML file that could be deserialized into a java object useful for testing, and a PHP script to test our PHP client API.

The Ruby script would also do things like calculate the signature (the test text files contained the api and secret keys a Gliffy user would have to use the API) and generate some derivative tests (e.g. one using a DELETE, and another tunneling that over POST). The testing engine could generate some additional derivative tests (e.g. GET requests should respond to conditional gets if the server sent back an ETag or Last-Modified header). All this then runs as a TestNG test.

The whole thing works well, but is pretty hackish. So, RestUNIT was created as a fresh codebase to create a more stable and useful testing engine. My hope is to specify tests as YAML or some other human-readable markup, instead of XML (which is essentially binary for any real-sized data) and to allow for more sophisticated means of comparing results, deriving tests, and running outside a container (all the Gliffy tests require a specific data set and run in-container).

The test specification format should then be usable to generate tests in any other language (like I did with PHP). I’m working on this slowly in my spare time and trying to keep the code clean and the architecture extensible, but not overly complex.