Monday, May 24, 2010

Jar Dependencies

There is a small debate in the software community about whether or not it makes sense to check jar dependencies into the code repository.  Keeping them in the repository leads is a more traditional approach to managing dependencies and builds, but build tools like Maven store the jar files outside of the code repository in a jar repository.  This approach has the advantage of ever only needing one copy of any particular dependency on your machine.  The two alternative methods have me wondering what it looks like when one needs to go back and build a previous version of a project.  If the jars are stored in the repository, then its clear what jars are included in the build.  If the build relies on a pom file, is it as clear?  Are they the same jar files?  It seems like they should be as long as its a release,  but it also seems like snapshots would be impossible to be rebuilt.

Friday, May 21, 2010

Code Performance

It seems to me that some developers often give a shrug of the shoulders response to performance problems. The suggestion these developers make is that the slowness is just a price the users have to pay for the functionality. It is true that solving performance problems is often a tedious task without any obvious solutions.  For starters, the code works, so why muck with it? There are two very good reasons.  They usually are not that hard to fix and everyone thinks you're a genius when you do fix them.

The hardest part of fixing performance problems is identifying them.  This is where profiling and heap analysis tools come into play.  The rule of thumb is that 5% of the code will be 95% of the problem.  I'm sure people are making up the percentages, but the point is simply that fixing one or two small pieces of code will make the whole application faster.  Finding the cause is job one.  That seems like a simple statement, but more often than not developers will be quick to throw out solutions firsts.  "Let's get new servers", "We need to make it multi-threaded", "The network is slow", etc...

Plenty of open source profiling and heap analysis tools exist to help track down bottlenecks.  Not to mention simply dumping timestamps to the console can go a long way.  Many areas of the code are also natural places where a slow down could happen.  These include communication across the network, loops, object creation, and database access.  When you find these areas, try to avoid using gimmicks that likely will only moderately improve performance while making the code harder to read.

Instead, try one of the following top 7 ways to speed up your code.
  1. Increase memory
  2. Add caching
  3. Write efficient SQL with joins and indexed columns
  4. Do more in each step
    1. Optimize loops (attributes, unrolling)
    2. Chunks across the wire, database, etc
  5. Choose the right collection
  6. Add more threads, with care
  7. Reduce object creation

Tuesday, May 18, 2010

Bad week for my stock picks

Bad week on earnings!  GME has dropped after industry numbers for April looked bad.  I don't see why this is a big deal and I was thinking about switching my SPU position back to GME before GME reports earnings on Thursday, but SPU has baffled me today.  They reported a Q1 blow out with plenty of cash on hand and they made it clear that they have an S-1 outstanding only if they can find suitable acquisitions to sustain their growth rate.  So why are they down 5 to 6%?.  Meanwhile DAAT reported a loss and blamed it on timing, but that seems like a poor excuse.