XmlData: The results are in

After running three test runs of the XmlData sensor, I have gotten the sensor up and running. The three test run settings were:

  1. The default JVM heap size, which is 64ish MB.
  2. An increase of JVM heap size to 1024 MB with a manual invocation of System.gc() after each send to the sensorbase and after each file.
  3. An increase of JVM heap size to 1024 MB.

After I fixed the problem of loading the hashmap with duplicate entries, the sensor failed to finish sending data due to an Out of Memory Exception. After some investigation, I found that the sensor was failing when the largest file was unmarshalled. The largest file was 10 MB in size.

First Test Run with no increase in heap:


So what I decided to do was increase the heap size and found that the sensor completes. Great success! At 1:13, you can see the largest spike on the heap usage graph. That is the time when the 10 MB file was unmarshalled. With the increase in heap, the sensor had no problems.



For fun, I ran one last invocation of the sensor with out the manual System.gc() calls. It turns out that the sensor allocates around 26 more MB to send the same amount of data. It is interesting to note that at the end of the sensor process, the data is very small, but the sensor still uses the same amount of allocated memory. In the 2nd invocation, the heap usage graph shows that it only uses a minimal amount of memory.


I will need to document in the user guide to increase the heap size if sending large files. It might be a good idea to tell users to increase their heap to a large number. If we come across the case where a user has an abnormally large file, we may need to go the route that Philip suggested of writing a custom SAX parser. That will remove the need to load the entire document into memory.

The sensor is almost functional. I noticed that version 7 data includes pMap attributes, which I probably will need to separate. After thats completed, I can write some documentation and release! Hopefully I can send out a review request at the next milestone.

Monitoring Java Processes with JConsole

I am currently working on fixing some Out of Memory (OOM) exceptions that are being thrown by the Hackystat XmlData sensor. Philip suggested in an email that the a good first step would be to start looking at heap usage with JConsole. JConsole is actually a decent tool. It gives you graphs of heap usage over time, spawned threads, created classes, and a nice little over view. The tool even lets you invoke the garbage collector on the monitored process.

The Overview of the monitored process.



The Heap Monitoring Tab


Tonight I was trying to reproduce the OOM exception, which from the exception log, was thrown when a 10MB file was unmarshalled. I tried to reproduce the error by only sending data from the 10MB file, but the exception was not thrown. After consulting the screenshots I took of the previous night's test run, I found out that OOM exception was thrown after an hour of sending data. I also noticed that the heap usage hovered around the high 50MB to low 60MB range.

My hypothesis is that the memory usage was almost at a maximum when the large 10MB file was accessed. So if I can keep the heap usage leveled off, maybe the exception will not get thrown. To test my hypothesis, I pressed on the "Perform GC" button a couple times while sending data. I noticed that the heap usage would stay at the mid to high 50MB range. That led me to insert some 'System.gc()' calls after each file and after sending a batch of data. The test run is currently running and I do notice that the heap usage is about 10MB lower on average.

Wow it looks like you get a blog with updates before the blog is posted! The exception was just thrown again, but the heap was around 49MB when it encounter the 10MB file. I'm beginning to think that the solution may be to increase the JVM heap from the command-line. I did a test invocation with a 40 MB data file and the OOM exception was thrown immediately. If the sensor sends all of the data with 1024MB of heap allocated, we may just have to inform users to increase the heap size.

Requesting the user to increase the heap size in this case may be the correct solution. Previously, increasing the heap did not work because I implemented the sensor wrong. I need to write a blog entitled, "It's usually your fault." We'll just have to wait and see how the current test run goes.

Hackystat 8 Supercharged!

As I was in the shower 5 minutes ago, I had a thought. (I think I should move my workstation into the shower because I channel the most ideas there).

The Hackystat 8 architecture will work. The reason why I believe it will work it is a combination of old and new things:

  1. Disciplined Developers
  2. Great Hackers
  3. The new motto: "Convention over Configuration".
  4. The sweet new architecture known as Version 8.

Disciplined Developers
Pavel found a great article of what it takes to be a disciplined developer. Without discipline I don't think you will have the ability to produce quality software in a timely manner. If you look up discipline in the dictionary, you'll come to this page.

Great Hackers
If there is one thing I miss about working in CSDL, it's reading other people's code. Working on Hackystat 8 has reminded me what it is like to read beautiful, elegant code. A motto I have come to accept is to "strive to be the worst in the group". The only way to improve is to always have something to learn from your peers. The Hackystat team is constantly providing me with that opportunity ;)

Convention over Configuration
The motto "Convention over Configuration" is a large improvement over the Hackystat 7 motto, which was, "Config it, then let's document!" (There wasn't really a motto for Version 7, so I decided to make one up for entertainment value). In Hackystat 7, you had to go through so much application and dependency configuration via properties files. That was a huge headache if you start developing on another machine.

In Hackystat 8, things are already starting to look very different. Gone are the days of endless configuration! We now have small little modules, each with their own library that can be used by other modules. Want to start the server up to receive data? Configuration files? No way! Hit the command-line (java -jar sensorbase.jar). I'm liking it so far. Small little pieces, small little hacks, large contributions to the Hackystat project.



Sweet New Architecture
I'm not going to pretend I can explain the new architecture in detail. Check out Philip's Version 8 Design Doc. Sweet!


As Aaron has said, Hackystat is fun. I can't wait till the shirts come out. I'll take 10.