Archive for the ‘Development’ Category

a weekend of craft, theatre, and technical meltdowns

Sunday, July 1st, 2012

this weekend turned out to be a rather odd mix of side-projects and technical chaos.  and just to preface it — this is not a boastful blog entry.  everything i did in the technical realm was either (a) a simple fix or (b) being helpful — nothing to brag about.  it’s the circumstances that make it something i’d like to put down on record :)

so, Friday night i was stitching together the last parts of my Burning Man coat.  it’s made of fur, and ridiculous by design.  i’m adding some needed collar reinforcement, when suddenly i start getting Prowl notifications.  my health checks are failing.  ”ah, crap, not again,” says the guy who’s used to running a totally-non-critical app platform in the AWS cloud, “i’ll get to it after i’ve finished sewing buffalo teeth into the collar.”  so i did.  my instance’s CPU appeared to be spiked — i could hit it with ssh, but the connection would time out.  a reboot signal resolved the issue (after an overnight wait).  and it was thus that i fell victim, like so many others, to Amazon’s ThunderCloudPocalypse 2012.  and the secret bonus was that one of my EBS volumes was stuck in attaching state.  ”ah, crap, not again,” says the guy who’s gonna lose some data (because he has backup scripts for Capistrano but no automation for them yet), and i’m forced to spin up a new volume from a month-old snapshot.  no worries – it wasn’t my MySQL / MongoDB volume, just the one for my blog & wiki & logs.  i got that up and running on Saturday in-between rehearsing scenes for The Princess Bride (coming to The Dark Room in August 2012 !!)

then i was immediately off to rehearsal for my Dinner Detective show that night.  yeah, it was one of those kind of Saturdays.  so, i was sitting there waiting for my cue, when at about 5pm PDT, failure txts suddenly start raining down from work.  and from multiple servers that have no reason to have load problems.  i log into our Engineering channel via the HipChat iPhone app, and our DevOps hero is already on the case.  ElasticSearch has pegged the CPU on its server, and JIRA & Confluence are going nuts as well.  something’s suddenly up with our Java-based services.  i ask him to check on Jenkins, and sure enough, it’s pegged too.  and no one’s pushed anything to build.  he goes off to reboot services and experiment, and i go off to check Twitter to see if we’re the only ones experiencing it.  sudden JVM failures distributed across independent servers?  that’s unlikely.  he guesses it’s a problem with date calculation, and he was absolutely right.  hello leap-second, the one added at midnight GMT July 1st 2012.  i RT:d a few good informative posts to get the word out — what else can i do, i’m at rehearsal and on my phone! — and then let DevOps know.  we’re able to bring down search for a while, and it turns out rebooting the servers solves the problem (even without disabling ntpd, as other folks recommended).  so, disaster averted thanks to Nagios alerts, a bit of heroic effort, and our architect’s choice of a heavily Ruby-based platform stack

again, as i prefaced; nothing impressive.  no Rockstar Ninja moves.  no brilliant deductions or deep insightful inspections.  neither lives no fortunes were saved.  and i got to wake up on Sunday, do laundry, pay my bills, and go out dancing to Silent Frisco for the later hours of the afternoon.  but it was fun to have been caught up in two different reminders of how fragile our amazing modern software is, and how the simplest unexpected things — storms in Virginia, and Earth’s pesky orbital rotation — can have such sudden, pervasive, quake-like impacts on it

delayedCallback(function(){ … }, delay);

Thursday, March 22nd, 2012

hola, Amigos! it’s been a long time since I rapped at ya!

so, i’ve been doing plenty, i’m just not chatty about it. i built a Ruby migration framework using bundler, pry, spreadsheet, sequel and mp3info to build a JSON document version of my SEB Broadcast database. next up is some node.js to serve it up, then some RequireJS, mustache (?) & jQuery goodness to spiff up the SEB site

but in the meanwhile, i wrote this little gem at work:

// returns a function that will invoke the callback 'delay' ms after it is last called
//   eg. invoke the callback 500ms after the last time a key is pressed
// based on http://stackoverflow.com/questions/2219924/idiomatic-jquery-delayed-event-only-after-a-short-pause-in-typing-e-g-timew
//   but fixes the following:
//   (a) returns a unique timer scope on every call
//   (b) propagates context and arguments from the last call to the returned closure
//   and adds .cancel() and .fire() for additional callback control
// then of course, you could use underscore's _.debounce
//   it doesn't have the .cancel and .fire, but you may not care :)
function delayedCallback(callback, delay) {
    return (function(callback, delay) {                          // (2) receives the values in scope
        var timer = 0;                                           // (3a) a scoped timer
        var context, args;                                       // (3b) scoped copies from the last invocation of the returned closure
        var cb = function() { callback.apply(context, args); }   // (3c) called with proper context + arguments
        var dcb = function() {                                   // (4) this closure is what gets returned from .delayedCallback
            context = this;
            args = arguments;
            window.clearTimeout(timer);
            timer = window.setTimeout(cb, delay);                // (5) only fires after this many ms of not re-invoking
        };
        dcb.cancel = function() { window.clearTimeout(timer); }; // (6a) you can cancel the delayed firing
        dcb.fire  = function() { this.cancel(); cb(); };         // (6b) or force it to fire immediately
        return dcb;
    })(callback, delay);                                         // (1) capture these values in scope
}

yes, i know. so it turns out that i didn’t know about underscore’s _.debounce() when i wrote it. eh. so much for DRY :)

still — i’m glad i thought it through. to me, this implementation captures the most powerful aspects of ECMAScript itself:

  • scope-capturing closures
  • specifiable function context
  • freestyle properties on Object instances
  • single-threading (look ma, no synchronize { ... } !)

anyway. bla dee blah. this post also gave me the incentive to start embedding gists in my blog. nice helper widget, dflydev !

peace out

Upgrading your Rails Development Mac to Snow Leopard

Saturday, February 6th, 2010

Oh, there is such joy in the process of upgrading to Mac OS/X Snow Leopard for us developer folks. Me, I chose the in-place ugrade path … my co-worker, who chose the from-scratch path, was deprived of some of these pleasures. Then again, he had to reconstruct everything from scrach, so he had his own barrel of monkeys to contend with.

Here’s all the bumps I ran into, pretty much in reverse order as I tried (unsuccessfully) to do the minimum amount of work possible :) I had to go pretty much this entire process twice — once on my work Macbook Pro, once on my identical home verison — so I figured I might as well document all of this crap down in the hope that it may reduce the shock and awe of future migrators. Of course you may run into a mess of fun issues not described here … And pardon me if the instructions aren’t perfect, because I’ve tried to boil a lot of this down to the it-finally-worked steps extracted from the frenzied back-and-forth of my real-time upgrade experience :)

Backups

Yes, this is just common-sense developer stuff (as is a number of the other obvious things that I call out in this post).

You’ll probably want to do a full mysqldump before upgrading. You can dump your port install and gem list --local listings up front as well, or wait ’til you get to those respective steps below.

MacPorts

If you also chose the MacPorts library system, you’ll need to re-install it from scratch. You’ll need X11 from the Snow Leopard from the OS/X install disks, and download the latest version of Xcode. Follow the migration steps as outlined on their Wiki; it does the trick.

Save off your port install list as a reference. Now, your MacPorts install will be completely toast, so that command won’t work until you re-install. No problem though — all of your packages will still be listed even after you upgrade.

The port clean step in the Wiki will crap out in the http* range, but that’s fine … you can probably skip that step anyway. Re-install your core packages and you’re good to go. I suggest installing readline if you haven’t, because it’s very useful in irb or any Ruby console.

MySQL

It was not necesary for me to build MySQL from source. Instead, I just installed the x86_64 version of MySQL — the latest was mysql-5.1.42-osx10.6-x86_64.dmg af the time of this writing.

If this is a 5.x verison upgrade for you as well, the install will just re-symlink /usr/local/mysql, so your old data will still be in your previous install dir.

I didn’t make mysqldumps before I did the upgrade (handpalm) so I had to copy over my data/ files raw and hope that the new version would make friends with them. Initially I had problems with InnoDB. It wasn’t showing up under show engines;, and when I tried to manually install the plug-in — per this bug report, which explains the whole thing — it would fail on the ‘Plugin initialization function’. Turns out you need to do two things when you bring over raw data:

  • Whack your /var/log/mysql/*binary* / binary log files in order to get past mysqld startup errors.
  • Whack your ib_logfile* files too. Once you do that, there’s a good chance MySQL will regen them in recovery mode. Me, I had no choice (except rolling back with Time Machine). Miracle of miracles … it works!

Don’t try this at home kids. Make your backups. Note: here’s the correct link to the manual page on InnoDB tablespace fun.

x86_64 ARCHFLAGS

Snow Leopard is a lot more native 64-bit than previous OS/X versions, and when you do your manual builds & makes, you may want to set the following environment variable:

export ARCHFLAGS="-Os -arch x86_64 -fno-common"

You’ll see a set of similar (though mixed) recommendations in the blogs I reference below; this particular flagset worked for me.

Ruby

I built Ruby 1.9 at work, and 1.8.7 on my personal machine. Either path is fine, just pick up the latest source of your choosing. Chris Cruft’s blog post goes into some of the details I’m describing here as well. Basically, the README boils down to:

autoconf
./configure --with-readline-dir=/usr/local
make clean && make
make install

Though there’s no reason in the world that you’d want to — it has been superceded — do not install ruby 1.9.1p243. If you do, you’ll never get the mysql gem to work. Or wait, or was it mongrel? Well, it was one or the other … just trust me, it’s bad.

Gem

I re-built gem from source from scratch as well, just to be sure. Save off your gem env and gem list --local as a reference. And before you start installing gems, you’ll also probably want to make sure you’re fully up-to-date with gem update --system, though that’s probably redundant.

Uninstall and re-install all of your gems; if some won’t uninstall even though they’re listed, it may be an install path issue. Use gem list -d GEMNAME to find where your gem was installed, and then use gem uninstall -i DIRNAME GEMNAME to finish the job.

With the ARCHFLAGS in place, the vast majority of the native source gem builds will go smoothly, but there are some notable exceptions …

The mysql Gem

Uh huh, this is the one gem that gets me every time. And again, you won’t have needed to have built MySQL from source.

For starters, you may way want to glance over this very useful iCoreTech blog post to see if it works for you. But if you run into a lot of issues like I did, you may need to do it in two steps:

Fetch and Install the Gem

At the time of this writing, either mysql gem version 2.7 or 2.8.1 will do the trick.

gem fetch mysql --version VERSION
gem install mysql -- --with-mysql-dir=/usr/local --with-mysql-config=/usr/local/mysql/bin/mysql_config
gem unpack mysql

Sadly, it may fail, either during the build or when you try to test it. I was able to successfuly run the (included?) test.rb at my workplace, but as simple as that sounds, I swear I don’t remember how I did it ! The second time, at home, I only found the problems retroactively when I tried to get my Rails projects to boot. If you do find and run the test.rb, you’ll need to make sure that the standard MySQL test database exists.

Both times, one of the big blockers that I — and many other people — ran into was:

NameError: uninitialized constant MysqlCompat::MysqlRes

If so, try this:

Manually Re-build the Binary

Go into ext/mysql_api, make sure your ARCHFLAGS are exported as described above, and …

ruby extconf.rb --with-mysql-config=/usr/local/mysql/bin/mysql_config
make clean && make
make install

Hopefully your newly built & installed binaries will resolve the issue.

Mongrel

It took me a little effort to build mongrel on Ruby 1.9 with the x86_64 architecture. My memory is a little hazy — since my 1.8.7 build at home worked perfectly through standard gem install — but buried deep in this Is It Ruby contributory blog post are probably all the answers you’ll need.

Under Ruby 1.9, I did had to modify the source, which (to paraphrase) involved some global code replacements:

  • RSTRING(foo)->len with RSTRING_LEN(foo)
  • RSTRING(foo)->ptr with RSTRING_PTR(foo)
  • change the one-line case ... when statements from Java-esque : delimiters to then‘s.

And then the re-build:

ruby extconf.rb install mongrel
make
make install
cd ../..
ruby setup.rb
gem build mongrel.gemspec
gem install mongrel.gem

Conclusion

And that’s as far as I had to go. Whew! I certainly hope that my post has been of some assistance to you (and with a minimum of unintended mis-direction). Of course, I learned everything the I reiterated above by searching the ‘net and plugging away. And there’s plenty of other folks who’ve gone down this insane path as well. Good luck, brave soul!

What Would a Wookie Do?

Saturday, September 12th, 2009

Yes, I had to ask myself that question a lot recently, at least from the perspective of how he would use the Twitter service. As much as this is a theoretical question, I believe I came up with some answers, and they are made manifest in the @cr_wookie Personality Engine.

Base-line Aesthetics

The first step in bringing a Wookie to life was to establish a basic phoenetic dialect. So I came up with a set of candidates ‘words’ along the lines of; auhrn, rghrn, gahrn, hraur, urau, ehruh, and nrauh. Hey, they sounded good. There’s about 60 of them total, comprised of the letters A E G H N R and U. After having watched The Empire Strikes Back, where Chewbacca seems to get most of his good lines, I expanded into a few W words as well (wahr seems to work particularly well). Many folks postulate that he was capable of speaking Os and Ks, yet I myself do not subscribe to that opinion.

I then used MadderLib to construct simple word generators which allowed the phoenemes to stretch appropriately for different word lengths. Long vowels, double Rs and Hs, whatever looked good. And with a little compositing logic, I came up with some sentence patterns that were quite fun to read out loud:

rauhr nruuuhh raghr uhr rghrrnnauhrnaauuuuuurhhh
nrauuh euu gauuhrr ruhr
urhn ehrraaah rhhreuhraahhrrrrnn gaurrh

Those are rather plain-looking though. In order to make the Wookie’s statements seem more like txt argot, some variety was needed. Punctuation was obvious, both terminating and delimiting (commas, semicolons). Plus there’s the proper use of txt idioms, the LOLs and WTFs that are so popular with the youngsters these days (AFAIK). Throw in a nice little collection of emoticons, and behold; the Wookie has charm:

LOLZ! uhrrn euueuuhhaur, eruh hraur nrauururhnehrrraaah auhrn harn aurreruhaaarruuuhh rraaghrrr ^_^
hraur rghneruh nuh waahhr???
rhhhnnghn uhrrrnn ehrah. euu urau ehuuurrr urrn aurheuuuh haarn uhrrrn k?… haauurrr nruharuhuhhrrh hruaaaauuh ghrn rghn nrruuhh

Well, yeah, they still look sorta flat. Real people quote and capitalize portions of what they type, and there are other non-verbal components to the average sentence. So, the Wookie was taught to inject numbers, times, abbreviations, and even Star Wars calendar years into his sentences:

gahrn rghhr ghrrnehur rghhrrn hruauh raghrehurauuuuuuhrn gahrn?? hrraaauu: hrrraaauu rauhr ;) _ehruuh_ ‘Rahr Ehrrraaaah’
raauh hrau Ehrahurrrrh *rauhr* gruh
auhrneuuhr. harnuhrrnuhrrr uhr – raauhneuuu 1:30 gahrn raauuughrrr *nuuh* aahhrruuuunn uhr. wuurh harn rhr?

Once the Wookie was at this point, he could talk for quite some time and produce diverse aesthetic results. Reading them out loud is a hoot! Thus was born the first Personality Engine bot (the Wookie is comprised of four of them). But he still wasn’t really tweeting until he could follow some of the core Twitter memes.

Twitter Memes

Hash tags were the first obvious choice, since they were easy to fake. To this day, the Wookie can simply prepend a # to any word or composite that he speaks. But to make this feature really zing, I added support for Twitter’s trending searches. This allowed him to use real-world releveant tags, injecting them into his sentences, or appending them to his tweet (as is common convention). It turns out that one of the joys of a nonsense grammar is that anything which isn’t nonsense magically becomes the ‘meaning’ of the sentence:

Harn ehureruheuuu & ahrn rrghhhn! AAAHHRRNNAUHRN EHUUR! #itsnotgonnawork
WAHR NUUUH RAUHRR!!! euu ‘aauurr nraur’ haaauurrurau! #fact
GRRUUUH! uhr hraheuuuuhhr aauurh #ChargersSuck rrhnn aur! gauhr aaurh haurerruuuhh! !! #aurh

Of course, no Twitter user can resist posting shortened URLs. They’ve been a cornerstone to the explosive growth of the service, maybe because there’s just so much interesting fast-moving crap out there on teh Internets. The Wookie follows several aggregation services — Digg, Technorati — and a smattering of other popular blogs — TMZ, The Onion Daily, LOLcats — etc. He pulls out links to recent content and shortens them with the bit.ly API. Again, since the Wookie is totally faking it, the results just cannot be accounted for. The best he can hope for is that the emotional texture of his tweet sometimes support the referenced source:

nrauh rrhn hrauuur ehrruhauhrnurrh. nrraauuuhh IMHO. hraur grraauhurhnauhrn rauh http://bit.ly/mjOD7
Ghrn euuhrr? haarneruuuhh urrn aruh rghrnn aaur, uhr hruh urrr :) http://bit.ly/1a9j1K

And no tweeter lives in a vacuum; their posts are replete with the user handles of friends, comrades and mentors.
The Wookie wasn’t about to make up handles, so his likely choices were his followees and followers.
Rather than take the name-dropping approach here — more on that later — the Wookie chooses to occasionally reference his most recent followers:

OMG! _rrrhnnneuh_ euh: urr wuuurrh rghnurrnh urhn hruhn @sleepbotzz rhagn ghrn rrhn waaahhr hruauhehuraghrrrrrnnn rhagn harrnaurh

After these features were implemented, the Wookie’s posts started to look almost real-ish. And whenever he tweets on his own, that is his range of capabilities. But he’s still not a real member of the Twitter community until he could play some other tricks. Thus began a completely separate effort; how to translate English into Wookie.

Mocking

Did I say ‘translate’? What I meant to say was ‘mock‘. After all, what can you really do with a nonsense grammar except make it look like it has meaning.

So, the Wookie was taught to mock existing sentences into his own dialect. He simply matches the initial letter (vowel / not) and preserves the word’s length and non-alpha characters (for contractions and the like). Special mappings were also added to deal with short words (the dialect only generates words 4-letters and above). And within a given tweet, he re-uses the same fake word for each instance of the real one. It’s an obscure feature, but it makes a helluva difference in some specific cases.

The totally awesome part of effort is identifying the words that don’t get mocked. There was no way I wanted to deal with semantic grammar detection, since tweets are often wildly non-grammatical. So as per usual, the Wookie fakes it. It mainly comes down to a weak analysis of quoting and capitalization patterns. He also keeps hash tags, links, handles, many acronyms, and argot — to the best of his ability.

And just for fun, he also recognizes a rather large lexicon of terms from the Star Wars universe. Well, except for the term ‘Star Wars’ that is. He doesn’t know what that means.

It took a lot of experimenting to get it right, and he still makes mistakes, but he’s getting smarter all the time. One of the interesting things I learned during development was how staccato the English language is, as compared to the long smooth yawls of Wookie. Reading back a mocked sentence out loud is a sublime experience.

You may ask, how can this awesome power best be used to serve the Good?

Re-Tweeting

Darn right the Wookie re-tweets! He simply selects a few users that he follows, derives their recent tweets, and mocks one of them up. There are some users — @darthvader for instance — which he will always re-tweet if the user has posted anything fresh. Otherwise it’s a simple random selection, after avoiding repeats (there’s extensive repeat-avoiding code all throughout the Wookie implementation). There is the slight hint of name-dropping here, since he tends to follow a lot of popular accounts, but that’s just the nature of this beast.

RT @warrenellis Rauuh’r @neilhimself ehr #neilfail ar hruun rraaauuuuhrrr? Au. #warrenfail. http://bit.ly/16IiUE
RT @KurzweilAINews: First Close Look At Stimulated Brain: Aghhrrrrnn gaauuuhhrrr ar hrrauur aauuuhrrn u gahrrn … http://bit.ly/17dg34
RT @cnnbrk Hruh. Urrnn nrrauuuhh Ted Kennedy ur “rrrhhrrr ghr rauhn hru rau wahr; wahr au Democratic Party; … http://bit.ly/3FmLyq

It turns out that injecting the re-tweet ‘header’ will often push longer ones past the 140ch barrier. He will attempt to preserve as much of the original tweet as possible, focusing on trailing URLs and hash tags. And if the tweet is short enough, he posts a shortened link to the original post, primarily to show off his mad skills. He is much inclined towards tweets which have a good blend of mockable and preservable words, again, to show off his mad skills.

This became the second Personality Engine bot. Yet still, re-tweeting is a one-way street, and interaction is the real key to user engagement.

Playing Well With Others

The third Personality Engine bot was borne of the need to perpetuate the following cycle of fun. On a regular basis, the Wookie will search for references to relevant words — wookiee, kashyyyk, etc. — and will respond to the user with a generated tweet. This is much less invasive and cruel than auto-following, a botting practice which I find to be quite gauche. I can only imagine the surprise on these user’s faces:

@amynicole21 WTF! aruh nrauuhehruhaaauuurr rraaahhhrr nruuh urrrrn ghrn wurh: euu haarrn nuuh grruhuhrnuhhrrn erruuuuhh :)
@DZ1641 gauuuhrr??
@vfigueroa1 rghrurr waahr! rrhneuuuhr urr nruuh hrauh – *ehrraaah* nrauh ^_^ nraur hruun rrrghhnn

However, before he goes searching, he first looks at his recent mention history, specifically at tweets starting with @cr_wookie. If one is found, he will mock and publicly respond to it, linking back to the original post when length permits. So if you talk to the Wookie, there’s a reasonable chance that he’ll republish you. To minimize abuse of this feature, he doesn’t follow quite the same word preservation rules as he does for follower re-tweeting. But he’ll keep Star Wars words, and that opens up a vast realm of potential amusement.

. @kindadodgy Nurh U hraaauuuu rghn Wookie rauuhrr wau’r hruh nrh ghrn hrun ‘rhngn ruhrn uhr wurh rn nuuh gh, ahrn au nrruuuuhh gauurh.
> @adamlampert Ar. U hru nrh raaghrr gh HR’N! Rghn ruuhr au! http://bit.ly/591qW
.@Lillput Nrh, rhag’n rauuhr nurh ghr haurr au a rghhnn ur a rghn.

Greeting New Followers

The fourth and final Personality Engine bot is the greeter. When you follow him, he’ll DM you. Short and to the point.

Summary

Whew! All in all, the project required about 6 weeks of spare time. My only hope is that much hilarity will ensue from these efforts.

If you want to read a bit more about the Wookie — and who wouldn’t, right? — you can check out his Wiki page.

Snake ‘n’ Bacon in The DDOS Caper!

Friday, August 7th, 2009

ah, come in! we’re so glad you’ve come Snake ‘n’ Bacon!
i’m crisp delicious bacon
sssss

glad you asked. it seems there’s a group of hackers, and we want you to go in under-cover
i go great on a sandwich
sssss


When Twitter came back online yesterday afternoon after their networking attacks, I got a torrent of @cr_snake_bacon tweets. Wasn’t sure why, but it seemed suspicious. Twitter’s API had flopped around for most of the day, so the logs were full of Exceptions and … oops! … re-connect attempts!

Of course I’d built the bots to re-tweet on an Exception. They’re all configured to wait 60 seconds, then try again. But of course until I fixed the configuration over night, they did exactly what a bot would doconspicuous

The service attacks on Twitter continued through today, and I’m sure that the birdy techs are furiously building black ice fortresses in Scala even now. Again, I saw a burst this afternoon from all of my bots. Pokey the Penguin, Conet Project, and Chewbacca all had several things to say, all at once. Obviously I had fucked something else up, so I hurriedly checked the logs. And nope … actually, my change had worked … Twitter had just un-blocked my IP.

*whew*

I’m not exactly sure how many bots are out there … here’s a nice wiki being kept of them. But I can imagine I’m not the only one who made that try-again coding mis-calculation. What’s sweet is that it’s un-done now, and my toys can continue prattling on.

Thanks, guys. Sorry we looked like a vicious autonoma for a while there. Glad to be back.

random problem in your cloud-hosted app? try a new instance!

Tuesday, May 26th, 2009

chalk this one up under ‘Time Sunk’.

my Ambience for the Masses app is a Spring / Hibernate / JSP stack, with a couple of other sweet components. i run it in on an AWS instance. it’s been purring along just wonderfully for months now. then, about ten days ago, it just stopped working

normally Java apps don’t die without throwing some sort of Exception. but that’s just what was happening. so i stripped out various components — thank you, Dependency Injection pattern! — and found that it would sometimes die instantly (if i was lucky) but usually it took a couple of hours. i don’t have a lot of free time to track down random intermittent bullshit like this, so it took me about a week to boil it down

it was somewhere in the Current Listener Map — my Shoutcast-listener-tracking geo-positioning statistics-gathering data sculpture back-end engine. i hear that the kids call them things mash-ups. the geo-lookup APIs were the most delicate part, and it seemed to work with them omitted. that red herring aside, it turned out to be the IP address resolution

that block was just a couple of lines until i’d added all the logging and desperate Exception handling. the app launches a lot of threads, so tracking down the issue was annoying … but eventually there it was in the traces. the stack just terminated when the app tried to getHostAddress (not during getByName though … must be a lazy-loading thing)

so i nearly had it all tracked down to that. then Tomcat inexplicably became unable to find basic JARs in /usr/share/java — i was using Fedora 8′s RPM version vs. raw Apache, and it’s organized real funny-like

so i threw up my hands and started up a fresh instance of my webapp AWS image. i’d rebooted the existing one, and that hadn’t helped at all. of course, the issue magically disappeared on the new instance. did anyone see that coming from a distance? ya probably did. cuz it’s ironic. and it’s the title line of the damn blog post

the hostname resoultion is surely a low-level OS thing. both Linux JavaSE 6 and IcedTea 7 just shat the bed when they got to that point, unlikely unless they both leveraged the same lib call. something must have gone wonk in the virtualization, and apparently a key part of the solution was running it inside of a different farm. i wasted a helluva lot of time to find that out

lesson !! if weird inexplicable freaky-ass things start happening to your cloud-hosted app, load it up on a new VM earlier than later. i’d taken a late-stage backup of the failed instance and assumed it would be corrupted with the Mystery Bug (read as: a waste of time to attempt). but oh no, it worked just great :) . next time it’ll be a cinch to just bundle the instance up, image it, and use it to launch a new one

and ultimately … it wasn’t a bug in my code !!!