


<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>CantRemembrances &#187; ops</title>
	<atom:link href="http://blog.cantremember.com/tag/ops/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.cantremember.com</link>
	<description>Memes of a technical vein discovered during CantRemember.com implementation</description>
	<lastBuildDate>Tue, 16 Feb 2010 06:36:02 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>random problem in your cloud-hosted app?  try a new instance!</title>
		<link>http://blog.cantremember.com/random-problem-in-your-cloud-hosted-app/</link>
		<comments>http://blog.cantremember.com/random-problem-in-your-cloud-hosted-app/#comments</comments>
		<pubDate>Wed, 27 May 2009 06:02:00 +0000</pubDate>
		<dc:creator>dfoley</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Learning]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[ops]]></category>

		<guid isPermaLink="false">http://blog.cantremember.com/?p=152</guid>
		<description><![CDATA[chalk this one up under &#8216;Time Sunk&#8217;.
my Ambience for the Masses app is a Spring / Hibernate / JSP stack, with a couple of other sweet components.  i run it in on an AWS instance.  it&#8217;s been purring along just wonderfully for months now.  then, about ten days ago, it just stopped [...]]]></description>
			<content:encoded><![CDATA[<p>chalk this one up under &#8216;Time Sunk&#8217;.</p>
<p>my <a href="http://sleepbot.com/seb">Ambience for the Masses</a> app is a Spring / Hibernate / JSP stack, with a couple of other sweet components.  i run it in on an <a href="http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud">AWS</a> instance.  it&#8217;s been purring along just wonderfully for months now.  then, about ten days ago, it just <em>stopped working</em></p>
<p>normally Java apps don&#8217;t die without throwing some sort of Exception.  but that&#8217;s just what was happening.  so i stripped out various components &#8212; thank you, <a href="http://www.martinfowler.com/articles/injection.html">Dependency Injection</a> pattern! &#8212; and found that it would sometimes die instantly (if i was lucky) but usually it took a couple of hours.  i don&#8217;t have a lot of free time to track down random intermittent bullshit like this, so it took me about a week to boil it down</p>
<p>it was somewhere in the <a href="http://sleepbot.com/ambience/broadcast/map.html">Current Listener Map</a> &#8212; my Shoutcast-listener-tracking geo-positioning statistics-gathering data sculpture back-end engine.  i hear that the kids call them things <em>mash-ups</em>.  the geo-lookup APIs were the most delicate part, and it seemed to work with them omitted.  that red herring aside, it turned out to be the IP address resolution</p>
<pre><code>try {
	if (getLog().isTraceEnabled())
		getLog().trace("lookup : " + hostname);

	// oh, lookup!
	<strong>InetAddress ipAddress = InetAddress.getByName(hostname);</strong>

	//	NOTE: this is where it went bad on the AWS image
	//		*sigh*

	<strong>return ipAddress.getHostAddress();</strong>
}
<strong>catch (UnknownHostException e) { }</strong>
catch (Exception e) {
	getLog().warn("failed to lookup " + hostname, e);
}</code></pre>
<p>that block was just a couple of lines until i&#8217;d added all the logging and desperate Exception handling.  the app launches a lot of threads, so tracking down the issue was annoying &#8230; but eventually there it was in the traces.  the stack just terminated when the app tried to <code>getHostAddress</code> (not during <code>getByName</code> though &#8230; must be a lazy-loading thing)</p>
<p>so i nearly had it all tracked down to that.  then Tomcat <em>inexplicably</em> became unable to find basic JARs in <code>/usr/share/java</code> &#8212; i was using Fedora 8&#8217;s RPM version vs. raw Apache, and it&#8217;s organized real funny-like</p>
<p>so i threw up my hands and started up a fresh instance of my webapp AWS image.  i&#8217;d rebooted the existing one, and that hadn&#8217;t helped at all.  of course, the issue <strong>magically disappeared</strong> on the new instance.  did anyone see that coming from a distance?  ya probably did.  cuz it&#8217;s ironic.  and it&#8217;s the title line of the damn blog post</p>
<p>the hostname resoultion is surely a low-level OS thing.  both Linux JavaSE 6 and IcedTea 7 just shat the bed when they got to that point, unlikely unless they both leveraged the same lib call.  something must have gone wonk in the virtualization, and apparently a key part of the solution was running it inside of a different farm.  i wasted a helluva lot of time to find that out</p>
<p><strong>lesson !!</strong>  if weird inexplicable freaky-ass things start happening to your cloud-hosted app, load it up on a new VM <em>earlier than later</em>.  i&#8217;d taken a late-stage backup of the failed instance and assumed it would be corrupted with the Mystery Bug (read as: a waste of time to attempt). but oh no, it worked just great :) .  next time it&#8217;ll be a cinch to just bundle the instance up, image it, and use it to launch a new one</p>
<p>and ultimately &#8230; <em>it wasn&#8217;t a bug in my code !!!</em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.cantremember.com/random-problem-in-your-cloud-hosted-app/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
