Archive for the ‘Development’ Category

What Would a Wookie Do?

Saturday, September 12th, 2009

Yes, I had to ask myself that question a lot recently, at least from the perspective of how he would use the Twitter service. As much as this is a theoretical question, I believe I came up with some answers, and they are made manifest in the @cr_wookie Personality Engine.

Base-line Aesthetics

The first step in bringing a Wookie to life was to establish a basic phoenetic dialect. So I came up with a set of candidates ‘words’ along the lines of; auhrn, rghrn, gahrn, hraur, urau, ehruh, and nrauh. Hey, they sounded good. There’s about 60 of them total, comprised of the letters A E G H N R and U. After having watched The Empire Strikes Back, where Chewbacca seems to get most of his good lines, I expanded into a few W words as well (wahr seems to work particularly well). Many folks postulate that he was capable of speaking Os and Ks, yet I myself do not subscribe to that opinion.

I then used MadderLib to construct simple word generators which allowed the phoenemes to stretch appropriately for different word lengths. Long vowels, double Rs and Hs, whatever looked good. And with a little compositing logic, I came up with some sentence patterns that were quite fun to read out loud:

rauhr nruuuhh raghr uhr rghrrnnauhrnaauuuuuurhhh
nrauuh euu gauuhrr ruhr
urhn ehrraaah rhhreuhraahhrrrrnn gaurrh

Those are rather plain-looking though. In order to make the Wookie’s statements seem more like txt argot, some variety was needed. Punctuation was obvious, both terminating and delimiting (commas, semicolons). Plus there’s the proper use of txt idioms, the LOLs and WTFs that are so popular with the youngsters these days (AFAIK). Throw in a nice little collection of emoticons, and behold; the Wookie has charm:

LOLZ! uhrrn euueuuhhaur, eruh hraur nrauururhnehrrraaah auhrn harn aurreruhaaarruuuhh rraaghrrr ^_^
hraur rghneruh nuh waahhr???
rhhhnnghn uhrrrnn ehrah. euu urau ehuuurrr urrn aurheuuuh haarn uhrrrn k?… haauurrr nruharuhuhhrrh hruaaaauuh ghrn rghn nrruuhh

Well, yeah, they still look sorta flat. Real people quote and capitalize portions of what they type, and there are other non-verbal components to the average sentence. So, the Wookie was taught to inject numbers, times, abbreviations, and even Star Wars calendar years into his sentences:

gahrn rghhr ghrrnehur rghhrrn hruauh raghrehurauuuuuuhrn gahrn?? hrraaauu: hrrraaauu rauhr ;) _ehruuh_ ‘Rahr Ehrrraaaah’
raauh hrau Ehrahurrrrh *rauhr* gruh
auhrneuuhr. harnuhrrnuhrrr uhr – raauhneuuu 1:30 gahrn raauuughrrr *nuuh* aahhrruuuunn uhr. wuurh harn rhr?

Once the Wookie was at this point, he could talk for quite some time and produce diverse aesthetic results. Reading them out loud is a hoot! Thus was born the first Personality Engine bot (the Wookie is comprised of four of them). But he still wasn’t really tweeting until he could follow some of the core Twitter memes.

Twitter Memes

Hash tags were the first obvious choice, since they were easy to fake. To this day, the Wookie can simply prepend a # to any word or composite that he speaks. But to make this feature really zing, I added support for Twitter’s trending searches. This allowed him to use real-world releveant tags, injecting them into his sentences, or appending them to his tweet (as is common convention). It turns out that one of the joys of a nonsense grammar is that anything which isn’t nonsense magically becomes the ‘meaning’ of the sentence:

Harn ehureruheuuu & ahrn rrghhhn! AAAHHRRNNAUHRN EHUUR! #itsnotgonnawork
WAHR NUUUH RAUHRR!!! euu ‘aauurr nraur’ haaauurrurau! #fact
GRRUUUH! uhr hraheuuuuhhr aauurh #ChargersSuck rrhnn aur! gauhr aaurh haurerruuuhh! !! #aurh

Of course, no Twitter user can resist posting shortened URLs. They’ve been a cornerstone to the explosive growth of the service, maybe because there’s just so much interesting fast-moving crap out there on teh Internets. The Wookie follows several aggregation services — Digg, Technorati — and a smattering of other popular blogs — TMZ, The Onion Daily, LOLcats — etc. He pulls out links to recent content and shortens them with the bit.ly API. Again, since the Wookie is totally faking it, the results just cannot be accounted for. The best he can hope for is that the emotional texture of his tweet sometimes support the referenced source:

nrauh rrhn hrauuur ehrruhauhrnurrh. nrraauuuhh IMHO. hraur grraauhurhnauhrn rauh http://bit.ly/mjOD7
Ghrn euuhrr? haarneruuuhh urrn aruh rghrnn aaur, uhr hruh urrr :) http://bit.ly/1a9j1K

And no tweeter lives in a vacuum; their posts are replete with the user handles of friends, comrades and mentors.
The Wookie wasn’t about to make up handles, so his likely choices were his followees and followers.
Rather than take the name-dropping approach here — more on that later — the Wookie chooses to occasionally reference his most recent followers:

OMG! _rrrhnnneuh_ euh: urr wuuurrh rghnurrnh urhn hruhn @sleepbotzz rhagn ghrn rrhn waaahhr hruauhehuraghrrrrrnnn rhagn harrnaurh

After these features were implemented, the Wookie’s posts started to look almost real-ish. And whenever he tweets on his own, that is his range of capabilities. But he’s still not a real member of the Twitter community until he could play some other tricks. Thus began a completely separate effort; how to translate English into Wookie.

Mocking

Did I say ‘translate’? What I meant to say was ‘mock‘. After all, what can you really do with a nonsense grammar except make it look like it has meaning.

So, the Wookie was taught to mock existing sentences into his own dialect. He simply matches the initial letter (vowel / not) and preserves the word’s length and non-alpha characters (for contractions and the like). Special mappings were also added to deal with short words (the dialect only generates words 4-letters and above). And within a given tweet, he re-uses the same fake word for each instance of the real one. It’s an obscure feature, but it makes a helluva difference in some specific cases.

The totally awesome part of effort is identifying the words that don’t get mocked. There was no way I wanted to deal with semantic grammar detection, since tweets are often wildly non-grammatical. So as per usual, the Wookie fakes it. It mainly comes down to a weak analysis of quoting and capitalization patterns. He also keeps hash tags, links, handles, many acronyms, and argot — to the best of his ability.

And just for fun, he also recognizes a rather large lexicon of terms from the Star Wars universe. Well, except for the term ‘Star Wars’ that is. He doesn’t know what that means.

It took a lot of experimenting to get it right, and he still makes mistakes, but he’s getting smarter all the time. One of the interesting things I learned during development was how staccato the English language is, as compared to the long smooth yawls of Wookie. Reading back a mocked sentence out loud is a sublime experience.

You may ask, how can this awesome power best be used to serve the Good?

Re-Tweeting

Darn right the Wookie re-tweets! He simply selects a few users that he follows, derives their recent tweets, and mocks one of them up. There are some users — @darthvader for instance — which he will always re-tweet if the user has posted anything fresh. Otherwise it’s a simple random selection, after avoiding repeats (there’s extensive repeat-avoiding code all throughout the Wookie implementation). There is the slight hint of name-dropping here, since he tends to follow a lot of popular accounts, but that’s just the nature of this beast.

RT @warrenellis Rauuh’r @neilhimself ehr #neilfail ar hruun rraaauuuuhrrr? Au. #warrenfail. http://bit.ly/16IiUE
RT @KurzweilAINews: First Close Look At Stimulated Brain: Aghhrrrrnn gaauuuhhrrr ar hrrauur aauuuhrrn u gahrrn … http://bit.ly/17dg34
RT @cnnbrk Hruh. Urrnn nrrauuuhh Ted Kennedy ur “rrrhhrrr ghr rauhn hru rau wahr; wahr au Democratic Party; … http://bit.ly/3FmLyq

It turns out that injecting the re-tweet ‘header’ will often push longer ones past the 140ch barrier. He will attempt to preserve as much of the original tweet as possible, focusing on trailing URLs and hash tags. And if the tweet is short enough, he posts a shortened link to the original post, primarily to show off his mad skills. He is much inclined towards tweets which have a good blend of mockable and preservable words, again, to show off his mad skills.

This became the second Personality Engine bot. Yet still, re-tweeting is a one-way street, and interaction is the real key to user engagement.

Playing Well With Others

The third Personality Engine bot was borne of the need to perpetuate the following cycle of fun. On a regular basis, the Wookie will search for references to relevant words — wookiee, kashyyyk, etc. — and will respond to the user with a generated tweet. This is much less invasive and cruel than auto-following, a botting practice which I find to be quite gauche. I can only imagine the surprise on these user’s faces:

@amynicole21 WTF! aruh nrauuhehruhaaauuurr rraaahhhrr nruuh urrrrn ghrn wurh: euu haarrn nuuh grruhuhrnuhhrrn erruuuuhh :)
@DZ1641 gauuuhrr??
@vfigueroa1 rghrurr waahr! rrhneuuuhr urr nruuh hrauh – *ehrraaah* nrauh ^_^ nraur hruun rrrghhnn

However, before he goes searching, he first looks at his recent mention history, specifically at tweets starting with @cr_wookie. If one is found, he will mock and publicly respond to it, linking back to the original post when length permits. So if you talk to the Wookie, there’s a reasonable chance that he’ll republish you. To minimize abuse of this feature, he doesn’t follow quite the same word preservation rules as he does for follower re-tweeting. But he’ll keep Star Wars words, and that opens up a vast realm of potential amusement.

. @kindadodgy Nurh U hraaauuuu rghn Wookie rauuhrr wau’r hruh nrh ghrn hrun ‘rhngn ruhrn uhr wurh rn nuuh gh, ahrn au nrruuuuhh gauurh.
> @adamlampert Ar. U hru nrh raaghrr gh HR’N! Rghn ruuhr au! http://bit.ly/591qW
.@Lillput Nrh, rhag’n rauuhr nurh ghr haurr au a rghhnn ur a rghn.

Greeting New Followers

The fourth and final Personality Engine bot is the greeter. When you follow him, he’ll DM you. Short and to the point.

Summary

Whew! All in all, the project required about 6 weeks of spare time. My only hope is that much hilarity will ensue from these efforts.

If you want to read a bit more about the Wookie — and who wouldn’t, right? — you can check out his Wiki page.

Snake ‘n’ Bacon in The DDOS Caper!

Friday, August 7th, 2009

ah, come in! we’re so glad you’ve come Snake ‘n’ Bacon!
i’m crisp delicious bacon
sssss

glad you asked. it seems there’s a group of hackers, and we want you to go in under-cover
i go great on a sandwich
sssss


When Twitter came back online yesterday afternoon after their networking attacks, I got a torrent of @cr_snake_bacon tweets. Wasn’t sure why, but it seemed suspicious. Twitter’s API had flopped around for most of the day, so the logs were full of Exceptions and … oops! … re-connect attempts!

Of course I’d built the bots to re-tweet on an Exception. They’re all configured to wait 60 seconds, then try again. But of course until I fixed the configuration over night, they did exactly what a bot would doconspicuous

The service attacks on Twitter continued through today, and I’m sure that the birdy techs are furiously building black ice fortresses in Scala even now. Again, I saw a burst this afternoon from all of my bots. Pokey the Penguin, Conet Project, and Chewbacca all had several things to say, all at once. Obviously I had fucked something else up, so I hurriedly checked the logs. And nope … actually, my change had worked … Twitter had just un-blocked my IP.

*whew*

I’m not exactly sure how many bots are out there … here’s a nice wiki being kept of them. But I can imagine I’m not the only one who made that try-again coding mis-calculation. What’s sweet is that it’s un-done now, and my toys can continue prattling on.

Thanks, guys. Sorry we looked like a vicious autonoma for a while there. Glad to be back.

random problem in your cloud-hosted app? try a new instance!

Tuesday, May 26th, 2009

chalk this one up under ‘Time Sunk’.

my Ambience for the Masses app is a Spring / Hibernate / JSP stack, with a couple of other sweet components. i run it in on an AWS instance. it’s been purring along just wonderfully for months now. then, about ten days ago, it just stopped working

normally Java apps don’t die without throwing some sort of Exception. but that’s just what was happening. so i stripped out various components — thank you, Dependency Injection pattern! — and found that it would sometimes die instantly (if i was lucky) but usually it took a couple of hours. i don’t have a lot of free time to track down random intermittent bullshit like this, so it took me about a week to boil it down

it was somewhere in the Current Listener Map — my Shoutcast-listener-tracking geo-positioning statistics-gathering data sculpture back-end engine. i hear that the kids call them things mash-ups. the geo-lookup APIs were the most delicate part, and it seemed to work with them omitted. that red herring aside, it turned out to be the IP address resolution

that block was just a couple of lines until i’d added all the logging and desperate Exception handling. the app launches a lot of threads, so tracking down the issue was annoying … but eventually there it was in the traces. the stack just terminated when the app tried to getHostAddress (not during getByName though … must be a lazy-loading thing)

so i nearly had it all tracked down to that. then Tomcat inexplicably became unable to find basic JARs in /usr/share/java — i was using Fedora 8’s RPM version vs. raw Apache, and it’s organized real funny-like

so i threw up my hands and started up a fresh instance of my webapp AWS image. i’d rebooted the existing one, and that hadn’t helped at all. of course, the issue magically disappeared on the new instance. did anyone see that coming from a distance? ya probably did. cuz it’s ironic. and it’s the title line of the damn blog post

the hostname resoultion is surely a low-level OS thing. both Linux JavaSE 6 and IcedTea 7 just shat the bed when they got to that point, unlikely unless they both leveraged the same lib call. something must have gone wonk in the virtualization, and apparently a key part of the solution was running it inside of a different farm. i wasted a helluva lot of time to find that out

lesson !! if weird inexplicable freaky-ass things start happening to your cloud-hosted app, load it up on a new VM earlier than later. i’d taken a late-stage backup of the failed instance and assumed it would be corrupted with the Mystery Bug (read as: a waste of time to attempt). but oh no, it worked just great :) . next time it’ll be a cinch to just bundle the instance up, image it, and use it to launch a new one

and ultimately … it wasn’t a bug in my code !!!

being too rapid on the things that matter

Thursday, May 7th, 2009

it took me a while to come up with the title for this post. and it’s and Opinion Piece, not Techincal … so you’ll see why …

i’m working for a new company now, and they’re rocking it for RoR apps on the iPhone. sounds like a good place to be. one of the many reasons why this position works for me is because these guys are all about GTD and getting it out there. lean ‘n’ mean

whereas i’ve become very used to a holistic detail-orented, wisened test-backed process. great for Enterprise, but not so much for the reckless streets of Startup 3.0 . so i’m in a learning process. i’ve turned around some good stuff quickly, and it’s very satisfying

but i’ve screwed the pooch twice since i’ve been there. it’s totally a judgement call thing — i’m shooting too fast from the hip, and don’t feel like i really grasp the balance here …

first project i worked on was related to account management. they wanted a quick turn-around, i gave it a shot, had the whole thing backed with solid testing, and ready for on-time deployment with a smile. and in trying to keep track of all the new system permutations — i’d been there 2 weeks or so — i forgot one basic thing, and forgot to test for another. a nice little Perfect Storm. one emergency 1am database rollback later, we had a load of pissed customers and a helluva lot of explaining to do

so, then this past week, i went in to fix a minor rounding issue bug. those can be touchy. the right way to do it is with BigDecimal. yep, i’ve done that in Java too with BigDecimal. overall, it’s somewhat ponderous, detail-oriented, and can easily be polluted with Floats and the like. so i’d taken a shortcut, realizing that the low-level C impl was doing String conversion without the rounding issue. so i took the low-hanging fruit:

total.to_s.to_i

awesome !!1!. well, that is until you get into the 100-of-trillions area, otherwise shown as 1.0e+14. guess what happens when you parse that into a Fixnum? no database rollback this time, but Da Boss had to spend days sorting out the visceral impact of ridiculous sums of bogus exploit money pouring into our RPG

security, privacy and account management. payment calculations. not the sort of things to take shortcuts on. yet, if you’re embracing a culture that wants it done quickly and with minimum impact, it’s a risk you might be willing to take. it’s not like i didn’t have test scripts … i just forgot to head into scientific notation territory. just like i forgot to check for the implication of null password acceptance ( long story there, special account cases, etc. )

i’m putting these things up here for my fellow developers to laugh at.   “I mean, c’mon. All that’s totally obvious stuff.”   “I’d never miss that, that’s sophmore shit.”   good, get it out of your system, laughing boy

but believe me, when you’re on the other end of it, and had been in the middle of it and all full of all the other things that you needed to keep track of at that time, heh, well, that’s when you’ll really need to keep yerself laughing :)

Dynamic Tunneling for your Facebook App

Sunday, April 19th, 2009

When I was launching Forgiveness, my first attempt at a Facebook app via RoR, I ran into one of the typical developer quandries. How the heck to I actually make local revs to the app and proxy them through the FB back-end while still keeping the app running live for my millions of satisfied customers?

I sought advice from Steve Enzer, and his answer was ‘create a separate FB app id’. Yep, you can do that. I’m sure it works just fine. However, I wanted to come up with a single-app solution, you know, just because.

My app backbone was based upon Facebooker, a great gem for just such purposes. And assuming that you’re using that gem, you’re provided with:

rake facebooker:tunnel:start

Which starts up the traditional server tunnel to your local dev box; Facebook’s proxy won’t know the difference. I’ve also set up my canvas page URLs and the like with a fixed IP — according to FB’s best practices, because they don’t want to deal with all the DNS resolution stuff — and it listens on port 80 with a dedicated context. So, I’ve got my facebooker.yml set up for port 3100:

tunnel:
  public_host: cantremember.com
  public_port: 3100
  local_port: 3100

And here’s my nginx config:

Here I omit the upstream cluster configs, and I’ll let the comments speak for themselves … I always try to keep track of red herrings at least at some level, and there were several I bumped into while trying out this solution. The best approach that I found was to simply set a cookie with a naming convention that nginx could parse (yes, there is a regex performance penalty here). The development context is routed locally, which is where the tunnel is waiting to pick up and run with it.

Setting the cookie, well that’s another trick. Though not much of one. Of course your app is completely masked behind the apps.facebook.com domain, so trying to set a cookie to your internal domain (eg. IP) through your browser isn’t so easy, and it’s repetetive manual work. So, just bake it into a dedicated action in your app:

Then use /:controller/tunnel/1 to engage, and /:controller/tunnel/0 to disengage. A little bit of auth logic, and you’re good to go.

finding your iPhone’s UDID after you’ve made it inoperable

Friday, March 20th, 2009

okay, this is what happens when you rush forward into things …

i ponied up cash for the Apple Developer License a while back. and lo & behold, developers can download the DMG for the 3.0 Beta firmware update from the iPhone Dev Center! so i went and installed iPhone OS 3.0 Beta tonight … i wanted to check out the cut’n’paste capabilities, etc. downloaded and extracted the firmware update package, started iTunes, held down [Ctrl+Option] when clicking ‘Check for Update’ to bring up the file selector dialog, and installed the 3.0 Beta IPSW

great! hooray! it’s the usual slow process. however, once the device restarts, it goes into pre-activation mode … and iTunes rejects it as not behing a registered development device. open up the Device Management Portal, and it tells you how about Locating a Unique Device ID … “The 40 hex character string in the Identifier field is your device‚Äôs UDID.” i believe they refer to it as an ICCID in other places

alright, where can one find this magical string? well, it’s shown when your device is connected to iTunes … but of course, that’s the problem! it already won’t accept my device = meta-problem. so, then i read further … “Please DO NOT install the iPhone OS before registering device UDIDs, as installation on non-registered devices will render them inoperable.”

so i figured that i’m toast. i can easily find the Serial Number and IMEI, but not the UDID, at least according to Apple’s instructions

making a long story short … look at System Profiler (if you haven’t already) … under USB :: USB High Speed Bus :: iPhone :: Serial Number. register that as a developer device, and you’re good to go as soon as you re-boot the phone & iTunes. there’s a much more elaborate description of the process using Xcode, but it doesn’t seem to be mandatory cause i sure ain’t done that

and yes … cut’n’paste is really nice!