Posts Tagged ‘twitter’

What Would a Wookie Do?

Saturday, September 12th, 2009

Yes, I had to ask myself that question a lot recently, at least from the perspective of how he would use the Twitter service. As much as this is a theoretical question, I believe I came up with some answers, and they are made manifest in the @cr_wookie Personality Engine.

Base-line Aesthetics

The first step in bringing a Wookie to life was to establish a basic phoenetic dialect. So I came up with a set of candidates ‘words’ along the lines of; auhrn, rghrn, gahrn, hraur, urau, ehruh, and nrauh. Hey, they sounded good. There’s about 60 of them total, comprised of the letters A E G H N R and U. After having watched The Empire Strikes Back, where Chewbacca seems to get most of his good lines, I expanded into a few W words as well (wahr seems to work particularly well). Many folks postulate that he was capable of speaking Os and Ks, yet I myself do not subscribe to that opinion.

I then used MadderLib to construct simple word generators which allowed the phoenemes to stretch appropriately for different word lengths. Long vowels, double Rs and Hs, whatever looked good. And with a little compositing logic, I came up with some sentence patterns that were quite fun to read out loud:

rauhr nruuuhh raghr uhr rghrrnnauhrnaauuuuuurhhh
nrauuh euu gauuhrr ruhr
urhn ehrraaah rhhreuhraahhrrrrnn gaurrh

Those are rather plain-looking though. In order to make the Wookie’s statements seem more like txt argot, some variety was needed. Punctuation was obvious, both terminating and delimiting (commas, semicolons). Plus there’s the proper use of txt idioms, the LOLs and WTFs that are so popular with the youngsters these days (AFAIK). Throw in a nice little collection of emoticons, and behold; the Wookie has charm:

LOLZ! uhrrn euueuuhhaur, eruh hraur nrauururhnehrrraaah auhrn harn aurreruhaaarruuuhh rraaghrrr ^_^
hraur rghneruh nuh waahhr???
rhhhnnghn uhrrrnn ehrah. euu urau ehuuurrr urrn aurheuuuh haarn uhrrrn k?… haauurrr nruharuhuhhrrh hruaaaauuh ghrn rghn nrruuhh

Well, yeah, they still look sorta flat. Real people quote and capitalize portions of what they type, and there are other non-verbal components to the average sentence. So, the Wookie was taught to inject numbers, times, abbreviations, and even Star Wars calendar years into his sentences:

gahrn rghhr ghrrnehur rghhrrn hruauh raghrehurauuuuuuhrn gahrn?? hrraaauu: hrrraaauu rauhr ;) _ehruuh_ ‘Rahr Ehrrraaaah’
raauh hrau Ehrahurrrrh *rauhr* gruh
auhrneuuhr. harnuhrrnuhrrr uhr – raauhneuuu 1:30 gahrn raauuughrrr *nuuh* aahhrruuuunn uhr. wuurh harn rhr?

Once the Wookie was at this point, he could talk for quite some time and produce diverse aesthetic results. Reading them out loud is a hoot! Thus was born the first Personality Engine bot (the Wookie is comprised of four of them). But he still wasn’t really tweeting until he could follow some of the core Twitter memes.

Twitter Memes

Hash tags were the first obvious choice, since they were easy to fake. To this day, the Wookie can simply prepend a # to any word or composite that he speaks. But to make this feature really zing, I added support for Twitter’s trending searches. This allowed him to use real-world releveant tags, injecting them into his sentences, or appending them to his tweet (as is common convention). It turns out that one of the joys of a nonsense grammar is that anything which isn’t nonsense magically becomes the ‘meaning’ of the sentence:

Harn ehureruheuuu & ahrn rrghhhn! AAAHHRRNNAUHRN EHUUR! #itsnotgonnawork
WAHR NUUUH RAUHRR!!! euu ‘aauurr nraur’ haaauurrurau! #fact
GRRUUUH! uhr hraheuuuuhhr aauurh #ChargersSuck rrhnn aur! gauhr aaurh haurerruuuhh! !! #aurh

Of course, no Twitter user can resist posting shortened URLs. They’ve been a cornerstone to the explosive growth of the service, maybe because there’s just so much interesting fast-moving crap out there on teh Internets. The Wookie follows several aggregation services — Digg, Technorati — and a smattering of other popular blogs — TMZ, The Onion Daily, LOLcats — etc. He pulls out links to recent content and shortens them with the API. Again, since the Wookie is totally faking it, the results just cannot be accounted for. The best he can hope for is that the emotional texture of his tweet sometimes support the referenced source:

nrauh rrhn hrauuur ehrruhauhrnurrh. nrraauuuhh IMHO. hraur grraauhurhnauhrn rauh
Ghrn euuhrr? haarneruuuhh urrn aruh rghrnn aaur, uhr hruh urrr :)

And no tweeter lives in a vacuum; their posts are replete with the user handles of friends, comrades and mentors.
The Wookie wasn’t about to make up handles, so his likely choices were his followees and followers.
Rather than take the name-dropping approach here — more on that later — the Wookie chooses to occasionally reference his most recent followers:

OMG! _rrrhnnneuh_ euh: urr wuuurrh rghnurrnh urhn hruhn @sleepbotzz rhagn ghrn rrhn waaahhr hruauhehuraghrrrrrnnn rhagn harrnaurh

After these features were implemented, the Wookie’s posts started to look almost real-ish. And whenever he tweets on his own, that is his range of capabilities. But he’s still not a real member of the Twitter community until he could play some other tricks. Thus began a completely separate effort; how to translate English into Wookie.


Did I say ‘translate’? What I meant to say was ‘mock‘. After all, what can you really do with a nonsense grammar except make it look like it has meaning.

So, the Wookie was taught to mock existing sentences into his own dialect. He simply matches the initial letter (vowel / not) and preserves the word’s length and non-alpha characters (for contractions and the like). Special mappings were also added to deal with short words (the dialect only generates words 4-letters and above). And within a given tweet, he re-uses the same fake word for each instance of the real one. It’s an obscure feature, but it makes a helluva difference in some specific cases.

The totally awesome part of effort is identifying the words that don’t get mocked. There was no way I wanted to deal with semantic grammar detection, since tweets are often wildly non-grammatical. So as per usual, the Wookie fakes it. It mainly comes down to a weak analysis of quoting and capitalization patterns. He also keeps hash tags, links, handles, many acronyms, and argot — to the best of his ability.

And just for fun, he also recognizes a rather large lexicon of terms from the Star Wars universe. Well, except for the term ‘Star Wars’ that is. He doesn’t know what that means.

It took a lot of experimenting to get it right, and he still makes mistakes, but he’s getting smarter all the time. One of the interesting things I learned during development was how staccato the English language is, as compared to the long smooth yawls of Wookie. Reading back a mocked sentence out loud is a sublime experience.

You may ask, how can this awesome power best be used to serve the Good?


Darn right the Wookie re-tweets! He simply selects a few users that he follows, derives their recent tweets, and mocks one of them up. There are some users — @darthvader for instance — which he will always re-tweet if the user has posted anything fresh. Otherwise it’s a simple random selection, after avoiding repeats (there’s extensive repeat-avoiding code all throughout the Wookie implementation). There is the slight hint of name-dropping here, since he tends to follow a lot of popular accounts, but that’s just the nature of this beast.

RT @warrenellis Rauuh’r @neilhimself ehr #neilfail ar hruun rraaauuuuhrrr? Au. #warrenfail.
RT @KurzweilAINews: First Close Look At Stimulated Brain: Aghhrrrrnn gaauuuhhrrr ar hrrauur aauuuhrrn u gahrrn …
RT @cnnbrk Hruh. Urrnn nrrauuuhh Ted Kennedy ur “rrrhhrrr ghr rauhn hru rau wahr; wahr au Democratic Party; …

It turns out that injecting the re-tweet ‘header’ will often push longer ones past the 140ch barrier. He will attempt to preserve as much of the original tweet as possible, focusing on trailing URLs and hash tags. And if the tweet is short enough, he posts a shortened link to the original post, primarily to show off his mad skills. He is much inclined towards tweets which have a good blend of mockable and preservable words, again, to show off his mad skills.

This became the second Personality Engine bot. Yet still, re-tweeting is a one-way street, and interaction is the real key to user engagement.

Playing Well With Others

The third Personality Engine bot was borne of the need to perpetuate the following cycle of fun. On a regular basis, the Wookie will search for references to relevant words — wookiee, kashyyyk, etc. — and will respond to the user with a generated tweet. This is much less invasive and cruel than auto-following, a botting practice which I find to be quite gauche. I can only imagine the surprise on these user’s faces:

@amynicole21 WTF! aruh nrauuhehruhaaauuurr rraaahhhrr nruuh urrrrn ghrn wurh: euu haarrn nuuh grruhuhrnuhhrrn erruuuuhh :)
@DZ1641 gauuuhrr??
@vfigueroa1 rghrurr waahr! rrhneuuuhr urr nruuh hrauh – *ehrraaah* nrauh ^_^ nraur hruun rrrghhnn

However, before he goes searching, he first looks at his recent mention history, specifically at tweets starting with @cr_wookie. If one is found, he will mock and publicly respond to it, linking back to the original post when length permits. So if you talk to the Wookie, there’s a reasonable chance that he’ll republish you. To minimize abuse of this feature, he doesn’t follow quite the same word preservation rules as he does for follower re-tweeting. But he’ll keep Star Wars words, and that opens up a vast realm of potential amusement.

. @kindadodgy Nurh U hraaauuuu rghn Wookie rauuhrr wau’r hruh nrh ghrn hrun ‘rhngn ruhrn uhr wurh rn nuuh gh, ahrn au nrruuuuhh gauurh.
> @adamlampert Ar. U hru nrh raaghrr gh HR’N! Rghn ruuhr au!
.@Lillput Nrh, rhag’n rauuhr nurh ghr haurr au a rghhnn ur a rghn.

Greeting New Followers

The fourth and final Personality Engine bot is the greeter. When you follow him, he’ll DM you. Short and to the point.


Whew! All in all, the project required about 6 weeks of spare time. My only hope is that much hilarity will ensue from these efforts.

If you want to read a bit more about the Wookie — and who wouldn’t, right? — you can check out his Wiki page.

Snake ‘n’ Bacon in The DDOS Caper!

Friday, August 7th, 2009

ah, come in! we’re so glad you’ve come Snake ‘n’ Bacon!
i’m crisp delicious bacon

glad you asked. it seems there’s a group of hackers, and we want you to go in under-cover
i go great on a sandwich

When Twitter came back online yesterday afternoon after their networking attacks, I got a torrent of @cr_snake_bacon tweets. Wasn’t sure why, but it seemed suspicious. Twitter’s API had flopped around for most of the day, so the logs were full of Exceptions and … oops! … re-connect attempts!

Of course I’d built the bots to re-tweet on an Exception. They’re all configured to wait 60 seconds, then try again. But of course until I fixed the configuration over night, they did exactly what a bot would doconspicuous

The service attacks on Twitter continued through today, and I’m sure that the birdy techs are furiously building black ice fortresses in Scala even now. Again, I saw a burst this afternoon from all of my bots. Pokey the Penguin, Conet Project, and Chewbacca all had several things to say, all at once. Obviously I had fucked something else up, so I hurriedly checked the logs. And nope … actually, my change had worked … Twitter had just un-blocked my IP.


I’m not exactly sure how many bots are out there … here’s a nice wiki being kept of them. But I can imagine I’m not the only one who made that try-again coding mis-calculation. What’s sweet is that it’s un-done now, and my toys can continue prattling on.

Thanks, guys. Sorry we looked like a vicious autonoma for a while there. Glad to be back.

When Broken Toys Impact your Friends

Friday, February 6th, 2009

This sure was an interesting morning! I woke up to find that I’d unintentionally sent direct messages to all of the followers on my personal Twitter account. And I’d sent them out at 1a PST, which means that anyone who (a) uses SMS capabilities, and (b) has some text message notification sound set up would have been rudely interrupted in the middle of the night.

Fortunately, I haven’t lost any followers (yet). But this was a perfect case of how mixing business with pleasure can have unintended consequences.

What Have I Learned

Or rather, what have I re-learned

Soft-disable features in Production at Launch Time
My Twitter engines are built with both an :enable_tweet and :enable_greeting config setting. In the git repo, they’re both true. When I did my local testing, I’d disabled them correctly. When I launched in Production, I neglected to make the quick-and-dirty changes; after all, everything worked great. And once started, my scripts correctly responded to the no-initial-state condition, and greeted everybody.

Launch preparation is critical, even for little projects. The start-up mentality is to move fast and lean, but there’s such as thing as too fast, and probably as too lean too. Gradual uptake migration is a wise strategy even for the ‘little things’.

Mock and Integration Testing Only Gets You So Far
I used rspec to mock out the full capabilities of the engine. Found some real-world issues, resolved them. I also wrote some core integration tests, ran them locally. Immediate failures. I had mocked documented features that didn’t actually exist. Fixed, re-mocked, re-tested, fixed again, etc .

Another great reminder that you can only mock something you trust, and how can you trust something you haven’t actually run under integration conditions to start with! Re-tested integration, and everything passed with flying colors. Sure, the features worked great now! And when I launched them, they did exactly what I asked.

So, as if we haven’t heard it enough times, be careful what you ask for!

All of this is familiar to anyone who has made a mistake in the software industry. It’s not like I haven’t successfully executed dozens of critical launches in the past, and most with virtually no issues at all. But what’s interesting is what happens when these mistakes happen in a public forum, and whom you expose them to — say, your friends :)

And who can say when two ounces of caution is more deserving than one … without the benefit of hindsight.

Just ask anyone who has a stringent backup policy how much time & effort they invest to avoid an event that may never actually happen. That stringency usually comes from that one unforgettable experience, and from there is born an extra layer of caution, and an additional time-sink (eg. mock & integration testing)

Heh. KISS. So, what exactly is simple? DRY. Isn’t that supposed to be a time-saver? Well, it depends on what you’re not trying to repeat. Strange how these cuddly and liberating acronyms can have more than one interpretation.

Experience taints everything.

Twitter4R shifts RSpec onto my Front Burner

Thursday, February 5th, 2009

As usual, my day did not pan out as expected. But, also as usual, I learned a lot!

Coming Up To Speed on Rspec

So, learning rspec has been on my list for a while. I finally got around to it. Nice framework. I am familiar with EasyMock, and am aware of JMock. I never had an opportunity to get into Mockito, but I’d give it a glance next time I put on my Java hat.

There is some decent documentation on it out there. I found these links particularly helpful:

During my ramping-up, I took the usual meta-approach of creating a test suite — and yaaay, that’s what rspec is meant for! — which I then used to test out its own range of capabilities. The Modules in the rdoc which I’ve found to provide the most value are:

  • Spec::Expectations::ObjectExpectations for conditionals (eg. should & should_not)
  • Spec::Matchers for expectations (eg. equal(value), be_a(class), respond_to(method_sym), raise_error)
  • Spec::Mocks::Methods for mock method definition (eg. should_receive(method_sym))
  • Spec::Mocks::MessageExpectation for mock behaviour (eg. with(*args), once, exactly(n).times, any_number_of_times)
  • Spec::Mocks::ArgumentConstraints for mock arguments (eg. an_instance_of(class), anything)
  • Spec::Mocks::BaseExpectation for mock responses (eg. and_return(value), and_yield(&block))

I won’t got into the deep details, but here are some examples of conditionals and expectations that I paraphrased into my meta-test:

And here’s a little bit of silly mocking:

These are just ways I thought of to exercise the width and breadth of the library. Very nice. I hope that these are useful examples for people new to this gem.

I recommend the other references from above for filling in the missing details. Once you have a context / specify or a describe / it specification set up, you’ll be good to go. There’s much more to the library — Stories, for instance — but that’s for another day.

An Informative Walk through Twitter4R

No, I didn’t really want to spend time learning rspec — I mean heck, I’m busy — but I had a personal need to expand the Twitter4R gem. Specifically, I wanted to add on some Twitter search features, and I was very impressed with how this library has been built. Contribution-wise, the final step that Susan Potter recommends is to craft up some rspec tests.

Of course, mock testing is only as good as the framework you’re built upon. The assumption is that Net::HTTP is going to do it’s job, so mock it up and you can even test your Twitter features offline. When I built Bitly4R (given that name, my thinking has clearly been influenced), I did everything as full-on functional tests. It was easy; has both shorten and expand commands, so I could reverse-test real values without having any fixed expectations.

However, Twitter is live and user-generated, so who knows what you’ll find. Mocking covers that for you. And of course not having to hit the service itself shortest testing time dramatically.

Here’s one of my tests, again foreshortened:

Plus a little mocking of Net::HTTPResponse:

Nothing like having some good inspiration to get you thinking about maintainability. I’m a not a strong adherent to any of the TDD or BDD denominations, but testability itself is close to my heart. Just ask the folks who’ll be looking at the micro-assertion Module that I wrote for my Facebook Puzzle submissions (which work locally under test conditions but fail miserably once thrown over their wall). Then again, I won’t need to write that (and test that) from scratch again.

So, back to talking glowingly about the Twitter4R architecture. Yes, I used that word. Yet regardless of that term’s heavyweightedness, it’s simply the result of carefully thinking out the formalization of a complex design. And once a non-author delves into extending an existing implementation, that impl’s demeanor becomes readily clear :)

Some of the interesting things I saw:

  • An inner bless-ish strategy, to inject the Twitter::Client into instanciated result classes. I’ve seen blessing as an OO-Perl-ism, for self-ication, and that metaphor carries over very nicely to this context.
  • Generous use of blocks / lambdas / closures in methods, for contextual iteration (eg. iterate through each Twitter::Message in a timeline response). Unnecessary, but an excellent convenience for a language that offers that optional capability in a very disposable fashion (from the caller’s perspective).
  • Retroactive sub-object population after construction. Twitter4R relies upon the json gem, which deals primarily in Hashes. Post-injection, the library itereates through Arrays of Hashes and transforms them into suitable Twiltter::User containers, etc. A great place to put such logic in the construction chain, and it doesn’t take long to get really tired of Hash hierarchies.

Good stuff, and learning with in a Ruby-centric mindset was invaluable for me. We all have to start somewhere, eh.

The Acute Long-Term Pain of Staticness

There was one issue that I ran into; static configuration. During my years of using the Spring Framework, my IOC-addled brain started thinking of everything in terms of instances — Factory Beans, POJOs, Observers & Listeners, Chain-of-Responsibility wrappers. Static configuration is a common metaphor, and in this case, there’s a static Twitter::Config instance. Convenient and centralized. Makes perfect sense.

I mean, the fact that it was a configurable library at all was awesome. I was able to easily set up a Twitter::Client to reference However, of course as soon as I did that, I whacked the ability for clients to talk to in the process. Oops!

On GPs, I refused to modify the original code. And I wanted to make sure that my superficial tweaks to the library would be thread-safe — temporarily swaping out the global Twitter::Config in mid-operation would be an issue. Using Mutex.synchronize seemed like the perfect choice. After finding that the same thread can’t lock a Mutex instance twice — grr!, that’s a great trick if you can work it — I overrode the one method that cared the most about @@config:

It works like a charm. Please, everyone just line up to tell me how I could have done it better. And I don’t mean quicker or cheaper, I mean better. Believe me, I would not have sunk the time into this end-around approach, if not for the fact that:

  1. I don’t want to maintain a local gem code modification (even though my impl is closely coupled to the gem impl already)
  2. I intend to follow that practice, so every opportunity to pull and end-around is a Valuable Learning Experience.

So, now my local Twitter4R has search capability gracefully latched onto it (and implemented much in the flavor of the library itself). I have a mass of rspec examples to work off in the future.

Now, I haven’t spent a great amount of time testing the thread-safeness — no one in their right mind wants to do that — but my sundry Twitter::Client instances play nicely together in between searches and normal status operations.

And I had something useful to blog about!