Sentiment analysis of digitized content (tweets, email, blog posts, etc) is hard. Sarcasm makes it even harder. Consider how many sarcastic comments are made in our online communications each day. "I love being delayed at the airport." "I can't stand it when everything is going my way." etc. Analyzing text like that has got to throw even the best sentiment analysis engines for a loop, and the false positives start flying.
If you're sarcastic, like me, you've learned to keep your sarcasm to a minimum when you're writing because the context just isn't there for your reader, much less a machine, to understand the subtle shifts in tone or where you're coming from.
I'm looking forward to sentiment deduction getting better, but I'd like to see how the logic evolves to understand age-old sarcasm.
Wednesday, July 8, 2009
Sarcasm and Sentiment Analysis
Thursday, July 2, 2009
Speed Date with Google App Engine
I assumed that in the months since the announcement of Google App Engine, that its glaring HTTP client deficiencies would have been resolved. Nope.
Any modern platform needs a robust HTTP client (timeout controls, full method support, custom headers, compression support, authentication support, and redirect handling). Unfortunately, GAE's urlfetch client (which the standard Python HTTP clients all funnel down to) doesn't let you tweak various headers (including Referer). Nor can you customize the connection timeouts. Both of these tweaks are tools of the modern day web services programming trade. Subsequently, I have to cast GAE to the tinker toy pile with the rest of today's high-level web apps. A quick look at the app repository proves this out sadly.
On the other hand, take a look at what's been built on Amazon's AWS. Goes to show what you can do when you have an open platform.
That said, GAE does show promise for hosting simple user facing web applications, or offline data crunching/hosting apps (ala google.com) with quick user facing response times and little reliance on the outside web.
Friday, June 26, 2009
TechStars Boulder 2009: Half-Time Report
The other night I attended the first "pitch practice" for this year's set of TechStars Boulder companies. Watching the evolution is always fascinating. So far, my first impressions are holding strong. Those who I suspected would struggle, are struggling. Those who I expected would be knocking it out of the park are doing so.
Comparing and contrasting the 2009 crew to the 2008 crew, I find that this year's companies are, on the whole, more mature than last year's. More companies this year have their products further down the path of where I think they will ultimately end up, than last year. It's been nice to work with teams that are more crystallized in their thinking and implementations. Of course, there's always the crew that bounces back and forth for awhile until they hone things to the point they can walk down a straight line.
For those doing user facing products, the focus on the concepts that will "hook" a user is much better than last year. Too many priorities tend to doom a team, and recognition of this is coming fast. That said, there's a big difference between knowing which features to drop and actually dropping them; it can be hard to let go.
For those doing more infrastructure intense products, the technical skills brought to bare, and understanding of the issues at hand, is much more advanced than last year. The infrastructure plays have a special place in my heart so it's been fun to work with folks doing more backend stuff.
Of course, there's a star burning hotter than the others. This team has taken a problem that billions of dollars have been thrown at, to little avail, and turned it upside down. As a result, they have a phenomenal product that is going to do things for an industry that has been begging for it for decades. Brilliant, and totally cool. I can't say who it is, but it will be apparent when the season's over.
Some technical patterns/themes that pervade almost every team this year:
- Polling. Mashing APIs together is the norm now, and the access paradigm overly leveraged is polling. Conveniently my company Gnip (http://gnip.com) is trying to make this easier.
- Queuing. Polling's ugly sibling. More teams are challenged with queuing needs in their application which bumps complexity up a notch. The simplest advice is best here. Queuing Theory 101: if the average inbound rate of items is greater than the system's ability to digest them; you're screwed, rethink the model.
- Data Storage. "How am I going to store all that data in an access efficient manner?" The inbred offspring of Polling and Queuing, data storage challenges are real for a few of the companies. For the others, the age old simple relational DB model will foot the bill.
One thing that will never cease to amaze me is the energy, passion, and commitment that radiates from the teams. Amazing.
Boulder is lucky to have this program, and I'm lucky to be a part of it.
Thursday, June 18, 2009
Faithfully Breaking Rules
Spending a week on a 15k person island (Martha's Vineyard) with family has made me think about breaking the rules in more ways than one. Reading the local paper this morning reminded me of how important it can be to break the rules. One of the bakeries in Oak Bluffs opens their back alley door at 10:30pm every night to sell doughnuts as they're coming off the line; all night until 7am. I'm sure they're breaking numerous zoning and health code rules in the process, but needless to say with a population of this size, everyone loves it, and no-one cares; no harm no foul.
The "family" aspect of this vacation has me bending/breaking, and enforcing, numerous parenting rules as well. Ice cream everyday? No problem. Licorice before breakfast? Sure.
Reading about President Obama's Finance industry reworking got me thinking about "bigger" rules that affect our everyday lives, indirectly and directly. That turned me to one of my favorite, and brutally simple, rules that we, internationally and cross-culturally, effectively never break: "stay on your side of the road when driving."
Think about it. Everyday millions of people drive two-thousand pound chunks of metal at high-speeds in opposite directions, with nothing more than a couple of feet between them as they pass eachother. There is some base rule that taps into our mortality that truly prevents us from breaking this rule. We have faith that complete strangers will adhere to the rule as well. We hand our lives over to other drivers everyday. I always like coming back to that one as it's an interesting exercise regarding faith in others.
Photo by: William C. Beall of Washington Daily News
Monday, June 1, 2009
"Mommamacations" & Perfect Software
My 6.5 year old son and I built a Lego Mindstorm vehicle yesterday. After constructing it, we wrote the software for it. After watching version 1.0 of our software run for about 5 seconds, we noticed a bug so we iterated, fixed the bug, and ran v2 of the software. After about 30 seconds we noticed another issue with the number of degrees the vehicle was turning when it confronted an obstacle. We tuned the software to increase the angle to 90 degrees, compiled, pushed code to the vehicle, and ran it.
This version, v3, of the software ran for awhile. It ran at home, at his grandparents house, and again this morning. It ran well, for a long time. However, a few minutes ago we found yet another refinement we could make to the turning angle to make it get out of a jam even faster, and I said "aha, I found another modification we can make!" My son replied, "let's make all of the mommamacations [sic] this time." He wanted to write the software once, without bugs, perfectly.
I went on to explain how it takes time to understand how software is going to work in the real-world and how you can't account for all of the variables and scenarios up front. As a result, you build, test, and refine; you iterate. You can't write it once and have it work perfectly forever.
He didn't fully grok it, but its starting to sink in. It was a neat interaction with my boy around what my world is all about. Ha! My daughter just yelled out "am I doing ballet today?" Gotta run.
Sunday, May 31, 2009
Google I/O: My Impressions
Photo from Matthias Schicker's post.
I attended last week's Google I/O 2009 conference in San Francisco. Here's what struck me.
HTML 5
Some friends and I debated what the punchline of the show was over dinner one night, but for me, it was HTML 5; the browser. The entire introduction sequence was about JavaScript execution, rendering speeds, & HTML 5 standards. The five minute automated intro demo before the keynote demonstrated web browser functionality, literally, using a web browser for everything. This theme was downright entertaining. HTML 5 is the distillation of everything we've wanted/needed in the browser over the past decade. What's particularly funny about HTML 5, and in Google's all-hands-on-deck push for its implementation across the board, is that, without exception, all the relevant portions were knocked together (in prototype form at a minimum) a decade ago between Netscape/Mozilla, IE, and Opera. The entire two days felt like a browser resurgence.
What I liked about this was that Google had the gumption, and obviously the money, to make something old, new again. If you ever spent time working on one of the major browsers, you too see the world through HTTP/URL/JavaScript lenses. Those technologies unlock everything. It was cool and fun to be part of a conference dedicated to these concepts.
Wave
I left about of a third of the way through the Wave introduction. Again, 10-year old communication/message-threading concepts being demonstrated in front of a technical audience of four-thousand. My initial reaction was, yawn. I've always loved the notion of treating messaging more centralized (in a logical sense) both from a backend protocol/storage standpoint, as well as from a UI perspective. Naturally flowing between an asynchronous email conversation, and a synchronous IM conversation, will be a beautiful thing when we finally get there. However, it felt awkward having one of Google's top three themes be Wave.
Architecturally Wave appears to be able to get us there, however standing up Wave providers en-masse will take a long time (consider how long it took for SMTP/POP/IMAP to proliferate).
I'm particularly excited about Wave's leverage of XMPP (with extensions) as the connection/protocol model, though; feels very fitting. Furthermore, Cisco's acquisition of Jabber last year is feeling like a sweet decision right about now. Imagine Cisco's XMPP routers hardened for Wave Providers; nice dovetail.
Android
Google handed out four thousand Android/HTC mobile's in hopes of spurring Android development. I've gone so far as to pull down the SDK and do some dev "how-to" reading, but I've gotten distracted and have moved on. There are three fatal flaws with Android and the HTC device.
- The soft-keyboard is too small which makes it very hard to type. This is purely a function of the device/form-factor which can/will change over time.
- No iTunes/iPod. There's a media player, but my world is painted in iTunes (for better or worse) and it's already a sync'ing nightmare so I'm not about to add another framework into the mix. My "phone" and music/video are on one device (iPhone) and I can never go back to multi devices. If anything, the iPhone has replaced my laptop as well on an occasional business trip.
- The browser is all but useless. This shocked me, but the UI metaphors (which I'm sure some of which Apple has patented) on the iPhone Safari browser are so well done, that anything less on such a small form factor is a huge step backward.
Joseph Smarr on "The Social Web"
Smarr's always good to watch/hear. He understands the high-level yet always has his hands dirty with the actual hand's on implementation. He underscored how much things have changed with respect to OpenID and OAuth adoption over the past 12 months. Very true, and great to see. He mentioned Gnip, and Plaxo's integration points with it, which was much appreciated; thanks Joseph!
"Spelly"
One of the tracks was about "Spelly"; notably the server-side spell checker used in the Wave demos. What was so cool about this spell checker is that it was backed not by a dictionary, but the statistical probability (language independent) of a word being spelled correctly based on its position relative to the surrounding words. For example "Let's met tomorrow" slammed against the corpus of indexed web documents illustrates that the vast majority of the time words starting with 'm', between "Let's" and "tomorrow" are spelled "meet", not "met." So cool!
App Engine + Java
Google's hamstrung Java to about the same degree they did Python in App Engine. If you're a high-level Java hacker you might have fun, otherwise this was a solid miss (at least for now).
Sunday, May 17, 2009
Boulder & California

While playing with my daughter this morning she pulled out the key-chain pictured here. I noticed that the "Boulder" sticker was sitting on top of another. I peeled it back and found "California" underneath. Growing up in Boulder, spending four years in Silicon Valley, then moving back to Boulder to help build and grow our software/technology sector, caused me to view the picture through several lenses.
- California is passe and products/companies/people are re-branding themselves as Boulder which is trendy.
- The key-chain manufacturer decided to put California stickers on the keychains as they came off the line, then localize the keychains on-demand and in smaller batches when needed, in order to save cost.
- Some of Boulder's entrepreneurship is really California underneath.
- Some of California's entrepreneurship is really Boulder underneath.
- The Californication of Boulder is real.
- The migration of Californian's to Boulder continues.
- Micah - I am Boulder, Hear me Roar
- Andrew Hyde - http://andrewhyde.net/
- may others that I don't have time to track down given my kids are unattended in the front yard doing a lemonade stand.
