Sunday, May 13, 2012

Life Spans: A Career in Connections

All of my programming life has been spent playing with sockets; client and server. The guy that introduced me to the networking stack was Lou Montulli, and he said something to me as an impressionable kid that has driven everything I've done ever since: "the URL is the only thing that matters. If you are writing code with that in mind, you will win." I took that statement to heart and doing so has yielded a priceless life experience so far. There's simplicity in that statement as well as, wisdom, truth, honor, religion, direction, revolution, evolution, complexity, and mantra.

It all started with the Lynx text-browser networking library; an originally serial thing. That library was sucked into Mosaic/Netscape and parallelized in order to accommodate fancy things like images and JavaScript. From there, we struck a performance balance between the desktop's ability to parallelize socket i/o schedules, with the rendering engine's ability to render. Connection speeds started to improve (dial-up to broadband), and TCP stacks standardized; relative network stack homogeneity set in; Cisco, Apache.

Scaling both sides of the socket (client and server) turned into a game of threads, multi-proc, asynchronous/event-driven programming and figuring out how to get as many balls in the air as possible, and how to get them back on the table in an organized manner. The entire system moved to this challenge, and tremendous innovation occurred at the hardware and software level. In the mid-2000s I spent my time trying to optimize AOL's publishing infrastructure to yield thousands of short-lived parallel connections per server in a real-world publishing environment.

Then something unpredictable happened. Social media thrust massive amounts of dynamic, user generated (UGC) content into my world at Gnip. Everything I'd been working for turned around 180 degrees. Instead of optimizing for lots (end consumer browsers) of short-lived, HTTP transactions in a web browser or on an HTTP server, we needed to focus on few (relatively speaking in the world of "enterprise") long-lived HTTP transactions. Dozens of customers per NIC, instead of thousands. State-full transactions instead of state-less. At Gnip we need connections to last weeks on end (and longer), instead of sub-second. Yet, we and our customers want everything to just look and feel like the URL we've all come to know and love.

The result has been a massive engineering exercise that continues today; Gnip. The public Internet has been plumbed out as a system that supports bursty, short-lived, HTTP transactions. Yet, Gnip users want a sustained HTTP transaction that can handle 10x volume spikes sans issue. From highly elastic public cloud hosting solutions, to dedicated hosting providers, we've spent a lot of energy to ensure network clarity to support these needs.

One "feature" that I miss about my old Internet is statelessness. State-full connections cannot break. If they do, bad things happen, and contingencies have to fill the gaps. Traditional queuing theory is the monkey on every Gnip engineer's back. Buffering, backfill, replay, index pointers, sustained throughput volumes, cascading, fanout, MTUs, packet-loss, and processing latency are our domain. I view our system as an aqueduct delivering uninterrupted streams of public consciousness to those who need it. The challenges inherent in building something like this are writing great chapters in the lives of everyone on the team.

I find it curious that while the URL is still the center of my universe, what's behind it has changed dramatically over the course of my career. Something so simple, yet so complex.

No comments: