programming

Teams, Rookies & Vets

Jud Valeski

22 Nov 2008 • 1 min read

A week ago, almost to the minute, the following message was generated by an internal Gnip monitoring server, and sent to the person on-call.

PROBLEM alert - production-head2/production-head2-gnip is CRITICAL

It was the start to a very long, non-stop, few days at Gnip.

Much of the rapport on your team gets defined in moments like this. Your team's ability to solve hard, live, problems is thrust into the foreground. About five hours into the ordeal, my appreciation for having focused very hard on bringing software veterans into Gnip was peaking. The problem was being sliced and diced, and the collective experience of everyone on the team was winnowing things down quickly. "It can't be that!" "It must be this!" "I think we should focus here."

Step 1: we isolated the symptoms. exactly what was going on!?!
Step 2: we checked configurations/environments
Step 3: we identified potential code inefficiencies
Step 4: we verified probabilities
Step 5: we placed a bet on what we thought the problem was, and wrote code to address it
Step 6: we watched our hard work pay off; production issue resolved; it was the right bet

Knowing which bet to place comes from experience. The only problem with experience at a startup is that it can be expensive. Like so much in life, you get what you pay for. Had Gnip been tilted toward relatively in-experienced, in-expensive, junior team members, what turned out to be a production blip, could have been a true nightmare for the company.

Glad that problem is behind us, and we all have a nice new chunk of experience to put into the bag of tricks for future use.

** PROBLEM alert - production-head2/production-head2-gnip is CRITICAL **

Sign up for more like this.

PROBLEM alert - production-head2/production-head2-gnip is CRITICAL