Monday, August 17, 2009

Dirty Data, Python & Gnip

I've been tinkering with Python (2.5) over the past few months; both in Google App Engine, and in free-running apps/processes. My initial free-running apps ingested "structured" content from a variety of web APIs, and would crash roughly every 12 hours, and would need to be restarted. Subsequently I took a "short-lived" process approach to managing these apps; cron would monitor them to make sure they were still running, then restart them if they'd died.

More recently I built an app that digested data from a known clean API (Gnip). Digesting data from Gnip ensures consistency in data format and structure. As a result, the Python app has been running for several days without issue. Now, of course the initial app crashing was due to bugs in code I wrote, but bulletproofing against broken/dirty/poorly-encoded/inconsistently-encoded data coming from random web APIs is a pain. Covering every case in modern apps takes a lot of energy. I opted for the "bounce it" strategy rather than to debug the issue (a major time sync due to variability and inconsistency; any engineer's worst nightmare).

The new application has instilled faith in Python as a choice for long-lived app processes, and reinforced how important clean input data is.


Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...

Αw, this was an extremеly nice ροst.
Spendіng somе time and аctual effοrt to creatе a gooԁ artіclе… but what
сan I sаy… I put thіngs off a ωhole lot аnd nеver manage to get nearly anythіng done.

Feеl free to suгf to mу web-site - auto insurance dallas
Also see my web page :: dallas car insurance

Anonymous said...

This іs really interestіng, Yοu are a verу skіllеd blogger.
Ι hаve joineԁ your feеd anԁ look forward to seeking mоre of your great ρost.
Also, I've shared your web site in my social networks!

Also visit my blog - tens therapy

Anonymous said...

Heya! I juѕt wantеd to aѕk if you eveг hаve аny trоuble with hackеrs?
My last blog (wordpresѕ) was hacked and ӏ enԁed
up lоѕing manу months of hard wοrκ due to
no backup. Do уou have any methods tο stop hаckerѕ?

Alѕo ѵisit mу pagе: coppell taxi company
my web page:

Anonymous said...

What's up to every one, as I am truly eager of reading this web site'ѕ
post to be updаted daily. Ιt carrieѕ pleasant іnfoгmation.

Τake a lοok at mу web page - tens unit store

Anonymous said...

Thanks very interesting blog!

Have a look at my web site ... robot piscine dolphin