Wednesday, December 26, 2018

Machine Play


The only really cool software things that have come along in the past several years have been a rise of the machines in the form of Machine Learning, and serverless/microservice/Kubernetes clouds. The former is a complete framework game changer for humans, and the latter just a really cool evolution of a concept that's been lingering since the dawn of computing.

Amazon's foray into the computer vision market was DeepLens awhile ago. I thought it was odd that they combined both the software and hardware (camera; note, the camera is only suitable for indoor use) ends of the product. Why in the heck would you include a camera with a product like this!?! Several months later the Goog released access to their genericized ML backend; Google Cloud Vision in the back of AutoML was cast unto thee.

Well, after using Google's Cloud Vision and a 3rd party outdoor webcam to cobble together a system that SMSes me when one of the deck furniture covers gets blown off in the wind (project breakdown below), I now see exactly why Amazon bundled the two. I only wish they'd provide an outdoor-grade camera version of the offering. It turns out, unsurprisingly, that the pics you feed the backend are such an integral part of the training/prediction process, that tightly coupling the two is very important (well done Amazon Product Managers who realized this early). I believe Amazon also does onboard model prediction which sounds cool, but I'm not getting the necessity of this feature. I think the Goog got it right by just offloading the evaluation bit to the network (their modus operandi obviously) via URL image data retrieval. Sure there are applications for on-board execution, but, they seem more specialized than most of the use cases will require.

The Project

I cover the outdoor furniture on my rooftop deck with covers to protect them from the elements (Colorado is pretty tough on the weather front). It's often really windy up there and the covers regularly get blown off. The problem is that I don't get topside as often as I'd like, and the furniture could be left uncovered for awhile, exposing it to the damaging sun. So, I wanted to get a notification when they were blown off so I could go re-cover them. Obviously traditional image parsing/recognition solutions would be horribly unreliable and hard to use in order to accomplish this, so, enter ML. I wanted an outdoor-webcam to take pics of the deck and have Google Cloud Vision determine with high accuracy (90%+) whether or not a cover had been blown off, and text me if so.

The Pieces

  • Input
    • Amcrest outdoor IP Camera - mounted outside and aimed at the deck furniture. the camera hardware for this is great, the firmware/on-board HTTP server/app is awful and stuck in the late 1990's. if anyone has experience with a better outdoor IP webcam (no, Nest doesn't natively work), please let me know.
  • Processing Nodes
  • Software
    • my app/driver - Python script (code is here) that cron runs every ten minutes on the droplet.
    • FTP server - vsftpd. the webcam's firmware design is as old as dirt and only talks FTP for snapshot images.
    • bash script that cron runs every ten minutes for copying/renaming latest image capture to the HTTP server so the main app can access it.
    • Google Cloud Vision - used to predict whether or not an image of my deck furniture has any of the covers blown off of it.
  • Output
    • Twilio API - used to send me a SMS message when a cover has been blown off.


The process of building a model on someone else's engine is unbelievably simple. If you can imagine it... the computer can model it and predict it.

Labeling image data is a major pain and very time consuming. While model prediction/execution is super fast after you've trained it, the labeling process required to train is horribly cumbersome. Looks like someone's entered the market to start doing the hard work for us; I'll give CloudFactory a try next time I need to build a model (which is pretty soon actually given that my cover configuration has changed already).

We are going to accelerate from zero to sixty very quickly with ML backed image apps. I imagine app providers providing integration solutions with existing webcam setups that allow consumers to easily train a model for whatever visual scenario they want to be notified about (cat is out of food, plant is lacking water, garage door is open, bike is unlocked, on and on and on). Of course, you can apply all of this logic to audio as well. The future is going to be cool!

What Could Be Better

  • I should collapse the file/FTP server and the app server onto either the WAN based Droplet, or the LAN based Pi server.
  • The webcam. While the hardware is great, the software on the camera only supports SMB/FTP for snapshot storage. If the camera supported snapshot via HTTP I could forgo this interim image staging framework entirely. There might be joy in this forum post... I'll need to dig in and see;
  • I need to format the SMS message to be Apple Watch form-factor friendly.
  • I need to reap/cull images after some duration.
  • As far as I can tell Google Cloud Vision data models can't be augmented *after* they've been trained. I'd like to add revised image data without having to rebuild/retrain the entire model. This seems like a pretty big bug. All of the image ML prediction scenarios I can think of are going to trend toward wanting to augment/add new image data over time without having to maintain the original seed model data.

Friday, December 21, 2018

Taking Charge Of My Attention

Over the past few weeks I’ve experimented with leaving my Phone at home when I head out for the day. The release of iOS Screen Time shocked me. Having a look at my raw usage data around how much I was using my devices/apps pushed me into trying some big changes like leaving my Phone behind.

No surprise, I haven’t really missed my phone. What made the shift possible was that my Watch lets me do the communication-in-a-pinch and payment stuff I need during the day when I’m not near an iPad or laptop.

That said, I wish I had my phone with me when I want to take a picture of something, and when I want to use a home automation app. That’s pretty much it though. If I can keep this up I’m going to look into a point-n-shoot camera I can tote around in lieu of the Phone.

As this experiment evolved, my awareness (experiential as well as stats from Screen Time) surrounding Notifications heightened. Possibly more poisonous to society than screen time itself, is the interrupt driven life we lead thanks to Notifications.

I vividly remember when Apple released Notifications on iOS. I was enamored and immediately foresaw a future wherein asynchronous, rich, notifications would allow deep-linking into our apps. Well we’re pretty much there and it’s a nightmare come true (just go look at your Settings->Screen Time->Notifications). Remember when you realized you were a Pavlovian dog hitting “get mail” every time your mail client would ding at you about new mail? Transfer that behavior to dozens of apps on your mobile device. Dopamine drip. Drip. Drip.

I’m slowly shutting Notifications off completely (Watch, iOS, and OSX) in most of my apps as I realize that I really don’t need to know, save for just a few cases, when some app has something to say. When I want to know something, I’ll go check it out on my own; on my time, not someone else’s who is simply trying to “drive engagement.” Needless to say the meaning of “breaking news” left us long ago, so I don’t need those notifications.

One elusive app has been the Phone app which rings with spam all day long. I installed an app called Hiya which does a great job blocking the non-sense.

Unfortunately neither iOS nor OSX support system-wide Notification disable. You can kind of hack around it with extended “Do Not Disturb” schedules, but you wind up doing damage in other areas that way.

In the communications app category (iMessage... email) I’ve realized there’s a missing level of Notification behavior that I’d like to see. Something like “Response Notifications.” As A User, I Want To know when someone has responded to communications I have initiated, In Order To receive notifications I care most about. If I initiate an exchange, I want to be notified when others respond. If someone initiates an exchange with me, I’ll get to it on my time.

I can hear the people that have been saying “do you really need to check your mail or text that person as you walk down the street” for a decade now, ringing in my ear. Guess what, I don’t.

Digital life is messy.