Sorting my socks with deep learning — Part 1

Can I arrange my sock drawer using machine learning? An adventure in pairing my socks using transfer learning to build a custom sock image classification model. A small project to train and deploy an object recognition model on AWS DeepLens hardware for identifying my socks.

Sock classification madness to neatly paired

The Idea

A deep learning model saving me from household chores

This project is also a good chance to test (or justify?) my new AWS Deeplens — a “deep-learning enabled video camera”. A Deeplens is a $300 hardware device with a 4MP video camera, Intel Atom Processor, 8GB RAM and runs Ubuntu. Plenty of hardware to help sort my socks.

A Deeplens in sock-classification action

This project and all the files are available here.

Image Capture

One sock at a time gets a photo

Taking photos of socks is just as exciting as it sounds. Luckily I found around 30 photos of each sock were sufficient to accurately identify each sock variety— so it didn’t take too long to capture images.

Each sock has around 30 different photos

To save time, I made sure each sock filled the camera frame. This means I didn’t need to add bounding boxes to the images as the entire image frame was filled with a single sock image.

Image Preparation

The exciting task of image preparation

While you can train a model on a folder full of images, it’s preferable to optimise the format of the images so that it’s less IO intensive during model training. I’ll prepare my dataset of sock images in ImageRecord format.

To prepare my sock dataset in ImageRecord format I first need to generate a .lst file, a text file describing images classification and filename. I can use the Apache MXNet im2rec.py python file to help with these steps

python3 -m venv myenv
source myenv/bin/activate
pip install mxnet opencv-python
curl --output im2rec.py https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/im2rec.py

I chose to split the photos into 80% training images and 20% for model validation. My data had 8 classification (ie., 8 different sock varieties) each with around 30 images. To randomly allocate the images into training and validation sets I run

python im2rec.py --list  --train-ratio 0.8   --recursive ./sock-images_rec sock-images/

After the execution, you find files sock-images_rec_train.lst and sock-images_rec_val.lst generated. These text files describe the images and classification for training and validation sets

wc -l *.lst
544 sock-images_rec_train.lst
137 sock-images_rec_val.lst
681 total

Now I want to actually crunch my sock images into ImageRecord, reducing the size of each image down to a 512x512 pixel image.

python im2rec.py   --resize 512   --center-crop   --num-thread 4 ./sock-images_rec ./sock-images/

It gives you four more files: (sock-images_rec_train.idx, sock-images_rec_train.rec, sock-images_rec_val.idx, sock-images_rec_val.rec). Now, I’ve got ImageRecord files for both the training and validation. I’ll copy these up to an s3 so I can run the training in d’ cloud.

aws s3 cp . s3://deeplens-sagemaker-socksort --exclude "*" --include "*.idx"  --include "*.rec"  --include "*.lst" --recursive

Model Training

Now we want to train an image classification model that can classify sock images. We will use transfer learning to leverage the knowledge gained from similar classification tasks to bootstrap my sock classification model. I can run this model training in a hosted Jupyter Notebook (AWS Sagemaker) in Python.

The basic steps being

  • Import the sock dataset as a recordio format.
  • Build an image classification model — see sock-classification.ipynb. All going well after running this notebook you should have a model file called${S3_BUCKET}/ic-transfer-learning/output/image-classification-${DATE}/output/model.tar.gz
  • Deploy a temporary classifier to test the inference function (that is, spin up your model with a temporary end point)
  • Test a few demonstration images can be correctly classified
  • Convince myself this is time well spent

A more extensive explanation here

Deeplens MXNet Model & Lambda Function

To create this local run-time project we need to build, publish and deploy a Sock Sort AWS DeepLens Inference Lambda Function. This is essentially a zip file containing my Python code (which pulls images from the camera and calls the classification model) along with support Python libraries (such as MQTT broker libraries) along with basic configuration settings.

Steps to build sock_deeplens_inference_function.zip are detailed here. So too are the steps required to load our previously created object classification model into my Deeplens hardware.

Publish Sock Sort AWS DeepLens Inference Lambda Function

DeepLens with model and local Lambda sending predictions to browser and MQTT

Now we have a trained model and Python Inference Lambda Function deployed locally to my Deeplens hardware. Powering up the device should run all images captured by the camera through the classification model. If a likely match is found, the details are overlaid on the image. I can review a live camera feed within as web-browser, and a text overlay displays the sock classification. I’m also pushing the classification prediction into an MQTT topic (more about that in part-2).

Socks — identified with model

To be continued …

Have a look at part 2 — Kafka stream processing: sorting socks with ksqlDB.

Socks — paired!

Project Files

Day job: data steaming & system architecture. Night gig: IoT and random project hacking