Collecting streaming data is easy — but understanding it is much harder! With months of 🐱 weight data captured I discovered sensor data can be very messy⚡. Let me share some of the real world data problems I encountered — and how I solved my stream processing & cat dining challenges with ksqlDB

Snowy the cat — an expert in streaming data

You may have seen me tweeting my cats weight & dining habits with a Raspberry Pi. This small project captures the weight of both my cat and her food bowl and places these measurements into a Kafka topic about every second.

These weight measurements are pretty messy…


Our cat Snowy 😸 has been enjoying her meals over winter. I wanted to start collecting data on her eating habits, and analyse her weight over time. This data is collected with food and cat weight measurements; alongside phots taken by two cameras. Data is collected and images processed locally using a Raspberry Pi. At the risk of body-shaming our cat, updates are posted regularly to Twitter #snowydata.

Snowy enjoying her food while being measured
Snowy enjoying her food while being measured
Snowy enjoying her food while being measured

Project Overview

There are two scales — one for weighing the cat; and an independent weight measure for her food. As Snowy enters the cat scale, an initial measurement is taken for her current…


It’s really exciting to have a new option for streaming Oracle data into Kafka. With Confluent releasing their “Oracle CDC Source Premium Connector” there’s a new way to capture data that has been added to, updated, or deleted from Oracle RDBMS with Change Data Capture (CDC). Think of this as a low touch way to stream both the Oracle data and schema changes into Kafka and Schema Registry.

I put together some examples so you can play with this entire pipeline in Docker. Docker is an easy-way to run a an end to end pipeline on your machine. …


If you read 1 Cat, 3 Clouds — Where is Snowy sleeping (Part 1) you’ll know I’m trying to build a real-time location tracker for my cat.

Following on from that initial blog (which described the hardware and networking), this “Part 2” article describes data processing with Amazon Web Services. I’ll be using native AWS services for data ingest, data processing and mapping visualisation to build a low cost pet tracking device to show Snowy’s location in real-time. Later articles will cover similar approaches in both Microsoft Azure & Google Cloud.

AWS IoT, DynamoDB and SageMaker

Today I’m describing my data build in AWS, however…


Our cat Snowy is an outdoor cat — but we are beginning to wonder where she travels once she’s left the house. Let me share how I built a low cost pet tracking device to show Snowy’s location in real-time.

Plus, what’s it like to build the same IoT data project in AWS, Azure & Google Cloud in 2021? This is article 1 in a 4 part series. This first blog post describes the project and details hardware and networking components. The next three articles describe the experience of building the data pipelines with each major cloud.

Where is Snowy?

You may already…


Landing gear down and strobe lights active. How to add physical buttons, switches and dials to control computer actions with a $10 Raspberry Pi Zero.

Why would you want to do this? Well you might want to add some additional buttons or switches to your favorite game. Or maybe you can streamline your video production but switching cameras with a foot pedal. For me, I wanted to add tactile controls to Microsoft Flight Simulator 2020. Switches for the landing gear, beacons and lights.

Project code at — https://github.com/saubury/pi-peripheral/

Animated GIF showing box in usage

In short, Pi Peripheral can be used to control close to any…


Monitor good posture — with machine learning. Ensure proper office ergonomics with TensorFlow, a webcam and OpenCV Python real-time computer vision libraries. Code available at https://github.com/saubury/posture-watch

Real-time posture monitoring — as demonstrated with a teddy bear

If you sit behind a desk for hours at a time, you’re possibly going to slump. This may result in neck and back pain. This can be avoided by teaching a machine to recognise “good” posture from “bad” posture. To encourage better posture, instant feedback with a quick sound every-time poor office ergonomics are observed. Model training and monitoring is handled locally on device.

Live Posture Classification — How does this work?

Machine learning packages available for Python 3


Cocktails based on your mood created by a Raspberry Pi bartender. Drinks are selected based on your emotion and multilingual voice prompts let you know when your drink is available.

Cocktails based on your mood created by a Raspberry Pi bartender

How to build a fully automated home build cocktail maker. Code available at https://github.com/saubury/cocktail-pi🍸

Pumps

I used 4 peristaltic pumps to provide a “food safe” way to pump the liquids from the drink bottles. These pumps provide a steady rate of flow of liquids when powered. …


Kafka serialisation schemes — playing with AVRO, Protobuf, JSON Schema in Confluent Streaming Platform. The code for these examples available at https://github.com/saubury/kafka-serialization

Apache Avro was has been the default Kafka serialisation mechanism for a long time. Confluent just updated their Kafka streaming platform with additional support for serialising data with Protocol buffers (or protobuf) and JSON Schema serialisation.

Kafka with AVRO vs., Kafka with Protobuf vs., Kafka with JSON Schema

Protobuf is especially cool, and offers up some neat opportunities beyond what was possible in Avro. The inclusion of Protobuf and JSON Schema applies at producer and consumer libraries, schema registry, Kafka connect, ksqlDB along with Control Center. …


Sorting socks with a streaming solution? Pair socks with ksqlDB, Kafka and Kafka Connect.

Socks — sorted!

ksqlDB is a pretty cool event streaming platform. A few commands allow you to build a detailed real-time stream processing application. Capture, transform and perform continuous transformations on Kafka with a simplified SQL dialect. In addition, ksqlDB allows you to capture events from an external system using Kafka connect.

In my previous blog classifying socks with deep learning I described deploying an object recognition model on AWS DeepLens hardware for identifying socks. …

Simon Aubury

Day job: data steaming & system architecture. Night gig: IoT and random project hacking

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store