The transition from my Delphi/MS SQL/IIS/ASP.Net based home automation system to Raspberry Pi/Node.JS/MQTT/[…] is progressing very well; just a few drivers are left to translate (Visonic Powermax, X10, Fritz!Box and Zigbee).
However, I don’t work with a road-map, milestones or anything else that would only limit me in what I will be doing in the next couple of hours of spare time – I just do what’s the most fun thing to do at that time..that’s what a hobby is all about and with Home Automation there’s always a lot of exciting stuff to work on! This week the term ‘storage’ kept on popping up in my mind, so those drivers just have to wait a bit longer until I’m ready for them -I’m not a driver-mass-production-facility.
So last week was ‘storage week‘: time to have a look at some other components that will need to be replaced, the DB engine for instance and the hardware underneath. With MQTT and its retain feature I’ve always got all the current device values available, but I hadn’t really thought about how to handle all the historical data that has been collected since September 2006. Here are some statistics about my current MS SQL historical data table:
- Number of historical items: 121
- Data space: 792 MB
- Index space: 700 MB
- Row count: 1051068
That may sound like a lot, but it isn’t. Well, for a domotica system, maybe it is 😉
MySQL, Redis and LevelDB crossed my mind; using MySQL would probably be the easiest transition, especially since I’ve always avoided DBMS specific features which would only result in a self-made vendor lock-in. But there’s no challenge in that, let’s try something else for a change and see how it works out. Redis is an in-memory key-value store and I had no idea if it would be able to handle the amount of data on a Raspberry Pi, so the first DB engine I tried was LevelDB.
Last week I wrote a Node.JS script that did exactly the same thing with LevelDB as what my ‘old’ system did with MS SQL server: storing all the data (device values) I wanted to keep. Exactly the same way, so including parameters like storage interval, repeat interval, all based on the type of information concerned. For example, total power and gas usage is stored with just 1 value per hour but kept forever, while current power usage is stored at each change but with a ‘TTL’ (Time To Live) of 30 days.
But after a week I saw that the LevelDB database wasn’t growing fast enough to have a really good look at how the DB engine would perform in say 5 years from now, so I decided to speed things up a bit and copy 7 years of gas usage data from MS SQL to LevelDB. Doing that, I could do some close-to-real-life testing and, for instance, see how long it would take to retrieve the information and build charts similar like these with it. Of course, not with the .Net charting software anymore… another thing I have to look at in the near future.
I took a spare Raspberry Pi, an old 1GB USB stick, mounted the stick on the RPi, created the script, and let it run without any idea of how long this would take.
Ooh wait, the script has already done its job: 45455 rows in 72 seconds, that’s faster than I anticipated actually… OK, I admit, I really underestimated the capabilities of the RPi dramatically here!
A quick calculation showed me that I should be able to ‘pump’ all 1.050.000 SQL rows into my LevelDB in about half an hour – lets try it!
Just over 30 minutes were needed to transfer more than 1 million rows to a Raspberry Pi with an old USB stick attached to it, out of the box… no overclocking, fancy storage, optimization, nothing of that kind – I didn’t expect this, to be honest.
BTW, the resulting size in terms of allocated disk space for the LevelDB database is <13 MB. From 1.4 GB down to 13 MB, another huge surprise:
I wonder what the performance will be when you do this with some more powerful boards like a Cubietruck or a Quad core Cortex A9 board like the Wandboard Quad with just a typical SSD attached as storage? Can’t wait to find out… but lets not waste too much energy/money on unneeded performance – so that’s why I’m first going to try the Cubietruck as soon as it’s available here in NL.
But this is just half of the work of course, cause now I can do the tests I need to do – see how and how fast I can create charts with all this data.
Less than 48 hours ago Pieter asked me if I had already worked with AngularJS based web pages with data from LevelDB – I replied that it could take some time before I’d reach that point, but now I’m not that sure anymore, cause everything has changed in the last 48 hours. I think I’ve got enough data to give it a try soon – very soon!