LevelDB and a Raspberry Pi, will it suffice?

The transition from my Delphi/MS SQL/IIS/ASP.Net based home automation system to Raspberry Pi/Node.JS/MQTT/[…] is progressing very well; just a few drivers are left to translate (Visonic Powermax, X10, Fritz!Box and Zigbee).

However, I don’t work with a road-map, milestones or anything else that would only limit me in what I will be doing in the next couple of hours of spare time – I just do what’s the most fun thing to do at that time..that’s what a hobby is all about and with Home Automation there’s always a lot of exciting stuff to work on! This week the term ‘storage’ kept on popping up in my mind, so those drivers just have to wait a bit longer until I’m ready for them -I’m not a driver-mass-production-facility.

So last week was ‘storage week‘: time to have a look at some other components that will need to be replaced, the DB engine for instance and the hardware underneath. With MQTT and its retain feature I’ve always got all the current device values available, but I hadn’t really thought about how to handle all the historical data that has been collected since September 2006. Here are some statistics about my current MS SQL historical data table:

  • Number of historical items: 121
  • Data space: 792 MB
  • Index space: 700 MB
  • Row count: 1051068

That may sound like a lot, but it isn’t. Well, for a domotica system, maybe it is 😉

MySQL, Redis and LevelDB crossed my mind; using MySQL would probably be the easiest transition, especially since I’ve always avoided DBMS specific features which would only result in a self-made vendor lock-in. But there’s no challenge in that, let’s try something else for a change and see how it works out. Redis is an in-memory key-value store and I had no idea if it would be able to handle the amount of data on a Raspberry Pi, so the first DB engine I tried was LevelDB.

Last week I wrote a Node.JS script that did exactly the same thing with LevelDB as what my ‘old’ system did with MS SQL server: storing all the data (device values) I wanted to keep. Exactly the same way, so including parameters like storage interval, repeat interval, all based on the type of information concerned. For example, total power and gas usage is stored with just 1 value per hour but kept forever, while current power usage is stored at each change but with a ‘TTL’ (Time To Live) of 30 days.

But after a week I saw that the LevelDB database wasn’t growing fast enough to have a really good look at how the DB engine would perform in say 5 years from now, so I decided to speed things up a bit and copy 7 years of gas usage data from MS SQL to LevelDB. Doing that, I could do some close-to-real-life testing and, for instance, see how long it would take to retrieve the information and build charts similar like these with it. Of course, not with the .Net charting software anymore… another thing I have to look at in the near future.

I took a spare Raspberry Pi, an old 1GB USB stick, mounted the stick on the RPi, created the script, and let it run without any idea of how long this would take.

Ooh wait, the script has already done its job: 45455 rows in 72 seconds, that’s faster than I anticipated actually… OK, I admit, I really underestimated the capabilities of the RPi dramatically here!

A quick calculation showed me that I should be able to ‘pump’ all 1.050.000 SQL rows into my LevelDB in about half an hour – lets try it!

SQL to LevelDB

Just over 30 minutes were needed to transfer more than 1 million rows to a Raspberry Pi with an old USB stick attached to it, out of the box… no overclocking, fancy storage, optimization, nothing of that kind – I didn’t expect this, to be honest.

BTW, the resulting size in terms of allocated disk space for the LevelDB database is <13 MB. From 1.4 GB down to 13 MB, another huge surprise:

LevelDB disk space

I wonder what the performance will be when you do this with some more powerful boards like a Cubietruck or a Quad core Cortex A9 board like the  Wandboard Quad with just a typical SSD attached as storage? Can’t wait to find out… but lets not waste too much energy/money on unneeded performance – so that’s why I’m first going to try the Cubietruck as soon as it’s available here in NL.

But this is just half of the work of course, cause now I can do the tests I need to do – see how and how fast I can create charts with all this data.

Less than 48 hours ago Pieter asked me if I had already worked with AngularJS based web pages with data from LevelDB – I replied that it could take some time before I’d reach that point, but now I’m not that sure anymore, cause everything has changed in the last 48 hours. I think I’ve got enough data to give it a try soon – very soon!

Bookmark the permalink.

4 Responses to LevelDB and a Raspberry Pi, will it suffice?

  1. Auke says:

    I’m new to the whole domotica business. Recently our gridoperator installed a smart meter and I have a Rapsberry Pi in search of a new project. After an afternooon of tinkering the Raspberry is now reading values from the smart meter. But how to store them? This LevelDB sounds quite interesting. What is in your opinion the best way to store the values from the smart meter? As seperate key value pairs for gas and electricity? Or as values for both of them and as key the timestamp? Or something different?

    • Hi Auke,
      In my opinion it would be best to store each values with a separate key (e.g. key=powerusage~201312011700, value=123.45). That’s how I do it, but there are more ways of doing that of course. By doing it this way it’s very easy to ‘query’ LevelDB and get a result back that will work very well for creating charts. Another way could be like key=smartmeter~201312011700, value={powerusage:123.45, gas: 33.5, powerprod: 3.15, etc.}. I have a LevelDB with >1M keys in it right now and I’m going to do some tests with it to see how fast LevelDB really is, that might interest you too I guess so stay tuned 🙂

  2. Michael MD says:

    did you encounter any problems with the usb stick after a while?

    I had a pi also using a leveldb database on a usb stick and though it worked well at first and as quite fast the database became corrupt after a few weeks. (it was used for a pump.io instance – a federated soc.net so possibly used the database somewhat more heavily – was pretty fast though!)

    I’m guessing most databases are optimised for normal magnetic hard drives and flash storage is a bit different.

    I’m thinking some kind of append-only storage with a separate script to rebuild the database occasionally to remove deleted items if needed might be a good fit for flash storage (less rewriting)

    • Robert Hekkers says:

      The setup I used didn’t run for that long. I switched to a Cubietruck, attached an SSD to it, moved the OS to the SSD and put MySQL on it as well. It’s running flawlessly since Feb 2014.

Leave a Reply

Your email address will not be published. Required fields are marked *