So, I’ve got me a Cubietruck with an 60GB SSD attached to it. On the SSD there’s a LevelDB database with historical data for 121 device values. The total number of keys stored in this database is about 1,050,000. It takes about 8 minutes to fill this database from scratch with a MS SQL Server table as ‘source’.
That means that I can put (write) more than 2,000 entries to the database per second. That will do. Just kidding – that’s more than I’ll ever need of course.. but what about the get (read) performance? Being able to read 2,000 values per second from a database is not that much, so I hope the read performance is even better. A small test showed me that LevelDB could produce about 4500 values per second. Would it be possible to improve this? A lot, please?
So I decided to do some testing, so that I won’t end up with a user interface with charts that take ages to load. Therefore I wrote a small script that would query the database to retrieve everything for all the device values in the database. The result set, retrieved with the createReadStream() function of LevelUP, can have a size in the range of 4 to ~50,000 values. By counting the number of returned values and the time that was needed to get those (and push them to an array for further processing), I could get an indication of how fast LevelDB really is, based on a real database (not just a bunch of “123456abcdef” keys…) and based on result set ranges from ‘very small’ to ‘very large’. As in real life 😉
And why not do the same test with MySQL and see how that one performs as well? With a MySQL installation ‘out of the box’ (so no performance-enhancing tricks) I created a database with a single table in it. It contains the same data as the LevelDB version: device value ID, timestamp and value (and some more). And an index of course:
CREATE TABLE `data` ( `TIME` datetime NOT NULL, `LGDEVICEID` varchar(25) NOT NULL, `START` decimal(16,6) DEFAULT '0.000000', `VALUE` decimal(16,6) DEFAULT '0.000000', `COUNT` int(11) DEFAULT '1', UNIQUE KEY `1` (`LGDEVICEID`,`TIME`), KEY `time` (`TIME`,`LGDEVICEID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Filling this table with the same data as the LevelDB database took a bit longer: 56 minutes vs. 04:19 for LevelDB.. I hope this isn’t indicative for the read performance 😉
Well, here are the numbers:
The numbers presented show the average number of rows per second each database can deliver in the specified size range of the result. An example for LevelDB: 100 createReadStream() calls resulted in a total of 62606 values in 13122 milliseconds, i.e. an average of 4771 values/second. MySQL: the same number of query() calls resulted in a total of 62606 values in 5996 milliseconds, i.e. an average of 10141 values/second.
The size ranges with which I tested may seem a bit strange, but that’s because I just didn’t have any historical data for a single device between 5k…10k – the gap between historical data that’s being stored temporarily and the historical data which is stored forever.
The numbers show a clear winner: MySQL. I guess that using a ‘traditional’ RDBMS for my historical data is not such a bad idea after all…
This is not what I expected actually. All those hyper-new database engines must have some benefits – cause what other reason would make them so popular lately? Well, I didn’t find them… at least not with my hardware, my historical data characteristics and my priorities … and since every second counts, I’m going for MySQL!