So, I’ve got me a Cubietruck with an 60GB SSD attached to it. On the SSD there’s a LevelDB database with historical data for 121 device values. The total number of keys stored in this database is about 1,050,000. It takes about 8 minutes to fill this database from scratch with a MS SQL Server table as ‘source’.
That means that I can put (write) more than 2,000 entries to the database per second. That will do. Just kidding – that’s more than I’ll ever need of course.. but what about the get (read) performance? Being able to read 2,000 values per second from a database is not that much, so I hope the read performance is even better. A small test showed me that LevelDB could produce about 4500 values per second. Would it be possible to improve this? A lot, please?
So I decided to do some testing, so that I won’t end up with a user interface with charts that take ages to load. Therefore I wrote a small script that would query the database to retrieve everything for all the device values in the database. The result set, retrieved with the createReadStream() function of LevelUP, can have a size in the range of 4 to ~50,000 values. By counting the number of returned values and the time that was needed to get those (and push them to an array for further processing), I could get an indication of how fast LevelDB really is, based on a real database (not just a bunch of “123456abcdef” keys…) and based on result set ranges from ‘very small’ to ‘very large’. As in real life
And why not do the same test with MySQL and see how that one performs as well? With a MySQL installation ‘out of the box’ (so no performance-enhancing tricks) I created a database with a single table in it. It contains the same data as the LevelDB version: device value ID, timestamp and value (and some more). And an index of course:
CREATE TABLE `data` ( `TIME` datetime NOT NULL, `LGDEVICEID` varchar(25) NOT NULL, `START` decimal(16,6) DEFAULT '0.000000', `VALUE` decimal(16,6) DEFAULT '0.000000', `COUNT` int(11) DEFAULT '1', UNIQUE KEY `1` (`LGDEVICEID`,`TIME`), KEY `time` (`TIME`,`LGDEVICEID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Filling this table with the same data as the LevelDB database took a bit longer: 56 minutes vs. 04:19 for LevelDB.. I hope this isn’t indicative for the read performance
Well, here are the numbers:
The numbers presented show the average number of rows per second each database can deliver in the specified size range of the result. An example for LevelDB: 100 createReadStream() calls resulted in a total of 62606 values in 13122 milliseconds, i.e. an average of 4771 values/second. MySQL: the same number of query() calls resulted in a total of 62606 values in 5996 milliseconds, i.e. an average of 10141 values/second.
The size ranges with which I tested may seem a bit strange, but that’s because I just didn’t have any historical data for a single device between 5k…10k – the gap between historical data that’s being stored temporarily and the historical data which is stored forever.
The numbers show a clear winner: MySQL. I guess that using a ‘traditional’ RDBMS for my historical data is not such a bad idea after all…
This is not what I expected actually. All those hyper-new database engines must have some benefits – cause what other reason would make them so popular lately? Well, I didn’t find them… at least not with my hardware, my historical data characteristics and my priorities … and since every second counts, I’m going for MySQL!