Daemonizing drivers, Python, MQTT and web scraping

I’ve been doing some excursions during the last 12 months to figure out what would be the best way to replace my Windows-based Home Automation system. It has become rather large through the years and I want to get rid of that single point of failure, being that Windows executable that does it all. A few weeks ago I tried some things with a Raspberry Pi (RPi) and Python and was actually very much surprised by the speed with which I could build something from scratch in a matter of hours.

FEZ Panda II, Simplecortex are both nice and may be able to do the job just as well, but the RPi is just awesome compared to those 2 for  reasons I think I don’t have to explain.

Yesterday afternoon was the first time (after a long, long week full of all kinds of problems) that I could sit down behind my PC again and spend some time on things I like to do. That afternoon I tried to get a Python script to run as a daemon. Running a Python script from the command-line is not very useful for me (only for testing & debugging) cause what I really want is that the RPi starts all the tasks I want it to run on it automatically at boot time – just power-up the RPi and the rest goes automagically, no user interaction should be needed. So I decided to combine 3 subjects into this small experiment: a Python MQTT client, daemonizing Python scripts and web scraping; lets see how far I can get.

I started with a brand new RPi, SD card and new image written to it. Did the usual things like setting the hostname, time zone etcetera. After that I installed some Python related tools (e.g. installing modules) to make life easier. An MQTT client written in Python wasn’t very hard to find – Mosquitto (the OS MQTT server that I’m using since September last year) has one, so I installed that one too. I searched for info on how to create a daemon and found the python-daemon library. The last things I needed were some tools to make the process of web scraping somewhat more comfortable: regular expressions and requests. I found some examples on how to use all these libraries/tools mentioned above and started coding.

Well, is this something you can really call coding? Reading and understanding the examples I found, copying code snippets, adding some extra lines, deleting others – it feels more like blending actually 😉

I use web scraping to show various kinds of information on the User interfaces in our house – things like the amount and total length of traffic jams, a 2-hour rain forecast (BBQ!) for our specific location and stuff like that, so it looked like a good idea to have a look if I could make one of those scrapers run in a Python-based daemon on the RPi.

3 hours later I was finished, with this code as a result:

import logging
import time
import mosquitto
import requests
import re
from daemon import runner

class App():

    def __init__(self):
        self.stdin_path = '/dev/null'
        self.stdout_path = '/dev/tty'
        self.stderr_path = '/dev/tty'
        self.pidfile_path =  '/var/run/webfetcherd.pid'
        self.pidfile_timeout = 5

    def on_connect(self, mosq, obj, rc):
        logger.info("on_connect:"+str(rc))

    def on_message(self, mosq, obj, msg):
        logger.info(msg.topic+" "+msg.payload)

    def de_html(self, html):
        pattern = re.compile("<.*?>|&nbsp;|&amp;",re.DOTALL|re.M)
        return pattern.sub("",html)

    def run(self):
        while True:
            self.mqttc = mosquitto.Mosquitto("webfetcher")
            self.mqttc.on_message = self.on_message
            self.mqttc.on_connect = self.on_connect
            self.mqttc.connect("192.168.10.40", 1883, 60)
            rc = 0
            prvtime = 0.0
            while rc == 0:
                rc = self.mqttc.loop()
                if (time.time()-prvtime) > 60:
                    prvtime = time.time()
                    rq = requests.get('http://m.fileindex.nl/files.js')
                    rex = re.compile('"(.*)"')
                    m = rex.search(rq.text)
                    if m:
                        res = self.de_html(m.group())
                        logger.info('Match: %s', res)
                        self.mqttc.publish("test/trafficjams",str(res))
                    else:
                        logger.info('NO match: %s', rq.text)

logger = logging.getLogger("Webfetcher")
logger.setLevel(logging.INFO)
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(messa
handler = logging.FileHandler("./webfetcher.log")
handler.setFormatter(formatter)
logger.addHandler(handler)
app = App()
daemon_runner = runner.DaemonRunner(app)
daemon_runner.daemon_context.files_preserve=[handler.stream]
daemon_runner.do_action()

Starting the daemon
Done! Ok, not completely – I didn’t make an init script yet and I don’t really like the way the MQTT client is called (line 36) and there are some other things that can probably be done better, especially when there’s I/O involved with hardware connected to the RPi; but for this goal and as result for a first alpha version this is OK I guess..

On to alpha2!

Raspberry Pi, P1 smart meter packets and Python

Yesterday evening I was looking for some fun thing to do; getting more familiar with the Raspberry Pi sounded nice. But what could I do? I needed something to work on, not just some fooling around with no end result.. and it should include working with Python, cause I’ve wanted to learn more about Python for quite some time but never got to it..

But when I landed on the website of Gé Janssen, I knew what I was going to do the next couple of hours: let’s see how easy it is to get the data of my smart meter into a MySQL database on the RPi!

RPi with USB Flash drive connected to an Arduino

RPi with USB Flash drive connected to an Arduino

Since I didn’t want to disrupt my own smart meter readings, I used a logfile that contains the ‘raw’ smart meter data of almost a week:

/ISk52ME382-1003
0-0:96.1.1(4B414C37303035313039)
1-0:1.8.1(00180.724*kWh)
1-0:1.8.2(00001.416*kWh)
1-0:2.8.1(00000.000*kWh)
1-0:2.8.2(00000.000*kWh)
0-0:96.14.0(0001)
1-0:1.7.0(0000.42*kW)
1-0:2.7.0(0000.00*kW)
0-0:17.0.0(0999.00*kW)
0-0:96.3.10(1)
0-0:96.13.1()
0-0:96.13.0()
0-1:24.1.0(3)
0-1:96.1.0(32383031313030313231)
0-1:24.3.0(120911010000)(00)(60)(1)(0-1:24.2.1)(m3)
(00007.646)
0-1:24.4.0(1)
!

Although it’s still not the real thing, with over a 50000 packets like the one above, I should be able to see if it’s working 😉

Now how do I get that data to the Raspberry Pi, in a way that’s as close to reality as possible? The log file that contains the P1 packets is on my NAS somewhere, but I wanted to feed the data through one of the USB ports of the RPi. Well, I can use an Arduino and a USB BUB for that. A VB.Net console app, Arduino sketch, USB BUB and some wires should get me going. First the VB.Net app, which feeds the Arduino:

Imports System.IO
Imports System.IO.Ports

Imports Microsoft.VisualBasic

Module Module1
  Private COMPort As SerialPort

  Sub InitPort()
    COMPort = New SerialPort
    COMPort.PortName = "COM16"
    COMPort.BaudRate = 9600
    COMPort.DataBits = 8
    COMPort.Parity = Parity.None
    COMPort.StopBits = 1
    COMPort.Open()
    Console.WriteLine("COMPort open.")
  End Sub

  Sub Main()
    InitPort()
    Dim reader As StreamReader
    Dim LineInput As String = ""
    Dim Counter As Integer = 0

    reader = My.Computer.FileSystem.OpenTextFileReader("U:P1data.txt")
    Console.WriteLine("File opened.")

    While Not reader.EndOfStream
      LineInput = reader.ReadLine()
      Console.WriteLine(LineInput)
      COMPort.Write(LineInput & Chr(13) & Chr(10))
      If LineInput.Equals("!") Then
        System.Threading.Thread.Sleep(5000)
      End If
    End While
  End Sub
End Module

And an Arduino sketch:

#include <SoftwareSerial.h>
// rx, tx
SoftwareSerial mySerial(2, 3);

void setup()
{
  Serial.begin(9600);
  Serial.println("SM Feeder");
  // set the data rate for the SoftwareSerial port
  mySerial.begin(9600);
}

void loop() // run over and over
{
  if (Serial.available()){
    char c = Serial.read();
    mySerial.print(c);
  }
}

Ok, now lets see if the smart meter data is arriving @ the RPi USB port:

slim@raspberrypi:/root $ cu -l /dev/ttyUSB0 -s 9600 --parity=none 
Connected.
/ISk52ME382-1003
0-0:96.1.1(4B414C37303035313039)
1-0:1.8.1(00180.912*kWh)
1-0:1.8.2(00001.416*kWh)
1-0:2.8.1(00000.000*kWh)
1-0:2.8.2(00000.000*kWh)
0-0:96.14.0(0001)
1-0:1.7.0(0000.48*kW)
1-0:2.7.0(0000.00*kW)
0-0:17.0.0(0999.00*kW)
0-0:96.3.10(1)
0-0:96.13.1()
0-0:96.13.0()
0-1:24.1.0(3)
0-1:96.1.0(32383031313030313231)
0-1:24.3.0(120911020000)(00)(60)(1)(0-1:24.2.1)(m3)
(00007.654)
0-1:24.4.0(1)
!

Cool. But we’re not there yet… now I need a Python script to read the smart meter data,  filter out the important stuff and put it in a MySQL table.

First I had to install some additional packages: python-dev, python-pip, python-mysqldb, pyserial and mysql-server. I installed apache2 and php5 as well, cause I had a feeling this wasn’t going to end with a Python script or a MySQL table being filled…

Although this Python script is ‘just’  about 80 lines, this was the part that consumed most of the time (and one of the reasons I started with all this):

#!/usr/bin/python
# -*- coding: utf-8 -*-

import MySQLdb as mdb
import sys
import serial
con = None
try:
    con = mdb.connect('localhost', 'p1writer', 'thisissecret', 'p1data')
except mdb.Error, e:

    print "Error %d: %s" % (e.args[0], e.args[1])
    sys.exit(1)

finally:
    if con:
         print("MySQL OK, proceeding...")

ser = serial.Serial()
ser.baudrate = 9600
ser.bytesize=serial.EIGHTBITS
ser.parity=serial.PARITY_NONE
ser.port="/dev/ttyUSB0"
ser.stopbits=serial.STOPBITS_ONE
ser.xonxoff=0
ser.rtscts=0

try:
    ser.open()
except:
    sys.exit ("Error opening port")

# parse function

type=''
value=''
dot=0

def p1parse(c):
  global type
  global value
  global dot

  if c == ":":
    type=''
    value=''
  elif c == "(":
    if type == '':
      type=value
    value=''
    dot=0
  elif c == ".":
    if type <> '':
      value = value + c
    dot=1
  elif c == ")":
    if dot and type <> '':
      return 1
  else:
    if c in ['0', '1','2','3','4','5','6','7','8','9']:
      value = value + c
  return 0

cur = con.cursor()
while 1:
    try:
        c = ser.read(1)
    except:
        sys.exit ("Error reading port")
    if not c:
        break
    if p1parse(c):
        with con:
            cur.execute("INSERT INTO p1values(InfoType,Value) VALUES(%s, %s)", (type, value))

The p1parse() routine was borrowed from Jean-Claude Wippler as posted on his daily weblog here.

After some trial and error, debugging, tweaking mySQL to use the USB Flash drive (to keep the database separated from the OS) as database storage, I got it working this evening:

Smart meter data in MySQL database

Not a bad result after spending only a couple of hours… this really makes me curious whether the rest of what I (we) have in mind will go just as smooth…