Page Created: 7/30/2014   Last Modified: 9/9/2018   Last Generated: 1/11/2019
(This page is an extremely rough draft and is full of all kinds of errors. I will try to improve the documentation over time if I release future versions.)
OswaldBot.py is a home automation robot, a Python-based XMPP bot with speech, audio stream player control, power outage detection, hourly chimes, laser tripwire, and integration with ScratchedInTime comment server. It pre-computes speech files to minimize CPU processing and improve speech response.
You could think of it like a little HAL 9000↗ since it is so bug-ridden you never know what it might do...
OswaldBot was designed around my particular hardware.
- Raspberry Pi
- Android-based smartphone running Xabber
- Stereo speakers with L and R channels placed in different rooms
- C-Media USB audio device (used as mixer)
- Behringer USB DAC audio device
- CyberPower UPS
- OswaldCluster (optional)
Required Files and Directories
config.py - Configuration file (must be in same directory as OswaldBot.py)
generatespeech.sh - Pre-generate the computationally intensive Festival speech files
oz.py - Main subroutines (must be in same directory as OswaldBot.py)
ScratchedInTimePlugin - Optional plugin to interface with ScratchedInTime.
- mpd (optional)
- curl - For use with ScratchedInTime (optional)
I have been experimenting with the British Raspberry Pi ever since it was released in the United States. It's a wonderful device, and its low cost ($35) means you can take risks with it and won't lose a lot of money if you damage and have to replace it. It gives me the same excitement of working on the old Commodore 64, because of its low cost, versatility, and minimalistic, defined hardware.
My first Pi was a 1st revision, Model B with 256 MB RAM. I chose Arch ARM as the OS, as it is low on resources and fast. The Pi has a 32-bit, single 700 MHz ARMv6 core, a fairly slow processor for 2013, but it has a high performance-per-watt ratio.
This means that if you put enough Pi's together, they won't cost you very much, they won't use much power, and they may have enough speed to do what you need. And you get some isolation, redundancy, and spare parts as a bonus.
The idea to incorporate composite video out, GPIO, and USB power was great, since it also keeps the external component cost low and allows a lot of flexibility.
The XMPP bot
I turned my first Pi into an XMPP (Jabber) instant messaging server running Matthew Wild's Prosody, and used Python 2 to create a "bot" which received messages from my smartphone, which then sent logic signals out the GPIO. Being skilled in electronics, I connected a few bipolar transistors and relays from a local electronics store called Gateway Electronics to the GPIO to control various equipment.
hi - Ping Oswald to see if it hears you.
chimeon, chimeoff - Turn hourly chime on or off
front, back, both - Set speaker channels to front, back, or both
loud, soft - Set speaker volume level
soundcard - Change to the next soundcard
say [message] - Text-to-speech message on speaker
priority [message] - Text-to-speech a loud, priority message on all speakers
repeat - Replay the last "say" or "priority" message
stream [url] - Play an audio stream
audio [url] - Play an audio file
pause - Pause audio
stop - Stop audio
laseron- Activates pulsed laser tripwire (if it passes alignment check)
laseraim - Turns on continuous laser beam for aiming the mirror alignment
laseroff - Turn off laser
I also thought it would be really neat to have it speak in each room of the house, reminiscent of my old phone system, so I used Edinburgh's Festival for text to speech. I looked into wireless bluetooth speakers, but they were too expensive, but then I noticed that I had RJ11 phone jacks in each room but wasn't using most of them, since I have wireless telephones. Hmm...
So I disassembled 2 pairs of old computer speakers and removed the power amplifier from one of them. I was surprised to find that one of them had a fairly powerful car audio amplifier IC inside. I connected the amp to the Pi's audio out and soldered on an RJ11 jack to its output, using one pair of wires for each stereo channel (L/R). My idea was to use only mono output for the audio, and control the volumes of 2 of the 4 speakers independently, by adjusting the audio balance.
I then replaced the speaker cables on each of the 4 speakers with 4-conductor CAT-3 cable and RJ11 connectors using only two conductors for each speaker.
I disconnected my phone wiring at the demarcation point and connected it to the amplifier and voila, sound in each room. The amplifier was just powerful enough to power two pairs of speakers in parallel.
However, I wasn't so impressed with the audio. There was a bug in the Rev. 1 Pi's audio which caused a popping noise whenever ALSA sent a sound, the sound was 11 bits instead of the standard 16, and there is no way to adjust the left and right channel volumes independently.
This worked okay until one day I knocked over a mug which fell and crushed the Pi. But luckily, replacing it was cheap, and the Raspberry Pi Foundation had increased the memory to 512 MB and fixed the popping audio problem.
However, the Pi's built-in audio cannot control the left/right channel volumes independently, so I connected a Behringer USB DAC device and a cheap C-Media based device which I use as a mixer (since the DAC has no hardware mixer). The setup is fairly complex and buggy since the USB port is being over taxed, and the Wolfson audio card was not available for the Rasberry Pi at the time I built it.
Festival, however, is very CPU intensive and creates several second delays on the Pi which don't allow instant feedback. So I created a BASH speech generator script called generatespeech.sh which uses Festival's text2wav to pre-generate the speech and save to wav files. This transforms the problem away from the time domain (cpu) into the space domain (flash drive) which is within the Pi's capabilities, since the speech files don't consume that much space.
Detecting power outages
The Oswald bot uses NUT (Network UPS Tools) to read the status of my CyberPower UPS, to detect when the system loses power, and send status updates over XMPP.
Audio stream player control
The system can control the mpg123 audio stream player to play audio streams over the same speakers that it uses to speak. It pauses the streams when speaking. If you use mpd on the same server, it will try to pause and resume those streams as well.
The Oswald bot can control and read the status of the OswaldLaser I constructed, sending status over XMPP.
ScratchedInTime IntegrationPage Created: 7/23/2014   Last Modified: 3/11/2016   Last Generated: 1/11/2019
ScratchedInTimePlugin.py - Needs to be in the same folder as OswaldBot.
ScratchedInTime doesn't need a control server, but I added a plugin called ScratchedInTimePlugin so that OswaldBot can send certain commands to it so I could send text commands (XMPP) from a smart phone running Xabber to the system to do things like send blogs, view most recent spam, view private comments, mark certain comments as spam, resurrect spam as valid comments, rebuild comments pages, and prime the bogofilter database.
Since I didn't add a user moderation system, spam is only controlled algorithmically unless I intervene. I could build a moderation system, but that would just add complexity, which I am keeping to a minimum. But a manual method only works if you put upper limits on the amount of comments per page. Otherwise the amount of spam would eventually overwhelm a single person.
To receive output from ScratchedInTime, it checks Memcached running on the Cache server. OswaldBot is not directly accessible by the Comments server for security reasons. To receive output from comments.pl, it checks Memcached running on the Cache server. It is a little tricky using the same Memcached server with both Perl and Python, since the software implementations are different, which can cause problems in Memcached if you're not careful.
set [command] - Sends a free form POST command to the comments server.
showmespam - Shows the latest spam flagged, along with the "resurrection" number.
showmeprivate - Shows the latest private comments (contact form), appends them to a file, and deletes them from Memcached.
The POST commands normally have to be prefixed by the word "set", except for "showmespam" and "showmeprivate" which were added for convenience.
Warning, this project is experimental and not recommended for real data or production. Do not use this software (and/or schematic, if applicable) unless you read and understand the code/schematic and know what it is doing! I made it solely for myself and am only releasing the source code in the hope that it gives people insight into the program structure and is useful in some way. It might not be suitable for you, and I am not responsible for the correctness of the information and do not warrant it in any way. Hopefully you will create a much better system and not use this one.
I run this software because it makes my life simpler and gives me philosophical insights into the world. I can tinker with the system when I need to. It probably won't make your life simpler, because it's not a robust, self-contained package. It's an interrelating system, so there are a lot of pieces that have to be running in just the right way or it will crash or error out.
There are all kinds of bugs in it, but I work around them until I later find time to fix them. Sometimes I never fix them but move on to new projects. When I build things for myself, I create structures that are beautiful to me, but I rarely perfect the details. I tend to build proof-of-concept prototypes, and when I prove that they work and are useful to me, I put them into operation to make my life simpler and show me new things about the world.
I purposely choose to not add complexity to the software but keep the complexity openly exposed in the system. I don't like closed, monolithic systems, I like smaller sets of things that inter-operate. Even a Rube Goldberg machine is easy to understand since the complexities are within plain view.
Minimalism in computing is hard to explain; you walk a fine line between not adding enough and adding too much, but there is a "zone", a small window where the human mind has enough grasp of the unique situation it is in to make a difference to human understanding. When I find these zones, I feel I must act on them, which is one of my motivating factors for taking on any personal project.
Here is an analogy: you can sit on a mountaintop and see how the tiny people below build their cities, but never meet them. You can meet the people close-up in their cities, but not see the significance of what they are building. But there is a middle ground where you can sort of see what they are doing and are close enough to them to see the importance of their journey.
The individual mind is a lens, but, like a single telescope looking at the night sky, we can either see stars that are close or stars that are much farther away, but we can't see all stars at the same time. We have to pick our stars.
I like to think of it like this:
Here is the source codeComments