Archive for the ‘tech’ Category

Better Puppet Module Development through Testing – CasitConf 2013

I spoke about testing puppet modules at this year’s Cascadia IT Conference in Seattle. The conference was great; I met a lot of great people and received a lot of feedback on my presentation.

Download the slides

Quickly validate puppet manifests in a git repo.

I have been bitten on more than one occasion with a forgotten curly brace or missing comma in a puppet manifest. So, I wrote a little git post-commit script that will validate all manifests in a repo. I have tested it on a repo with ~60 modules and ~400 manifests, and it only adds a little bit of latency to committing (less than 1s on my system with the email disabled).

I still like to use continuous integration via Travis or Jenkins to run smoke tests, spec tests, and lint. This post-commit commit script just provides some quick feedback for syntax errors.

My first puppet forge contribution: redis_server

Today I contributed my very first puppet module back to the community! thomasvandoren/redis is a puppet module to install, configure, and manage redis. It is by no means finished, but it is working. I didn’t see any other redis modules in puppet forge, so I thought I would share this one.

I also, for the first time, configured the repo to build in travis, which is an awesome free CI system for open source repos on github. I have about half of the test script running in travis right now. I’ll have to reflect on travis in another post; suffice it to say it is an awesome service!

Update 2012-05-18: the module has been renamed to the much more simple: redis.

Fixing Master Boot Record on Windows 7

I recently removed the linux partition from my desktop. I had a linux partition, but it made sense to just use one windows partition and run linux in a vm.

Removing the partition is not a big deal. I just used the Windows Disk Management tool to remove the linux partition and expand the windows partition accordingly. I tried to fix the master boot record before rebooting, since GRUB was no longer required. Upon restart, however, I was met with a vague partition error followed by the grub restore prompt.

To remedy this problem so that windows would boot:

  • Boot from Windows 7 install disk.
  • Try Startup Repair (though it probably won’t work) in ‘Repair your computer’ > ‘System Recovery Options’.
  • Reboot – if it still doesn’t work, return to ‘System Recovery Options’ menu.
  • Open command prompt.
  • bootrec.exe /fixmbr
  • Reboot – it should boot directly to Windows 7.


Monitoring RabbitMQ Queues with Zabbix

Recently I setup some monitoring on a RabbitMQ server using Zabbix. This process is by no means difficult, but I thought it was worth sharing.

I was looking for a solution that did not require additional plugins or packages, but would perform well. Some useful tools for monitoring include: the Management Plugin for RabbitMQ – works well, but provides more info than I needed; SNMP Statistics Plugin which looks promising; and the method below.

This assumes a zabbix server and agent(s) are setup, and a basic knowledge of zabbix.

Zabbix User Parameters

These user parameters pull all of the queue and exchange information out of rabbitmqctl for a particular queue and exchange.

I created a new file, /etc/zabbix/zabbix.conf.d/rabbitmq-server-stats.conf, which looked like the one below. It assumes rabbitmqctl is at /usr/sbin/rabbitmqctl.

After making the changes, bounce (restart) the zabbix-agent service on the rabbitmq server box.

Sudo Permissions

The parameters won’t work until the zabbix group is granted non-password sudo access. I chose to add a new file at /etc/sudoers.d/rabbitmqserverstats.

I added the following line to the end of /etc/sudoers:

/etc/sudoers.d/rabbitmqserverstats contains:

And with that, Zabbix should be able to monitor the some-queue and some-exchange statistics.

Update: Posted to RabbitMQ Server Stats template on

Related Sources

Jenkins Windows Slave with Git

There are some articles out there about setting up Jenkins slaves on Windows. This is one more, with a bunch of information about configuring Git. The documentation for setting up Git to work well with Jenkins is surprisingly sparse and the process is extremely frustrating (in my experience). Hopefully this will help!

* This assumes some master Jenkins service is setup with the Git plugin and has network access to the Windows box that is becoming a slave.

Setup Jenkins Slave

  • Goto Jenkins master.
  • Select ‘Manage Jenkins’ > ‘Manage Nodes’.
  • Select ‘New Node’.
  • Give it a name that identifies the computer, select ‘Dumb Slave’, and ‘Ok’. Some example settings:
    • Name: win7-thomas
    • # of executors: 4 (one executor per cpu is not a bad ratio)
    • Remote FS Root: c:\Jenkins\Slaves
    • Labels: windows blackberry (this box is good for building the blackberry projects and is a windows box)
    • Usage: Utilize this slave as much as possible
    • Launch method: Launch slave agents via Java Web Start
    • Availability: Keep this slave on-line as much as possible
  • After saving the new node, open it from the ‘nodes’ screen, and select ‘Launch’.
  • Save the slave-agent.jnlp in a decent folder (like c:\Jenkins).
  • Open the slave-agent.jnlp. Double clicking worked for me, but you might need to use something like:
    • javaws http://jenkins-hostname/computer/win7-thomas/slave-agent.jnlp
    • Or, one of the other suggestions Jenkins shows.
  • This should popup a window that says ‘Connected’. Goto ‘File’ > ‘Install as Windows Service’.
  • Once you have completed this install, you should see ‘Jenkins Slave’ among the running services.
  • It might make sense to change the user that runs the service to something other than the SYSTEM user. Once changed, you will need to Stop and Restart the service.
  • Reboot. Make sure that ‘Jenkins Slave’ is started automatically at startup.
  • When you create a new project, make sure that its labels indicate this slave’s name or labels.

Setup Git

It is advisable to run the Jenkins Slave service as a pre-defined user, as opposed to the SYSTEM user. However, if the Jenkins Slave service is running as the SYSTEM user, the following will help emulate the environment that Jenkins will use when building.
  • To run commands as the SYSTEM user, you can use psexec.exe from SysInternals.
    • From an Administrator cmd.exe prompt, psexec -i -s cmd.exe will open a new shell as the SYSTEM user.

General Advice when Setting Up Git

  • Define a HOME env var equal to %USERPROFILE%.
  • Create passphrase-less rsa keys and put them in %HOME%/.ssh. These keys should be setup on whatever server hosts the Git repos. In GitHub, for example, you would need to add the keys to your account.
  • Do an initial ssh [email protected] to add GitHub to the known_hosts.
  • Get rid of any GIT_SSH env vars if using the default ssh client for auth (as opposed to plink.exe, etc). GIT_SSH=c:\…\plink.exe may exist if you have previously used putty/pageant/TortoiseGit/etc to access Git repos.
  • ssh [email protected] (or wherever your repo is) is very useful for debugging. One to three -v flags (i.e. ssh -vv [email protected]) may be added to help debug the connection process.
  • Set the %HOME%/.ssh/config to specify which authentication to use:
    User git
    PreferredAuthentications publickey
  • If you see the following error message and your files do have the correct perms (0600), then you are suffering from a bug in the msysgit ssh executable. Unix permissions (0644) don’t map to NTFS ACLs. Msys just fakes the behavior of chmod, but it can’t fake a chmod to a restrictive enough permissions set. Steps to fix are below.
Permissions 0644 for '/path/to/key' are too open.
It is recommended that your private key files are NOT accessible by others.
This private key will be ignored.
bad permissions: ignore key: /path/to/key
  • Assuming cygwin is installed at c:\cygwin and msysgit is installed at c:\progra~1\Git, this will replace the ssh executable in msysgit with the one from cygwin, which recognizes file perms:
@rem From an Administrator cmd.exe
@rem This works for 32bit Windows. Adjust accordingly for 64bit.
ren "C:\Program Files\Git\bin\ssh.exe" "C:\Program Files\Git\bin\ssh.bak.exe"
copy "C:\cygwin\bin\ssh.exe" "C:\Program Files\Git\bin\ssh.exe"
copy "C:\cygwin\bin\cyg*.dll" "C:\Program Files\Git\bin\"

Some Sources

Appreciated feedback from George Reilly

Update: Git section posted on Cozi Tech blog!

Five python tips

I have been using python at work for the last year, and had fooled around with it for five years prior to that. I am by no means an encyclopedia of python knowledge, but here are a few pointers that were not obvious to me when I was getting started.

1. Use contextlib.closing to avoid leaking file pointers.

2. Use OptionParser (< 2.7) or ArgumentParser (>= 2.7) for easy access to command line options and arguments.

3. Use pickle to write/read data structures to/from file for easy persistent storage.

4. Use subprocess to make shell calls. This is especially useful when writing utility scripts that need to make bash calls (for example)!

5. Simplify unittesting with mock objects. Be careful about using the built in methods on Mock objects, though. A spelling error will get invoked as a ‘mock’ call to the object instead of throwing an AttributeError.

There are tons of other useful modules and third party libraries out there. Feel free to suggest them in the comments!

WikiGraph – Visualize Wikipedia Connections

WikiGraph helps users visualize connections among Wikipedia articles. It uses links within articles to decide how articles relate to each other. It is a Flash application with a PHP data services api, and a MySQL database. The database contains about 8.3 million articles with 800 million links. Seven UW CSE students (including me) worked on WikiGraph as part of the Winter 2011 Software Engineering class.

Some of the Challenges

The 24 most significant relationships for an article are shown in a graph. The strength of a relationship is determined by link relationships and the length of the article (longer articles are more likely to be real articles as opposed to long pages of links). Mutual relationships are considered the strongest (article a links to article b and vice-a-versa) followed by outbound relations and then inbound relations.

It can take a long time to determine the 24 most significant relationships for some articles. For example, the number 0 article has hundreds of thousands of inbound links (articles that link to the number 0 article). It takes over 30 seconds to pull all those connections from the database and then sort them in order of significance.

This seemed like too much time to have users wait, so we implemented a caching system. If a graph is not already in the cache, the api quickly returns 24 connections that are not necessarily the strongest, but are a mix of inbound and outbound links. A background job is then initiated to cache the graph for subsequent requests.

It would be possible to cache all of the links at once, but due to limitations with our MySQL server we were not able to do this. Hence, it is done lazily. Some graphs, with too many links cause the database queries to timeout (there are too many results). If we had more time to work on the project, we would consider ways to cache all of the graphs initially so that every request would return the most significant connections quickly.

We used Google Code to develop the source code, track issues, and create documentation in the wiki.

The Flash Client

To develop the Flash application we used Flex 4 with ActionScript 3. None of our team members had ever developed a Flash application prior to January and we did not have access to Flash Builder. Flex was incredibly easy to learn for our team and it is open source! Everyone had significant Java and Object Oriented experience, and we found lots of documentation and tutorials which made it easy to get started.

We were also able to use FlexUnit 4 to run automated unit tests on our Flex/ActionScript source code. This proved to be easy to do in a Windows environment, but far more difficult on our Hudson build server which used Fedora 13. I will write a separate blog about getting FlexUnit, Hudson, and Fedora to work together harmoniously.

The PHP Data Services API

We implemented a ReSTful architecture to access the pertinent WikiGraph data. This allows for future clients (maybe a JavaScript/HTML5 client) to use our api. It is a read only api. The users do not have any modification rights. Although, a user action can initiate a cache operation, which updates the cache table.

There is a pretty simple set of functional PHP scripts that implement all of the user actions. An OO  paradigm is used to access the database.

We used PHPUnit to run automated unit tests. It was super easy  and provided options for jUnit-like output which Hudson could easily chart.

The MySQL Database

Our database consumes quite a bit of space, so we opted to use an Amazon Web Service Relational Database Service. Our class was able to provide grants to use the AWS services. We were very pleased with the AWS experience. It allowed us to moderate how much space and memory we used for the database.  That was useful when we wanted to perform administrative actions on the db. We could upgrade to a high memory, fast CPU instance while the admin operations took place, and then downgrade to a cheaper low memory, standard CPU for normal use.

I doubt that AWS needs any further commendations, but I highly recommend it.

Switch to our mobile site