Fog, in case you haven’t heard of it, is a fantastic
cloud computing library written in Ruby. It provides a unified interface
to several popular cloud computing platforms(including Amazon,
Rackspace, Linode, and others), making it easy to interact with them
from Ruby. It currently supports four types of cloud services: storage,
compute, DNS, and CDN. Fog has become very popular lately, and serves as
the backbone for Chef’s cloud computing functionality, which is how I
first became aware of it.
I recently used Fog to write a backup script in Ruby to automatically
send encrypted database backups from a database server running at
Rackspace to Amazon’s S3 storage service. Here’s how I did it.
My script runs as the second step in a process. The first step is a
shell script that calls pg_dump to dump a PostgreSQL database and then
encrypts the file using GnuPG, dropping them in a backup directory on
the database server.
My Fog-based script’s job is to make sure that all of the files in the
backup directory get moved to S3.
Fogsync (my script), looks at all of the files in that directory and
makes sure that they all exist in a bucket on S3. If they don’t, it
copies them up there. Additionally, it deletes old backups from S3. For
this customer, we keep backups for 14 days, so all backups older than
that get deleted.
Let’s look at how it works:
fog = Fog::Storage.new(
:provider => 'AWS',
:aws_access_key_id => MY_ACCESS_KEY,
:aws_secret_access_key => MY_SECRET
directory = fog.directories.get("MY_DIRECTORY")
files = Dir["/var/backup/*.gpg"]
for file in files do
name = File.basename(file)
directory.files.create(:key => name, :body => open(file))
Here’s what this snippet does:
Creates a connection to AWS. The syntax is basically the same for
connecting to all of the cloud platforms, just the parameter names are
Uses ‘head’ to check if the file exists and, optionally, get some
metadata about it (size, modify date, etc). Think of this as the cloud
equivalent to the unix stat command. You don’t want to use the ‘get’
command, as that will return the whole file, which would take a very
long time if the files are large cough*voice of experience*cough.
Creates the file in the given directory (“bucket” in S3 terms) if it
doesn’t exist already.
If you’ve used S3, you’ll notice that Fog uses slightly different terms
for things than S3 does. Because Fog works across a number of different
storage providers, it uses more general terms. While this might be
confusing at first if you’re familiar with a specific provider’s
nomenclature, but the tradeoff is that if you want to move from one
provider to another, the only thing you have to change is the code that
sets up the connection (the call to Fog::Storage.new() in this example).
oldest = Date.today - 14 (our date)
directory = fog.directories.get(MY_DIRECTORY)
files = directory.files
files.each do |f|
file_date = Date.parse(f.last_modified.to_s)
if file_date < oldest
This is fairly straightforward as well. Get all the files in the
directory and check their age, deleting the ones that are older than we
want to keep.
So that, in a nutshell, is how to use Fog. This is a simplified example
of course, in my production code the parameters are all pulled from
configuration files, and the script emails a report of what it did, in
addition to having a lot more error handling.
If you do any scripting with cloud computing, you owe it to yourself to
check out Fog.
Generating semi-realistic test data for an application can be a pain. If
the data already exists, as in the case of an upgrade to an existing
system, you can generally create data based on the existing database.
But what if you need a large sample of data for a brand new system? If
you have simple data requirements, there are some Ruby gems that can
help you out. Faker is one such gem,
which lets you generate realistic names, addresses and phone numbers.
But what do you do for things that are a little less typical? Things
like scores, ratings, ages, dates, etc. I needed to do this recently for
a prototype I built of a system to generate letters. Here’s the Rake
task I ended up with:
This script adds 1000 records to my database that are representative of
what real production data would look like. The quantity of data is
obviously easily adjusted up or down as needed.
This is just a standard rake task that you can drop inside lib/tasks.
Most of this is fairly standard ruby code and not very interesting, but
lets look closer at what makes this work.
The first portion of the script does some setup work, deleting existing
data. Then it sets up a series of arrays for the values that will be
used for individual fields. For example the volumes variable:
volumes = (8000..100000).to_a
This creates an array of integers containing every number between 8000
and 100000. Response rates and variances are set up similarly, as are
the client names.
In the loop that generates the actual data, we then call the
rand() function on these arrays to select a value from our
range. This function isn’t a standard part of the Ruby Array class, it’s
actually added to the class by
Using this method makes it very easy to generate test data within
predefined acceptable ranges.
For another take on this topic, see the EdgeCase
I’ve been a bit heads-down lately, working on a super-secret project in
Ruby. More on that in the near future, but in the meantime I wanted to
share about a few things that I’ve started using.
When I started my new project, I wanted to try one of the new testing
frameworks for Ruby. The problem is there are a number to choose from.
What to do…
I settled on Shoulda. I
wish I could tell you that this was a rigorous process, that I evaluated
each framework carefully, learning about each one’s strengths and
weaknesses. I did not, I cheated. You see, a while back, Josh Susser did
just that thing. He called it the The Great Test Framework
He settled on Shoulda, so that’s what I went with.
Shoulda is developed by Tammer Saleh of ThoughtBot, who have a number of
other really nice projects.
Shoulda’s tagline is “Making Tests Easy on the Fingers and Eyes”, and it
lives up to that goal. It has a very nice syntax for developing tests,
including a complete set of macros for testing controllers and models.
It’s a joy to use. Here’s what it looks like (both samples taken from
the Shoulda README :
Here’s a sample of the ActiveRecord macros in action:
So what’s the big deal? Well, it’s easier to read for one. Instead of
horrendous method names like test_should_do_this_but_not_that, you
get to write English: should “do this but not that”. The macros in
Shoulda also let you test your models and controllers easily.
Pivotal Tracker is an Agile project management tool, developed by the
folks at Pivotal Labs. It lets you create
projects, track release, stories, and defects. The beauty of Tracker is
it’s all-on-one-screen user interface. It lets you see everything at a
glance, and even provides keyboard shortcuts for common tasks. I’m not
alone in my admiration of Tracker, it seems to be extremely popular
among the Rails consulting shops (Hashrocket,
While Tracker is powerful enough to be used for large multi-developer
projects, it also happens to be perfect for managing your side
projects. Enter the features you want, organize them into releases, and
just click start to begin the first one. Click finish when you’re done,
and move on to the next one. Easy peasy. Did I mention it’s free?
Be sure to check out the
screencast, which gives a
nice overview of the application.
John Nunemaker is a prolific Ruby and Rails developer, as witnessed by a
quick glance at his Github page. One of
his most recent projects is HTTParty, which makes it dead-simple to
consume REST apis using Ruby. Here’s what it looks like:
HTTParty automatically detects whether the response is JSON or XML and
parses it appropriately. It really doesn’t get much easier than that.
There’s also a nice command-line app bundled with the gem that lets you
call RESTful web services easily from the command-line, with a few more
bells and whistles than curl.
Sinatra is a great, compact web framework similar in concept to Why the
Lucky Stiff’s Camping framework. It makes it trivial to create a web
application in just a few lines of code. It was originally written by
XXX to allow for creating lightweight web services, but has since become
quite popular as a web framework to use when Rails might be overkill.
It’s easy to create simple test applications for libraries, but also
robust enough to create full-blown websites with. Check out the Sinatra
website and the Sinatra
book for more details.
What tools have you discovered lately?
Peter Cooper (who I interviewed
) has just announced SwitchPipe, which aims
to make deploying and hosting Rails (and other frameworks, such as
Django) applications easy. From the site:
Introduction / Overview
I haven’t spent much time with SwitchPipe yet, but if it lives up to
Peter’s claims this will dramatically simplify hosting
SwitchPipe is a proof of concept “Web application server” developed in
Ruby. More accurately, it’s a Web application process manager and
request dispatcher / proxy. Backend HTTP-speaking applications (Web
applications) do not run directly within SwitchPipe, but are loaded into
their own processes making SwitchPipe language and framework agnostic.
SwitchPipe takes control of, and manages, the backend application
processes, including loading and proxying to multiple instances of each
application in a round-robin style configuration. As an administrator,
you can define the maximum number of backend processes to run for each
app, along with other settings so that you do not exceeded preferred
resource limits. SwitchPipe quickly removes processes that “break” or
otherwise outlive their welcome. For example, you can let SwitchPipe
kill any backend processes that have not been accessed for, say, 20
seconds. This makes hosting many multiple Rails applications, for
example, a quick and non-memory demanding process, ideal for shared
SwitchPipe’s goal is to be:
* super easy to configure
* the easiest way to deploy multiple HTTP-talking backend
* painless in terms of management; no hand-holding of different
applications is needed
* a permanent daemon that can handle configuration changes in backend
apps “on the fly”
* a reliable solution on Linux and OS/X (and anything POSIX
What’s interesting to note is that this originated with Peter’s widely
on why such a thing was needed. Unlike a lot of other people who have
complained loudly about the state of Rails on shared hosting
environments, Peter put his time and talents towards creating a solution
which he then released within 3 weeks. This is
definitely something we need more of.
So what are your thoughts? Is this the solution we’ve been waiting for?
Initial performance numbers would seem to indicate that Ruby 1.9 (due by
Christmas) will be lots faster.
If you spend a lot of time in IRB (most of us probably do), it’s worth
taking the time to learn how to customize it. This is a good start.
Nice clean library to generate fake data. The home page says it’s a port
of Perl’s Data::Faker library, which I’d never even heard of.