Fog, in case you haven’t heard of it, is a fantastic
cloud computing library written in Ruby. It provides a unified interface
to several popular cloud computing platforms(including Amazon,
Rackspace, Linode, and others), making it easy to interact with them
from Ruby. It currently supports four types of cloud services: storage,
compute, DNS, and CDN. Fog has become very popular lately, and serves as
the backbone for Chef’s cloud computing functionality, which is how I
first became aware of it.
I recently used Fog to write a backup script in Ruby to automatically
send encrypted database backups from a database server running at
Rackspace to Amazon’s S3 storage service. Here’s how I did it.
Overview
My script runs as the second step in a process. The first step is a
shell script that calls pg_dump to dump a PostgreSQL database and then
encrypts the file using GnuPG, dropping them in a backup directory on
the database server.
My Fog-based script’s job is to make sure that all of the files in the
backup directory get moved to S3.
Writing Files
Fogsync (my script), looks at all of the files in that directory and
makes sure that they all exist in a bucket on S3. If they don’t, it
copies them up there. Additionally, it deletes old backups from S3. For
this customer, we keep backups for 14 days, so all backups older than
that get deleted.
Let’s look at how it works:
fog = Fog::Storage.new(
:provider => 'AWS',
:aws_access_key_id => MY_ACCESS_KEY,
:aws_secret_access_key => MY_SECRET
)
directory = fog.directories.get("MY_DIRECTORY")
files = Dir["/var/backup/*.gpg"]
for file in files do
name = File.basename(file)
unless directory.files.head(name)
directory.files.create(:key => name, :body => open(file))
end
end
Here’s what this snippet does:
-
Creates a connection to AWS. The syntax is basically the same for
connecting to all of the cloud platforms, just the parameter names are
changed.
-
Uses ‘head’ to check if the file exists and, optionally, get some
metadata about it (size, modify date, etc). Think of this as the cloud
equivalent to the unix stat command. You don’t want to use the ‘get’
command, as that will return the whole file, which would take a very
long time if the files are large cough*voice of experience*cough.
-
Creates the file in the given directory (“bucket” in S3 terms) if it
doesn’t exist already.
If you’ve used S3, you’ll notice that Fog uses slightly different terms
for things than S3 does. Because Fog works across a number of different
storage providers, it uses more general terms. While this might be
confusing at first if you’re familiar with a specific provider’s
nomenclature, but the tradeoff is that if you want to move from one
provider to another, the only thing you have to change is the code that
sets up the connection (the call to Fog::Storage.new() in this example).
Deleting files
oldest = Date.today - 14 (our date)
directory = fog.directories.get(MY_DIRECTORY)
files = directory.files
files.each do |f|
file_date = Date.parse(f.last_modified.to_s)
if file_date < oldest
file.destroy
end
end
This is fairly straightforward as well. Get all the files in the
directory and check their age, deleting the ones that are older than we
want to keep.
So that, in a nutshell, is how to use Fog. This is a simplified example
of course, in my production code the parameters are all pulled from
configuration files, and the script emails a report of what it did, in
addition to having a lot more error handling.
If you do any scripting with cloud computing, you owe it to yourself to
check out Fog.
Generating semi-realistic test data for an application can be a pain. If
the data already exists, as in the case of an upgrade to an existing
system, you can generally create data based on the existing database.
But what if you need a large sample of data for a brand new system? If
you have simple data requirements, there are some Ruby gems that can
help you out. Faker is one such gem,
which lets you generate realistic names, addresses and phone numbers.
But what do you do for things that are a little less typical? Things
like scores, ratings, ages, dates, etc. I needed to do this recently for
a prototype I built of a system to generate letters. Here’s the Rake
task I ended up with:
This script adds 1000 records to my database that are representative of
what real production data would look like. The quantity of data is
obviously easily adjusted up or down as needed.
This is just a standard rake task that you can drop inside lib/tasks.
Most of this is fairly standard ruby code and not very interesting, but
lets look closer at what makes this work.
The first portion of the script does some setup work, deleting existing
data. Then it sets up a series of arrays for the values that will be
used for individual fields. For example the volumes variable:
volumes = (8000..100000).to_a
This creates an array of integers containing every number between 8000
and 100000. Response rates and variances are set up similarly, as are
the client names.
In the loop that generates the actual data, we then call the
rand()
function on these arrays to select a value from our
range. This function isn’t a standard part of the Ruby Array class, it’s
actually added to the class by
ActiveSupport.
Using this method makes it very easy to generate test data within
predefined acceptable ranges.
For another take on this topic, see the EdgeCase
blog
I’ve been a bit heads-down lately, working on a super-secret project in
Ruby. More on that in the near future, but in the meantime I wanted to
share about a few things that I’ve started using.
Shoulda
When I started my new project, I wanted to try one of the new testing
frameworks for Ruby. The problem is there are a number to choose from.
What to do…
I settled on Shoulda. I
wish I could tell you that this was a rigorous process, that I evaluated
each framework carefully, learning about each one’s strengths and
weaknesses. I did not, I cheated. You see, a while back, Josh Susser did
just that thing. He called it the The Great Test Framework
Dance-off.
He settled on Shoulda, so that’s what I went with.
Shoulda is developed by Tammer Saleh of ThoughtBot, who have a number of
other really nice projects.
Shoulda’s tagline is “Making Tests Easy on the Fingers and Eyes”, and it
lives up to that goal. It has a very nice syntax for developing tests,
including a complete set of macros for testing controllers and models.
It’s a joy to use. Here’s what it looks like (both samples taken from
the Shoulda README :
Nice, right?
Here’s a sample of the ActiveRecord macros in action:
Beautiful.
So what’s the big deal? Well, it’s easier to read for one. Instead of
horrendous method names like test_should_do_this_but_not_that, you
get to write English: should “do this but not that”. The macros in
Shoulda also let you test your models and controllers easily.
Pivotal Tracker
Pivotal Tracker is an Agile project management tool, developed by the
folks at Pivotal Labs. It lets you create
projects, track release, stories, and defects. The beauty of Tracker is
it’s all-on-one-screen user interface. It lets you see everything at a
glance, and even provides keyboard shortcuts for common tasks. I’m not
alone in my admiration of Tracker, it seems to be extremely popular
among the Rails consulting shops (Hashrocket,
for one).
While Tracker is powerful enough to be used for large multi-developer
projects, it also happens to be perfect for managing your side
projects. Enter the features you want, organize them into releases, and
just click start to begin the first one. Click finish when you’re done,
and move on to the next one. Easy peasy. Did I mention it’s free?
Be sure to check out the
screencast, which gives a
nice overview of the application.
HTTParty
John Nunemaker is a prolific Ruby and Rails developer, as witnessed by a
quick glance at his Github page. One of
his most recent projects is HTTParty, which makes it dead-simple to
consume REST apis using Ruby. Here’s what it looks like:
HTTParty automatically detects whether the response is JSON or XML and
parses it appropriately. It really doesn’t get much easier than that.
There’s also a nice command-line app bundled with the gem that lets you
call RESTful web services easily from the command-line, with a few more
bells and whistles than curl.
Sinatra
Sinatra is a great, compact web framework similar in concept to Why the
Lucky Stiff’s Camping framework. It makes it trivial to create a web
application in just a few lines of code. It was originally written by
XXX to allow for creating lightweight web services, but has since become
quite popular as a web framework to use when Rails might be overkill.
It’s easy to create simple test applications for libraries, but also
robust enough to create full-blown websites with. Check out the Sinatra
website and the Sinatra
book for more details.
What tools have you discovered lately?
Peter Cooper (who I interviewed
recently
) has just announced SwitchPipe, which aims
to make deploying and hosting Rails (and other frameworks, such as
Django) applications easy. From the site:
Introduction / Overview\
SwitchPipe is a proof of concept "Web application server" developed in
Ruby. More accurately, it's a Web application process manager and
request dispatcher / proxy. Backend HTTP-speaking applications (Web
applications) do not run directly within SwitchPipe, but are loaded into
their own processes making SwitchPipe language and framework agnostic.\
SwitchPipe takes control of, and manages, the backend application
processes, including loading and proxying to multiple instances of each
application in a round-robin style configuration. As an administrator,
you can define the maximum number of backend processes to run for each
app, along with other settings so that you do not exceeded preferred
resource limits. SwitchPipe quickly removes processes that "break" or
otherwise outlive their welcome. For example, you can let SwitchPipe
kill any backend processes that have not been accessed for, say, 20
seconds. This makes hosting many multiple Rails applications, for
example, a quick and non-memory demanding process, ideal for shared
hosting environments.\
...\
SwitchPipe's goal is to be:
* super easy to configure
* the easiest way to deploy multiple HTTP-talking backend
applications
* painless in terms of management; no hand-holding of different
applications is needed
* a permanent daemon that can handle configuration changes in backend
apps “on the fly”
* a reliable solution on Linux and OS/X (and anything POSIX
compatible, ideally)
I haven't spent much time with SwitchPipe yet, but if it lives up to
Peter's claims this will dramatically simplify hosting
Rails/Django/Camping/whatever applications.\
What's interesting to note is that this originated with Peter's [widely
read
article](http://www.rubyinside.com/no-true-mod_ruby-is-damaging-rubys-viability-on-the-web-693.html)
on why such a thing was needed. Unlike a lot of other people who have
complained loudly about the state of Rails on shared hosting
environments, Peter put his time and talents towards creating a solution
which he then released within 3 weeks. This is
definitely something we need more of.\
So what are your thoughts? Is this the solution we've been waiting for?
Initial performance numbers would seem to indicate that Ruby 1.9 (due by
Christmas) will be lots faster.
If you spend a lot of time in IRB (most of us probably do), it’s worth
taking the time to learn how to customize it. This is a good start.
Nice clean library to generate fake data. The home page says it’s a port
of Perl’s Data::Faker library, which I’d never even heard of.
A plugin to do OpenID authentication in Rails, in
a RESTful way.
Competition is good. Merb and the like provide
that competition to Rails. This article runs through an alternative to
the Rails stack. It’s always good to keep an eye on what else is out
there.
Ok, this is a bonus link. Not at all Rails related, but relevent to you
if you’re reading this. Rands nails the
Nerd. I mean, really nails it.
A new book from O’Reilly on troubleshooting Ruby (and Rails) apps. From
the overview:
This short cut introduces key system diagnostic tools to Ruby
developers creating and deploying web applications. When programmers
develop a Ruby application they commonly experience complex problems
which require some understanding of the underlying operating system to
be solved. Difficult to diagnose, these problems can make the
difference between a project’s failure or success. This short cut
demonstrates how to leverage system tools available on Mac OS X,
Linux, Solaris, BSD or any other Unix flavor. You will learn how to
leverage the raw power of tools such as lsof, strace or gdb to resolve
problems that are difficult to diagnose with the standard Ruby
development tools. You will also find concrete examples that
illustrate how these tools solve real-life problems in Ruby
development. This expertise will prove especially relevant during the
deployment phase of your application. In this way, should your
production Mongrel cluster freeze and stop serving HTTP requests, it
will not take you 2 days to figure out why!
A nice, if a bit short, article on some of the changes that are coming
in Rails 2.0. This is focused on what you will need to change in your
application.
This is a beginner tutorial, specific to using Netbeans 6.0. I’ve not
played much with the Rails support in Netbeans, but it looks impressive
so far.
Jeremy Kemper recently committed a request
profiler
to Rails. It lets you make a request to a URL repeatedly, and then see
an HTML or text report of where your code is spending it’s time. This
looks very handy.
A walkthrough of building an app with Rails, which includes feature
definition, using Piston to manage
plugins, and Restful Authentication. Nice.
The Halloween Edition
One of the first tutorials I’ve seen that focuses on Rails 2.0.
This would seem to make deploying a Rails app on Amazon’s
EC2
very simple:
EC2 on Rails is an Ubuntu Linux server image for Amazon’s EC2
hosting service that’s ready to run a standard Ruby on Rails
application with little or no customization. It’s a Ruby on Rails
virtual appliance. If you have an EC2 account and a public keypair
you’re five minutes away from deploying your Rails app.
A collection of Rails links
This is a nice step-by-step article on integrating
PayPal with your Rails application, using
ActiveMerchant.
I’ve only skimmed over the new features in the upcoming 2.0 release of
Rails, but this looks like one of the nicest features. This is a good
explanation of how it works and why it’s useful.
A bugfix release of Mongrel is out. Looks like 1.1 is due soon, and it
looks interesting:
“Mongrel 1.1 is coming real soon now with JRuby support and a few
other things.”
Being a bit of an Emacs junky, I’m not sure how I missed this. Looks
mature, and very functional, and almost TextMate-like. The link has a
nice flash video of Emacs on Rails in action.
Sitepoint’s book “Build Your Own Ruby on Rails Web Applications” is now
free, at least for the next month. I’ve only skimmed it, but it looks
like a decent introduction, and the price is certainly right.