I use Rakefiles quite a bit like traditional Makefiles, in that I specify immediate dependencies for an individual task and Rake will execute all of them. If a file or directory is the dependency and it exists, the task that creates it will be skipped. A contrived Rakefile example might look like:

file 'sample' do |t|
  puts 'Creating sample directory'

file 'sample/population.txt' => ['sample'] do |t|
  puts 'Creating sample population file...'
  # Perhaps download a dataset? Lets just create the file
  File.write(t.name, "---> Very important data <---\n")

task :process_population => ['sample/population.txt'] do
  puts 'Check out our data!'
  # Do some processing... whatever you need to...
  puts File.read('sample/population.txt')

The first time you run it you’ll the following output:

$ rake process_population
Creating sample directory
Creating sample population file...
Check out our data!
---> Very important data <---

And subsequent runs will skip the creation since they’re already present:

$ rake process_population
Check out our data!
---> Very important data <---

This is fine for statically implementing file contents, but what if you need additional information to generate the file? With a normal rake task you can provide bracketed arguments to access additional information like so:

task :args_example, :word do |t, args|
  puts "The word is: #{args.word}"

You’d use it like so:

$ rake args_example[data]
The word is: data

That information isn’t made available to the dependent tasks though so we need to broaden our scope a little bit. There is another way to provide arguments to Rake using key value pairs. This has a bonus that was kind of an obvious solution once I found it. Rake provides the values of key/value pairs to a task via environment variables. Another contrived example of how to use this (specifically with a file dependency example):

file 'passed_state' do |t|
  puts 'Creating state file'
  File.write(t.name, ENV['state'])

task :read_state => ['passed_state'] do
  puts File.read('passed_state')
$ rake read_state state=something
Creating state file

State has been transferred! There is a gotcha, that is handling expiration of data yourself. Passing in state again with a different value you’ll see the problem:

$ rake read_state state=notsomething

It won’t recreate that file again until it’s removed which you’ll need to handle on your own.