EzRor blog

Upload videos with Paperclip, generate thumbnails, and send to S3 using delayed job - PART 1

Paperclip is a very popular Rails attachment plugin. Its agile, well maintained and easy to setup. There are numerous tutorials out there (and I've read them all) that deal with Paperclip, video thumbnails, amazon s3 and delayed job, but it took me bit of time to get them all working nicely together.

I'm hoping this three part post will save you the time and trouble it took me to get them all working together. Instead of just giving you the code, we'll build a simple uploading app together. I'll also put the code up on github at the end.

The Plan

The first part will deal mostly with installation of various plugins/gems/software involved, getting paperclip and delayed job installed and configured. Second part will deal with thumbnail generation and amazon 3. The third and final part will deal with installing it on a production server. Lot of tutorials don't deal with production server and I've ran into numerous troubles getting it up and running, so hopefully this will help you.

I consulted numerous blogs and discussion posts out there to get this working. All credits go to the original authors: thewebfellas, aaronvb, scottmotte, traz and countless others.

Required Gems/plugins/software

aw-s3 or right-aws - Ruby library to interact with Amazon Simple Storage Service API. Either library is fine, I just personally prefer aw-s3.

delayed_job (collectiva version) - Library extracted from Shopify for processing jobs in the background. Never could get starling/workling working for me. We'll be using collectiva's fork, since its the most up to date one. There are 346 forks of delayed job on Github, do feel free to play around.

nifty-generators - Ryan Bates' excellent scaffold generator (OPTIONAL). You can use Rails' inbuilt scaffold generator, but do check out this gem, I think you'll like it.

Rails 2.3.5 - 3.0 version to come soon.

thoughtbot-paperclip - Excellent file management library by the good folks at thoughtbot. Its actively maintained and the people involved with it and around it are very helpful.

FFMPEG - We'll be using this to create thumbnail of our videos. Its amazing how much this software can do actually.

Getting Up and Running

$ rails paperclip_aws_delayed
$ cd paperclip_aws_delayed

Let's use Ryan Bates' nifty_scaffold (part of nifty_generators) to create a simple Video scaffold with title and processing fields. Note the processing field, you'll see why its used later.

$ script/generate nifty_scaffold Video title:string processing:boolean --skip-migration

I ususally like to create one master migration file in development and add everything by hand, hence the skip migration part. Its gets boring switching through 10 different migration files.

script/generate migration InitialSchema
#db/migrate/(..).rb
class InitialSchema < ActiveRecord::Migration
  def self.up
    create_table :videos do |t| 
      t.string :title
      t.boolean :processing
      t.timestamps
    end
  def self.down
    drop:videos
  end
end

Lets not migrate the database just yet. Paperclip requires certain fields in the database to store attachment information. First we need to add paperclip's has_attached_file to model to Video model. We'll call it source, but you can call it anything you want (attachments, video, photo, picture, etc.)

Here's the link to rdocs. I personally have the gem server running when in development for rdocs.

To get your gem server, run gem server and point your webbrowser to localhost:8008. If you don't see paperclip rdocs, run "gem install paperclip" without any flags. --no-ri and --no-rdoc doesn't install the respective files. Its slows down the installation a little bit, but I feel its worth it.

Install Paperclip

Just add the gem file to your environment.rb file and run (sudo) rake gems:install

#config/environment.rb
config.gem "thoughtbot-paperclip"

$ sudo rake gems:install

Now we've to specify which model has the attachment.

#models/video.rb
has_attached_file :source

Paperclip requires certain fields to work, which is why we didn't migrate the database before.

#db/migrate/(..).rb
def self.up
  create_table :videos do |t| 
    t.string :title
    t.string :source_content_type
    t.string :source_file_name
    t.integer :source_file_size
    t.datetime :source_updated_at
    t.boolean :processing
    t.timestamps
  end

Now we need to change the video/_form to add an upload field and change form_for to accept multipart. The multipart option sets the enctype to multipart/form-date, in other words lets you upload files.

<% form_for @video, :html => { :multipart => true } do |f| %>
  <%= f.file_field :source %>
<% end %>

At this point if you fire up your server and try to upload you won't see anything, at the same time you'll see no errors. However if you look at the logs, you'll see: "Can't mass assign protected attributes :source.

nifty-generators was nice enough to prevent mass assignment in our application by adding the line attr_accessible. Add :source next to it and you'll see your new upload form works.

#models/video.rb
attr_accessible :title, :processing, :source

Install FLV player

There are numerous free flv players out there, feel free to chooose whichever one you free. Here're are couple suggestions: flv player, jw player, osflv

Install Delayed Job

Say you're sending the user uploaded videos to Amazon s3, but you don't want the user to wait.You can use delayed_job to push the job to the background, so the application doesn't hang till it finishes the job.

First we'll need to install it. There are numerous forks out there, but collectiva is by far the best.

$ script/plugin install git://github.com/collectiveidea/delayed_job.git

I assume you've git installed on your system or the above command won't work. Either install git (see here for instructions) or manually download the file and put it in your vendors/plugin folder.

delayed_job comes with a generator which creates a delayed_jobs table to store the jobs. Let's just copy and paste the following code in the in initial schema table.

#db/migrate/(..).rb
create_table :delayed_jobs, :force => true do |table|
  table.integer  :priority, :default => 0       
  table.integer  :attempts, :default => 0      
  table.text     :handler                      
  table.text     :last_error                   
  table.datetime :run_at                       
  table.datetime :locked_at                    
  table.datetime :failed_at                    
  table.string   :locked_by                  
  table.timestamps
end

delayed_job also comes with two important rake tasks: one to start delayed job and one to delete unprocessed ones.

$ rake jobs:work
$ rake jobs:clear

Configure delayed_job

Now that we've installed delayed job, let's actually put it to use. What we want to do is once user uploads the video, we push the thumbnail generation job and upload to S3 job to the backgroud, so the user doesn't have to wait for the appllication to finish those jobs. Instead he/she can keep browsing while those jobs process in the background.

Let's start with thumbnail generation first. Paperclip comes before_source_post_process. If we pass false to it will stop the processing.

#models/video.rb
before_source_post_process do |video|
  if video.source_changed?
    false
  end
end

If you notice above there's a source_changed? method. What that does is it looks to see if source's attributes are changed and if they are it stops paperclip from doing post save process so we can do it via delayed job. This is for mostly for when the user updates the source, i.e. uploads a new attachment or same one with different attributes.

#models/video.rb
def source_changed?
    self.source_file_size_changed? || 
    self.source_file_name_changed? ||
    self.source_content_type_changed? || 
    self.source_updated_at_changed?
  end 

Paperclip also comes with another method to process the styles that we stopped in the lines before. Its a simple one line method and we don't have to concern too much about it, except to know it works.

I always like to try out stuff in the console first to make sure it works. Why don't we switch over to console and try it.

$ script/console
>Video.last.source.reprocess!
=> true

So let's write a method that does that.

#models/video.rb
def regenerate_styles!
  logger.info "Processing thumbnails"
  self.source.reprocess!
end
$ script/console
>Video.last.regenerate_styles!
=> true

Now comes delayed job. So far we've stopped paperclip from processing and created a method to tell it to manually process the thumbnails. We're going to call delayed job using Delayed::Job.enqueue and we're going to pass in Video.id so we know which video to do the jobs for.

#models/video.rb
after_save do |video|
  if video.source_changed?
    Delayed::Job.enqueue(ThumbnailJob.new(video.id))
    Delayed::Job.enqueue(SendToAmazonJob.new(video.id))
  end
end

What we're doing above is using the after_save call back to run two delayed jobs based on if the source is changed. This will only run if source, i.e. attachment is changed and not when attributes of parent model (Video), i.e. when title or other video attributes, are updated.

So far we've two delayed job classes ThumbnailJob, SendToAmazonJob. Now let's create two files under lib/ under the corresponding class names.

$ gedit send_to_amazon_job.rb $ gedit thumbnail_job.rb

Create Delayed Job classes

lib/send_to_amazon_job.rb
class SendToAmazonJob < Struct.new(:video_id)
  def perform
    Video.find(video_id).async_send_to_s3
  end
end

As you can see we're passing in video_id and calling the method async_send_to_s3 on it. We'll write async_send_to_s3 method in the next part.

lib/thumbnail_job.rb
class ThumbnailJob < Struct.new(:video_id)
  def perform
    Video.find(video_id).regenerate_styles!
  end
end

And here we're calling regenerate_styles! We can call source.reprocess! directly, but we don't have access to [logger] here, and I like to keep track of what's going on, helps with debugging later on.

I think this is a good place to stop. We've made good progress: created a sample app, installed paperclip and delayed job and configured them. Next up installing ffmpeg and getting thumbnail generation working with delayed job and finally amazon s3 working with delayed job as well.

Comments/concerns?

Send an email to help@ezror.com. We love hearing from you.

What is EzRor?

EzRor is a simple rails deployment script that takes care of all your deployment needs. Simply run the script, enter in a few info and see your app deployed in less than 30 minutes.

About this post:

This post deals with paperclip video uploads, generating thumbnails from those videos, sending them to amazon s3 via delayed job and finally deploying it in a production environemnt.

Published: October 12th 2010 10:10pm