Recently, I came across a feature request which required me to query on a field in versions table provided by papertrail gem in ruby. You can read more about it here.

It was an old project and someone had provisioned the table to store a column called object_changes and object as string data type in database (postgres) and querying that was a real headache.

I decided to fix the schema and migrate to jsonb to store those changes. Now again there was a problem. There were ~1M+ records in the table, and changing all the column types in a single transaction was very risky.

So I came up with this approach with sidekiq.

I created a new migration for adding new columns:

class ConvertVersionYamlToJson < ActiveRecord::Migration
  def change
    rename_column :versions, :object, :old_object
    rename_column :versions, :object_changes, :old_object_changes

    add_column :versions, :object, :jsonb
    add_column :versions, :object_changes, :jsonb
  end
end

and the data migration part I moved to sidekiq, like:

class ConvertVersionYamlToJson < ActiveRecord::Migration
  def change
    rename_column :versions, :object, :old_object
    rename_column :versions, :object_changes, :old_object_changes

    add_column :versions, :object, :jsonb
    add_column :versions, :object_changes, :jsonb

    # Note: This worker will remove the columns old* from this table after it's done,
    # and no error has occured.
    MigrateVersionsYamlToJsonbWorker.perform_async
  end
end

The worker looked like this:

# frozen_string_literal: true

class MigrateVersionsYamlToJsonbWorker
  include Sidekiq::Worker

  def perform
    ## update object
    PaperTrail::Version.where.not(old_object: nil).find_each do |version|
      update_object(version)
      update_object_changes(version)
    end

    if PaperTrail::Version.where.not(old_object: nil).count.zero?
      ActiveRecord::Migration.remove_column :versions, :old_object
    end

    ## update object changes
    PaperTrail::Version.where.not(old_object_changes: nil).find_each do |version|
      puts version.id
      update_object_changes(version)
    end

    if PaperTrail::Version.where.not(old_object_changes: nil).count.zero?
      ActiveRecord::Migration.remove_column :versions, :old_object_changes
    end
  end

  def update_object(version)
    version.old_object &&
      version.update_columns(old_object: nil, object: YAML.load(version.old_object))
  end

  def update_object_changes(version)
    version.old_object_changes &&
      version.update_columns(old_object_changes: nil,
                             object_changes: YAML.load(version.old_object_changes))
  rescue StandardError => e
    nil
  end
end

This was going to be a time taking process (but async), and in that I had provisioned to remove the extra column that I had created whenever it got finished.

Will write up a separate post for I provisioned the query with jsonb data type.