Recently, I came across a feature request which required me to query on a field in versions
table provided by papertrail
gem in ruby. You can read more about it here.
It was an old project and someone had provisioned the table to store a column called object_changes
and object
as string
data type in database (postgres) and querying that was a real headache.
I decided to fix the schema
and migrate to jsonb
to store those changes. Now again there was a problem. There were ~1M+
records in the table, and changing all the column types in a single transaction was very risky.
So I came up with this approach with sidekiq.
I created a new migration for adding new columns:
class ConvertVersionYamlToJson < ActiveRecord::Migration
def change
rename_column :versions, :object, :old_object
rename_column :versions, :object_changes, :old_object_changes
add_column :versions, :object, :jsonb
add_column :versions, :object_changes, :jsonb
end
end
and the data migration part I moved to sidekiq, like:
class ConvertVersionYamlToJson < ActiveRecord::Migration
def change
rename_column :versions, :object, :old_object
rename_column :versions, :object_changes, :old_object_changes
add_column :versions, :object, :jsonb
add_column :versions, :object_changes, :jsonb
# Note: This worker will remove the columns old* from this table after it's done,
# and no error has occured.
MigrateVersionsYamlToJsonbWorker.perform_async
end
end
The worker
looked like this:
# frozen_string_literal: true
class MigrateVersionsYamlToJsonbWorker
include Sidekiq::Worker
def perform
## update object
PaperTrail::Version.where.not(old_object: nil).find_each do |version|
update_object(version)
update_object_changes(version)
end
if PaperTrail::Version.where.not(old_object: nil).count.zero?
ActiveRecord::Migration.remove_column :versions, :old_object
end
## update object changes
PaperTrail::Version.where.not(old_object_changes: nil).find_each do |version|
puts version.id
update_object_changes(version)
end
if PaperTrail::Version.where.not(old_object_changes: nil).count.zero?
ActiveRecord::Migration.remove_column :versions, :old_object_changes
end
end
def update_object(version)
version.old_object &&
version.update_columns(old_object: nil, object: YAML.load(version.old_object))
end
def update_object_changes(version)
version.old_object_changes &&
version.update_columns(old_object_changes: nil,
object_changes: YAML.load(version.old_object_changes))
rescue StandardError => e
nil
end
end
This was going to be a time taking process (but async), and in that I had provisioned to remove the extra column that I had created whenever it got finished.
Will write up a separate post for I provisioned the query with jsonb
data type.