Bam, Kicking AASM Up A Notch

Here at Blue Apron Engineering, we are big fans of the Acts As State Machine (AASM) gem. We chose AASM because of its wide adoption and continued support by the gem’s author and the community. It currently powers a lot of our core functionality on our consumer facing “.com” stack and our Warehouse Management Software (WMS) stack.

The project solves many of our problems, but we discovered that the more we leveraged AASM, the more we realized that the existing patterns related to pessimistic locking, extensibility, and code repetition couldn’t always meet our needs. Originally, we monkey patched our changes, but we felt that these additions would be a great contribution back to the AASM community.

In this blog post, we’re going to discuss in detail these patterns and how we improved upon them. We’ll review an example workflow from our Inventory Control system in our Warehouse Management System (WMS), and how AASM behaved initially in our codebase. We’ll then discuss the problems we ran into, and how we changed things to meet our needs with the help of the AASM community.

An Example Workflow

Below is an example Inventory Movement workflow from our bespoke WMS app. In this state machine, our graph of vertices (states) and edges (event transition paths) represent all of the business logic we need for an employee in our warehouse to move a container from location A to location B.

module Wms
  class SimpleInventoryMovement < ActiveRecord::Base
    aasm column: 'status' do
      state :unstarted, initial: true
      state :started
      state :finished
      state :canceled

      event :unstart do
        transitions from: :started, to: :unstarted, after: :perform_unstart
      end
      
      event :pickup do
        transitions from: :unstarted, to: :started, guard: :can_pickup?, after: :perform_pickup
      end

      event :dropoff do
        transitions from: :started, to: :finished, guard: :can_dropoff?, after: :perform_dropoff
      end

      event :cancel do
        transitions from: %i(unstarted started), to: :canceled, guard: :can_cancel?, after: :perform_cancel
      end
    end
  end
end

To give a real world example, we have an Employee, our canine mascot Panda. He wants to move a pallet of chicken thighs from the receiving dock to a refrigerated location.

Panda arrives at the receiving dock, scans the container and “starts” the workflow. Ideally he can take the pallet all the way to the designated refrigerator and “drop it off”, thus “finishing” the workflow.

Pessimistic Locking

A pessimistic lock in a relational database is a type of lock in multi-user scenarios to prevent the read of invalid or dirty data. One user may be updating data, but other users may be reading / writing the information at the same time. A pessimistic lock guarantees that the lock holder will be reading the latest version of the data and that any subsequent write will commit. In PostgreSQL, a LOCK will prevent any other user from doing an UPDATE, INSERT, DELETE, or LOCK until the lock is released.

At its core, AASM does not implement pessimistic locking for ActiveRecord based persistence for a state machine. However, it can be necessary in situations where guard clauses and data writes collide. To return to our example workflow, let’s suppose that our guard for can_pickup? involves checking that an existing employee has not already claimed the task and that it is unstarted.

def can_pickup?
  owning_employee.blank? && state == ‘unstarted’
end

Without a database level pessimistic lock, this guard is prone to race conditions. Two different processes can each load up the model, pass the condition, and try to move our chicken thighs!

Engine Yard and the AASM community recommended locking around the workflow using standard ActiveRecord Pessimistic Locking patterns at the controller level. In our case, our controller locked the pickup! method.

# POST /inventory_movement/pickup/:id
def pickup
  movement = Wms::SimpleInventoryMovement.find(params[:id]) 
  movement.with_lock do
    movement.pickup!
  end
  # yadda, yadda, yadda
end

This solves our race condition. The first process will first load the model, then lock the row in our database. It can then execute pickup! and know that the guard will resolve with exclusive access to the information in the model. The second process, using a SELECT FOR UPDATE lock, will block until the first process is done. When it resumes, it will see that the workflow is grabbed by another employee and fail the guard.

Unfortunately, we quickly noticed that locking outside of AASM was unsustainable. If we had corresponding controller actions for cancel, report_inventory_problem, dropoff, each would need to wrap a with_lock around their use of AASM. And what if we had a scheduled job that was auto-canceling inventory movements that was taking too long? The scheduled job would be locking workflows too.

We started down this path of locking outside of our models but in the end, we could not easily ensure our state machines were properly locked exclusively. First, our code didn’t look great with repetition. But more importantly, we could easily miss a spot during development and code review.

Alternatively, we could either lock each of our guard clauses or introduce pessimistic locking into AASM. We chose the later path. With a lot of help from the community, we changed the AASM DSL to support a locking declaration which leverages ActiveRecord’s locking features underneath. Since AASM already supports transactions, it was simple to extend this support and add a lock declaration since ActiveRecord locks must occur within a transaction. This tidied up our code and automatically ensures that our guard clauses work within an exclusive context.

To return to our example, our state machine looks like this afterwards.

module Wms
  class SimpleInventoryMovement < ActiveRecord::Base 
    aasm column: 'status', requires_lock: true do
      # yadda yadda
    end
  end
end

Now all of our HTTP controllers, scheduled jobs, and other Plain Old Ruby Objects can simply invoke pickup! without worrying about locking.

Extending AASM

One of our first projects to extend AASM involved audit trails for our inventory workflows. The audit trails record every event change so we can piece together why our container of chicken thighs ended up in the donation / good-will location instead of a refrigerator location.

A quick Google Search returned AASM History. We didn’t use the gem but the code provided early guidance on how we could decorate AASM to provide our own utility helpers for our state machines. Our end result looked very similar: we performed a class_eval, re-opened the class, and coded our extensions right into AASM.

##
# Monkey patch extra methods for AASM state machines.
AASM::Base.class_eval do
  def self.has_history
    # do some logging
  end

  def self.has_even_more_awesomeness
    # do some awesomeness
  end
end

In most languages, to extend a base class you use inheritance or you use a structural design pattern (i.e. decoration, composition). In Ruby however, it’s very convenient to avoid these practices and re-open classes to extend functionality.

Unfortunately, there’s a reason why the Ruby community ridicules  class_eval as “class_evil”. I’m 99% sure that if I answered a question on how to extend ActiveRecord::Base on StackOverflow with class_eval, it would receive 0 up votes. In the end, just as it is with ActiveRecord::Base, monkey patching AASM::Base felt like a bad idea.

Rails 5 will introduce a new concept called ApplicationRecord that will solve this very problem. Using this concept as inspiration, we worked with the AASM community to offer the same idea, the ability to subclass AASM::Base and then leverage that class in our state machines.

The above example now becomes the following:

module Wms
  class OurAASMBase < AASM::Base
    def self.has_history
      # do some logging
    end

    def self.has_even_more_awesomeness
      # do some awesomeness
    end
  end
end

And our model becomes:

module Wms
  class SimpleInventoryMovement < ActiveRecord::Base
    aasm with_klass: Wms::OurAASMBase, column: 'status' do
      has_history
      has_even_more_awesomeness

      # yadda yadda
    end
  end
end

More Global Callbacks For DRYness

To properly encapsulate our code, we wished to prevent SQL errors from bleeding to the controller level. It is unnecessary for Inventory Movements to know we use ActiveRecord under the hood, which that type of error would expose. With this in mind, we wrapped all of our inventory movement errors into a Movement Error.

Our state machine looked like this:

module Wms
  class SimpleInventoryMovement < ActiveRecord::Base
    aasm column: 'status' do
      event :unstart do
        transitions from: :started, to: :unstarted, after: :perform_unstart
        error { |e| raise Wms::MovementError.new(e.message, original_exception: e, model: self) }
      end
 
      event :pickup do
        transitions from: :unstarted, to: :started, guard: :can_pickup?, after: :perform_pickup
        error { |e| raise Wms::MovementError.new(e.message, original_exception: e, model: self) }
      end
 
      # yadda yadda yadda
    end
  end
end

This accomplishes our goal of wrapping exceptions, but resulted in a lot of repetition in our code. There was no way for us to truly avoid repetition with the current AASM DSL. One attempt created a raise_a_movement_error method that helped code clarity a little bit as an intermediate solution. But the real answer laid in improving AASM itself.

Again, the AASM community inspired us with this pull request to introduce a global after_all_transitions callback. If we could register one global callback for an AASM model, could we add more?

The answer was yes. after_all_transitions introduced a callback registry per model for global callbacks. Using this work, we could use the very same registry to create even more global callbacks that mirrored event level callbacks e.g. before and  before_all_events and  error and error_on_all_events.

This allowed us to completely DRY up our error handling. Now we have no repetition of the error handler. We can now rewrite our model.

module Wms
  class SimpleInventoryMovement < ActiveRecord::Base
    aasm column: 'status' do
      error_on_all_events { |e| raise WMS::MovementError.new(e.message, original_exception: e, model: self) }
       
       event :unstart do
         transitions from: :started, to: :unstarted, after: :perform_unstart
       end
       # yadda yadda yadda.
    end
  end
end

Conclusion

Both AASM and its community are terrific. We have placed incredibly important functionality into this project for both our WMS and Consumer applications here at Blue Apron. We hope that our contributions have helped and we certainly hope these contributions help you discern  why AASM is an open source leader for state machine representation in Ruby.

As of this post, global callbacks have been released in AASM 4.7. Pessimistic locking and AASM:Base extensions have been released in AASM 4.9.

Special Thanks: I’d like to thank my colleagues Kevin Bongart and Shaun Butler for helping me understand AASM and for providing feedback on my pull requests. I’d like to thank Ray Chan for implementing the Pessimistic Locking on WMS and ensuring this delivered as promised.

Blue Apron 2016 Spring Hackathon

We are pleased to announce Blue Apron hosted its first-ever internal hackathon from March 30th to April 1st! The opportunity for Blue Aproners to tinker is not only aligned with our tenet of lifelong learning, but also crucial for engineers’ empowerment in their vision for the future of Blue Apron.

Hackathon - 8268

Read More

Scalable and Responsive Rule Engine API Design

In a new internal API that models Blue Apron’s dietary preference microcopy, providing a responsive front-end competed with future scalability. In this post, we’ll walk you through how we reconciled our constraints by designing the API around a custom rule engine.

You will receive only vegetarian recipes.

“I’m a vegetarian” dietary preference

Read More

Check Out With Blue Apron Using Apple Pay

Blue Apron customers are using our iOS app to do everything from following along with tonight’s recipe, to watching quick how-to videos, to managing their delivery schedule. We are excited to announce support for Apple Pay to make it even easier to sign up for Blue Apron meals. Checking out with Apple Pay is as simple as the touch of a finger with Touch ID, so there’s no need to manually enter shipping and billing information.

Security and privacy is at the core of Apple Pay. When you add a credit or debit card to Apple Pay, the actual card numbers are not stored on the device, nor on Apple servers. Instead, a unique Device Account Number is assigned, encrypted and securely stored in the Secure Element on your device.

Signing up and placing your first order with Apple Pay is a breeze; download the app now and give it a try!

Apple Pay is compatible with iPhone 6s, iPhone 6s Plus, iPhone 6, iPhone 6 Plus, iPad Air 2, iPad mini 3, iPad mini 4 and iPad Pro within apps.

apple-pay1apple-pay2