polymorphic associations

Table of Contents

composite primary keys

why?

Scenary

we have two tables with schema like this:

  1. uploaded_project_listing_orders:
Field Key
id PRI
listingorder  
project_id  
property_number  
  1. ProjectListings
Field Key
Id PRI
ProjectID MUL
PropertyNumber MUL
ListingOrder  
   

we want to use like this UploadedProjectListingOrder.first.ProjectListings

how do we declare the association?

How

like this?

has_many :project_listings, class_name: Legacy::ProjectListings,
          foreign_key: [:ProjectID,:PropertyNumber], primary_key:[:project_id, :property_number]

rails not support composite primary keys. with the gem [[https://github.com/composite-primary-keys/composite_primary_keys][composite_primary_keys]], you can.

polymorphic associations

why ?

we have several type of listings stored in different table

  1. Residential
Field Key
PropertyNumber PRI
*  
*  
  1. Land
Field Key
PropertyNumber PRI
*  
*  

and

  1. ProjectListings
Field Key
Id PRI
ProjectID MUL
PropertyNumber MUL
ListingOrder  

we want to find out all relative listings via a ProjectListings instance, but we need to find it in different tables.

How

Rails way

sql? we currently did it like this:

pl =  joins('left join Residential on ProjectListings.PropertyNumber = Residential.PropertyNumber')
   .joins('left join Land on ProjectListings.PropertyNumber = Land.PropertyNumber')
   .by_project(project).active
   .where('COALESCE(Residential.Status, Land.Status) in (?)', statuses)
   .limit(page_size)
   .offset(offset)

this is equvalent to raw sql

we want listings but we got ProjectListing instances rails way

ProjectListing.first.listing

implementation

  1. create a migration
    change_table :ProjectListings do |t|
        t.references :listing, index: true
    end
    

    this will add listing\id(integer), listing\type(string) to your ProjectListings table. we have pre defined primary key(PropertyNumber) so we can use existing one explictly

    change_table :ProjectListings do |t|
        t.string :listing_type, index: true
    end
    

    give information we need to find which table the listing is and the primary key in that 'listing' table

  2. declare the associations in ProjectListing model
    belongs_to :listing, polymorphic: true, primary_key: :PropertyNumber, foreign_key: :PropertyNumber
    

    in listings model Residential

    has_one :project_listing, class_name: Legacy::ProjectListings, as: :listing, foreign_key: :PropertyNumber
    

    same in listings model the Land

    has_one :project_listing, class_name: Legacy::ProjectListings, as: :listing, foreign_key: :PropertyNumber
    
  3. then you can do this
    ProjectListings.first.listing
    Residential.first.project_listing
    

N+1 queries & eager loading

well, it's convinient again but the performance is bad though let's take a simpler example, if we want to look up all project\listings belongs to one project, how could we do?

Project.first.project_listings

look at the sql logs:

SELECT `ProjectListings`.* FROM `ProjectListings` WHERE `ProjectListings`.`ProjectID` = 600000001 AND (Deleted IS NULL) ORDER BY ListingOrder
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000002 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000003 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000004 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000005 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000009 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000008 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000007 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000006 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000010 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000011 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000012 LIMIT 1
SELECT `Residential`.* FROM `Residential` WHERE `Residential`.`PropertyNumber` = 100000013 LIMIT 1

think about if we have thousands of project listings, if we are in a more complex associations reality usally worse, we have more than 5 million of sql queries for just 1 thousands listings we can not afford performance like this.

How?

eager loading… for this example, it's simple, prefine the association properly in models

Projects.includes(:project_listings).first.project_listings

only two sql generated

expanded

  1. #preload
  2. #eager\load
  3. #includes

activerecord-import

why?

we mention about efficiency queries, how about inserting bulk of data efficiency?

  1. raw sql?
  2. model.save?

use activerecord-import ?

  1. build in activerecord import not efficiently
  2. you have to save each model individually
    10.times do |i|
      Book.create! :name => "book #{i}"
    end
    

    compare with

    books = []
    10.times do |i|
      books << Book.new(:name => "book #{i}")
    end
    Book.import books
    
  3. import via raw columns and arrays of values
  4. import model objects in a batch
  5. duplicate key updates (requires mysql)

activerecord-import on duplicated key update