Skip to content

Instantly share code, notes, and snippets.

@zsy715
Last active January 8, 2020 17:23
Show Gist options
  • Save zsy715/e48ea7f10ba42af85f0a148dd50efc65 to your computer and use it in GitHub Desktop.
Save zsy715/e48ea7f10ba42af85f0a148dd50efc65 to your computer and use it in GitHub Desktop.

PIM/Products:

  • the page shows Product Attribute View which is mainly from a view pim_product_attributes_view. The view does a bunch of table join. The drawback of it is there is no way for us to utilize index to speed up querying. The view does provide very details of product attribute but I don’t think in PIM/Products page, we utilize most of them. The attributes we currently care about are vendor name, product name, vendor sku, magento id, category department, category class, category subclass, unique_id and status. These attributes are what we list on the page and we performance search function against them as well.
  • One potential solution is manually join necessary tables so that we can optimize the query by using indexes instead of reading entries from the view. Also, we can eager load tables to avoid N+1 query which is not a major issue regarding to the performance.
  • some tests against sql with necessary tables joined together.

origin sql:

SELECT vendor_name, name, cat_department, cat_class, cat_subclass, magento_id, vendor_sku, unique_id, status FROM `pim_product_attributes_view` ORDER BY pim_product_attributes_view.vendor_name asc LIMIT 10 OFFSET 0;

optimized sql:

SELECT
   `pv`.`unique_id` AS `vendor_unique_id`,
   `pv`.`name` AS `vendor_name`,
   `pp`.`unique_id` AS `unique_id`,
   `pp`.`name` AS `name`,
   `pp`.`vendor_sku` AS `vendor_sku`,
   `pp`.`is_inventoriable` AS `is_inventoriable`,
   `pp`.`magento_id` AS `magento_id`,
   `pc_department`.`name` AS `cat_department`,
   `pc_category`.`name` AS `cat_class`,
   `pc_class`.`name` AS `cat_subclass`,
   `hpe_status`.`value` AS `status`
from `pim_products` `pp` 
join `pim_vendors` `pv` on `pp`.`vendor_id` = `pv`.`id` 
join `pim_entities` `pe` on `pe`.`foreign_entity_id` = `pp`.`id` and `pe`.`entity_type_id` = 1 
left join `pim_categories` `pc_class` on `pp`.`category_id` = `pc_class`.`id` 
left join `pim_categories` `pc_category` on `pc_class`.`parent_id` = `pc_category`.`id` 
left join `pim_categories` `pc_department` on `pc_category`.`parent_id` = `pc_department`.`id` 
left join `hydra_string_pim_entities` `hpe_status` on `hpe_status`.`entity_id` = `pe`.`id` and `hpe_status`.`hydra_attribute_id` = 10
order by `pv`.`name` asc limit 10 offset 0;

In production, the run time of 1st sql is 4500+ms on average. The run time of 2nd sql which queries over joined tables is less than 900 ms on average. So we could save 3.5 sec through this optimization.

  • Another problematic query is
SELECT COUNT(*) FROM `pim_product_attributes_view`

This query’s run time is 4000+ms in Sequel Pro.

If I directly run count(*) on similar join query, the run time drops to less than 700ms. So we would benefit from optimization for another 3 sec.

select count(*)
from `pim_products` `pp` 
join `pim_vendors` `pv` on `pp`.`vendor_id` = `pv`.`id` 
join `pim_entities` `pe` on `pe`.`foreign_entity_id` = `pp`.`id`
left join `pim_categories` `pc_class` on `pp`.`category_id` = `pc_class`.`id` 
left join `pim_categories` `pc_category` on `pc_class`.`parent_id` = `pc_category`.`id` 
left join `pim_categories` `pc_department` on `pc_category`.`parent_id` = `pc_department`.`id` 
left join `hydra_string_pim_entities` `hpe_status` on `hpe_status`.`entity_id` = `pe`.`id` where `hpe_status`.`hydra_attribute_id` = 10 and `pe`.`entity_type_id` = 1;

PIM/Samples:

samples page has a different story. PIM loads the page twice with two different ways. The 1st time PIM fetches all samples(6000+), then finds vendor for each sample. It requires to run select query against pim_ventors table 6000+ times which slows down page load.The page would be overridden by 2nd load with some optimized query which only takes less than 1s. In my test, I just simply remove 1st load(https://github.com/birchbox/tools-server/blob/01434281ba999653d855ea872369def267db7124/app/controllers/pim/samples_controller.rb#L7 and https://github.com/birchbox/tools-server/blob/01434281ba999653d855ea872369def267db7124/app/views/pim/samples/index.html.haml#L17-L31), the page loads fairly quick in dev.

PIM/assortments

this seems fine to me. The page loads in 3s. I can look into it later.

PIM/components

this seems fine to me. The page loads in 3s.

PIM/vendors

It's a similar issue as samples. The page loads data twice. 1st load takes for a while and will be overridden by 2nd load.

PIM/birchboxes

It's a similar issue as samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment