Magento developers who wants refresh their Magento theory, perhaps for the sake of a job interview
Anyone who are interested in learning Magento
EAV (Entity-Attribute-Value) Model
When Varien first built Magento back in 2008, they developed this EAV concept – a distributed attribute model describes products by attributes and attribute values, and the goal was to create a scalable database system suitable for eCommerce.
One of the most obvious approach of EAV is it spreads all the attributes and values into different tables, which allows product attributes can be modified easily without having to alter the whole product entity. In other words, product attributes in the EAV model are in a normalised form, which makes querying and altering attributes a breeze.
If done differently, let’s say all the product attributes data are stored in one gigantic table, to make one simple attribute update would be so resource demanding – forget about bulk updating product attributes of hundreds and thousands in a few seconds, event if it’s possible, it would have to take days!
However, this brilliant EAV concept also created a deadly weakness for Magento – it’s a nightmare to query against product entities, since all the product attributes are distributed into different tables, to get the full product data, we will have to make a long database query to join many tables. In other words Magento is born with a performance problem especially when the store grows in sizes – it’s very very slow!
If you’re interested in Magento speed, have a look at this article where we discussed why Magento gets so slow and possible ways to improve it. Otherwise let’s get on the second part of this article – How does Magento address its speed issue?
Magento Flat Tables
We have to agree the EAV serves well for Magento backend, but for the storefront EAV is a performance killer and it compromises online shopping experience. Magento addresses the EAV speed issue by introducing Flat Tables – tables aggregate all product attributes relate to each product entity, which can be fetched in just one database query. Those flat tables are read-only, and they are basically a duplication of existing product data.
Why Indexing is Needed?
Magento has various indexes, the one we’re referring in this topic is called Flat Index which includes Products and Categories. Since product entities are fetched from the flat tables (once flat table is enabled in Magento configuration), after adding new products in Magento we need to execute Magento indexing which basically rebuild the flat tables.
Many also call the flat tables Magento indexes which is what store front uses to retrieve product data. By default, Magento reindexes automatically after a product entity is added or updated which is not the most efficient way of keeping Magento indexes up to date, therefore a Cron job is introduced to executed indexing in a predefined time interval.
Like many things in life, pros and cons coexist so does this brilliant EAV concept. Although Magento came up with the idea of Indexing to speed things up, the sacrifice is however that Magento can be very resource demanding. From our own experience – in the process of nurturing a Magento store from a few hundreds of products to over 50k SKUs, we had to upgrade our client’s server multiple times and adjust PHP settings to accommodate the extra time and resource needed for Magento to complete its reindexing.
In today’s cloud hosting environment where every execution of a command costs money, the idea of completely wipe out a flat table whenever a new product is added is quite non-sense.
To many, a business is a lifetime commitment. It's easy to start one yet difficult to make it successful. Attitude, skills, experiences and dedication help hone the craft along the way, but it's often the great vision and resilience to remain focused wins the game. Read more about me here