Magento’s addAttributeToFilter doesn’t work: three solutions

Say you need to filter a Magento product collection to only include products with a certain attribute value. You could normally just use the built-in addAttributeToFilter function. However, if you’ve got flat tables turned on, you may run into a problem where addAttributeToFilter doesn’t do anything.

Let’s take a look at why.

If you don’t care why, skip ahead to how to fix it.

Let’s take a look at the guts of the addAttributeToFilter function in Mage_Catalog_Model_Resource_Product_Collection.


public function addAttributeToFilter($attribute, $condition = null, $joinType = 'inner')
{
if ($this->isEnabledFlat()) {
if ($attribute instanceof Mage_Eav_Model_Entity_Attribute_Abstract) {
$attribute = $attribute->getAttributeCode();
}

if (is_array($attribute)) {
$sqlArr = array();
foreach ($attribute as $condition) {
$sqlArr[] = $this->_getAttributeConditionSql($condition['attribute'], $condition, $joinType);
}
$conditionSql = '('.join(') OR (', $sqlArr).')';
$this->getSelect()->where($conditionSql);
return $this;
}

if (!isset($this->_selectAttributes[$attribute])) {
$this->addAttributeToSelect($attribute);
}

if (isset($this->_selectAttributes[$attribute])) {
$this->getSelect()->where($this->_getConditionSql('e.' . $attribute, $condition));
}

return $this;
}

//Other code, which isn't relevant to this example
}

First, we check if the attribute is an object and grab its code if so. Then, if the attribute is an array, we build the JOIN and WHERE clause based on that. Next, if the attribute isn’t already stored in the object’s _selectAttributes property, we use addAttributeToSelect to get it, and this is the root cause of the issue.

addAttributeToSelect uses $this->getEntity()->getAttributeForSelect to collect all the model’s attributes. The first thing that function does is check if the attribute is in the flat table and return null if it’s not found. Which means that back in addAttributeToFilter, $this->_selectAttributes[$attribute] never gets set, which means no WHERE clause for you.

Let’s look at some solutions.

The easy way

Used in product listing setting
The first and most obvious way to resolve this is to add the attribute to the flat table. Go to the attribute manager under Frontend Properties and check Used in Product Listing. If you know you’re only going to have a handful of attributes, that might be all you need. But if you’re dealing with, say, a mid-to-large size retailer with hundreds of attributes, there are some storage and performance implications to consider. Depending on the type of data, there are hard limits to how many columns you can have in a table. And you’ll probably end up with a whole lot of columns full of mostly null values, which seems like bad design. Let’s see some alternatives.

The (kinda) hard way

Who needs addAttributeToFilter? We can can do the join ourselves!

$attribute_code = 'YOUR_ATTRIBUTE_CODE';
$value = 'VALUE_FOR_FILTERING';
$collection = Mage::getResourceModel('catalog/product_collection');
$attribute = Mage::getModel('eav/entity_attribute')->loadByCode('catalog_product',$attribute_code);
$table = $attribute->getBackendTable();

$collection->joinField(
$attribute_code,
$table,
'value',
'entity_id = entity_id',
array( 'value' => $value, 'attribute_id' => $attribute->getId() ),
'inner'
);

First we set the collection in a variable, load the attribute model, and get its table name. (Since product attributes are EAV, they’re stored in different tables depending on their type, for example catalog_product_entity_varchar.) Then we use joinField to do an inner join. The first two arguments are the attribute code and the table name we found before. Next is the column in the joined table that we want to select, which will be “value” for the value of the attribute. Then we have the “bind” condition, which provides the primary and foreign key to build the ON portion of the JOIN. “entity_id = entity_id” means to match up the product ids in both tables. (Somewhat confusingly, the product id in product entity attribute tables is also called entity_id. I just typed that sentence and I still had to read it twice to understand it. Welcome to the madness that is EAV.)

The fifth argument is where the filtering happens. It adds more conditions to the ON predicate. We want only the rows with a specific combination of attribute_id and value. Last, we specify an inner join, which eliminates the need for a WHERE clause.

From a few minutes of Googling my understanding is that JOIN…ON and WHERE result in the same query plan, so there shouldn’t be any performance impact. (If I’m wrong, please shoot me an email.)

But all that seems a little excessive, so let’s move on to…

The best way?

The last way is my favorite. I pretty much gave it away in the explanation of the problem. Remember how I said that if the $attribute passed to addAttributeToFilter was an array, we used that to build the JOIN and WHERE clause?

Just do this:

$collection->addAttributeToFilter( array( array( 'attribute' => 'YOUR_ATTRIBUTE_CODE', 'eq' => 'VALUE_FOR_FILTERING' ) ) );

Note that it’s expecting an array of associative arrays, and the inner ones have to have the key “attribute” defined, otherwise you’ll get an Illegal string offset exception.

That’s all! Happy filtering.

References

Leave a Reply

Your email address will not be published. Required fields are marked *