Safely Querying CCK Data Types

I recently came upon an article by Matt Johnson talking about the dangers of querying CCK tables directly. Due to shifting data within the CCK schema depending on a number of factors, directly querying tables can be hazardous if not done correctly. It's easy to forget this until you've done a site where you've written dozens of queries directly against CCK tables, only to later change fields on your content types and have them break. Let's take the example of building a site which catalogs artist and album information. I might start by creating an artist content type, with a field on it which defines the record label the artist is current signed to. Upon doing this the table structure would look like this:

So if after I've developed for a while and written a handful of queries which depend on the "label" field to work correctly I decide it's time to add a content type for albums, this type also needs a label field as artists might switch labels between albums. So after creating my artist content type and sharing the field, our table structure looks like this:

The catch when querying these tables directly is that even if you already have data in your fields and change the field definitions, the CCK will change the schema and move the data around as necessary, making direct queries against these fields hazardous. In his article Matt recommends using views as an abstraction layer for queries by instantiating the view objects and executing the query, here's a simplified example:

 
<?php
$view = views_get_view($view_name);
$view->execute();
if ($view->result) {
...
}
?>
 

Let's take a look at the pros and cons of this approach:

Pros:

  • Queries which are more stable, since views handles any shifting within the CCK table schema automatically.
  • Ability to take advantage of the recently committed views caching patch written by Jeff Eaton and Earl Miles. This can offer performance gains if you're not already using some form of external caching.

Cons:

  • Needing to write a different view for every query on the CCK schema you're making.
  • The need to export those views to code.

Although the views method works, and has some advantages, I find it a little cumbersome to create a view for every custom query I'm running on a site. An alternate method I've been using for some time is to find the table information directly through the CCK API and run the queries based on the information it provides. Heres an example:

 
<?php
 
// Get an array of field settings for the record label field.
$field_info = content_fields('field_label');
 
// Get the database table and column(s) information for the field.
$db_info = content_database_info($field_info);
 
// Get all nid's which have a label field value which matches the $record_label query parameter.
$nids = db_query("SELECT n.nid FROM {node} n INNER JOIN {". $db_info['table'] ."} fd ON fd.vid = n.vid and fd.nid = n.nid WHERE fd.". $db_info['columns']['value']['column'] ." = '%s'", $record_label);
 
?>
 

I much prefer this method as it maintains all the advantages of the views method in terms of stability, yet avoids the need to clutter the views interface, and maintains portability without the need to export the views you define to code.

Filed under: 

3 Comments

Someone correct me if I'm wrong, but since vid is a primary key, there's no need to join on it and nid, just vid will be sufficient.

dalin: nope. you're right, in the given example joining by the nid field is not required - I do it by habit. Thanks for pointing out an area this can be simplified.

This is fantastic work. I've had dozens of annoying views sitting around for this purpose. This makes them all vanish.

Thank you!

Add a comment