<About />

My name is Kevin. I am a web professional living in Massachusetts. I build websites mostly using Drupal and jQuery. I use Vim even when I don't need to. When I'm not on the computer, I'm usually hanging with my wife, Melissa.

<Search />

The danger of "q=$1"; or, why you shouldn't ignore Drupal 404 errors.

Drupal makes your life very easy, but that's no excuse for being lazy. While profiling scripts today, I ran into something that, for whatever reason, never clicked before. I was running Xdebug and KCachegrind on some complex pages to get an idea where I could optimize. Xdebug has a profiler built in that creates files digestible by something like KCachegrind. With profiling on, you get a profile dump file for every PHP script you run. You can in-turn open that file with KCachegrind to get some helpful analysis.

Profiling is a great way to expose those problems that I often ignore...

So, I ran a few scripts and noticed that there were a whole bunch of profile dumps - way more than the number of Drupal pages I profiled. After a quick inspection, it was apparent that these extra files were coming from 404 "Page Not Found" errors generated by Drupal.

The errors were coming from a CSS image that wasn't found. Normally, an Apache generated 404 would be a trivial issue. However, since Drupal rewrites all requests that are not for files or directories as "q=$1" it does a full bootstrap for every one of those 404 pages. So, in my case, I was doing an extra full Drupal page load for every page request.

So, the moral of the story is: Do not ignore 404 errors being issued by Drupal. Watch those logs like a hawk! With a little bit of cleanup, you can drastically reduce the load on your servers.

Happy day!

Handy way of finding those 404's...

I've written a simple bash script to find the most common 404's in your error_log. It uses a combination of gawk, grep and uniq.

Dangerous

Thomas - you'll need to be careful of that rule. It sounds like it would break ImageCache which relies on having RewriteRules enabled for the files folder.

An alternative moral

This is an important point and you've made it well but an alternative moral might be: "stock configurations are only good for stock servers; and your server almost certainly isn't stock!" :-)

Rather than poring over the watchdog logs for 404 errors I find it easier to add RewriteConds excluding sites/*/files/ and other static directories from the rewriting. It might not catch everything (a module or theme referencing an image, CSS or JS file that doesn't exist, for instance) but does the job pretty well.

Cheers,

Thomas

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <span> <a>
  • Lines and paragraphs break automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options

Mollom CAPTCHA (play audio CAPTCHA)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated.