CREATE A CUSTOM 404 ERROR PAGE USING PHP AND GOOGLE’S 404 WIDGET

A custom 404 error page can be invaluable in retaining traffic to your site which would have otherwise been your competitors.

In some cases it may be broken links on your site that cause a 404 error, but with so many link checking programs available (many for free) this can be easily remedied.

However, a good proportion of your traffic probably comes from search engines. If you are a well indexed site and you are redesigning your website you could be in for a nightmare situation, where many of your indexed pages become defunct.

If this is the case when a user attempts to load a page under your domain and recieves the default “Page Cannot be displayed” 404 error, it is highly unlikely they are going to attempt to access your home page by editing the url. What I can virtually guarantee, is that 99% of users will click the browsers back button and search again.

Here I suggest 3 actions that can really help. Especially if you are undergoing a large scale site restructure and you are a well indexed site.

  1. Create a custom 404 php page
  2. Include Google’s new 404 error widget
  3. Query your data using Microsoft Access

This php script will collect the necessary variables such as the full address the user is trying to reach, where the user has come from, the browser they are using and the date they tried to access the file. We could send ourselves an email when a user recieves a 404 error, but for larger sites where a large structural change has occurred this could generate a lot of emails! Remember a 404 error is generated whenever a graphic (.jpg .gif etc…), a css file, .js file etc… is referenced and doesn’t exist.

For this reason I suggest you upload a writable text file, i.e. 404errors.txt on your server and append the details of the 404 error to this document. If we separate the error information by a comma, we can then import this comma delimited file into an Access database and analyse the data further.

For sites recieving several thousands of errors this is really useful as you can then run a duplication wizard query to see the most problematic pages.

The PHP code used to collect the errors in a text file.

<?php
//Users ip address
$ip = getenv (”REMOTE_ADDR”);

//Page or file user attempted to load
$req_uri = getenv (”REQUEST_URI”);

//Name defined in the apache configuration
$server_name = getenv (”SERVER_NAME”);

//Where the user came from
$http_refer = getenv (”HTTP_REFERER”);

//Info on the users operating system also tells us if its a spider accessing the page
$http_agent = getenv (”HTTP_USER_AGENT”);

//The date the file was accessed
$todays_date = date(”D M j Y g:i:s a T”);

//Put it all together in a comma seperated string
$txt = $ip . “,” . $server_name . “,” . $req_uri . “,” . $http_refer . “,” . $http_agent . “,” . $todays_date;

//define the text file we are going to write to
$file = “notfound.txt”;

//Open the file and use the a command to append to the file – w would overwrite
$action = fopen($file, ‘a’);

//Write the data to the file
fwrite($action, $txt);

//Clean up
fclose($action);
?>

*This code should be placed on a page designed the same as the rest of your site. Including your standard navigation so that users can easily access your working areas of the site.

Here are some brilliant examples

Displaying your custom 404 page using .htaccess:

A .htaccess file is a system file which can be used override some Apache server default settings, one of which is configuration of a 404 error page.
You can use more than one .htaccess file as each .htaccess in each directory affects pages that the user tries to access and are not displayed in that directory.

All subdirectories that do not have an .htaccess file of their own are also affected. It is important you name the file correctly using all lowercase characters i.e. .htaccess

The format is for the file is:

ErrorDocument ERRORCODE URL

It is best if you use a relative path to your 404 page not a full (absolute) URL.

For example, if your .php file is named notfound.php and it is uploaded to a directory named news, you would enter the following line in your .htaccess file:

ErrorDocument 404 /news/notfound.php

Googles new 404 widget

The Google 404 widget is a simple piece of code that Google provides through its Webmaster Tools. If you haven’t yet signed up for a free Google Account… what are you waiting for!

Depending on your site Google will return suggestions or a search box which is automatically filled with the name of the page they are looking for. This can be useful if you have sensible page naming conventions and will mean the user is again more likely to find the page they were looking for,

The code is simple placed inbetween your body tags and looks like the following:

<script type=”text/javascript”>
var GOOG_FIXURL_LANG = ‘en’;
var GOOG_FIXURL_SITE = ‘http://www.YOUR-WEBSITE.co.uk/’;
</script>
<script type=”text/javascript” src=”http://linkhelp.clients.google.com/tbproxy/lh/wm/fixurl.js”></script>

It is worth noting this is still in development by Google so the suggestions should improve over time.

Querying the information in the 404 error file using Microsoft Access.

First of all you must download your error text file that you have been writing the errors too. It is sensible to do this at least once a week otherwise the file could become very large and take up your hosting space.

Query Wizard

Now open up Microsoft Access and import the error text file, making sure you have the comma separated delimiter selected.
Follow the options through in the import wizard until you have your error table in Access.

Now go to your query tab and select New Query. You will then be able to go through the steps of a “Find Duplicates” wizard.

This will show you which pages are causing the most errors.

Prioritise fixing the most problematic pages and overwrite your 404 text file on your server with a blank file (of the same name) ready to start collecting data again.

You will soon start to see your 404 error file shrink in size, whilst also being safe in the knowledge you are still capturing all those visitors that would have otherwise looked elsewhere!

Leave a Reply

Your email address will not be published. Required fields are marked *