The robots.txt file

When it comes to SEO, most people understand that a Web site must have content, “search engine friendly” site architecture/HTML, and meta data such as title tags, graphic alt tag tags and so on.

However, some web sites totally disregarded the robots.txt file. When optimizing a Web site: don’t disregard the power of this little text file.

What is a Robots.txt File?

Simply put, if you go to www.domain.com/robots.txt, you should see a list of directories of the Web site that the site owner is asking the search engines to “skip” (or “disallow”). However, if you’re not careful when editing a robots.txt file, you could be putting information in your robots.txt file that could really hurt your business.

There’s tons of information about the robots.txt file available at the Web Robots Pages, including the proper usage of the disallow feature, and blocking “bad bots” from indexing your Web site.

The general rule of thumb is to make sure a robots.txt file exists at the root of your domain (e.g., www.domain.com/robots.txt). To exclude all robots from indexing part of your Web site, your robots.txt file would look something like this:

User-agent:
* Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/

The above syntax would tell all robots not to index the /cgi-bin/, the /tmp/, and the /junk/ directories on your Web site.

There are situations where you might use the Robots.txt file to cause issues with your site optimisation.  For instance if you include a * Disallow: “/” in your Robots.txt file it will be telling the search engines not to crawl any part of the web site giving you no web presence – not what you want.

Another point to watch out for is if you modify your Robots.txt file to dissallow old legacy pages and directories – you should really do a 301 permanent redirect to pass the value from the old Web pages to the new web pages.

Robots.txt Dos and Don’ts

There are many good reasons to stop the search engines from indexing certain directories on a Web site and allowing others for SEO purposes.

Here’s what you should do with robots.txt:

* Take a look at all of the directories in your Web site. Most likely, there are directories that you’d want to disallow the search engines from indexing, including directories like /cgi-bin/,  /wp-amin/,  /cart/,  /scripts/,  and others that might include sensitive data.
* Stop the search engines from indexing certain directories of your site that might include duplicate content. For example, some Web sites have “print versions” of Web pages and articles that allow visitors to print them easily. You should only allow the search engines to index one version of your content.
* Make sure that nothing stops the search engines from indexing the main content of your Web site.
* Look for certain files on your site that you might want to disallow the search engines from indexing, such as certain scripts, or files that might contain e-mail addresses, phone numbers, or other sensitive data.

Here’s what you should not do with robots.txt:

* Don’t use comments in your robots.txt file.
* Don’t list all your files in the robots.txt file. Listing the files allows people to find files that you don’t want them to find.
* There’s no “/allow” command in the robots.txt file, so there’s no need to add it to the robots.txt file.

By taking a good look at your Web site’s robots.txt file and making sure that the syntax is set up correctly, you’ll avoid search engine ranking problems.  By disallowing the search engines to index duplicate content on your Web site, you can potentially overcome duplicate content issues that might hurt your search engine rankings.

Test a robots.txt file

Google provides a facility as part of there Webmaster Tools system to enable you to test a robots.txt file.

Test a site’s robots.txt file:

On the Webmaster Tools Home page, click the site you want.
Under Health, click Blocked URLs.
If it’s not already selected, click the Test robots.txt tab.
Copy the content of your robots.txt file, and paste it into the first box.
In the URLs box, list the site to test against.
In the User-agents list, select the user-agents you want.

Any changes you make in this tool will not be saved. To save any changes, you’ll need to copy the contents and paste them into your robots.txt file.

Posted in SOE

Single and double quotes in PHP

There is a difference in the way that PHP handles single and double quote marks when using the echo statement.

For example :

$var = ‘test’;

The statements echo(‘$var’); and echo(“$var”); will generate different results.

echo “\$var is equal to $var”;

will display $var is equal to test

While :

echo ‘\$var is equal to $var’;

will display

\$var is equal to $var.

In the case of the sinlle quotes, the variable name is displayed as is.

filter_var and validate an email address in PHP 5.2.0 onwards

PHP 5.2.0 onwards has the filter_var function which can be used to validate many different inputs.

To validate an email address :

<?php
//Validate an email address in PHP 5.2.0 onwards

$email_address = “me@example.com”;
if (filter_var($email_address, FILTER_VALIDATE_EMAIL)) {
// The email address is valid
} else {
// The email address is not valid
}
?>

Further PHP try / catch PHP 5

A try / catch block is meant to catch exceptions.  An exception would be something like divide by zero which causes a program exception and this can be caught.

An error on the other hand is not usually recoverable.  An example of an error would be forgetting to place a ; at the end of a line or not enclosing a string with ” marks.

In the case of divide by zero, if you use a try / catch block, program execution will continue because you have caught the exception.

Each try must have at least one corresponding catch block.  You can have multiple catch blocks to catch different classes of exceptions.

When an exception is thrown, the code following the statement will not be executed and PHP will then attempt to find the first matching catch block.

The general form of a try / catch block is :

try
{
$a = 1;
$b = 0;
$c = $a / $b;
}
catch (Exception $e)
{
echo($e->getMessage());
}

Other functions of the exception class are :

getMessage();        // message of exception
getCode();           // code of exception
getFile();           // source filename
getLine();           // source line
getTrace();          // an array of the backtrace()
getPrevious();       // previous exception
getTraceAsString();  // formatted string of trace

You may extend the exception class to create your own custom exceptions and the use them as multiple catch blocks to catch different classes of exception as shown in the following code :

<?php

//Extending the exception class

class WidgetNotFoundException extends Exception {}

function use_widget($widget_name) {
$widget = find_widget($widget_name);

if (!$widget) {
throw new WidgetNotFoundException(t(‘Widget %widget not found.’, array(‘%widget’ => $widget_name)));
}
}

//The try / catch block

try {
$widget = ‘thingie’;
$result = use_widget($widget);

// Continue processing the $result.
// If an exception is thrown by use_widget(), this code never gets called.
}
catch (WidgetNotFoundException $e) {
// Error handling specific to the absence of a widget.
}
catch (Exception $e) {
// Generic exception handling if something else gets thrown.
watchdog(‘widget’, $e->getMessage(), WATCHDOG_ERROR);
}

?>

Simple PHP 5 error handling

<?php
//create function with an exception
function checkNum($number)
{
if($number>1)
{
throw new Exception(“Value must be 1 or below”);
}
return true;
}

//trigger exception in a “try” block
try
{
checkNum(2);//If the exception is thrown, this text will not be shownecho ‘If you see this, the number is 1 or below’;}
//catch exception
catch(Exception $e)
{
echo ‘Message: ‘ .$e->getMessage();
}
?>

Examples of using PDO objects in PHP

<?php

//Example of fetching data from a database using PDO objects

# using the shortcut ->query() method here since there are no variable
# values in the select statement.

try {

$dbhost = “localhost”;
$dbname    = “users”;
$dbusername = “root”;
$dbpass = “”;

//Connect to the database
$dbh = new PDO(“mysql:host=” . $dbhost . “;dbname=” . $dbname, $dbusername, $dbpass);

//the sql query
$sql = “SELECT * FROM users”;

//statment handle
$sth = $dbh->query($sql);

# setting the fetch mode
$sth->setFetchMode(PDO::FETCH_ASSOC);

echo(“——————————————–<br/>”);
echo(“An example of a while loop<br/>”);
while($row = $sth->fetch()) {
echo( $row[“first_name”] . “<br/>” );
$table[] = $row;
}

$dbh = null;

}  catch (PDOException $e) {
print “Error!: ” . $e->getMessage() . “<br/>”;
die();
}

echo(“<br/><br/>”);

echo(“——————————————–<br/>”);
echo(“An example of looping around an array<br/>”);

if ($table) {    //Check if there are any rows to be displayed
//Retrieve each element of the array
foreach($table as $d_row) {
echo( $d_row[“first_name”] . ” ” . $d_row[“last_name”] . “<br/>” );
}
}

echo(“——————————————–<br/>”);
echo(“An example of printing one element from the array<br/>”);
echo($table[0][“first_name”]);

?>

<?php

//Example of fetching data from a database using PDO objects

//This uses a prepared statement using named values

try {

$dbhost = “localhost”;
$dbname    = “users”;
$dbusername = “root”;
$dbpass = “”;

$first_name = “%paul%”;

//Connect to the database
$dbh = new PDO(“mysql:host=” . $dbhost . “;dbname=” . $dbname, $dbusername, $dbpass);

//the sql query using a named placeholder
$sql = “SELECT * FROM users WHERE first_name LIKE :first_name “;

//statment handle
$sth = $dbh->prepare($sql);

$sth->execute(array(“:first_name” => $first_name));

$sth->setFetchMode(PDO::FETCH_ASSOC);

echo(“<br/><br/>”);
echo(“——————————————–<br/>”);
echo(“An example of printing values from a select statement with parameters<br/>”);

while($row = $sth->fetch()) {
echo( $row[“first_name”] . “<br/>” );
$table[] = $row;
}

$dbh = null;

}  catch (PDOException $e) {
print “Error!: ” . $e->getMessage() . “<br/>”;
die();
}

?>

Some regular expression matches

Regular Expression     Will match…

foo     The string “foo”
^foo     “foo” at the start of a string
foo$     “foo” at the end of a string
^foo$     “foo” when it is alone on a string
[abc]     a, b, or c
[a-z]     Any lowercase letter
[^A-Z]     Any character that is not a uppercase letter
(gif|jpg)     Matches either “gif” or “jpeg”
[a-z]+     One or more lowercase letters
[0-9\.\-]     ?ny number, dot, or minus sign
^[a-zA-Z0-9_]{1,}$     Any word of at least one letter, number or _
([wx])([yz])     wy, wz, xy, or xz
[^A-Za-z0-9]     Any symbol (not a number or a letter)
([A-Z]{3}|[0-9]{4})     Matches three letters or four numbers

Possible way of dealing with inserting quote marks into a database

This is another possible way of dealing with quote marks for inserting data into a database :

if (!get_magic_quotes_gpc()) {
$item_name = addslashes($_POST[‘txtItem_Name’]);
}
else
{
$item_name = $_POST[‘txtItem_Name’];
}

Dealing with quote marks for inserting data into a database
———————————————————–

if (!get_magic_quotes_gpc()) {
$item_name = addslashes($_POST[‘txtItem_Name’]);
}
else
{
$item_name = $_POST[‘txtItem_Name’];
}

Regular Expression Will match…

Regular Expression     Will match…

foo                                The string “foo”
^foo                            “foo” at the start of a string
foo$                            “foo” at the end of a string
^foo$                         “foo” when it is alone on a string
[abc]                           a, b, or c
[a-z]                           Any lowercase letter
[^A-Z]                      Any character that is not a uppercase letter
(gif|jpg)                   Matches either “gif” or “jpeg”
[a-z]+                       One or more lowercase letters
[0-9\.\-]                  any number, dot, or minus sign
^[a-zA-Z0-9_]{1,}$      Any word of at least one letter, number or _
([wx])([yz])                    wy, wz, xy, or xz
[^A-Za-z0-9]                 Any symbol (not a number or a letter)
([A-Z]{3}|[0-9]{4})     Matches three letters or four numbers