Search
Left Quote    Black holes are where God divided by zero.
- Steven Wright    
Right Quote
 
[login] | [Register]
 

Mod_Rewrite URLs for Search Engines

by: bs0d
Page: 1 of 3
(View All)

Mod_Rewrite URL's for Search Engines

Mod_Rewrite is a module in Apache webservers that allows you to manipulate URL's. Mod_Rewrite can be quite powerful, and is known as a "voodoo" tool and "the swiss army knife of url manipulation." In this tutorial, Im going to cover using Mod_Rewrite and regex to increase hits to your site from search engines (like Google) by making them search engine friendly.

Should I? or Shouldn't I?

Search engines will not crawl deep into query strings. Im saying deep because it is true that the spider will crawl a query string a bit, but this is overall a good decision in other areas as well, take a look:
  1. Google can get meaning from your url's, and can be searched accordingly.
    Say you used a query string to refer to an article on your site like, /id=45. Well, id=45 does not tell google anything! If you changed that to, /study-on-science/ --Then google can read that, and if someone searches science, or study; it is likely to show up.
  2. It is easy to read!
    By appearance alone, it looks more professional and pleasing to the eye, than a bunch of confusing numbers and characters. Your visitors can be able to recall an article more easily by its url, than dealing with memorizing query strings. Also, many popular sites today use this method.
  3. More hits to your site.
    We cannot ignore the fact that doing this is going to increase hits to your website. If your site is new, its important to let everyone know that its out there. Because of the reasons above, the hits to your site will increase, as well as the rank.
Everyone has their own opinion on this, but I feel it is important and can help your site out. It is also quite easy once you get the hang of it, so why not try? If you use query strings in your URL's and do not re-write them, you're only hindering your site from deserved hits for appropriate content via searches on search engines.

So if you're reading this tutorial you've found that search engines do not like query strings, and deal much better with text based directories. So if we make your urls look like this:

www.mysite.com/tutorials/category/id/title/page.php

Then the search engines will be more than happy to make that information available in searches.

Getting Started

First, lets assume that the url's to your site currently look a bit like this:

www.mysite.com/tutorials/view.php?id=tutorial_id&page=page_requested

our goal and final product, is to make that same file accessible by a prettier and friendlier url like this:

www.mysite.com/tutorials/category/id/title/page.php

Ok, to eliminate problems that may occur when you begin the re-writing process, lets make sure that everything is set up right, so we can know that it have to be the code producing errors, not anything else if something goes wrong.

The mod_rewrite module must be loaded.

*Note: If you're running your own webserver, you can check by navigating to the httpd.conf file and opening it. On line 182 (Apache 1.3x), make sure it reads:


and that its not commented out (does not have # in front of it). If it does, remove the #. Now go to line 226. It should read:



If it has a # in front, remove it. Now it will be loaded when you start Apache. You can restart apache to ensure the changes go into effect, assuming that you made any.

If you're not running on your own webserver (most likely), then you will likely not have to bother with any of that stuff, because it is probably already setup for you by the server administrator. If it still wont work and you feel your code is correct, you can try inserting this:



as the first line of you .htaccess file.

The magic happens so to speak in your .htaccess file. You can make the changes for all the rewrite rules in the .htaccess of your root folder, or to the folder which you want to apply the changes, it doesn't matter either way.

The Code

Open up your .htaccess file. To begin using mod_rewrite, your first line of code will be:



Simple enough, right? Next we're going to tell mod_rewrite where to work (base directory) with this statement:



The next line of code will be your actual rewrite rule which contains regex. What is regex? "A regular expression, which is a way of describing a pattern to match in text. The directive definition will specify what the regex is matching against." - Apache

See also:
Apache.org

When I started learning about mod_rewrite, this is the part I first had trouble with because I don't think I found it explained too clear anywhere, so here I am going to describe each part to you the best I can.

you will use these expressions in the code to rewrite your urls, here they are and what they do:

^ - start of line,to match from beginning.
$ - End of line, match to end.
.(dot) - Match any single character (except for zero).
* - Match many characters.
[ ] - Match any character within the brackets.
[^ ] - Match any character, except whats listed in brackets.
+ - Match the preceding element 1 more times.
- excape preceding.
.* - will match an unlimited number of characters.

ok, that should clear things up a bit. At the end of this tutorial, I will provide links to some references and guides you can use to continue learning about regex, ect.
At the end of some of your rules, you declare what exactly the rule is evaluating, see below:

[NC] - No case (not case sensitive).
[OR] - Or next condition (declare one, then put [OR] and declare another).
[L] - Last rule.
[F] - Force URL to forbidden (HTTP 403: Forbidden).
[D] - Is a directory.
[R] - Redirect to external redirection (HTTP 302: Moved Temporairly).


You can use these combined with others. If you want to say no case for a rule, and specify an or, then you would just do [NC,OR] at the end. And if you use a ! in front of them, this will turn it around. Like, if you had [!L] would mean, Not a link, just like using ! in other programming.

Now that we have turned the rewrite engine on, and declared the base directory, we can begin setting up our rewrite rules.




Lets breakdown our rule. You see ^(.*) followed by categories.php?type=$1 and finally [NC]. ^(.*)/ is a wildcard, to which we refer back to with $1. This rule is saying, The first directory after /tutorials/ will now contain the text from whatever type equals in categories.php?type=. For example, If you link a visitor to:

www.yoursite.com/tutorials/php/

^^it will display the results as if you had send them to:

www.yoursite.com/tutorials/categories.php?type=php

See? Now, if php was not a type in your categories.php file, then it would bring up the error you set when an unknown type is attempted.
The rules thus far have been for one directory. Now we need to set up a rule that will take care of the rest, because the old url was set to:
www.yoursite.com/tutorials/view.php?id=X&page=X (where X = the id,page of the tutorial in the database)

Instead if the URL looking like that, we're going to change it to this (for example):
www.yoursite.com/tutorials/php/1/simple_php_tutorial/1.php

The first directory after /tutorials/ is the category, followed by the tutorial id #, then the name of the tutorial and finally the page number. You can seperate category types and the names of the tutorials with whatever you wish, but if you decide to use a space, the browser will interpret that as %20. So, the Underscore (_) and dash (-) seem to be the most popular seperators.

Here is the rule for the change:



Here is the rule broken down. ^(.*)/ starts the rule for our first wild card directory after /tutorials. Followed by two others. the final is a wildcard but its not a directory. Notice that it ends with: .php$ -We're saying the final wildcard is going to be a php file. The [NC] at the end indicates no case, not case sensitive.

Now after that, we tell the rule where to get the data for the wildcard directories and file. Recall the original link:

www.yoursite.com/tutorials/view.php?id=X&page=X (where X = the id,page of the tutorial in the database)

So, in our rule we put: view.php?id=$2&page=$4 [NC]. We told it to access view.php file. Then, for the 2nd wildcard, insert value from id in view.php. Next, we told the rule to place the value from page in the view.php query string into the 4th wildcard directory specified.

When someone now visits: www.yoursite.com/tutorials/php/1/simple_php_tutorial/1.php
it will be the same as : www.yoursite.com/tutorials/view.php?id=X&page=X

This is your final product code:


So, where do the tutorial names come from?

Where I got the names of my tutorials were straight from my tutorials table in my database for the site, the same goes with the category. But, the names from the database will have spaces, and the browser interprets those as "%20" - so you will want to use the str_replace(); function to change them to "spaces_like_this". Or, use the php function, urlencode(); Here is an example using str_replace();




The only thing left to do, is to make sure to change your links in your files. Inside the url, you can easily echo the type, title, id - right into the link (from the database). Remember though, you probably dont want the url to have %20 for every space in the title or type, so keep urlencode(); and str_replace(); in mind, like in the example above.

Conclusion & Links


We have covered the mere basics of using mod_rewrite, but this is a good start for further progression. A good place to start after this tutorial would be to learn about rewrite conditions. I am still learning on them myself. Perhaps once I perfect the process there will be another tutorial similar to this on. So I hope that you find this tutorial useful. I know there are lots of tutorials on this topic out there today, so I am honored that you read through it and I hope you gained some knowlege from it. Below are some more resources for definitions, rules and syntax. Or just in case you still need more examples to clear some things up.

-bs0d | www.allsyntax.com


1  |  2  |  3  |  
Next »


No Comments for this page.

You Must be logged in or a member to comment.


Tutorial Stats

Tutorial Stats

282,757 Views
0 Total Comments
5 Rating of 5 (1 Votes)

Options

Tutorial Options

· Login to Rate This Article
· Login to Post a Comment
· Read more by this author
Digg This Article! Del.icio.us: Bookmark This Article Reddit: Bookmark This Article BlinkList: Blink This Article! YahooMyWeb BlogMarks: Add This Mark! Furl: Save This Article Spurl: Mark This Article

Articles

Related    

Your Article Here



"AllSyntax.com" Copyright © 2002-2018; All rights lefted, all lefts righted.
Privacy Policy  |  Internet Rank