Posts Tagged ‘SEO’

May 15th, 2008

What I learned about SEO from Celebrity Jeopardy!

I was having a conversation with my best friend a few days ago and we got on the subject of our preferences for how URL’s are rendered for blogs.

I fall on the side of lowercase letters and hyphens splitting the words:

http://www.somesite.com/2008/05/my-url-preference-is-like-this/

He falls on the side of title case lettering and no hyphens splitting the words:

http://www.somesite.com/2008/05/HisURLPreferenceIsLikeThis/

He has his reasons I have mine, I just think mine are more valid. Sorry Al that is my opinion. I am going to layout why I think mine are more valid, with an example from Celebrity Jeopardy. For those of you who aren’t familiar with this famous skit:

Celebrity Jeopardy! was a recurring sketch on Saturday Night Live. It parodies the Celebrity Jeopardy! edition of the television game show Jeopardy! where celebrities compete and the game’s level of difficulty is significantly reduced. Thirteen sketches have been aired to date, two per season from 1996 to 2002, and one in 2005.

Before I get to my commentary lets first watch this excerpt from Saturday Night Live’s Celebrity Jeopardy!:

I really tried to find a good example of one of the famous Sean Connery mess ups in a legal video sharing site, but none of them had anything usable. The skit I was really looking for was the famous “An Album Cover” where Sean Connery pronounces it as “Anal Bum Cover”.

As you can see Norm MacDonald playing the character of Burt Reynolds transforms the category in Celebrity Jeopordy!, on purpose for comedy reasons. In my analogy Google is going to be the Burt Reynolds of your search, however instead of finding the wrong words on purpose it is going to do it because it is a dumb machine that does what it is asked even if the results are not contextually accurate.

An Album Cover Google Example

Notice in the image above, in the highlighted words, Google finds both “An Album Cover” and “Anal Bum Cover”. This is because Google understands that the words you may be looking for don’t always fall in the same order and spacing as the exact phase you are looking for. This is something that SEO experts have known for a long and try to control so that their content shows in the top spot for the keywords they designed in to the page.

If you don’t control your URL, which is one of the highest ranking keywords on your site. You could end up decreasing the effectiveness of your keywords, as an almost duplicate keyword penalty. Granted I don’t know if something like this exists as a penalty, but when you are dealing with SEO it never hurts to be as careful and precise as possible.

So again I ask which URL would you rather have? Now knowing how a URL can be misconstrued :

http://www.somesite.com/2008/05/an-album-cover/
http://www.somesite.com/2008/05/AnAlbumCover/

So this is my way of saying be careful what your URL spells out, you may get unintended search rankings that you may not want, or you may offend a person who reads the URL wrong. Either way it is always good to control your environment with in reasonable means to make sure the message is received as you were intending it.

note: There are other factors in play that yielded the search results above.  However one thing that you will notice is that none of the URL’s were falsely highlighted, that is because they used a non-whitespace character to break up the words.

Tags: , , , ,

Posted in SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 4 Comments »

February 10th, 2008

How to use the .NET URL Rewriter and Reverse Proxy to run WordPress on IIS

First off I would like to say that many of my readers are very intelligent, they picked up on a one line sentence in my last post about my new design and Coder Journal switching from Linux to Windows.

I also moved hosts from GoDaddy’s shared Linux hosting. To GoDaddy’s virtual dedicated hosting on Windows. This proved difficult since URL Rewriting isn’t currently built in to IIS 6.0 like it is in Apache. I will talk a little about this setup in a later post.

Switching from Linux to Windows wasn’t the part that really intrigued many of them, it happens every day so why would it? It was the fact that I was able to get the same level of URL Rewriting out of IIS 6.0 as I was out of Apache’s mod_rewrite and still be able to make WordPress look and function like it was running on Apache.

So to get started I just want to say, while I know there are other solutions out there to get WordPress hosted on IIS with the exact same outcome as what I am going to present below. I did this for the following reasons:

  1. I am a .NET guy and I love developing software that is popular on other platforms on .NET just to see if it can be done.
  2. I also believe in Eating One’s Own Dog Food, and the URL Rewriter and Reverse Proxy that I am presenting below, and that is used in Coder Journal, is my own creation.

What This Post Covers

This post is meant to provide an insight in to a technology, Reverse Proxy, that many developers are unaware of and it will be demonstrated through the eyes of my blog and how it works in regards to WordPress/IIS 6.0. Some of the basics will be covered such as the working of a URL Rewriter and Reverse Proxy. This post will not cover how to code a URL Rewriter or Reverse Proxy in C#. The reader should also have a basic understanding of how RegEx, HTTP, and URL Rewriters.

The Problem

On IIS 6.0, and previous versions, due to a lack of any standardized URL Rewriting process built in, so developers have to take nice visitor and SEO friendly URL’s like this:

http://www.coderjournal.com/2008/02/10/sample-post/

And make IIS 6.0 compatible ugly URL’s, which may or may not be SEO friendly, and neither URL is as visitor friendly as the one above.

http://www.coderjournal.com/?p=123
http://www.coderjournal.com/index.php/2008/02/10/sample-post/

My Solution Used On Coder Journal

The solution I choose was influenced by a number of factors, a couple that will change for the better when IIS 7.0 is released. The factors are:

  • I need to run PHP for WordPress.
  • I need to run FastCGI for IIS 6.0 to get the best performance out of PHP.
  • .NET and PHP run separate from each other, so I cannot use a .NET URL Rewriter to control which PHP file is chosen to run. (This changes in IIS 7.0 with Integrated Pipelines)
  • I need to pass all requests to www.coderjournal.com through .NET, which has a performance loss for rendering static files such as image, and text files. (This changes in IIS 7.0 with Integrated Pipelines)
  • I need to keep the URL’s friendly for visitors and SEO.

So because of what I listed above I needed to create two web servers to host www.coderjournal.com, which I will talk about later on in this article. One of the servers is the public interface to www.coderjournal.com, which I will call frontend, and the other is the Backend WordPress web server, which I will call backend that only handles standard WordPress with the ugly URL’s listed above, this one is not public. The picture will demonstrate the structure better than I can explain.

Coder Journal Web Structure

As you can see, from the above picture, all requests to WordPress are handled by the frontend server for this blog. This all happens through a technique known as Reverse Proxy.

A reverse proxy dispatches in-bound network traffic to a set of servers, presenting a single interface to the caller. For example, a reverse proxy could be used for load balancing a cluster of web servers. In contrast, a forward proxy acts as a proxy for out-bound traffic. For example, an ISP may use a proxy to forward HTTP traffic from its clients to external web servers on the internet; it may also cache the results to improve performance.

So with out going in to a deep explanation of how I was able to accomplish the reverse proxy, basically for every request that comes in to frontend server that meets a certain criteria I make another HTTP web request to the backend server and then write it’s response back to the original frontend server request.

Step 1 - Setting Up .NET to Process All Requests

Setup your frontend server to process everything through the .NET framework.

  1. Open IIS and right-click on the website and select Properties.
  2. Click the Configuration button under Application Settings section
  3. Click the Insert… button to create a new wildcard mapping
  4. Set the executable textbox to aspnet_isapi.dll file location.
    for .net 2.0, 3.0, 3.5: C:\Windows\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll
  5. Make sure the checkbox Verify that file exists is not checked.
  6. Press OK to confirm and close all the windows.

Step 2 - Install PHP/WordPress

Just follow this article on IIS.NET for installing PHP/WordPress on IIS 6.0. You may also want to install FastCGI, I recommend this, but it is optional.

Step 3 - Setting Up the URL Rewriter and Reverse Proxy Rules

The criteria for the requests are put inside the URL Rewriter Rules files. But before the proxy request is made, I must check to make sure the file being requested doesn’t already exist on the frontend server. If it does exist on the frontend server I don’t want to make a reverse proxy request. The following is the code used to do that.

# any file that exists just return it
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*) $1 [L]

Then after I check to make sure the file doesn’t exist on the frontend server I make the request to the backend using the following rules.

# proxy all connections through to the backend server
RewriteRule ^(/[0-9]{4}/.*) http://backend/index.php$1 [P]
RewriteRule ^(/tags/.*) http://backend/index.php$1 [NC,P]
RewriteRule ^(/topics/.*) http://backend/index.php$1 [NC,P]
RewriteRule ^(/author/.*) http://backend/index.php$1 [NC,P]
RewriteRule ^(/comments/feed/.*) http://backend/index.php$1 [NC,P]
RewriteRule ^(/page/.*) http://backend/index.php$1 [NC,P]
RewriteRule ^(.*) http://backend$1 [P]

Conclusions

To get the exact same setup as I have, you will need the following software, which is all free for download:

As always if you have any questions about the setup or the performance please post them below in the comments and I will answer them and or update the post as needed.

Happy Coding.

Tags: , , , , , , , , , ,

Posted in ASP.NET, C#, How To, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 1 Comment »

January 26th, 2008

Coder Journal’s New Year Make Over

New Theme

My first major change was the development of my own theme. My old theme was clunky and overall I didn’t like the feel that it gave to my reader base. I became greatly discouraged looking for a new theme as most are more of a testament to art and less on readability and functionality. So I decided to create my own that had a very simple layout.

Optimization of Load Time

One of the things I hated about my other blog was the fact that I didn’t have control over how the HTML and thus JavaScript was laid out. Especially the JavaScript because I had duplication where I didn’t need it. The script that Technorati gives you is hardly optimized for load time because of duplication of a supporting script file.

<a class="tr-linkcount" href="http://technorati.com/search/{your URL here}">View blog reactions</a>
<script src="http://technorati.com/linkcount" type="text/javascript"></script>

If you notice the 2nd line in the script above never changes. Well to optimize this I only included the 1st line in my post text and the 2nd line is at the bottom of the page with the rest of my JavaScript.

The next thing I did was optimize my load time using YSlow. See Jeff Atwoods Description.

  1. Add Expires headers to all my static content for 10 years from the day it is downloaded.
  2. Enabled Gzip compression for all my static content.
  3. Put all CSS at the top of the page.
  4. Reduced all DNS lookups by downloading images from LinkedIn, Technorati, and others and hosted them locally.
  5. Moved all JavaScript to the bottom of the page.
  6. Removed duplicate Technorati scripts.

I also moved hosts from GoDaddy’s shared Linux hosting. To GoDaddy’s virtual dedicated hosting on Windows. This proved difficult since URL Rewriting isn’t currently built in to IIS 6.0 like it is in Apache. I will talk a little about this setup in a later post.

SEO and SEM

I did a decent amount of SEO and SEM work to get my blog up to snuff. I took the following steps when redesigning the HTML for easy of indexing by Search Engines and Googles Media Bot (Used for giving relevant results in AdSense)

  1. I download the MySQL file from the database and normalized all the URL’s to the one you see above.
  2. Google AdSense only allows you to have 3 AdUnits per page and the placement of the AdUnits counts. For instance I had to red0 my theme so the content was before the sidebar, in terms of the HTML, so that the AdUnit in the articles placed first so that it received the highest quality Ad.
  3. I reduce my categories to a handful of manageable ones, and migrated the rest to the new Tags feature.
  4. Use H1, H2, and H3 tags sparingly. They should be a way to document the internal structure of your HTML page. (i.e. logical sections) My logic is as follows
    1. H1 is used for the blog title.
    2. H2 is used for the article title.
    3. H3 is used for sections of the article.
  5. I started using the Post Slug is very important and should abide by the following rules
    1. No more than 3-7 keywords
    2. No common English words such as (if, about, when, my, etc.)

So that was how I spent my holiday creating a new design for my blog. If you have any suggestions, I am all ears about how I can improve my blog for the better.

Tags: , , , , , , ,

Posted in How To, Personal, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | No Comments »

April 24th, 2007

World Of HTTP/1.1 Status Codes

In a follow up to my previous post on Proper URL Construction, I am going to dive more deeply in to the Status Codes that control the redirects that were talked about in my previous article.

Most developers are familiar with the HTTP 1.0 Status Codes, that have been recently popularized by the SEO guys. We have all heard that you should use 301 Moved Permanently instead of 302 Temporary Redirect. What many of the SEO guys won’t tell you, because they don’t know any better, is that they are using the RFC 1945 HTTP/1.0 Standard that was released in May 1996, that is right it is about 12 years old. The newest HTTP/1.1 Standard, RFC 2616, was released in June 1999, and made some pretty drastic changes the the 3xx Redirect Status Codes. The goal of this post is to inform and familiarize developers with the HTTP/1.1 Standard, specifically the 3xx Redirect Status Code changes. This can have drastic effect on how you handle requests on your website and optimize your site for search engines.

History

In the middle-to-late 1990’s 302 Moved Temporarily was the most popular redirect code, but also an example of industrial practice contradicting the standard. HTTP/1.0 specification (RFC 1945) required the client to perform a temporary redirect (the original describing phrase was “Moved Temporarily”), but popular browsers implemented it as though it was a 303 See Other.

Note from 302 Found: RFC 1945 and RFC 2068 specify that the client is not allowed to change the method on the redirected request. However, most existing user agent implementations treat 302 as if it were a 303 response, performing a GET on the Location field-value regardless of the original request method. The status codes 303 and 307 have been added for servers that wish to make unambiguously clear which kind of reaction is expected of the client.

Therefore, HTTP/1.1 added status codes 303 and 307 to disambiguate between the two behaviors. However, majority of Web applications and frameworks still use the 302 status code as if it were the 303.

Proper Use of HTTP Redirects

The next part will be a guide of the conditions that should be met in order to use the specific redirect.

301 Moved Permanently

  • The URL (or page) is going to permanently reside in a differently location
  • The domain should always be displayed a certain way, (i.e. This domain is always displayed as coderjoural.com, so any traffic to www.coderjournal.com gets a 301 redirect to coderjoural.com).
  • This should be used for most static redirects that are not generated programmatically.
  • *NEW* This status was mostly designed to be used with GET and HEAD requests.

303 See Other

  • This is going to be the most common type of redirect that you want to use when you are programmatically changing where the user is located in your site during a POST back.
  • Any time you want to redirect a user to another URL after a POST from a form has occurred (i.e. The visitor to your site registers with your site and after they are done registering you want to direct them back to the home page, this is when you would use a 303 redirect).
  • *NEW* This status was designed to be used with POST requests specifically, so it should not be used for GET or HEAD requests.

307 Temporary Redirect

  • Anytime that you want to put up a temporary page (i.e. your site is under construction and you want all traffic temporarily redirected to a static HTML page).
  • *NEW* This should be used when you want to redirect a GET request to different location each time the URL is requested.
  • *NEW* This should be not be used with POST requests, because of this statement in the specification:

    the user agent MUST NOT automatically redirect the request unless it can be confirmed by the user, since this might change the conditions under which the request was issued.

302 Found

  • Use this for any condition not met above.
  • This should be used sparingly because there is a search engine penalty if used too much, because of some spammers that used an Exploit called Page Hijacking.
  • This is sort of the antithesis to 404 Not Found and should be used in a similar way. So if you have a page that is referenced but no longer exists, but you do not want to return a 404 and just redirect the user to a random (not static as defined in a 301) site you would use a 302 redirect. (note this argument is very weak and there is very little reason in a HTTP/1.1 world to use a 302 redirect)
  • *NEW* This status should be used during GET requests for any semi-static URL’s that may change in the future, but don’t change with each and every request. A good example of this on Coder Journal is my Essential Software Every Developer Needs which I publish annually, and is located at http://www.coderjournal.com/essential-software/. It changes but it only changes once a year, so it is semi-static in terms of the internet.
  • *NEW* The 302 Found falls right between 301 Moved Permanently and 307 Temporary Redirect in terms of how permanent the URL is for GET requests.

An example of an HTTP Redirect Response will look something like the following, this was take from my own site when somebody queries www.coderjournal.com:

HTTP/1.1 301 Moved Permanently
Date: Tue, 24 Apr 2007 18:12:55 GMT
Server: Apache
Location: http://www.coderjournal.com/
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1131

If you would like to learn more about how to perform these redirects, that I have talked about above, in your favorite language please read this article from Steven Hargrove.

Update (2008-5-20): I have updated my understanding of the different types of redirects that developers may want to use. See above for my new understandings.

Tags: , , ,

Posted in How To, Programming, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 8 Comments »