Archive for the ‘SEO’ Category

June 7th, 2009

Managed Fusion URL Rewriter & Reverse Proxy - Release 3.0

I am happy to announce the 3.0 release of the Managed Fusion URL Rewriter & Reverse Proxy. Since my previous release in February I have been working hard on a significant rewrite of the core, that to be honest really needed refactoring if I hoped to extend the rewriter is some interesting ways in the future.

Download: Binary Release
View: Source Code
Discuss: Forum
Issues: Report

Release Notes

If you would like to find out more about the past releases please visit us at http://www.managedfusion.com/products/url-rewriter/release-notes.aspx

Version 3.0

  • Breaking Change Configuration in the web.config has been reorganized.
  • Major rewrite to the URL Rewriter to provide better performance and more reliable logging.
  • Major update to the proxy handler, it is much faster, and provides an exact duplication of headers from the proxied server.
  • Fixed many issues with the chunked encoding, so you are now able to proxy web based services, such as SVN.
  • Full rewrite of the rule, condition, and flag handling system to provide better performance and more flexibility for developers.
  • More extensibility points have been created for developers interested in creating their own handlers for rules, conditions, and flags.
  • More extensive testing of internal mechanics of the rewriter.
  • Added thread safety to the Apache rule set refresh.
  • Added initial support for Microsoft UrlRewriter IIS 7 module, this will provided a starting point for extension of the Microsoft configuration to support proxying and other more advanced Apache features.

Featured at PDC 2008

Tags: ,

Posted in News, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 5 Comments »

February 1st, 2009

Managed Fusion URL Rewriter & Reverse Proxy Release 2.5

Download: Binary Release
Download: Source Code

Release Notes

If you would like to find out more about the past releases please visit us at http://www.managedfusion.com/products/url-rewriter/release-notes.aspx

Version 2.5

    • Major update to the proxy handler, it is not much faster, and provides an exact duplication of headers from the proxied server.
    • Added full support for $N and %N support in conditions and rules now.
    • Added contexts for condition, rule, and ruleset to make transfer of common data easier for implementations of the API.
    • Added split between async and sync proxy handler, this can now be controlled through the web.config using useAsyncProxy.
    • Fixed issue with transfer-encoding: chuncked

      Tags:

      Posted in News, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 1 Comment »

      December 14th, 2008

      Introducing the ASP.NET MVC (Part 1) - The Model-View-Controller Pattern

      About a month and a half ago I announced that I am writing a book, I was really overwhelmed by the amount of support that I received from this announcement.  Both myself and Al are really looking forward to the day when this book ships, and we start receiving real feedback on all our hard work.  However, both of us would like to start receiving feedback as soon as possible, so…

      In an effort to write the book and keep blogging, I decided to write/blog the last chapter, Chapter 2.  I am doing this so I can receive feedback on this chapter as early as possible.  Because this chapter, in my opinion, is probably the most critical of the book, it defines the context around ASP.NET MVC and how it differs from ASP.NET Web Forms, as well as giving a historical perspective of the MVC pattern.

      In the next several posts we will cover the following parts of Chapter 2 from the book:

      by Nick Berardi

      New: $31.49
      This item has not yet been released. You may order it now and we will ship it to you when it arrives.

      The Model-View-Controller Pattern

      The Model-View-Controller architectural pattern has been around since 1978 and was first described by Trygve Reenskaug while working on a programming language called Smalltalk at Xerox PARC.  The implementation was first described in his now famous paper on the subject, titled Applications Programming in Smalltalk-80: How to use Model-View-Controller, published on December 1979, and has been popping its head up in many different ways and forms since the original paper was published.  Reenskaug maintains a page that explains MVC in his own words (http://heim.ifi.uio.no/~trygver/themes/mvc/mvc-index.html), and contains his publications on the subject; it is well worth the read and is only two pages long.

      The MVC pattern has been implemented in most every programming language that is in use today, including ColdFusion, Java, JavaScript, Perl, PHP, Python, Ruby, Smalltalk, XML, and of course .NET.  In fact in November, 2002 the W3C, the main international standards body for the World Wide Web, voted to make the MVC pattern part of their XForms specification, which will be integrated directly into the XHTML 2.0 standard.

      Reenskaug explains on this site that “The essential purpose of MVC is to bridge the gap between the human user’s mental model and the digital model that exists in the computer.”  As illustrated in Figure 2-1.

      Figure 2-1

      Figure 2-1

      He goes on to explain that “The ideal MVC solution supports the user illusion of seeing and manipulating the domain information directly.  The structure is useful if the user needs to see the same model element simultaneously in different contexts and/or from different viewpoints.”  This is important because it puts the emphasis not on the application, but how the user perceives the data, the controller and view is only a means to the end of allowing the user to visualize the model in other words.

      Reenskaug defines the Model-View-Controller in the following way.

      • Model: Represents knowledge.  A model can be in the simplest case a single object in your application, or in a complex case combination of objects.  It should represent the world as seen by the developer for the application that is being developed, in other words your database or domain.
      • View: Visual representation of the Model.  It should highlight the certain aspects of the model while minimizing the others where possible.  According to Reenskaug it should act as a presentation filter.  What he describes as a presentation filter is the notation of a contract created between the Model and the View that will provide the parts of the model requested for the presentation by the View.
      • Controller: A controller provides a link between the user and the system.  It provides the user with actions that can be taken against the Model, which in other words creates a set of inputs that can be acted upon and represented to the user in one or more ways through a View.

      Bringing MVC Down To Earth

      The concepts and ideas behind MVC were honestly a little abstract for me when I was first getting started, it took me a while to understand how the Model, View, and Controller where suppose to work together to create an application.  Unfortunately at the time I didn’t have a great example that clearly defined the lines between the different parts of the Model, View, and Controller, so I had to learn the hard way.  Lucky for us Jeff Atwood, of codinghorror.com fame, provided an example that really struck a chord with me.  Figure 2-2 is a visual representation of his example.

      Figure 2-2

      Figure 2-2

      This example almost perfectly represents MVC in a way that any web developer with only basic knowledge of HTML and CSS can understand.

      • Model: The HTML is the “skeleton” or definition of the data to be displayed to the user.
      • View: The CSS is the “skin” that gives the HTML a visual presentation.  The CSS can be swapped out to view the original content in a different manor, without altering the underlying Model.  They are relatively, but not completely, independent of each other.
      • Controller: The browser is responsible for combining the CSS and HTML, into a final representation that is rendered out to the screen in the form of pixels.  It gathers input from users, but it is restricted to the input defined by the HTML in the form of input, select, textarea, and button DOM objects.

      I find this to be an awesome acknowledgement to the success of the Model-View-Controller, because the browser is a natural interface for a computer user that wants to visualize the World Wide Web.  It successfully maps the Mental Model, from Figure 2-1, that a designer envisioned as an interface for the user to the Computer Model, which a developer coded for use on the World Wide Web.  So I hope this helped you visualize MVC in a way that helps you break out and understand the concepts behind the Model, View, and Controller.  If you would like to read Jeff’s full article it is available at http://www.codinghorror.com/blog/archives/001112.html.

      For the purpose of this book we are going to define MVC as the following:

      • Model: The classes which are used to store and manipulate the state of the database, through our domain objects, combined with some business logic.
      • View: The user interface parts, coded in HTML, necessary for rendering the Model to the user.  It may also render the Model as XML or JSON if needed programmatically by JavaScript.
      • Controller: The application layer that will accept the input and save that information to the database through our Model.  It will also contain a small amount of business logic necessary for controlling and validating the inputs.  The controller will also decide which view to render, the HTML, XML, or JSON depending on the form that was requested by the browser.

      The above definition of MVC for our application, The Beer House, is an almost exact representation of MVC as defined by the ASP.NET MVC team.

      This post is licensed under a different license than the rest of my site. Copyright © Wiley Publishing Inc 2009

      Tags: , , , , ,

      Posted in ASP.NET, C#, News, Personal, Review, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 7 Comments »

      December 9th, 2008

      Creating an extension module for .NET URL Rewriter and Reverse Proxy

      Wow that is a long title. Recently I have been looking for quick posts that I can put out each day to keep my blog relevant and also so I don’t feel like I am slacking off too much. Today I want to post about a little known feature in my .NET URL Rewriter and Reverse Proxy (aka. Managed Fusion URL Rewriter) that I have developed in my spare time, mostly out of necessity for this blog and other projects I have worked on.  Here is a quick run through of what it does.

      Managed Fusion URL Rewriter is a powerful URL manipulation engine based on the Apache mod_rewrite extension. It is designed, from the ground up to bring all the features of Apache mod_rewrite to IIS 6.0 and IIS 7.0. Managed Fusion Url Rewriter works with ASP.NET on Microsoft’s Internet Information Server (IIS) 6.0 and Mono XPS Server and is fully supported, for all languages, in IIS 7.0, including ASP.NET and PHP. Managed Fusion Url Rewriter gives you the freedom to go beyond the standard URL schemes and develop your own scheme.

      But one feature that I added that is not part of the official Apache mod_rewrite documentation is the ability to add custom modules to extend the use of the URL rewriter in non-traditional ways.  One great example of this was born out of wanting to clean up the SEO mess I created in the early days of this blog.  I had to support the following different types of URL patterns:

      1. http://www.coderjournal.com/?p=23
      2. http://www.coderjournal.com/2008/03/14/some-post.html
      3. http://www.coderjournal.com/2008/03/14/some-post

      to transform them in to the URL pattern that I finally settled on today:

      • http://www.coderjournal.com/2008/03/some-post

      In the above list #2 and #3 were pretty easy to transform using the following rules:

      RewriteRule ^(/[0-9]{4}/.*).html$    $1/ [NC,R=301]
      RewriteRule ^(/[0-9]{4}/[0-9]{1,2}/)[0-9]{1,2}/(.*)$    $1$2 [R=301]

      Because they contained all of the elements that make up my current URL.  As you can imagine problems arose when I had to support links that used #1’s syntax.  It contains zero elements that I can use to create my current URL.  Being a programmer who beleives that each part of a system should handle gracefully the domain it was designed to support, in this case a URL rewriter should be able to handle any senario that has to do with URL rewriting.  I added in support that allowed developers to naturally extend the URL rewriter to accomplish any type of URL rewriting task they could think of.

      Setting Up the URL Rewriter Rules

      In my case I needed to handle the following SQL query everytime I saw a URL that matched #1.

      select concat('http://www.coderjournal.com/',year(post_date),'/',month(post_date),'/',post_name,'/') from wp_posts where ID = $1;

      What this query does is query the WordPress database table that contains all the posts by the post ID and have it return the actual absolute path to the post, that should be displayed in the URL.  To do this I created a new directive for the mod_rewrite syntax called RewriteModule.  I also had to extend the RewriteRule and RewriteCond directives to support these new module extensions.  The RewriteModule, RewriteRule, and RewriteCond are defined by the following syntax:

      RewriteModule <Reference Name> <Namespace>,<Assembly>
      RewriteRule[([<Left Module>],[<Right Module>])] <Pattern> <Substitution>
      RewriteCond[([<Left Module>],[<Right Module>])] <Test String> <Condition Pattern>

      The parts in light blue parts above are optional to creating the rule.  In my case for this blog the rewriter directives looked like the following:

      RewriteModule PostQueryString CoderJournal.Rewriter.Rules.PostQueryStringRuleAction, CoderJournal.Rewriter.Rules
      RewriteRule(,PostQueryString)   ^/\?p=([0-9]+)$    "select guid from wp_posts where ID = $1;" [R=301]

      I have highlighted in red the important parts of the syntax that indicate the custom module processor that should be used on the RewriteRule directive and how it relates back to the class defined in the RewriteModule

      Creating the Module

      I have to warn you that I am not going to demonstrate and show all the properties and methods on the interface that are important for creating a custom module, but I am going to show you the actual meat of the module that is involved in the lookup of the URL from the database.

      public Uri Execute(int logLevel, string logCategory, HttpContext context,
                         Pattern pattern, Uri url, string[] conditionValues,
                         IDictionary<string, string> flags)
      {
      	string inputUrl = url.GetComponents(UriComponents.PathAndQuery, UriFormat.UriEscaped);
      	string sqlCommand = pattern.Replace(inputUrl, Text, conditionValues);
      	string substituedUrl = String.Empty;
      
      	using (MySqlConnection connection = new MySqlConnection(Properties.Settings.Default.DatabaseConnection)) {
      		using (MySqlCommand command = connection.CreateCommand()) {
      			command.CommandText = sqlCommand;
      			command.CommandType = CommandType.Text;
      
      			try {
      				connection.Open();
      				substituedUrl = command.ExecuteScalar() as string;
      			} finally {
      				connection.Close();
      			}
      		}
      	}
      
      	return new Uri(url, substituedUrl);
      }

      It may not be clear right away what is going on, but on line 6, I am replacing the defined value in the regular expression (^/\?p=([0-9]+)$) with the SQL query (from above) to produce a query that will be run against the database. So if the following URL came in to my server:

      It would produce a SQL query that looked like this:

      select concat('http://www.coderjournal.com/',year(post_date),'/',month(post_date),'/',post_name,'/') from wp_posts where ID = 372;

      Notice that the ID, 372, shows up in both the URL and the query, that is because this is the part I am most interested in, in the URL, because it is the only part of the URL that I need to query the database to find the actual path of the post.

      Now that we have the query we can execute it on the database, using lines 9 through 21, and create the resulting URL on line 23. The resulting URL is then passed back through the URL rewriter, and processed using the flags defined. In my case [R=301], actually indicates that I want to do a 301 Permanent Redirect on the URL, which tells the browser and search engines, a like, that they need to update their URL for this page.

      You can test out the above conditions by using the following URL’s that all redirect back to this page:

      1. http://www.coderjournal.com/?p=372
      2. http://www.coderjournal.com/2008/12/9/creating-extension-module-net-url-rewriter-reverse-proxy.html
      3. http://www.coderjournal.com/2008/12/9/creating-extension-module-net-url-rewriter-reverse-proxy/

      The code as always is available on my SVN server at Google Code.

      I hope this comes in handy to some of you developers that have to support legacy URL’s in your own product or a project that you are working on. As always if you have any questions or need anything clarified please feel free to contact me or leave a comment below.

      Tags: , , , , ,

      Posted in C#, How To, SEO, SQL | kick it on DotNetKicks.com | Bookmark | View blog reactions | 3 Comments »

      November 9th, 2008

      Managed Fusion URL Rewriter & Reverse Proxy Release 2.2

      Download: Binary Release
      Download: Source Code

      Release Notes

      If you would like to find out more about the past releases please visit us at http://www.managedfusion.com/products/url-rewriter/release-notes.aspx

      Version 2.2

      • Added support for RewriteCond backreferences: These are backreferences of the form %N (1 <= N <= 9), which provide access to the grouped parts (again, in parentheses) of the pattern, from the last matched RewriteCond in the current set of conditions.
      • Updated the logging output to be more readable.

      Tags:

      Posted in News, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | No Comments »

      June 12th, 2008

      Turn Google App Engine into your own Personal Content Delivery Network (CDN)

      As anybody who has run a growing website or blog knows, response time is going to get worse with the more users you have visiting your site. The users come from all angles, RSS feeds, homepage visits, search engine visits, people sealing your static files that you host, and pretty much anything else that can be served over HTTP. The solution to this problem is to off load your static content on to a Content Delivery Network or CDN. CDN providers cost a lot of money though, so it is nothing for us mere mortals with one server can afford.

      But thanks to Google anyone can now run their own CDN for free on Googles servers. Lucky for you and me Google has made the process really painless and you can even have the CDN under you own domain name. In my case static.coderjournal.com.

      What Is A Content Delivery Network?

      According to Wikipedia:

      A content delivery network or content distribution network (CDN) is a system of computers networked together across the Internet that cooperate transparently to deliver content most often for the purpose of improving performance, scalability, and cost efficiency, to end users. The first web content based CDNs were Speedera, Sandpiper, Mirror Image and Skycache, followed by Akamai and Digital Island.

      Basically it is a network of computers around the world that serves your content to the end user closest to one of those many servers around the world. This method of delivery cuts down on server overload, DNS hops, and delivery time.

      When sites like Microsoft, Yahoo, Google, or Amazon delivery content they use Content Delivery Networks (CDN’s) to host most of their content, especially static files such as images, stylesheets, downloads and anything else you can think of. The reason they do this is to reduce load on their application servers, that serve dynamic content, such as PHP or ASP.NET pages.

      What Is Google App Engine?

      So you may ask what is Google App Engine:

      Google App Engine lets you run your web applications on Google’s infrastructure. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs grow. With App Engine, there are no servers to maintain: You just upload your application, and it’s ready to serve your users.

      You can serve your app using a free domain name on the appspot.com domain, or use Google Apps to serve it from your own domain. You can share your application with the world, or limit access to members of your organization.

      App Engine costs nothing to get started. Sign up for a free account, and you can develop and publish your application for the world to see, at no charge and with no obligation. A free account can use up to 500MB of persistent storage and enough CPU and bandwidth for about 5 million page views a month.

      Google has also announced a very very affordable price plan that any mere mortal can afford. They are not ready to start charging people yet, but here are the details:

      • $0.10 - $0.12 per CPU core-hour
      • $0.15 - $0.18 per GB-month of storage
      • $0.11 - $0.13 per GB outgoing bandwidth
      • $0.09 - $0.11 per GB incoming bandwidth

      How do I setup my own CDN using Google App Engine?

      To use Google App Engine you need to do a couple things that readies you computer to publish your static content to Google. Please take note that my setup is for Windows, but you can easily modify the process for any other OS.

      Setup

      1. You need to download and install Python on your computer. You may already have it if you are using a Unix environment (i.e. Linux or Mac OS X). If you need to download it or would just like to check to see if it is up to date, please visit http://www.python.org/download/ and download the correct version for you operating system.
      2. Install Python to c:\Program Files\ (all my scripts that I have designed to make the publishing to Google are going to be using this path).
      3. You will also need Google App Engine SDK which is available at http://code.google.com/appengine/downloads.html. Download the version that is for you OS. Note that the SDK will check for the Python install, so make sure you install it before the SDK.
      4. Sign up for Google App Engine at http://appengine.google.com/, you will need a valid Google account. I suggest you sign up for a Google Apps account and use that as your Google account. Why I suggest this will become apparent later on.
      5. Once you are done with the setup process you need to create an application. Click the “Create an Application” and give your application a name (called “application identifier”). This is a unique name for all Google App Engine applications. For example I set my application identifier to “coderjournal”. Click though to the next part of the application, if this is your first time registering an application you need to specify your cell phone number and confirm your account with a SMS code that Google sends you.

      Publish To Your CDN

      1. Download my publishing files, hosted on my CDN, at http://static.coderjournal.com/downloads/coderjournal-cdn.zip
      2. Create a directory on your computer specifically for you CDN files. My directory is c:\websites\static.coderjournal.com. Fill this directory with all your static files you want hosted on your CDN. Fill it full of all your css, downloads, flash, images, javascripts, videos, and anything else you want hosted.
      3. Unzip the files I provided to you in step 1 into the directory you created in step 2.
      4. Next we need to edit the YAML configuration file. Open the app.yaml file in your favorite text editor and change application: coderjournal to application: {your application identifier}.
      5. Next go down and edit your static directories, in mine I have css, downloads, flash, images, and js. You can create your own by just modifying the ones I put in the file.
      6. If you installed the Google App Engine SDK in the default directory and Python in c:\Program Files\ then skip to step 7. The next part is also required if you are using the x64 version of Windows, because Google App Engine SDK installs in c:\Program Files (x86)\. So change the paths in publish-cdn-coderjournal.bat to your actual paths.
      7. Now double click on publish-cdn-coderjournal.bat and a command window will display. Fill in your Google account and password that you used to sign up for the Google App Engine account. And you content will start to publish.
      8. You now have you own private CDN that can be accessed at http://application-identifier.appspot.com.

      Using Your Own Domain (Optional)

      1. If you created your own Google App as suggested up in Setup step 4, you can create your own custom domain for your CDN. If you didn’t, don’t worry just create one, and follow the steps below.
      2. Go to the dashboard of your Google Apps and click “Add more services”.
      3. Under other services you will see Google App Engine and a place to enter your application identifier. Enter you application identifier and click “Add It Now”.
      4. It will take you to the next page where you enter in the domain you want for your CDN, I suggest something simple like static.yoursite.com.
      5. Then you just need to follow the steps for adding a CNAME to your DNS and you are ready to go with you custom domain.

      IdeaPipe Logo

      How do I use my own CDN?

      Well this is the cool part! You just use the absolute path to your files. For example if you wanted to host the image to your right you would just use the following in your HTML:

      Potential Gotcha: I forgot to mention that currently the files hosted statically are case-sensitive. I have reported this issue to Google, hopefully they will correct it soon. http://code.google.com/p/googleappengine/issues/detail?id=466

      <img src="http://static.coderjournal.com/images/ideapipe-logo.png" />

      It is really that simple. Now comes the cool part that I need your help with, and proof that this is really a true CDN. I would like to see how many different IP Addresses my CDN points to. So far I was able to find the following IP addresses:

      1. 72.14.207.121
      2. 64.233.179.121
      3. 66.249.91.121

      That point to:

      static.coderjournal.com

      To see what IP Address you get on your local machine just pull up the command prompt and type:

      ping static.coderjournal.com

      Please report your findings in the comments below. I am sure everybody would love to see how big Google’s CDN really is.

      Tags: , , ,

      Posted in ASP.NET, How To, JavaScript, Programming, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 37 Comments »

      May 15th, 2008

      What I learned about SEO from Celebrity Jeopardy!

      I was having a conversation with my best friend a few days ago and we got on the subject of our preferences for how URL’s are rendered for blogs.

      I fall on the side of lowercase letters and hyphens splitting the words:

      http://www.somesite.com/2008/05/my-url-preference-is-like-this/

      He falls on the side of title case lettering and no hyphens splitting the words:

      http://www.somesite.com/2008/05/HisURLPreferenceIsLikeThis/

      He has his reasons I have mine, I just think mine are more valid. Sorry Al that is my opinion. I am going to layout why I think mine are more valid, with an example from Celebrity Jeopardy. For those of you who aren’t familiar with this famous skit:

      Celebrity Jeopardy! was a recurring sketch on Saturday Night Live. It parodies the Celebrity Jeopardy! edition of the television game show Jeopardy! where celebrities compete and the game’s level of difficulty is significantly reduced. Thirteen sketches have been aired to date, two per season from 1996 to 2002, and one in 2005.

      Before I get to my commentary lets first watch this excerpt from Saturday Night Live’s Celebrity Jeopardy!:

      I really tried to find a good example of one of the famous Sean Connery mess ups in a legal video sharing site, but none of them had anything usable. The skit I was really looking for was the famous “An Album Cover” where Sean Connery pronounces it as “Anal Bum Cover”.

      As you can see Norm MacDonald playing the character of Burt Reynolds transforms the category in Celebrity Jeopordy!, on purpose for comedy reasons. In my analogy Google is going to be the Burt Reynolds of your search, however instead of finding the wrong words on purpose it is going to do it because it is a dumb machine that does what it is asked even if the results are not contextually accurate.

      An Album Cover Google Example

      Notice in the image above, in the highlighted words, Google finds both “An Album Cover” and “Anal Bum Cover”. This is because Google understands that the words you may be looking for don’t always fall in the same order and spacing as the exact phase you are looking for. This is something that SEO experts have known for a long and try to control so that their content shows in the top spot for the keywords they designed in to the page.

      If you don’t control your URL, which is one of the highest ranking keywords on your site. You could end up decreasing the effectiveness of your keywords, as an almost duplicate keyword penalty. Granted I don’t know if something like this exists as a penalty, but when you are dealing with SEO it never hurts to be as careful and precise as possible.

      So again I ask which URL would you rather have? Now knowing how a URL can be misconstrued :

      http://www.somesite.com/2008/05/an-album-cover/
      http://www.somesite.com/2008/05/AnAlbumCover/

      So this is my way of saying be careful what your URL spells out, you may get unintended search rankings that you may not want, or you may offend a person who reads the URL wrong. Either way it is always good to control your environment with in reasonable means to make sure the message is received as you were intending it.

      note: There are other factors in play that yielded the search results above.  However one thing that you will notice is that none of the URL’s were falsely highlighted, that is because they used a non-whitespace character to break up the words.

      Tags: , , , ,

      Posted in SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 5 Comments »

      February 10th, 2008

      How to use the .NET URL Rewriter and Reverse Proxy to run WordPress on IIS

      First off I would like to say that many of my readers are very intelligent, they picked up on a one line sentence in my last post about my new design and Coder Journal switching from Linux to Windows.

      I also moved hosts from GoDaddy’s shared Linux hosting. To GoDaddy’s virtual dedicated hosting on Windows. This proved difficult since URL Rewriting isn’t currently built in to IIS 6.0 like it is in Apache. I will talk a little about this setup in a later post.

      Switching from Linux to Windows wasn’t the part that really intrigued many of them, it happens every day so why would it? It was the fact that I was able to get the same level of URL Rewriting out of IIS 6.0 as I was out of Apache’s mod_rewrite and still be able to make WordPress look and function like it was running on Apache.

      So to get started I just want to say, while I know there are other solutions out there to get WordPress hosted on IIS with the exact same outcome as what I am going to present below. I did this for the following reasons:

      1. I am a .NET guy and I love developing software that is popular on other platforms on .NET just to see if it can be done.
      2. I also believe in Eating One’s Own Dog Food, and the URL Rewriter and Reverse Proxy that I am presenting below, and that is used in Coder Journal, is my own creation.

      What This Post Covers

      This post is meant to provide an insight in to a technology, Reverse Proxy, that many developers are unaware of and it will be demonstrated through the eyes of my blog and how it works in regards to WordPress/IIS 6.0. Some of the basics will be covered such as the working of a URL Rewriter and Reverse Proxy. This post will not cover how to code a URL Rewriter or Reverse Proxy in C#. The reader should also have a basic understanding of how RegEx, HTTP, and URL Rewriters.

      The Problem

      On IIS 6.0, and previous versions, due to a lack of any standardized URL Rewriting process built in, so developers have to take nice visitor and SEO friendly URL’s like this:

      http://www.coderjournal.com/2008/02/10/sample-post/

      And make IIS 6.0 compatible ugly URL’s, which may or may not be SEO friendly, and neither URL is as visitor friendly as the one above.

      http://www.coderjournal.com/?p=123
      http://www.coderjournal.com/index.php/2008/02/10/sample-post/

      My Solution Used On Coder Journal

      The solution I choose was influenced by a number of factors, a couple that will change for the better when IIS 7.0 is released. The factors are:

      • I need to run PHP for WordPress.
      • I need to run FastCGI for IIS 6.0 to get the best performance out of PHP.
      • .NET and PHP run separate from each other, so I cannot use a .NET URL Rewriter to control which PHP file is chosen to run. (This changes in IIS 7.0 with Integrated Pipelines)
      • I need to pass all requests to www.coderjournal.com through .NET, which has a performance loss for rendering static files such as image, and text files. (This changes in IIS 7.0 with Integrated Pipelines)
      • I need to keep the URL’s friendly for visitors and SEO.

      So because of what I listed above I needed to create two web servers to host www.coderjournal.com, which I will talk about later on in this article. One of the servers is the public interface to www.coderjournal.com, which I will call frontend, and the other is the Backend WordPress web server, which I will call backend that only handles standard WordPress with the ugly URL’s listed above, this one is not public. The picture will demonstrate the structure better than I can explain.

      Coder Journal Web Structure

      As you can see, from the above picture, all requests to WordPress are handled by the frontend server for this blog. This all happens through a technique known as Reverse Proxy.

      A reverse proxy dispatches in-bound network traffic to a set of servers, presenting a single interface to the caller. For example, a reverse proxy could be used for load balancing a cluster of web servers. In contrast, a forward proxy acts as a proxy for out-bound traffic. For example, an ISP may use a proxy to forward HTTP traffic from its clients to external web servers on the internet; it may also cache the results to improve performance.

      So with out going in to a deep explanation of how I was able to accomplish the reverse proxy, basically for every request that comes in to frontend server that meets a certain criteria I make another HTTP web request to the backend server and then write it’s response back to the original frontend server request.

      Step 1 - Setting Up .NET to Process All Requests

      Setup your frontend server to process everything through the .NET framework.

      1. Open IIS and right-click on the website and select Properties.
      2. Click the Configuration button under Application Settings section
      3. Click the Insert… button to create a new wildcard mapping
      4. Set the executable textbox to aspnet_isapi.dll file location.
        for .net 2.0, 3.0, 3.5: C:\Windows\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll
      5. Make sure the checkbox Verify that file exists is not checked.
      6. Press OK to confirm and close all the windows.

      Step 2 - Install PHP/WordPress

      Just follow this article on IIS.NET for installing PHP/WordPress on IIS 6.0. You may also want to install FastCGI, I recommend this, but it is optional.

      Step 3 - Setting Up the URL Rewriter and Reverse Proxy Rules

      The criteria for the requests are put inside the URL Rewriter Rules files. But before the proxy request is made, I must check to make sure the file being requested doesn’t already exist on the frontend server. If it does exist on the frontend server I don’t want to make a reverse proxy request. The following is the code used to do that.

      # any file that exists just return it
      RewriteCond %{REQUEST_FILENAME} -f
      RewriteRule ^(.*) $1 [L]

      Then after I check to make sure the file doesn’t exist on the frontend server I make the request to the backend using the following rules.

      # proxy all connections through to the backend server
      RewriteRule ^(/[0-9]{4}/.*) http://backend/index.php$1 [P]
      RewriteRule ^(/tags/.*) http://backend/index.php$1 [NC,P]
      RewriteRule ^(/topics/.*) http://backend/index.php$1 [NC,P]
      RewriteRule ^(/author/.*) http://backend/index.php$1 [NC,P]
      RewriteRule ^(/comments/feed/.*) http://backend/index.php$1 [NC,P]
      RewriteRule ^(/page/.*) http://backend/index.php$1 [NC,P]
      RewriteRule ^(.*) http://backend$1 [P]

      Conclusions

      To get the exact same setup as I have, you will need the following software, which is all free for download:

      As always if you have any questions about the setup or the performance please post them below in the comments and I will answer them and or update the post as needed.

      Happy Coding.

      Tags: , , , , , , , , , ,

      Posted in ASP.NET, C#, How To, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 1 Comment »

      January 26th, 2008

      Coder Journal’s New Year Make Over

      New Theme

      My first major change was the development of my own theme. My old theme was clunky and overall I didn’t like the feel that it gave to my reader base. I became greatly discouraged looking for a new theme as most are more of a testament to art and less on readability and functionality. So I decided to create my own that had a very simple layout.

      Optimization of Load Time

      One of the things I hated about my other blog was the fact that I didn’t have control over how the HTML and thus JavaScript was laid out. Especially the JavaScript because I had duplication where I didn’t need it. The script that Technorati gives you is hardly optimized for load time because of duplication of a supporting script file.

      <a class="tr-linkcount" href="http://technorati.com/search/{your URL here}">View blog reactions</a>
      <script src="http://technorati.com/linkcount" type="text/javascript"></script>

      If you notice the 2nd line in the script above never changes. Well to optimize this I only included the 1st line in my post text and the 2nd line is at the bottom of the page with the rest of my JavaScript.

      The next thing I did was optimize my load time using YSlow. See Jeff Atwoods Description.

      1. Add Expires headers to all my static content for 10 years from the day it is downloaded.
      2. Enabled Gzip compression for all my static content.
      3. Put all CSS at the top of the page.
      4. Reduced all DNS lookups by downloading images from LinkedIn, Technorati, and others and hosted them locally.
      5. Moved all JavaScript to the bottom of the page.
      6. Removed duplicate Technorati scripts.

      I also moved hosts from GoDaddy’s shared Linux hosting. To GoDaddy’s virtual dedicated hosting on Windows. This proved difficult since URL Rewriting isn’t currently built in to IIS 6.0 like it is in Apache. I will talk a little about this setup in a later post.

      SEO and SEM

      I did a decent amount of SEO and SEM work to get my blog up to snuff. I took the following steps when redesigning the HTML for easy of indexing by Search Engines and Googles Media Bot (Used for giving relevant results in AdSense)

      1. I download the MySQL file from the database and normalized all the URL’s to the one you see above.
      2. Google AdSense only allows you to have 3 AdUnits per page and the placement of the AdUnits counts. For instance I had to red0 my theme so the content was before the sidebar, in terms of the HTML, so that the AdUnit in the articles placed first so that it received the highest quality Ad.
      3. I reduce my categories to a handful of manageable ones, and migrated the rest to the new Tags feature.
      4. Use H1, H2, and H3 tags sparingly. They should be a way to document the internal structure of your HTML page. (i.e. logical sections) My logic is as follows
        1. H1 is used for the blog title.
        2. H2 is used for the article title.
        3. H3 is used for sections of the article.
      5. I started using the Post Slug is very important and should abide by the following rules
        1. No more than 3-7 keywords
        2. No common English words such as (if, about, when, my, etc.)

      So that was how I spent my holiday creating a new design for my blog. If you have any suggestions, I am all ears about how I can improve my blog for the better.

      Tags: , , , , , , ,

      Posted in How To, Personal, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 1 Comment »

      December 7th, 2007

      SEO and C# Extention Methods

      I previously talked about the importance of using the correct kind of redirect to optimize your website for search engines in an article titled. World Of HTTP/1.1 Status Codes. I just recently decided to create a C# Utility class to help me in this endeavor and to extended the far from complete HttpResponse.Redirect method. I am using a new C# 3.0 language extension called Extension Methods. Basically what the extension method does is, it allows you to, add methods to types that you don’t have the ability to modify, in my case the HttpResponse class.

      I have created the following code to give me better control over my redirects in the HttpResponse class.

      public static void Redirect(this HttpResponse response, int type, string url)
      {
      	response.Clear();
      
      	switch (type)
      	{
      		case 301:
      			response.StatusCode = (int)HttpStatusCode.MovedPermanently;
      			response.StatusDescription = "Moved Permanently";
      			break;
      
      		case 302:
      			response.StatusCode = (int)HttpStatusCode.Found;
      			response.StatusDescription = "Found";
      			break;
      
      		case 303:
      			response.StatusCode = (int)HttpStatusCode.SeeOther;
      			response.StatusDescription = "See Other";
      			break;
      
      		case 304:
      			response.StatusCode = (int)HttpStatusCode.NotModified;
      			response.StatusDescription = "Not Modified";
      			break;
      
      		case 307:
      			response.StatusCode = (int)HttpStatusCode.TemporaryRedirect;
      			response.StatusDescription = "Temporary Redirect";
      			break;
      
      		default:
      			goto case 302;
      	}
      
      	response.RedirectLocation = url;
      
      	response.ContentType = "text/html";
      	response.Write("<html><head><title>Object Moved</title></head><body>");
      	response.Write("<h2>Object moved to <a href=\"" + HttpUtility.HtmlAttributeEncode(url) + "\">here</a>.</h2>");
      	response.Write("</body></html>");
      
      	response.End();
      }
      

      So now in your code you don’t have to jump through hoops to chance the StatusCode, StatusDescription, RedirectLocation, and ContentType, just so you can respond with a 301 Redirect instead of a 302 Redirect (the default for HttpResponse.Redirect and the most dangerous of the redirects from an SEO point of view). All that you need to have access to is the Response property from your Page or Context and you are good to go.

      Response.Redirect(301, "http://www.coderjournal.com");

      So that is all you need to do to give your self better control over your redirects in .NET. You can also use this same C# 3.0 Extension Methods for any object that you need to add a custom method on to.

      Tags: ,

      Posted in C#, Programming, SEO | kick it on DotNetKicks.com | Bookmark | View blog reactions | 4 Comments »