Over at REST-discuss there’s an interesting discussion going on regarding how to construct you URLs in the best way. Bill Venners of Artima posted a very interesting note on how he canonicalize all URLs, including the ordering of query parameters and removing default parameters, using redirects. His motivation for doing this, which I think is brilliant, is to aid search engines in identifying two URLs as pointing to the same resource. That is, if http://example.com/foo?a=1&b=2 is his canonical URL, the following examples (with c having the default value of 3) would all get a 301 response pointing to the canonical URL above:
Getting a search engine to figure out that all these are the same thing, would add up the four page rankings and likely make the final page a higher hit. Now, we don’t know how search engines do there rankings (or canonicalizations for that matter) but I think it’s a fair guess that this method helps along.
In a response, Roy Fielding suggest he should use a canonical URL that doesn’t use query parameters at all, probably helping caches to do a better job. I think that makes sense as well, so these suggestions will now form my embryo for a URL c14n best practice:
- Don’t use query parameters in the URL of the final resource (a search result page is a different matter, but the actual final page should have a clean URL)
- Redirect cases where you need to use query parameters (e.g. when allowing a user to select a resource using for example an HTML form) to the canonical URL.
- When you really need a page identified by the query parameters (e.g. a search result page), redirect so that ordering and presence of query parameters are canonicalized
January 9th, 2007 at 9:33 pm
January 9th, 2007 at 9:51 pm