Wired magazine has a feature article which gives about as much detail as outsiders can expect on the core of Google’s business, its search algorithm. I was surprised to see that philosopher Ludwig Wittgenstein was an influence. Hundreds of different pieces of information (or “signals”) are used to rank the results, and some of these are contextual to the user: for example, geographical information is used to prioritise results from near your location.

One of the signals which is increasing in importance is page speed: the time it takes the page to load and render. Hence it’s worth reading up on Google’s performance optimisation tips.

Intute advent calendar blog

This December, Intute is once again running an “Advent Calendar” on its blog, with the theme of user-created content. It started on Tuesday with a post about the independent film Born of Hope, set in Tolkein’s Middle Earth. My own post, “Voluntary work for an obscure educational charity”, discusses contributing academic material to Wikipedia. Paul Meehan’s post today discusses augmenting a human-maintained web catalogue with Google Custom Search Engine. There’s more to come through the month on web2.0/community themes, and as usual the Intute blog has that bit more depth than the rest. driven Google custom search

This is an account of how and why I wrote a Google custom search engine to search sites that I had bookmarked on

I’ve liked the Google custom search engine since I first played with it shortly after it came out. If you don’t know about Google CSE, it allows an individual or group to create a search form that will perform a full text search using the Google search engine but limited to sites which they choose. This search form, and the results page, can be embedded in any website. I think it is the obvious way to build a cross search across all the centres in an organization like the HE Academy (this was one of my first custom search engines). Better, for teaching and learning you can set up a reading list of recommended sites for a course and let students do a full google search that prioritizes those sites (for a sort of generic variation on this see Tony Hirst‘s Open Educational Resources search). Better still, let the students as a group decide which sites they want on their course reading list.

Setting Canonical Domain with Apache

An experiment in search engine optimization:
My work site, (or if you’re intimate) is also known by three other domain names, because of past re-branding. My problem? How to tell search engines that these are the exact same site, so they know that an external link to, say, is to count as a link to (and boost my site’s Google ranking, goddamit!). Establishing a canonical domain name like this should also help consistency of brand (i.e. helping the user know what site they are on and what to call it).

For a long time I had a <base href=”…”> tag in my home page to set the canonical domain. This is dumb. It only ensures that a user sees the domain once they’ve come to the home page and then clicked a link. A check shows that, and still exist in the Google index as separate sites. A serious fix requires a few lines of Apache config:

RewriteEngine on
RewriteCond %{HTTP_HOST} (ltsn|heacademy)
RewriteRule (.*)$1 [R=301,L]

