Microsoft pursuit of Google revealed

Microsoft October 31st, 2003

I was in meetings all morning. So, I missed this report earlier, but Dave just alerted me to the Microsoft and Google news.

“Microsoft approached Google, the internet search engine, two months ago to discuss a partnership or even a merger it emerged today.”

“Google showed little interest in overtures from the company that dominates the market for operating systems.”
(via The Guardian)

This is certainly interesting, yet given MSFT’s track record in this respect, the news is not surprising. My guess is that the initial rejection by Google spured MSFT’s recent MSN Search push.

Doh! The cruller is no more!

General October 30th, 2003

Donuts?

“Walk into any local Dunkin’ Donuts and you can purchase a caramel-swirl latte or sourdough bagel, a pumpkin muffin or powdered Munchkin. You can get a jelly stick, chocolate stick, or chocolate-coconut stick, pastries that are shaped somewhat like conventional crullers and contain roughly the same lip-smacking number of empty calories.”

“But you cannot get a cruller anymore… ” (more here)

Your Mom is so versatile you can now have her in RSS

RSS October 30th, 2003

Yes, indeed it’s true yourmom.com now has her very own set of RSS feeds!

I’m sure you’ll sleep better with this wonderful news. In fact, there’s more! Your mom even comes in Atom 0.1 format — thanks to FeedCreator class v1.3

Enjoy :-)

JavaScript: Search word hit-highlighting

Open Source October 29th, 2003

I found searchhi, which is a slick JavaScript library by Stuart Langridge that will highlight keywords in your documents when the referring link to you page comes in from a seach engine such as Google:

searchhi JavaScript library is a way of automatically highlighting words on a page when that page was reached by a search engine. In essence, if you search, for example, Google for some words, and then follow a link from the search results to a searchhi enabled page, the words you searched for will be highlighted on that page.”

I was actually thinking of using PHP to do this, but Stuart’s JS code seems to be a better alternative given that the performance hit is on the client-side.

Google eyes book search

Search October 29th, 2003

In light of Amazon’s recent book search service, this report on CNet about Google in talks with publishers to provide a similar service seems a little strange.

“Google is in talks with several publishers to build a service that would allow Web surfers to search the full text of books online” (via CNet)

IMHO, the service does make sense for Amazon as a way to drive more consumers to book purchases and I suppose it could also turn another revenue source for Google — certainly as a research tool for business and academia, but is the market big enough to support the effort?

Movable Type Blog Migration

Blogs October 28th, 2003

Over the last week, usually in the mid-to-late evenings — after Catherine falls asleep, I have been slowly migrating my B2-based blog to Movable Type.

I must say that for the most part the process has been fairly straight forward. The MT system installed smoothly and customizing the core MT templates, while time-consuming getting them to fit my old B2 template, were rather easy and extremely flexible.

However, during the migration process I had some interesting obstacles. In particular, I wanted to seamlessly maintain the entire URL-space of my old B2 blog with my new MT blog. My initial thinking was that with a little data-scrubbing and massaging I could export the MySQL table data from B2 and import the data into the MT table-space.

In addition, if I could retain the same entry/post IDs between the old and new system, I could easily redirect links via an Apache mod_rewrite regular expression mapping.

After some initial head-scratching, this idea was a bit more complex that I had thought given that I wanted to include comments as well. Plus, I wasn’t sure if retaining the entry/post IDs would break MT.

I did some quick searches via Google and the great MT Support forums and found Bill Grady’s excellent B2 Export script for MT, which allowed me to dump all of my B2 posts and comments into MT’s import format. This format enabled me to easily import my old post data into MT.

The problem however was that (as far as I can tell) MT’s import format does not allow for the specification of entry/post IDs, which excluded me from using a simple Apache mod_rewrite regular expression to map the URL-space.

Oh well, back to the drawing-board…

After further research, I found the following links regarding interesting solutions that utilize MT archive templates to create global redirects in Apache’s .htaccess or httpd.conf formats.

Unfortunately, these solutions used the entry_id as the key field in the mapping, which cased problems for me because my old B2 blog had post IDs that were inconsistent with MT — Plus, for some reason the post ID in B2 were out of order.

I though I could use the post date as my key field, but for some reason I found a number of inconsistencies between the post dates in the two data sets. Very odd.

Instead I used the entry title as my key field; this required me to insure that the entry titles between both old and new data sets where precisely the same and not contain any duplicates. This way I could use the entry titles to map old post IDs from B2 to the new URL space in Movably Type.

Once the titles were synchronized, I created an MT template to export my newly imported MT entries in a CSV format that I could manipulate in Excel. I used the following MT Archive template:


<$MTEntryID$>,<$MTEntryDate format="%m/%d/%Y %H:%M"$>,<$MTEntryTitle$>,<$MTEntryLink$>

I then export my B2 post data into a CSV file, sorted the data in Excel, by title; opened the newly-exported MT data in Excel and also sorted it by title. I now had two matching sets of data, each with unique entry/posting IDs. The next step was to construct the redirect mapping between old post ID and the entry’s new URL.

Ultimately, I used a bit of PHP to do the redirecting. I did this by constructing an associative array using the post ID from B2 as the index, with the MT entry URL as the value. I also utilized the ‘array_key_exists’ PHP function to determine if the old post ID was found in the array.

Here’s a snippet of code:

$entry_array = array (
"613" => "http://www.hatch.org/blog/2002/06/17/404.php",
"576" => "http://www.hatch.org/blog/2002/04/18/1000_ultrapersonal_computer.php");

// entry lookup $p = post_id
if ($p) {
if (array_key_exists($p, $entry_array)) { $url = $entry_array[$p]; } }

// redirect
header( "Location: ".$url );

Worked like a charm!

I wish I could use this or a similar technique to redirect my old RSS feed to my feed’s new location, but that’s a topic for another day…

Amazon’s Full Text Search

Search October 23rd, 2003

Amazon has now extended its search to include the full text of over 120,000 books. It will even do hit-highlighting on the actual pages that match. Slick!

Office 2003 and the Google Web Service API

Microsoft October 21st, 2003

This article about integrating Google into the research pane of Office is from a few months ago, but I think relevant given that today is the official launch of Office 2003

Hybrid Application Manager

Open Source October 17th, 2003

AppRocket seems to be a hybrid application manager with “intelligent search” that appears to be based on LaunchBar for Mac OS X. (link via Les Orchard)

“AppRocket uses a very special search algorithm to zip through thousands of items and only show you that which is most relevant.”

Just a note however, AppRocket requires .Net 1.1.

Is Comment Spam Cost Effective?

Blogs October 16th, 2003

I’m getting my fair share of comment spam like many other bloggers, but I can’t imagine that the cost/time ratio is actually worth it.

I think Sam Ruby sums it up best:

“65 minutes to create. Carefully crafted to appear to be on topic. 10 seconds to wipe out.”

LOL! Dumb asses!