Google MD5 Hash Search Engine

Google MD5 hash search engineI came across an interesting combination of blackhat SEO and "knowledge belongs to the people" hacker attitude. It’s about storing unique MD5 hashes in the title of numerous pages spidered by Google . You may call it an implementation of an hash search engine using Google.

Unlike other implementations, the aim here is to get Google to store the word and associated hash. We do this by putting them into the title where it will always be stored by Google’s spider. Dynamically generating them means they’re only there when Google’s spider wants them.

If I read it right they present different content to humans vs. search engines, isn’t this a cloaking blackhat SEO technique?
Anyway it’s a nice PoC of the ubiquity of Google search, but I still think that GData’s free online MD5 cracker kicks ass with it’s 168,678,430 unique entries.

 



Thank you for reading this post. You can now Read Comments (5) or Leave A Trackback. Print This Post Print This Post

5 Responses to “Google MD5 Hash Search Engine

  • 1
    Ben
    June 23rd, 2007 21:01

    It’s a rather interesting concept: leaving storage and indexing up to google, instead of having to handle it all yourself. So all you do is generate the hashes as google requests the page. The only problem with it is, I don’t know:
    1) how far Google is willing to crawl: a search for “site:www.nth-dimension.org.uk” reveals some 7K results for the domain name.
    2) Poking around in the URL, it looks like they only have about 2400 words (change the startline value)
    3) There’s one that is more methodical: http://reverse.me.uk/, but a search for that domain reveals a measily 49 results… so somehow Google will stop early if it all looks the same? I’m not sure.
    4) I’ve tried to make my own, just for the fun of it. My blog is searchable by google, so hopefully there’ll be more than 49 results… we’ll see? Here’s the hash generator: http://darwin.servehttp.com/cgi-bin/hash.pl

  • 2
    Tim Brown
    June 24th, 2007 05:18

    No not at all, everyone sees the same content. The point here is that Google doesn’t always cache the body of the web page, but will always cache the title. Hence, when we present results for any given word, we present them in both areas. The ideas is that should we ever feel like closing the PoC, Google will continue to remember :).

  • 3
    photoshop
    October 16th, 2007 21:00

    but how will you binefit from storing them into google

  • 4
    Dragos Lungu
    October 17th, 2007 11:41

    @photoshop
    It helps a lot because even if the site goes offline, google cache will still have the hash

  • 5
    Onur Safak
    July 14th, 2008 16:25

    but how will you binefit from storing them into google

    It helps a lot because even if the site goes offline, google cache will still have the hash

    Actually, the benefit is the storage and the ability to search hundreds of gigabytes, even terabytes of result data from the index.

    And I don’t like the idea to disturb one of the best internet service with a great team behind.


Subscribe without commenting


Leave a Reply

Note: Any comments are permitted only because the site owner is letting you post, and any comments will be removed for any reason at the absolute discretion of the site owner.

CommentLuv badge