<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.2" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Google MD5 Hash Search Engine</title>
	<link>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/</link>
	<description>Security Tools and Tips</description>
	<pubDate>Wed, 20 Aug 2008 14:31:37 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.2</generator>

	<item>
		<title>By: Onur Safak</title>
		<link>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-58303</link>
		<author>Onur Safak</author>
		<pubDate>Mon, 14 Jul 2008 23:25:24 +0000</pubDate>
		<guid>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-58303</guid>
		<description>but how will you binefit from storing them into google

It helps a lot because even if the site goes offline, google cache will still have the hash

Actually, the benefit is the storage and the ability to search hundreds of gigabytes, even terabytes of result data from the index.

And I don't like the idea to disturb one of the best internet service with a great team behind.</description>
		<content:encoded><![CDATA[<p>but how will you binefit from storing them into google</p>
<p>It helps a lot because even if the site goes offline, google cache will still have the hash</p>
<p>Actually, the benefit is the storage and the ability to search hundreds of gigabytes, even terabytes of result data from the index.</p>
<p>And I don&#8217;t like the idea to disturb one of the best internet service with a great team behind.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dragos Lungu</title>
		<link>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-5495</link>
		<author>Dragos Lungu</author>
		<pubDate>Wed, 17 Oct 2007 18:41:53 +0000</pubDate>
		<guid>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-5495</guid>
		<description>@photoshop 
It helps a lot because even if the site goes offline, google cache will still have the hash</description>
		<content:encoded><![CDATA[<p>@photoshop<br />
It helps a lot because even if the site goes offline, google cache will still have the hash</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: photoshop</title>
		<link>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-5466</link>
		<author>photoshop</author>
		<pubDate>Wed, 17 Oct 2007 04:00:19 +0000</pubDate>
		<guid>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-5466</guid>
		<description>but how will you binefit from storing them into google</description>
		<content:encoded><![CDATA[<p>but how will you binefit from storing them into google</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tim Brown</title>
		<link>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-292</link>
		<author>Tim Brown</author>
		<pubDate>Sun, 24 Jun 2007 12:18:14 +0000</pubDate>
		<guid>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-292</guid>
		<description>No not at all, everyone sees the same content.  The point here is that Google doesn't always cache the body of the web page, but will always cache the title.  Hence, when we present results for any given word, we present them in both areas.  The ideas is that should we ever feel like closing the PoC, Google will continue to remember :).</description>
		<content:encoded><![CDATA[<p>No not at all, everyone sees the same content.  The point here is that Google doesn&#8217;t always cache the body of the web page, but will always cache the title.  Hence, when we present results for any given word, we present them in both areas.  The ideas is that should we ever feel like closing the PoC, Google will continue to remember :).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ben</title>
		<link>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-290</link>
		<author>Ben</author>
		<pubDate>Sun, 24 Jun 2007 04:01:35 +0000</pubDate>
		<guid>http://www.dragoslungu.com/2007/06/22/google-md5-hash-search-engine/#comment-290</guid>
		<description>It's a rather interesting concept: leaving storage and indexing up to google, instead of having to handle it all yourself.  So all you do is generate the hashes as google requests the page.  The only problem with it is, I don't know:
1) how far Google is willing to crawl: a search for "site:www.nth-dimension.org.uk" reveals some 7K results for the domain name.
2) Poking around in the URL, it looks like they only have about 2400 words (change the startline value)
3) There's one that is more methodical: http://reverse.me.uk/, but a search for that domain reveals a measily 49 results... so somehow Google will stop early if it all looks the same?  I'm not sure.
4) I've tried to make my own, just for the fun of it.  My blog is searchable by google, so hopefully there'll be more than 49 results... we'll see?  Here's the hash generator: http://darwin.servehttp.com/cgi-bin/hash.pl</description>
		<content:encoded><![CDATA[<p>It&#8217;s a rather interesting concept: leaving storage and indexing up to google, instead of having to handle it all yourself.  So all you do is generate the hashes as google requests the page.  The only problem with it is, I don&#8217;t know:<br />
1) how far Google is willing to crawl: a search for &#8220;site:www.nth-dimension.org.uk&#8221; reveals some 7K results for the domain name.<br />
2) Poking around in the URL, it looks like they only have about 2400 words (change the startline value)<br />
3) There&#8217;s one that is more methodical: <a href="http://reverse.me.uk/," rel="nofollow">http://reverse.me.uk/,</a> but a search for that domain reveals a measily 49 results&#8230; so somehow Google will stop early if it all looks the same?  I&#8217;m not sure.<br />
4) I&#8217;ve tried to make my own, just for the fun of it.  My blog is searchable by google, so hopefully there&#8217;ll be more than 49 results&#8230; we&#8217;ll see?  Here&#8217;s the hash generator: <a href="http://darwin.servehttp.com/cgi-bin/hash.pl" rel="nofollow">http://darwin.servehttp.com/cgi-bin/hash.pl</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
