Back to examples |
This is a joyously small script to calculate the number of
pages spidered by google for a specific domain. Nothing fancy, it just extracts
the figure that google give us at the top of the page. As with other scripts,
the code on googles side may change making this invalid.
I tend to us this in a daily batch file to check for my new domains and how
google is coming along spidering new pages.
|
<?php $count = 0;
if(!empty($searchurl)) { $searchurl = str_replace("%2E",".",$searchurl);
$filename = "http://www.google.com/search?sourceid=navclient&". "ie=UTF-8&q=site%3A$searchurl";
$file = fopen($filename, "r"); if (!$file) { echo "<p>Unable to open remote file $filename.\n"; } else { while (!feof($file)) { $var = fgets($file, 1024);
if(eregi("of about <b>(.*)</b> from",$var,$out)) { $out[1] = strtolower(strip_tags($out[1]));
$count = $out[1];
break; } } fclose($file); }
if($count) { $result = "The site $searchurl has $count pages listed"; } else { $result = "The site $searchurl is not spidered"; } } ?>
|
|
|