Replay search on GosuGamers doesn’t work properly? No problem

Sometimes programming skills become really handy. For example like just now when I tried searching for Warcraft 3 replays including a specific player on GosuGamers. The problem is that it revealed no matches, which is kinda weird because I already found one replay of the same player via Google.

The solution? Write a script that scrapes all the replay pages and checks if the player nickname is found on the page.

Update: Come to think of it maybe I was too clever for my own good. The Google query site:gosugamers.net/warcraft gamlasonn -gosubet is a viable alternative 🙂

The complete script can be seen below. It’s fairly short and should be pretty obvious, basically it uses regular expressions on the nickname and if a match is found it prints the page URL so I can manually go and check it.


<?php

set_time_limit(360);

$start = 0;
$end = 7500;
$matches = array();

for ($i=$start; $i<=$end; $i+=100) {
$url = 'http://www.gosugamers.net/warcraft/replays.php?&start=' . $i;

echo 'Processing ' . $url . '...<br />';

$ch = curl_init();
$timeout = 3; // set to zero for no timeout
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$html = curl_exec($ch);
curl_close($ch);

if ($html && preg_match('#gamlasonn#i', $html)) {
array_push($matches, $url);
}
}

echo "<br /><br /><b>Results:</b><br />";
echo implode('<br />', $matches);

?>

Note that web scraping is often frowned upon by website owners as they prefer that you visit their site directly, and for many more reasons. Take care not to over do it, it might also go against their ToS.

This entry was posted in GosuGamers, PHP, Warcraft 3 and tagged , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.