How to get information removed from google cache

Google's instructions on how to do this are contradictory. Here is a method I used succesfully.

* * *

Scenario

This task arose in trying to service a request for a page to be hidden on indymedia.org.uk .

The story described part of a campaign against deportation of an asylum seeker who was subsequently granted refugee status. There was nothing in the story that breached editorial guidelines but the subject of the report has a distinctive name and didn’t want his immmigration history to remain public. We were contacted by friends of his from a local No Borders group who had supported him in his campaign to remain in the UK, saying that he wanted to ‘put his past behind him.’

We had already removed personal contact details from the story, removed an attached file with contact details in and later hidden the story itself but a month later, we were asked why the story still appeared when the subject googled his own name. The subjects’ friends had already been advised to get the content removed from google cache but they couldn’t work out how to do it.

It seems that google provides at least 3 different ways for cache content to be manipulated:

  1. a process for webmasters, which requires a validation process that might compromise the privacy of the administrator
  2. a process for ‘data subjects’ (people about whom information is stored)
  3. a faster process for people who may become victims of identity theft

These instructions relate to 2.

Before you begin

It only makes sense to ask google to refresh their cache if the content at a particular URL has been changed since the last time it was crawled. So, make sure that the version of the page in their cache is different from the directly accessed page.

Google stores all kinds of information about people, so for your own privacy you might like to:

  • perform all of the following through the tor network
  • create a disposable email account
  • ensure your browser has no google cookies stored
  • temporarily change your browser’s user agent string (eg. in firefox you can use the Tamper Data or Switch User Agent add-ons)

Create a disposable google account

  1. Start at www.google.com/webmasters/tools/removals
  2. Using the link at the bottom right of the page, create an account
  3. Follow the steps to create an account using your disposable email address
  4. Clear the ‘stay signed in’ and ‘enable web history’ tick-boxes
  5. Verify your ‘identity’ using the link sent to your disposable email address

Fill in the forms

The verification link should take you back to the webpage removal request page. If it doesn’t you can initiate a new request here

  1. Choose the first option in the list, “Information or image that appears in the Google search results.”
  2. Click ‘next’
  3. The second page allows you to specify whether the page has been modified or removed. Choose the former if personal information has been deleted from a page that still exists; choose the latter if the story has been ‘hidden’ (either unpublished or meta-tagged ‘noindex’.)
  4. Click ‘next’

Modified pages

If personal information has been removed from a story, the next page requires that you enter the search terms that would lead someone to find that story. For example, if searching for ‘myunusualname’ is how people would find the cached story that’s had ‘myunusualname’ (and/or other personal information) removed from it, you need to specify the URL of the story that’s been modified (in the upper box) and the search term (‘myunusualname’) in the lower box.

Removed pages

If a story has been ‘hidden’, the html output should include a meta-tag to exclude it from indices so all we need to do is ask google to honour that by refreshing their cache & index. In this case, only the URL is needed. I repeated this process for the http:// and https:// pages, but I’m not sure if that’s necessary.

Finish

Logout and clear all google cookies, form data and saved passwords from your browser. Google claim that these requests take 3-5 days to process so don’t expect instant results.

   

This is brilliant. Not just because it tells me how to do something I didn’t even know was possible, but also because it’s almost a template for how to write a procedure doc. Thanks.