Reading Home / Ideas / How to find secret web site pages and content

How to find secret web site pages and content

June 17, 2008

Web site owner hide their webpages using commands in Robots.txt.Robots.txt is a text file which is located in the root directory of a site.It is used to control webpages  indexed by a robot,ie. you can disallow a particular web page or content to be spidered from search engine robots. By using ‘disallow‘ word you can block any URL of your blog from  reaching search engines.

We will take the help of Robots.text file to see the hidden web site pages and content

Step 1 – Go to Google and type this in the search box

"robots.txt" "disallow:" filetype:txt

Hit enter and you will be presented with loads of Robots file website results which have a disallow command.

google_robot

Step 2 – From thousands of results we will choose any website,for example I will open Microsoft robot text file which is in the 1st page (Highlighted).After opening the robot text file,it looks like this

microsoft_robot_file

These are the content and pages which Microsoft doesn’t want search engine spider to get indexed.Now copy any line after the word Disallow:

For example we will copy this line :

/communities/blogs/PortalResults.mspx
 

Remember to copy the slash which is at the beginning of the line.

 

Step 3 – Type the main website url and then the line which you have copied in the Step 2  address_bar_browser

After combining both the main website URL and the line,Hit enter (See the screenshot)

Main url – http://microsoft.com

Line - /communities/blogs/PortalResults.mspx

Combination – http://www.microsoft.com/communities/blogs/PortalResults.mspx

 microsoft_secrets

This was the page Microsoft had hidden from the search engine!

This was just an example,you can find some more interesting web pages and other secret content easily.Go ahead and try !

Digg Delicious Stumble Technorati Facebook Tweet This Reddit

{ 2 trackbacks }

How to view hidden feed count of any website | Blogote.com
June 17, 2008 at 11:28 am
Blogote June 2008 Monthly wrap up | Blogote.com
July 2, 2008 at 7:58 pm

{ 9 comments… read them below or add one }

shaiksha February 8, 2009 at 3:30 am

yes this is great trick to view the hidden site in search engine….can you explain me how this process is going on in other websites ……if i want to know the hide information of some any other website then what i can do…..tell me…..

Reply

Neel January 19, 2009 at 8:49 am

cool stuff! lots to learn in this SEO world,, : (

Reply

Nitesh August 21, 2008 at 4:32 pm

WICKED!!

Reply

Diane June 19, 2008 at 12:57 pm

Sensitive information shouldn’t be on the internet! Companies have a responsibility to keep their data safe. There’s a new story about data beig lost or stolen every week!

Reply

Jakub June 19, 2008 at 9:32 am

Hey this is cool feature, but I think that google should prevent it. Sometimes it may contain sensitive information!

But anyway, great post and I will be careful about it on my pages.

Reply

Avenues June 18, 2008 at 7:41 am

WOW.. this is so cool.. easy to find the hidden files from search engines..

Reply

web design company June 17, 2008 at 5:49 pm

A great way to find articles and web site web pages which are hidden from search engine…

Reply

stratosg June 17, 2008 at 5:45 pm

funny result using your tip on google. http://www.google.gr/linux never seen that page :) nice tip

Reply

Steven Finch June 17, 2008 at 4:44 pm

Great Post.

crenk.com

Reply

Leave a Comment

Previous post:

Next post: