All my spiders were taking all content from a website on a single visit starting from the begining.

It seems that the idea of remembering which urls “produce” links with content is not so very bad.

Here is what I found for diri.bg – a local Bulgarian SE.

I see that diri.bg hasn’t remove from their page

show_categories.php

Even I have no links to this page. Check the result here

Ops. Google do it the same way: here

Then how to get rid of old pages without leaving “bad” links in internet?

I will try to put show_categories.php in robots.txt to see what will happen with this page.