[Organizers] Query: Tools to search for stale links (was Re: is NEFFA LinkFest still active)
Seth Seeger via Organizers
organizers at lists.sharedweight.net
Wed Apr 26 15:07:51 PDT 2017
Someone with a little Linux knowledge could use “wget” or “linkchecker”.
Seth
> On Apr 26, 2017, at 3:55 PM, James Saxe via Organizers <organizers at lists.sharedweight.net> wrote:
>
> I have a question for any of you who are involved with website
> administration (e.g., for your local dance organization):
>
> Do anyone know of good tools that will automatically go
> through a site searching for stale links?
>
> I've been prompted to make this query in part by the recent
> messages about the NEFFA LinkFest, which is apparently no longer
> maintained because it got to be too much work relative to the
> perceived benefit (particularly relative to perceived benefit
> for the volunteer maintainer(s)). However, I've noticed that
> many local dance groups' websites include lists of external
> links, and that while these lists are typically much smaller than
> the LinkFest (containing perhaps a dozen to a hundred links vs.
> almost 3000 in the LinkFest), they also often are not assiduously
> maintained and so often include at least a few links that no
> longer point to the expected content.
>
> I recognize that the automatic detection of stale links may not
> be a trivial problem, since there can be a variety of different
> symptoms. I'e just gone looking through lists of links on
> several different sites. Among the things I've found are:
>
> * Links that fail with Error 404 (Not Found)
>
> * Links that fail with Error 403 (Forbidden: You don't have
> permission to access ...)
>
> * Links that fail because the browser cant find the server.
>
> * Links that get to the site that apparently used to contain
> the desired page but now show a message like "We're sorry,
> but we were unable to locate the page you requested."
> (Perhaps attempts to follow these links actually produce a
> 404 code even if the text displayed to the user doesn't
> include the string "404".)
>
> * Links that go to sites offering to sell you the (expired)
> domain name that included the target page.
>
> * Links to pages that appear to have been taken over by new
> owners and no longer display the original content. Among
> other things, I've found pages full of text in Chinese
> or Cyrillic characters; pages that used to have dance
> information and seem to have been taken over by real estate
> agents or financial service organizations; and pages that
> say "WARNING!! THIS SITE CONTAINS ADULT MATERIALS ..."
> (I did not click on the "Continue" button after the warning
> to see whether it got me to traditional-dance-related
> cotent).
>
> * Links to pages that admit to being no longer maintained.
>
> * Links to pages that in turn offer a (possibly good) link
> to a new page with the desired content.
>
> * Links that go to what seem to be the original target
> pages, but that also seem not to have been updated in
> several years. (For example a site might refer to
> "upcoming" event that are now several years in the past.
> One might then wonder whether there's a new, actively
> maintained, site somewhere with similar, but current,
> information.)
>
> If someone creates a link the points to a recent blog entry,
> someone who follows the link several years later might get to
> now-most-recent page of the blog and only be able to reach
> the originally referenced content by clicking something like
> "older posts" a large--and unknown--number of times. Similarly,
> links to online newspaper or magazine content might now point
> to pages that are evidently full of newer content. The old
> content might or might not be available somewhere, but there
> may be no visible advice about how to find it.
>
> When people migrate their websites, it seems to be common that,
> if links into the old site get redirected at all, they end up
> redirecting to some place like the home page of of the new site
> instead of to the exact page (if one exists) corresponding to the
> old target of the link.
>
> Despite these comments about how links can go stale in various
> ways that aren't immediately obvious, it seems to me that stale
> link detection would be a useful feature for a wide variety of
> website administrators--and not just for dance organizers. So
> perhaps someone somewhere has put some effort into doing a good
> job of it. I'd be interested if anyone knows of examples.
>
> Thanks.
>
> --Jim
>
> _______________________________________________
> Organizers mailing list
> Organizers at lists.sharedweight.net
> http://lists.sharedweight.net/listinfo.cgi/organizers-sharedweight.net
More information about the Organizers
mailing list