Archiving Cited Web Pages:

When you cite a Web page in your article -- or in a court opinion -- you run the risk that the page will be gone by the time some reader of yours will want to check it. The obvious solutions are to archive the page, either on some site that you've committed to keep up yourself, or on some site run by an organization devoted to this purpose. Likewise, publishers (such as law reviews) may do the same for articles that they publish. And I know there are organizations that do try to do this, at least in theory.

My question relates to the practice: Have any of you done this systematically for your own works? Have you used archiving services, and what has been your experience with them? Are there unexpected technical issues one needs to worry about when doing this? I'd love to know, and I imagine that so would many others.

Izzy (mail):
I use the ScapBook extension for the Firefox browser:

It works well.
2.3.2007 6:22pm
Guest J (mail):
What are the copyright implications? What if the copyright owner does not want ot grant you permission to archive the page?
2.3.2007 6:41pm
Chuck Jackson (mail):
I do a print to PDF. This leaves a label showing the web page and date.
2.3.2007 6:56pm
Wintermute (mail) (www):
When I can, I favor news sources trhat leave their content up forever at the original link. WaPo over NYT, for example, last I looked.
2.3.2007 7:05pm
Ditto on ScrapBook, it's free but very powerful. It allows you to save not only the page, but pages linked to that page. You can organize clips into folders and make notations directly on the pages you've saved.

See Robert Ambrogi's review here.
2.3.2007 7:20pm
Viscus (mail) (www):
Michael Dorf has a blatantly superior post that covers this topic here.

I think the tool to use that he mentions is called Webcite.
2.3.2007 8:30pm
Rich B. (mail):
Whenever I cite to a web page in a court filing, I always print out and file the page. If I am ever "challenged" based on a subsequent change, I can pull out the print-out.

It never occurred to me to "archive" the page electronically.
2.3.2007 9:22pm
Eli Rabett (www):
My practice, although not for legal reasons, is to archive the sites on a CD. Since most of what I need is in acrobat format this is sufficient. I suppost at some point I'll have to go to thumb drives, but then the issue of labeling the drive will become a killer.
2.3.2007 10:49pm
Michael M. (www):
To Guest J's question, why would there be copyright implications? You aren't making the archived web page public, you're just saving it. It shouldn't be any different, from a copyright perspective, from saving a newspaper or magazine article.
2.4.2007 7:23am
Wouldn't the internet archive ( be useful for this sort of thing? Not only does it provide a backup in case the page disappears, but it also provides a version history in case it changes.

2.4.2007 7:55am
Guest J (mail):
Michael M:

I assumed (perhaps wrongly) that some of the proposed archiving solutions would be publically readable, serving not just as an archive, but also as a mirror. If not, I suppose my question is moot.
2.4.2007 11:40am
Truth Seeker:
Guest J + Michael M:

I think there's a copyright problem if you just make a copy of something for yourself. For instance, if you just burn a copy of your friend's music CD. Perhaps a printed copy of a web page for yourself would be under fair use, but I think making a mirror site has been held to be a violation.
2.4.2007 1:05pm
Bill Poser (mail) (www):

Eugene Volokh's post poses a question; Michael Dorf's post proposes an answer. How can one be superior to the other?

Bill Poser
2.4.2007 1:41pm
Wait. How does this even count as proof?

I can very, very easily alter whatever file I'm saving. I don't see how it's acceptable in court.
2.4.2007 4:20pm
Eric Crampton (mail):
I use Furl for such purpose - it's pretty handy. It saves any website as it appeared at the time of saving. Any subscription sites are saved such that you can view the saved page, but others cannot; the site's been around for at least 3 years and is reasonably popular, so I'd be surprised if it hadn't survived a copyright challenge.
2.4.2007 4:32pm
Viscus (mail) (www):
Bill Poser,

You are adopting an unreasonable interpertation of what I said. I did not say what Dorf's post is superior to, I certainly did not say it was superior to Volokh's question.

I stand by my suggestion that the post is blatantly superior. To what?... I refuse to say.
2.4.2007 5:01pm
Visitor Again:
I stand by my suggestion that the post is blatantly superior. To what?... I refuse to say.

Then your use of the comparative "blatantly superior" is meaningless. Motion to strike granted.
2.4.2007 9:49pm
Viscus (mail) (www):
Visitor Again,

My refusal to say why I have used a particular term does not prove that my use of the term "blatantly superior" is meaningless. It only proves that I have not established that it is not meaningless. It could in fact, be quite meaningful, depending on the unknown reasons that I am using that particular comparative.

The burden is on you to prove that my use of the term "blatantly superior" is in fact meaningless.
2.4.2007 11:43pm
Visitor Again:
Your use of the comparative "blatantly superior" is meaningless to your readers in the absence of your specification of what you are comparing. Your use of those words may not be meaningless to you, since you know what you are comparing, but it is to the rest of us. Since the entire purpose of your message was to communicate to us, your readers--I assume you do not write messages on the VC to communicate with yourself--it is the readers' perspective, not yours, that controls whether your use of these words was meaningless or not. Your use of these words was meaningless.
2.5.2007 7:48am
Guest P:
Guest J hits it. There is a definite privacy violation saving anothers material against his or her wishes.
2.5.2007 11:06am
KeithK (mail):
Guest J hits it. There is a definite privacy violation saving anothers material against his or her wishes.

Can someone actually claim a right to privacy for something that is published on a publically accessible web site? Intellectual property rights maybe, but not privacy.
2.5.2007 1:51pm
Bill Poser (mail) (www):

Your post may not be meaningless in the strict sense, but insofar as the likely intepretation is ridiculuous, no other interpretation suggests itself to any of your readers, and you refuse to provide one yourself, the very best that can be said about your comment is that it conveys no information whatever and is therefore a waste of time. I don't care if you waste your time, but I'd prefer that you not waste mine. If you have nothing to say, go play by yourself.
2.5.2007 4:52pm
guest from TX:
saying something is a "superior solution" in normal usage implies that it is superior to other solutions. Any reference to the better sticklers would confirm this. I never read the first response as saying that the Dorf post was superior to this one (even if that is what was actually meant), but that one should check it out, after all it is superior.

Why all the anger?
2.6.2007 10:17am
James of England:
I imagine that people are irritated with Viscus defense of poor linguistics. Bill Poser is very nearly completely wrong in his first sentence. Viscus' phrase is not completely meaningless. However, since all that it tells us is that in his opinion there is something inferior to Dorf's post, ie., that Dorf's post is not the worst in existence, it is effectively indistinguishable from being meaningless.
2.6.2007 11:21am
James of England:
Having said that Viscus was being kind of a jerk, I'd like to note that the chief substance of the initial post, the link to Dorf, was useful and interesting to me.
2.6.2007 11:27am