Usenet

Collection of Usenet resources, including copy of old rough notes entitled “Saving Usenet”

RIP

Usenet’s “home” at Duke shut down on 20 May 2010 [Article in Register UK]

Sascha Segan’s obit, PC Magazine Jul 31, 2008

Google Groups

Greatest hits and netnews classics

Google’s 20-year timeline links to some “famous” posts, and is drawn in ASCII art to boot.

Guides

Usenet Info Center Launch Pad (browse the hierarchy, with descriptions)

Using Usenet Effectively by Patrick Salsbury

How To Write a Good Newgroup Message by Brian Edmonds [more]

Usenet FAQs and netiquette files

Usenet chapter in Zen and the Art of the Internet

Saving Usenet

Aug 2001 update: Google Corporation’s on the hunt for old archives:
 http://groups.google.com/googlegroups/archive_hunt.html

the need

usenet existed since 1979. it is a decentralized electronic world-around messaging system — and a literary record of our time. it is the greatest archive of human discussion. some of the best writing of the past 20 years exists only on usenet. (of course, so does some of the worst, but that’s got to be a given.) proper reconstruction, archival, and access to usenet is the key to an internet-age literary renaissance [Ed. or not?]

dejanews.com was a proprietary archive of usenet, from march 1995 to the present; in late 1999 it became “deja.com” with an increasingly hard to use interface; then google.com bought it in feb 2001 and took it offline until they “fixed” it; it’s great now (spring 2001) with a better interface than deja’s, and it’s faster, and they’ve got the entire deja archive up … but you can’t view plaintext article sources (as you could with deja) [more comments]

[11 dec 2001 great google update: 20 years, 700 million articles, and a sampling of memorable moments]

We need a public archive, out of the corporate sector. This should be done by the Library of Congress, but that does not seem likely, considering the frightening outlook on etext expressed by the Librarian of Congress. Perhaps ideally, “no one” will archive it — the net (usenet/web/ftp/etc) will be archived via some decentralized p2p scheme.

public archives do not exist

Phil Greenspun mentioned once that his moderated Web-based LUSENET system cost him something like a million-odd dollars to maintain — I wonder what the estimated dollar cost of distributed Usenet is, in world-around servers, disks and other equipment; CPU usage; power consumption, etc. … hundreds of millions, perhaps much more? (Any guesses?)

Two things are necessary: creating a complete-as-possible archive of old Usenet; and building a complete, non-corporate archival system for netnews. Recently this problem has increased in urgency; as time goes on it will be harder to obtain old backups or reliable archives — many of which are stored on rapidly-degrading media or are forgotten. Many have been destroyed, some of the people have died.

how much disk space will we need?

year		space
----		-----
1979
1980
1981
1982		
1983		400MB
1984
1985
1986
1987
1988
1989
1990
1991		1GB
1992		4GB (343,945,617 words)
1993
1994		(word frequency list)

1995
1996
1997		500GB+ (300,000,000 articles)
1998
1999

2000

2001		1.8TB (5GB/day+?)

Usenet daily traffic thru Simtel

the plan

At least a partial reconstruction is possible. Many people kept archives of single groups; still more kept archives of selected posts. Multiple verification and other means of determining authenticity of the data will be absolutely essential — with only one source, it may be possible for parties to inject fictitious articles, threads, etc. into the archive and rewrite history, so to speak.

First, we need a temporary repository site for building whatever reconstruction we can. The good part about archiving old news (before Deja, so anything before 17 Mar 1995 is game) is that there is a finite amount of it. And disk space is always getting cheaper.

We can probably reconstruct a pre-Deja archive in under 300GB. For everything up to 2000, we’ll need a terabyte of disk space. And we’ll need some means of making a backup of this data.

Possibilities: archive.org? leo.org? universities? gnu.org?

New Open source distributed USENET archive project: DelĂ  News [Wired News on the project] [Slashdot thread]

Discuss this project: [delanews forum] [dsl.org forum (experimental)]


John Foust is very interested in this. He says that archive.org has “a simple archive of a few years’ worth of early Usenet news. I think there’s a gap of at least ten years between its archive and Deja News.”

In a Dec 1999 email, Foust wrote: “The archive appears to be down right now, but it has a simple archive of a few years’ worth of early Usenet news. I think there’s a gap of at least ten years between its archive and Deja News. Henry Spencer supposedly archived everything into the early 90s, but did he save the tapes and shift them to new media in time?”

Foust thinks that members of the Classic Computer Collector mailing list and the PUPS archive would be very helpful in reading old magtape-based archives.

Sep 1999 thread: T. Gryn, Richard E. Hawkins

Story about archive.org and Internet archives in general

List of known Usenet archives

personal archives

Google “author profile”:

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&safe=off&q=author:email address+

dejaview, a perl command-line tool to retrieve all the articles from dn search in one html file.

Article by Garry Wiegand expressing interest in an archive

Slashdot thread: Is Usenet Dying? (decline in posters, but an increase in noise and posts (binaries))

Thread: “Old Usenet archives, Deja, and Internet history” [deja part 1] [deja part 2]

Thread: “The Future of Usenet” [deja]

Thread: “Good Deja News Replacement” [deja]

John Stevenson is interested in Usenet archives.

Interview with Henry Spencer

archive of usenet history material

Related project: “Can we make an Open Source IMDb Replacement?” (LUSENET discussion)

(Design Science License copyleft can be useful for data repositories like these)

Fun analysis: Usenet is about 4% ‘dirty’ (research project: how ‘dirty’ is corporate media?)

First published on February 18th, 2009 at 6:38 pm (EST) and last modified on June 9th, 2010 at 3:14 pm (EST).