Collection of Usenet resources, including copy of old rough notes entitled “Saving Usenet”
Usenet’s “home” at Duke shut down on 20 May 2010 [Article in Register UK]
Sascha Segan’s obit, PC Magazine Jul 31, 2008
Greatest hits and netnews classics
Google’s 20-year timeline links to some “famous” posts, and is drawn in ASCII art to boot.
Usenet Info Center Launch Pad (browse the hierarchy, with descriptions)
Usenet chapter in Zen and the Art of the Internet
Aug 2001 update: Google Corporation’s on the hunt for old archives:
usenet existed since 1979. it is a decentralized electronic world-around messaging system — and a literary record of our time. it is the greatest archive of human discussion. some of the best writing of the past 20 years exists only on usenet. (of course, so does some of the worst, but that’s got to be a given.) proper reconstruction, archival, and access to usenet is the key to an internet-age literary renaissance [Ed. or not?]
dejanews.com was a proprietary archive of usenet, from march 1995 to the present; in late 1999 it became “deja.com” with an increasingly hard to use interface; then google.com bought it in feb 2001 and took it offline until they “fixed” it; it’s great now (spring 2001) with a better interface than deja’s, and it’s faster, and they’ve got the entire deja archive up … but you can’t view plaintext article sources (as you could with deja) [more comments]
[11 dec 2001 great google update: 20 years, 700 million articles, and a sampling of memorable moments…]
We need a public archive, out of the corporate sector. This should be done by the Library of Congress, but that does not seem likely, considering the frightening outlook on etext expressed by the Librarian of Congress. Perhaps ideally, “no one” will archive it — the net (usenet/web/ftp/etc) will be archived via some decentralized p2p scheme.
public archives do not exist
Phil Greenspun mentioned once that his moderated Web-based LUSENET system cost him something like a million-odd dollars to maintain — I wonder what the estimated dollar cost of distributed Usenet is, in world-around servers, disks and other equipment; CPU usage; power consumption, etc. … hundreds of millions, perhaps much more? (Any guesses?)
Two things are necessary: creating a complete-as-possible archive of old Usenet; and building a complete, non-corporate archival system for netnews. Recently this problem has increased in urgency; as time goes on it will be harder to obtain old backups or reliable archives — many of which are stored on rapidly-degrading media or are forgotten. Many have been destroyed, some of the people have died.
how much disk space will we need?
year space ---- ----- 1979 1980 1981 1982 1983 400MB 1984 1985 1986 1987 1988 1989 1990 1991 1GB 1992 4GB (343,945,617 words) 1993 1994 (word frequency list) 1995 1996 1997 500GB+ (300,000,000 articles) 1998 1999 2000 2001 1.8TB (5GB/day+?)
At least a partial reconstruction is possible. Many people kept archives of single groups; still more kept archives of selected posts. Multiple verification and other means of determining authenticity of the data will be absolutely essential — with only one source, it may be possible for parties to inject fictitious articles, threads, etc. into the archive and rewrite history, so to speak.
First, we need a temporary repository site for building whatever reconstruction we can. The good part about archiving old news (before Deja, so anything before 17 Mar 1995 is game) is that there is a finite amount of it. And disk space is always getting cheaper.
We can probably reconstruct a pre-Deja archive in under 300GB. For everything up to 2000, we’ll need a terabyte of disk space. And we’ll need some means of making a backup of this data.
Possibilities: archive.org? leo.org? universities? gnu.org?
John Foust is very interested in this. He says that archive.org has “a simple archive of a few years’ worth of early Usenet news. I think there’s a gap of at least ten years between its archive and Deja News.”
In a Dec 1999 email, Foust wrote: “The archive appears to be down right now, but it has a simple archive of a few years’ worth of early Usenet news. I think there’s a gap of at least ten years between its archive and Deja News. Henry Spencer supposedly archived everything into the early 90s, but did he save the tapes and shift them to new media in time?”
List of known Usenet archives
- nooz is a relatively new usenet archive.
- Another newish one is the mailgate Usenet archive.
- Norman Yarvin’s hand-picked, classified, and searchable collection of Usenet archives.
- archive.org, believed to have a lot saved; free access, by appointment only
- USENET OldNews Archive (May 1981-82; based on Spencer’s files)
- Cameron Laird’s old news archive list, ca. 1994
- ISC control message archive (alt.config articles, large)
- comp.unix.aix archive
- leo.org archives (some moderated groups)
- Nice unreasonable.org archives
- ftp’able comp.text.sgml archives
- ftp’able “Oldnews” archive.
- Famous “Linux is obsolete” thread from 1992
- “The Breeders Rule!” — I’m amused by this 1993 (pre-Deja) Usenet article…
- … and more. Google search on ‘usenet archives’
Google “author profile”:
- Paul Sand: http://granite.unh.edu/pasnews/ngindex.html
dejaview, a perl command-line tool to retrieve all the articles from dn search in one html file.
Slashdot thread: Is Usenet Dying? (decline in posters, but an increase in noise and posts (binaries))
Thread: “The Future of Usenet” [deja]
Thread: “Good Deja News Replacement” [deja]
Related project: “Can we make an Open Source IMDb Replacement?” (LUSENET discussion)
(Design Science License copyleft can be useful for data repositories like these)
Fun analysis: Usenet is about 4% ‘dirty’ (research project: how ‘dirty’ is corporate media?)