Usenet

Col­lec­tion of Usenet resources, includ­ing copy of old rough notes enti­tled “Sav­ing Usenet”

RIP

Usenet’s “home” at Duke shut down on 20 May 2010 [Arti­cle in Reg­is­ter UK]

Sascha Segan’s obit, PC Mag­a­zine Jul 31, 2008

Google Groups

Greatest hits and netnews classics

Google’s 20-year time­line links to some “famous” posts, and is drawn in ASCII art to boot.

Guides

Usenet Info Cen­ter Launch Pad (browse the hier­ar­chy, with descrip­tions)

tors.com/~salsbury/Articles/using-USENET.html”>Using Usenet Effec­tive­ly by tors.com/~salsbury/”>Patrick Sals­bury

How To Write a Good New­group Mes­sage by Bri­an Edmonds [more]

Usenet FAQs and neti­quette files

Usenet chap­ter in Zen and the Art of the Inter­net

[ad]

Saving Usenet

Aug 2001 update: Google Cor­po­ra­tion’s on the hunt for old archives:
http://groups.google.com/googlegroups/archive_hunt.html

the need

usenet exist­ed since 1979. it is a decen­tral­ized elec­tron­ic world-around mes­sag­ing sys­tem — and a lit­er­ary record of our time. it is the great­est archive of human dis­cus­sion. some of the best writ­ing of the past 20 years exists only on usenet. (of course, so does some of the worst, but that’s got to be a giv­en.) prop­er recon­struc­tion, archival, and access to usenet is the key to an inter­net-age lit­er­ary renais­sance [Ed. or not?]

dejanews.com was a pro­pri­etary archive of usenet, from march 1995 to the present; in late 1999 it became “deja.com” with an increas­ing­ly hard to use inter­face; then google.com bought it in feb 2001 and took it offline until they “fixed” it; it’s great now (spring 2001) with a bet­ter inter­face than deja’s, and it’s faster, and they’ve got the entire deja archive up … but you can’t view plain­text arti­cle sources (as you could with deja) [more com­ments]

[11 dec 2001 great google update: 20 years, 700 mil­lion arti­cles, and a sam­pling of mem­o­rable moments]

We need a pub­lic archive, out of the cor­po­rate sector. This should be done by the Library of Con­gress, but that does not seem like­ly, con­sid­er­ing the fright­en­ing out­look on etext expressed by the Librar­i­an of Con­gress. Per­haps ide­al­ly, “no one” will archive it — the net (usenet/web/ftp/etc) will be archived via some decen­tral­ized p2p scheme.

pub­lic archives do not exist

Phil Green­spun men­tioned once that his mod­er­at­ed Web-based LUSENET sys­tem cost him some­thing like a mil­lion-odd dol­lars to main­tain — I won­der what the esti­mat­ed dol­lar cost of dis­trib­uted Usenet is, in world-around servers, disks and oth­er equip­ment; CPU usage; pow­er con­sump­tion, etc. … hun­dreds of mil­lions, per­haps much more? (Any guess­es?)

Two things are nec­es­sary: cre­at­ing a com­plete-as-pos­si­ble archive of old Usenet; and build­ing a com­plete, non-cor­po­rate archival sys­tem for net­news. Recent­ly this prob­lem has increased in urgency; as time goes on it will be hard­er to obtain old back­ups or reli­able archives — many of which are stored on rapid­ly-degrad­ing media or are for­got­ten. Many have been destroyed, some of the peo­ple have died.

how much disk space will we need?

year		space
----		-----
1979
1980
1981
1982		
1983		400MB
1984
1985
1986
1987
1988
1989
1990
1991		1GB
1992		4GB (343,945,617 words)
1993
1994		(word frequency list)

1995
1996
1997		500GB+ (300,000,000 articles)
1998
1999

2000

2001		1.8TB (5GB/day+?)

Usenet dai­ly traf­fic thru Sim­tel

the plan

At least a par­tial recon­struc­tion is pos­si­ble. Many peo­ple kept archives of sin­gle groups; still more kept archives of select­ed posts. Mul­ti­ple ver­i­fi­ca­tion and oth­er means of deter­min­ing authen­tic­i­ty of the data will be absolute­ly essen­tial — with only one source, it may be pos­si­ble for par­ties to inject fic­ti­tious arti­cles, threads, etc. into the archive and rewrite history, so to speak.

First, we need a tem­po­rary repository site for build­ing what­ev­er recon­struc­tion we can. The good part about archiv­ing old news (before Deja, so any­thing before 17 Mar 1995 is game) is that there is a finite amount of it. And disk space is always get­ting cheap­er.

We can prob­a­bly recon­struct a pre-Deja archive in under 300GB. For every­thing up to 2000, we’ll need a ter­abyte of disk space. And we’ll need some means of mak­ing a back­up of this data.

Pos­si­bil­i­ties: archive.org? leo.org? uni­ver­si­ties? gnu.org?

New Open source dis­trib­uted USENET archive project: Delà News [Wired News on the project] [Slash­dot thread]

Dis­cuss this project: [delanews forum] [topic=dsl%2eorg”>dsl.org forum (exper­i­men­tal)]


John Foust is very inter­est­ed in this. He says that archive.org has “a sim­ple archive of a few years’ worth of ear­ly Usenet news. I think there’s a gap of at least ten years between its archive and Deja News.”

In a Dec 1999 email, Foust wrote: “The archive appears to be down right now, but it has a sim­ple archive of a few years’ worth of ear­ly Usenet news. I think there’s a gap of at least ten years between its archive and Deja News. Hen­ry Spencer sup­pos­ed­ly archived every­thing into the ear­ly 90s, but did he save the tapes and shift them to new media in time?”

Foust thinks that mem­bers of the Clas­sic Com­put­er Col­lector mail­ing list and the PUPS archive would be very help­ful in read­ing old mag­tape-based archives.

Sep 1999 thread: T. Gryn, Richard E. Hawkins

Story about archive.org and Inter­net archives in gen­er­al

List of known Usenet archives

[ad]

personal archives

Google “author pro­file”:

http://groups.google.com/groups?hl=en&lr=&ie=UTF‑8&safe=off&q=author:email address+

dejaview, a perl com­mand-line tool to retrieve all the arti­cles from dn search in one html file.

Arti­cle by Gar­ry Wie­gand express­ing inter­est in an archive

Slash­dot thread: Is Usenet Dying? (decline in posters, but an increase in noise and posts (bina­ries))

Thread: “Old Usenet archives, Deja, and Inter­net history” [deja part 1] [deja part 2]

Thread: “The Future of Usenet” [deja]

Thread: “Good Deja News Replace­ment” [deja]

John Steven­son is inter­est­ed in Usenet archives.

tory.html”>Interview with Hen­ry Spencer

tory/”>archive of usenet history mate­r­i­al

[ad]

Relat­ed project: “Can we make an Open Source IMDb Replace­ment?” (LUSENET dis­cus­sion)

(Design Sci­ence License copy­left can be use­ful for data repositories like these)

Fun analy­sis: Usenet is about 4% ‘dirty’ (research project: how ‘dirty’ is cor­po­rate media?)

First published on February 18th, 2009 at 6:38 pm (EST) and last modified on June 9th, 2010 at 3:14 pm (EST).


Better Tag Cloud