Usenet Newsgroups Part II – What is on that CD and Why Your Hard Drive isn’t a Database

I left off last week sharing the fact that I discovered some Usenet Newsgroup archives on CD-ROM in a box in the garage, which kicked off a reverie about Usenet and the good old days.

My plan for the next post is to actually explore the story arc of Usenet Newsgroups and where they stand today.  But this middle post… because you always put your weakest point in the middle of your presentation I guess… is to look deeper into the CD-ROM I found and what was really on it and what that said about the time.

Net News – 1992 series, number 25

Just to remind you where we left off, I did point out that the CD-ROM, which covered just 10 days of Usenet Newsgroup traffic in the pre-eternal September time frame, and which did not include any of the binaries groups, was pretty full.

600 MB of Usenet text

That image is, however, a bit misleading… or, rather, it hides a problem that the original Usenet architecture faced.  What do we see when we get properties within the disk, and specifically what is in that SPOOL directory?

Look at that file count

Almost 186K files that are 340MB in size but which take up 531MB of space on the CD-ROM itself.

I knew exactly what was going on the moment I saw that.  That pulled me back in my career, back to my brief tenure at Jasmine Technologies, a job I got because of my BBS, a time when I started off thinking hard drives were easy and left believing SCSI was a nether dimension where the generally accepted rules of the universe did not apply.

Drive Safely was out motto

In the world of storage, hard drives vary in capacity, but there is generally a default sector size that dictates the minimum amount of space a file can take up on a drive and an operating system can only access so many sectors on a single storage device in a way that is analogous, but not the same, as the RAM limits different operating systems face based on the size of their memory addressing.

Basically, if you have a big device, and in 1992 a 600MB CD-ROM was pretty damn big, and have a limit on the number of sectors you can address, then you just make the sectors bigger.  So the CD-ROM has a sector size of 2,048 bytes.

If you have a bunch of really small files… if you were doing something like using your file system as a database structure and just storing every message as a file… and many of those messages were only a few hundred bytes but all took up a minimum of 2,048 bytes on the disk because of the sector size… well, you get files that take up a lot more space their their cumulative size might otherwise indicate.

But this was how it was done back in 1979 when they put all of this together.  And not just for Usenet.  Early email server storage was the same way, with every email being a single file on the drive in a directory that indicated the user name of the owner/recipient.  Or something like that.  I used to have the SRI Guide to internet configurations from 1992, but I am pretty sure that got tossed. (Also, SRI will come up again in a later post.)  But there is still some history out there if you are interested.

It was okay at the time.  The internet was small.  There was a limited number of users.  And the UNIX machines driving all of this were among the most capable around and were largely funded directly or indirectly by government research grants.

ARAPnet in 1973

Anyway, it seemed like a good idea at the time.  Later, it had to get fixed and the committe driving all of this came up with a method that did not litter your drive with hundreds of thousands of tiny files taking up more space than they should.  But that came later after it was practically a crisis and hosting Usenet was a durability test.

When you look into the SPOOL directory you find directories for every top level group the server subscribed to (82 in the case of this CD), then under that the sub groups.

Inside the SPOOL directory of alt.games

Inside of those directories… such as XTREK, one of the early online games… there are files that correspond to each message for that ten day period, numbered to preserve ordering, and a thread file that maintains the relationships between some of the messages where one message was a reply to another.

Not much action in alt.games.xtrek

And the files are all pretty small.

Message 4046 stats

That message itself was an reply to somebody asking how to setup Netrek and being told to go read the FAQ in rec.games.netrek.

Path: sparky!uunet!decwrl!claris!apple!amdahl!fadden
From: fadden@uts.amdahl.com (Andy McFadden)
Newsgroups: alt.games.xtrek
Subject: Re: startup
Message-ID: <beyQ037Tb6o700@amdahl.uts.amdahl.com>
Date: 30 Oct 92 19:41:13 GMT
References: <1992Oct29.221630.17927@msuinfo.cl.msu.edu>
Organization: Amdahl Corporation, Sunnyvale CA
Lines: 13

In article <1992Oct29.221630.17927@msuinfo.cl.msu.edu> remeika@anchovy.cps.msu.edu (Joseph D Remeika) writes:
>
> I don't suppose anyone can post some easy startup
>
>instructions for xtrek ???

(1) read rec.games.netrek instead of alt.games.xtrek
(2) read the FAQ that is posted there (and presumably cross-posted to
news.answers).

-- 
fadden@uts.amdahl.com (Andy McFadden)
[ Above opinions are mine, Amdahl has nothing to do with them, etc, etc. ]

Such was often the state of Usenet back in the day.  You will see that the text includes everything the UUCP services needed to handle and categorize the message, including the “bang qualified” path, which are the servers you need to go through to get to the person in question.  Before about 1987 email addressed were formatted the same way, until the structure to support the still current user@domain format was adopted.  You need domain lookup for that, which is a whole story in and of itself.

All of which means that, without a reader that can organize things from the SPOOL directory, reading these messages is a bit of a pain in the ass.  You kind of have to just hunt at random, or drag everything over onto your hard drive and grep for things. (The CD-ROM is way too slow to be tolerable in the third decade of the 21st century.)

I did find a few interesting items.  I liked this one about Dyson spheres and Ringworld

Path: sparky!uunet!van-bc!mdavcr!ewm
From: ewm@mdavcr.mda.ca (Eric W. Mitchell)
Newsgroups: alt.startrek.creative
Subject: Re: Dyson's Spheres
Message-ID: <3828@jeff.mdavcr.mda.ca>
Date: 27 Oct 92 02:25:42 GMT
References: <1c66fcINNlv7@usenet.INS.CWRU.Edu>
Organization: MacDonald Dettwiler, 13800 Commerce Parkway, Richmond, BC, Canada V6V 2J3
Lines: 94

ag154@cleveland.Freenet.Edu (Jeremy C. Radwan) writes:
: 
: In a previous article, m363@quads.uchicago.edu (Marco Temaner) says:
: 
: >
: >Please email any and all factual information about Dyson's Spheres
: >(within reason). Thanks.
: >
: >Marco
: >

<stuff deleted>
: 
: Now, if you contained a SUN inside a sphere, the inside would get VERY hot, 
: thus making it very improbable that plants and people could live in it.
: 
: Hey, I know, it's Star Trek...


I agree that a Dyson's sphere has many problems, not the least of which
is that it requires artificial gravity on the inside so everything stays
in place. You can't spin it for gravity because it would stretch out
flat, and the gravity would decrease to zero at the poles anyway.

A much more practical idea than the Dyson's Sphere is Larry Niven's...

Ringworld 
=========

With the mass of Jupiter (or so) to work with, you build a ring 
around the sun about 1,000,000 kilometers wide and the diameter 
of Earth's orbit.

Put a wall about 1,000 kilometers high on the "top" and "bottom" inside
edges. Spin it for gravity. Pour an atmosphere into the inside.
Terraform with dirt and stuff. Depending on how thick the ring is,
you can make big features in the ground by actually building the
relief into the ring.

Profile View:
1000 km (wall)
> <
_
| ^
* <----- Earth's Orbit -------> | 1,000,000 km (ring width)
(ring diameter) _| v


Sun Ringworld




The tensile strength of the metal required (Niven calls it "Scrith")
is absolutely incredible due to the force caused by the spin. The ring
is also unstable, so you would need rockets attached at various points
on the rim to maintain stability.

Night and day can be created by connecting a series of large opaque
squares with cables in a ring between the Ringworld and the sun. The
shadow of the square on a section of Ringworld makes it night. Niven used
completely opaque squares, so the transition from day to night was
virtually instantaneous. Personally, I would use a grating or something
at the edges which gradually blocks out the light as it rotates above
you until it is dark.

Even at night, things are quite bright, however, because all the other
illuminated regions on the Ringworld are visible (you are on the inside,
remember?). Even during the day, these would be very prominent.
(The dwellers on the Ringworld called this the "Arch"). 
You could control how many regions were illuminated (make them further
apart, for instance) by adjusting the size of the squares and the
relative angular speed of the shadow ring and the Ringworld.

The surface area of this monster is millions of times that of the Earth
(I think it was 4 million X or something - I can't remember exactly). If
you peeled the earth and dropped it on the Ringworld then looked away,
you would never be able to find it again. More than enough room, huh?

For more information, and a very entertaining read, I suggest you
buy the novel "Ringworld" and the sequel "The Ringworld Engineers".


Enjoy,


Eric
-- 
==========================================================================
# Eric Mitchell | "'Cuz it's free!!!" #
# MacDonald Dettwiler | #
# Ph: (604) 278-XXXX | - L.A. looter #
# Fax: (604) 278-XXXX |-----------------------------------
# Email: ewm@mda.ca | Standard disclaimers apply. #
==========================================================================

Also, what an age we lived in back then, where people put all their contact info into a message that they then cast out into the wilds of Usenet.  I have X’d out some of the phone digits, but holy moly, Eric there thought that was a good idea back in late 92.

I too used to post with a .sig file that did not include my phone number, but had my full real name and was from my work domain, something I later regretted.  Nothing like an employer using Google to check you out and finding you arguing about video games or what not on the internet.  Wilhelm Arcturus, as you may have guessed, is my online pseudonym, used in part because my Usenet history came up in an interview once… not in a bad way… and I decided I would disassociate my real self from video games.  Since then, most of those Usenet references have fallen way down or completely off of search engine radar and I am in what I hope will be my last position before I retire, so I am less concerned these days.  But still, I am sure it wasn’t a good look back in the early 2000s when I wanted another job and couldn’t get in the door anywhere… but we’ll get to that story later.

The CD-ROM itself does have some other directories.

Here we go

It even includes directories for readers capable of supporting the format for Mac, UNIX, and MS-DOS.  Sort of.  Mac is there, and if you have a DEC MIPS system or a SUN SPARC system you’re covered.  The PC directory promises a reader some day soon.  No luck for me there.

The CD case booklet describes what is on the CD.

You’re going to have to click to read it

Since I am meandering to an end here, I might as well include the other aspect shots of the CD case to give the full effect.

There is the back of the case.  I did not take the case apart to get the doc out, so you can see in the plastic reflection the outline of the hole in the phone stand I use to take pictures of docs.

What it promises on the back of the case

And it wouldn’t be software of any sort if there wasn’t a disclaimer inside that you could only read once you had paid for and opened up the case.

As always, it ain’t our fault

Not a lot of story to that post.  But if it sparks any interest in you, I copied the files off of the CD-ROM into a directory on my drive and zipped them up into an archive which is a little over 200MB, which you will find at this link.

There are no warranties, express or implied if you click that link and download that file.  You do so at your own risk.

Next time, some tales of Usenet, its rise, fall, using it today, how it foreshadowed our current internet, and some of those whose fame were anchored in it.

Past posts in the series:

1 thought on “Usenet Newsgroups Part II – What is on that CD and Why Your Hard Drive isn’t a Database

  1. PCRedbeard

    I’ve never been interviewed –and definitely have not had anybody track down my online activity and admitted as much– but I can now appreciate your position.

    A coworker at the software company I worked at in the late 90s was fired for griping about some of the graphical decisions my company made on USENET. His points were valid, and he didn’t use a company email or even reference his employer, but people put two and two together and discovered it was him.

    For me, my USENET activity was kind of limited to a few newsgroups, and none of those I was more a lurker and didn’t get involved with discussions unless I felt confident in my knowledge of the subject. Still, since we didn’t have an online presence at home during a lot of those years –why pay for it when my company provides it?– I used my work email a lot.

    I can still search based on that and find the occasional nugget that remains, but I have to really hunt for it nowadays.

    As for the hard drive/CD-ROM drive issue, I ran into this exact problem with the ISO for Windows XP back in the day.

    Because a previous employer was a Microsoft something-or-other, we were entitled to install Windows XP on a home PC without paying for it. Given the cost of Windows at the time, it was a nice boon.

    So I downloaded the ISO, burned it onto a CD-ROM, and used it as the installation disk for a PC I was building for my family at the time. All was well and good, until I started getting some gigantic errors during boot time and I couldn’t figure out what was going on. The file that it said was “missing” was there, so I had no idea what weird Windows shenanigans were going on. It took a lot of debugging, but I finally figured out that the file was supposed to be something like 300-ish bytes but was 2048 on the ISO. Copying over the file with the correct size from another XP installation fixed the issue.

    Of course, that also meant that I had a busted installation disk because of one little problem with the ISO.

    Like

    Reply

Voice your opinion... but be nice about it...