Innumeracy
Okay, we can't stand it any more.
We've all laughed at articles about innumeracy, secure
in the knowledge that we're in a profession where we
learned to count, do ``big-O'' calculations, and make back-of-the-envelope estimates. Yes, we are computer
professionals.
Why is it, then, that some of the articles we've seen
about IPv6, the new addressing scheme for the Internet, make
statements about addressing power that seem, well, like the
authors can't tell the difference between a googol and a
googolplex?
One recent report we read claims that IPv6 will let us
individually address every proton on earth. A still-more-recent
book boldly states that the number of possible IPv6 addresses
will be larger than the number of molecules in the universe.
Really?
We could go searching the net for statistics, but a
pencil and used envelope-back are enough to sanity-check these
statements. (By the way. We'll leave their authors anonymous.
They're friends of ours,)
Let's attack the first claim. We'll begin by estimating
the volume of the earth.
A kilometer was originally defined to be 1/10,000 of the
distance from the equator to the North Pole. Even though it's
now defined in terms of a meter, which is, in turn, defined in
terms of light wavelengths, the original definition is close
enough for our purposes. Taking the circumference of the earth
to be around 4x10**4 kilometers, we can calculate the volume from
high-school geometry. Since the diameter is 2{pi}r, the radius
is about (2/{pi})x10**4 km., or (2/{pi})x10**9 cm. The volume of
a sphere is (4/3){pi}r**3 so the earth's volume is about
(32/3){pi}**2x10**27 cubic centimeters. The constants nearly
cancel, and leave us with an estimate of about 10**27 cubic
centimeters.
How much does something that big weigh? Well, since 1cc
of water weighs about a gram. if the earth were all water it
would weigh 10**27 grams. The earth's water floats on the
surface, so that's a safe underestimate. We suggest that readers
who are curious about how far off we are surf the net to find a
more precise estimate. (See, for example, the fine ``Nine
Planets Tour,'' at http://www.seds.org/billa/tnp.)
Now, how many protons would there be in that much mass?
For this, we need high-school chemistry. Well, we know that one
mole of Hydrogen weighs a gram. Electrons are pretty light, so
to a first approximation, we can say that 6.02x10**23 (Avogadro's
number) protons weigh a gram. Of course, something at the other
end of the periodic table, like Uranium, will have a lot of its
mass in neutrons, but even Uranium is about a third protons.
Since protons and neutrons are about the same size, a gram of the
most abundant elements, like Iron and Oxygen, which are about
half protons, would still have about 3x10**23 protons, which we
can use as a safe lower bound.
Combining this with our earlier estimate of the weight
of the earth, we have a lower bound of 10**27 x 3x10**23
protons/g = 3x10**50 protons.
Now, how much addressing power does IPv6 give us?
Let's go back a minute to see what problem IPv6 is
solving. Currently, every IP address has four bytes. These
bytes are the four numbers you see, separated by periods, if you
look in your host table, or when you run a networking application
like ping or telnet. My machine, for example, is 161.33.16.21,
and ftp.uu.net is 192.48.96.9
Since this is only 32 bits of address space, an IP
network can never have more than 2**32 addresses. (Just like
there can never be more than 10**7 seven-digit phone numbers.)
Basically, the net is growing fast enough that we're going to run
out of addresses. The problem's worse than that, because of the
way addresses are assigned -- people and corporations are
actually assigned big blocks of addresses which eats up the
address space even faster -- but even if that weren't true, most
folks agree that 32 bits just isn't going to be enough. The web
is growing too fast.
The new addressing scheme, IPv6, will put overload off a
bit by making addresses four times as large -- 128-bits.
Okay, IPv6 will let us address a lot more things.
(For an analogy, imagine adding an area code to phone numbers
that's three times as big as the phone number.) But could it let
us assign separate IP addresses to every proton on earth?
Nope. 2**10 is about 10**3, so 2**128 is about 10**39,
which means we fall short by a factor of at least 3x10**11. A
mistake of this magnitude is roughly comparable to confusing your
net worth with that of Bill Gates.
The difference is ... well ... astronomical.
How about the number of molecules in the universe? Most
molecules in the universe are hydrogen atoms, which means that we
could just divide the number of protons on the earth by the
weight of the earth, and then multiply by the weight of the
universe. Unfortunately, we don't yet have a good, back-of-the-envelope estimate for the weight of the universe. If you have a
good suggestion, please send us email at <jeff@rd.qms.com>
or <jsh@rd.qms.com>.
If you don't have one, but are as amused by back-of-the-envelope calculations as we are, see Jon Bentley's March, 1984,
CACM column.
Counting Electoral
Votes
How do you feel about the results of the recent
presidential election? The Jeffreys are split: ``Is too!'' ``Is
not!''
Despite the fact that the other Jeff is wrong, we both
had a good time speculating about which states were going to go
to which candidate. Unfortunately, it sometimes got a little
hard to track. ``Okay, now if Colorado goes for Dole, but
Arizona goes for Clinton ....'' What we really needed was a
spreadsheet.
A garden-variety spreadsheet would have worked, but the
spreadsheet interface seemed ugly. What we really wanted was
something that would show us a map, with outlines of the states,
and then let us click on the individual states to assign them to
candidates. We wanted our 2-dimensional, visual spreadsheet that
would color the states, so we could see which candidates had
which ones, and that would keep track of the running electoral
totals for each candidate at the same time.
We'll go through how we built it both because it's fun,
and because it lets us illustrate how to build a CGI application.
Our spreadsheet interface will be a web page.
This month, we'll build a simple, radio-button form that
assigns a state to a candidate.
Next month, we'll expand on that and build a map.
CPAN
Have you heard a lot about the virtues of ``code
reuse''? So have we. Have you seen much of it? We haven't
either. At least, until recently.
Recently, we sat in on a C++ course, which began with
the admission that code reuse was one of the most oversold
virtues of the object-oriented approach. Despite its allure,
code reuse has been largely confined to the stdio.h
model: libraries that are guaranteed to be distributed with
whatever language you are using will be widely used, and treated
as opaque, black-boxes with well-defined interfaces. Everything
else gets built from scratch each time.
In the Perl world, that's changed. A quick visit to
http://www.perl.com/cpan/,
will introduce you to the Comprehensive Perl Archive Network -- a
vast and growing array of reusable Perl modules (classes) that
are becoming de-facto building blocks for Perl programmers all
over the world.
To give you a feeling for the convenience and ease-of-use these modules offer, we recently built a Netscape-based news
reader in an afternoon, out of an NTP module and a CGI module.
It's hard for us to imagine doing that in either raw Perl, or in
any other language.
All of these modules are user-contributed, and the whole
of CPAN is a volunteer effort. Despite -- well, no, probably
because -- of this, CPAN is growing at an astonishing
rate.
CGI.pm
For our application, we'll use the module
CGI.pm, contributed by Lincoln Stein, which lets us
write CGI applications with no muss, and no fuss.
Our script, shown in Figure 1, will create and manage a
form that looks like the one in Figure 2.
We begin with relatively normal code
#!/usr/local/bin/perl -w require 5.003; use strict; # Perl's equivalent of "lint"
These lines are good ways to start any Perl application.
The opening, ``shebang'' line invokes our perl interpreter with
the all-important -w flag, which warns us of a wild
array of programming errors. There is never a good reason to
start your perl programs without this flag.
The next line states which version of perl we expect to
be running We have 5.003 installed, so we just require it,
guarding against the possibility that a user will stumble over an
older revision lurking somewhere in their path.
The last of these lines is not necessary, but we like to
use it anyway. The pragma use strict warns us about
very nit-picky problems: undeclared functions, incompletely
scoped variable names, and other things of that ilk. Raw perl,
like raw C, is very easy to write but gives you a lot of rope to
hang yourself with. The strict pragma, like C's
lint program, warns about things that could get you into
trouble. We like to use it, even though it makes us do some work
that we wouldn't otherwise have to do, because we don't always
really know as much about what we're doing as we'd like to
pretend.
The next few lines are the equivalents of C #include directives. These lines:
are, in effect, Perl for# pull in modules we need use CGI qw(:all use_named_parameters); # now some defined constants BEGIN { # full names, abbrevs, electoral votes of states require "states.pl"; # list of all candidates require "candidates.pl"; }
#include "CGI.pm" #include "states.pl" #include "candidates.pl"
Why two different syntaxes use and
require? Basically, because CGI.pm is a full-blown module while the other two are just files full of defined
constants.
The syntax use Module LIST lets us specify a
list of symbols that we can use from a class without a package
qualifier. The statement use CGI qw(:all
use_named_parameters); lets us use nearly every function in
CGI.pm without qualification. Instead of having to say
CGI::hr, to get a horizontal rule, we can say just
hr.
We enclose the other two require statements in
a BEGIN { ... } block to get them included at compile
time. (Perl programs are processed in two steps. First, the
programs are checked for syntax and compiled into a sort of byte-code. Then the ``compiled'' programs are run.) We didn't really
need to do this to get the program to work correctly -- the
constants aren't needed until run-time -- but without this, our
use strict; pragma makes the compiler complain bitterly
about undeclared variables, and blocks further compilation.
Skipping down a few lines, let's go directly to the statement
my $state = param('state') || 'CO';
When you give a web browser, like Netscape, a URL to visit, the http server on the target machine can tell if the location is a CGI executable, instead of a simple text file. If so, it executes the program, passing in a variety of information, and parses and interprets the output. The information is typically passed in with a peculiar syntax, which the application must parse. For example, if you use AltaVista to search for the nine planets tour, it might generate a location that looks like this:
This means that you're invoking an application called query, with an array of named arguments, concatenated into a single string passed in as the environment variable QUERY_STRING. The application must parse that string, pg=q&what=web&fmt=.&q=%2B%22nine+planets+tour%22, and then use the resulting information.http://altavista.digital.com/cgi-bin/query?pg=q&what=web&fmt=.&q=%2B%22nine+planets+tour%22
CGI.pm takes a lot of the work out of this for you. The
call
my $state = param('state') || 'CO';
looks through both standard input and the environment,
parses what it sees, finds the value of the parameter
state, and puts it into the scalar variable
$state. (If the variable is unset or empty, we use `CO'
as a default.)
Whew.
The rest of the program is a single print statement. Each line after the print calls a function from CGI.pm that generates a string containing the proper HTML for that piece of the form. Here's what they do:
header,
Okay, we lied. This doesn't actually generate HTML, it generates the code that tells the server how to interpret the output of the CGI program.
start_html('States'), ... end_html,
These generate all the little HTML tags that a web page needs for starting and ending; things like this: <HTML><HEAD><TITLE>States</TITLE></HEAD><BODY>, (We won't show you more raw HTML. We're using CGI.pm so we don't have to.)
h1("Select a candidate to cast $names{$state}'s votes for:"),
This emits a first-level header, translating the 'state' parameter into a human-readable state name along the way.
startform('GET'), ... endform,
These bracket a form. Forms are HTML's way of packaging user input up into something that you can pass to a CGI application. The result routinely looks as ugly as the query we showed you above. Luckily, you're using CGI.pm and won't have to look at it.
These calls emit code for the radio buttons on the form. All the functions in CGI.pm accept self-identifying parameters. We're taking advantage of them here, both because it makes the code easier to read, and because it saves us from having to remember what order to put the arguments in. We read the list of candidates from candidates.pl, so that we don't have to change the application for every election. The linebreak parameter puts each candidate on a different line (we think that looks nicer), and the default parameter says ``don't make any candidate a default choice.'' Even if he is the one people should be voting for.radio_group( name=>'candidate', 'values'=>[@candidates], default=>'-', linebreak=>'true', ),
submit(name=>"Cast vote"),
This emits code for the submit button. Every form needs at least one. We could have more than one, in which case we could identify which one the user had pressed by its name. By default, the name and the button label are the same, but CGI.pm will let you do almost any odd thing you want. Perl programmers tend to lack a prescriptive mind-set.
hidden(-name=>'state', value=>$state), # remember chosen state
This one's a little subtle. What happens when you press
the submit button we created? Because we didn't specify an
action in the call to startform, the server invokes the
default action: it re-invokes this same CGI script with the new
arguments, taken from the filled-out form. (In this case, our
form is ``filled out'' with button pushes.)
But wait. Re-invoking the script with new arguments
means that we've lost any information that we had when we started
up this form the first time. If we want to remember what state
we're working on, we have to save the information somewhere. We
can either tuck the information away into a file somewhere, then
read the file again on restart, or we can actually put the
information into the filled-out form. We don't, however, want to
put the information anywhere that the user has to see, or that
the user could accidentally change by typing in the wrong spot.
The function call hidden emits code for an invisible,
but filled-out field whose value is transmitted to the
application on submission of the form. Here's where we take the
state information we were passed, and pass it on in turn.
We'd like to expand on this a bit more, but the column's
already too long, so let's defer further discussion to next time,
when we'll tie this to a map.
For now, happy trails.
#!/usr/local/bin/perl -w # $Id: ns.cgi,v 1.1 1997/02/14 18:42:52 jeff Exp $ require 5.003; use strict; # Perl's equivalent of "lint" # pull in modules we need use CGI qw(:all use_named_parameters); # now some defined constants BEGIN { # full names, abbrevs, electoral votes of states require "states.pl"; # list of all candidates require "candidates.pl"; } if (@ARGV) { use FileHandle; my $params = shift; my $fh = new FileHandle $params or die "Couldn't open $params: $!"; $CGI::Q = new CGI($fh); } use_named_parameters(1); my $state = param('state') || 'CO'; print header, start_html('States'), h1("Select a candidate to cast $state_names{$state}'s votes for:"), startform('GET'), radio_group( name=>'candidate', 'values'=>[@candidates], default=>'-', linebreak=>'true', ), submit(name=>"Cast vote"), # remember chosen state hidden(-name=>'state', value=>$state), endform, hr, "$state_names{$state} has $elec{$state} electoral votes\n", end_html, "\n";