Nailing Down the Jell-O:
Three Years of Running a Specialized Social Science Reference
Site
Craig McKie
Department of Sociology and Anthropology
Carleton University
Fredericton: October 6, 1997
I am here today to talk about my
experiences in inventing, developing, administering and (when
it got too big) migrating a social science reference site over
the past three years. I am characterizing the process as akin
to nailing Jell-O to a wall since a permanent state of flux is
the rule of the day, at least in my experience. Links go dead
without notice; ISPs change operating systems and versions of
the web server software without notice; they change the location
of web logs without notice; and important new links and sites
appear most often without notice. The net effect is not the familiar
thought of Chairman Gates "where would you like to go
today?" but rather "where is where you
would like to go today?"
History & Experience
In the fall of 1994, I handed out
on paper a list of ten useful resource web sites in my graduate
seminar in research design and analysis. I had been putting off
the major day of reckoning when I was going to learn how to set
up a gopher site. Almost as an afterthought to the distribution
of the piece of paper, I abandoned the gopher site idea and undertook
to set up a small web site in my account area which they could
then use to easily access the ten sites. I got on the phone to
likely advisers in the university community and they told me how
to set up a public files area in my UNIX account and how to set
the permissions for the files so that others could use them. I
really at that stage had no idea what I was letting myself in
for. I told my class about it in the next class period and urged
the students to try it out. Reaction was good.
The next Friday evening after, the
baseball season having ended prematurely, I decided to spruce
the site up a bit. I visited a nice looking site and copied the
code in toto. This allowed me to deduce how one embedded
images and set backgrounds and chose colours and so on. Then I
hauled out Corel Draw and experimented with design. I chose lurid
colours which people later on loved to complain about (generally
they had 256k video cards and 16 unsatisfactory colours to work
with in their displays). The dark purple colours drove such visitors
wild since they couldn't see anything except swirls of purple
and black. They did not hesitate to inform me of this discomfort
which indirectly let me know that people out there were using
the site which I had dubbed the "Universal Codex for the
Social Sciences" as, I thought at the time, a not very subtle
attempt at humor, given the meager number of sites which was then
included. I also began a parallel site completely unenhanced with
colourful distractions for the video card disadvantaged.
When the list reached twenty or so
uncategorized links, I decide to submit it to the Yahoo
site. At that time it barely had 2000 links listed. I had found
a reference to it on an underground Japanese web page, two or
three weeks after it started up. In due course I got back a personal
message from the proprietors then operating Yahoo out of a dorm
room at Stanford or Berkeley. To my everlasting regret, this message
has been lost. I would now rank it with my signed artwork for
Byte Magazine covers from 1979 in collectibles and sentimental
value. Today, my research resources site, the direct successor
to the original page is the most consulted link for the social
sciences at Yahoo. But a lot of water has gone over the dam since
those early days barely three years ago. I have had to learn on
the fly as it were and this afternoon I would like to share some
of my insights with you here.
The Present
Since the early days, the Research
Resources page has been through many changes, most notably the
introduction of 18 subsections. Its total size has grown to 130k
of links and the next version will add about 30k more as it will
be greatly expanded and augmented. The original site is fixed
in content but I am still getting 1000 hits a day on the colourful
and the plain vanilla versions I left in place there. The mirror
site at UNESCO in Paris is heavily used and the main site in Toronto
has experienced heavy traffic as well though I do not have log
analysis software at either of the latter two sites to understand
the nature of the traffic.
The traffic does come from all over
the world and transcends the intended user group, the social science
research community. I get emailed requests for assistance regularly
only some of which I can find the time to respond to. Still, the
origins of these requests are intriguing. I particularly remember
one from the research department at the World Service at the BBC
in London for instance.
Insights:
1. Enlist your users
The research resources page is a
place where stranger-to-stranger information exchanges are mediated.
When present, the strangers can be asked if they have anything
to contribute to enrich the marketplace. Many do so.
2. Validation sequences
I have learned never to trust the
address of an interesting looking link which falls into my hands.
I test out every one of them for an initial editorial look-see.
Then I test it again when it is entered on the revised page but
before it is seen by the public. In addition, as often as I can
manage, I run the page through the Dr.HTML site which checks
for broken links, down servers and other misfortunes. It is just
invaluable. It will only work for pages below a certain size limit,
which was a powerful reason for me to split up the main file into
manageable chunks.
3. Take care with institutional
facilities and identities
I have always been concerned about
this issue. A successful web site generates traffic for my university's
server. If you are not sensitive to the traffic costs you incur,
the administrators can be difficult. I had a no advertising rule
in anticipation of a negative reaction from my university hosts.
When it became feasible and necessary
to migrate and accept commercial sponsorship, I argued that I
could not do this on university facilities and that the sponsorship
message should be low-key and non-intrusive. I hope that I have
succeeded in striking the right balance. I did for instance have
to argue down the size and placement of the ads to tasteful dimensions.
Having said this however, I do host
some specific pages of research resources for immigration topics
and refugee resources which were paid for by a government department
(though their sponsorship is not acknowledged on the pages themselves).
4. Spin-offs
The creation and development of my
research page led to me writing two "how-to" books and
the second, published this summer by McGraw-Hill Ryerson [Using
the Web for Social Research] in turn led me to accept commercial
sponsorship for redevelopment of the graphics content on a commercial
ISP where it now resides somewhere at an undisclosed location
in downtown Toronto. The practical difficulties for me are that
the files are now way off campus and I have to do the standard
UNIX file transfers after revisions take place. This has greatly
slowed the process. I understand that if your server is using
Windows NT, then you can use Frontpage to manage you links effectively,
but commercial ISPs often use a standard UNIX setup and in any
case I did not have a say in the choice of a service provider.
I am much impressed with Frontpage and what it can do and I think
my site would be better if I could use it but I can't. It is however
something to keep in mind.
Other major sites have had spin-offs
as well. SOSIG in England for instance published a user guide
pamphlet and gives it away. They also run training sessions for
which they charge. There is such a pressing necessity for training
in the use of the Web tools that I think this avenue is very promising.
And of course there is the Web course for credit concept which
as yet I do not feel at ease though perhaps some of you do. I
am more than willing to try it out and a colleague in social work
at Carleton is now giving a course in this fashion.
For myself, the generation of new
"how-to" books is a obvious way to go. With the recent
advent in Canada of Data Liberation, there is for instance a crying
need for something called "How to use Statistics Canada data".
The data is now available in the academic community free of charge
to the user (but not of course to the institutions); the problem
is very few students and faculty are aware of its strengths and
limitations and may lack the skills to analyze with statistical
packages such as SAS or SPSS.
5. Copyright
There is starting to be litigation of the presumed "right" to post active links on your page to copyright material (such as for instance Associated Press Wire copy). I am aware of one case in Scotland and one in the southwestern United States where web site operators have be sued because their sites contained such links. In addition there is the "Radikal" case in Germany where a prominent Socialist legislator was charged with a criminal offense for having a link to the Dutch anarchist webzine Radikal. I took the step of putting the same link on my research resources page as a gesture of sympathy for the "felon". I understand the charges were subsequently dropped on a technicality but the principle of these lawsuits and charges is alarming. I will continue to operate as if active links are public domain material but I am not altogether sure whether this principle would be upheld in a court of law. Recent discussions of a new international convention on Internet practice (discussed notably by Ira Magaziner on behalf of the U.S. government) add to my unease. The current regime of 'lawlessness' is not likely to persist so we collectively had better make our views known on this matter. I am assuming most everyone associated with a high volume web site is in favour of a wide open regime of exchange but perhaps I am wrong on this.
The copyright issue has an evil twin,
the use of strong encryption and its legal status but that is
too big an issue and its ramifications too important and consequential
to discuss here in any detail. Suffice to say that I am in favour
of unlimited use by anyone who wants to of strong encryption (by
which I mean encryption which cannot be broken by the security
agencies of the state). It now exists but its use is being actively
opposed by most Western governments who purport to fear its use
by crooks, terrorists and pornographers.
Conclusions: What have I learned
Let me conclude with some general
observations about the web site management and development process.
Some of these follow from my previous remarks but some are more
general yet:
1. Its a lot of work. It is the farthest
possible thing from the fire-and-forget smart munition. It takes
constant feeding and alteration and you are prodded to do these
by the clients.
2. Its habit-forming in the sense
that it becomes part of the routine. I used to operate on a weekly
cycle of changes culminating as many things do in a big change
of character on Friday afternoons. I now have abandoned this approach
in favour of the big re-write at much more lengthy intervals.
Big rewrites can be scheduled and done with more concentrated
attention. This is an admission that the activity has become one
of major importance (though not yet in the tenure/promotion process).
3. It attracts attention. This is
both good and bad (see above points). It definitely increases
the amount of incoming email as strangers approach you for help.
4. The role involved is almost exactly
the same as that of the traditional editor. You are called upon
to challenge the work of others, and if it withstands the challenge,
then you accept it. It is worth saying that some resources do
not meet my arbitrary unaccountable standards. My own personal
judgment is thus by default being inflicted on an unsuspecting
web world very much in the mode of the old print editor.
5. Logging and analysis thereof is
a continuing problem. I don't feel I know enough about the user
community and their interests. My interests are therefore paramount.
I did try one experiment with on-line questionnaires (it is feasible)
and I did try some matching of responses to the log but I can't
say that I learned very much since the response rate was about
2%. That's great if you are in the junk mail business but for
survey purposes it is unusable.
Finally and 6., I must conclude that
the activity and the site itself are worthwhile and valid. Evidently
some needs are being met. Though I did not set out to do this,
it is evident to me now that were the page not to exist, somebody
would have had to invent it. I think of it as if it were the Amsterdam
flower market. All manner of wondrous blooms are on display and
sold in box lots in that place but none of them grew there and
few of them expire there. They are shipped off somewhere else
in the world to carry out the illuminatory role in their brief
lives, much like the links on my page.