Up to [related material].

Protecting Our Innocents.

L. Allison and R. Baxter,

Department of Computer Science,
Monash University.

Technical Report 95/224
2 June 1995



cartoon of F14, supermodel and T2 coming out of the Internet Contents:

Introduction.

The traditional media's discovery of the internet has brought with it controversial headlines such as `Electronic Sink of Depravity' [1] and `Coping with Nasties on the Net' [2]. Although sensationalist, the articles reflect a genuine problem. The internet is comprised of various digital media subsuming many of the distinct roles of traditional media. So we have kiddies' cartoons and "adult" magazines all accessed in the same way. With traditional media, there are all kinds of social (and legal) mechanisms which, although imperfect, control access and content. The internet is a new media paradigm and social (and legal) mechanisms have yet to be developed.

The knee-jerk response to the problem is to propose censorship of the internet. However, when the Electronic Telegraph (ET) [3] recently conducted an online poll on censorship of the internet, the votes were approximately 9:1 against. It must be pointed out that the ET question did not allow qualified responses and that the sample was not representative of society as a whole. Various censorship proposals are discussed later in the section Traditional Controls. The use of internet media does not exempt an author from social obligations to be polite and considerate to the views of others.

The problem with the internet is the uniform method of access to digital forms of traditional media. The digital representation used by internet media causes problems for technical fixes for controlling access; we discuss this further in the section Technological Fixes. This paper proposes a simple social mechanism to address the problem. The mechanism would allow control of access to sensitive material, particularly access by children in the care of adults. The mechanism for the control of access is entirely optional. It does not prohibit or censor authors in any way at all. (Any existing censorship laws still apply, rightly or wrongly.) It could be tailored to account for the varied sensibilities of different societies and groups. Naturally, it would not be perfect and would not satisfy everyone but it may be close to the limit of what is feasible.

The proposal is that authors and information providers optionally classify the content of their own material in objective and machine-readable terms. As far as is possible the terms must be objective, e.g. "sexual", and not subjective, e.g. "depraved". It is also proposed that material be optionally signed using a secure digital signature. These measures would allow browser programs to filter material on criteria to be decided by individual users and in particular by adults responsible for children. Hence we have a mechanism for allowing control of access to sensitive material. The varied sensibilities of different groups can be accommodated.

It is argued that most authors would want to classify their material correctly. After all, many documents in internet media currently have their contents labelled. It is in the authors' interest to be socially responsible and to clearly label potentially offensive material. This maintains the goodwill of readers. However the means of labelling is haphazard, varying between different internet media. The proposal is for a standard for the optional labelling of the contents of internet documents by their authors. One exception to the suggestion that authors will want to classify their material correctly is any group involved in illegal activities, but it is in their interest to limit access to their sensitive material anyway (e.g. alleged child pornography mailing lists). The problem of illegal activities exists independently of the internet media. They can be dealt with by the law in the usual ways although new enforcement difficulties do arise with internet media. Some other individuals will deliberately misclassify material in the hope of causing shock and they can only be `badlisted'. The strong sense of "netiquette" evident in the electronic newsgroups suggests that such individuals might also be controlled by (cyber-)social pressure to some extent.

A significant issue for the proposal is the subjectivity of individual classification. For example, there will be well-meaning authors classifying material within the social mores of their own culture and just not understanding possible implications for other cultures. (e.g. The offensiveness of different degrees of nudity varies dramatically between Japan, Sweden and Iran.) It is for this reason that the categories should be very broad. Nothing is to be gained in making fine distinctions. Another difficulty lies with intolerant authors who do not care for cultural implications (e.g. nationalistic extremists or religious fundamentalists) and who do not care about offending others. These problems can be mitigated by including who classifies the material into the classification. This is discussed in the section Practicalities. How much you believe a classification depends on the credibility and perceived sensitivity of the classifier.

The effect of the proposal is to allow individuals to set their browsers to filter the material they are to access. Conservative readers can choose to avoid documents falling outside their target classes. More liberal readers might access material with no classification despite the possibility of potentially offensive contents.

The Internet.

The `internet' is a much misused and misunderstood term. In physical terms the number of computers connected to the internet is estimated to be several million and rising. Personal computers connect through a `gateway' computer usually belonging to a commercial bureau, a large company, a government organization or a university. The computers can communicate with each other which enables programs to be written so that people can use them to exchange information with each other or with computer databases. The information takes many forms: messages, text, pictures, sound, movies and so on. The internet supports its own forms of media which are discussed below. They share some properties of more traditional information media and have special properties of their own.

Electronic mail (email) approximates person to person letters, memoranda, notes and even phone calls. Sound and pictures are sometimes sent in addition to text. Email is mainly for private communication. It is relatively easy to forge email or to intercept it, but cryptography can be used, and sometimes is, to make email completely secure. There are even public-key encryption systems that can be used to create unforgeable electronic `signatures'.

Electronic mailing lists are rather like club newsletters. Readers have to contract-in or subscribe to a list. It is a more public medium than person to person email, although the topics are often specialised and the circulation is usually small. Some lists are moderated - they have an editor.

Electronic news (enews) is a broadcast, free to internet medium. It has some properties of radio or television, particularly talk-back radio or television, in that the destination is indiscriminate. Some newsgroups are moderated. There is a strong sense of `netiquette' that controls behaviour on even unmoderated newsgroups although it may not correspond to general society's view of good behaviour. For example, anyone who broadcasts irrelevant or exploitative messages is likely to be told off very severely by the readers of a newsgroup. A crude form of classification already exists in enews, e.g. it is quite clear what the `alt.sex' newsgroups are about.

File transfer protocol (ftp) started as an internet archival and retrieval medium, somewhat analogous to libraries. Files can be retrieved from distant computers using a traditional text-based interface.

The addition of an easy-to-use graphical interface to a generalization of ftp gave rise to "the web". The world-wide web (www) can be used to "publish" material that would traditionally appear in journals, magazines, posters, books, television and even on film.

The various internet media are often used in a connected way. For example, a picture may be placed on a computer for ftp access and be announced in one or more newsgroups. In a sense, the www includes the other media as it can make use of links to them. Its point-and-click interface is very easy to use. The next developments are likely to be increased levels of interactivity and more sophisticated graphics, using systems such as Hot Java [4] and VRML [5].

The most significant new properties of the internet media are the diversity of information sources and their ability to reach almost anywhere in the world. Authors range from major corporations such as IBM and Disney to school children.

The information provided on the internet, particularly through the www, ranges across train time-tables, university lecture notes, books, art exhibits, film promotions, the wisdom and ravings of individuals and, yes, pornographic pictures. What is truly remarkable is that this one medium covers all of this and more. All this information is easily accessible. It is as easy to read Playboy as the Magna Carta in the privacy of your own home or office. This has come to pass in a very short time, literally a couple of years in the case of the www. As such, there are few mechanisms in place to control access to information on the internet in a social or legal sense.

Sensitive Material.

People and societies have varied sensibilities. One person's smut is depraved pornography to another and light reading to a third. It is proposed that (quite short) lists of sensitive topics and of possible attitudes to topics be defined. It is proposed that internet material be optionally classified by their authors and providers in a standard machine readable way. The classifications could either be queried as a part of communication protocols or included in the headers of transmitted items. This would allow internet gateways and browser programs to accept or reject material on various criteria. We believe that most authors would want to classify their material correctly. For example, Playboy wants to be known, and widely known, as a provider of nude pictures. It is very much in its interest to be honest with the market. Even the bootleg ftp sites for such images want to be known, perhaps to be notorious.

Lists of possible topics and attitudes are given below. It is not the purpose of this paper to argue for particular items to be included; that would require a long and lively debate. Perhaps the items should be organized hierarchically. However, the lists should be simple and objective, not as far as is possible subject to interpretation. Note that broad categories such as `16+' are not sufficient as they do not allow for the views of different societies. The following is an incomplete list of `sensitive topics' for illustrative purposes. The items are based on a detailed world-wide study on censorship, the File Room [6]. The list may need to be shortened as a concession to the requirement to keep the classification task as simple as possible.

A given document can have various `attitudes' towards a topic: For example one document might advocate legalisation of drug taking. A second might advise on sterilising needles if one must inject drugs. A third might deplore drug taking. A fourth might have nothing to do with drug taking. It is important that it be possible to specify some sort of attitude of an article towards a topic in a classification. For example, a medical research paper might discuss the consequences of drug abuse. A parent might judge this to be suitable for an older child to read without supervision but not for a younger one.

Some Practicalities.

This section is phrased principally in terms of www and ftp material; the principles could also be applied to enews and maybe even to email.

A major obstacle to the proposed classification scheme is the laziness of authors. Would they be bothered to classify their material? To reduce the effort of classifying many individual items, particularly in the case of ftp and www, classifications could be attached to directories and inherited by subdirectories and documents. Major corporations would probably be highly motivated to classify material carefully - they are sensitive to public opinion. Of course most material is innocuous and not even worth classifying - except to ensure maximum readership.

Many variations on the proposed scheme are possible and it is not the purpose of this paper to argue for a particular instance of it, but it might be useful to indicate optionally the `strength' of a classification, e.g.

This could be used to provide sensible default classifications. A strong classification should be carefully checked by the information provider. A weak classification would indicate intended content but might not have been checked for each sub-document. This is particularly relevant to bulletin boards where the computer owner and the author are different. For example, a bulletin board might carry the weak classification that it was for non-political material but an individual author might transgress this. An individual author could choose to provide his or her own classification for an item and both classifications could be attached to the item. Similar situations apply with regard to universities and students, and with corporations and employees not acting in an official role. A university might give a strong classification that its official online documents are not sensitive, that its research findings are intended not to be sensitive and might not give any default classification at all to unofficial material.

It is also proposed that items be optionally signed with a secure digital signature that can be traced to a real person, company or organization. There are good reasons why some items should not be signed, for example political items written under a repressive regime. However a classification that carries a signature also carries more weight. A cautious reader might limit access to items classified and signed by large organizations - they are not necessarily more trustworthy than individuals but they are certainly more suable for false advertising, say.

Traditional Controls.

We now consider how traditional controls may or may not apply to internet media.

Traditional controls on letters and telephone calls, dealing with issues such as obscenity and sexual harassment, have the same validity when applied to electronic mail. Notice that there have traditionally been few ways to avoid receiving offensive material in a letter or a phone call. More recently, the `caller id' facility allows controlled access by the receiver of phone calls.

The traditional restrictions on access to pornography and other sensitive materials (e.g. drugs, violence) found in books, magazines and videos are applied at the point of sale. (Note that the most serious obstacle to schoolkids getting dirty magazines, although they do from time to time, is not any law but the stern gaze of the cashier at the shop counter.) This control is not applicable to the world wide web, where pornographic materials are available. (There are also sales of pornographic materials, but vendors then take steps to verify the age of their customers). One reaction to this inapplicability has been the proposed banning of pornography (among other things) on the internet in some countries (e.g. Senator Exon's Communications Decency Act of 1995 in the U.S. [7].

Senator Exon has said:

"My amendment would simply apply the same laws that protect against obscene, indecent or harassing telephone calls to computers. Is that an infringement on free speech? I want to make the information superhighway safe to travel for children and families." Senator Exon, NY Times, Letter to the Editor,31/3/95.
As we have already seen, the internet media are far broader than traditional wire media, such as voice telephone calls. Pornography is sometimes legal in the traditional media, such as video and magazines, and so it is inconsistent to ban the internet equivalents. Senator Exon has identified a real problem, that of limiting children's access to pornographic materials. Our proposal addresses this problem by allowing readers to make informed decisions regarding access of internet material.

There are also technological limitations affecting the enforcement of any proposed censorship laws. It must be recognised that it is impossible in principle to monitor all material being transmitted on the internet. Any good Computer Science graduate can create a completely secure encryption system for concealment purposes. The material can even be disguised, for example hidden "inside" a perfectly innocuous picture. This is another reason why any proposal requiring monitoring by internet carriers or bearers is infeasible. This should not be of major concern to most people most of the time. An illegal document only becomes a potential problem to the average reader when it becomes notorious at which point it can be badlisted and its authors pursued by the law.

The traditional controls on the broadcast media (television, radio, ham radio, film) are also unlikely to be effectively applied to internet media. With internet media, anyone can broadcast. Traditionally, broadcast media involved licensing arrangements for a few broadcasters. Some media are controlled by regulations requiring broadcasters to monitor and/or classify the materials they transmit. Breaches of regulation can be penalised by the non-renewal of the license if the broadcaster is shown to be `unfit'. Films are classified by an external Censorship Board (whose name differs from country to country).

It is possible in principle to license internet broadcasters, as ham radio operators are now licensed. However the population of internet broadcasters is much larger than that of ham radio users. Also considering the difficulties with international boundaries, a licensing system faces many obvious practical hurdles. (Although ham radio licenses are coordinated through international agreement.) It is then not surprising that we have not seen such a proposal seriously mooted.

While it is impractical to license internet broadcasters, we find that the traditional approach of having broadcasters classifying their own material is analogous to our present proposal. Internet broadcasters are subject to social pressure to conform to standards, rather than regulatory pressure.

The V-Chip proposal [8] in the U.S. is that television broadcasters include a signal that signifies whether "violence" is being broadcast. Parents can then set a switch on their televisions to prevent the reception of violent broadcasts and so protect their children. Presumably each television broadcaster must classify their material according to guidelines on violence. This is analogous to our proposal for the internet, except that we would make the classification by broadcasters optional. We would then rely on viewer pressure to ensure broadcasters would classify their broadcast material.

Materials (films, books) currently censored are illegal, whether digitised or not, whether available on the internet or not. However enforcement in the internet media is much more difficult, or even perhaps infeasible. Anyone in the internet world is a potential broadcaster or publisher. Arguably this is a good thing. Provided publishers of illegal material are driven underground, they are not a mass problem.

Technological Fixes.

In this section we discuss some technological fixes to allow readers to limit access to internet materials. Some proposals based on analogies with various traditional media were described in the previous section. Here we discuss proposals specifically tailored to the internet.

The most common existing means of restricting access to internet materials is to have computer administrators make the decision to censor material being distributed by their computers. The usual reasons are that it is not within the purposes of the owners of the computers. Another common reason is the possibility of children accessing the computers. This is the censorship by carriers proposal.

A good example concerns various ftp sites (e.g. CICA on-line archive [9]) having the following login announcement:

"As is a standard practice with most anonymous-ftp sites (that are supported with funds from an independent host institution), certain programs, files, or software are outside the general scope of archive operations, and therefore will NOT be accepted. These files include (but are not limited to) any software: containing political, religious, racist, ethnic, or other prejudiced or discriminatory messages therein; that is adult or sexually-explicit in nature; otherwise considered unsuitable by the site moderator."
Our proposal, once implemented, could be used by carriers to help automate the implementation of such policies for their ftp sites.

Another example concerns internet providers to schools. Policies vary. Sometimes access is limited to sites known to be suitable for children. Sometimes access is limited by excluding sites known to be unsuitable for children. Sometimes no restrictions on access are made. Once again, our proposal would assist in the automated implementation of restricted access to internet materials in schools.

There is a dark side to censorship by carriers. Some internet carriers censor material without the explicit permission of their internet readers, often out of concern for legal liability, or because of complaints from a subset of their users. In these cases, the carrier judges what materials are available for access, instead of the individual internet reader. Carl Kadie argues that carriers (internet sites, system administrators) have analogous responsibilities to those of public libraries and public librarians to not censor material. The American Library Association has a "bill of rights" [10] designed to protect librarians from being pressured into censoring the materials they hold. Kadie believes the same bill of rights should apply to internet carriers.

Censorship imposed by carriers against their reader's wishers would immediately undermine our present proposal. This is because it would cause authors of internet materials to misclassify materials in an effort to evade carrier censorship and so reach internet readers.

An alternative to the censorship by carriers proposal, is the technological fix. This involves the design of intelligent software to filter information, say, into the home. There is a rush to develop information filtering software and get it to market. This is a common solution as perceived by the media:

"Like other dilemmas and unanswered questions of the digital age, traditional approaches simply won't work. We're going to have to accept less intrusive, probably more exotic solutions, like providing intelligent software filters to those who want a version of Internet Lite. Before these sorts of tools arrive - if they ever do get here - the First Amendment may experience its toughest test to date." Newsweek [11]
Reliable, intelligent software filters are technologically infeasible. Although one can envisage crude filters for text, based on keywords, general filters for other digital materials are infeasible. For example, the required sophistication for image processing software to identify offensive digitised video images is not likely to be available in the forseeable future. The same applies to audio. Intelligent software filters are at best a partial solution. The coming multimedia age will make them an increasingly less complete solution.

The infeasibility of a general software filter for internet media without content classifications has not stopped commercial software vendors developing and marketing such products. For example, consider the press release for the SurfWatch [12] internet filter:

"SurfWatch is a breakthrough software product which helps you deal with the flood of sexual material on the Internet. By allowing you to be responsible for blocking what is being received at any individual computer, children and others have less chance of accidentally or deliberately being exposed to unwanted material. SurfWatch is the first major advance in providing a technical solution to a difficult issue created by the explosion of technology. SurfWatch strives to preserve Internet freedom by letting individuals choose what they see."
The SurfWatch vendor intends to provide monthly updates to cope with the fast changing internet. The software removes access to more than a thousand newsgroups. Presumably web links have to be vetted individually. Undoubtedly the software meets a market need, but its approach is inevitably ad hoc. Our proposal would allow products such as this one to do more accurate, complete and automated filtering of internet materials. Yet another technological fix is for parents and guardians to have a separate "proxy server" for their children's web browser. The parents then need to actively select sites their proxy server can access. This is a labor intensive solution. Of course, commercial internet providers can offer to provide these services for parents, as is already happening. Commercial providers, such as America Online, allow parents to control what internet relay chat (irc) sessions are available to their children. We believe these approaches will be difficult to implement without the voluntary "classification of materials" provided by the present proposal. The widespread implementation of the proposal would assist parents (or those acting in loco parentis) in automatically, or at least semi-automatically, vetting internet materials for their children.

If our proposal (or similar) is not implemented, we are left with the status quo. Many potentially offensive internet materials are already clearly marked as such. For example, enews groups can be vetted according to their names and published purposes. Ftp sites can be vetted according to their indexes and readme files. Web sites can be vetted according to their index.html files. Video and audio materials can be vetted according to their source and/or their filenames. These are all de facto classification schemes that already exist.

As a specific example, consider the enews group's rec.humor.funny's editorial policy on offensive content:

"This newsgroup sometimes contains material some consider offensive, and material that may not be suitable for some minors. As such, all redistributors should make sure that nobody reads the group other than by personally requesting it, and all redistributors must take whatever precautions they feel are necessary with regards to newsgroup access by minors.

The warnings and keywords placed on jokes are for the reader's convenience only. The editor makes no assurances that such warnings are complete or correct. It is up to each reader to decide whether to be guided by these keywords and/or warnings."

The editorial policy makes it clear that some jokes may be offensive and that sensitive readers should avoid reading the newsgroup altogether. Others should take note of the keywords before reading a particular joke. Individual newsreaders can be optionally set to "kill" jokes containing keywords indicating that they are offensive to a particular internet reader. rec.humor.funny is just one small corner of the net. Our proposal is to extend such a "keyword" policy to the wider internet community.

Case Studies.

We now describe some hypothetical scenarios to illustrate how our scheme would apply if it were widely implemented on the internet. They demonstrate some of its uses, and also how it is robust to misuse by potential internet censors.

Your Home

Your home PC has an internet connection with a web browser. You allow your young children to use this wonderful resource (saves having to buy Encyclopaedia Britannica). However you carefully set the filter on the browser to disallow obscene, sexually explicit, and violent documents to be viewed. You decide to allow only classified documents. After all, this is hardly limiting for your children as all the reputable information providers classify their documents.

In this case our proposal aids parents in supervising the internet materials their children can access.

University I

First year Computer Science students are required to access internet media, including the web and enews. The University's Computer Centre sets the software defaults to restrict access to only classified, course-related documents. However, students can alter this default at their discretion.
(R.B. witnessed an unfortunate classroom scene where students were by default automatically subscribed to all newsgroups. The first ones to be displayed were in the middle of the alt.sex hierarchy. These newsgroups have potentially offensive names, just to begin with.)

This is an example of our proposal assisting novice internet readers in accessing internet materials relevant to their specific purposes. New internet readers can then knowingly extend their access to potentially offensive materials.

University II

A postgraduate student, as part of postgraduate work, is required to search the internet for documents relating to her or his thesis. The student has strong religious and cultural beliefs regarding blasphemy and obscenities etc. Documents containing these matters are not relevant to the thesis topic. The student can set the web server to view only classified documents not containing the topics offending his or her beliefs.

This is an example of our proposal allowing internet readers of diverse cultures and religions to access internet materials with the potential for accidental offense greatly reduced.

Researchers

A strident moral rights lobbyist sets the web browser to search for all the obscene, sexually explicit, violent materials available. The lobbyist is collecting evidence for a campaign against that sink of depravity, the internet.

Although our proposal is not aimed at aiding the search for internet materials of specific content, it certainly can be used in this way. This side-effect may be of use to some.

Political Activists

The government has passed legislation forbidding student union fees being used to fund `political activity'. This includes the discussion of government policy in student club newsletters funded by student amenities' fees. These student club newsletters are published on the internet, via the web, enews and electronic mailing lists. The University Administration is required by the government to enforce the ban and instructs the Computer Centre to monitor internet materials emanating from the student union for any documents classified as political. It is in the students' interests to misclassify the content of their newsletters as being "non-political". (The legislation in this scenario is based on actual laws in Victoria, Australia, in 1995.)

Censorship removes the incentive for internet authors to accurately label their materials. So carrier censorship would render our proposal useless. However at least our proposal cannot be used to aid censorship. This property of the proposal is a good one.

Conclusions.

Most authors using electronic media do not produce material that is any "worse" than that available from newsagents, video shops, or mail-order sources. What is new is that all types of material are equally, and easily, accessible in an electronic market-place that ignores most physical boundaries. There are no computer programs to automatically and reliably classify material; only people can do it. We have argued that authors should be encouraged (not forced) to classify their own material in a standard machine-readable form, and should be encouraged to sign it, and that this would provide a practical way to filter material coming into the school or home from the internet. This would not limit freedom of speech in any way; it would give freedom of choice to readers, and to parents of young readers in particular. Some authors will try to circumvent any system and they can only be badlisted or pursued by the law in the case of illegal activities.

References.

[1] S. Winchester. An electronic sink of depravity.
The Electronic Telegraph (http://www.telegraph.co.uk) 3 Feb 1995.

[2] R. Ashcroft. Coping with nasties on the net.
The Age, Melbourne, p30, Tues 2 May 1995.

[3] Electronic Telegraph (http://www.telegraph.co.uk).

[4] Sun. Hot Java (http://java.sun.com/), 1995.

[5] VRML forum (http://vrml.wired.com/) and technical forum (http://www.eit.com/vrml/).

[6] The File Room (http://128.248.123.200/FileRoom/documents/TofCont.html).

[7] Liberties (http://galaxy.einet.net/galaxy/Community/Liberties.html).

[8] C. M. Kadie. Sex, Censorship and the Internet, (http://www.eff.org/CAF/cafuiuc.html, including the V-Chip) 1995.

[9] CICA on-line archive (gopher://ftp.cica.indiana.edu/).

[10] American Library Association's Library Bill of Rights (ftp://ftp.eff.org/pub/CAF/library/).

[11] Newsweek (http://www.phantom.com/~slowdog/press/newsweek.html) 27 Feb 1995.

[12] SurfWatch (http://www.surfwatch.com/) May 1995.


This technical report can be found at http://www.csse.monash.edu.au/publications/1995/tr-cs95-224/1995.224.html.

Copyright © 1995 L. Allison and R. Baxter