META-Tag: How to use it for structured indexing


Proposal to the html-wg


March 18th 1996


Version: 18th March 1996, Author: Heinrich C. Kuhn


From:      Self <sapientia> 
To:       html-wg@w3.org
Subject:  META-Tag: How to use it for structured indexing
Date:     Mon, 18 Mar 1996 16:01:16

Sorry about commenting that late on a draft from late 
december 1995: I wanted to discuss the things I propose 
here first with some experts on the narrower field of 
professional indexing and classification before putting 
them to your attention here. This discussion had place at 
the 20th annual meeting of the "Gesellschaft fuer 
Klassifikation" (Society for classification) on March 7th 
1996. I then made my proposal known to the German 
librarians community via the listserv INETBIB, and not 
having received any protests from there either: here is 
what I want to propose to you
(BTW: I'm trying here to be as concise as possible: If you
should be interested in a *more loquacious* version of 
this here proposal: such a thing exists at
 
<http://www.gwdg.de/~hkuhn1/wwwcat/mtprop01.html> ):

   Davide Musella made several proposals in his draft on 
the META-Tag of December 1995 (available e.g. at 
< http://jargo.itim.mi.cnr.it/documentazione/draft-musella-html-metatag-02.txt 
>), 
which I think are certainly usefull and possibly necessary. 
There are, however, some points where I feel, that some 
additions might be appropriate. 
   Especially when it comes to *keywords*, but as well 
where information on the *author* and on the *abstract* of 
a document are concerned, the Musella-proposal apparently 
permits for 
-  just "author's keywords"
-  one author per document without information about 
   her or his institutional affilation
-  just one abstract, (implicitely ecpecting this abstract 
   to be in English?)

   However, our experiences with printed documents tell us, 
that when there are lots of documents indexing by mere 
"author's keywords" is not sufficient for retrieval. This 
is why such things as decimal classifications (e.g. the 
Dewey one or the UDC) where invented, and why professional 
databases use ordered thesauri like e.g. *MeSH*, 
*BiosisBioCodes* and the like. 
   As the number of scholarly relevant documents on the web 
is rising, there is a rising need for such "qualified" 
indexing of web-documents as well. The MEAT-tag seems to be 
a good intrument to do such a thing. Defining however all 
possible "properties" for classification in the way done by 
Musella for 9 "properties" from "keywords"" to "public" for 
all types of classification at present in use in the 
different disciplines and scholarly communities would be a 
task that should be *very* difficult to accomplish, and 
would, furthermore, tie the HTML-spec to developments in 
classification that are not inherently connected to the 
HTML-spec. 
   So it seems more appropriate to suggest, that the 
specification for the META-tag should be construed in a way 
to permit flexible and structured indexing without having 
to fix every possible detail in the specification. This 
might be done by permitting a certain "CatchWord" in the
META-Tag to have the name of a named anchor in the same 
document as a content:

<META NAME="IndexingInfo" 
CONTENT="#WhereTheIndexingInformationIs">

The content of the named anchor 
"WhereTheIndexingInformationIs" could then be the indexing
information like e.g.:

<DDC>123, 456</DDC>
<MeSH>MeshTerm_1, MeshTerm_2</MeSH>

and so on for whatever might be the type of indexing 
prefered by the author of the document and her or his 
scholarly community.

Or it might be possible to introduce a trigger of a boolean
type like

<META NAME="Index" CONTENT="Yes">
accompanied by an index-tag in the document: e.g.:

<index>
<DDC>123, 456</DDC>
<MeSH>MeshTerm_1, MeshTerm_2</MeSH>
</index>

Both models, working with sections in the document that are 
dedicated to structured indexing information might provide 
as well for information about the institutional affilation 
of authors, authors' roles in preparing a document, 
languages of abstracts, and other things. I can
go in further detail at more or a less any moment you 
should wish me to do so. Some further detail is already 
now found in the above mentioned
< 
http://www.gwdg.de/~hkuhn1/wwwcat/mtprop01.html >.

   Do you share my view, that it might be worthwhile to 
make an addition to the draft on the META-tag that permits 
for such structured use of classificatory information and 
other indexing information?

((Or do you consider this to be one of the fields on which 
the discussion should be regarded as closed for the rest of
the life of this WG?))

   Lots of thanks already now for your answers!

Heinrich C. Kuhn