Date: 18. Februar 1997, Author: Heinrich C. Kuhn
[Hier finden Sie ein Link zu Informationen zum Projekt NAUDAEUS auf Deutsch]

General information

Project NAUDAEUS is intended for cooperative and more or less scholarly indexing of information found on the internet and for the retrieval of information about such information. Ist imlementation uses a relataional database (RDBMS). Project NAUDAEUS combines indexing and abstracting by humans with indexing by robots. It aims at combining the advantages of both approaches.

Most of the information on this project is in German only. Sorry!

But some information on this project you can get from some emails I wrote in English to explain this project to some persons interested in it. I'll try to assemble the various bits under the following headers:



B.)   Situation and problems:
      We have in various places various interneto-
      graphies, clearinghouses and the like for
      various subjects. Entries not rarely are
      multiplied accross these indexes. Part of
      these indexes are for electronic material
      only and part of them are for printed things
      as well. "Normal cataloguing" is found by
      the cataloguers as not suited for at least
      part of the electronic material. Part of 
      the material catalogued is pertinent for 
      more than one subject, and has to be 
      indexed in more than one way more or less
      specicific for certain disciplines only.
      Cataloguers suffer from extra work due to
      lack of cooperation when indexing "non 
      standard" materials. Customers and 
      librarians complain that almost anything 
      they need is indexed somewhere in the
      various catalogues, internetographies and
      the like, but that it becomes less and less
      likely that they will be able to find out
      where and how information about the item they 
      seek might be retrieved. The "big" search
      engines are not enough help due to lack
      of really specific indexing and "up-to-
      date-ness". Maintaining the structures of
      the internetographies becomes more and
      more clumbersome. Meta-data contained in
      the electronic documents is scarce, and
      often not sufficient to retrieve the items
      searched. Inexing by authors tehmselves
      often is no viable solution.


We have various subject guides on various 
subjects. Users have complained, that they 
have developped in a way that makes them 
difficult to handle, that it takes too long
to find out what is listes in which of these
internetographies, etc.. Besides quite a number
of resources have been indexed by several of
our librarians and other specialists in several
guides from several points of view. Consent
has been reached, that any sort of "universal
classification" won't be acceptable for 
specialist customers' needs (we have some 80
institutes doing basic research in more than
80 [in most cases rather "esoteric"] fields in
STM and the humanities).


Our site has got several hierarchical entry-points
and "internetographies" and other stuff on various
subjects. Worked fine ... . At the start at least.
But now, as the number of 1000 entries has been 
passed long ago, user-feedback says "there's almost
anything that we are interested in indexed at your
site, but it takes soooo much time to find it";
and my own experience says, user-feedback is right.
And new entries to our indices are inserted every week 
... . No, the situation and user-friendlyness does *not* 
improve ... . Just adding a "go for keywords you might
imagine and hope that the author of the text or the
indexer might have shared your imagination"-search-
engine would not help (at least not on the long run).

A.)   About our institution:
      Max-Planck-Society (non profit, mostly
      sponsored by public money) has some 80
      institutes doing basic research in various
      science and humanities disciplines. This
      research is supported by some 80 libraries
      and other units for information gathering
      and providing most of which are part of 
      the institutes.

Project managers are Diann Rusch-Feja, Peter Scherber, Ernst von Biron, and Heinrich C. Kuhn :

C.)   Project steering comitee:
      2 librarian+active research+electronic
        dataprocessing persons
      1 electronic data processing+documentation+
        active research person
      1 electronic data processing person

D.)   Solution:
      Combination of intellectual and automatic
      indexing according to a vary detailed and
      very flexible set of categories. Use of
      a relational database for indexing, retrie-
      val and automatic generation of several types
      of documents. WWW-based interface both for
      indexing and retrieval. Cooperative indexing.
      Indexing can (but does not have to) make use
      of several indexing schemes at the same time.


> 1. When did the project start?
Late autumn 1996.

> 2. How many records have you done?
0 in the "solution database". Several thousands
in clearinghouses, internetographies etc. that
are to be merged into the "solution database".

> 3. Are you using any specific software?
RDBMS plus some things that we'll have to program
or will have to have programmed.

> 4. Does the project concentrate on one type of materials only?
It will start with various types of electronic
material, but th structure is such, that it can 
incorporate library catalogues and "classical"
biblographic database entries as well.

> 5. Are there any written descriptions or documentation, on the web
> or in print?

There is a description of the set of categories
used for indexing. Descriptions are - alas -
in German only. I can send you a copy if you are


So we decided to develop a framework for so-
phisticated "cataloguing" of internet ressources
that is open for general and specialized indexing
of all types, that permits "cooperative indexing"
and the use of atomatically generated, author-generated
and indexer-generated information to describe one
and the same ressource. It permits as well to build
a common database both for electronic and printed
ressources. And, of course, it permits to include
full texts.

The final solution is intended to replace our
present internetographies by documents created
"on the fly".


So we decided to go for a combination of several things:
- transforming most of our stuff from entries in more or
  less flat files to entries in a database
- hierarchies built on the fly according to search-entries
  by users
- a search-engine with quite a number of interfaces,
  hand-tailored to the various needs of various groups
  of users
- the possibility to combine searches in hierarchies with
  searches via keywords and classification codes
- and some add-ons ... .
We hope to have the new design up and working sometime this
summer ... .

