Connecting classifications in the digital world


See the OAI Website at http://www.openarchives.org 
But such a task is not trivial. As for subject indexing, different classifications, thesauri or otherwise structured terminologies, or even ontologies, while insisting over the same area, can keep presenting strong linguistic (which can not be worked out by mere translation), structural and semantic disagreements, in spite of any effort for harmonization. Dramatic disagreements are evidenced in passing from the specialized world of disciplineoriented classifications to general classifications widely used in public, school or even general academic libraries, such as Dewey Decimal Classification, Universal Decimal Classification or Library of Congress Classification. Misinterpretations are easy to occur when the same words are used in different contexts or for different purposes. Moreover, even in using one and the same classification, differences and inconsistencies are normal practice, either among different applications or inside the same application. Correctlyminded people could expect that good interconnections among classifications are at the basis of good retrieval across classifications, but this seems not to be a common case. 
Actually, it's still worthwile, and not only for educational purposes, to work out well defined connections between classifications or the like, which provide that the objects each of them refers to are identified unambiguously by means of a suitable representation language. With the knowledge representation languages currently being designed and implemented in computer applications, this task is getting feasible. 

subject classifications in Mathematics, Computing, Physics.
Mathematics Subject Classification (MSC)
The classification covers all branches of pure and applied mathematics, including probability and statistics, numerical analysis and computing, mathematical physics and economics, systems theory and control, information and communication theory. 
The MathSci database
The paper version consists of the journals Mathematical Reviews (MR), published since 1940, and Current Mathematical Publications (CMP). MSC, compiled since 1959 (by AMS alone till the first '70s), in the first years of its existence was very unstable. So, for the part which appeared in print from 1940 to 1972, the MathSci database got new classification data, which are stable for relatively long time (19401958, 19591972) and therefore more suitable for database search than the frequently varying ones of the print version. Starting with 1973 the database is indexed with the same classification codes that appear in the print version. The 1995 and 2000 versions are available in hypertextual presentation
The Zentralblatt MATH database
The paper versions consists of the journal Zentralblatt für Mathematik und ihre Grenzgebiete / Mathematics Abstracts (ZM/MA), issued since 1931, formerly by Deutschen Akademie der Wissenschaften zu Berlin; published by SpringerVerlag. The database is indexed with the 1991 and 2000 MSC versions; some superseded classification codes from preceding versions are also present. Math Doc Cell issues a multilingual (French, English, Italian) Web presentation
The evolving structure of MSC
>From 1959 to 1985 the MathSci version of MSC counts 60 major sections; 61 from 1986 to 1999 and 63 since 2000. Until 1972 the classification was issued in two levels; an intermediate level became available in 1973, and is progressively being exploited, as far as MSC increases in detail and so grows in size. Started with 1436 numbers in 1959, MSC counts 4895 numbers in 1999 and 5590 since 2000. A consistent and ever growing apparatus of cross references helps understanding connections between different branches of mathematics. 
The EULER project
The main objective of EULER was the realization of a "onestop shop" for research on mathematics information resources such as books, preprints, Web pages, abstracts, collections of articles and reviews, periodicals, technical reports and theses. The result is a Web metainterface for parallel simultaneous queries to a heterogeneous collection of databases. See the EULER site: http://www.emis.de/projects/EULER/ 
Let's look other classifications in the field of Mathematics:
The Referativnyj zhurnal: Matematika classification scheme
An English translation is provided by the AMS site in textual form, at the address: http://www.ams.org/mathweb/Classif/RZhClassification.html 
Zentralblatt für Didaktik der Mathematik Classification Scheme (ZDM)
The paper version of the database is Zentralblatt fur Didaktik der Mathematik. A Web presentation of the ZDM classification is available at:

In the field of Computing we start with:
ACM Computing Classification System (CCS)
Moreover, it is adopted by the bibliographic database CompuScience, produced by Fachinformationszentrum (FIZ) Karlsruhe, Department of Mathematics & Computer Science Berlin, which contains references from CR since 1976, from GCL since 1977 and from Section 68 Computer Science of MSC in ZM/MA. ACM's first classification system for the computing field was published in 1964. Then, in 1982, the ACM published an entirely new system. New versions based on the 1982 system followed, in 1983, 1987, 1991, and 1998. Web presentations of the 1964, 1991 and 1998 versions are available
Moving into the field of Physics we find:
Physics and Astronomy Classification Scheme (PACS)
Revised editions of PACS are published biennially, or as necessary, by AIP. PACS contains 10 broad categories subdivided into 66 major topics 
INSPEC Classification
INSPEC was formed in 1967, based on the Science Abstracts service, which has been provided by the Institution of Electrical Engineers (UK) since 1898. Still today Physics Abstracts, Electrical & Electronics Abstracts and Computer & Control Abstracts together form the Science Abstracts series of journals, which is the paper version of the INSPEC database. 
Now we are at the general subject classifications; we start with:
Dewey Decimal Classification
The Dewey Decimal Classification is published in two editions, full and abridged. The Classification is kept uptodate electronically through electronic versions: Dewey for Windows, a CDROM product that is updated annually; and WebDewey in CORC, a Webbased product that is updated quarterly. The DDC is published by Forest Press, a division of OCLC Online Computer Library Center, Inc. DDC is widely used all over the world, not only for book shelving in
libraries, especially in public, school and general academic ones, but
also for subject indexing and browsing in general online document retrieval
tools, such as bibliographic databases (including the national bibliographies
of sixty countries), online library catalogues (including WorldCat, the
OCLC Online Union Catalog), digital libraries, Web search engines.
The classification is developed and maintained in the US national bibliographic agency, the Library of Congress.
agency, the Library of Congress.
The print version of Edition 21 is composed of nine major parts in four volumes as follows:
Tables, together with the very structure of the hierarchy in some areas of the classification, make up an effective approximation to facet analysis.  Relocations and Reductions;  Comparative and Equivalence Tables;  Reused Numbers. Volumes 2 and 3:

The CARMEN project

Universal Decimal Classification (UDC)
Until recently responsibility for the scheme belonged to the FID (Federation Internationale de Documentation); this responsibility was passed to a consortium of publishers (the UDC Consortium) in 1992. The scheme consists of 60,000 classes (divisions and subdivisions) as well as a number of auxiliary tables. 
Library of Congress Classification
LCC is an enumerative system built on 21 major classes, each class being
given an arbitrary capital letter between AZ, with 5 exceptions: I, O,
Displaying subject classifications: our achievements
We are especially exploiting a presentation mode (double view) that allows moving to and fro parallel views of the same or similar structures along links inside or between the structures; this proves very useful in our setting. Such hypertexts are produced mainly by a pool of standard C programs, which operate only on sequential ASCII files and are aimed to the analysis and transformation of specific texts and to the generation of groups of syntactically simple but highly connected and JavaScript enriched HTML pages (Hvolumes). 
The Scientific Classifications Page
http://www.math.unipd.it/~biblio/math/eng.htm. Besides hypertextual presentations of subject classifications, the page collects some Hvolumes presenting KWIC (KeyWordInContext) lists extracted from the descriptions of one or more combined classifications. Descriptions are circularly permuted on significant words, i.e. words out of a stopword list; the very long list of resulting strings is dispalyed on the right, subdivided into smaller manageable lists, which can be accessed through an index appearing in the left frame. This redundant but properly paginated presentation allows the rapid exploration of lexical similarities among descritions to obtain suggestions about possible affinities of contents. The Scientific Classifications Page page includes: 
The Mathematics Classification Page http://www.math.unipd.it/~biblio/math/engmsc.htm
From a sequential ASCII file containing the whole MSC2000, two Hvolumes were obtained, respectively
were obtained, respectively http://www.math.unipd.it/~biblio/math/mainb/mhbmain.htm http://www.math.unipd.it/~biblio/math/doppiaeng/mhdmain.htm http://www.math.unipd.it/~biblio/math/italiana/mhimain.htm http://www.math.unipd.it/~biblio/math/it+eng/mhlmain.htm MSC2000d Hvolume, simple frame presentation, including changes from MSC 1991: http://www.math.unipd.it/~biblio/math/complexc/mhcmain.htm MSC2000w Hvolume, simple frame presentation, with guide pages linking to subject specific pages of relevant Websites http://www.math.unipd.it/~biblio/math/travel/mhwmain.htm 
Mathematics Subject Classification MSC and Dewey Decimal Classification DDC http://www.math.unipd.it/~biblio/math/engddc.htm
http://www.math.unipd.it/~biblio/msccdd/index.html  the proposed revision of the 510 DDC section  MSC2000  the sections E  N of the ZDM classification, encoded as 97E  97N in the MSC style to produce the KWIC list Hvolume MSC2000 + ZDM EN http://www.math.unipd.it/~biblio/kwic/msccdd/index.html. 
KWIC (KeyWords In Context) lists for Scientific Subject Classification Descriptions http://www.math.unipd.it/~biblio/math/engkwic.htm.
http://www.math.unipd.it/~biblio/kwic/msc/ http://www.math.unipd.it/~biblio/kwic/pacs/ http://www.math.unipd.it/~biblio/kwic/acm/ PACS 2001 classification schemes http://www.math.unipd.it/~biblio/kwic/mscpacs/ ACM Computing Classification System (1998) http://www.math.unipd.it/~biblio/kwic/mscacm/ Furthermore, some improvements obtainable by discrimination of homonyms, synonyms and secondary terms shall be investigated. 
Conclusions
Actually, the same documents come mostly to be represented, in different bibliographic utilities or catalogues, with indexing data from different systems. While general library OPACs rely on DDC and national lists of subject headings, specialized bibliographic databases are each confident on its disciplinespecific classification or thesarus. It would suffice to put these data for matching records together to create the bridge. In this way, browsing inside one subject indexing system can be integrated either with direct access to document metadata (or possibly documents), or with passage to another subject indexing system for further navigation. Suitable metadata for identifying versions of subject indexing systems should be required for effective navigation tracking, but a metadata format for such objects has yet to be defined. Work for defining a metadata format for subject classifications and their versions in the framework of metadata formats for documents is strongly at issue now. While backing such developments, our realizations in subject classification displaying are intended to demonstrate possibilities for library OPACs to integrate their functionalities with disciplinespecific environments for document search and retrieval.
displaying are intended to demonstrate possibilities for library OPACs
to integrate their functionalities with disciplinespecific environments
for document search and retrieval.
In the next future, the keywords that will index a cooperative effort
