GRETIL - Göttingen Register of Electronic Texts in Indian Languages

Experimental Site ... UNDER CONSTRUCTION!


When downloading e-texts in encodings reaching beyond the basic 128 ASCII characters, the encoding of the downloaded files frequently comes out distorted in one way or another. Usually this is due to divergent browser / computer configurations on the receiving end. For the same reason, online screen display of such texts may vary considerably.

The projected introduction of two additional formats is intended to resolve these problems, and to facilitate online retrieval of the GRETIL archives. In a way, these two formats branch out into opposite directions.

Until now, the Parasara-Smrti, e.g., was available in

  • REE and CSX , GRETIL's two standard encodings,
In addition, it will be available in
  • Unicode HTML (UTF-8)
and
  • ATIL, the ASCII Transliteration for Indian Languages ,
    devised for GRETIL on the basis of ASCII (ISO 646) characters only.
    The distinctive feature of this ASCII transliteration is that it covers the entire range of characters defined in CSX+.
For further information on these formats see the concordance and systematic list of encodings and transliteration systems (PDF).

Tamil texts will be processed accordingly (the exception being REE, which is not used for e-texts in Dravidian languages).

Thus, Tiruvalluvar's Tirukkural will be available in the following formats:

    CSX UTF-8 ATIL


"The Mahabharata Online"
Unicode HTML version with integrated reference system by Hans Ruelius

Hans Ruelius devised a remarkable SAS procedure that converts CSX+ files to HTML Unicode, and automatically produces a database with an index verborum (from which a defined list of stopwords can be excluded). A simultaneous spell check produces an error list that can be used to correct and improve the source file in CSX+, from which his SAS procedure then automatically updates the HTML files:

  • Prototype of "The Mahabharata Online"
    (NOTE: In order to see the diacritics, please make sure -- (1) that your browser is configured to UTF-8 and -- (2) that you have a Unicode UTF-8 font installed on your computer.)
The "Mahabharata Online" was generated from the CSX+ files of the 18 books, automatically segmented into 1965 separate HTML files for the individual adhyayas on the basis of the chapter headings.
With the "Table of Contents" in the lower left frame you can navigate from book to book, and within the books from chapter to chapter.
The "Index" button in the upper left frame opens an alphabetical menu for the index verborum, from which you can call up every reference for a given word in the entire text. ...


GRETIL home | Link to Indological Resources



Last update: 4.12.2002
Reinhold Grünendahl gruenen@mail.sub.uni-goettingen.de

© 2002 Niedersächsische Staats- und Universitätsbibliothek Göttingen