Making electronic publication easier, faster, and more powerful with  Hydra, a drag-and-drop TEI publishing environment


Article Contents

Summary
The Need for Hydra
A Description of Hydra
A Demonstration of Hydra
Current Limitations and the Future of Hydra
Bibliography

Contents Page

Summary   [top]  

Hydra is an experimental drag-and-drop electronic publishing environment  for TEI-conformant XML texts. This article introduces the capabilities and  limitations of the system.

The Need for Hydra   [top]  

There have been a number of tools written that can transform TEI-conformant  XML into display for the web, but these tools often tax the skills of new  users. Hydra was created so that authors of TEI-conformant XML can simply  drop their file into a folder and then view the text in HTML or PDF format  immediately. Thus the goal is to encourage the use of TEI-conformant XML as  a standard by making it far easier to transform that XML into more readable  formats.

A Description of Hydra   [top]  

Hydra is a custom distribution of Cocoon, an XML publishing framework. Hydra     builds on the flexible publishing environment of Cocoon and its separation  of concerns between content, logic, and style. Cocoon incorporates these concerns  using components and pipelines where each component in the pipeline executes  a particular function. In a typical example, a generator component reads and  parses an XML file, a transformer component converts the XML markup into a  different XML markup using XSLT, and a serializer component produces the resulting   output. Within the base distribution, Cocoon includes a number of generators,  transformers, and serializers that anyone can use in their application. Some  of these included components are generators which can read from native filesystems  and XML as well as serializers which can output PDF, RTF, SVG and other formats.  In addition to using these components, it is also possible to build custom  components. Hydra uses one such custom-built transformer, Transcoder, written  by Hugh Cayless. The Transcoder transforms TLG Betacode-encoded Greek into  a number of different Greek encodings, such as SPIonic, Sgreek, and Unicode.

Hydra aggregates many open source projects in a single distribution. Two  of the most important pieces of software Hydra uses are Sebastian Rahtz’s  XSL stylesheets and the Apache Project’s Formatting Object Processor  (FOP). Rahtz’s stylesheets transform TEI XML documents to HTML and to  XSL Formatting Objects, later used to create PDFs with FOP. Not every TEI  tag is converted to XSL-FO or HTML. This may result in Hydra not properly  displaying the text. However, all of the texts in the base distribution of  Hydra can be displayed in PDF and HTML and so these may provide a useful reference  platform. The second piece of open source software that Hydra uses, FOP, is  a Java application that takes XSL-FO and transforms it into a number of output  formats, including PDF, SVG, Postscript, or just plain text. In particular, Hydra takes advantage of the PDF capabilities of FOP.

All of the components in Cocoon are controlled by the sitemap, a file which  resolves URLs and calls the appropriate generators, transformers, and serializers.The logic and display concerns are handled within the Hydra sitemap so that  the user may be free to focus only on the content of the electronic publication. The sitemap handles the logic by reading directories and files and then dynamically  generates a listing of anthologies (or a collection of texts) and texts. The  user can choose to display the output of the text in a number of formats including  HTML, PDF, and the native XML format. Hydra is capable of displaying Greek  encoded in Betacode using the user’s choice of font. In addition, the  Greek words can be linked to the Perseus morphological parser. Finally, each  collection of texts may be displayed with a custom template by modifying the  stylesheets included in the base distribution.

A Demonstration of Hydra   [top]  

Vicus Unguentarius, a project dedicated to the study of the Roman epigraphic  record pertaining to the scent industry, was initally implemented using Hydra.  In a number of ways, this project demonstrates the range of customizations  that Hydra provides. For example, the XSLT stylesheets have been modified  with new colors and images, the front page has a navigation structure suitable  for the particular project, and the ability to choose a Greek font has been  removed as unecessary in this case. In order for these customizations to occur,  the author of the project, Sandra Bolero-Imwinkelreid, in addition to writing  her files in TEI-conformant XML, also needed to have a familiarity with XSLT  to modify the parameters in the appropriate stylesheets.

Current Limitations and the Future of Hydra       [top]

One limitation of Hydra stems from its use of “off-the-shelf”  XSL stylesheets freely distributed by the TEI Consortium, though these may  be modified by anyone with an understanding of XSLT in order to support the  requirements of particular documents. On the other hand, a major advantage  of Hydra (as of Cocoon generally) lies in the modularity of its constituent  elements. Whenever a new version of FOP, Transcoder, or the stylesheets appears,  it can be readily downloaded and patched into the system.

 

Another limitation concerns the way Hydra displays texts. It needs to be  better at chunking, the process of breaking up texts within one file. It should  also provide a built-in capacity for searching through the texts, possibly  taking advantage of the Apache Project’s Lucene search engine. These improvements, along with more substantial documentation, would make Hydra  a much more satisfactory electronic publication environment, possibly even  suited to some public uses.

Bibliography   [top]

Here are some links to projects that Hydra uses in its implementation:

Apache Cocoon

Apache FOP

Transcoder

Sebastian Rahtz’s XSL stylesheets

To refer to this please cite it in this way:

Michael Jones, “Making electronic publication easier, faster, and more  powerful with Hydra, a drag-and-drop TEI publishing environment,” C.  Blackwell, R. Scaife, edd., Classics@ volume 2: C. Dué   & M. Ebbott, executive editors, The Center for Hellenic Studies of Harvard   University, edition of April 3, 2004.