Notes on the XML Article Sample
This downloadable package consists of three basic components: (1) an article-in-progress of mine (to be published this winter by the Toyo Gakuen University Bulletin) in XML format, written according to the guidelines for XML encoding set forth by the Text Encoding Initiative (TEI); (2) A subdirectory containing the TEI DTD files; (3) A subdirectory containing the related XSL stylesheets that allow for the generation of new HTML files, based on the original XML file.
Why write in XML?
As a greater volume of serious research and teaching materials begin to be published on the web — whether directly or as an afterthought — the question of how to make these materials available in the most systematic manner, so that they can readily be converted to other formats and utilized in other ways, becomes increasingly important.
Of course, all recently-built word processors provide the readily available function of saving as HTML, and the HTML editors contained in such Web browsers as Mozilla are gradually reaching the level where their functionality approaches that of a word processor. So if one's purpose in HTML publishing is to simply convert an old article or two into HTML format, or just make a simple page for a small web site, the word-processor "save-as-HTML" solution can often be good enough. But if one wants to produce web materials on a steady basis, and wants these materials to remain available for ready updating, or conversion into other formats1 in a way where one has full control over the formatting, one will quickly find that that simple HTML publication brings problems. Also, when it comes to the encoding of the information contained in the work, HTML alone cannot help, since the only thing it can encode is appearance; it can't encode meaning. XML, when used in conjunction with XSL style sheets, provides a way to skillfully and powerfully merge content-based markup with the style of one's choice. And since all of the XML code is written in human-readable format, it is something that has an unusual degree of accessibility for authors of articles, books, and so forth who possess no special training in computer programming.
XML, used with XSL(T) stylesheets, provides the basic underlying technology to do fully-empowered web publication. But XML only provides a set of general rules for the creation of markup tags. To publish literary manuscripts, research articles and the like on the web in a systematic manner, some sort of standard set of guidelines needs to be in place, and it is toward the establishment of these sorts of guidelines that the Text Encoding Initiative (TEI) has been working. Their aim is to provide a standard for the smooth interchange of digitized scholarly research information, and thus to overcome the impediments and information loss incurred by proprietary software systems.
There is a growing number of scholars who are using TEI's application of XML to digitize previously written works,2 and to write new works from the beginning, which allows full freedom in terms of our choice of future publication format—especially if there is any chance that a version of the text will eventually make it onto the web. And for those who manage full-fledged web-publication projects, this form of publication has no equal in terms of broad functionality and ease of use.3
Since there is ample material available both on the web and in book format explaining the structure and function of XML and XSL, we do not want to waste time in attempting to duplicate that here. Also, it has been my experience that most of the real learning of an application such as XML comes with being involved with the material in a "hands-on" manner. This is what we have attempted to offer here. If you have just a few basic tools on your computer, along with a good book or two, you can begin to write your own works in XML almost immediately. You will just need a few things to get started.
Preparation
You will need:
It is not necessary just to get started, but if you want to seriously experiment with these materials, you will probably want to get a good XML/XSL(T) book or two: For example, The XML Bible by Elliotte Rusty Harold and Beginning XSLT by Jeni Tennison.
Download and Usage
Now, getting down to business:
Acknowledgments
The most critical aspects of the XSLT formatting, including the integration of the style sheets and the handling of footnotes and table of contents were either directly written by, or derived from examples provided by Michael Beddow. Other sophisticated functions were applied through examples received from Jeni Tennison and Wendell Piez. However, since none of these people were directly involved in the production of the final style sheets, they should not be considered responsible for any poorly-conceived code to be seen therein.
Charles Muller
Notes
1. Such as: other arrangements of the HTML, conversion to E-Book; or conversion into PDF, MS-Word, assimilation into databases, or incorporation into larger projects.[back]
2. Every article and translation that I have done in the past four years has been written this way. One might, for example, download the XML files available for some of my online translations or digitized versions of previous print publications, place these in the same folder, set the links the DTD and XSL files, and validate/output those files in the same way.[back]
3. There are other formats, such as TEX, that are good for web publication, but TEX does not encode semantic information, and is not as readily learnable by non-programmers as XML.[back]
4. Actually, most of the work on this project was done with Emacs, but I don't recommend it for beginners.[back]