Our very first XSLT assignment is an Identity Transformation, a kind transformation we have to do frequently in our projects when we need to make specific changes to our encoding. We want to make some small changes in our Georg Forster file to make better choices of TEI elements for some of our tags.
To begin, download the Georg Forster file from here: ForsterGeorgComplete.xml and open it in <oXygen>. We don’t want to change much about this file, but we do want to alter its tagging just a little, and that is a good occasion to write an Identity Transformation XSLT, converting our XML to XML that is meant to be (for the most part) identical to the original.
Here are two changes we want to make to our XML file:
You may already be calculating how to do these tasks with a regular expression Find and Replace, and while we know you could do that, our purpose with this exercise is to make the changes using an XSLT transformation, and we hope you will learn some things about how XSLT works through this exercise!
To begin, open a new XSLT stylesheet in <oXygen> and switch to the XSLT view. We will have some housekeeping to do as we get started.
Georg Forster’s A Voyage Round the World is coded in the TEI namespace, which means that your XSLT stylesheet must include an instruction at the top to specify that when it tries to match elements, it needs to match them in that TEI namespace. When you create a new XSLT document in <oXygen/> it won’t contain that instruction by default, so whenever you are working with TEI you need to add it (See the text in blue below). We also need to make sure that our XSLT parser understands it is outputting results to the TEI namespace, so we change one more line (See the text in red below).Our modified stylesheet template looks like the following:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xpath-default-namespace="http://www.tei-c.org/ns/1.0" xmlns="http://www.tei-c.org/ns/1.0" version="3.0"> </xsl:stylesheet>
version="3.0"
in our stylesheet template above. On future assignments we are setting the default version 2.0 for tranforming to HTML mostly because the old version is better tested for processing HTML output, but for an identity transformation of XML to XML, we like the efficient new code we can write in version 3.0. (You can see an old form here in the first template rule of our Identity transformation of Shakespeare’s sonnets, which you can download, save and open from here. That old first rule matches on all nodes, elements and attributes throughout the document and simply copies them. It’s perfectly fine to use that older template rule in place of the one we show you below, but we like the simplicity of this new form even better!).
This XSLT statement is the opposite of the xsl:template match we have been showing you in our XSLT tutorial. You basically say, if I do not write a template rule to match an element, attribute, or comment node, really of any part of the document that I do not mention in a template match rule, XSLT should simply make a copy of that element and output it. Try running this and look at your output: it will look exactly identical to the current XML document. Obviously we do not need to do this unless we want to make changes with template match rules! There is another way to copy, called "deep copy" in XSLT, but we do not want use it here. When you use "deep copy" in XSLT, you reproduce the full directory tree underneath a given element, so the understanding is that we would match on the root element only, and reproduce all the descendents of that one node just as they are. We like the "on-no-match-shallow-copy" approach because we do not necessarily want to copy every node just as it is in the original. We only want to copy if it we do not want to write a new template rule that will change it.
@n
or @number
). An AVT offers a special way to extract or calculate information from our input XML to output in an attribute value (for example, this lets us come up with a count()
of where the particular line we are processing sits in relation to all the preceding::
line elements ahead of it). You need to look at some examples of AVTs in order to write one yourself, so for this last task, go and look at the examples in Obdurodon’s Attribute Value Templates (AVT) tutorial. After reading the AVT tutorial, write two more template rules to add @n attributes that automatically number the <div> elements for Books, and the <div> elements for Chapters. (We would ask you to number the paragraphs, too, but we already did that!) Hint: For help with teaching the computer how to count these properly, look at my example ID-transform stylesheet that adds line numbers to a series of sonnets, downloadable from here if you didn’t download it earlier from the Introduction to XSLT tutorial.) We will return to this later, since you will be working with AVTs in later XSLT exercises and almost certainly in your projects.When you are finished, save your XSLT file and your XML output of the Georg Forster file, following our usual homework file naming conventions, and upload these to the appropriate place in Courseweb.