Our first XSLT assignment is an Identity Transformation, a kind transformation we have to do frequently in our projects when we need to make specific changes to our encoding. We want to make a small change to the encodin, and we want to add a Schematron association line to all of the Banksy collection files. We will work with the same collection of Banksy XML files we used on the XPath test.
git pull
on the DHClass-Hub, and find the Assignments/banksy_XML
directory. This is a modified version of the Banksy project XML collection for use in this exam.banksy_XML
directory into your own local space (*outside of DHClass-Hub*) to work on this assignment.
To begin, open any one of the Banksy XML files in oXygen and look at the encoding. In the blinded_napoleon.xml
file, take a look at the sourceDesc
element:
<sourceDesc> <bibl> <title>Blinded Napoleon</title> <date when="2018"/> <medium type="spray_paint"/> <location lat="48.8566101" long="2.3514992">Paris, FR</location> <ref target="http://www.banksy.co.uk/"> Banksy's Personal Site</ref> <ref target="https://www.instagram.com/banksy/?hl=en"> Banksy's Personal Instagram</ref> <ref target="https://www.thisiscolossal.com/2018/06/banksy-in-paris/"> Article That Supplied Information for Markup</ref> </bibl> </sourceDesc>
Here are two changes we want to make to our XML file:
sourceDesc
element, to get rid of the medium
element and move the information from its @type
attribute up to the title
element. They want to change all the encoding to move the information from <medium type="??">
to <title medium="??">.....</title>
. The team will need to be careful because there is a second title
element inside the body
element that they do not want to change. You may already be calculating how to do these tasks with a regular expression Find and Replace and by going through the collection one by one and adding a Schema association line, but these would be very tedious to do file by file in a collection. Not only will writing XSLT make this easier to process the entire collection at once, but when you write an XSLT script to handle these changes, you will be producing documentation of a major project change you are implementing.
To begin, open a new XSLT stylesheet in <oXygen> and switch to the XSLT view. We will have some housekeeping to do as we get started:
Change the XSLT root element to process with version 3.0: Change the version line as marked in purple below:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="3.0">
</xsl:stylesheet>
version="3.0"
in our stylesheet template above. (If you are curious about older 2.0 approaches, you can see an old form here in the first template rule of our Identity transformation of Shakespeare’s sonnets, which you can download, save and open from here. That old first rule matches on all nodes, elements and attributes throughout the document and simply copies them. It’s perfectly fine to use that older template rule in place of the one we show you below, but we like the simplicity of this new form much better.).
This XSLT statement is the opposite of the xsl:template match we have been showing you in our XSLT tutorial. You basically say, if I do not write a template rule to match an element, attribute, or comment node, really of any part of the document that I do not mention in a template match rule, XSLT should simply make a copy of that element and output it. Try running this and look at your output: it will look exactly identical to the current XML document. Obviously we do not need to do this unless we want to make changes with template match rules! There is another way to copy, called "deep copy" in XSLT, but we do not want use it here. When you use "deep copy" in XSLT, you reproduce the full tree of descendants underneath each element, so that we would have to match on the root element only, and reproduce all the descendants of that one node just exactly as they are. We like the "on-no-match-shallow-copy" approach because we do not necessarily want to copy every node just as it is in the original. We only want to copy if it we do not want to write a new template rule that will change it.
title
element with an ancestor sourceDesc
element anywhere in the tree). You will output its contents using <xsl:apply-templates/>, and you will need to construct a newly changed version of the element around that apply-templates instruction.<title>
element, we need to add an attribute value to our sourceDesc//title
element, and for this we will work with Attribute Value Templates (or AVT), a handy special format in XSLT. An AVT offers a special way to extract or calculate information from our input XML to output in an attribute value. You need to look at some examples of AVTs in order to write one yourself, so for this last task, go and look at the examples in Obdurodon’s Attribute Value Templates (AVT) tutorial. Apply what you learn there to position a new attribute on the title element that pulls its information from the medium/@type
in the source file, and output its contents.medium
element from the source file. How do you write XSLT to remove an element? Look this up in our XSLT tutorial and find out how to do this. element()
, attribute()
, comment()
, processing-instruction()
, and more). You will want to preserve the existing processing instruction, and add a new one to follow it. We will get you started here:
<xsl:template match="processing-instruction()"> <xsl:copy/> <!--ebb: The xsl:copy instruction above says to fully copy out the existing processing-instruction, which is the Relax NG schema line. Write your code to add the Schematron line after this point. --> </xsl:template>
Run-to-Endbutton. Eye-balling those results is not really enough because the Output window does not check for well-formedness or validation against a schema. Be sure to save those results, either by setting an output location in the appropriate place in the selection boxes, or by right-clicking on the output window and selecting
Save as. Always, always open the saved output file in <oXygen/> and check to make sure that it checks out as valid and well-formed. Your new output should address all of the schema validation errors and return a green square.
When you are finished, save your XSLT file with the .xsl
file extension and following our usual homework filenaming conventions. (You don’t need to alter the filenames of the XML files.) Submit 1) your XSLT file, 2) your input XML, and 3) your output XML to the appropriate place in Canvas.