Last modified: Wednesday, 28-Oct-2020 08:14:19 UTC. Maintained by: Elisa E. Beshero-Bondar (eeb4 at psu.edu). Powered by firebellies.

CDV: XSLT Exercise 1

Our first XSLT assignment is an Identity Transformation, a kind transformation we have to do frequently in our projects when we need to make specific changes to our encoding. We want to make a small change to the encodin, and we want to add a Schematron association line to all of the Banksy collection files. We will work with the same collection of Banksy XML files we used on the XPath test.

Do a git pull on the DHClass-Hub, and find the Assignments/banksy_XML directory. This is a modified version of the Banksy project XML collection for use in this exam.
Copy the banksy_XML directory into your own local space (*outside of DHClass-Hub*) to work on this assignment.

To begin, open any one of the Banksy XML files in oXygen and look at the encoding. In the blinded_napoleon.xml file, take a look at the sourceDesc element:

     <sourceDesc>
          <bibl>
              <title>Blinded Napoleon</title>
              <date when="2018"/>
              <medium type="spray_paint"/>
              <location lat="48.8566101" long="2.3514992">Paris, FR</location>
              <ref target="http://www.banksy.co.uk/">
              Banksy's Personal Site</ref>
              <ref target="https://www.instagram.com/banksy/?hl=en">
              Banksy's Personal Instagram</ref>
              <ref target="https://www.thisiscolossal.com/2018/06/banksy-in-paris/">
              Article That Supplied Information for Markup</ref>
          </bibl>
      </sourceDesc>

Here are two changes we want to make to our XML file:

Let’s pretend that the Banksy team recently decided to change its XML encoding inside this sourceDesc element, to get rid of the medium element and move the information from its @type attribute up to the title element. They want to change all the encoding to move the information from <medium type="??"> to <title medium="??">.....</title>. The team will need to be careful because there is a second title element inside the body element that they do not want to change.
We want to add a Schematron association line to apply the Schematron you wrote for Schematron Exercise 1 to every file in the collection.

You may already be calculating how to do these tasks with a regular expression Find and Replace and by going through the collection one by one and adding a Schema association line, but these would be very tedious to do file by file in a collection. Not only will writing XSLT make this easier to process the entire collection at once, but when you write an XSLT script to handle these changes, you will be producing documentation of a major project change you are implementing.

To begin, open a new XSLT stylesheet in <oXygen> and switch to the XSLT view. We will have some housekeeping to do as we get started:

Setting up the Stylesheet

Change the XSLT root element to process with version 3.0: Change the version line as marked in purple below:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="3.0">
    
</xsl:stylesheet>

Writing the Identity Transformation!

We will give you your first template rule, to set this as an identity transformation. We’re going to use a new form for this in version XSLT 3.0, so that is why we have set version="3.0" in our stylesheet template above. (If you are curious about older 2.0 approaches, you can see an old form here in the first template rule of our Identity transformation of Shakespeare’s sonnets, which you can download, save and open from here. That old first rule matches on all nodes, elements and attributes throughout the document and simply copies them. It’s perfectly fine to use that older template rule in place of the one we show you below, but we like the simplicity of this new form much better.).
<xsl:mode on-no-match="shallow-copy"/>

This XSLT statement is the opposite of the xsl:template match we have been showing you in our XSLT tutorial. You basically say, if I do not write a template rule to match an element, attribute, or comment node, really of any part of the document that I do not mention in a template match rule, XSLT should simply make a copy of that element and output it. Try running this and look at your output: it will look exactly identical to the current XML document. Obviously we do not need to do this unless we want to make changes with template match rules! There is another way to copy, called "deep copy" in XSLT, but we do not want use it here. When you use "deep copy" in XSLT, you reproduce the full tree of descendants underneath each element, so that we would have to match on the root element only, and reproduce all the descendants of that one node just exactly as they are. We like the "on-no-match-shallow-copy" approach because we do not necessarily want to copy every node just as it is in the original. We only want to copy if it we do not want to write a new template rule that will change it.
We will begin by processing just one file until we are certain of the changes we want to make to the whole collection. Keep one Banksy file open, and test your XSLT on this file.
Next, we will simply write our template rules to match on the particular elements we wish to change. Keep in mind that we only wish to change the title element inside the sourceDesc element, and we do not want to alter any other title elements in the Banksy documents. Review our Introduction to XSLT to see how to write a template match on an XPath pattern (you want to match on a title element with an ancestor sourceDesc element anywhere in the tree). You will output its contents using <xsl:apply-templates/>, and you will need to construct a newly changed version of the element around that apply-templates instruction.
To construct our new and improved <title> element, we need to add an attribute value to our sourceDesc//title element, and for this we will work with Attribute Value Templates (or AVT), a handy special format in XSLT. An AVT offers a special way to extract or calculate information from our input XML to output in an attribute value. You need to look at some examples of AVTs in order to write one yourself, so for this last task, go and look at the examples in Obdurodon’s Attribute Value Templates (AVT) tutorial. Apply what you learn there to position a new attribute on the title element that pulls its information from the medium/@type in the source file, and output its contents.
Next we need to write a template rule that will suppress the medium element from the source file. How do you write XSLT to remove an element? Look this up in our XSLT tutorial and find out how to do this.
Finally, we want to add the Schematron association line using XSLT. Look at the syntax you need by first associating a Schematron line with one of your XML files. Then, study this w3 schools example of how to add a processing instruction with XSLT. (This is a weird example showing how to associate a CSS file with an XML file. But the same logic applies with slightly different syntax to generate a processing instruction beneath the document node. Write a template rule that adds a Schematron line. To write this rule, first understand that XSLT template rules can match on any kind of XML node (element(), attribute(), comment(), processing-instruction(), and more). You will want to preserve the existing processing instruction, and add a new one to follow it. We will get you started here:
```
<xsl:template match="processing-instruction()"> 
       <xsl:copy/>
      
     
   </xsl:template>
```
When you have made these changes to one Banksy XML file, we will pause a moment, and continue this together in class to apply this to a collection of XSLT files. We will do this part together in class at our next meeting.
You will be looking at your results in the Output window as you write and test your template rules each time you press the blue Run-to-End button. Eye-balling those results is not really enough because the Output window does not check for well-formedness or validation against a schema. Be sure to save those results, either by setting an output location in the appropriate place in the selection boxes, or by right-clicking on the output window and selecting Save as. Always, always open the saved output file in <oXygen/> and check to make sure that it checks out as valid and well-formed. Your new output should address all of the schema validation errors and return a green square.

When you are finished, save your XSLT file with the .xsl file extension and following our usual homework filenaming conventions. (You don’t need to alter the filenames of the XML files.) Submit 1) your XSLT file, 2) your input XML, and 3) your output XML to the appropriate place in Canvas.