For this assignment, we will be working with a file from Digital Mitford project, a file that stores stores lists of names and information on people, places, organizations, and texts, among other kinds of named entities (really, anything that is named). For this assignment, we will work with a slightly modified version of the Digital Mitford Site Index, and we have prepared a starter XSLT file for you.
git pull
. It is saved in Class-Examples >> XSLT >> DigitalMitford_SI.Before you dive in to start writing XSLT, please read this assignment thoroughly, so you understand what you're doing and what you need to set up in the XSLT stylesheet. Our goal is to create a structured outline in HTML of all the information about organizations in the site index. We want to output that in HTML in the form of a list with nested lists inside, representing an outline of first the categories of organization, and then inside each category, a new nested sublist of the organization names. One possible use of a webpage like this is as a list of links, so that each organization name might link to a page of information on each organization. We don’t have to generate those links now. For this assignment, we just want to learn how to transform XSLT to HTML and to generate the lists themselves by pulling the exact content we want out of our XML.
For the organization types or categories, we need to pull from the <head> element sitting inside at the top of each <listOrg> elements in our TEI file. For the organization names, we reach in to find the individual entries for <org> and their child <orgName> elements inside each <listOrg> element. Each <org> element contains one <orgName> inside that holds the best-known name of a particular organization. You may first want to experiment with XPath on the Site Index file to locate the <listOrg> elements and study the XML hierarchy of the lists. Let’s make the outer list be ordered (or numbered) list in HTML, using the HTML <ol> element, and then make the inner list be an unordered (bulleted) list, using the HTML <ul> element.
Your lists in HTML should come out looking something like this, only yours will contain many more entries in each category, because your XML document contains some new material.
The underlying HTML, which we generated by running XSLT, should look like this:
<ol> <li>Archives Holding Mitford's Papers<ul> <li>Baylor University, Armstrong Browning Library</li> <li>Berkshire Record Office</li> <li>British Library</li> <li>Boston Public Library</li> <li>Cambridge University: Fitzwilliam Museum</li> <li>Duke University Rubenstein Library</li> <li>Eton College</li> <li>Florida State University Special Collections</li> <li>The Women's Library, Glasgow</li> </ul> </li> <li>Organizations Relevant to Mitford's World<ul> <li>Billiard Club</li> <li>House of Bourbon</li> <li>Cavaliers</li> <li>Court of Chancery</li> <li>Church of England</li> <li>the Cockney School</li> <li>Mr.and Mrs.Mitford</li> <li>the Moncks, family of John Berkeley Monck </li> <li>New Model Army</li> <li>Palmerite</li> <li>Parliament</li> </ul> </li> <li>Fictional Organizations Referenced by Mitford<ul> <li>Attendants &c.</li> <li>Citizens</li> <li>Guards</li> <li>Guards</li> <li>Ladies</li> <li>Nobles (in Julian)</li> <li>Nobles (in Rienzi)</li> <li>officers in Charles I </li> <li>Prelates</li> </ul> </li> </ol>
In HTML ordered and unordered lists, the only elements permitted inside are list items or <li> elements. We’ve nested them so that each list item in the outside numbered list contains a category type (designating what kind of organization), followed by an embedded <ul> that contains, in turn, a separated bulleted list series, listing the name of each organization in the list.
If you’re feeling adventurous, once you obtain the output we're seeking, you may go on to build other HTML lists, working with other portions of the XML document, such as the <listBibl>
or <listPerson>
sections, which are formatted a little differently. The only required content of your homework, though, is the HTML outline of organization types and organization names.
Since the Digital Mitford’s Site Index is coded in the TEI namespace, we will need to make some edits to our XSLT 3.0 Stylesheet to read from a TEI document and output to HTML 5 in XHTML syntax. If we don't make these changes, XSLT will not be able to read the input file or output to the correct HTML 5 format.
@xpath-default-namespace
to resolve to the TEI namespace. (When you create an new XSLT document in <oXygen/> it won’t contain that instruction by default, so whenever you are working with TEI you need to add it (See the text in blue below).<xsl:output>
element as the first child of our <xsl:stylesheet>
in red below). So, our modified xsl:stylesheet
and xsl:output
elements look like this, and you should copy this into your stylesheet:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" xmlns="http://www.w3.org/1999/xhtml" xpath-default-namespace="http://www.tei-c.org/ns/1.0"> <xsl:output method="xhtml" html-version="5" omit-xml-declaration="yes" include-content-type="no" indent="yes"/> </xsl:stylesheet>
Our XSLT transformation (after all this housekeeping) has three template rules:
<xsl:template
match="/">
), in which we create the basic HTML file structure: the
<html>
element, <head>
and its contents,
and <body>
—anything that appears just once in the HTML document (one to one relationship with the root node). Inside the <body>
element that
we’re creating, we use <xsl:apply-templates>
and select the <listOrg>
elements
(using an XPath expression as the value of the @select
attribute). And we create our wrapper <ol>
tags to set up the ordered list of organization types.<listOrg>
elements (holding the lists of organizations), so it will be invoked as a result of the preceding <xsl:apply-templates>
instruction, and
will fire once for each <listOrg>
element in our Site Index. Inside that template rule we create a new list item
(<li>
) for the particular <listOrg>
being processed and inside the tags for
that new list item we do two things. First, we apply templates to the
<head>
for the <listOrg>
, which will cause its category description to
be output when we run the transformation. Second, we create wrapper <ul>
tags for the nested
list that will contain the names of the organizations within that category. Inside that new
<ul>
element, we use an
<xsl:apply-templates>
rule to apply templates to (that is, to
process) the <org>
elements of that <listOrg>
.<org>
elements, which make up the items in the list of organizations,
and that just applies
templates to the <orgName>
element within each <org>
.
This rule will fire once for each <org>
element inside the <listOrg>
, and it will be called separately for the <org>
elements within each <listOrg>
, so that the orgs will be rendered properly in their respective lists.We don’t need a template rule for the <head>
elements themselves
because the built-in (default) template rule in XSLT for an element that
doesn’t have an explicit, specified rule is just to apply templates to its children. The
only child of the <head>
elements is a text node, and the built-in
rule for text nodes is to output them literally. In other words, if you apply
templates to <head>
and you don’t have a template rule that matches
that element, ultimately the transformation will just output the textual content of the
head, that is, the title that you want.
<xsl:for-each>
instruction that could be used to solve this problem. We are
prohibiting its use for now; your solution must use
<xsl:template>
and <xsl:apply-templates>
rules instead. There’s a Good Reason for this, which is to see how XSLT templates work, and later on we will
discuss situations where you should use
<xsl:for-each>
, and a template rule would not work so well.orgName
be listed first in the list of names, so we recommend that you tidy up your list by selecting just the very first available orgName, that is, the first element child named orgName
of org
elements you are processing. Alternatively, you may try applying an XPath string-join()
function to output the entries, but you will need to use xsl:value-of
instead of xsl:apply-templates
because we need to use xsl:value-of
to calculate the results of functions (which removes us from the XML tree). Either approach is fine with us, and you would use the same @select
attribute to indicate what you would like to output.#xslt
channel and/or to the textEncoding-Hub Issues board. You can’t just ask for the answer,
though; you need to describe what you tried, what you expected, what you got, and
what you think the problem is. We often find, just as we’re preparing to post our
own queries to coding discussion boards, that having to write up a description
of the problem helps us think it through and solve it ourselves, and the technical term for this is rubber duck debugging. Beyond just being patient rubber ducks, though, we’re also
encouraging you to discuss the homework on the discussion boards because that’s also
helpful for the person who responds. Answering someone else’s
inquiry and troubleshooting someone else’s problem often helps us clarify matters for
ourselves!