The Digital Mitford Site Index stores lists of names and information on people, places, organizations, and texts, among other kinds of named entities referenced in files throughout the Digital Mitford project. For this assignment, we will work with a slightly modified version of the Digital Mitford Site Index, which you should download from here and open in <oXygen>. (You may see some errors being flagged on this file in oXygen from our project schema, and that is okay! This is an old project file that we have purposefully preserved for homework exercises, and you can process the code as long as it is well formed, as the schema validation doesn’t matter for our purposes.) Our goal is to create a structured outline in HTML of all the information about organizations in the site index. We want to output that in HTML in the form of a list with nested lists inside, representing an outline of first the categories of organization, and then inside each category, a new list of the organization names. This is something we actually need to do in the Mitford project: to process portions of the Site Index file to make it readable on the web as a list. One possible use of a webpage like this is as a list of links, so that each organization name might link to a page of information on each organization. We don't have to generate those links now. For this assignment, we just want to learn how to transform XSLT to HTML and to generate the lists themselves by pulling the right content out of our XML.
If you’re feeling adventurous, once you obtain the output we're seeking, you may go on to build other HTML lists, working with other portions of the XML document, such as the <listBibl>
or <listPerson>
sections, which are formatted a little differently. The only required content of your homework, though, is the HTML outline of organization types and organization names. For the organization types or categories, we need to pull from the <head> element sitting inside at the top of each <listOrg> elements in our TEI file. For the organization names, we reach in to find the individual entries for <org> and their child <orgName> elements inside each <listOrg> element. Each <org> element contains one <orgName> inside that holds the best-known name of a particular organization. You may first want to experiment with XPath on the Site Index file to locate the <listOrg> elements and study the XML hierarchy of the lists. Let’s make the outer list be ordered (or numbered) list in HTML, using the HTML <ol> element, and then make the inner list be an unordered (bulleted) list, using the HTML <ul> element.
Your lists in HTML should come out looking something like this, only yours will have a few more entries in each category, because your XML document contains some new material.
The underlying HTML, which we generated by running XSLT, should look like this:
<ol> <li>Archives Holding Mitford's Papers<ul> <li>Baylor University, Armstrong Browning Library</li> <li>Berkshire Record Office</li> <li>British Library</li> <li>Boston Public Library</li> <li>Cambridge University: Fitzwilliam Museum</li> <li>Duke University Rubenstein Library</li> <li>Eton College</li> <li>Florida State University Special Collections</li> <li>The Women's Library, Glasgow</li> <li>Houghton Library, Harvard</li> <li>Huntington Library</li> <li>University of Iowa Special Collections</li> <li>Massachusetts Historical Society</li> <li>New York Public Library</li> <li>Oxford University, Balliol College Archives</li> <li>Oxford University, Bodleian Library</li> <li>Reading Central Library The principal archive of Mary Russell Mitford's personal papers and related documents, holding approximately 1,000 manuscripts and a nearly comprehensive collection of her publications. </li> <li>John Ruskin Library, Lancaster</li> <li>The John Rylands Library</li> <li>National Library of Scotland, Manuscript Collections</li> <li>University of Texas, Ransom Center</li> <li>University of Reading Special Collections</li> <li>University of Virginia Special Collections</li> <li>Wellesley College, Margaret Clapp Library, Special Collections</li> <li>Wordsworth Trust</li> <li>Yale University, Beineke Library</li> </ul> </li> <li>Organizations Relevant to Mitford's World<ul> <li>Billiard Club</li> <li>House of Bourbon</li> <li>Cavaliers</li> <li>Court of Chancery</li> <li>Church of England</li> <li>the Cockney School</li> <li> Drover </li> <li>Eton College</li> <li>High Court of Justice</li> <li>House of Commons</li> <li>the Kembles</li> <li>House of Medici</li> <li> Mitford </li> <li>Mr.and Mrs.Mitford</li> <li>the Moncks, family of John Berkeley Monck </li> <li>New Model Army</li> <li>Palmerite</li> <li>Parliament</li> <li>Court of Pope Pius VII</li> <li>Prelacy</li> <li>the Presybterian faction</li> <li>Privy Council</li> <li>Richmond Coach or Stage</li> <li>Scriblerus Club</li> <li> Slade family </li> <li>Taylor and Hessey (publishers)</li> <li>Tory Party</li> <li>Twickenham Coach or Stage</li> <li> Valpy family </li> <li> Webb family </li> <li>Weylandite</li> </ul> </li> <li>Fictional Organizations Referenced by Mitford<ul> <li>Attendants &c.</li> <li>Citizens</li> <li>Guards</li> <li>Guards</li> <li>Ladies</li> <li>Nobles (in Julian)</li> <li>Nobles (in Rienzi)</li> <li>officers in Charles I </li> <li>Prelates</li> </ul> </li> </ol>
In HTML ordered and unordered lists, the only elements permitted inside are list items or <li> elements. We’ve nested them so that each list item in the outside numbered list contains a category type (designating what kind of organization), followed by an embedded <ul> that contains, in turn, a separated bulleted list series, listing the name of each organization in the list.
The Digital Mitford's Site Index file is coded in the TEI namespace, which means that your XSLT stylesheet (much as in the last assignment) requires an instruction at the top to specify that when it tries to match elements, it needs to match them in the TEI namespace. (When you create an new XSLT document in <oXygen/> it won’t contain that instruction by default, so whenever you are working with TEI you need to add it (See the text in blue below). To ensure that the output would be in the XHTML namespace, we
added a default namespace declaration (in purple
below). To output the required DOCTYPE declaration, we also created
<xsl:output>
element as the first child of our root
<xsl:stylesheet>
element (in green below), and we needed to include an attribute there to omit the default XML declaration because if we output it that XML line in our XHTML output, it will not produce valid HTML with the w3C and might produce quirky problems with rendering in various web browsers. So, our modified stylesheet template and xsl:output line is this, and you should copy this into your stylesheet:
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" xmlns="http://www.w3.org/1999/xhtml" xpath-default-namespace="http://www.tei-c.org/ns/1.0"> <xsl:output method="xhtml" encoding="utf-8" doctype-system="about:legacy-compat" omit-xml-declaration="yes"/> </xsl:stylesheet>
Our XSLT transformation (after all this housekeeping) has three template rules:
<xsl:template
match="/">
), in which we create the basic HTML file structure: the
<html>
element, <head>
and its contents,
and <body>
—anything that appears just once in the HTML document (one to one relationship with the root node). Inside the <body>
element that
we’re creating, we use <xsl:apply-templates>
and select the <listOrg>
elements
(using an XPath expression as the value of the @select
attribute). And we create our wrapper <ol>
tags to set up the ordered list of organization types.<listOrg>
elements (holding the lists of organizations), so it will be invoked as a result of the preceding <xsl:apply-templates>
instruction, and
will fire once for each <listOrg>
element in our Site Index. Inside that template rule we create a new list item
(<li>
) for the particular <listOrg>
being processed and inside the tags for
that new list item we do two things. First, we apply templates to the
<head>
for the <listOrg>
, which will cause its category description to
be output when we run the transformation. Second, we create wrapper <ul>
tags for the nested
list that will contain the names of the organizations within that category. Inside that new
<ul>
element, we use an
<xsl:apply-templates>
rule to apply templates to (that is, to
process) the <org>
elements of that <listOrg>
.<org>
elements, which make up the items in the list of organizations,
and that just applies
templates to the <orgName>
element within each <org>
.
This rule will fire once for each <org>
element inside the <listOrg>
, and it will be called separately for the <org>
elements within each <listOrg>
, so that the orgs will be rendered properly in their respective lists.We don’t need a template rule for the <head>
elements themselves
because the built-in (default) template rule in XSLT for an element that
doesn’t have an explicit, specified rule is just to apply templates to its children. The
only child of the <head>
elements is a text node, and the built-in
rule for text nodes is to output them literally. In other words, if you apply
templates to <head>
and you don’t have a template rule that matches
that element, ultimately the transformation will just output the textual content of the
head, that is, the title that you want.
<xsl:for-each>
instruction that could be used to solve this problem. We are
prohibiting its use for now; your solution must use
<xsl:template>
and <xsl:apply-templates>
rules instead. There’s a Good Reason for this, which we’ll explain later, when we
talk about situations where you should use
<xsl:for-each>
.orgName
be listed first in the list of names, so we recommend that you tidy up your list by selecting just the very first available orgName, that is, the first element child named orgName
of org
elements you are processing. Alternatively, you may try applying an XPath string-join()
function to output the entries, but you will need to use xsl:value-of
instead of xsl:apply-templates
because we need to use xsl:value-of
to calculate the results of functions (which removes us from the XML tree). Either approach is fine with us, and you would use the same @select
attribute to indicate what you would like to output.