Autumn 2019: Classes meet M W F 3 - 3:50 PM, 136 McKenna Hall.
Syllabus: Autumn 2019
- Hum 1030: (Digital Humanities) course registration number: 19247. This course fulfills an HM or Q2 General Education Requirement at Pitt-Greensburg.
- Socsci 1031: (Digital Studies) course registration number: 26509. This course fulfills an SS or Q2 General Education Requirement at Pitt-Greensburg.
- Either course number satisfies a core course requirement for Pitt-Greensburg’s Digital Studies Certificate.
Instructor Team
- Taught by: Prof. Elisa Beshero-Bondar; e-mail: ebb8 at pitt.edu; office hours either in our classroom or in FOB 204 on
Wednesday from 4 - 5:30pm, Thursday from 1:15-3pm, and by appointment.
- Professor Beshero-Bondar (
Dr. B
) directs Pitt-Greensburg’s Center for the Digital Text, and you can find most of her projects on http://newtfire.org. She and former student instructor, Rebecca J. Parker co-authored A GitHub Garage for a Digital Humanities Course published in New Directions for Computing Education (Springer Books, 2017) pp. 259-276, about our work with GitHub in this class. Her first XML project was Digital Archives and Pacific Cultures project on which she collaborated with a Pitt undergrad project team in a class like the one you are taking now.
- Professor Beshero-Bondar (
- Assisted by returning students:
- Alyssa Argento; e-mail ama277 at pitt.edu; office hours: M W F: 4 - 5pm in FOB 131. Alyssa was part of the William Combe project in 2018, and led the Banksy project in 2019.
- Kiara DeVore; e-mail ksd32 at pitt.edu; office hours: T H: 12 - 2pm in FOB 131. Kiara led the Washington Papers project in 2018 and was part of the Ulysses project in 2019.
Course Projects
Explanatory Guides and Exercises: Complete List
Class Web Resources:
- Course Home Website: CDA.html Home of our syllabus and schedule.
- DHClass-Hub: https://github.com/ebeshero/DHClass-Hub Class GitHub Repository and Issues Board
- CourseWeb: https://courseweb.pitt.edu To submit homework assignments and exams and read private course announcements
- File Conventions for CourseWeb Assignments
- Server Access Instructions for Web Project Development on newtFire: [To Be Announced]
All the Tools You Need As We Begin:
Download and install the following software on your own personal computer(s) on or before the first day of class. These software tools are available in our campus computing labs, too.
- All students: <oXygen/>. The University of Pittsburgh has purchased a site license for this software, which is installed in the Pitt computer labs on multiple campuses, and it’s in use in courses here at Greensburg and at Oakland. The license also permits students enrolled in the course to install the software on their home computers (for course-related use only). When installing this on your own computers, you will need the license key, which we have posted on our course Announcements section of Courseweb.
- All students require a good means of secure file transfer (SFTP) for homework
assignments and projects (also available in the campus computer labs). There are
several good options available. We recommend you download and install on your own
computers one (or more) of the following, depending on your platform: (Feel free
to experiment with these and others!)
-
Windows users: one of the following FTP clients—the functionality is
similar:
- FileZilla (This is our favorite client because it behaves the same way across platforms.)
- WinSCP (This is one we used for a long time, since the 1990s, but we now use SSH and Filezilla more frequently.)
- SSH Secure Shell Client
-
Mac users:
- FileZilla (This is our favorite client because it behaves the same way across platforms.)
- or Fetch (students may obtain free licenses at http://fetchsoftworks.com/fetch/free)
- Linux users: You probably don’t need to install anything, but look at how your system handles secure file transfer (SFTP). (FileZilla or other clients designed for Linux environments.)
-
Windows users: one of the following FTP clients—the functionality is
similar:
- Read the Course Description and this Syllabus page to see how this course works on a day-to-day basis.
- This course fulfills general education requirements in Q2, SS, and HM, and it fulfills a core requirement for the Digital Studies Certificate at Pitt-Greensburg. Think about where this course might fit in your academic career, and how you might apply the skills you learn here.
- No coding experience? Don’t worry! You are in very good company. We don’t expect any of you to have written a line of computer code before now. Past students in this course who never saw anything like markup or XML code have designed projects (like these) and even spoken about them at an undergraduate conference! You’ll help continue some of these projects we’ve started, and you’ll learn to build and create digital tools for yourself with skills we hope you will keep developing.
Coding and Digital Archives: Course Description
This course is all about working with computers and digital technology to build cultural resources on the public web. In taking this course you will gain experience with textual scholarship, editing, and digital production, and you will learn a variety of coding designed for systematic building and sharing of information resources.
This course is meant to be complementary with the Coding and Data Visualization course taught in spring semesters, but where the emphasis in that course is on analyzing data to produce informational graphics, this course concentrates on curating and preparing reading views of documents. Neither course is meant to be a prerequisite for the other: you may take either one as a beginner. Returning students (in either semester) serve as student instructor-mentors to beginning students for units and assignments they have already completed confidently.
Our class is one of the core courses of Pitt-Greensburg’s Digital Studies Certificate, and it satisifes a range of general education requirements in quantitative reasoning, behavioral sciences, and humanities. That is because this course is distinctively interdisciplinary in engaging formal and quantitative reasoning through computer coding in ways that matter to students in humanities and social sciences who are not training to be computer scientists. Students gain hands-on experience in this course with applying computer coding to represent and investigate cultural materials. As we design projects together, you will gain practical experience in editing and you will certainly fine-tune your precision in writing and thinking. You will also be learning in an openly collaborative environment (as professional coders learn and work) with an emphasis on building sustainable and freely accessible resources on the public web.
Students who complete this course will gain skills in practical hands-on programming, digital project management, and web development. Their digital projects will distinguish them as investigators and makers, able to wield computers creatively and effectively for human interests. Your success will require patience, dedication, and regular communication and interaction with us, working through assignments on a daily basis. Your success will not require perfection, but rather your regular efforts throughout the course and your documenting of problems when your coding doesn’t yield the results you want. Homework exercises are a back-and-forth, intensive dialogue between you and your instructors, and we plan to spend a great deal of time with you individually over these as we work together. Our guiding principle in developing assignments and working with you is that the best way for you to learn and succeed is through regular practice as you hone your skills. Our goal is not to make you expert programmers (as we are far from that ourselves). Instead, we want you to learn how to apply coding technologies for your own purposes, how to track down answers to questions, how to think your way algorithmically (step-by-step) through problems to find good solutions.
Learning to Code: Our Context
You do not need any background or experience at all with computer programming or web development to succeed in this course. We teach practical programming as a foundational skill (like reading, writing, and arithmetic) that all students should experience regardless of major or background.
You will be learning to code with practical goals in mind: to build a digital project
to investigate a research topic in the humanities, working with eXtensible Markup
Language (XML) and languages connected with it.
XML is a powerful tool for modelling texts that we can adapt creatively to our interests
and
questions. You will learn how to work with regular expressions to match patterns in
plain text and up-convert
them to an XML document. You’ll learn how to write XPath expressions: a formal language
for searching and extracting information from XML code which serves as the basis for
transforming XML into many publishable forms. You’ll learn to
write XSLT: a programming “stylesheet” transforming language designed to convert XML
to publishable formats, as well as to
extract information and plot it in charts in graphs in Scalable Vector Graphics (SVG).
You will learn how to
design your own systematic coding methods to work on projects, and how to write your
own rules in schema languages (like Schematron and Relax-NG) to keep your projects
organized and prevent errors. You’ll gain experience with an international XML
language called TEI (after the Text Encoding Initiative) which serves as an
international standard for coding digital archives of cultural materials. Since one
of the best and most widely accessible ways to publish XML is on the worldwide web,
you’ll gain working experience with HTML code (a markup language that is a kind of
XML), styling HTML with Cascading Stylesheets (CSS), and adding dynamic features to
your website with JavaScript. You will also, all along, be working with Git and GitHub
for collaborating with the class and your project team, and gaining command line (or
shell) experience.
While we are using XML in an academic research context,
what you learn here is also important in the tech industry, where XML is the internal
format for many general applications (used for bank and hospital records, and the
basis of the entire Microsoft Office and LibreOffice software packages).
XML is important, too, for web developers, where the HTML (hypertext markup language)
of web pages to be viewed in browsers is often expressed as a form of XML (as we will
be applying it). Developers of relational databases widely appreciate XML, too, as
a universal data interchange format.
Learning the XML family of languages
in a humanities context may give you a strong foundation in practical computing experience
whatever your major, and we hope what you learn here will help you build a distinctive
portfolio of skills and projects.
Learning Objectives:
- Work with Texts as Artifacts—As Physical and Virtual Objects:
- Generate "digital surrogates": digitally represent facsimiles of rare manuscripts and other kinds of documents, and make their content digitally searchable.
- Reflect and write on the issues and problems with digital representation, as well as the capacity of the digital medium to enhance or add dimensions to a physical text.
- Learn and practice coding in eXtensible Markup Language (XML) and related coding technologies: to “mark up,” process, and extract information about the structure, physical condition, and cultural contexts of textual artifacts.
- Gain Experience with Information Retrieval, "Distant Reading," and Autotagging
Techniques:
- Write code to apply searching and data extraction methods through multiple kinds of pattern-matching algorithms, including forms of regular expression matching. Take conventional boolean searches and library database searches to new levels.
- Apply "mining" and "drilling" methods to interact with texts and visualizations differently than we could do "manually" or with unassisted eyes and brains.
- Learn how to "autotag" enormous texts or collections of texts, for practical results: to code the structure of enormous texts from a distance, in order to navigate them and make them accessible through distant reading.
- Reflect on the strengths and limitations of data processing and visualization.
- Gain Project Design and Editing Experience:
- Gain digital editing experience with proposing, designing, and contributing to a digital research project
- Gain experience with collaborating and sharing code using a version control system (GitHub) in a team repository
- Transform XML code into publishable web formats, to build or contribute to a project website.
- Design navigation elements, and build visual aids and models (such as timelines and tree diagrams) from texts: to generate charts and images from extracted data
- Gain experience with plotting digital graphs and charts
- Last but not least: Discover that you read and write with “new eyes,” with greater precision and agility, thanks to your adventures with digital projects!
Optional Textbook:
- Michael Kay, XSLT 2.0 and XPath 2.0: Programmer’s Reference, 4th edition (Wiley Publishing, 2008) ISBN-13: 978-0-470-19274-0 This is really the authoritative word on XSLT and XPath, written by a designer of the official W3C specifications of XSLT 2.0 that we’re using. We are learning from this book ourselves and consult it frequently! We’re not requiring that you buy it, but we recommend it to have a powerful reference at your fingertips and for learning more on your own. There’s a kindle edition available but poorly designed for searching, so we (actually) prefer the hardcover print edition. If you’re going to purchase it, be sure you pick up the current edition (not the earlier ones).
Grading:
Homework Exercises (30%):
To keep up with this class, you must work on exercises regularly. Each day will involve some small assignment, to prepare you for the next of class, and to help you to build your course project. Students must complete on time at least 90% of all assigned homework exercises in order to pass the course. For students who complete the 90% requirement, the homework taken all together, is worth 30% of your course grade.
Coding assignments: Coding exercises in this course are about your active learning, and not—as in other courses—a way of testing whether you have already learned something we covered in class or in an assigned reading. You may often need to look up how to do something that you don’t already know how to do. Often, there will be multiple ways of accomplishing the task and we are not simply looking for you to do things perfectly in just one way. We are instead looking for a record of your learning process as you take on a challenge. Documenting problems is key to learning, and sometimes just writing out what you are trying to do helps lead you to a solution! When we post solutions for homework assignments, part of your homework may be to write a comment to review what you missed and assessing what you needed to do for the correct answer. There may be times when you don’t get the result you want in the homework, and that is to be expected! In those cases you can still get full credit for the assignment if you’ve made a serious attempt and if you submit, along with your code, a description of what else you tried, what results you expected, what results you got, and what you think went wrong. Getting stuck is part of the learning process and the instructors will be happy to help unstick you as long as you’ve described your understanding of the problem and your attempts to resolve it on your own.
The instructors will read and evaluate all student homework, and will post an assessment
in the Courseweb Grade Center. Coding assignments are assessed as check plus
, check
, and check minus
, or redo
. Don’t think of these as grades, since they all receive full credit; they are feedback,
for learning purposes, about how well you engaged with the assignment. If you have
not engaged with the assignment adequately (whether that means solving the tasks or
discussing the coding obstacles you encountered and how you dealt with them), we will
ask you to meet with us to review the issues and then complete a followup (redo) task
in order to receive credit. For assignments with posted solutions, we invite you to
review the posted solution on GitHub and comment on it (we will show you how to do
this) to address something you learned from the solution or did in a different way.
For some assignments where we review posted solutions and line-comments together in
person or in class, we will write back to you with individual comments only if your
specific submission raises an issue that we don’t address elsewhere. If we don’t return
your assignment, that means that we have nothing to add to our posted solution, but
should you have any specific questions after you’ve read our posted solution, please
ask the instructors.
Issue posts: Throughout the course, we’ll assign discussion posts on our class GitHub site in which you will respond to online readings or evaluate web resources. Your posting
should do more than
summarize the article or site (which you could just do by skimming or reading
the first paragraph), but should demonstrate a thoughtful reflection on
specific ideas and issues. When evaluating a web resource, don’t simply praise
or condemn it without going into details about why a key component is effective
or poorly designed. Good posts demonstrate care and reflection, and
you may choose to respond to the overarching ideas of a piece, or to selected
details of specific interest. These posts are scored as check plus
, check
, and check minus
.
Participation: In Class and on the DH-ClassHub (15%):
Coding and programming in real life is a social activity, and professionals in
the real world aren’t “know-it-all” experts who work alone, but rather are tuned
into discussion boards and regularly ask and answer questions to stay sharp and to
learn from their community. In this class, we want you to work together and talk
to each other and your instructors as your community resource, so we have built
this into our course participation grade as a formal expectation. Beginning by week two, we’ll expect each student to post at least once per
week on our course GitHub repo, and we strongly encourage you
to do more than this minimum. Earn an A
in participation by asking questions, making suggestions, and sharing helpful
resources you’ve found. Help each other out by trying to answer questions on GitHub
(and
read the instructor posts too as we wade in to help). Your instructors will likely
be dominating the class time as we model concepts and methods, so the GitHub Issues
board gives the students a good space to form into a coding community to help each
other and reflect together. Also, if you have a question about an assignment, always think of our GitHub Issues board as your first resource to
check for helpful hints and to post your questions, because others may have the
same question and answers are best shared! Of course you may e-mail us, but we
really prefer you go the discussion board first, and doing so is, after all, worth
course credit as your participation grade.
Tests (15%):
As scheduled throughout the course there will be several (probably about five or six) tests on the various kinds of coding we are learning in the course, and we will drop the lowest grade. All but the first test will be take-home and assigned over a weekend. They are open-book, open notes, but they must be completed individually and are designed to demonstrate that you have learned from the class material, coding assignments, and posted solutions. Tests may resemble homework assignments, but unlike homework exercises, these are given letter grades. These are given grades because they are evaluative and involve demonstrating what you have learned after we have finished a coding unit.
Project (40%):
Throughout the semester you will be working as part of a team on a course project. Early in the semester each student posts a proposal for a semester project to work with a text (or collection of texts) in the public domain and a set of research questions to explore in a coding project. Teams will form around a selection of these projects in mid-September and begin work, performing document analysis, developing and implementing a system of markup and project rules, marking up text following that system, writing programs to conduct research and create a resource to share on a public website you will develop together that represents your investigation and your conclusions. Each project team must meet regularly together and with a project mentor (one of the instructor team) outside of class for project planning and discussion. Each of the project components described below adds up to 40% of your grade for the course.
Project Checkpoints There will be a series of project checkpoints to complete, by set due dates throughout
the semester. Each is worth 5% of the final course grade (a total of 20%) and a letter
grade on the following scale: exceeds target
(A+), meets target
(A), some progress
(B), negligible progress
(C), no progress
(F). Each checkpoint will expect you to complete a stage of serious work on the course
project with your project team. Project Checkpoints are met using the Issues and/or
Projects tabs on your project GitHub repository and by posting files on the project
website on newtFire.
The course project develops throughout the semester, but is fully assembled in the final weeks of the course and submitted in two places, through code and documentation shared in your GitHub repository and on your project website due in Finals Week. Projects are evaluated as a team effort, but if unequal effort is observed, project members may receive different project grades accordingly. The Final Project grade is worth 20% of the course grade.
Grading Scale:
Grades for the course are calcuated and posted on Courseweb, and follow this standard scale: A: 93-100%, A-: 90-92%, B+: 87-89%, B: 83-86%, B-: 80-82%, C+: 77-79%, C: 73-76%, C-: 70-72%, D+: 67-69%, D: 60-66%, F: 59% and below. In taking the course on a S / NC (pass-fail) basis, students must earn a C to receive Satisfactory credit. We give G grades (incomplete) at our discretion and only in conformity with the University Registrar policy: http://www.registrar.pitt.edu/grades.html.
Course Policies:
Each day we are covering material that builds on earlier material and assignments, so your success depends upon regular attendance and completing each assignment on time.
Attendance:
We strictly require your attendance, as it is not only a setback to yourself but to the entire class and the success of our projects if students are repeatedly absent. Students must attend at least 90% of our class meetings in order to pass the course (that is, a maximum of four absences for any reason). Late arrival—particularly a pattern of repeated late arrivals—may be counted as absences at our discretion. To be considered attending for a class period, you must be present in the classroom by the beginning of class and until class is dismissed. For your own sanity, do not miss two consecutive classes. If you are experiencing a genuine emergency or crisis, alert us and provide documentation. (We do not want you to attend class if you have a fever or flu-like symptoms! If you are a Greensburg student and wish to be excused from two or more consecutive class meetings due to your own documented illness, please contact one of the following: the Office of the Vice President for Academic Affairs, Prof. Jackie Horrall (724-836-7482 or jhorrall at pitt.edu), our Campus Health Center (Pamela Reed, 724-836-9947 or pmr20 at pitt.edu), or the Director of Counseling, Gayle Pamerleau (gaylep at pitt.edu), who can then officially alert all of your professors.
Deadlines:
Your daily homework for this course is time-sensitive! Coding assignments, response papers, and other homework exercises must be uploaded to CourseWeb (or to Box, or the Sandbox server as specified), by the date and time indicated by the instructors. Homework assignments will be posted online to our class website and linked from our schedule, so students who miss class are nevertheless expected to consult the schedule and submit assignments on time. Because we post and share answers to homework exercises after submission deadlines, we will not accept late homework submissions. In order to pass the course, students must submit at least 90% of the regular homework assignments, and complete at least 90% of the work in each component of the course.
Exam Policy:
Similarly, because we will be posting answers or sharing them in class, we do not give make-up examinations or allow people to write exams after the solutions are posted. However, we will drop your lowest exam score for the class, so that you may miss one exam without penalty.
Classroom Courtesy:
Our class is fast paced, and requires that we all be making the best use we can of our in-person class sessions. Arriving late and leaving early disrupts the important collective mental activity of class. So does in-class texting and checking your cell phone. While class is in progress, talking disruptively, leaving the classroom, texting or using a cell phone or computer, reading a newspaper, or other distracting behavior will be actively discouraged, and may result in a deduction in your Participation grade. Please respect what we do in the classroom: attend class regularly, and come prepared to contribute your questions and ideas.
E-mail:
Each student is issued a University email address (username@pitt.edu) upon admission. This email address may be used by the University for official communication with students. Students are expected to read email sent to this account on a regular basis. Failure to read and react to University communications in a timely manner does not absolve the student from knowing and complying with the content of the communications. The University provides an email forwarding service that allows students to read their email via other service providers (e.g., Hotmail, AOL, Yahoo). Students who choose to forward their email from their pitt.edu address to another address do so at their own risk. If email is lost as a result of forwarding, it does not absolve the student from responding to official communications sent to their University email address. To forward email sent to your University account, go to http://accounts.pitt.edu, log into your account, click on Edit Forwarding Addresses, and follow the instructions on the page. Be sure to log out of your account when you have finished. (For the full Email Communication Policy, go to http://www.bc.pitt.edu/policies/policy/09/09-10-01.html.)
Academic Integrity
Source Citation and Plagiarism: One goal of our course is to reflect on how best to cite sources in digital contexts. We will consider how and why such citations differ from documenting printed texts. We will also consider the ease and frequency with which digital texts and graphics are plagiarized on the worldwide web, and discuss how the omission of source citations detracts from the authority of a digital information resource. We expect you to practice mindful source citation, and plagiarism on your part will have very serious consequences.
Representing the voice of another individual as your own voice constitutes plagiarism, however generous that person may be in “helping” you with an assignment. Turning in an assignment generated collectively under the name of a single individual is considered plagiarism. When instructed to collaborate on a project, project collaborators share collective authorship and should identify themselves directly as a team. To avoid plagiarism, cite your sources whenever you quote, paraphrase, or summarize material, or use digital images from any outside source (including websites, articles, books, course readings, Courseweb or GitHub postings, or someone else’s notes). When using the “copy” and “paste” features as you read and research, be sure that you are carefully marking that these passages are unprocessed from their source, so that you know to process it later. Forgetting to do so not only produces sloppy work but (whether you intended it or not) results in a false representation. As long as you make a good faith and clear effort to cite your sources, you will not be faulted for plagiarism, but your work will be penalized if citations are inaccurate, unclear, or lack important information.
That said, the coding and digital development we do encourages collaboration, and for that reason we adopt our colleague David Birnbaum's Collaboration policy, which specifies that students identify collaborators in a comment on submitted asignments and take care on projects that all students contribute equally (and no student is contributing excessively more than what everyone else has done). When joining a group homework session, always work on the assignment by yourself first so you can be an equal participant, and write up the assignment by yourself, after the session is over so you take care not to copy from the other students. While we want you to consult with each other, you are responsible for doing all your writing and coding by yourself, using your own words.
Plagiarism falsely represents another source’s words or ideas as your own, and, if you commit plagiarism in this course, you will receive a final course grade of F and, at Greensburg campus, be reported to the Vice President for Academic Affairs. At Greensburg campus, cheating on exams or exercises will also receive a final course grade of F and will be reported to the Vice President for Academic Affairs.
All Pitt students and instructors (from Pittsburgh and Greensburg) are required to observe the Dietrich School Academic Integrity guidelines, and violations of the Academic Integrity Code will be addressed according to those guidelines.
Disability Services:
If you have a disability for which you are or may be requesting an accommodation, please contact both your instructor and the Director of the Learning Resources Center, Dr. Lou Ann Sears, Room 240 Millstein Library Building (724) 836-7098 (voice) or los3 at pitt.edu (e-mail). The Learning Resources Center will verify your disability and help to determine reasonable accommodations for this course.
Resources
We gratefully acknowledge David Birnbaum’s Digital Humanities course as our starting point and resource for much of our development. Other useful resources include:
- eXam Center: a learning resource for quizzing yourself on coding that we’re learning in class. (We need to arrange for you to have individual accounts here to sign in and take the quizzes.)
- The Programming Historian (full collection of tutorials)
Projects that inspire us:
- Obdurodon: where we learned what we can teach, and where we’re still learning.
- Venice Time Machine: very ambitious, enormous project team of faculty and students to study and model a thousand years of Venice, digitizing "kilometers of archives."
- Map of Early Modern London
- Lord Byron and His Times: The very thoughtful stylistic design of this important project reproduces the style of nineteenth-century print and layout. The content makes many rare materials about Lord Byron’s social network searchable and connected to the web of linked open data.
- The Shelley-Godwin Archive: digitizes the manuscripts of Percy and Mary Shelley, and Mary Shelley’s parents, William Godwin and Mary Wollstonecraft—manuscripts often written in multiple hands. Provides an important study of the Frankenstein notebooks to demonstrate how much of a role Percy Shelley played in the writing of Frankenstein. The archive provides a good model of the use of TEI for manuscript encoding and of complex and multiple visualizations of manuscript texts.
- TokenX: a text visualization, analysis, and play tool
- A Tour Through the Visualization Zoo
- Clay Shirky on Love, Internet Style (9 minutes of Youtube inspiration: on what lasts, and why community matters in our digital worlds.)