Digital Humanities in History, Linguistics and Literature
History:
1.
Hidden Histories by UCL Centre for Digital Humanities
The application of computing to the Humanities is not new. It can be traced back to at least 1949, when Fr Roberto Busa began researching the creation of an index variorum of some 11 million words of medieval Latin in the works of St Thomas Aquinas and related authors. Notes and contributions towards a history of the computer in the humanities have appeared in recent years; however, our understanding of such developments remains incomplete and largely unwritten. This project gathers and makes available sources to enable the social, intellectual and cultural conditions that shaped the early take up of computing in the Humanities to be investigated.
DASI ¡V Digital Archive for the study of pre-Islamic Arabian inscriptions is an ERC project (2011-2016), led by Prof. Avanzini of the University of Pisa, that aims at getting the whole corpus of pre-Islamic Arabian inscriptions inventoried and digitized, to enhance historic and cultural knowledge of Ancient Arabia and strengthen the linguistic study of texts.
About 9,000 Ancient South Arabian, Ancient North Arabian (University of Oxford) and Aramaic inscriptions (UMR 8167, CNRS-Paris) have been digitized through a hybrid data base/xml system, developed by the Scuola Normale Superiore di Pisa, consisting of three main components: a relational database, a data entry and a front end.
The database stores not only metadata, but also text encoded in XML format according to the EpiDoc standard, being the data entry provided with an editing module specifically developed to encode pre-Islamic Arabian inscriptions.
The front end [http://dasi.humnet.unipi.it] gives free access to nearly 7,700 inscriptions. Users can browse content by different Indexes (Corpora, Collections, Epigraphs, Objects and Sites) and filters on metadata. Moreover, thanks to the xml mark up of the textual features, users can perform complex searches on texts through the Textual search and the Lists of words by alphabetical order (both for lexicon and onomastics).
DASI data are also available for harvesting. An OAI-PMH repository allows service providers to harvest DASI records, which have been mapped to several standards and data models: DC, EDM and EpiDoc.
Digital lexica of the South Arabian languages are presently under construction.
MAAYA by Idiap Research Institute, Switzerland-On-going
Archaeologists and epigraphists have made formidable progress over 100 years to decipher the writings of the ancient Maya culture, yet a proportion of the hieroglyphic corpus remains open for interpretation. This effort can be accelerated through the use of multimedia analysis and information management methods for organization, analysis, and visualization tasks.
MAAYA (Multimedia Analysis and Access for Documentation and Decipherment of Maya Epigraphy) integrates the work of humanities scholars and computer scientists to design computational tools that support the work of Maya scholars, advance the state of Maya epigraphy through the combination of expert knowledge and the use of these tools, and make these resources available to the scholar community at large.
Textal by UCL Centre for Digital Humanities-On-going
Textal is a text analysis app for iPhone and iPad. It allows you to create wordclouds from your favourite text, website, tweet stream, or document. Textal then allows you to interact with the wordcloud, to drill down further and explore the statistics and the relationships between words in the text. It has been designed to be an easy introduction to text-analysis, whilst providing useful functionality missing from previous implementations of wordclouds. Textal transforms wordclouds into useful tools for analysis, research, and play.
@note is a collaborative annotation system developed under the auspices of the Google¡¦s 2010 Digital Humanities Awards program. It is developed by ILSA (Research Group on Implementation of Language-Driven Software and Applications) and LEETHI Groups (Research Group on ¡§Spanish & European Literatures: from Text to Hypertext¡¨) at Complutense University (Spain). @Note tool retrieves digitized works from Google Books and, in particular, from the free-access UCM-Google collection. It lets the reader add annotations to enrich texts for researching and learning purposes such as reading activities, e-learning tasks, proposal of critical editions, etc. @Note promotes the collaborative creation of free-text and semantic annotation schemas on literary works by communities of researchers, teachers and students, and the use of these schemas in a very flexible and adaptative model for the definition of annotation activities.
ALCIDE is a web-based platform designed to assist humanities scholars in analysing large amounts of data such as historical sources and literary works. The system combines advanced text processing techniques, intuitive visualisations and close/distant reading of corpora. In ALCIDE, it is possible to browse through the content of document collections and analyse them interactively along different dimensions, including the lexical, the semantic, the geographical and the temporal level.
Burckhardtsource by the Scuola Normale Superiore, Italy-On-going
Burckhardtsource.org is a semantic digital library created within the ERC project ¡§The European correspondence to Jacob Burckhardt¡¨ coordinated by Prof. Maurizio Ghelardi (Scuola Normale Superiore, Pisa). The time span 1842¡V1897 witnesses a period of significant cultural transformations. Texts also offer content-related information. Indeed, a semantic annotation tool enables users to create semantically structured data. The combined use of the semantic annotator Pundit and the customised vocabulary Korbo offers an effective tool for data retrieval, representing different data ¡V such as metadata and annotations ¡V in several ways, according to the way users want to put them in relation.
CATMA (Computer Aided Textual Markup and Analysis) is a free, open source markup and analysis tool from the University of Hamburg's Department of Languages, Literature and Media. It incorporates three interactive modules, a tagger enabling textual markup and markup editing, an analyzer incorporating a query language and predefined functions, and a query builder that allows users to construct queries from combinations of pre-defined questions while allowing for manual modification for more specific questions. It also interfaces with the Voyant toolset.
Corpus Coranicum by Berlin-Brandenburg Academy of Sciences and Humanities-On-going
The project Corpus Coranicum of the Berlin-Brandenburg Academy of Sciences and Humanities (2007-2024) is exploring the Qur'an from three different angles: (1) Textual History: databases of ancient manuscripts and variant readings give insights into its textual history, (2) Qur¡¦an in History: a documentation of texts from the cultural and religious environment outlines Late Antique features in the text, (3) History of the Text: a literary-chronological commentary studies the Qur¡¦an as a developing speech proclaimed during more than 20 years to a changing audience. The digital publication is also providing other digital research tools like its font ("Coranica") and a text concordance.
CRETA by University of Stauttgart, Germany-On-going
The ¡§Center for Reflected Text Analytics¡¨ (CRETA) focuses on the development of technical tools and a general workflow methodology for text analysis within Digital Humanities. Of particular importance is the transparency of tools and traceability of results, such that they can be employed in a critically-reflected way.
Evaluation Metrics for Visual Analytics in Linguistics by Universitat Konstanz, Germany
Within linguistics, the use of large sets of data via a combination of rule-based and stochastic methods is now standardly part of the analysis of language structure. However, novel visual computation techniques have only just begun to be explored. The aim of the project "Evaluation Metrics for Visual Analytics in Linguistics" is to evaluate whether visual analytics represents a methodology that can yield improved results for linguistic research and to establish metrics for the evaluation of visual analytics approaches by conducting linguistically motivated case studies on historical data. The project is part of the SFB-TRR 161 "Quantitative Methods for Visual Computing".
This project is centered around the richly annotated linguistic database of the Hebrew Bible created by the Eep Talstra Centre for Bible and Computer. In the SHEBANQ project a web interface was developed that enables running and saving queries and add them as annotations to the text. The richly annotated database can now be accessed by anyone who wants to consult the Hebrew text with its linguistic annotations, to see queries that researches have run and that have also been added as annotations, or run queries by him/herself, save the results or create a diagram of the statistics.
VisArgue by University of Konstanz, Germany-On-going
A team of linguists, information scientists and political scientists embarks on the question when political negotiations are successful and why they are successful. In particular, we want to develop automatic tools that analyse political discourse, allowing to draw conclusions on its effectiveness. The motivation behind this is that public mega-projects repeatedly create conflicts between governments and the public sphere and their realization has become a incalculable risk for political decision makers. Therefore we want to investigate the factors that make political communication successful. This complex task can only be tackled using an innovative combination of methods from different areas of research. These methods include 1) a deep and detailed linguistic processing of real mediation processes to generate an abstract representation of communication; 2) a shallow, statistical analysis of text to detect common patterns in negotiations and 3) the development and employment of visualization tools which identify patterns of communication at-a-glance.
Visual Analysis of Language Change and Use Patterns by University of Konstanz, Germany-On-going
The project "Visual Analysis of Language Change and Use Patterns" is an interdisplinary project at the University of Konstanz between the Lingustics (Prof. Miriam Butt) and the Computer Science Department (Prof. Daniel Keim). The aim of the project is to analyze patterns of language change and use via the combination of linguistic analysis and novel visualization techniques in order to uncover information about language change, genetic relationships between languages, and variations in language use across time.
The Great Parchment Book by UCL Centre for Digital Humanities-On-going
The Great Parchment Book is an early 17th century survey of the County of Londonderry. It is a manuscript that has been completely inaccessible to scholars for over 200 years, since it was heavily damaged in a fire at Guildhall in 1786. It is hoped that the development of new digital methodologies will allow the opening up of the obscured text and enable the production of usable 3-D digital images and a transcription of the complete manuscript. These techniques have never been tried on manuscripts before, and so, if successful, would provide exciting possibilities for other damaged parchment manuscripts in the City of London¡¦s collections and beyond.
CLiGS (Computational Literary Genre Stylistics) is a junior research group in which researchers from Literary Studies and Computer Science work together. We study, use, adapt and develop quantitative methods of text analysis to investigate French, Spanish and Spanish-American literary texts. Our main focus is on genre stylistics that is on studying textual differences between various literary genres and the stylistic development of genres over the course of literary history. We are part of the Department for Literary Computing at the University of Wurzburg, Germany and publish news about our activities on our project website at http://cligs.hypotheses.org and make code and text collections available at http://github.com/cligs.
Ossian Online by National University of Ireland-On-going
Ossian Online is a project to publish the various editions of the sequence of eighteenth-century works known collectively as the Ossian poems. Initially presented by Scottish writer James Macpherson as fragments of original manuscripts he had found on journeys around the Highlands of Scotland, the Ossian poems grew into a body of work that inspired readers, courted controversy, and profoundly influenced the literature of the Romantic period. The first phase of Ossian Online will see the publication of TEI-encoded texts of the major editions published between 1760 and 1773. Subsequent phases will present a new critical edition and launch a tool for collaborative annotation of the texts.
The Women Writer Project by Northeastern University, USA-On-going
The Women Writers Project is a long-term research project devoted to early modern women's writing and electronic text encoding. Our goal is to bring texts by pre-Victorian women writers out of the archive and make them accessible to a wide audience of teachers, students, scholars, and the general reader. We support research on women's writing, text encoding, and the role of electronic texts in teaching and scholarships.
Database of Early English Playbooks by University of Pennsylvania, USA-Completed
It is an easy-to-use and highly customizable search engine of every playbook produced in England, Scotland, and Ireland from the beginning of printing through 1660, DEEP provides a wealth of information about the original playbooks, their title-pages, paratextual matter, advertising features, bibliographic details, and theatrical backgrounds.