Digital Humanities, Corpus and Language Technology / Humanidades Digitales, Corpus y Tecnología del Lenguaje : A look from diverse case studies / Una mirada desde diversos casos de estudio


Andrés Grajales Ramírez (ed)
Universidad de Antioquia
Jorge Molina Mejía (ed)
Universidad de Antioquia
Pablo Valdivia Martin (ed)
University of Groningen


Digital Humanities, Corpus and Language Technology: A look from diverse case studies is an outstanding collection of research contributions that explores the intersection of technology and the humanities. The authors provide a  comprehensive overview of how these technologies can enhance research across various disciplines, from literature to history to anthropology. This book is a mustread for anyone interested in future research in the humanities. Digital Humanities, Corpus, and Language Technologies are rapidly growing fields that have the potential to revolutionize research across various disciplines.
New technologies have opened up new perspectives for research, allowing scientists to analyze data in previously impossible ways. The interdisciplinary approach and practical applications make it an invaluable resource for researchers, students, and anyone interested in the intersection of technology and the humanities.

This publication is peer reviewed.

Typesetting: LINE UP boek en media bv: Mirjam Kroondijk
Cover design: Bas Ekkers

Published by University of Groningen Press
Broerstraat 4 9712 CP Groningen
In co-edition with Facultad de Comunicaciones y Filología, Universidad de Antioquia (Colombia)

Production support: LINE UP boek en media bv

The Softcover version of this book (ISBN: 9789403430232) can be ordered via Print on demand directly on the site of our partner  Uitgeverij kleine Uil, and through all regular (internet) bookshops.

International shipping is possible via Amazon.

Unfortunately, the DOI links to individual chapters of this book do not function at this time. A newer version of our software will have the option of a landing page per chapter,  but it is still in the testing phase. You can of course download the PDF version of each individual chapter, and the DOI link to the book’s landing page functions correctly. It will take some time to implement the new software, but please check back again at a later date. Our sincere apologies for any inconvenience.

August 2023


  • Preface
    Pablo Valdivia Martin
  • Introduction/Introducción
    Andrés Grajales Ramírez, Jorge Molina Mejía, Pablo Valdivia Martin
  • Part I Digital Humanities
  • Understanding Outsider Art in the context of Digital Humanities
    Entender el Arte Outsider en el contexto de las Humanidades Digitales
    John Roberto, Brian Davis
  • La Biblioteca Virtual de la Filología Española (BVFE) y su acervo hispanoamericano
    The Biblioteca Virtual de la Filología Española (BVFE) and its Hispanic American heritage
    Jaime Peña Arce, María Ángeles García Aranda
  • De dos bases de datos relacionales a una base de datos XML. El proyecto COMREGLA
    From two relational databases to an XML one. Project COMREGLA
    Iván López Martín, Cristina Tur, Eveling Garzón Fontalvo, Alberto Pardal Padín, Berta González Saavedra, Guillermo Salas Jiménez, José Ignacio Hidalgo González
  • Análisis del epistolario del coronel Anselmo Pineda con Python: una mirada al proyecto coleccionista y al territorio desde las redes sociales y el aprendizaje automático
    Analysis of Colonel Anselmo Pineda’s epistolary with Python: a glance to the collecting project from the study of the territory and social networks
    Santiago Alejandro Ortiz Hernández
  • Part II Corpus construction
  • Desarrollo de un corpus de atlas lingüísticos
    Development of a corpus of linguistic atlases
    Carolina Julià Luna
  • The C-ORAL-BRASIL proposal for the treatment of multimodal corpora data: the BGEST corpus pilot project
    La propuesta del C-ORAL-BRASIL para el tratamiento de datos multimodales en corpus: el proyecto piloto del corpus BGEST
    Camila Barros, Heliana Mello
  • Las tecnologías del lenguaje y las lenguas indígenas mexicanas: constitución de un corpus paralelo amuzgo-español
    Human language technology and the indigenous languages in Mexico: the Amuzgo-Spanish parallel corpus
    Antonio Reyes Pérez, H Antonio García Zúñiga
  • Methodological bases: the construction of a corpus for the detection of deception and credibility assessment
    Bases metodológicas: la construcción de un corpus para la detección de mentiras y la evaluación de la credibilidad
    Pedro Eduardo Hernández Fuentes
  • Türkisch für Anfänger: propuesta de un corpus del alemán coloquial actual, ejemplificado a partir de las fórmulas rutinarias de saludo
    Türkisch für Anfänger: proposal of a corpus of modern colloquial German, exemplified from routine phrases for greetings
    Karen Lorena Baquero Castro
  • CLEC - Colombian Learner English Corpus: first learner corpus of written production in English online in Colombia
    CLEC - Corpus Colombiano de Aprendices de Inglés: primer corpus de producción escrita de aprendices de inglés en Colombia disponible en línea
    María Victoria Pardo Rodríguez, Antonio Jesús Tamayo Herrera
  • Part III Corpus analysis and Natural Language Processing
  • Pronunciation of consonant clusters in Spanish speakers based on the Czech read speech corpora
    La pronunciación de los grupos de consonantes en hispanohablantes basándose en el corpus oral leído checo
    Kateřina Pugachova, Jitka Veroňková
  • Relacionando los análisis cualitativo y cuantitativo. Una propuesta de modelo estadístico predictivo para completar la descripción compleja de los verbos cognitivos
    Relating qualitative and quantitative analysis. A predictive statistical model proposal to complete the complex description of cognitive verbs
    M. Amparo Soler Bonafont
  • Use of Bayesian networks for the analysis of corpus of local problems related to the Sustainable Development Goals
    Uso de redes Bayesianas para el análisis de corpus de problemas locales relacionados con los Objetivos de Desarrollo Sostenible
    Ernesto Llerena García, Manuel Caro Piñeres
  • Correlación entre la metáfora orientacional bueno es arriba / malo es abajo y polaridad positiva/negativa en verbos del español: un estudio con estadística de corpus
    Correlation between the orientational metaphor good is up / bad is down and positive/negative polarity in Spanish verbs: a study with corpus statistics
    Benjamín López Hidalgo, Irene Renau, Rogelio Nazar
  • UnderRL Tagger: a free software for Under-Resourced Languages POS tagging
    UnderRL Tagger: un software libre para etiquetar POS en Under- Resourced Languages
    Jorge Molina Mejía, José Luis Pemberty Tamayo


Download data is not yet available.

Author Biographies

Andrés Grajales Ramírez, Universidad de Antioquia

Andrés Felipe Grajales Ramírez is a Hispanic philologist from the University of Antioquia (Colombia) and holds a Master's degree in Cinematografía from the University of Córdoba (Spain).
Passionate about language, linguistics and art, he has participated for more than five years in the research incubator group Corpus ex machina, specialized in natural language processing and corpus linguistics. Thus, his research and professional fields are Computational and Corpus Linguistics, Proofreading and the teaching of Spanish as a Foreign Language. He is also part of the multidisciplinary international academic research network Data Science, Culture & Social Change.

Jorge Molina Mejía, Universidad de Antioquia

Jorge Mauricio Molina Mejía is an associate professor in the area of linguistics at the University of Antioquia, professor of computational linguistics and Spanish as a foreign language, coordinator of the research group Corpus Ex Machina, he is part of the Committee of the Doctorate in Linguistics of the Faculty of Communications and Philology (University of Antioquia). His research fields are Computational Linguistics, Natural Language Processing and the teaching of Spanish as a Foreign Language. He has written articles, book chapters and books in these fields of knowledge, particularly the book "Lingüística computacional y de corpus: teorías, métodos y aplicaciones" (Editorial Universidad de Antioquia).

Pablo Valdivia Martin, University of Groningen

Pablo Valdivia Martin is Full Professor and Chair of European Culture and Literature (University of Groningen), Accreditated Full Professor [Catedrático Universidad] of Arts and Humanities (ANECA, Spain), Associate in Applied Physics at Harvard Paulson School of Engineering and Applied Sciences (Harvard University), Academic Director of the Netherlands Research School for Literary Studies  (OSL), Scientific Advisor of the Netherlands Institute of Advanced Studies in Social Sciences and  Humanities and the Netherlands Royal Academy of Arts and Sciences (NIAS-KNAW), Coordinator Research Theme Group Data Science, Culture & Social Change at Research Centre for the Study of Democratic Cultures and Politics (DemCP, RUG), Co-Editor of the Routledge Companions to Hispanic and Latin American Studies and Research Fellow ‘Corpus Ex Machina’ Research Group Incubator (UdeA).

Cover Digital Humanities



September 10, 2023



Digital humanities, Corpus linguistics, New Technologies, Natural Language Processing, Linguistics

Details about this monograph

ISBN-13 (15)


Publication date (01)