880257 :Research Skills: Text Analytics


Voertaal Engels
Werkvorm: Seminar (Geen informatie over collegetijden bekend)
Tentamenvorm: A small research project with a written report and assignments (Geen informatie over tentamendata bekend)
Studielast:3 ECTS credits
Inschrijving:Inschrijven via Blackboard voor aanvang colleges
Blackboard informatieniet beschikbaar in Blackboard


dr. J.A.A. Engelen

Y. Matusevych

Doel van de cursus (alleen in het Engels beschikbaar)

Do liars use specific words and sentence structures more often than people who tell the truth? Can a computer judge the quality of a text and make suggestions for improvement? How can a publisher quickly check if a manuscript contains plagiarism?

These are just a few questions that can be answered using ‘text analytics’, an umbrella term for various processes for extracting high-quality information from text, mostly relying on modern computational techniques.

By the end of this course:

The student is able to work with various applications of text analytics, ranging from simple tools for searching text with regular expressions (e.g., to find all words that end in ‘-ish’) and calculating type-token ratios (e.g., to explore lexical richness) to more sophisticated techniques that allow document comparison (e.g., to determine the semantic similarity between two or more texts) and automated syntactic parsing (e.g., to determine whether ‘brief’ is an adjective or a verb in a sentence).

The student will understand and be able to reflect on the possibilities and limitations of these applications.

The student will be familiar with the properties of several relevant open-access text corpora and be able to use these corpora for answering simple research questions.

The student will be able to prepare (or ‘pre-process’) a text for a specific computational analysis.

The student will be able to write a text analytics report and present and visualize text analytics results.

Inhoud van de cursus (alleen in het Engels beschikbaar)

The course consists of one 2-hour meeting each week for 7 weeks. These will be a mix of lectures and practicals, in which the student will gain hands-on experience with several analytical tools (including, but not limited to those described above) and text corpora.

Bijzonderheden (alleen in het Engels beschikbaar)

Attendance to the meetings is obligatory and active participation is required. The grade will be based on several assignments during the course (e.g., a 5-minute presentation on a software survey) and a small individual research project. Depending on the number of students in the course, some of the assignments will be done in small groups.

Students will have to bring their own laptops to the meetings.

Verplichte literatuur

  1. will be announced during the course.

Mogelijk interessant voor

  • Bedrijfscommunicatie en Digitale Media ( 2015 )
  • Communicatie-Design ( 2015 )
  • Human aspects of Information Technology ( 2015 )
  • Data Journalism ( 2015 )
  • Communication and Information Sciences ( 2015 )
  • Master KCW: Art, Media and Society ( 2015 )
  • Master KCW: Global Communication ( 2015 )
  • Language and Communication (research) ( 2014 )