Top

About

The Centre’s research focus is on individual variation in language use in the context of forensic author identification. Our aims are to develop the theoretical underpinnings of the notion of idiolect and to validate methods of authorship analysis for a variety of forensic tasks.

Forensic linguistic practice in cases of authorship identification is based on two assumptions: that every language user has a unique linguistic style, or 'idiolect', and that features characteristic of that style will recur with a relatively stable frequency (Coulthard, Grant and Kredens 2011: 536). Hundreds of style markers and a great variety of attribution techniques have been proposed over the years with some recent studies reporting attribution success rates for the less complex closed-set tasks in the region of 95 per cent (e.g. Grieve 2007, Koppel et al. 2013, Wright 2017). In recent years the development of powerful computing tools and the easy accessibility of large quantities of linguistic data online have sparked renewed interest in authorship analysis and it is quantitative approaches that seem to be the most promising at the moment. However, there are two problems with these approaches. Firstly, the relevant studies tend to use sociolinguistically and situationally homogeneous data whereas forensically realistic identification methods need to be able to capture stylistic similarities between texts created in different contexts and for different purposes and audiences.

Secondly, the studies use non-transparent classification algorithms; meanwhile, in legal and forensic settings identification models need to be explanatorily rich because the forensic linguist needs to be both certain of the validity of his/her findings and able to explain them to lay triers of fact. Our research will thus use sociolinguistically dynamic, cross-genre data and in interpreting the findings we will be looking for ways to open the black box.

linguistics-new

People

Directors, Academic and Research Staff

Projects

Individual variation across genres: English-language data
We are collecting and analyzing written and spoken data produced in a variety of contexts and modalities by 100 participants. Our focus in the analysis is on genre effects, with the aim to shed light on whether features of individual idiolectal styles are consistent across various contexts and modalities.
Individual variation across genres: Spanish-language data
Individual variation across genres: Spanish-language data: This study is similar to the English idiolect project: we are interested in the influence of genre effects on the stability of individual idiolectal styles. The data, however, is in Spanish. This allows us to compare results between the two projects and observe if there are any cross-linguistic similarities. 
Individual variation across a lifetime: Up Series Project

Data for this project comes from the UK television series Up, which has for the past 56 years revisited the same 14 British individuals every seven years. Our aim is to study individuals’ language over their lifetime, documenting which areas of language production remain stable and which are most subject to change.

Abuse and harassment in anti-abortion campaigns
Dr Tahmineh Tayebi, AIFL, and Dr Pam Lowe, Sociology and Policy, are investigating the abusive language directed at Stella Creasy, MP for Walthamstow, in an anti-abortion campaign on Twitter. Grounded in an interdisciplinary approach, this project uses corpus linguistics and in-depth socio-pragmatic analysis to find out how discourses of intimidation, abuse and harassment are created and justified. 
Motivations of self-styled 'paedophile-hunters'

Dr Emily Chiang is investigating the linguistic activities and motivations of 'paedophile-hunting' groups. Through an analysis of stance markers in in-group online chats, this project seeks to identify the topics and issues that present themselves as particularly salient to the group. This will support police in better understanding how such groups operate.