Invited Speakers

Producing Referring Expressions: Combining Empirical and Computational Approaches to Reference

Emiel Krahmer
Department of Communication and Information Science
Tilburg University, The Netherlands
Referring expressions (including descriptions such as "the man with the hat" and pronouns such as "he") play a central role in communication, and their production has been studied extensively both from an empirical, psycholinguistic and a computational perspective. Psycholinguistic studies of reference production have yielded many interesting ideas (for instance concerning common ground and alignment), but the resulting theories tend to be somewhat sketchy and informal. Computational theories of reference production, on the other hand, often result in explicit and practically useful algorithms, which are unsatisfactory, however, as models of human linguistic behavior. In recent years there has been an increasing interest in trying to bridge the gap between these two approaches. In this talk I will describe a number of recent developments in this area, focussing on two topics in particular. The first concerns the production of referring expressions in interactive settings: I will present new experimental evidence showing that existing algorithms for the production of referring expressions fail to account for how references are produced in interaction. The second concerns overspecification, the phenomenon that referring expressions often include information that listeners do not strictly need for the purpose of identification. In this talk, I discuss recent findings on why speakers produce overspecified descriptions and on how listeners interpret them. Finally, I discuss what a psychologically plausible computational model of reference production might look like.

This work was done within the NWO VICI project "Bridging the hap between psycholinguistcs and computational linguistics: The case of referring expessions" See:

Click here to view the website of Emiel Krahmer.

Pronominal Resolution, Dependencies and the Architecture of the Linguistic Brain

Maria Mercedes Piñango
Department of Linguistics and Interdepartmental Neuroscience Program
Yale University,USA

The literature on Broca’s aphasia comprehension (where Broca's aphasia is understood as the syndrome resulting from lesion to left inferior frontal cortex (BA 44/45/6 and underlying white matter)) reveals an interesting situation: on the one hand, it reports a robust pattern of impairment in the comprehension of pronouns (vs. reflexives) and in the comprehension of object relative clauses and agentive passives (vs. subject relative clauses and actives) thus suggesting that these two kinds of phenomena share common linguistic mechanisms, and on the other, it does not offer a unified account for the patterns observed. That is, the independent generalizations proposed to account for each of these patterns are not necessarily compatible with each other. This situation can be explained in at least two ways: either binding phenomena and relative clauses rely on categorically distinct mechanisms (and therefore it is simple coincidence that they both be affected by damage to the same brain region) or the generalizations proposed to capture the two contrasts are not formulated at the right level of resolution.

This talk is an exploration of the second possibility. To this end I bring together lesion and imaging-based evidence on pronoun comprehension and compare the patterns observed to those associated with Wh-dependencies. I focus on recent fMRI findings from our lab which show a distinction between /phrase structure/ relations that syntactically support the dependencies (exclusively recruiting cortical regions BA44/6) and the /semantic relations/ involved in the dependencies (exclusively recruiting BA45/47). Those results show that just like pronominal interpretation, which has been viewed as the result of the interaction between coindexation (syntactic) and coreference (semantic) mechanisms, Wh-dependencies result from the interaction of phrase-structure building (syntactic) and argument interpretation (semantic) mechanisms. Crucially, these functional distinctions observed are found a) within the confines of Broca’s area and b) as the comprehension of the sentence unfolds, suggesting not the activity of isolated cortical regions but of cortical paths.

The discussion will center on the possible organizing principles of the linguistic architecture that these findings suggest, and of the cortical paths hypothesized to support it. It will also explore the notion that linguistic organization as we currently understand it may not be in a one-to-one correspondence with brain organization. That instead the brain organizes language in ways that cut across traditional subcomponents of the system.

Click here to view the website of Maria Mercedes Piñango.

BART Tutorial

BART: Multilingual Anaphora Resolution System
Olga Uryupina* and Massimo Poesio**(*Univ of Trento, Italy, **Univ Essex, UK)

BART is an open-source modular toolkit for anaphora resolution that supports state-of-the-art statistical approaches to the task and enables efficient feature engineering. It implements different models of anaphora resolution (mention-pair and entity-mention; best-first vs. ranking), has interfaces to different machine learners (MaxEnt, SVM, decision trees) and provides a large set of linguistically motivated features, along with the possibility to design new ones. BART has originally been created and tested for English, but its flexible modular architecture ensures its portability to other languages and domains.

At the last CoNLL competition for Coreference Resolution in English, two groups submitted systems based on BART, both scoring among the top teams. BART has also shown reliable performance for German, English and Italian at the Semeval-2010 task 1 on Multiligual Coreference Resolution.

The aim of this tutorial is to make the anaphora resolution community more familiar with BART, providing information on the basics (installation, running as a "black box") and more advanced usage (implementing your features and models) usage. We believe that this tutorial will be appealing for linguists interested in anaphora resolution but not willing to implement a state-of-the-art full-scale system from scratch, as well as researchers developing new coreference resolution algorithms and using coreference in their applications.


1. Installing BART, making your first experiments
2. Runnng BART with different input/output format. CoNLL format and scoring.
3. Running BART as a black box (for using in an external application)
4. Fine tuning BART: different models, feature selection, machine learners
5. LanguagePlugin: porting BART to new languages
6. Developing new features or models.