Ruprecht-Karls-UniversitÃ<83>¤t Heidelberg

Research interests

-- updated Dec 2011 --

Put briefly, my interested are centered around the development of representations for the meaning of natural language words and phrases that can be acquired from corpora (or at least from language users without a linguistics degree) but are still able to account for various phenomena of language use and understanding (such as ambiguity, semantic relations, inference, and cognitive processing cost). The following paragraphs sketch the most important research directions I follow and link to the most recent (or most canonical) papers.

Distributional Modeling of Word Meaning and Semantic Relations

Historically, I started out with developing a framework for building vector spaces from dependency graphs for tasks such as synonymy detection and prediction of priming effects, following the intuition that these models can outperform pure bag-of-words models [PL07]. We then used these dependency-based models for the representation of selectional preferences. The resulting models can rival the performance of deep semantic models for modelling selectional preferences while being much easier to construct [PPE10]. I'm also interested in cross-lingual distributional models, and have investigated strategies to induce bilingual vector spaces from comparable corpora which can be used to translate semantic models (like the selectional preference models) into new languages. A surprisingly nice result is that this translation can profit not only from cross-lingual synonymy (=translation) but also from "looser" semantic relations [PP11].

Phenomena in Lexical Semantics: Compositionality and Polysemy

An important challenge in distributional modeling is the construction of models that are able to model the meaning not only of individual words, but of whole predicate-argument combinations -- or, of words in context, which turns out to be a closely related question. We have proposed two models for word meaning in (local) context. The first one combines the word's context-independent lexical meaning with the expectations of its context for the position the word occupies [EP08]. The second one represents words as instance clouds and treats word combination as "activation" of one cloud by the other [EP10].

Another lasting problem is polysemy (that is, systematic sense ambiguity), as opposed to homonymy (idiosyncratic sense ambiguity). We recently showed that the homonymy/polysemy distinction can be made fairly well on the basis of ontological information from WordNet and CoreLex. However, we also found that many words actually show both kinds of ambiguity for their different senses [UP11].

Psycholinguistics of Semantic Interpretation

My interest in meaning and ambiguity also extends to the psycholinguistic side. We have recently investigated the time course metonymic sentence interpretation and have found effects from subject (actor) choice that are difficult to relate to a lexicon-based account and are more consistent with a primarily world knowledge-based interpretation process [ZS11].

Semantic Processing and Textual Entailment

I also work in the framework of "Textual Entailment" which tries to cast the semantic processing needs of NLP applications in terms of common sense entailment decisions at the surface level. For example, we have approached Machine Translation evaluation based on textual entailment features and have been able to predict human judgments of MT quality can be predicted significantly better than surface-based methods can [PGCJM09].

A second strand that we have considered in the relation between Textual Entailment and discourse. We have shown that a substantial number of "difficult" entailments could be solved with discourse knowledge, in particular coreference and bridging [MDP10]. This suggests that discourse processing and entailmen tasks should be better interleaved in future work.

A result with practical impact of recent engineering work is one of the currently (2011) best-performing Named Entity Recognizers for German [FP10].

Semantic Lexicons and Semantic Roles

Even though the paradigm of inducing semantic lexical-knowledge from corpora is our best shot at large-scale resource building, it has its own share of problems. There may be fundamental disagreement on what semantic dimensions to use for the description of meaning, be it with respect to frameworks for semantic role annotation [EEKP06] or general-purpose semantic verb classifications [CEPS08]. Also, the representation of knowledge across multiple layers of linguistic analysis (e.g., syntax and semantics) requires answers to questions about granularity, reliability, and integrated querying [BPSFH08].

Finally, my PhD thesis was concerned with a topic in the area of semantic roles, namely the cross-lingual projection of frame-semantic information. Starting from the available English resource (i.e., FrameNet), I used annotation projection in parallel corpora to produce resources for languages using graph-based tree alignment [PL09]. The models are fairly language-independent, but my evaluation concentrated on German and French. In the same context, I have also done some work on semantic role labeling, both practically [EP06] and theoretically [EP05], and investigated a semi-supervised approach to SRL that uses annotations for verbal predicates to label nominal predicates [PPS08].