abstract Marco Marelli

Marco Marelli (Center for Mind-Brain Sciences, University of Trento, Italy)

When words are un-understandable: modeling the processing of morphologically complex words with compositional distributional semantics

Since the seminal LSA proposal, distributional semantics has provided efficient data-driven models of the human conceptual system, representing word meaning through vectors recording lexical co-occurrences in large text corpora. However, these approaches generate static descriptions of the semantic system, falling short of capturing the highly dynamical interactions occurring at the meaning level during language processing. The present proposal represents a first step in this direction by adapting distributional semantics to the combinatorial processing of morphologically complex words. In a functional perspective, affixes can be represented as matrices mapping stems into derived forms, and estimated from corpus data by means of machine learning techniques. As a consequence, derived-form meanings can be thought of as the result of a combinatorial procedure which transforms the stem vector on the basis of the affix matrix (e.g., the meaning of "nameless" can be obtained by multiplying the vector of "name" with the matrix of "-less"). This architecture accounts for the remarkable human capacity of generating new words that denote novel meanings, correctly predicting semantic intuitions about nonce derived forms (e.g., "quick+ify"). Moreover, the proposed compositional approach, once paired with a whole-word route, provides a new interpretative framework for semantic transparency effects, that are here explained in terms of ease of the combinatorial procedure and strength of the transformation brought about by the affix. Model-based predictions are in line with the modulation of semantic transparency on explicit intuitions about existing words, response times in lexical decision, and morphological priming. Finally, the model can be extended to the processing of compound words through the estimation of a general functional structure. Preliminary analyses of this model outputs are in line with reported effects of relational modulation in semantic-priming paradigms. In conclusion, the present talk introduces a computational model to account for the semantic aspects of morphological combination. The model is data-driven, theoretically sound, and empirically supported, and it makes predictions that open new research avenues in the domain of semantic processing.