Channels Resources Recent Items Reading list HomeRegisterLoginSupportContact


Authors: Mohit Bansal John DeNero Dekang Lin
Details: | Google Scholar CiteSeer X DBLP Database
View PDF
Abstract
We propose an unsupervised method for clustering the translations of a word, such that the translations in each cluster share a common semantic sense. Words are assigned to clusters based on their usage distribution in large monolingual and parallel corpora using the soft K -Means algorithm. In addition to describing our approach, we formalize the task of translation sense clustering and describe a procedure that leverages WordNet for evaluation. By comparing our induced clusters to reference clusters generated from WordNet, we demonstrate that our method effectively identifies sense-based translation clusters and benefits from both monolingual and parallel corpora. Finally, we describe a method for annotating clusters with usage examples.
Item Details
Status: updated [Success]
Update: last updated 06/13/2012, 06:29 PM


2077 users, 690 channels, 350 resources, 56081 items