START Conference Manager    

Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression

Asli Celikyilmaz, Dilek Hakkani-Tur, Gokhan Tur and Ruhi Sarikaya

The 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013)
Sofia, Bulgaria, August 4-9, 2013


Finding concepts in natural language utterances is a challenging task, especially given the scarcity of labeled data for learning semantic ambiguity. Furthermore, data mismatch issues, which arise when the expected test (target) data does not exactly match the training data, aggravate this scarcity problem.

To deal with these issues, we describe an efficient semi-supervised learning (SSL) approach which has two components: (i) Markov Topic Regression is a new probabilistic model to cluster words into semantic tags (concepts). It can efficiently handle semantic ambiguity by extending standard topic models with two new features. First, it encodes word n-gram features from labeled source and unlabeled target data. Second, by going beyond a bag-of-words approach, it takes into account the inherent sequential nature of utterances to learn semantic classes based on context. (ii) Retrospective Learner is a new learning technique that adapts to the unlabeled target data. Our new SSL approach improves semantic tagging performance by 3% absolute over the baseline models, and also compares favorably on semi-supervised syntactic tagging.

START Conference Manager (V2.61.0 - Rev. 2792M)