START Conference Manager    

Leveraging Synthetic Discourse Data via Multi-task Learning for Implicit Discourse Relation Recognition

Man Lan, Yu Xu and Zhengyu Niu

The 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013)
Sofia, Bulgaria, August 4-9, 2013


To overcome the shortage of labeled data for implicit discourse relation recognition, previous works attempted to automatically generate training data by removing explicit discourse connectives from sentences and then built models on these synthetic implicit examples. However, a previous study \cite{Sporleder:08} showed that models trained on these synthetic data do not generalize very well to natural (i.e. \emph{genuine}) implicit discourse data. In this work we revisit this issue and present a multi-task learning based system which can effectively use synthetic data for implicit discourse relation recognition. Results on PDTB data show that under the multi-task learning framework our models with the use of the prediction of explicit discourse connectives as auxiliary learning tasks, can achieve an averaged $F_1$ improvement of 5.86\% over baseline models.

START Conference Manager (V2.61.0 - Rev. 2792M)