Channels Resources Recent Items Reading list HomeRegisterLoginSupportContact


Query: "dependency parsing" or "dependency parse"
Status: updated [Success]
1-20 of 1966: 12345...99
View PDF Joint Inference for Heterogeneous Dependency ParsingAbstract: This paper is concerned with the problem of heterogeneous dependency parsing. In this paper, we present a novel joint inference scheme, which is able to leverage the consensus information between heterogeneous treebanks in the parsing phase. Different from stacked learning methods (Nivre and McDonald, 2008; Martins et al., 2008), which process the dependency parsing in a pipelined way (e.g., a second level uses the first level outputs), in our method, multiple dependency parsing models are coordinated to exchange consensus information. We conduct experiments on Chinese Dependency Treebank (CDT) and Penn Chinese Treebank (CTB), experimental results show that joint inference can bring significant improvements to all state-of-the-art dependency parsers.
Guangyou Zhou
Google Scholar CiteSeer X DBLP Database
View PDF Persian Dependency ParsingAbstract: This paper investigates the impact of different morphological and lexical information on data-driven dependency parsing of Persian, a morphologically rich language. We explore two state-of-the-art parsers, namely MSTParser and MaltParser, on the recently released Persian dependency treebank and establish some baselines for dependency parsing performance. Three sets of issues are addressed in our experiments: effects of using gold and automatically derived features, finding the best features for the parser, and a suitable way to alleviate the data sparsity problem. The final accuracy is 87.91% and 88.37% labeled attachment scores for MaltParser and MSTParser, respectively.
Mojtaba Khallash Ali Hadian Behrouz Minaei-Bidgoli
Google Scholar CiteSeer X DBLP Database
View PDF Easy-First POS Tagging and Dependency Parsing with Beam SearchAbstract: In this paper, we combine easy-first dependency parsing and POS tagging algorithms with beam search and structured perceptron. We propose a simple variant of "early-update" to ensure valid update in the training process. The proposed solution can also be applied to combine beam search and structured perceptron with other systems that exhibit spurious ambiguity. On CTB, we achieve 94.01% tagging accuracy and 86.33% unlabeled attachment score with a relatively small beam width. On PTB, we also achieve state-of-the-art performance.
Ji Ma Tong Xiao Nan Yang
Google Scholar CiteSeer X DBLP Database
Ji Ma, Tong Xiao, Jingbo Zhu, Feiliang Ren
Google Scholar CiteSeer X DBLP Database
View PDF Transition-based Dependency Parsing Using Recursive Neural NetworksAbstract: In this work, we present a general compositional vector framework for transitionbased dependency parsing. The ability to use transition-based algorithms allows for the application of vector composition to a large set of languages where only dependency treebanks are available, as well as handling linguistic phenomena such as non-projectivities which pose problems for previously proposed methods. We introduce the concept of a Transition Directed Acyclic Graph that allows us to apply Recursive Neural Networks for parsing with existing transition-based algorithms. Our framework captures semantic relatedness between phrases similarly to a constituency-based counterpart from the literature, for example predicting that "a financial crisis", "a cash crunch" and "a bear market" are semantically similar. Currently, a parser based on our framework is capable of achieving 86 . 25% in Un-labelled Attachment Score for a well-established dependency dataset using only word representations as input, falling less than 2% points short of a previously proposed comparable feature-based model.
Pontus Stenetorp
Google Scholar CiteSeer X DBLP Database
View PDF Parsing Croatian and Serbian by Using Croatian Dependency TreebanksAbstract: We investigate statistical dependency parsing of two closely related languages, Croatian and Serbian. As these two morphologically complex languages of relaxed word order are generally under-resourced--with the topic of dependency parsing still largely unaddressed, especially for Serbian--we make use of the two available dependency treebanks of Croatian to produce state-of-the-art parsing models for both languages. We observe parsing accuracy on four test sets from two domains. We give insight into overall parser performance for Croatian and Serbian, impact of preprocessing for lemmas and morphosyntactic tags and influence of selected morphosyntactic features on parsing accuracy.
Zeljko Agic Danijela Merkler Dasa Berovic
Google Scholar CiteSeer X DBLP Database
View PDF ISI-Kolkata at MTPIL-2012Abstract: (no abstract)
Arjun Das, Arabinda Shee and Utpal Garain
Google Scholar CiteSeer X DBLP Database
View PDF Exploiting the Contribution of Morphological Information to Parsing: the BASQUE TEAM system in the SPRML'2013 Shared TaskAbstract: This paper presents a dependency parsing system, presented as BASQUE TEAM at the SPMRL'2013 Shared Task, based on the analysis of each morphological feature of the languages. Once the specific relevance of each morphological feature is calculated, this system uses the most significant of them to create a series of analyzers using two freely available and state of the art dependency parsers, MaltParser and Mate. Finally, the system will combine previously achieved parses using a voting approach.
Iakes Goenaga, Nerea Ezeiza Koldo Gojenola
Google Scholar CiteSeer X DBLP Database
, Ramisetty Rajeswara Rao
Google Scholar CiteSeer X DBLP Database
View PDF A Statistical Approach to Prediction of Empty Categories in Hindi Dependency TreebankAbstract: In this paper we use statistical dependency parsing techniques to detect NULL or Empty categories in the Hindi sentences. We have currently worked on Hindi dependency treebank which is released as part of COLINGMTPIL 2012 Workshop. Earlier Rule based approaches are employed to detect Empty heads for Hindi language but statistical learning for automatic prediction is not explored. In this approach we used a technique of introducing complex labels into the data to predict Empty categories in sentences. We have also discussed about shortcomings and difficulties in this approach and evaluated the performance of this approach on different Empty categories.
Puneeth Kukkadapu, Prashanth Mannem
Google Scholar CiteSeer X DBLP Database
View PDF Two methods to incorporate local morphosyntactic features in Hindi dependency parsingAbstract: In this paper we explore two strategies to incorporate local morphosyntactic features in Hindi dependency parsing. These features are obtained using a shallow parser. We first explore which information provided by the shallow parser is most beneficial and show that local morphosyntactic features in the form of chunk type, head/non-head information, chunk boundary information, distance to the end of the chunk and suffix concatenation are very crucial in Hindi dependency parsing. We then investigate the best way to incorporate this information during dependency parsing. Further, we compare the results of various experiments based on various criterions and do some error analysis. All the experiments were done with two data-driven parsers, MaltParser and MSTParser, on a part of multi-layered and multi-representational Hindi Treebank which is under development. This paper is also the first attempt at complete sentence level parsing for Hindi. 1 Introduction The dependency parsing community has since a few years shown considerable interest in parsing morphologically rich languages with flexible word order. This is partly due to the increasing availability of dependency treebanks for such languages, but it is also motivated by the observation that the performance obtained for these languages have not been very high (Nivre et al., 2007a). Attempts at handling various non-configurational aspects in these languages have pointed towards shortcomings in traditional parsing methodologies (Tsarfaty and Sima'an, 2008; Eryigit et al., 2008; Seddah et al., 2009; Husain et al., 2009; Gadde et al., 2010). Among other things, it has been pointed out that the use of language specific features may play a crucial role in improving the overall parsing performance. Different languages tend to encode syntactically relevant information in different ways, and it has been hypothesized that the integration of morphological and syntactic information could be a key to better accuracy. However, it has also been noted that incorporating thes ...
(no authors)
Google Scholar CiteSeer X DBLP Database
View PDF A Cross-Task Flexible Transition Model for Arabic Tokenization, AffixAbstract: This paper describes cross-task flexible transition models (CTF-TMs) and demonstrates their effectiveness for Arabic natural language processing (NLP). NLP pipelines often suffer from error propagation, as errors committed in lower-level tasks cascade through the remainder of the processing pipeline. By allowing a flexible order of operations across and within multiple NLP tasks, a CTF-TM can mitigate both cross-task and within-task error propagation. Our Arabic CTF-TM models tokenization, affix detection, affix labeling, partof-speech tagging, and dependency parsing, achieving state-of-the-art results. We present the details of our general framework, our Arabic CTF-TM, and the setup and results of our experiments.
Stephen Tratz
Google Scholar CiteSeer X DBLP Database
View PDF Ensembling Various Dependency Parsers: Adopting Turbo Parser for Indian LanguagesAbstract: In this paper, we describe our experiments on applying combination of Malt, MST and Turbo Parsers for Hindi dependency parsing as part of a shared task at MTPIL 2012 Workshop, COLING 2012. We explore the usage and adoption of the recently released Turbo Parser for parsing Indian languages. Various configurations of each parser are explored before combination in order to adjust them for two different settings of the data (with gold-standard and automatic Part-Of-Speech tags). We achieved a best result of 96.50% unlabeled attachment score (UAS), 92.90% labeled accuracy (LA), 91.49% labeled attachment score (LAS) using voting method on data with gold POS tags. In case of data with automatic POS tags, we achieved a best result of 93.99% UAS, 90.04% LA and 87.84% LAS respectively.
Puneeth Kukkadapu , Deepak Kumar Malladi and Aswarth Dara
Google Scholar CiteSeer X DBLP Database
View PDF From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsingAbstract: Usually unsupervised dependency parsing tries to optimize the probability of a corpus by modifying the dependency model that was presumably used to generate the corpus. In this article we explore a different view in which a dependency structure is among other things a partial order on the nodes in terms of centrality or saliency. Under this assumption we model the partial order directly and derive dependency trees from this order. The result is an approach to unsupervised dependency parsing that is very different from standard ones in that it requires no training data. Each sentence induces a model from which the parse is read off. Our approach is evaluated on data from 12 different languages. Two scenarios are considered: a scenario in which information about part-of-speech is available, and a scenario in which parsing relies only on word forms and distributional clusters. Our approach is competitive to state-of-the-art in both scenarios.
Anders Sgaard
Google Scholar CiteSeer X DBLP Database
View PDF Fourth-Order Dependency ParsingAbstract: (no abstract)
Hai Zhao
Google Scholar CiteSeer X DBLP Database
View PDF Semi-supervised dependency parsing using generalized tri-trainingAbstract: Martins et al. (2008) presented what to the best of our knowledge still ranks as the best overall result on the CONLLX Shared Task datasets. The paper shows how triads of stacked dependency parsers described in Martins et al. (2008) can label unlabeled data for each other in a way similar to co-training and produce end parsers that are significantly better than any of the stacked input parsers. We evaluate our system on five datasets from the CONLL-X Shared Task and obtain 10--20% error reductions, incl. the best reported results on four of them. We compare our approach to other semi-supervised learning algorithms.
Anders Sgaard and Christian Rishj
Google Scholar CiteSeer X DBLP Database
View PDF SPMRL'13 Shared Task System: The CADIM Arabic Dependency ParserAbstract: We describe the submission from the Columbia Arabic & Dialect Modeling group (CADIM) for the Shared Task at the Fourth Workshop on Statistical Parsing of Morphologically Rich Languages (SPMRL'2013). We participate in the Arabic Dependency parsing task for predicted POS tags and features. Our system is based on Marton et al. (2013).
Yuval Marton Nizar Habash, Owen Rambow Sarah Alkuhlani
Google Scholar CiteSeer X DBLP Database
Zhenghua Li , Min Zhang , Wanxiang Che , Ting Liu
Google Scholar CiteSeer X DBLP Database
1-20 of 1966: 12345...99


1924 users, 666 channels, 347 resources, 56080 items