Channels Resources Recent Items Reading list HomeRegisterLoginSupportContact


Query: CRF or CRFs or "conditional random field" or "conditional random fields"
Status: updated [Success]
1-20 of 2525: 12345...127
View PDF Logarithmic Opinion Pools for Conditional Random FieldsAbstract: Recent work on Conditional Random Fields (CRFs) has demonstrated the need for regularisation to counter the tendency of these models to overfit. The standard approach to regularising CRFs involves a prior distribution over the model parameters, typically requiring search over a hyperparameter space. In this paper we address the overfitting problem from a different perspective, by factoring the CRF distribution into a weighted product of individual "expert" CRF distributions. We call this model a logarithmic opinion pool (LOP) of CRFs (LOP-CRFs). We apply the LOP-CRF to two sequencing tasks. Our results show that unregularised expert CRFs with an unregularised CRF under a LOP can outperform the unregularised CRF, and attain a performance level close to the regularised CRF. LOP-CRFs therefore provide a viable alternative to CRF regularisation without the need for hyperparameter search.
Andrew Smith Trevor Cohn Miles Osborne
Google Scholar CiteSeer X DBLP Database
Content Analysis and Indexing Linguistic processing
Google Scholar CiteSeer X DBLP Database
View PDF Recognizing Biomedical Named Entities using Skip-chain Conditional Random FieldsAbstract: Linear-chain Conditional Random Fields (CRF) has been applied to perform the Named Entity Recognition (NER) task in many biomedical text mining and information extraction systems. However, the linear-chain CRF cannot capture long distance dependency, which is very common in the biomedical literature. In this paper, we propose a novel study of capturing such long distance dependency by defining two principles of constructing skipedges for a skip-chain CRF: linking similar words and linking words having typed dependencies. The approach is applied to recognize gene/protein mentions in the literature. When tested on the BioCreAtIvE II Gene Mention dataset and GENIA corpus, the approach contributes significant improvements over the linear-chain CRF. We also present in-depth error analysis on inconsistent labeling and study the influence of the quality of skip edges on the labeling performance.
Jingchen Liu Minlie Huang Xiaoyan Zhu
Google Scholar CiteSeer X DBLP Database
View PDF Selecting Optimal Feature Template Subset for CRFsAbstract: Conditional Random Fields (CRFs) are the state-of-the-art models for sequential labeling problems. A critical step is to select optimal feature template subset before employing CRFs, which is a tedious task. To improve the efficienc y of t his step, we propose a new method that adopts the maximum entropy (ME) model and maximum entropy Markov models (MEMMs) instead of CRFs considering the homology between ME, MEMMs, and CRFs. Moreover, empirical studies on the efficiency and effectiveness of the method are conducted in the field of Chinese text chunking, whose performance is ranked the first place in task two of CIPS-ParsEval-2009.
Xingjun Xu Guanglu Sun Yi Guan Xishuang Dong Sheng Li
Google Scholar CiteSeer X DBLP Database
View PDF Effect of Non-linear Deep Architecture in Sequence LabelingAbstract: If we compare the widely used Conditional Random Fields (CRF) with newly proposed "deep architecture" sequence models (Collobert et al., 2011), there are two things changing: from linear architecture to non-linear, and from discrete feature representation to distributional. It is unclear, however, what utility non-linearity offers in conventional featurebased models. In this study, we show the close connection between CRF and "sequence model" neural nets, and present an empirical investigation to compare their performance on two sequence labeling tasks--Named Entity Recognition and Syntactic Chunking. Our results suggest that non-linear models are highly effective in low-dimensional distributional spaces. Somewhat surprisingly, we find that a nonlinear architecture offers no benefits in a high-dimensional discrete feature space.
Mengqiu Wang Christopher D. Manning
Google Scholar CiteSeer X DBLP Database
View PDF On Application of Conditional Random Field in Stemming of Bengali Natural Language TextAbstract: While stochastic route has been explored in solving the stemming problem, Conditional Random Field (CRF), a conditional probability based statistical model, has not been applied yet. We applied CRF to train a set of stemmers for Bengali natural language text. Care had been taken to design it language neutral so that same approach can be applied for other languages. The experiments yielded more than 86% accuracy.
Sandipan Sarkar Sivaji Bandyopadhyay
Google Scholar CiteSeer X DBLP Database
View PDF Effect of Non-linear Deep Architecture in Sequence LabelingAbstract: If we compare the widely used Conditional Random Fields (CRF) with newly proposed "deep architecture" sequence models (Collobert et al., 2011), there are two things changing: from linear architecture to nonlinear, and from discrete feature representation to distributional. It is unclear, however, what utility non-linearity offers in conventional feature-based models. In this study, we show the close connection between CRF and "sequence model" neural nets, and present an empirical investigation to compare their performance on two sequence labeling tasks--Named Entity Recognition and Syntactic Chunking. Our results suggest that non-linear models are highly effective in low-dimensional distributional spaces. Somewhat surprisingly, we find that a non-linear architecture offers no benefits in a high-dimensional discrete feature space.
Mengqiu Wang Christopher D. Manning
Google Scholar CiteSeer X DBLP Database
View PDF Supervised Morphological Segmentation in a Low-Resource Learning Setting using Conditional Random FieldsAbstract: We discuss data-driven morphological segmentation, in which word forms are segmented into morphs, the surface forms of morphemes. Our focus is on a lowresource learning setting, in which only a small amount of annotated word forms are available for model training, while unannotated word forms are available in abundance. The current state-of-art methods 1) exploit both the annotated and unannotated data in a semi-supervised manner, and 2) learn morph lexicons and subsequently uncover segmentations by generating the most likely morph sequences. In contrast, we discuss 1) employing only the annotated data in a supervised manner, while entirely ignoring the unannotated data, and 2) directly learning to predict morph boundaries given their local sub-string contexts instead of learning the morph lexicons. Specifically, we employ conditional random fields, a popular discriminative log-linear model for segmentation. We present experiments on two data sets comprising five diverse languages. We show that the fully supervised boundary prediction approach outperforms the state-of-art semi-supervised morph lexicon approaches on all languages when using the same annotated data sets.
Teemu Ruokolainen Oskar Kohonen Sami Virpioja Mikko Kurimo
Google Scholar CiteSeer X DBLP Database
View PDF Structure-Aware Review Mining and SummarizationAbstract: In this paper, we focus on object feature 1 1 Introduction based review summarization. Different from most of previous work with linguistic rules or statistical methods, we formulate the review mining task as a joint structure tagging problem. We propose a new machine learning framework based on Conditional Random Fields (CRFs). It can employ rich features to jointly extract positive opinions, negative opinions and object features for review sentences. The linguistic structure can be naturally integrated into model representation. Besides linear-chain structure, we also investigate conjunction structure and syntactic tree structure in this framework. Through extensive experiments on movie review and product review data sets, we show that structure-aware models outperform many state-of-the-art approaches to review mining. With the rapid expansion of e-commerce, people are more likely to express their opinions and hands-on experiences on products or services they have purchased. These reviews are important for both business organizations and personal costumers. Companies can decide on their strategies for marketing and products improvement. Customers can make a better decision when pur1 Note that there are two meanings for word "feature". We use "object feature" to represent the target entity, which the opinion expressed on, and use "feature" as the input for machine learning methods. chasing products or services. Unfortunately, reading through all customer reviews is difficult, especially for popular items, the number of reviews can be up to hundreds or even thousands. Therefore, it is necessary to provide coherent and concise summaries for these reviews.
Fangtao Li , Chao Han , Minlie Huang , Xiaoyan Zhu Ying-Ju Xia , Shu Zhang and Hao Yu
Google Scholar CiteSeer X DBLP Database
View PDF A Unified and Discriminative Model for Query RefinementAbstract: This paper addresses the issue of query refinement, which involves reformulating ill-formed search queries in order to enhance relevance of search results. Query refinement typically includes a number of tasks such as spelling error correction, word splitting, word merging, phrase segmentation, word stemming, and acronym expansion. In previous research, such tasks were addressed separately or through employing generative models. This paper proposes employing a unified and discriminative model for query refinement. Specifically, it proposes a Conditional Random Field (CRF) model suitable for the problem, referred to as Conditional Random Field for Query Refinement (CRF-QR). Given a sequence of query words, CRF-QR predicts a sequence of refined query words as well as corresponding refinement operations. In that sense, CRF-QR differs greatly from conventional CRF models. Two types of CRF-QR models, namely a basic model and an extended model are introduced. One merit of employing CRF-QR is that different refinement tasks can be performed simultaneously and thus the accuracy of refinement can be enhanced. Furthermore, the advantages of discriminative models over generative models can be fully leveraged. Experimental results demonstrate that CRFQR can significantly outperform baseline methods. Furthermore, when CRF-QR is used in web search, a significant improvement of relevance can be obtained.
Jiafeng Guo , Gu Xu , Hang Li , Xueqi Cheng
Google Scholar CiteSeer X DBLP Database
View PDF Scalable Gaussian Process Structured Prediction for Grid Factor Graph ApplicationsAbstract: Structured prediction is an important and wellstudied problem with many applications across machine learning. GPstruct is a recently proposed structured prediction model that offers appealing properties such as being kernelised, non-parametric, and supporting Bayesian inference( Brati`eres et al. , 2013 ).The model places a Gaussian process prior over energy functions which describe relationships between input variables and structured output variables. However, the memory demand of GPstruct is quadratic in the number of latent variables and training runtime scales cubically. This prevents GPstruct from being applied to problems involving grid factor graphs, which are prevalent in computer vision and spatial statistics applications. Here we explore a scalable approach to learning GPstruct models based on ensemble learning, with weak learners (predictors) trained on subsets of the latent variables and bootstrap data, which can easily be distributed. We show experiments with 4 M latent variables on image segmentation. Our method outperforms widely-used conditional random field models trained with pseudo-likelihood. Moreover, in image segmentation problems it improves over recent state-of-the-art marginal optimisation methods in terms of predictive performance and uncertainty calibration. Finally, it generalises well on all training set sizes. Proceedings of the 31 st International Conference on Machine Learning , Beijing, China, 2014. JMLR: W&CP volume 32. Copyright 2014 by the author(s).
Sebastien Bratieres Novi Quadrianto Sebastian Nowozin Zoubin Ghahramani
Google Scholar CiteSeer X DBLP Database
View PDF Cost-benefit Analysis of Two-Stage Conditional Random Fields based English-to-Chinese Machine TransliterationAbstract: This work presents an English-to-Chinese (E2C) machine transliteration system based on two-stage conditional random fields (CRF) models with accessor variety (AV) as an additional feature to approximate local context of the source language. Experiment results show that two-stage CRF method outperforms the one-stage opponent since the former costs less to encode more features and finer grained labels than the latter.
Chan-Hung Kuo Shih-Hung Liu MikeTian-Jian Jiang Cheng-Wei Lee Wen-Lian Hsu
Google Scholar CiteSeer X DBLP Database
View PDF Sentence Dependency Tagging in Online Question Answering ForumsAbstract: Online forums are becoming a popular resource in the state of the art question answering (QA) systems. Because of its nature as an online community, it contains more updated knowledge than other places. However, going through tedious and redundant posts to look for answers could be very time consuming. Most prior work focused on extracting only question answering sentences from user conversations. In this paper, we introduce the task of sentence dependency tagging. Finding dependency structure can not only help find answer quickly but also allow users to trace back how the answer is concluded through user conversations. We use linear-chain conditional random fields (CRF) for sentence type tagging, and a 2D CRF to label the dependency relation between sentences. Our experimental results show that our proposed approach performs well for sentence dependency tagging. This dependency information can benefit other tasks such as thread ranking and answer summarization in online forums.
Zhonghua Qu Yang Liu
Google Scholar CiteSeer X DBLP Database
View PDF Hidden-Unit Conditional Random FieldsAbstract: The paper explores a generalization of conditional random fields (CRFs) in which binary stochastic hidden units appear between the data and the labels. Hidden-unit CRFs are potentially more powerful than standard CRFs because they can represent nonlinear dependencies at each frame. The hidden units in these models also learn to discover latent distributed structure in the data that improves classification. We derive efficient algorithms for inference and learning in these models by observing that the hidden units are conditionally independent given the data and the labels. Finally, we show that hiddenunit CRFs perform well in experiments on a range of tasks, including optical character recognition, text classification, protein structure prediction, and part-of-speech tagging.
(no authors)
Google Scholar CiteSeer X DBLP Database
View PDF Learning a Two-Stage SVM/CRF Sequence ClassifierAbstract: Learning a sequence classifier means learning to predict a sequence of output tags based on a set of input data items. For example, recognizing that a handwritten word is "cat", based on three images of handwritten letters and on general knowledge of English letter combinations, is a sequence classification task. This paper describes a new two-stage approach to learning a sequence classifier that is (i) highly accurate, (ii) scalable, and (iii) easy to use in data mining applications. The two-stage approach combines support vector machines (SVMs) and conditional random fields (CRFs). It is (i) highly accurate because it benefits from the maximummargin nature of SVMs and also from the ability of CRFs to model correlations between neighboring output tags. It is (ii) scalable because the input to each SVM is a small training set, and the input to the CRF has a small number of features, namely the SVM outputs. It is (iii) easy to use because it combines existing published software in a straightforward way. In detailed experiments on the task of recognizing handwritten words, we show that the two-stage approach is more accurate, or faster and more scalable, or both, than leading other methods for learning sequence classifiers, including max-margin Markov networks (M3Ns) and standard CRFs.
Guilherme Hoefel Charles Elkan
Google Scholar CiteSeer X DBLP Database
View PDF Cocktail Party Processing via Structured PredictionAbstract: While human listeners excel at selectively attending to a conversation in a cocktail party, machine performance is still far inferior by comparison. We show that the cocktail party problem, or the speech separation problem, can be effectively approached via structured prediction. To account for temporal dynamics in speech, we employ conditional random fields (CRFs) to classify speech dominance within each time-frequency unit for a sound mixture. To capture complex, nonlinear relationship between input and output, both state and transition feature functions in CRFs are learned by deep neural networks. The formulation of the problem as classification allows us to directly optimize a measure that is well correlated with human speech intelligibility. The proposed system substantially outperforms existing ones in a variety of noises.
Yuxuan Wang , DeLiang Wang
Google Scholar CiteSeer X DBLP Database
View PDF Forward-backward Machine Transliteration between English and Chinese Based on Combined CRFsAbstract: The paper proposes a forward-backward transliteration system between English and Chinese for the shared task of NEWS2011. Combined recognizers based on Conditional Random Fields (CRF) are applied to transliterating between source and target languages. Huge amounts of features and long training time are the motivations for decomposing the task into several recognizers. To prepare the training data, segmentation and alignment are carried out in terms of not only syllables and single Chinese characters, as was the case previously, but also phoneme strings and corresponding character strings. For transliterating from English to Chinese, our combined system achieved Accuracy in Top-1 0.312, compared with the best performance in NEWS2011, which was 0.348. For backward transliteration, our system achieved top-1 accuracy 0.167, which is better than others in NEWS2011.
Ying Qin Guohua Chen
Google Scholar CiteSeer X DBLP Database
View PDF Institute of Computer Science and Technology Peking University, China, 100871Abstract: This paper describes our experiments on the cross-domain Chinese word segmentation task at the first CIPS-SIGHAN Joint Conference on Chinese Language Processing. Our system is based on the Conditional Random Fields (CRFs) model. Considering the particular properties of the out-of-domain data, we propose some novel steps to get some improvements for the special task.
(no authors)
Google Scholar CiteSeer X DBLP Database
View PDF Evidence-Specific Structures for Rich Tractable CRFsAbstract: We present a simple and effective approach to learning tractable conditional random fields with structure that depends on the evidence. Our approach retains the advantages of tractable discriminative models, namely efficient exact inference and arbitrarily accurate parameter learning in polynomial time. At the same time, our algorithm does not suffer a large expressive power penalty inherent to fixed tractable structures. On real-life relational datasets, our approach matches or ex-ceeds state of the art accuracy of the dense models, and at the same time provides an order of magnitude speedup.
Anton Chechetka Carlos Guestrin
Google Scholar CiteSeer X DBLP Database
1-20 of 2525: 12345...127


1983 users, 671 channels, 349 resources, 56081 items