News | About | Download | Usage | Questions | Mailing lists | Release history
December 9, 2015: The deterministic coreference resolution system is still supported in StanfordCoreNLP by using the annotator dcoref. However, we have now added new, better performing statistical and neural coreference systems written by Kevin Clark, which are invoked by default or explicitly using the annotator coref. See the CorefAnnotator documentation.
May 7, 2013: Recent improvements to the Stanford Deterministic Coreference Resolution System (Recasens et al., below) won the best short paper award at NAACL 2013.
June 30, 2011: This system was the top ranked system at the CoNLL-2011 shared task.
This system implements the multi-pass sieve coreference resolution (or anaphora resolution) system described in Lee et al. (CoNLL Shared Task 2011) and Raghunathan et al. (EMNLP 2010).
The score obtained is higher than that in EMNLP 2010 paper because of additional sieves and better rules (see Lee et al. 2011 for details). Mention detection is included in the package (see Usage for instructions).
The Computational Linguistics paper includes more details and additional experimental results.
The papers to cite for this system are as follows:
Marta Recasens, Marie-Catherine de Marneffe, and Christopher Potts.
The Life and Death of Discourse Entities: Identifying Singleton Mentions.
In Proceedings of NAACL 2013.
Heeyoung Lee, Angel Chang, Yves Peirsman, Nathanael Chambers, Mihai Surdeanu and Dan Jurafsky.
Deterministic coreference resolution based on entity-centric, precision-ranked rules.
Computational Linguistics 39(4), 2013.
Heeyoung Lee, Yves Peirsman, Angel Chang, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky.
Stanford's Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task.
In Proceedings of the CoNLL-2011 Shared Task, 2011.
Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, Christopher Manning
A Multi-Pass Sieve for Coreference Resolution
EMNLP-2010, Boston, USA. 2010.
The scores of the dcoref code in v3.6.0 (CoNLL 2011 shared task winner descendant) on the CoNLL 2011 Shared Task dev data set, measured on 2016/02/07 using the v4 scorer (used for the 2011 evaluation).
----------------------------------------------------------------------------------------------------------------------------------------- MUC B cubed CEAF (M) CEAF (E) BLANC | P R F1 P R F1 P R F1 P R F1 P R F1 | Avg F1 ----------------------------------------------------------------------------------------------------------------------------------------- conllst2011 dev | 62.1 59.3 60.7 | 74.2 67.7 70.8 | 59.4 59.4 59.4 | 46.1 48.9 47.5 | 79.6 72.4 75.4 | 59.56 ----------------------------------------------------------------------------------------------------------------------------------------- * Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.
The scores of the dcoref code in v3.6.0 (CoNLL 2011 shared task winner descendant) on the CoNLL 2011/2012 Shared Task dev data sets, measured on 2016/02/07 using the v8.01 scorer (current in 2016).
----------------------------------------------------------------------------------------------------------------------------------------- MUC B cubed CEAF (M) CEAF (E) BLANC | P R F1 P R F1 P R F1 P R F1 P R F1 | Avg F1 ----------------------------------------------------------------------------------------------------------------------------------------- conllst2011 dev | 62.1 59.3 60.7 | 56.2 48.6 52.1 | 58.0 57.5 57.8 | 48.9 53.5 51.1 | 54.1 47.2 50.1 | 54.62 conllst2012 dev | 65.9 64.1 65.0 | 58.7 50.9 54.5 | 59.2 59.6 59.4 | 48.6 54.3 51.3 | 59.5 53.7 56.1 | 56.92 ----------------------------------------------------------------------------------------------------------------------------------------- * Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.
The coreference resolution system is integrated in the Stanford suite of NLP tools, StanfordCoreNLP. Please download the entire suite from this page.
Running coreference resolution on raw text
This software is now fully incorporated in StanfordCoreNLP, so all you have to do is add the dcoref annotator to the "annotators" property in StanfordCoreNLP. For example, add "dcoref" to the end of the list of text annotators:
annotators = tokenize, ssplit, pos, lemma, ner, parse, dcorefThe properties you can set for the dcoref system itself are the following:
dcoref.demonym // The path for a file that includes a list of demonyms dcoref.animate // The list of animate/inanimate mentions (Ji and Lin, 2009) dcoref.inanimate dcoref.male // The list of male/neutral/female mentions (Bergsma and Lin, 2006) dcoref.neutral // Neutral means a mention that is usually referred by 'it' dcoref.female dcoref.plural // The list of plural/singular mentions (Bergsma and Lin, 2006) dcoref.singular // above 8 options do not have to be set; default models in StanfordCoreNLP package will be used if unspecified. dcoref.score = false // Scoring the output of the system dcoref.postprocessing = false // Do post processing dcoref.maxdist = -1 // Maximum sentence distance between two mentions for resolution (-1: no constraint on the distance) dcoref.use.big.gender.number = false // Load a big list of gender and number information dcoref.replicate.conll = false // Turn on this for replicating conllst result // if above 5 options are omitted, default values (as shown in above) are used. sievePasses // Sieve passes - each class is defined in dcoref/sievepasses/ // If omitted, the default sieves will be used (recommended).See StanfordCoreNLP for more details.
How to replicate the results in our CoNLL Shared Task 2011 paper
To replicate the results in the paper run:
java -cp <jars_in_corenlp> -Xmx8g edu.stanford.nlp.dcoref.SieveCoreferenceSystem -props <properties file>A sample properties file (coref.properties) is included in the dcoref package. The properties file includes the following:
# annotators needed for coreference resolution annotators = pos, lemma, ner, parse # Scoring the output of the system. # Scores in log file are different from the output of CoNLL scorer because it is before post processing. dcoref.score = true # Do post processing dcoref.postprocessing = true # Maximum sentence distance between two mentions for resolution (-1: no constraint on the distance) dcoref.maxdist = -1 # Load a big list of gender and number information dcoref.use.big.gender.number = true # Older CoreNLP versions loaded huge text file; newer versions load serialized map # dcoref.big.gender.number = edu/stanford/nlp/models/dcoref/gender.data.gz dcoref.big.gender.number = edu/stanford/nlp/models/dcoref/gender.map.ser.gz # Turn on this for replicating conllst result dcoref.replicate.conll = true # Path for the official CoNLL 2011 scorer script. if omitted, no scoring dcoref.conll.scorer = /PATH/FOR/SCORER # Path for log file for coref system evaluation dcoref.logFile = /PATH/FOR/LOGS # for scoring on other corpora, one of following options can be set # dcoref.conll2011: path for the directory containing conllst files # dcoref.ace2004: path for the directory containing ACE2004 files # dcoref.mucfile: path for the MUC file dcoref.conll2011 = /PATH/FOR/CORPUSThis system can process ACE2004, MUC6, and CoNLL Shared Task 2011 corpora in their original formats. Examples from the corpora are given here:
CoNLLst 2011:
nw/wsj/00/wsj_0020 0 0 The DT (TOP_(S_(NP_* - - - - * * (ARG0* * * * (11 nw/wsj/00/wsj_0020 0 1 U.S. NNP *) - - - - (GPE) * *) * * * 11) nw/wsj/00/wsj_0020 0 2 , , * - - - - * * * * * * - nw/wsj/00/wsj_0020 0 3 claiming VBG (S_(VP_* claim 01 2 - * (V*) (ARGM-ADV* * * * -
MUC6:
... <s> By/IN proposing/VBG <COREF ID="13" TYPE="IDENT" REF="6" MIN="date"> a/DT meeting/NN date/NN</COREF> ,/, <COREF ID="14" TYPE="IDENT" REF="0"> <ORGANIZATION> Eastern/NNP</ORGANIZATION></COREF> moved/VBD one/CD step/NN closer/JJR toward/IN reopening/VBG current/JJ high-cost/JJ contract/NN agreements/NNS with/IN <COREF ID="15" TYPE="IDENT" REF="8" MIN="unions"><COREF ID="16" TYPE="IDENT" REF="14"> its/PRP$</COREF> unions/NNS</COREF> ./. </s> ...ACE2004:
... <document DOCID="20001115_AFP_ARB.0212.eng"> <entity ID="20001115_AFP_ARB.0212.eng-E1" TYPE="ORG" SUBTYPE="Educational" CLASS="SPC"> <entity_mention ID="1-47" TYPE="NAM" LDCTYPE="NAM"> <extent> <charseq START="475" END="506">the Globalization Studies Center</charseq> </extent> <head> <charseq START="479" END="506">Globalization Studies Center</charseq> </head> </entity_mention> ...
If you have issues getting this to work, you may need to follow a few steps:
/tc/
part of the data, run
sed -i s/ch_0001/ch_0009/g res sed -i s/ch_0002/ch_0019/g res sed -i s/ch_0003/ch_0029/g res sed -i s/ch_0004/ch_0039/g res sed -i s/ch_0005/ch_0049/g rese.g. ch_0005 from the test set is named ch_0049 in the test key.
How to run Chinese Coreference
Since CoreNLP version 3.5.2, we added support for Chinese coreference.
String text = ...; String[] args = new String[]{ "-props", "edu/stanford/nlp/hcoref/properties/zh-dcoref-default.properties" }; Annotation document = new Annotation(text); Properties props = StringUtils.argsToProperties(args); StanfordCoreNLP corenlp = new StanfordCoreNLP(props); corenlp.annotate(document); HybridCorefAnnotator hcoref = new HybridCorefAnnotator(props); hcoref.annotate(document); MapcorefChain = document.get(CorefChainAnnotation.class); System.out.println(corefChain);
// Note that you have to replace the following properties file with your own. // To do so, copy the following file, replace the # Evaluation section with // your own paths and refer to it in args. String[] args = new String[]{ "-props", "edu/stanford/nlp/hcoref/properties/zh-dcoref-conll.properties" } edu.stanford.nlp.hcoref.CorefSystem.main(args);
Questions, feedback, and bug reports/fixes can be sent to our mailing lists.
We have 3 mailing lists for the Stanford Coreference Resolution System, all of which are shared
with other JavaNLP tools (with the exclusion of the parser). Each address is
at @lists.stanford.edu
:
java-nlp-user
This is the best list to post to in order
to ask questions, make announcements, or for discussion among JavaNLP
users. You have to subscribe to be able to use it.
Join the list via this webpage or by emailing
java-nlp-user-join@lists.stanford.edu
. (Leave the
subject and message body empty.) You can also
look at
the list archives.
java-nlp-announce
This list will be used only to announce
new versions of Stanford JavaNLP tools. So it will be very low volume (expect 1-3
messages a year). Join the list via this webpage or by emailing
java-nlp-announce-join@lists.stanford.edu
. (Leave the
subject and message body empty.)
java-nlp-support
This list goes only to the software
maintainers. It's a good address for licensing questions, etc. For
general use and support questions, you're better off joining and using
java-nlp-user
.
You cannot join java-nlp-support
, but you can mail questions to
java-nlp-support@lists.stanford.edu
.
Version 3.6.0 - February 7, 2016
The scores of the dcoref code in v3.6.0 (CoNLL 2011 shared task winner descendant) on the CoNLL 2011 Shared Task dev data set, measured on 2016/02/07 using the v4 scorer (used for the 2011 evaluation).
----------------------------------------------------------------------------------------------------------------------------------------- MUC B cubed CEAF (M) CEAF (E) BLANC | P R F1 P R F1 P R F1 P R F1 P R F1 | Avg F1 ----------------------------------------------------------------------------------------------------------------------------------------- conllst2011 dev | 62.1 59.3 60.7 | 74.2 67.7 70.8 | 59.4 59.4 59.4 | 46.1 48.9 47.5 | 79.6 72.4 75.4 | 59.56 ----------------------------------------------------------------------------------------------------------------------------------------- * Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.
The scores of the dcoref code in v3.6.0 (CoNLL 2011 shared task winner descendant) on the CoNLL 2011/2012 Shared Task dev data sets, measured on 2016/02/07 using the v8.01 scorer (current in 2016).
----------------------------------------------------------------------------------------------------------------------------------------- MUC B cubed CEAF (M) CEAF (E) BLANC | P R F1 P R F1 P R F1 P R F1 P R F1 | Avg F1 ----------------------------------------------------------------------------------------------------------------------------------------- conllst2011 dev | 62.1 59.3 60.7 | 56.2 48.6 52.1 | 58.0 57.5 57.8 | 48.9 53.5 51.1 | 54.1 47.2 50.1 | 54.62 conllst2012 dev | 65.9 64.1 65.0 | 58.7 50.9 54.5 | 59.2 59.6 59.4 | 48.6 54.3 51.3 | 59.5 53.7 56.1 | 56.92 ----------------------------------------------------------------------------------------------------------------------------------------- * Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.
July 9, 2013
Single mention detection (Recasens et al. 2013) is integrated. The score may differ due to the change in Parser or NER.
----------------------------------------------------------------------------------------------------------------------------------------- MUC B cubed CEAF (M) CEAF (E) BLANC | P R F1 P R F1 P R F1 P R F1 P R F1 | Avg F1 ----------------------------------------------------------------------------------------------------------------------------------------- conllst2011 dev | 62.4 59.3 60.8 | 74.2 67.6 70.8 | 59.3 59.3 59.3 | 45.5 48.6 47.0 | 79.1 72.5 75.3 | 59.5 ----------------------------------------------------------------------------------------------------------------------------------------- * Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.
June 6, 2011
This release is the code used for CoNLL Shared Task 2011. The score may differ due to the change in Parser or NER.
----------------------------------------------------------------------------------------------------------------------------------------- conllst MUC B cubed CEAF (M) CEAF (E) BLANC | track P R F1 P R F1 P R F1 P R F1 P R F1 | Avg F1 ----------------------------------------------------------------------------------------------------------------------------------------- conllst2011 dev | close | 59.1 57.5 58.3 | 69.2 71.0 70.1 | 58.6 58.6 58.6 | 46.5 48.1 47.3 | 72.2 78.1 74.8 | 58.6 conllst2011 dev | open | 60.1 59.5 59.8 | 69.5 71.9 70.7 | 59.0 59.0 59.0 | 46.5 47.1 46.8 | 73.8 78.6 76.0 | 59.1 conllst2011 test | close | 57.5 61.8 59.6 | 68.2 68.4 68.3 | 56.4 56.4 56.4 | 47.8 43.4 45.5 | 76.2 70.6 73.0 | 57.8 conllst2011 test | open | 59.3 62.8 61.0 | 69.0 68.9 68.9 | 56.7 56.7 56.7 | 46.8 43.3 45.0 | 76.6 71.9 74.0 | 58.3 ----------------------------------------------------------------------------------------------------------------------------------------- * Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3. ---------------------------------------------------------------------------- MUC B cubed Pairwise P R F1 P R F1 P R F1 ---------------------------------------------------------------------------- ACE2004 dev | 86.0 75.5 80.4 | 89.3 76.5 82.4 | 81.7 55.2 65.9 ACE2004 test | 82.7 70.2 75.9 | 88.7 74.5 81.0 | 77.2 44.6 56.6 ACE2004 nwire | 84.6 75.1 79.6 | 87.3 74.1 80.2 | 79.4 50.1 61.4 MUC6 test | 90.6 69.1 78.4 | 90.6 63.1 74.4 | 89.7 57.0 69.7 ---------------------------------------------------------------------------- * Gold mentions are used.
August 26, 2010
This release is generally similar to the code used for EMNLP 2010, with one additional sieve: relaxed exact string match.
---------------------------------------------------------------------------- MUC B cubed Pairwise P R F1 P R F1 P R F1 ---------------------------------------------------------------------------- ACE2004 dev | 84.1 73.9 78.7 | 88.3 74.2 80.7 | 80.0 51.0 62.3 ACE2004 test | 80.5 72.4 76.2 | 85.4 75.9 80.4 | 68.7 47.9 56.4 ACE2004 nwire | 83.8 72.8 77.9 | 87.5 72.1 79.0 | 79.3 47.6 59.5 MUC6 test | 90.3 68.9 78.2 | 90.5 62.3 73.8 | 89.4 55.5 68.5 ----------------------------------------------------------------------------