Ryan McDonald
Contact
Google Research
76 Ninth Ave, New York, NY 10011
web site: http://www.ryanmcd.com

My Email

Research Interests
  • Parsing techniques and complexity for various linguistic formalisms
  • Information extraction
    • Opinion extraction and summarization
    • Biomedical IE (see BioTagger used in FABLE)
  • Machine learning for structured and large-scale NLP
  • Algorithms for natural language processing
  • Domain adaptation in NLP
  • Organizations of interest: ACL, SIGDAT, SIGNLL, SIGPARSE, NIPS
  • CV [PDF]

Recent Activities

Software
Note: I try to answer emails about my software, in particular for MSTParser. However, time constraints limit the amount of requests I can manage.
  • Implementation of parsers described in ACL and HLT-EMNLP '05 papers.
  • General online structured learning package.

  • Conditional Random Field Biomedical Entity Tagger (Genes, Variations and Malignancies). This tagger (or variants of it) form part of the core technology of the following resources:
    • FABLE biomedical literature search engine
    • PlasmoDB relevant literature search

  • In the past I have been known to contribute to MALLET,
    which is a general implementation of Conditional Random Fields
    and other learning algorithms tailored specifically to language.
    Andrew McCallum is the primary writer and caretaker of the package
    which is available at http://mallet.cs.umass.edu

Teaching
  • Generalized Linear Classifiers in NLP
    Fall 2009
    Guest lecture for Machine Learning at the GSLT
    Course materials: [HTML]

  • Generalized Linear Classifiers in NLP
    Fall 2007
    Guest lecture for Machine Learning at the GSLT
    Course materials: [HTML]

  • Data-Driven Dependency Parsing
    Summer 2007
    At ESSLLI 2007
    Course materials: [HTML]

  • Probabilistic Parsing
    Spring 2005
    Guest lectures for CIS 620 at the University of Pennsylvania
    Course materials: [PDF]

Media

Some Links

 

Thesis
  • Discriminative Training and Spanning Tree Algorithms for Dependency Parsing
    University of Pennsylvania, July 2006.
    Supervisor: Fernando Pereira
    Committee: Eugene Charniak (external), Aravind Joshi, Mark Liberman, and Mitch Marcus (chair)
    [PDF]

2010
  • Evaluation of Dependency Parsers on Unbounded Dependencies
    J. Nivre, L. Rimell, R. McDonald and C. Gómez Rodríguez
    International Conference on Computational Linguistics (COLING), 2010.
    [PDF]

  • Learning to Classify the Scope of Negation for Improved Sentiment Analysis
    I. Councill, R. McDonald and L. Velikovich
    Negation and Speculation in Natural Language Processing (NeSp-NLP), 2010
    [PDF]

  • Distributed Training Strategies for the Structured Perceptron
    R. McDonald, K. Hall and G. Mann
    North American Association for Computational Linguistics (NAACL), 2010.
    [PDF]

  • The Viability of Web-derived Polarity Lexicons
    L. Velikovich, S. Blair-Goldensohn, K. Hannan and R. McDonald
    North American Association for Computational Linguistics (NAACL), 2010.
    [PDF]

2009
  • Dependency Parsing
    S. Kübler, R. McDonald and J. Nivre
    Synthesis Lectures on Human Language Technologies, G. Hirst (ed.)
    Morgan & Claypool Publishers
    [e-print] [Amazon] [Google Books]

  • Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models
    G. Mann, R. McDonald, M. Mohri, N. Silberman, and D. Walker
    Neural Information Processing Systems (NIPS), 2009.
    [PDF]

  • Sentiment Summarization: Evaluating and Learning User Preferences
    K. Lerman, S. Blair-Goldensohn and R. McDonald
    European Association for Computational Linguistics (EACL), 2009.
    [PDF]

  • Contrastive Summarization: An Experiment with Consumer Reviews
    K. Lerman and R. McDonald
    North American Association for Computational Linguistics (NAACL), 2009.
    [PDF]

2008
  • A Joint Model of Text and Aspect Ratings for Sentiment Summarization
    I. Titov and R. McDonald
    Association for Computational Linguistics (ACL), 2008.
    [PDF]

  • Integrating Graph-based and Transition-based Dependency Parsers
    J. Nivre and R. McDonald
    Association for Computational Linguistics (ACL), 2008.
    [PDF]

  • Building a Sentiment Summarizer for Local Service Reviews
    S. Blair-Goldensohn, K. Hannan, R. McDonald, T. Neylon, G. Reis, and J. Reynar
    WWW Workshop on NLP in the Information Explosion Era (NLPIX), 2008.
    [PDF]

  • Modeling Online Reviews with Multi-Grain Topic Models
    I. Titov and R. McDonald
    International World Wide Web Conference (WWW), 2008.
    [PDF]

2007
  • The CoNLL 2007 Shared Task on Dependency Parsing
    J. Nivre, J. Hall, S. Kübler, R. McDonald, J. Nilsson, S. Riedel, and D. Yuret
    Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL), 2007.
    [PDF]

  • Characterizing the Errors of Data-Driven Dependency Parsing Models
    R. McDonald and J. Nivre
    Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP-CoNLL), 2007.
    [PDF]

  • On the Complexity of Non-Projective Data-Driven Dependency Parsing
    R. McDonald and G. Satta
    International Conference on Parsing Technologies (IWPT), 2007.
    [PDF]

  • Structured Models for Fine-to-Coarse Sentiment Analysis
    R. McDonald, K. Hannan, T. Neylon, M. Wells, and J. Reynar
    Association for Computational Linguistics (ACL), 2007
    [PDF]

  • A Study of Global Inference Algorithms in Multi-Document Summarization
    R. McDonald
    European Conference on Information Retrieval (ECIR), 2007
    [PDF]

2006
  • Automated recognition of malignancy mentions in biomedical literature
    Y. Jin, R. McDonald, K. Lerman, M. Mandel, S. Carroll, M. Liberman, F. Pereira, R. S. Winters and P. S. White
    BMC Bioinformatics 2006, 7:492
    [PDF]

  • Domain Adaptation with Structural Correspondence Learning
    J. Blitzer and R. McDonald and F. Pereira
    Empirical Methods in Natural Language Processing (EMNLP), 2006
    [PDF]

  • Multilingual Dependency Analysis with a Two-Stage Discriminative Parser
    R. McDonald and K. Lerman and F. Pereira
    Conference on Natural Language Learning (CoNLL), 2006
    [PDF]

  • An automated procedure to identify biomedical articles that contain cancer-associated gene variants
    Ryan McDonald, R. Scott Winters, Claire K. Ankuda, Joan A. Murphy, Amy E. Rogers, Fernando Pereira, Marc S. Greenblatt, Peter S. White
    Human Mutation, Volume 27 Issue 9, 2006
    [PDF]

  • Online Learning of Approximate Dependency Parsing Algorithms
    R. McDonald and F. Pereira
    European Association for Computational Linguistics (EACL), 2006
    [PDF]

  • Discriminative Sentence Compression with Soft Syntactic Constraints
    R. McDonald
    European Association for Computational Linguistics (EACL), 2006
    [PDF]

2005
  • Non-Projective Dependency Parsing using Spanning Tree Algorithms
    R. McDonald, F. Pereira, K. Ribarov and J. Hajič
    Human Language Technologies and Empirical Methods in Natural Language Processing (HLT-EMNLP), 2005
    [Best Student Paper Award]
    [PDF]

  • Flexible Text Segmentation with Structured Multilabel Classification
    R. McDonald, K. Crammer and F. Pereira
    Human Language Technologies and Empirical Methods in Natural Language Processing (HLT-EMNLP), 2005
    [PDF]

  • Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE
    R. McDonald, F. Pereira, S. Kulick, S. Winters, Y. Jin and P. White
    Association for Computational Linguistics (ACL), 2005
    [PDF]

  • Online Large-Margin Training of Dependency Parsers
    Ryan McDonald, Koby Crammer and Fernando Pereira
    Association for Computational Linguistics (ACL), 2005
    [PDF], or a more detailed and updated Tech Report

  • Identifying and Extracting Malignancy Types in Cancer Literature
    Y. Jin, R. McDonald, K. Lerman, M. Mandel, M. Liberman, F. Pereira, R.S. Winters and P.S. White
    Linking Literature, Information and Knowledge for Biology (BioLink), 2005
    [PDF]

  • Automatically annotating documents with normalized gene lists
    Jay Crim, Ryan McDonald and Fernando Pereira
    BMC Bioinformatics 2005, 6(Suppl 1):S13
    [PDF]

  • Identifying gene and protein mentions in text using conditional random fields
    Ryan McDonald and Fernando Pereira
    BMC Bioinformatics 2005, 6(Suppl 1):S6
    [PDF]

2004
  • An entity tagger for recognizing acquired genomic variations in cancer literature
    R. McDonald, R.S. Winters, M. Mandel, Y. Jin, P.S. White and F. Pereira
    Journal of Bioinformatics, 2004.
    [PDF]

  • Integrated Annotation for Biomedical Information Extraction
    S. Kulick, A. Bies, M. Liberman, M. Mandel, R. McDonald, M. Palmer, A. Schein, L. Ungar, S. Winters and P. White
    Linking Biological Literature, Ontologies and Databases (BioLink), 2004
    [PDF]

  • New Large Margin Algorithms for Structure Prediction
    Koby Crammer, Ryan McDonald and Fernando Pereira
    NIPS Workshop on Learning With Structured Outputs, 2004

  • Scalable Large Margin Online Learning Algorithms for Structured Classification
    Ryan McDonald, Koby Crammer and Fernando Pereira
    NIPS Workshop on Learning With Structured Outputs, 2004

Before 2004
  • Exploiting Sequent Structure in Membership Algorithms for the Lambek Calculus
    Ryan McDonald
    15th Annual European Summer School in Logic Language and Information (ESSLLI), 2003
    [PDF]

  • Flexible Web Document Analysis for Delivery to Narrow-Bandwidth Devices
    Gerald Penn, Jianying Hu, Hengbin Luo and Ryan McDonald
    International Conference on Document Analysis and Recognition (ICDAR), 2001
    [PDF]

Tech Reports, Non-Refereed Publications and Unpublished Material
  • Spanning Tree Methods for Discriminative Training of Dependency Parsers
    Ryan McDonald, Koby Crammer and Fernando Pereira
    UPenn CIS Technical Report: MS-CIS-05-11
    [PDF]

  • Extracting Relations from Unstructured Text
    Ryan McDonald
    My WPEII review paper for admission to PhD candidacy, 2004.
    UPenn CIS Technical Report: MS-CIS-05-06
    [PDF]

  • Automatically Annotating Documents with Normalized Gene Lists
    Jay Crim, Ryan McDonald and Fernando Pereira
    A critical assessment of text mining methods in molecular biology, BioCreative, 2004
    [PDF]

  • Identifying Gene and Protein Mentions in Text Using Conditional Random Fields
    Ryan McDonald and Fernando Pereira
    A critical assessment of text mining methods in molecular biology, BioCreative, 2004
    [PDF]

  • A Distributed Social MUD to Enhance Reliability and Scalability
    Ryan McDonald and Nick Montfort, 2003.
    Unpublished
    [PDF]