Introduction to Computational Linguistics

Linguistics 346
Winter 2001, Tu/Th 2.30--3.50
Northwestern University



Instructor: Prof. Chris Kennedy
Office: Linguistics Department, Room 12 (2016 Sheridan Rd.)
Phone: 491-8054
Email: kennedy@northwestern.edu
Office hours: Wednesday 10-12 (or by appointment)

Course description

This course is an introduction to computational linguistics, designed to familiarize students with the methods and goals of language processing technologies at both an applied and a theoretical level. The central goal of the course is to provide students with a solid understanding of the core computational questions that arise in the context of linguistic analysis, and to develop a set of language processing skills and algorithms that will provide useful tools both for the development of applied linguistic technologies, and for more theoretical linguistic research. Specific computational topics to be covered include regular expressions and finite state automata/transducers, part of speech tagging, context free grammars, parsing and complexity, unification grammars and feature structures; we will examine these topics as they apply to the analysis of natural language at different levels: morphology, syntax, semantics and discourse.

Requirements

6 assigments (60%)
Take-home midterm (20%)
Final project (20%)

Students are encouraged to work together and collaborate in learning the concepts and skills required to complete the various assignments in this course, however the assignments themselves must be done individually.

Text

Jurafsky, Daniel and James Martin, 2000, Speech and Language Processing, Prentice Hall, Upper Saddle River, N.J. (Available at Norris Bookstore.)

Online resources

Mini-corpus of examples of Verb Phrase Ellipsis (VPE) for evaluating the ellipsis resolution algorithm you will construct for the final.

PDF versions of overheads from class

Links to useful and informative websites related to computational linguistics and natural language processing (to be updated throughout the course).

Syllabus

Week 1: NO CLASS ON THURSDAY, JANUARY 4

CK will be at the annual meeting of the Linguistic Society of America on the first day of class, but students should start doing the reading for week two.

Week 2: Introduction

Introduction, regular expressions, finite state automata

Reading: Chapters 1-2

Week 3: Morphology

Finite state transducers, morphological parsing

Reading: Chapter 3

Week 4: Syntax

Part of speech tagging, context free grammars

Reading: Chapters 8-9

Week 5: Syntax

Syntactic parsing with context free grammars

Reading: Chapter 10

Week 6: Syntax

Feature structures and unification grammars

Reading: Chapter 11

Week 7: Semantics

Representing (various aspects of) meaning, mapping syntax to semantics

Reading: Chapter 14

Week 8: Semantics

Semantic analysis, information extraction

Reading: Chapter 15

Week 9: Pragmatics

Reference resolution, coherence, rhetorical structure

Reading: Chapter 18

Week 10: Conclusion

Loose ends, looking ahead



Back to Kennedy's classes.