Show The Graduate Center Menu

Natural Language Processing


Natural Language Processing (NLP) is one of the most important areas within Artificial Intelligence. It is deeply connected with Algorithms, Machine Learning, Programming Languages and Compiler Theory, and Automata and Formal Language Theory.

Course Description

Computers process massive quantities of information every day in the form of human language, yet machine understanding of human language remains one of the great challenges of computer science. How can advances in computing technology enable more intelligent processing of all this language data? Will computers ever be able to use this data to learn language like humans do? This course provides a systematic introduction to statistical models of human language, with particular attention to the structures of human language that inform them and the structured learning and inference algorithms that drive them. This is a lecture course, not a seminar course, but aims to cover both fundamental and cutting-edge research issues.


Students are expected to be proficient in programming, basic algorithms and data structures (e.g., dynamic programming, graph traversal and shortest paths, hashtables and priority queues), discrete math, and basic probability theory.


  • Recommended but optional: Jurafsky and Martin, Speech and Language Processing (2nd ed.), Prentice Hall, 2008.

Learning Goals

  • Be able to write simple programs that understand natural language text by implementing classical NLP algorithms such as Viterbi and CKY

  • Be able to understand the mathematical theory of noisy-channel model

  • Be able to understand the formal machineries of describing natural language, such as finite automata and context-free grammars

  • Be able to understand current NLP research


  • Homework: 10+15+10+13 = 48%.

    • programming exercises in Python + pen-n-paper exercises

    • late penalty: you can submit two (2) HWs late (by 48 hours each).

  • Quiz: 7%

  • Final Project: 5 (proposal) + 5 (talk) +15 (report) = 25%. individually or in pairs.

  • Exercises: 5+5=10%. graded by completeness, not correctness.

  • Class Participation: 10%

    • asking/answering questions in class; helping peers on HWs (5%)

    • catching/fixing bugs in slides/exams/hw & other suggestions (2%)

    • reward for submitting less than 2 HWs late (3%)