Course Finder

Natural Language Processing

Natural Language Processing


Natural Language Processing

About this course

Natural Language Processing (NLP) is the subfield of Artificial Intelligence that deals with tasks involving human languages (e.g. English, Swedish, Xhosa). NLP includes question-answering, sentiment analysis, summarization, and translation. Recently, great excitement has been created by the part of NLP known as “Large Language Models (LLMs),” such as ChatGPT. This course is an introduction to NLP with a focus on the parts of the field needed to gain an understanding of Large Language Models. We will discuss and implement various algorithms needed to create a LLM. These may include tokenization, stemming and lemmatization, word embeddings, basic neural networks, transformers, and attention modules. We will discuss ways to use LLMs, explore why LLMs perform well in some tasks and not in others, and reflect on whether LLMs can be harmful.

Syllabus

Link to Draft Syllabus

Go to syllabus

This is a draft syllabus. The final syllabus will be available here a few days prior to the new course’s first start date.

Pre-requisites

One year of computer science at university level. A course in data structures or a course in algorithms. Knowledge of a programming language (e.g. in Python/Javascript/Java/C++/Matlab).

Faculty

John Rager

Faculty

Prof. Rager earned his Ph.D. from Northwestern University.  He is a full professor at Amherst College in Amherst, Massachusetts where he has taught since 1988.  He has always been interested in languages, both human and computer.  His dissertation was in the field of symbolic natural language processing and subsequent to that his research has shifted to (among other things) natural language processing using machine learning.  He has also worked on applying Artificial Intelligence to teaching English to Speakers of Other Languages. This work was motivated by the difficulties faced by English teachers in Moldova, where he was a Fulbright Scholar during the 2003-04 academic year. His teaching has often touched on language. For example, he has taught a seminar for first-year students called “Natural and Unnatural Languages.” The material in that course included “traditional” natural language processing as done in artificial intelligence, but also a discussion of rhetorical devices in Shakespeare, a reading of parts of Finnegan’s Wake and a discussion of language evolution.  He has also taught a course on Digital Textual Analysis. That course discussed the computer science (e.g. topic modeling, Naive Bayes classification) used in papers in digital humanities. The course included both Computer Science and Humanities students, who worked together in groups on projects.

We’ll support you every step of the way.
Do you need advice?