Word Prediction with N-Grams Model Using Python
The “AI611µ Word Prediction with N-Grams Model using Python” micro-course is about how to perform word prediction with artificial intelligence techniques. More precisely, it presents how to use N-Grams model to predict words based on a corpus and builds a small example with Python.
I gave this micro-course in 2020, once, at the ECAM Brussels Engineering School (ECAM) as a part of the “I4110 – Artificial intelligence” course. The course is taught in French, but all the material is available in English and in French.
Documents
- General information about the micro-course
- Competency Based Assessment
- Grid of skills to acquire
Theory
- Session 1: Word Prediction Problem and N-Grams Model
- Session 2: N-Grams Model Training and Model Evaluation
Practice
- Quizz 1: N-Grams model
- Quizz 2: Bigram model training
- Quizz 3: N-Grams model training and evaluation
- Coding 1: Corpus statistics
- Coding 2: Training a bigram model
- Mission 1: N-Grams model applications
- Mission 2: Bigram model training with nltk
- Project 1: Simple word prediction application
Resources
This section gathers resources that have been used to create this micro-course. These latter can be used to learn more about N-Grams.
Reference books
- Daniel Jurafsky and James H. Martin (2008). Speech and Language Processing (Second Edition). Pearson. (ISBN: 978-0-135-04196-3)
Online resources
- Online NGram Analyzer to perform simple statistics on texts.
- Google Ngram Viewer with data collected from Google Books.
- Official website of the Natural Language Toolkit (NLTK) Python module.