This course is an introduction to Natural Language Processing (with mainly deep-learning approach). The lab sessions use the pytorch (python module).
- The course of Guillaume Pitel: slides are on the drive !
- Deadline for the project: May the 15th
- For project and reading, team and selection: deadline Thursday the 6th of February
- The project and the reading lists are available, see the drive !
- The course starts the 16/01
The resources / drive
Look at this drive for the slides and the material of lab sessions.
It starts in january 2020 (the 16th). The course are scheduled on Thursay, starting at 8:30 in the morning.
16-jan, course: NLP, overview and the main tasks
- For the linguistic part you can refer to https://faculty.washington.edu/ebender/100things-sem_prag.html
- For the NLP basics: https://web.stanford.edu/~jurafsky/slp3/
23-jan, course: Text classification
30-jan, lab session
Two notebooks for two parts (see the drive)
- pytorch 101
- text classification
Further work: text classification with convolution
6-feb, course: sequence models
- ngram language model
- recurrent and LSTM network
Then for the first part
- 13-feb: Postponed
- 27-feb, course: readings
The second part:
- 03-mar, course: syntax, by B. Crabbé, at 13:30
- 05-mar, 12-mar: Large scale NLP : the business perspective, G. Pitel
The evaluation is in two parts. For both, first make your team (typically 3 students).
The goal is to read an article an to make a presentation (the 27-feb). A list will be availble soon, but you can also propose one (I must agree beforehand). Select one article per team to read and analyse the paper to make a clear and synthetic presentation. Some questions you may use to guide your reading are (among others):
- Did you like the paper? Did you find it interesting? Be honest!
- What are the most important things you learned from the paper? Why are they important?
- Do the lessons learned generalize beyond the specific task? Do they contribute towards building an important system or application?
- Is the experimental setup satisfying? Any experiments missing? Any obvious or important baseline missing?
- Is the problem/approach well motivated?
- Are you convinced by the results? Why?
- Is the writing clear? Is the paper well structured?
The important dates are :
- Make up your team and select the paper before the 6-feb
- Presentation: the 23-feb (10 minutes per team)
A list will be availble soon, but you can also propose one (I must agree beforehand).
- Team and the project registration : before 6-feb
- Deliverable for 13-feb: 2 pages (pdf only) to describe the data, the task and your plan
- Deliverable for 27-feb: a github/gitlab repository
- Final deliverable: a report in pdf and the code via the git repos
- Final deadline: 15th of May