Chapter 1. Tokenizing Text and WordNet Basics

In this chapter, we will cover:

  • Tokenizing text into sentences
  • Tokenizing sentences into words
  • Tokenizing sentences using regular expressions
  • Filtering stopwords in a tokenized sentence
  • Looking up synsets for a word in WordNet
  • Looking up lemmas and synonyms in WordNet
  • Calculating WordNet synset similarity
  • Discovering word collocations