Understand Various Steps in Natural Language Processing (NLP) under 5 minutes

Natural Language Processing or NLP for short, is broadly defined as the automatic manipulation of Languages, like speech and text by software.

The study of Natural Language Processing has been around for more than 4 decades.

In this blog, you will discover the steps in NLP

Tokenization:

The Process of breaking the string into smaller tokens is called Tokenization.

Example:

My Name is Aman

after breaking this into token we get

'My' , 'Name' , 'is' , 'Aman'

Stemming:

Normalizing the words into their base form or root form is called Stemming.

Example: All the below words are considered as one:

Affections, Affects, Affected, Affecting

All the above token will be converted to their root form that is

Affect

It simply tries to remove all possible and basic prefix and postfix to a work

Lemmatization:

Takes care of Morphological analysis of word

Example:

Lemmatiser should map gone, going, went into go

POS Tag: Part of Speech

Here the words are mapped with their Parts of Speech

Example:

The	Dog	Killed	the	Bat
DT	NN	VBD	DT	NN

List of Universal POS Tag POS tag

Name Entity Recognition:

It is used to Identify or Recognize the name of Movie/ Organisation, Location, person, and so on

Example:

Google's CEO Sundar Pichai introduced the new Pixel Phone at New York

and after Name Entity Recognition it shall be

Google's	CEO	Sundar Pichai	introduced	the	new	Pixel	Phone	at	New York
Organisation		Person				Object			Location

Chunking

Picking up Individual pieces of Information and grouping them into bigger pieces.