Natural Language Processing - Lemmatizing Stemming and Lemmatizing process goes in hand in hand. Both of these process do the same thing but in different way. In stemming we considered to cut off the last part of the word and get a meaningful word but in lemmatizing it is more considered upon getting a more meaningful word by removing infectious part and returning the vocabulary word. Lets understand with a simple example. from nltk.stem import PorterStemmer, WordNetLemmatizer # lemmatizing verbs words_verbs = [ "run" , "ran" , "running" , "gave" , "took" , "shot" ] print ( "*************Stemming verbs********************" ) for w in words_verbs: # Stemming the words print (PorterStemmer() . stem(w)) print ( "*************Lemmatizing verbs********************" ) for w in words_verbs: # lemmatize the words print (WordNetLemmatizer() . lemmatize...
Natural Language Processing - NER Named entities are specific reference to something. As a part of recognizing text NLTK has allowed us to used the named entity recognition and recognize certain types of entities. Those types are as follows NE Type Examples ORGANIZATION Georgia-Pacific Corp. , WHO PERSON Eddy Bonte , President Obama LOCATION Murray River , Mount Everest DATE June , 2008-06-29 TIME two fifty a m , 1:30 p.m. MONEY 175 million Canadian Dollars , GBP 10.40 PERCENT twenty pct , 18.75 % FACILITY Washington Monument , Stonehenge GPE South East Asia , Midlothian Source: http://www.nltk.org/book/ch07.html Simple example on NER: import nltk from nltk.tokenize import word_tokenize, sent_tokenize para = " America is a country. John is a name. " sent = sent_tokenize(para) for s in sent: word = word_tokenize(s) tag = nltk . pos_tag(word) n...