Natural Language Processing
Stop words are the words which we ignore due to the fact that they do not generate any specific meaning to the sentence. Words like the, is, at etc. can be removed to extract the meaning of the sentence more easily. So NLTK has introduced us a stop words filter we can easily use. Let's see how it works.
from nltk.corpus import stopwords from nltk.tokenize import word_tokenize sent = "As you can see this is the blog of myself which is written by Anjula" w = word_tokenize(sent) # set English stop words stop_words = set(stopwords.words('english')) # list of standard stop words in English print(stop_words) # making empty arrays to store stop words and others stop_words_in_sent = [] non_stop_words = [] # Loop through to get the stop words for x in w: if x not in stop_words: non_stop_words.append(x) else: stop_words_in_sent.append(x) # print result print(non_stop_words) print(stop_words_in_sent)
The code is simple as that the output will be as follows: