本文主要是介绍英文过滤停用词,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
"""Created on Sun Nov 13 09:14:13 2016@author: daxiong"""from nltk.corpus import stopwordsfrom nltk.tokenize import sent_tokenize,word_tokenize#英文停止词,set()集合函数消除重复项list_stopWords=list(set(stopwords.words('english')))example_text="Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. This momentous decree came as a great beacon light of hope to millions of Negro slaves who had been seared in the flames of withering injustice. It came as a joyous daybreak to end the long night of bad captivity."#分句list_sentences=sent_tokenize(example_text)#分词list_words=word_tokenize(example_text)#过滤停止词filtered_words=[w for w in list_words if not w in list_stopWords]```
这篇关于英文过滤停用词的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!