python - CountVectorizer ignores Upper Case -


what reason why countvectorizer ignores word in upper case?

cv = countvectorizer(stop_words=none,analyzer='word',token_pattern='.*',max_features=none) text = ['this','is','a','test','!'] fcv = cv.fit_transform(list) fcv = [cv.vocabulary_.get(t) t in text] print fcv 

returns

[5, 3, 2, none, 1] 

this caused lowercase set true default in countvectorizer, add lowercase=false.

cv = countvectorizer(stop_words=none, analyzer='word', token_pattern='.*',         max_features=none, lowercase=false) 

Comments

Popular posts from this blog

angular - Ionic slides - dynamically add slides before and after -

minify - Minimizing css files -

Add a dynamic header in angular 2 http provider -