NLP–特征提取¶
Text feature extraction
API
- feature_extraction.text.CountVectorizer
- feature_extraction.text.HasingVectorizer
- feature_extraction.text.TfidfTransformer
- feature_extraction.text.TfidfVectorizer
1.Bag of Words¶
scikit-learn 提供的工具
- tokenizing
- counting
- normalizing
2.Sparsity¶
test