NLP
A set of techniques for automated generation, manipulation and analysis of human languages.
Applications
Processing large amount of texts;
Index and search large texts;
Speech understanding;
Information retrieval;
Automatic summarization;
Human-Computer Interaction;
Common Tasks
1 Stemming(单词还原)
Stemming is the process of reducing inflected words to their word stem, base or root form.(将单词各种时态还原为原始形式)
1 | ``` |
The ball is red.
article noun verb adjective
1 |
|
The boy went home.
NP(noun phrase) VP(verb phrase)
article noun verb none
1 |
|
TF for cat is (3/100) = 0.03;
IDF is log(10,000/1,000) = 4;
TF-IDF: 0.03*4 = 0.12`