{"id":2547,"date":"2026-02-15T18:44:33","date_gmt":"2026-02-15T10:44:33","guid":{"rendered":"https:\/\/ai.ziyuanzz.online\/?page_id=2547"},"modified":"2026-03-28T09:28:28","modified_gmt":"2026-03-28T01:28:28","slug":"nltk","status":"publish","type":"page","link":"https:\/\/ai.ziyuanzz.online\/index.php\/nltk\/","title":{"rendered":"NLTK"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"450\" src=\"https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/02\/1-49.jpeg\" alt=\"\" class=\"wp-image-2548\" style=\"width:230px\" srcset=\"https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/02\/1-49.jpeg 600w, https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/02\/1-49-300x225.jpeg 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\">NLTK<\/h2>\n\n\n\n<p>Python\u81ea\u7136\u8bed\u8a00\u5904\u7406\u5de5\u5177\u5305<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.nltk.org\/?utm_source=ai-bot.cn\" target=\"_blank\" rel=\"nofollow noopener\">\u8bbf\u95ee\u5b98\u7f51 ><\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"2400\" height=\"240\" src=\"http:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1.png\" alt=\"\" class=\"wp-image-34\" srcset=\"https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1.png 2400w, https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1-300x30.png 300w, https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1-1024x102.png 1024w, https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1-768x77.png 768w, https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1-1536x154.png 1536w, https:\/\/ai.ziyuanzz.online\/wp-content\/uploads\/2026\/01\/d-design-banner-0906-1-2048x205.png 2048w\" sizes=\"auto, (max-width: 2400px) 100vw, 2400px\" \/><\/figure>\n\n\n\n<div style=\"height:40px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">NLTK\u662f\u4ec0\u4e48<\/h2>\n\n\n\n<p>NLTK\uff08Natural Language Toolkit\uff09\u81ea\u7136\u8bed\u8a00\u5de5\u5177\u5305\u662f\u4e00\u5957\u5f00\u6e90\u7684Python\u6a21\u5757\u3001\u6570\u636e\u96c6\u548c\u6559\u7a0b\uff0c\u4e13\u95e8\u7528\u5728\u81ea\u7136\u8bed\u8a00\u5904\u7406\uff08NLP\uff09\u3002NLTK\u63d0\u4f9b\u4e30\u5bcc\u7684\u5de5\u5177\u548c\u8d44\u6e90\uff0c\u5305\u62ec\u6587\u672c\u5206\u8bcd\u3001\u8bcd\u6027\u6807\u6ce8\u3001\u53e5\u6cd5\u5206\u6790\u3001\u547d\u540d\u5b9e\u4f53\u8bc6\u522b\u7b49\u3002NLTK\u5305\u542b\u5927\u91cf\u8bed\u6599\u5e93\u548c\u8bcd\u6c47\u8d44\u6e90\uff0c\u5982WordNet\uff0c\u65b9\u4fbf\u7528\u6237\u8fdb\u884c\u8bed\u8a00\u5b66\u7814\u7a76\u548c\u5f00\u53d1\u3002NLTK\u652f\u6301Python\u7248\u672c3.7\u30013.8\u30013.9\u30013.10\u62163.11\uff0c\u9002\u5408\u4ece\u521d\u5b66\u8005\u5230\u4e13\u4e1a\u4eba\u58eb\u7684\u5404\u79cd\u7528\u6237\uff0c\u5e7f\u6cdb\u5e94\u7528\u5728\u5b66\u672f\u7814\u7a76\u3001\u5546\u4e1a\u5e94\u7528\u548c\u6559\u80b2\u9886\u57df\u3002NLTK\u7684\u6587\u6863\u9f50\u5168\uff0c\u793e\u533a\u6d3b\u8dc3\uff0c\u662f\u5b66\u4e60\u548c\u5b9e\u8df5\u81ea\u7136\u8bed\u8a00\u5904\u7406\u7684\u7edd\u4f73\u5de5\u5177\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image\"><a class=\"js\" href=\"https:\/\/ai-bot.cn\/wp-content\/uploads\/2025\/09\/NLTK-website.png\" target=\"_blank\" rel=\"nofollow noopener\"><img decoding=\"async\" src=\"https:\/\/ai-bot.cn\/wp-content\/uploads\/2025\/09\/NLTK-website.png\" alt=\"NLTK\" class=\"wp-image-61837\"\/><\/a><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">NLTK\u7684\u4e3b\u8981\u529f\u80fd<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong class=\"\">\u5206\u8bcd\uff08Tokenization\uff09<\/strong>\uff1a\u5c06\u6587\u672c\u5206\u5272\u6210\u5355\u8bcd\u6216\u53e5\u5b50\uff0c\u4fbf\u4e8e\u540e\u7eed\u5904\u7406\u3002<\/li>\n\n\n\n<li><strong>\u8bcd\u6027\u6807\u6ce8\uff08Part-of-Speech Tagging\uff09<\/strong>\uff1a\u4e3a\u6587\u672c\u4e2d\u7684\u5355\u8bcd\u6807\u6ce8\u8bcd\u6027\uff0c\u5982\u540d\u8bcd\u3001\u52a8\u8bcd\u3001\u5f62\u5bb9\u8bcd\u7b49\u3002<\/li>\n\n\n\n<li><strong>\u547d\u540d\u5b9e\u4f53\u8bc6\u522b\uff08Named Entity Recognition, NER\uff09<\/strong>\uff1a\u8bc6\u522b\u6587\u672c\u4e2d\u7684\u4eba\u540d\u3001\u5730\u540d\u3001\u7ec4\u7ec7\u540d\u7b49\u547d\u540d\u5b9e\u4f53\u3002<\/li>\n\n\n\n<li><strong>\u8bcd\u5e72\u63d0\u53d6\uff08Stemming\uff09<\/strong>\uff1a\u5c06\u5355\u8bcd\u8fd8\u539f\u4e3a\u5176\u57fa\u672c\u5f62\u5f0f\uff08\u8bcd\u5e72\uff09\uff0c\u4fbf\u4e8e\u7edf\u4e00\u5904\u7406\u3002<\/li>\n\n\n\n<li><strong>\u8bcd\u5f62\u8fd8\u539f\uff08Lemmatization\uff09<\/strong>\uff1a\u5c06\u5355\u8bcd\u8fd8\u539f\u4e3a\u8bcd\u5178\u5f62\u5f0f\uff08\u8bcd\u5f62\uff09\uff0c\u66f4\u51c6\u786e\u5730\u5904\u7406\u8bcd\u6c47\u3002<\/li>\n\n\n\n<li><strong>\u53e5\u6cd5\u5206\u6790\uff08Parsing\uff09<\/strong>\uff1a\u751f\u6210\u53e5\u6cd5\u6811\uff0c\u5206\u6790\u53e5\u5b50\u7684\u8bed\u6cd5\u7ed3\u6784\u3002<\/li>\n\n\n\n<li><strong class=\"\">\u8bed\u6599\u5e93\u8bbf\u95ee<\/strong>\uff1a\u63d0\u4f9b\u591a\u79cd\u8bed\u6599\u5e93\uff0c\u5982Brown\u8bed\u6599\u5e93\u3001PENN Treebank\u7b49\uff0c\u7528\u5728\u7814\u7a76\u548c\u5f00\u53d1\u3002<\/li>\n\n\n\n<li><strong>\u5206\u7c7b\u5668\uff08Classifiers\uff09<\/strong>\uff1a\u63d0\u4f9b\u591a\u79cd\u5206\u7c7b\u5668\uff0c\u5982\u6734\u7d20\u8d1d\u53f6\u65af\u5206\u7c7b\u5668\u3001\u51b3\u7b56\u6811\u5206\u7c7b\u5668\u7b49\uff0c\u7528\u5728\u6587\u672c\u5206\u7c7b\u4efb\u52a1\u3002<\/li>\n\n\n\n<li><strong>\u7279\u5f81\u63d0\u53d6\uff08Feature Extraction\uff09<\/strong>\uff1a\u4ece\u6587\u672c\u4e2d\u63d0\u53d6\u7279\u5f81\uff0c\u7528\u5728\u673a\u5668\u5b66\u4e60\u6a21\u578b\u7684\u8bad\u7ec3\u3002<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">\u5982\u4f55\u4f7f\u7528NLTK<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u5b89\u88c5NLTK<\/strong>\uff1a\u5728\u7ec8\u7aef\u6216\u547d\u4ee4\u884c\u4e2d\u8fd0\u884c\u4ee5\u4e0b\u547d\u4ee4\uff1a<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-preformatted\">pip &lt;span <strong>class<\/strong>=\"token function\">install&lt;\/span> nltk<\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u9a8c\u8bc1\u5b89\u88c5<\/strong>\uff1a\u5728Python\u73af\u5883\u4e2d\u8fd0\u884c\u4ee5\u4e0b\u4ee3\u7801\uff1a<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-preformatted\">&lt;span <strong>class<\/strong>=\"token keyword\">import&lt;\/span> nltk&lt;span <strong>class<\/strong>=\"token keyword\">print&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>nltk&lt;span <strong>class<\/strong>=\"token punctuation\">.&lt;\/span>__version__&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u4e0b\u8f7d\u5fc5\u8981\u7684\u6570\u636e\u5305<\/strong>\uff1a\u8fd0\u884c\u4ee5\u4e0b\u4ee3\u7801\u4e0b\u8f7d\u57fa\u672c\u7684\u6570\u636e\u5305\uff1a<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-preformatted\">&lt;span <strong>class<\/strong>=\"token keyword\">import&lt;\/span> nltknltk&lt;span <strong>class<\/strong>=\"token punctuation\">.&lt;\/span>download&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>&lt;span <strong>class<\/strong>=\"token string\">'punkt'&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span>  &lt;span <strong>class<\/strong>=\"token comment\"># \u5206\u8bcd\u5668&lt;\/span>nltk&lt;span <strong>class<\/strong>=\"token punctuation\">.&lt;\/span>download&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>&lt;span <strong>class<\/strong>=\"token string\">'averaged_perceptron_tagger'&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span>  &lt;span <strong>class<\/strong>=\"token comment\"># \u8bcd\u6027\u6807\u6ce8\u5668&lt;\/span><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u57fa\u672c\u4f7f\u7528<\/strong>\uff1a\n<ul class=\"wp-block-list\">\n<li><strong>\u5206\u8bcd<\/strong>\uff1a<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-preformatted\">&lt;span <strong>class<\/strong>=\"token keyword\">from&lt;\/span> nltk&lt;span <strong>class<\/strong>=\"token punctuation\">.&lt;\/span>tokenize &lt;span <strong>class<\/strong>=\"token keyword\">import&lt;\/span> word_tokenizetext &lt;span <strong>class<\/strong>=\"token operator\">=&lt;\/span> &lt;span <strong>class<\/strong>=\"token string\">\"NLTK is a powerful library for natural language processing.\"&lt;\/span>words &lt;span <strong>class<\/strong>=\"token operator\">=&lt;\/span> word_tokenize&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>text&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span>&lt;span <strong>class<\/strong>=\"token keyword\">print&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>&lt;span <strong>class<\/strong>=\"token string\">\"\u5206\u8bcd\u7ed3\u679c:\"&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">,&lt;\/span> words&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\n<ul class=\"wp-block-list\">\n<li><strong>\u8bcd\u6027\u6807\u6ce8<\/strong>\uff1a<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-preformatted\">&lt;span <strong>class<\/strong>=\"token keyword\">from&lt;\/span> nltk &lt;span <strong>class<\/strong>=\"token keyword\">import&lt;\/span> pos_tagtagged_words &lt;span <strong>class<\/strong>=\"token operator\">=&lt;\/span> pos_tag&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>words&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span>&lt;span <strong>class<\/strong>=\"token keyword\">print&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">(&lt;\/span>&lt;span <strong>class<\/strong>=\"token string\">\"\u8bcd\u6027\u6807\u6ce8\u7ed3\u679c:\"&lt;\/span>&lt;span <strong>class<\/strong>=\"token punctuation\">,&lt;\/span> tagged_words&lt;span <strong>class<\/strong>=\"token punctuation\">)&lt;\/span><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">NLTK\u7684\u5e94\u7528\u573a\u666f<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u6587\u672c\u5206\u7c7b<\/strong>\uff1a\u7528NLTK\u7684\u5206\u7c7b\u5668\u5bf9\u6587\u672c\u8fdb\u884c\u5206\u7c7b\uff0c\u4f8b\u5982\u5783\u573e\u90ae\u4ef6\u68c0\u6d4b\u6216\u6587\u6863\u5206\u7c7b\u3002<\/li>\n\n\n\n<li><strong>\u60c5\u611f\u5206\u6790<\/strong>\uff1a\u5206\u6790\u6587\u672c\u4e2d\u7684\u60c5\u611f\u503e\u5411\uff0c\u5224\u65ad\u6587\u672c\u662f\u6b63\u9762\u3001\u8d1f\u9762\u8fd8\u662f\u4e2d\u6027\uff0c\u5e38\u7528\u5728\u793e\u4ea4\u5a92\u4f53\u76d1\u63a7\u548c\u5e02\u573a\u8c03\u7814\u3002<\/li>\n\n\n\n<li><strong>\u673a\u5668\u7ffb\u8bd1<\/strong>\uff1a\u901a\u8fc7\u8bed\u8a00\u6a21\u578b\u548c\u53e5\u6cd5\u5206\u6790\uff0c\u8f85\u52a9\u5b9e\u73b0\u4e0d\u540c\u8bed\u8a00\u4e4b\u95f4\u7684\u6587\u672c\u7ffb\u8bd1\u3002<\/li>\n\n\n\n<li><strong>\u95ee\u7b54\u7cfb\u7edf<\/strong>\uff1a\u7528NLTK\u7684\u81ea\u7136\u8bed\u8a00\u5904\u7406\u529f\u80fd\uff0c\u6784\u5efa\u80fd\u7406\u89e3\u5e76\u56de\u7b54\u95ee\u9898\u7684\u7cfb\u7edf\u3002<\/li>\n\n\n\n<li><strong>\u6587\u672c\u6458\u8981<\/strong>\uff1a\u63d0\u53d6\u6587\u672c\u7684\u5173\u952e\u4fe1\u606f\uff0c\u751f\u6210\u7b80\u6d01\u7684\u6458\u8981\uff0c\u5e2e\u52a9\u5feb\u901f\u4e86\u89e3\u6587\u672c\u5185\u5bb9\u3002<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>NLTK Python\u81ea\u7136\u8bed\u8a00\u5904\u7406\u5de5\u5177\u5305 NLTK\u662f\u4ec0\u4e48 NLTK\uff08Natural Language Tool [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-2547","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/pages\/2547","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/comments?post=2547"}],"version-history":[{"count":0,"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/pages\/2547\/revisions"}],"wp:attachment":[{"href":"https:\/\/ai.ziyuanzz.online\/index.php\/wp-json\/wp\/v2\/media?parent=2547"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}