AI/자연어처리
NLTK(Natural Language Toolkit)
A띠
2024. 9. 1. 20:46
NLTK(Natural Language Toolkit)는 자연어 처리 및 문서 분석용 파이썬 패키지다.
https://www.nltk.org/install.html
NLTK :: Installing NLTK
Installing NLTK NLTK requires Python versions 3.8, 3.9, 3.10, 3.11 or 3.12. For Windows users, it is strongly recommended that you go through this guide to install Python 3 successfully https://docs.python-guide.org/starting/install3/win/#install3-windows
www.nltk.org
* tokenizing : 텍스트 -> 의미가 있는 가장 작은 언어단위(토큰)으로 나누기 -> 전처리
!pip install nltk
import nltk
from nltk.tokenize import WordPunctTokenizer
from nltk.tokenize import TreebankWordTokenizer
from nltk.tokenize import word_tokenize
nltk.download('punkt')