tag Package

주석

Initial runs of each class method may require some time to load dictionaries (< 1 min). Second runs should be faster.

Hannanum Class

경고

Hannanum() is not supported on Windows 7 [#7].

class konlpy.tag._hannanum.Hannanum(jvmpath=None)

Wrapper for JHannanum.

JHannanum is a morphological analyzer and POS tagger written in Java, and developed by the Semantic Web Research Center (SWRC) at KAIST since 1999.

from konlpy.tag import Hannanum

hannanum = Hannanum()
print hannanum.analyze(u'롯데마트의 흑마늘 양념 치킨이 논란이 되고 있다.')
print hannanum.nouns(u'다람쥐 헌 쳇바퀴에 타고파')
print hannanum.pos(u'웃으면 더 행복합니다!')
print hannanum.morphs(u'웃으면 더 행복합니다!')
매개 변수:jvmpath – The path of the JVM passed to init_jvm().
analyze(phrase)

Phrase analyzer.

This analyzer returns various morphological candidates for each token. It consists of two parts: 1) Dictionary search (chart), 2) Unclassified term segmentation.

morphs(phrase)

Parse phrase to morphemes.

nouns(phrase)

Noun extractor.

pos(phrase, ntags=9, flatten=True)

POS tagger.

This tagger is HMM based, and calculates the probability of tags.

매개 변수:ntags – The number of tags. It can be either 9 or 22.

Kkma Class

class konlpy.tag._kkma.Kkma(jvmpath=None)

Wrapper for Kkma.

Kkma is a morphological analyzer and natural language processing system written in Java, developed by the Intelligent Data Systems (IDS) Laboratory at SNU.

from konlpy.tag import Kkma

kkma = Kkma()
print kkma.sentences(u'저는 대학생이구요. 소프트웨어 관련학과 입니다.')
print kkma.nouns(u'대학에서 DB, 통계학, 이산수학 등을 배웠지만...')
print kkma.morphs(u'자주 사용을 안하다보니 모두 까먹은 상태입니다.')
print kkma.pos(u'어쩌면 좋죠?')
매개 변수:jvmpath – The path of the JVM passed to init_jvm().
morphs(phrase)

Parse phrase to morphemes.

nouns(phrase)

Noun extractor.

pos(phrase, flatten=True)

POS tagger.

sentences(phrase)

Sentence detection.

Komoran Class

경고

Komoran() is not supported on Python 2 + Mac OS [#40].

class konlpy.tag._komoran.Komoran(jvmpath=None, dicpath=None)

Wrapper for KOMORAN.

KOMORAN is a relatively new open source Korean morphological analyzer written in Java, developed by Shineware, since 2013.

from konlpy.tag import Komoran

komoran = Komoran()
print komoran.pos(u'우왕 코모란도 오픈소스가 되었어요')
매개 변수:
  • jvmpath – The path of the JVM passed to init_jvm().
  • dicpath – The path of dictionary files. The KOMORAN system dictionary is loaded by default.
nouns(phrase)

Noun extractor.

pos(phrase, flatten=True)

POS tagger.

Mecab Class

경고

Mecab() is not supported on Windows 7.

class konlpy.tag._mecab.Mecab(dicpath='/usr/local/lib/mecab/dic/mecab-ko-dic')

Wrapper for MeCab-ko morphological analyzer.

MeCab, originally a Japanese morphological analyzer and a POS tagger developed by the Graduate School of Informatics in Kyoto University, was modified to MeCab-ko by the Eunjeon Project to adapt to the Korean language.

In order to use MeCab-ko within KoNLPy, follow the directions in optional-installations.

from konlpy.tag import Mecab
# MeCab installation needed

mecab = Mecab()
print mecab.pos(u'자연주의 쇼핑몰은 어떤 곳인가?')
print mecab.morphs(u'영등포구청역에 있는 맛집 좀 알려주세요.')
print mecab.nouns(u'우리나라에는 무릎 치료를 잘하는 정형외과가 없는가!')
매개 변수:dicpath – The path of the MeCab-ko dictionary.
morphs(phrase)

Parse phrase to morphemes.

nouns(phrase)

Noun extractor.

pos(phrase, flatten=True)

POS tagger.

더 보기

Korean POS tags comparison chart

Compare POS tags between several Korean analytic projects. (In Korean)
comments powered by Disqus
Fork me on GitHub

목차

Related Topics