Skip to main content

Tools Used in Natural Language Processing

Morphological Analysis

MeCab

  • https://taku910.github.io/mecab/
  • Designed with the basic principle of being a general-purpose tool independent of language, dictionary, and corpus
  • Operates at very high speed
  • Outputs the analysis result that minimizes cost based on word occurrence likelihood (generation cost) and part-of-speech connectivity (connection cost)

JUMAN

JUMAN++

Sudachi

Janome

  • https://mocobeta.github.io/janome/
  • A morphological analysis library written in Pure Python with a built-in dictionary
  • Has few dependencies but is still under development

Dependency Parsing

CaboCha

Libraries

GiNZA

  • https://megagonlabs.github.io/ginza/
  • Released on 2019/4/2
  • Provides morphological analysis, dependency parsing, and word dependency structure analysis
  • Uses spaCy as its framework and incorporates SudachiPy internally