Child pages
  • SpellCheckerDictionary
Skip to end of metadata
Go to start of metadata

How to create dictionary for Spell Checker

IdeaSpellChecker is powered by Jazzy (http://jazzy.sourceforge.net) spell checker which may import ASpell (GNU spell checker, http://aspell.net/) dictionaries. ASpell has a lot of different dictionaries (ftp://ftp.gnu.org/gnu/aspell/dict/0index.html) and we will try to convert russian dictionary to IdeaSpellChecker plugin in this guide.

Install ASpell

First thing you need to install ASpell. There are different way to do that. You may download sources and compile yourself or get binaries. I followed this guide http://docs.moodle.org/en/Configuring_aspell_on_Mac_OS_X and installed on Mac OS X without any problems.
Open terminal and type the following:

dubik$ sudo port install aspell-dict-ru

You should see something similar to this:

Last login: Sat Nov 24 12:21:57 on ttyp1
Welcome to Darwin!
sergiy-duboviks-computer:~ dubik$ sudo port install aspell-dict-ru
---> Fetching aspell-dict-de-alt
---> Attempting to fetch aspell6-ru-0.99f7-1.tar.bz2 from http://ftp.gnu.org/gnu/aspell/dict/ru
---> Verifying checksum(s) for aspell-dict-ru
---> Extracting aspell-dict-ru
---> Configuring aspell-dict-ru
---> Building aspell-dict-ru with target all
---> Staging aspell-dict-ru into destroot
---> Installing aspell-dict-ru 0.99f7_0
---> Activating aspell-dict-ru 0.99f7_0
---> Cleaning aspell-dict-ru
sergiy-duboviks-computer:~ dubik$

Add Dictionary to ASpell

Now you need to dump imported dictionary into a word list file - a file which can be parsed by Jazzy (SpellDictionaryASpell class, you may take a look at it's documentation)

To list available dictionaries type in terminal

dubik$ aspell help
.....
Available Dictionaries:
Dictionaries can be selected directly via the "-d" or "master"
option. They can also be selected indirectly via the "lang",
"variety", and "size" options.

de
de_AT
de_CH
de_DE
en
en-variant_0
en-variant_1
en-variant_2
en-w_accents
en-wo_accents
en_CA
en_CA-w_accents
en_CA-wo_accents
en_GB
en_GB-ise
en_GB-ise-w_accents
en_GB-ise-wo_accents
en_GB-ize
en_GB-ize-w_accents
en_GB-ize-wo_accents
en_GB-w_accents
en_GB-wo_accents
en_US
en_US-w_accents
en_US-wo_accents
ru
ru-ye
ru-yeyo
ru-yo
.....

we will use ru dictionary. Now lets continue with dumping (I created /Users/dubik/Tools/ASpellDict directories for output)

dubik$ aspell --master=ru dump master > russian.0
dubik$ ls -l
total 3376
rw-rr- 1 dubik dubik 1725925 Nov 10 19:51 russian.0

Dictionary Plugin
Open IntellijIDEA and create new plugin project. Module name is spellchecker-dict-russian. You also need to get spellchecker plugin sources. Check them out from https://idea-spellchecker.googlecode.com/svn/trunk. Make spellchecker-dict-russian depended on spellchecker.

Here is how plugin.xml should look like.

plugin.xml

Create package org.intellij.spellchecker and create SpellCheckerRussianDictionary class there.

SpellCheckerRussianDictionary.java

As you can see dictionary will load word text file from classpath /dict/russian.0. So create a package /dict and copy russian.0 from ../Tools/ASpellDict.
For some unknown for me reason ASpell add non-text characters to the end of each word in word file, so I had to remove them using regular expression. You might want to check content of your dictionary file as well. Now you can compile and package your dictionary, it's ready.

  • No labels

2 Comments

  1. Why does a new dictionary require a new plugin? Although there's some advantage for the hosting & installation through the plugin manager, it seems a bit like overkill.

    Why not just load dictionary-files e.g. from %IDEA_OPTIONS%/dictionaries?

    1. As you said, it's easy to install a dictionary from plugin manager. Why do you think it's an overkill?

      We are not loading from %IDEA_OPTIONS%/dictionaries because we would need to instruct users how to download them, and where to find %IDEA_OPTIONS%. It's not hard to wrap dictionaries.