WWWJDIC - a feature-rich WWW-based Japanese Dictionary

The WWWJDIC WWW-based Japanese dictionary[2] is an evolving multi-feature dictionary service based on free and public dictionary files. It is widely used in Japanese-language education, and has a number of functions specifically to aid language learners. The main server is at Monash University in Australia, and there are five mirror sites in Europe (2), North America (2) and Japan. Usage is currently at several hundred thousand accesses per day.

The dictionary service has been developed in an attempt to give expression to concepts of "tomorrow's dictionary"[1] in providing a wide range of configurable features and options which go well beyond the common commercial dictionary services based on accesses to copies of published bilingual dictionaries.[5,6,7]

The dictionary files used by the server are:

  1. the JMdict/EDICT Japanese-English dictionary[3], which has about 140,000 entries;
  2. the ENAMDICT dictionary of named entities, which has over 700,000 entries
  3. the KANJIDIC kanji (Chinese character) dictionary[4], which has detailed information on over 12,000 characters
  4. a collection of glossary files in fields such as life sciences, law, engineering, Buddhism, business, etc.

Entries in the dictionaries can be accessed either by the Japanese headwords (either the kanji form or the reading/pronunciation) or by words in the glosses. (Fig. i) The kanji dictionary can be accessed via a variety of methods including the traditional radical/stroke-count and four-corner techniques, the character pronunciations, the character meanings, various dictionary indices, etc. A multi-component index based on the visual elements in the characters is particularly effective and popular. An external handwriting interface can also be used. The dictionaries are integrated so that a user, having found a particular character, can display word entries containing that character, or having selected a word, can examine the details of the constituent characters.

elex1fig1.gif
Fig. i: Example of dictionary word display

One function of the service commonly used by translators is a text-glossing capability in which Japanese text is segmented and matched with dictionary entries. The segmentation and matching uses a combination of most of the dictionary files, and allows inflected forms of verbs and adjectives to be aligned with the dictionary forms. (Fig. ii)

elex1fig2.gif
Fig. ii: Example of text-glossing function

Aspects of WWWJDIC's service which are of particular interest in CALL are:

  1. the option of displaying a table of conjugations for any of the verbs or verbal nouns in the dictionary (approx. 17,000 entries). (Fig. iii)
  2. animated stroke-order-diagrams for the 2,000 most common kanji. (Fig. iv)
  3. links at the entry level to the Tanaka Corpus of 150,000 Japanese-English sentence pairs. (Fig. v) The Corpus can also be searched independently.
  4. sound clips of the Japanese pronunciation of almost all EDICT entries

elex1fig3.gif
Fig. iii: Example of verb conjugation table

elex1fig4.gif
Fig. iv: Example of animated stroke order display

elex1fig5.gif
Fig. v: Example sentences linked to the ¥Ð¥¹Ää entry

Other features of the service are:

  1. a configurable interface enabling users to structure the display and enable or disable options to suit their needs;
  2. multi-lingual operation. At present the main operating pages are available in either English and Japanese. Other languages can be added by extending the catalogue files, and a French interface is in preparation.
  3. a restricted interface tailored for use with Japanese mobile telephones.
  4. links from each entry to a range of online dictionaries, search engines, Japanese Wikipedia entries, the Japanese WordNet, etc..
  5. an edit interface enabling users to provide suggestions, amendments, etc. about dictionary entries or to propose new entries.
  6. an API enabling access from software and servers.

Although most of the dictionary files used are Japanese-English, it also includes the major WaDokuJT Japanese-German dictionary and smaller Japanese-French, Japanese-Spanish, Japanese-Swedish, Japanese-Hungarian and Japanese-Dutch files.

References

  1. B.T.S. Atkins Bilingual Dictionaries - Past, Present and Future, Lexicography and Natural Language Processing - A Festschrift in Honour of B.T.S. Atkins, Eurolex 2002.

  2. James Breen, A WWW Japanese Dictionary, In "Language Teaching at the Crossroads", JSC Working paper No. 13, Monash Asia Institute, Monash University Press, 2003 http://www.wwwjdic.net/

  3. James Breen, JMdict: a Japanese-Multilingual Dictionary, COLING Multilingual Linguistic Resources Workshop, Geneva, 2004.

  4. James Breen, Multiple Indexing in an Electronic Kanji Dictionary, COLING Enhancing and Using Electronic Dictionaries Workshop, Geneva, 2004.

  5. Kenkyusha Ltd. Kenkyusha Online Dictionary http://kod.kenkyusha.co.jp/service/

  6. NTT Resonant Inc. Goo Jisho http://dictionary.goo.ne.jp/

  7. Yahoo Japan Corporation Yahoo Jisho http://dic.yahoo.co.jp/