CC100
Frequency list of CC100 corpus data from Japanese internet. Formal words will appear more common in this frequency list. It contains about 160k words.
JPDB
A frequency dictionary made from JPDB, which is a site that has analyzed many light novels, visual novels, anime and j-drama. It contains about 183k words.
Frequencies for hiragana versions of kanji dictionary entries will be marked by ㋕. For example, if you search 成る, you will see frequencies for both なる and 成る.
Novels
A frequency list made from over 10,000 novels. It contains about 270k words.
BCCWJ
A frequency list created using data from the balanced corpus of contemporary written Japanese (BCCWJ). It contains about 536k words.
Aozora Bunko
A frequency dictionary created using data from the Aozora Bunko. This dictionary does not cover words with kana in them but it covers many rare 熟語 not covered by other frequency dictionaries, such as 睽乖. It contains about 120k words.
Innocent Ranked
A frequency list based on data from 5000+ novels. It contains about 285k words.
JPDB Kanji
Kanji frequency data from JPDB. It contains about 4k kanji.
Innocent Ranked Kanji
Kanji frequency based on data from 5000+ novels. It contains about 6k kanji.
Wikipedia Kanji
Kanji frequency based on wikipedia pages. It contains about 20k kanji.
Aozora Bunko Kanji
Kanji frequency created using data from the Aozora Bunko. It contains about 8k kanji.
What is a common word? (This does not apply to kanji frequency)
Very common: 1-10,000
Commmon: 10,001-20,000
Fairly common: 20,001-30,000
Kind of uncommon: 30,001-40,000
Uncommon: 40,001-50,000
Rare: 50,001-80,000
Natives-probably-don't-know-it-level: 80,000+
Why do some words/kanji have multiple frequencies within the same frequency lists?
This usually happens as a result of some parsing oddities in the frequency list source. Almost always, you will want to be looking at the highest frequency for each list.
What do the カタカナ and ひらがな buttons do?
They convert between hiragana and katakana.
Why isn't romaji or kanji conversion available?
Converting kanji is much more difficult to deal with due to some kanji having multiple kana representations and some kana having multiple kanji representations.
Converting romaji is not as difficult as kanji but isn't as straightforward as converting hiragana and katakana. It may be added in the future.
Offline Use
This page can be downloaded for offline use by in most browsers by pressing "ctrl + s". The individual assets for this page are available at the github repo.
Yomichan Dictionaries
These frequency lists are available as yomichan dictionaries: JPDB, CC100, Novels, BCCWJ, Innocent Ranked, Aozora Bunko, JPDB Kanji, Innocent Ranked Kanji, Wikipedia Kanji, Aozora Bunko Kanji.