Word Frequency Counter
Count the frequency of each word in your text. Find top N most common words, analyze bigrams, exclude stopwords, measure lexical density, and check Zipf's Law distribution.
Top Words (word: count)
—
Unique Words —
Total Words —
Lexical Density % —
Extended More scenarios, charts & detailed breakdown ▾
Frequency List
—
Unique Words —
Total Words —
Professional Full parameters & maximum detail ▾
Frequency Analysis
Top Words —
Vocabulary Stats
Unique Words —
Total Words —
Lexical Density % —
Zipf's Law & N-grams
Zipf's Law Check (top word freq / 2nd word freq) —
Top 5 Bigrams —
How to Use This Calculator
- Paste your text into the Input Text field.
- Set Show Top N Words to control how many results you see (default 20).
- The calculator shows the most frequent words with their counts, unique word count, and lexical density.
- Use the No Stopwords tab to remove common words (the, a, is) and see only content words.
- Use the Bigrams tab to find common two-word phrases.
Formula
Word Frequency: Count occurrences of each unique token after lowercasing and removing punctuation.
Lexical Density % = (Unique Words ÷ Total Words) × 100
Zipf Check = Frequency of rank-1 word ÷ Frequency of rank-2 word (expect ≈ 2.0 for natural text)
Example
"The cat sat on the mat. The cat was fat." → Top words: the (3), cat (2), sat (1), on (1), mat (1), was (1), fat (1). Unique: 7, Total: 10, Lexical Density: 70%.
Frequently Asked Questions
- Word count tells you the total number of words in a text. Word frequency tells you how often each individual word appears. For example, in "the cat sat on the mat", the total word count is 6, but the word "the" has a frequency of 2.
- Lexical density is the proportion of unique words (lexical items) to the total number of words, expressed as a percentage. Higher lexical density (>60%) indicates varied vocabulary. Academic writing tends to have lexical density around 40–55%. Conversational speech is typically lower at 35–45%.
- Zipf's Law states that in any natural language corpus, the frequency of a word is inversely proportional to its rank. The most common word appears roughly twice as often as the second most common, three times as often as the third, and so on. This remarkably consistent pattern was observed by linguist George Zipf in 1949.
- A bigram is a sequence of two consecutive words. Bigram analysis reveals common phrases and collocations (words that frequently appear together). For example, a bigram analysis of news articles might show "prime minister", "climate change", or "stock market" as high-frequency pairs.
- Stopwords are common function words that carry little lexical meaning — articles (a, the), prepositions (in, on, at), conjunctions (and, but, or), and pronouns (I, you, he). They are often removed before word frequency analysis to surface the meaningful content words.
Related Calculators
Sources & References (5) ▾
- Human Behavior and the Principle of Least Effort (Zipf, 1949) — George Kingsley Zipf / Harvard University Press
- Stanford NLP Group — Word Frequency Analysis — Stanford NLP
- NLTK — Natural Language Toolkit Documentation — NLTK Project
- Google Books Ngram Viewer — Google
- Brown Corpus — Standard Reference Corpus — NLTK / Brown University