Word Frequency Counter
Rank words by frequency with stop-word filtering and CSV export.
Written by Golam Rabbani, Founder & Lead Engineer
How to use this word frequency counter
- Paste your text into the "Your text" field.
- Pick a sort order — frequency (most common first), alphabetical, or by word length.
- Optionally raise the minimum word length and toggle "Exclude common English stop words" to drop the, and, of, etc.
- Press Analyze to render the table. Each row shows the word, count, and share of total tokens.
- Press Copy CSV to copy the full table as CSV, or Reset to start over.
About this word frequency counter
The frequency counter tokenizes your text by extracting every run of letters, digits, or apostrophes — so contractions like `don't` stay intact. Each token is optionally lowercased, length-filtered, and matched against a bundled list of ~80 common English stop words. Remaining tokens are tallied in a `Map<string, number>`, then sorted by your chosen rule. The resulting table renders up to 200 rows at a time and exports the full list as CSV.
A concrete example. Paste:
`The rain in Spain falls mainly on the plain. The plain stays the same.`
With minimum length 1, case insensitive, no stop-word filter, you get a 14-token total. Sorted by frequency the top rows are: `the (4, 28.57%)`, `plain (2, 14.29%)`, `in (1, 7.14%)`, `spain (1, 7.14%)`, `rain (1, 7.14%)`, and so on. Enable the stop-word filter and `the`, `in`, `on` drop out, leaving `plain` as the most common token with a 25% share of the filtered corpus.
This is useful for SEO content auditing, spotting repetitive word choices in essays, summarizing customer-feedback datasets, and seeding word clouds. Everything runs locally in your browser.
FAQ
- What counts as one word?
- A word is any uninterrupted run of letters, digits, or apostrophes — so "don't", "iPhone", and "2024" each count as a single word. Punctuation, spaces, and other symbols act as separators.
- How does the case-insensitive option work?
- When enabled, every token is lowercased before counting, so "The" and "the" merge into one row. When disabled, each capitalization variant is counted separately.
- What words are in the stop-word list?
- The bundled list contains about 80 of the most common English function words — articles, prepositions, conjunctions, pronouns, and auxiliary verbs (a, an, the, of, in, on, and, but, is, was, …). The list is fixed and is not customizable.
- Why is there a 200-row display limit?
- Rendering thousands of table rows simultaneously hurts scroll performance and is rarely useful. The top 200 cover the meaningful long-tail for nearly all real-world documents. Copy CSV to get the complete list.
- Are the percentages calculated before or after filtering?
- After filtering. The denominator is the number of tokens that survived your minimum length and stop-word settings, so the percentages always sum to ~100% across the displayed rows.
- Is my text uploaded anywhere?
- No. Tokenization and counting run entirely in your browser. Your text is never sent to a server, logged, or stored.