Remove Punctuation

Strip punctuation characters from any text, with options to keep apostrophes and hyphens.

Runs locallyInstantPrivate
Input
Output
Result appears here.

What This Tool Does

Paste any text and the tool removes all punctuation characters — commas, periods, exclamation marks, colons, semicolons, quotation marks, brackets, em dashes, and more. Two options give you fine-grained control: keep apostrophes so contractions like "don't" and "it's" remain readable, and keep hyphens between words so compound adjectives like "well-known" or "state-of-the-art" stay intact. Ideal for preparing text for NLP pipelines, word frequency analysis, or any task where punctuation is noise.

How to Use

  1. Paste or type your text into the Input panel.
  2. Choose All punctuation to remove everything, or Keep apostrophes to preserve contractions.
  3. Toggle Keep hyphens between words if you want compound words to stay hyphenated.
  4. Copy or download the cleaned result from the Output panel.

Frequently Asked Questions

Which punctuation characters does this tool remove?

The tool targets the most common punctuation: periods, commas, exclamation marks, question marks, semicolons, colons, single and double quotation marks (both straight and curly), parentheses, square brackets, curly braces, ellipses, em dashes, and en dashes. The "Keep apostrophes" mode skips the apostrophe character so that contractions like "can't" and possessives like "John's" are preserved.

Why would I want to keep hyphens between words?

Hyphens serve two distinct roles in English text. A hyphen connecting two words (like "well-known" or "free-range") is part of the word's meaning and removing it would leave two separate tokens that look unrelated. The "Keep hyphens between words" option detects these word-internal hyphens and leaves them alone while still removing standalone hyphens used as dashes in informal writing.

Is this useful for machine learning or NLP tasks?

Yes. A common preprocessing step before tokenisation, bag-of-words analysis, or training a language model is to strip punctuation so that "word" and "word." are treated as the same token. Paste your corpus into the Input panel and copy the cleaned version directly into your pipeline. Because everything runs in your browser, you can safely process sensitive or proprietary datasets without uploading them to any server.

Does it handle Unicode punctuation like curly quotes and em dashes?

Yes. The tool removes both ASCII punctuation and common Unicode punctuation marks, including left and right double quotation marks (\u201C\u201D), left and right single quotation marks (\u2018\u2019), horizontal ellipsis (\u2026), em dash (\u2014), and en dash (\u2013). If you copied text from a word processor or website, these Unicode variants will be caught just like their ASCII equivalents.

Related tools