[go: up one dir, main page]

comrak 0.45.0-rc.1

A 100% CommonMark-compatible GitHub Flavored Markdown parser and formatter
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
# [Comrak]https://comrak.ee/

[![Build status](https://github.com/kivikakk/comrak/actions/workflows/rust.yml/badge.svg)](https://github.com/kivikakk/comrak/actions/workflows/rust.yml)
[![CommonMark: 652/652](https://img.shields.io/badge/commonmark-652%2F652-brightgreen.svg)](https://github.com/commonmark/commonmark-spec/blob/9103e341a973013013bb1a80e13567007c5cef6f/spec.txt)
[![GFM: 670/670](https://img.shields.io/badge/gfm-670%2F670-brightgreen.svg)](https://github.com/kivikakk/cmark-gfm/blob/2f13eeedfe9906c72a1843b03552550af7bee29a/test/spec.txt)
[![crates.io version](https://img.shields.io/crates/v/comrak.svg)](https://crates.io/crates/comrak)
[![docs.rs](https://docs.rs/comrak/badge.svg)](https://docs.rs/comrak)

[Comrak](https://comrak.ee/) is a [CommonMark](https://commonmark.org/) and [GitHub Flavored Markdown](https://github.github.com/gfm/) compatible parser and renderer, written in Rust.

Compliant with [CommonMark 0.31.2](https://spec.commonmark.org/0.31.2/) by default.

## Installation

Specify it as a requirement in `Cargo.toml`:

``` toml
[dependencies]
comrak = "0.45.0-rc"
```

Comrak's library supports Rust <span class="msrv">1.65</span>+.

### CLI

- Anywhere with a Rust toolchain:
  - `cargo install comrak`
  - <code>[cargo binstall]https://github.com/cargo-bins/cargo-binstall comrak</code>
- Many Unix distributions:
  - `pacman -S comrak`
  - `brew install comrak`
  - `dnf install comrak`
  - `nix run nixpkgs#comrak`

You can also find builds I've published in [GitHub Releases](https://github.com/kivikakk/comrak/releases), but they're limited to machines I have access to at the time of making them! [webinstall.dev](https://webinstall.dev/comrak/) offers `curl | shell`-style installation of the latest of these for your OS, including Windows.

## Usage

<details>

<summary>Click to expand the CLI <code>--help</code> output.

``` console
$ comrak --help
```

</summary>

```
A 100% CommonMark-compatible GitHub Flavored Markdown parser and formatter

Usage: comrak [OPTIONS] [FILE]...

Arguments:
  [FILE]...
          CommonMark file(s) to parse; or standard input if none passed

Options:
  -c, --config-file <PATH>
          Path to config file containing command-line arguments, or 'none'
          
          [default: /home/runner/.config/comrak/config]

  -i, --inplace
          To perform an in-place formatting

      --hardbreaks
          Treat newlines as hard line breaks

      --smart
          Use smart punctuation

      --github-pre-lang
          Use GitHub-style <pre lang> for code blocks

      --full-info-string
          Enable full info strings for code blocks

      --gfm
          Enable GitHub-flavored markdown extensions: strikethrough, tagfilter, table, autolink, and
          tasklist. Also enables --github-pre-lang and --gfm-quirks

      --gfm-quirks
          Enables GFM-style quirks in output HTML, such as not nesting <strong> tags, which
          otherwise breaks CommonMark compatibility

      --relaxed-tasklist-character
          Enable relaxing which character is allowed in a tasklists

      --relaxed-autolinks
          Enable relaxing of autolink parsing, allow links to be recognized when in brackets and
          allow all url schemes

      --tasklist-classes
          Output classes on tasklist elements so that they can be styled with CSS

      --default-info-string <INFO>
          Default value for fenced code block's info strings if none is given

      --unsafe
          Allow raw HTML and dangerous URLs

      --gemojis
          Translate gemojis into UTF-8 characters

      --escape
          Escape raw HTML instead of clobbering it

      --escaped-char-spans
          Wrap escaped characters in span tags

  -e, --extension <EXTENSION>
          Specify extension name(s) to use
          
          Multiple extensions can be delimited with ",", e.g. --extension strikethrough,table
          
          [possible values: strikethrough, tagfilter, table, autolink, tasklist, superscript,
          footnotes, inline-footnotes, description-lists, multiline-block-quotes, math-dollars,
          math-code, wikilinks-title-after-pipe, wikilinks-title-before-pipe, underline, subscript,
          spoiler, greentext, alerts, cjk-friendly-emphasis]

  -t, --to <FORMAT>
          Specify output format
          
          [default: html]
          [possible values: html, xml, commonmark]

  -o, --output <FILE>
          Write output to FILE instead of stdout

      --width <WIDTH>
          Specify wrap width (0 = nowrap)
          
          [default: 0]

      --header-ids <PREFIX>
          Use the Comrak header IDs extension, with the given ID prefix

      --front-matter-delimiter <DELIMITER>
          Ignore front-matter that starts and ends with the given string

      --syntax-highlighting <THEME>
          Syntax highlighting for codefence blocks. Choose a theme or 'none' for disabling
          
          [default: base16-ocean.dark]

      --list-style <LIST_STYLE>
          Specify bullet character for lists (-, +, *) in CommonMark output
          
          [default: dash]
          [possible values: dash, plus, star]

      --sourcepos
          Include source position attribute in HTML and XML output

      --ignore-setext
          Ignore setext headers

      --ignore-empty-links
          Ignore empty links

      --experimental-minimize-commonmark
          Minimize escapes in CommonMark output using a trial-and-error algorithm

  -h, --help
          Print help information (use `-h` for a summary)

  -V, --version
          Print version information

By default, Comrak will attempt to read command-line options from a config file specified by
--config-file. This behaviour can be disabled by passing --config-file none. It is not an error if
the file does not exist.
```

</details>

And there's a Rust interface. You can use `comrak::markdown_to_html` directly:

``` rust
use comrak::{markdown_to_html, Options};
assert_eq!(
    markdown_to_html("¡Olá, **世界**!", &Options::default()),
    "<p>¡Olá, <strong>世界</strong>!</p>\n"
);
```

Or you can parse the input into an AST yourself, manipulate it, and then use your desired formatter:

``` rust
use comrak::nodes::NodeValue;
use comrak::{format_html, parse_document, Arena, Options};

fn replace_text(document: &str, orig_string: &str, replacement: &str) -> String {
    // The returned nodes are created in the supplied Arena, and are bound by its lifetime.
    let arena = Arena::new();

    // Parse the document into a root `AstNode`
    let root = parse_document(&arena, document, &Options::default());

    // Iterate over all the descendants of root.
    for node in root.descendants() {
        if let NodeValue::Text(ref mut text) = node.data.borrow_mut().value {
            // If the node is a text node, perform the string replacement.
            *text = text.to_mut().replace(orig_string, replacement).into()
        }
    }

    let mut html = String::new();
    format_html(root, &Options::default(), &mut html).unwrap();

    html
}

fn main() {
    let doc = "Hello, pretty world!\n\n1. Do you like [pretty](#) paintings?\n2. Or *pretty* music?\n";
    let orig = "pretty";
    let repl = "beautiful";
    let html = replace_text(doc, orig, repl);

    println!("{}", html);
    // Output:
    //
    // <p>Hello, beautiful world!</p>
    // <ol>
    // <li>Do you like <a href="#">beautiful</a> paintings?</li>
    // <li>Or <em>beautiful</em> music?</li>
    // </ol>
}
```

For a slightly more real-world example, see how I [generate my GitHub user README](https://github.com/kivikakk/kivikakk) from a base document with embedded YAML, which itself has embedded Markdown, or
[check out some of Comrak's dependents on crates.io](https://crates.io/crates/comrak/reverse_dependencies) or [on GitHub](https://github.com/kivikakk/comrak/network/dependents).

## Security

As with [`cmark`](https://github.com/commonmark/cmark) and [`cmark-gfm`](https://github.com/github/cmark-gfm#security),
Comrak will scrub raw HTML and potentially dangerous links. This change was introduced in Comrak 0.4.0 in support of a
safe-by-default posture, and later adopted by our contemporaries. :)

To allow these, use the `unsafe_` option (or `--unsafe` with the command line program). If doing so, we recommend the
use of a sanitisation library like [`ammonia`](https://github.com/notriddle/ammonia) configured specific to your needs.

## Extensions

Comrak supports the five extensions to CommonMark defined in the [GitHub Flavored Markdown
Spec](https://github.github.com/gfm/):

- [Tables]https://github.github.com/gfm/#tables-extension-
- [Task list items]https://github.github.com/gfm/#task-list-items-extension-
- [Strikethrough]https://github.github.com/gfm/#strikethrough-extension-
- [Autolinks]https://github.github.com/gfm/#autolinks-extension-
- [Disallowed Raw HTML]https://github.github.com/gfm/#disallowed-raw-html-extension-

Comrak additionally supports its own extensions, which are yet to be specced out (PRs welcome!):

- Superscript
- Header IDs
- Footnotes
- Inline footnotes
- Description lists
- Front matter
- Multi-line blockquotes
- Math
- Emoji shortcodes
- Wikilinks
- Underline
- Spoiler text
- "Greentext"
- [CJK friendly emphasis]https://github.com/tats-u/markdown-cjk-friendly

By default none are enabled; they are individually enabled with each parse by setting the appropriate values in the
[`options::Extension` struct](https://docs.rs/comrak/latest/comrak/options/struct.Extension.html).

## Custom formatting

The default HTML formatter can be partially specialised, to
allow customising the output for certain node types without
having to reimplement a whole formatter.  See the docs for
[`comrak::create_formatter`](https://docs.rs/comrak/latest/comrak/macro.create_formatter.html)
for details.

## Plugins

### Fenced code block syntax highlighting

You can provide your own syntax highlighting engine.

Create an implementation of the `SyntaxHighlighterAdapter` trait, and then provide an instance of such adapter to
`Plugins.render.codefence_syntax_highlighter`. For formatting a Markdown document with plugins, use the
`markdown_to_html_with_plugins` function, which accepts your plugins object as a parameter.

See the `syntax_highlighter.rs` and `syntect.rs` examples for more details.

#### Syntect

[`syntect`](https://github.com/trishume/syntect) is a syntax highlighting library for Rust. By default, `comrak` offers
a plugin for it. In order to utilize it, create an instance of `plugins::syntect::SyntectAdapter` and use it in your
`Plugins` option.

## Related projects

Comrak's original design goal was to model the upstream
[`cmark-gfm`](https://github.com/github/cmark-gfm) as closely as possible in
terms of code structure. Many years have passed since its inception, and the codebases
have since grown considerably apart. It does remain the case, though, that there
are bugs in `cmark-gfm` that are likely in Comrak too, as a result.

Over the years, we have increasingly opted to fix such bugs, rather than
maintain upstream compatibility at all costs.  `cmark-gfm` no longer appears to
be under active maintenance, but Comrak is a living and growing project.

This library offers an AST backed by
[`typed_arena`](https://github.com/thomcc/rust-typed-arena), with extensive
use of `RefCell` in the core node type to provide mutable access with
parent/sibling/child pointers.  This can produce non-idiomatic-looking code,
though in practice it has proven very usable.

For whatever reason, Comrak may not meet your requirements. Here are some
projects and resources to also consider:

- [Raph Levien]https://github.com/raphlinus's [`pulldown-cmark`]https://github.com/google/pulldown-cmark. It's
  very fast, uses a novel parsing algorithm, and doesn't construct an AST (but you can use it to make one if you
  want). `cargo doc` uses this, as do many other projects in the ecosystem.
- [markdown-rs]https://github.com/wooorm/markdown-rs looks really promising.
- [markdown-it]https://github.com/markdown-it-rust/markdown-it is a port of JavaScript's [markdown-it.js]https://github.com/markdown-it/markdown-it.
- [babelmark]https://babelmark.github.io/ lets you compare many implementations at once, including the above.
- Know of another library? Please open a PR to add it!

### Bindings

- [Commonmarker]https://github.com/gjtorikian/commonmarker — Ruby bindings for this library built with Magnus/rb-sys.
  Available on RubyGems as [`commonmarker`]https://rubygems.org/gems/commonmarker.
- [MDEx]https://github.com/leandrocp/mdex — Elixir bindings for this library built with Rustler.
  Available on Hex as [`mdex`]https://hex.pm/packages/mdex.
- [comrak]https://github.com/lmmx/comrak — Python bindings for this library built with PyO3.
  Available on PyPI as [`comrak`]https://pypi.org/project/comrak, benchmarked at 15-60x faster than pure Python alternatives.
- [comrak-wasm]https://github.com/nberlette/comrak-wasm — TypeScript bindings for this library, built with WebAssembly.
  Available on JSR as [`@nick/comrak`]https://jsr.io/@nick/comrak.

### Users

Comrak is used in a few Rust-y places, and more beyond:

- [crates.io]https://crates.io, [docs.rs]https://docs.rs and [lib.rs]https://lib.rs use Comrak to render README Markdown faithfully.
- [GitLab]https://gitlab.com uses Comrak to render Markdown documents, issues, comments, and more.
- [Deno]https://deno.com uses Comrak to render documentation in [`deno_doc`]https://github.com/denoland/deno_doc.
- [Reddit]https://reddit.com's new-style site uses a Comrak fork[^reddit].
- [Lockbook]https://lockbook.net/ is a Markdown-based secure notebook with native apps. It looks really neat!!
- [many]https://github.com/kivikakk/comrak/network/dependents [more!]https://crates.io/crates/comrak/reverse_dependencies

I'd be really happy to add your site or app here, just open a PR or issue. :)

[^reddit]: And they contributed some really nice changes, years back. Then management went and sold their user base and all their credibility. What can you do.

## Benchmarking

We offer some tools to perform stdin-to-stdout benchmarking of Comrak with its contemporaries.  In this respect, Comrak is not and will not be the fastest: some alternatives do not construct an AST in this scenario.

You'll need to [install hyperfine](https://github.com/sharkdp/hyperfine#installation), and CMake if you want to compare against `cmark-gfm`.

If you want to just run the benchmark for the `comrak` binary itself, run:

``` bash
make bench-comrak
```

This will build Comrak in release mode, and run benchmark on it. You will see the time measurements as reported by hyperfine in the console.

The `Makefile` also provides a way to run benchmarks for `comrak` current state (with your changes), `comrak` main branch, [`cmark-gfm`](https://github.com/github/cmark-gfm), [`pulldown-cmark`](https://github.com/raphlinus/pulldown-cmark) and [`markdown-it.rs`](https://github.com/rlidwka/markdown-it.rs). You'll need CMake, and ensure [submodules are prepared](https://stackoverflow.com/a/10168693/499609).

``` bash
make bench-all
```

This will build and run benchmarks across all, and report the time taken by each as well as relative time.

<!-- XXX: The following isn't really true at the moment, due to https://github.com/kivikakk/comrak/issues/339 -->

<!-- Apart from this, CI is also setup for running benchmarks when a pull request is first opened. It will add a comment with the results on the pull request in a tabular format comparing the 5 versions. After that you can manually trigger this CI by commenting `/run-bench` on the PR, this will update the existing comment with new results. Note benchmarks won't be automatically run on each push. -->

## Contributing

Contributions are **highly encouraged**; if you'd like to assist, consider checking out the [`good first issue` label](https://github.com/kivikakk/comrak/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)! I'm happy to help provide direction and guidance throughout, even if (especially if!) you're new to Rust or open source.

Where possible I practice [Optimistic Merging](http://hintjens.com/blog:106) as described by Peter Hintjens. Please keep the [code of conduct](CODE_OF_CONDUCT.md) in mind too.

Thank you to Comrak's many contributors for PRs and issues opened!

### Code Contributors

[![Small chart showing Comrak contributors.](https://opencollective.com/comrak/contributors.svg?width=890&button=false)](https://github.com/kivikakk/comrak/graphs/contributors)

### Financial Contributors

Become a financial contributor and help sustain Comrak's development.

- [GitHub Sponsors]https://github.com/sponsors/kivikakk

## Contact

Asherah Connor <ashe kivikakk ee\>

## Legal

Copyright (c) 2017–2025, Comrak contributors. Licensed under
the [2-Clause BSD License](https://opensource.org/licenses/BSD-2-Clause).

`cmark` itself is is copyright (c) 2014, John MacFarlane.

See [COPYING](COPYING) for all the details.