markdown-rs - CommonMark compliant markdown parser in Rust with ASTs and extensions

	Commit message (Collapse)	Author	Files	Lines
2022-09-19	Add support for parsing MDX ESM, expressions	Titus Wormer	1	-0/+8
	This commit adds support for hooks that lets a user integrate another parser with `micromark-rs`, to parse ESM and expressions according to a particular grammar (such as a programming language, typically JavaScript). For an example integrating with SWC, see `tests/test_utils/mod.rs`. The integration occurs with two functions passed in `options`: `mdx_expression_parse` and `mdx_esm_parse`. The can signal back to micromark when they are successful, whether there is an error at the end (in which case micromark will try to parse more), or whether there is a syntax error (in which case micromark will crash).
2022-09-14	Fix to prefer flow over definitions, setext headings	Titus Wormer	1	-24/+9
	An undocumented part of CommonMark is how to deal with things in definition labels or definition titles (which both can span multiple lines). Can flow (or containers?) interrupt them? They can according to the `cmark` reference parser, so this was implemented here. This adds a new `Content` content type, which houses zero or more definitions, and then zero-or-one paragraphs. Content can be followed by a setext heading underline, which either turns into a setext heading when the content ends in a paragraph, or turns into the start of the following paragraph when it is followed by content that starts with a paragraph, or turns into a stray paragraph.
2022-09-09	Add mdx expression (flow, text)	Titus Wormer	1	-1/+23

2022-09-08	Add support for mdx jsx (flow)	Titus Wormer	1	-2/+17

2022-08-31	Add support for GFM tables	Titus Wormer	1	-23/+18

2022-08-26	Add support for math (flow)	Titus Wormer	1	-12/+14

2022-08-16	Add support for frontmatter	Titus Wormer	1	-12/+12

2022-08-15	Refactor to move `content` to `construct`	Titus Wormer	1	-0/+0

2022-08-12	Refactor to improve docs of each function	Titus Wormer	1	-35/+70

2022-08-11	Refactor attempts to remove unneeded state name	Titus Wormer	1	-56/+64

2022-08-11	Refactor to move some code to `event.rs`	Titus Wormer	1	-55/+55

2022-08-11	Refactor to move some code to `state.rs`	Titus Wormer	1	-50/+51

2022-08-10	Add `State::Retry`	Titus Wormer	1	-2/+2

2022-08-10	Rename `State::Fn` to `State::Next`	Titus Wormer	1	-32/+32

2022-08-09	Add peeking to unindented flow lines	Titus Wormer	1	-11/+47

2022-08-09	Add support for passing `ok`, `nok` as separate states to attempts	Titus Wormer	1	-29/+68

2022-08-09	Rewrite algorithm to not pass around boxed functions	Titus Wormer	1	-27/+30
	* Pass state names from an enum around instead of boxed functions * Refactor to simplify attempts a lot * Use a subtokenizer for the the `document` content type
2022-07-29	Refactor to work on bytes (`u8`)	Titus Wormer	1	-2/+2

2022-07-28	Refactor to work on `char`s	Titus Wormer	1	-7/+7
	Previously, a custom char implementation was used. This was easier to work with, as sometimes “virtual” characters are injected, or characters are ignored. This replaces that with working on actual `char`s. In the hope of in the future working on `u8`s, even. This simplifies the state machine somewhat, as only `\n` is fed, regardless of whether it was a CRLF, CR, or LF. It also feeds `' '` instead of virtual spaces. The BOM, if present, is now available as a `ByteOrderMark` event.
2022-07-25	Refactor to not pass codes around	Titus Wormer	1	-14/+14

2022-07-25	Remove no longer needed field in `State::Ok`	Titus Wormer	1	-4/+4

2022-07-22	Refactor to remove unneeded tuples in every states	Titus Wormer	1	-12/+12

2022-07-22	Refactor to pass ints instead of vecs around	Titus Wormer	1	-6/+6

2022-07-07	Refactor to move token types to `token`	Titus Wormer	1	-5/+6

2022-07-07	Add basic support for block quotes	Titus Wormer	1	-47/+2

2022-07-05	Refactor code style	Titus Wormer	1	-2/+2

2022-07-04	Add support for unicode punctuation	Titus Wormer	1	-1/+1

2022-07-04	Update list of todos	Titus Wormer	1	-1/+1

2022-07-01	Make paragraphs really fast	Titus Wormer	1	-22/+24
	The approach that `micromark-js` takes is as follows: to parse a paragraph, check whether each line starts with something else. If it does, exit, otherwise continue. That is slow, because our actual flow parser does similar things: the work was being done twice. To fix this, this commit introduces parsing each line of a paragraph separately. And finally, when done with flow, combining adjacent paragraphs. This same mechanism is reused for setext headings. Additionally, this commit adds support for interrupting things (or not). E.g., HTML (flow, complete) cannot interrupt paragraphs. Definitions cannot interrupt paragraphs, and connect be interrupted either, but they can follow each other.
2022-06-29	Add support for sharing identifiers, references before definitions	Titus Wormer	1	-5/+7

2022-06-24	Add link, images (resource)	Titus Wormer	1	-6/+14
	This is still some messy code that needs cleaning up, but it adds support for links and images, of the resource kind (`[a](b)`). References (`[a][b]`) are parsed and will soon be supported, but need matching. * Fix bug to pad percent-encoded bytes when normalizing urls * Fix bug with escapes counting as balancing in destination * Add `space_or_tab_one_line_ending`, to parse whitespace including up to one line ending (but not a blank line) * Add `ParserState` to share codes, definitions, etc
2022-06-22	Add support for normalizing identifiers	Titus Wormer	1	-1/+24

2022-06-22	Refactor to improve tokenizer, add docs	Titus Wormer	1	-8/+10

2022-06-20	Add some more enabled tests	Titus Wormer	1	-1/+0

2022-06-20	Add improved whitespace handling	Titus Wormer	1	-34/+11
	* add several helpers for parsing betwen x and y `space_or_tab`s * use those helpers in a bunch of places * move initial indent parsing to flow constructs themselves
2022-06-20	Add paragraph	Titus Wormer	1	-130/+9

2022-06-20	Remove unneeded `content` content type	Titus Wormer	1	-27/+27

2022-06-17	Add support for definitions	Titus Wormer	1	-9/+15
	* Add definitions * Add partials for label, destination, title * Add `go`, to attempt something, and do something else on `ok`
2022-06-16	Add heading (setext)	Titus Wormer	1	-11/+8

2022-06-14	Reorganize to split util	Titus Wormer	1	-2/+2

2022-06-10	Add text content type	Titus Wormer	1	-5/+9
	* Add character reference and character escapes in text * Add recursive subtokenization
2022-06-10	Add proper support for subtokenization	Titus Wormer	1	-19/+26
	- Add “content” content type - Add paragraph - Add skips - Add linked tokens
2022-06-09	Add basic subtokenization, string content in fenced code	Titus Wormer	1	-17/+6

2022-06-09	Refactor to pass more slices around	Titus Wormer	1	-1/+1

2022-06-09	Add support for indented lines in paragraphs	Titus Wormer	1	-23/+20

2022-06-09	Add basic support for interrupting content	Titus Wormer	1	-69/+95