| Commit message (Collapse) | Author | Files | Lines |
|
This will probably catch some confusing bugs, such as ad1b3e6.
|
|
|
|
An undocumented part of CommonMark is how to deal with things in definition
labels or definition titles (which both can span multiple lines).
Can flow (or containers?) interrupt them?
They can according to the `cmark` reference parser, so this was implemented here.
This adds a new `Content` content type, which houses zero or more definitions,
and then zero-or-one paragraphs.
Content can be followed by a setext heading underline, which either turns
into a setext heading when the content ends in a paragraph, or turns into
the start of the following paragraph when it is followed by content that
starts with a paragraph, or turns into a stray paragraph.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Pass state names from an enum around instead of boxed functions
* Refactor to simplify attempts a lot
* Use a subtokenizer for the the `document` content type
|
|
|
|
|
|
Previously, a custom char implementation was used.
This was easier to work with, as sometimes “virtual” characters are injected,
or characters are ignored.
This replaces that with working on actual `char`s.
In the hope of in the future working on `u8`s, even.
This simplifies the state machine somewhat, as only `\n` is fed, regardless of
whether it was a CRLF, CR, or LF.
It also feeds `' '` instead of virtual spaces.
The BOM, if present, is now available as a `ByteOrderMark` event.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This commit introduces trimming initial and final whitespace around the
whole string or text, or around line endings inside that string or text.
* Add `register_resolver_before`, to run resolvers earlier than others,
used for labels
* Add resolver to merge `data` events, which are the most frequent token
that occurs, and can happen adjacently.
In `micromark-js` this sped up parsing a lot
* Fix a bug where a virtual space was not seen as an okay event
* Refactor to enable all turned off whitespace tests
|
|
|
|
This is still some messy code that needs cleaning up, but it adds support for
links and images, of the resource kind (`[a](b)`).
References (`[a][b]`) are parsed and will soon be supported, but need matching.
* Fix bug to pad percent-encoded bytes when normalizing urls
* Fix bug with escapes counting as balancing in destination
* Add `space_or_tab_one_line_ending`, to parse whitespace including up to
one line ending (but not a blank line)
* Add `ParserState` to share codes, definitions, etc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* Fix a couple of forgotten line ending handling in html (text)
* Fix missing initial case for html (text) not having a `<` 😬
* Add line ending handling to `text` construct
|
|
|
|
* Add all states for html (text)
* Fix to link paragraph tokens together
* Add note about uncovered bug where linking paragraph tokens together
doesn’t work 😅
|
|
|
|
* Add character reference and character escapes in text
* Add recursive subtokenization
|
|
- Add “content” content type
- Add paragraph
- Add skips
- Add linked tokens
|