| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
|
| |
* Pass state names from an enum around instead of boxed functions
* Refactor to simplify attempts a lot
* Use a subtokenizer for the the `document` content type
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, a custom char implementation was used.
This was easier to work with, as sometimes “virtual” characters are injected,
or characters are ignored.
This replaces that with working on actual `char`s.
In the hope of in the future working on `u8`s, even.
This simplifies the state machine somewhat, as only `\n` is fed, regardless of
whether it was a CRLF, CR, or LF.
It also feeds `' '` instead of virtual spaces.
The BOM, if present, is now available as a `ByteOrderMark` event.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces trimming initial and final whitespace around the
whole string or text, or around line endings inside that string or text.
* Add `register_resolver_before`, to run resolvers earlier than others,
used for labels
* Add resolver to merge `data` events, which are the most frequent token
that occurs, and can happen adjacently.
In `micromark-js` this sped up parsing a lot
* Fix a bug where a virtual space was not seen as an okay event
* Refactor to enable all turned off whitespace tests
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is still some messy code that needs cleaning up, but it adds support for
links and images, of the resource kind (`[a](b)`).
References (`[a][b]`) are parsed and will soon be supported, but need matching.
* Fix bug to pad percent-encoded bytes when normalizing urls
* Fix bug with escapes counting as balancing in destination
* Add `space_or_tab_one_line_ending`, to parse whitespace including up to
one line ending (but not a blank line)
* Add `ParserState` to share codes, definitions, etc
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
* Fix a couple of forgotten line ending handling in html (text)
* Fix missing initial case for html (text) not having a `<` 😬
* Add line ending handling to `text` construct
|
| |
|
|
|
|
|
|
|
| |
* Add all states for html (text)
* Fix to link paragraph tokens together
* Add note about uncovered bug where linking paragraph tokens together
doesn’t work 😅
|
| |
|
|
* Add character reference and character escapes in text
* Add recursive subtokenization
|