aboutsummaryrefslogtreecommitdiffstats
path: root/src/tokenizer.rs (unfollow)
Commit message (Collapse)AuthorFilesLines
2022-07-05Refactor code styleLibravatar Titus Wormer1-78/+0
2022-07-04Add support for unicode punctuationLibravatar Titus Wormer1-1/+1
2022-07-04Update list of todosLibravatar Titus Wormer1-2/+0
2022-07-04Add support for attention (emphasis, strong)Libravatar Titus Wormer1-0/+9
2022-07-01Make paragraphs really fastLibravatar Titus Wormer1-0/+3
The approach that `micromark-js` takes is as follows: to parse a paragraph, check whether each line starts with something else. If it does, exit, otherwise continue. That is slow, because our actual flow parser does similar things: the work was being done twice. To fix this, this commit introduces parsing each line of a paragraph separately. And finally, when done with flow, combining adjacent paragraphs. This same mechanism is reused for setext headings. Additionally, this commit adds support for interrupting things (or not). E.g., HTML (flow, complete) cannot interrupt paragraphs. Definitions cannot interrupt paragraphs, and connect be interrupted either, but they can follow each other.
2022-06-30Add support for trimming whitespace around string, textLibravatar Titus Wormer1-1/+8
This commit introduces trimming initial and final whitespace around the whole string or text, or around line endings inside that string or text. * Add `register_resolver_before`, to run resolvers earlier than others, used for labels * Add resolver to merge `data` events, which are the most frequent token that occurs, and can happen adjacently. In `micromark-js` this sped up parsing a lot * Fix a bug where a virtual space was not seen as an okay event * Refactor to enable all turned off whitespace tests
2022-06-30Add docs on resolver, clean feedLibravatar Titus Wormer1-77/+97
2022-06-30Refactor to reorder token typesLibravatar Titus Wormer1-303/+303
2022-06-30Add docs to image, link, and other media tokensLibravatar Titus Wormer1-16/+434
2022-06-29Refactor to externalize handlers of compilerLibravatar Titus Wormer1-1/+1
2022-06-28Fix jumps in `edit_map`Libravatar Titus Wormer1-33/+16
* Use resolve more often (e.g., heading (atx, setext)) * Fix to link whole phrasing (e.g., one big chunk of text in heading (atx, setext), titles, labels) * Replace `ChunkText`, `ChunkString`, with `event.content_type: Option<ContentType>` * Refactor to externalize `edit_map` from `label`
2022-06-24Add link, images (resource)Libravatar Titus Wormer1-5/+106
This is still some messy code that needs cleaning up, but it adds support for links and images, of the resource kind (`[a](b)`). References (`[a][b]`) are parsed and will soon be supported, but need matching. * Fix bug to pad percent-encoded bytes when normalizing urls * Fix bug with escapes counting as balancing in destination * Add `space_or_tab_one_line_ending`, to parse whitespace including up to one line ending (but not a blank line) * Add `ParserState` to share codes, definitions, etc
2022-06-22Refactor some unneeded assignmentsLibravatar Titus Wormer1-3/+1
2022-06-22Add `attempt_opt` to tokenizerLibravatar Titus Wormer1-35/+30
2022-06-22Rename `Whitespace` token to `SpaceOrTab`Libravatar Titus Wormer1-15/+15
2022-06-22Refactor to improve tokenizer, add docsLibravatar Titus Wormer1-144/+58
2022-06-22Add docs for token typesLibravatar Titus Wormer1-6/+1102
2022-06-21Add support for passing token types to destination, label, titleLibravatar Titus Wormer1-1/+1
2022-06-21Refactor to improve a bunch of statesLibravatar Titus Wormer1-0/+1
* Improve passing stuff around * Add traits to enums for markers and such * Fix “life time” stuff I didn’t understand
2022-06-21Update todo listLibravatar Titus Wormer1-16/+23
2022-06-21Add support for inferring line ending, configurableLibravatar Titus Wormer1-2/+4
* Rename `CompileOptions` to `Options` * Add support for an optional default line ending style * Add support for inferring the used line ending style
2022-06-20Add support for BOMLibravatar Titus Wormer1-0/+10
2022-06-20Fix bug with tabsLibravatar Titus Wormer1-1/+6
2022-06-20Add improved whitespace handlingLibravatar Titus Wormer1-27/+5
* add several helpers for parsing betwen x and y `space_or_tab`s * use those helpers in a bunch of places * move initial indent parsing to flow constructs themselves
2022-06-20Remove unneeded `content` content typeLibravatar Titus Wormer1-2/+0
2022-06-17Add support for definitionsLibravatar Titus Wormer1-15/+67
* Add definitions * Add partials for label, destination, title * Add `go`, to attempt something, and do something else on `ok`
2022-06-16Add heading (setext)Libravatar Titus Wormer1-4/+7
2022-06-16Refactor to reorder thing alphabeticallyLibravatar Titus Wormer1-21/+7
2022-06-16Add support for hard break (trailing)Libravatar Titus Wormer1-2/+11
2022-06-16Add support for hard break escapeLibravatar Titus Wormer1-3/+11
2022-06-15Add code (text)Libravatar Titus Wormer1-5/+22
2022-06-14Reorganize to split utilLibravatar Titus Wormer1-0/+2
2022-06-13Add basic html (text)Libravatar Titus Wormer1-2/+32
* Add all states for html (text) * Fix to link paragraph tokens together * Add note about uncovered bug where linking paragraph tokens together doesn’t work 😅
2022-06-13Add autolinksLibravatar Titus Wormer1-0/+5
2022-06-10Add text content typeLibravatar Titus Wormer1-1/+1
* Add character reference and character escapes in text * Add recursive subtokenization
2022-06-10Add proper support for subtokenizationLibravatar Titus Wormer1-3/+37
- Add “content” content type - Add paragraph - Add skips - Add linked tokens
2022-06-09Add basic subtokenization, string content in fenced codeLibravatar Titus Wormer1-7/+8
2022-06-09Refactor to pass more slices aroundLibravatar Titus Wormer1-5/+5
2022-06-09Add basic support for interrupting contentLibravatar Titus Wormer1-1/+57