aboutsummaryrefslogtreecommitdiffstats
path: root/src/subtokenize.rs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Refactor to improve enteringLibravatar Titus Wormer2022-08-121-4/+4
|
* Refactor to improve docs of each functionLibravatar Titus Wormer2022-08-121-17/+19
|
* Refactor to move `space_or_tab_eol` to own fileLibravatar Titus Wormer2022-08-111-2/+2
|
* Refactor to move some code to `event.rs`Libravatar Titus Wormer2022-08-111-11/+12
|
* Refactor to move some code to `state.rs`Libravatar Titus Wormer2022-08-111-3/+4
|
* Refactor internal docs, code style of tokenizerLibravatar Titus Wormer2022-08-111-5/+1
|
* Add improved container exit injectionLibravatar Titus Wormer2022-08-111-5/+8
|
* Rename `State::Fn` to `State::Next`Libravatar Titus Wormer2022-08-101-2/+2
|
* Refactor to share some codeLibravatar Titus Wormer2022-08-091-56/+86
|
* Rewrite algorithm to not pass around boxed functionsLibravatar Titus Wormer2022-08-091-7/+6
| | | | | | * Pass state names from an enum around instead of boxed functions * Refactor to simplify attempts a lot * Use a subtokenizer for the the `document` content type
* Refactor to use `debug_assert`Libravatar Titus Wormer2022-07-281-7/+7
|
* Refactor to drastically improve perf around whitespaceLibravatar Titus Wormer2022-07-261-6/+8
|
* Refactor to simplify tokenizerLibravatar Titus Wormer2022-07-261-6/+3
|
* Refactor to remove need for cloning codesLibravatar Titus Wormer2022-07-251-10/+4
|
* Improve performance w/ a single feed loopLibravatar Titus Wormer2022-07-251-2/+6
|
* Refactor to remove unneeded tuples in every statesLibravatar Titus Wormer2022-07-221-13/+9
|
* Refactor to pass ints instead of vecs aroundLibravatar Titus Wormer2022-07-221-4/+6
|
* Refactor to move `index` field to `point`Libravatar Titus Wormer2022-07-211-5/+5
|
* Refactor to move some event fields to `link`Libravatar Titus Wormer2022-07-211-35/+36
|
* Refactor to share edit mapLibravatar Titus Wormer2022-07-201-3/+3
|
* Refactor to use less vecs for eventsLibravatar Titus Wormer2022-07-201-2/+4
|
* Refactor to remove cloning in `edit_map`Libravatar Titus Wormer2022-07-191-2/+2
|
* Use `edit_map` in `subtokenize`Libravatar Titus Wormer2022-07-191-67/+40
|
* Remove an unneeded `HashMap`Libravatar Titus Wormer2022-07-191-1/+1
|
* Fix annoying bug around virtual spaces in containersLibravatar Titus Wormer2022-07-151-1/+1
|
* Add support for `Flow` content typeLibravatar Titus Wormer2022-07-071-2/+4
|
* Refactor to do some to dosLibravatar Titus Wormer2022-07-051-3/+2
|
* Add support for unicode punctuationLibravatar Titus Wormer2022-07-041-1/+1
|
* Update list of todosLibravatar Titus Wormer2022-07-041-2/+0
|
* Fix jumps in `edit_map`Libravatar Titus Wormer2022-06-281-101/+99
| | | | | | | | | * Use resolve more often (e.g., heading (atx, setext)) * Fix to link whole phrasing (e.g., one big chunk of text in heading (atx, setext), titles, labels) * Replace `ChunkText`, `ChunkString`, with `event.content_type: Option<ContentType>` * Refactor to externalize `edit_map` from `label`
* Add link, images (resource)Libravatar Titus Wormer2022-06-241-12/+26
| | | | | | | | | | | | This is still some messy code that needs cleaning up, but it adds support for links and images, of the resource kind (`[a](b)`). References (`[a][b]`) are parsed and will soon be supported, but need matching. * Fix bug to pad percent-encoded bytes when normalizing urls * Fix bug with escapes counting as balancing in destination * Add `space_or_tab_one_line_ending`, to parse whitespace including up to one line ending (but not a blank line) * Add `ParserState` to share codes, definitions, etc
* Refactor some unneeded assignmentsLibravatar Titus Wormer2022-06-221-2/+1
|
* Add docs for token typesLibravatar Titus Wormer2022-06-221-1/+3
|
* Add docs for `subtokenize`Libravatar Titus Wormer2022-06-211-2/+51
|
* Update todo listLibravatar Titus Wormer2022-06-211-8/+1
|
* Add support for BOMLibravatar Titus Wormer2022-06-201-0/+4
|
* Remove unneeded `content` content typeLibravatar Titus Wormer2022-06-201-6/+3
|
* Fix support for deep subtokenizationLibravatar Titus Wormer2022-06-141-9/+19
| | | | | | * Fix a couple of forgotten line ending handling in html (text) * Fix missing initial case for html (text) not having a `<` 😬 * Add line ending handling to `text` construct
* Reorganize to split utilLibravatar Titus Wormer2022-06-141-6/+4
|
* Add docs for html (text)Libravatar Titus Wormer2022-06-141-0/+1
|
* Add basic html (text)Libravatar Titus Wormer2022-06-131-3/+9
| | | | | | | * Add all states for html (text) * Fix to link paragraph tokens together * Add note about uncovered bug where linking paragraph tokens together doesn’t work 😅
* Add text content typeLibravatar Titus Wormer2022-06-101-4/+10
| | | | | * Add character reference and character escapes in text * Add recursive subtokenization
* Add proper support for subtokenizationLibravatar Titus Wormer2022-06-101-50/+116
| | | | | | | - Add “content” content type - Add paragraph - Add skips - Add linked tokens
* Add basic subtokenization, string content in fenced codeLibravatar Titus Wormer2022-06-091-0/+67