aboutsummaryrefslogtreecommitdiffstats
path: root/src/subtokenize.rs (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Add support for BOMLibravatar Titus Wormer2022-06-201-0/+4
|
* Remove unneeded `content` content typeLibravatar Titus Wormer2022-06-201-6/+3
|
* Fix support for deep subtokenizationLibravatar Titus Wormer2022-06-141-9/+19
| | | | | | * Fix a couple of forgotten line ending handling in html (text) * Fix missing initial case for html (text) not having a `<` 😬 * Add line ending handling to `text` construct
* Reorganize to split utilLibravatar Titus Wormer2022-06-141-6/+4
|
* Add docs for html (text)Libravatar Titus Wormer2022-06-141-0/+1
|
* Add basic html (text)Libravatar Titus Wormer2022-06-131-3/+9
| | | | | | | * Add all states for html (text) * Fix to link paragraph tokens together * Add note about uncovered bug where linking paragraph tokens together doesn’t work 😅
* Add text content typeLibravatar Titus Wormer2022-06-101-4/+10
| | | | | * Add character reference and character escapes in text * Add recursive subtokenization
* Add proper support for subtokenizationLibravatar Titus Wormer2022-06-101-50/+116
| | | | | | | - Add “content” content type - Add paragraph - Add skips - Add linked tokens
* Add basic subtokenization, string content in fenced codeLibravatar Titus Wormer2022-06-091-0/+67