aboutsummaryrefslogtreecommitdiffstats
path: root/src/subtokenize.rs (unfollow)
Commit message (Collapse)AuthorFilesLines
2022-06-21Update todo listLibravatar Titus Wormer1-8/+1
2022-06-20Add support for BOMLibravatar Titus Wormer1-0/+4
2022-06-20Remove unneeded `content` content typeLibravatar Titus Wormer1-6/+3
2022-06-14Fix support for deep subtokenizationLibravatar Titus Wormer1-9/+19
* Fix a couple of forgotten line ending handling in html (text) * Fix missing initial case for html (text) not having a `<` 😬 * Add line ending handling to `text` construct
2022-06-14Reorganize to split utilLibravatar Titus Wormer1-6/+4
2022-06-14Add docs for html (text)Libravatar Titus Wormer1-0/+1
2022-06-13Add basic html (text)Libravatar Titus Wormer1-3/+9
* Add all states for html (text) * Fix to link paragraph tokens together * Add note about uncovered bug where linking paragraph tokens together doesn’t work 😅
2022-06-10Add text content typeLibravatar Titus Wormer1-4/+10
* Add character reference and character escapes in text * Add recursive subtokenization
2022-06-10Add proper support for subtokenizationLibravatar Titus Wormer1-50/+116
- Add “content” content type - Add paragraph - Add skips - Add linked tokens
2022-06-09Add basic subtokenization, string content in fenced codeLibravatar Titus Wormer1-0/+67