From 0450e7c2b12bd3ef53e0cffb60a3dd860325b478 Mon Sep 17 00:00:00 2001 From: Titus Wormer Date: Mon, 4 Jul 2022 15:21:11 +0200 Subject: Add support for unicode punctuation --- readme.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'readme.md') diff --git a/readme.md b/readme.md index ef943b9..6c3ecf3 100644 --- a/readme.md +++ b/readme.md @@ -125,6 +125,7 @@ cargo doc --document-private-items #### Refactor - [ ] (1) Use `edit_map` in `subtokenize` (needs to support links in edits) +- [ ] (1) Use rust to crawl unicode #### Parse @@ -151,7 +152,6 @@ cargo doc --document-private-items #### Misc - [ ] (1) use `char::REPLACEMENT_CHARACTER`? -- [ ] (3) Unicode punctuation - [ ] (3) `nostd` - [ ] (3) Check subtokenizer unraveling is ok - [ ] (3) Remove splicing and cloning in subtokenizer @@ -275,3 +275,4 @@ important. things interrupt them each line - [x] (3) Add support for interrupting (or not) - [x] (5) attention +- [x] (3) Unicode punctuation -- cgit