Topic
#javascript
2 pieces
Your word counter thinks a Japanese paragraph is one word
Count words with `text.split(/\s+/)` and a whole Japanese or Chinese paragraph comes back as one word, because CJK is written with no spaces between words. Reading-time estimates read "1 min" and length gates reject valid answers. The fix is to count CJK characters separately, or segment with `Intl.Segmenter` at word granularity.
The substring that cuts a character in half
Slice a string that holds an emoji or a rare kanji and you can split one character into two, because JavaScript indexes by UTF-16 code unit, not by character. `[...str]` and `Array.from` fix surrogate pairs but still tear ZWJ emoji and combining marks apart; the only walk that respects every visible character is `Intl.Segmenter` at grapheme granularity.