Unicode Internals of Perl 6


20 minutes



For those who are curious about the implementation details of Unicode in Perl 6. I go over some of the recent improvements to both standard case sensitive string search, case insensitive string search as well as implementing the Unicode Collation Algorithm. Implementation issues are discussed, in addition the talk will explain:

* How are strings stored inside MoarVM?
* When and how strings are normalized and what difficulties does this present for the MoarVM codebase?
* Recently MoarVM got support of case insensitive regex using foldcase instead of using lowercase semantics. When foldcasing strings, their length can be change. I discuss how this was solved.