generated from mwc/lab_encoding
Finished checkpoint 3, hypothesis on utf8
This commit is contained in:
@@ -116,4 +116,6 @@ I was a bit hesitant about this one, because my math brain tells me that there i
|
|||||||
|
|
||||||
Make a hypothesis about how this could work.
|
Make a hypothesis about how this could work.
|
||||||
|
|
||||||
|
There must be some sort of pattern to the way characters are encoded in the first place. It must have something to do with how many zeros are next to each other, like maybe there's a rule that there can't be more than 4 zeros in a row in a given byte? So when it sees more than that in a byte, it stops and decides to code up until that point?
|
||||||
|
|
||||||
|
I decided to look this up after I hypothesized because I was struggling to find a pattern, and the actual way it works is super cool! It uses the first few digits of the first byte to determine how many bytes long the character is, and then it reads only that many!
|
||||||
|
|||||||
Reference in New Issue
Block a user