A web-based collaborative LaTeX editor
Find a file
James Allen a3847d21d5 Replace UTF-16 surrogate characters with 'replacement character'
In Javascript, characters are 16-bits wide. It does not understand surrogates as characters.

From Wikipedia (http://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane):
"The High Surrogates (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved
for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate
and one Low Surrogate. A single surrogate code point will never be assigned a character.""

The main offender seems to be \uD835 as a stand alone character, which would be the first
16-bit character of a blackboard bold character (http://www.fileformat.info/info/unicode/char/1d400/index.htm).
Something must be going on client side that is screwing up the encoding and splitting the
two 16-bit characters so that \uD835 is standalone.
2015-06-12 10:14:35 +01:00
services/document-updater Replace UTF-16 surrogate characters with 'replacement character' 2015-06-12 10:14:35 +01:00