In Javascript, characters are 16-bits wide. It does not understand surrogates as characters.
From Wikipedia (http://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane):
"The High Surrogates (U+D800–U+DBFF) and Low Surrogate (U+DC00–U+DFFF) codes are reserved
for encoding non-BMP characters in UTF-16 by using a pair of 16-bit codes: one High Surrogate
and one Low Surrogate. A single surrogate code point will never be assigned a character.""
The main offender seems to be \uD835 as a stand alone character, which would be the first
16-bit character of a blackboard bold character (http://www.fileformat.info/info/unicode/char/1d400/index.htm).
Something must be going on client side that is screwing up the encoding and splitting the
two 16-bit characters so that \uD835 is standalone.
The version number is only used by the doc updater and so can be cleanly
encapsulated in a collection that only the doc updater knows about. The version
is already stored under docOps, so continue to tore it there.