Zero-width space

The zero-width space (ZWSP; rendered: ​; HTML entity: ​ or ​) is a non-printing character used in computerized typesetting to indicate word boundaries to text-processing systems for scripts that do not use explicit spacing, or after characters not followed by a visible space after which there may be a line break.

Purpose

The zero-width space marks a potential line break without hyphenation. Its semantics and HTML implementation are similar to the soft hyphen, but soft hyphens display a hyphen character at the point where the line is broken.

The zero-width space can be used to mark word breaks in languages without visible space between words, such as Thai, Myanmar, Khmer, and Japanese.[1][2]

Unlike fixed-width spaces, in justified text that increases spacing between letters, characters adjacent to the zero-width space are spaced as if it was not present.[2]

Example

To show the effect of the zero-width space in text, the following words have been separated with zero-width spaces:

Lorem​Ipsum​Dolor​Sit​Amet​Consectetur​Adipiscing​Elit​Sed​Do​Eiusmod​Tempor​Incididunt​Ut​Labore​Et​Dolore​Magna​Aliqua​Ut​Enim​Ad​Minim​Veniam​Quis​Nostrud​Exercitation​Ullamco​Laboris​Nisi​Ut​Aliquip​Ex​Ea​Commodo​Consequat​Duis​Aute​Irure​Dolor​In​Reprehenderit​In​Voluptate​Velit​Esse​Cillum​Dolore​Eu​Fugiat​Nulla​Pariatur​Excepteur​Sint​Occaecat​Cupidatat​Non​Proident​Sunt​In​Culpa​Qui​Officia​Deserunt​Mollit​Anim​Id​Est​Laborum

And the following words have not been separated with these spaces:

LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum

The first text only breaks at word boundaries, while the second text will not be broken at all. Resizing the browser window will re-break the text accordingly.

Usage

HTML

In HTML pages, the HTML element <wbr> functions as a zero-width space. In Internet Explorer 6, the zero-width space was not supported in some fonts.[3]

Prohibition in domain names

ICANN rules prohibit domain names from containing non-displayed characters, including the zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one.[4][5]

Encoding

The zero-width space character is encoded in Unicode as U+200B ZERO WIDTH SPACE,[6] and input in HTML as &ZeroWidthSpace;, &#8203; or &#x200B;. Contrary to what their names suggest, the character entities &NegativeThickSpace;, &NegativeMediumSpace;, &NegativeThinSpace;, and &NegativeVeryThinSpace; also refer to the zero-width space.[7]

The TeX representation is \hskip0pt; the LaTeX representation is \hspace{0pt};[8] and the groff representation is \:.[9]

See also

References

Citations

Sources