The zero-width space (ZWSP; rendered: ; HTML entity: ​ or ​) is a non-printing character used in computerized typesetting to indicate word boundaries to text-processing systems for scripts that do not use explicit spacing, or after characters not followed by a visible space after which there may be a line break.
Purpose
The zero-width space marks a potential line break without hyphenation. Its semantics and HTML implementation are similar to the soft hyphen, but soft hyphens display a hyphen character at the point where the line is broken.
The zero-width space can be used to mark word breaks in languages without visible space between words, such as Thai, Myanmar, Khmer, and Japanese.[1][2]
Unlike fixed-width spaces, in justified text that increases spacing between letters, characters adjacent to the zero-width space are spaced as if it was not present.[2]
Example
To show the effect of the zero-width space in text, the following words have been separated with zero-width spaces:
LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum
And the following words have not been separated with these spaces:
LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum
The first text only breaks at word boundaries, while the second text will not be broken at all. Resizing the browser window will re-break the text accordingly.
Usage
HTML
In HTML pages, the HTML element <wbr>
functions as a zero-width space. In Internet Explorer 6, the zero-width space was not supported in some fonts.[3]
Prohibition in domain names
ICANN rules prohibit domain names from containing non-displayed characters, including the zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one.[4][5]
Encoding
The zero-width space character is encoded in Unicode as U+200B ZERO WIDTH SPACE,[6] and input in HTML as ​
, ​
or ​
. Contrary to what their names suggest, the character entities ​
, ​
, ​
, and ​
also refer to the zero-width space.[7]
The TeX representation is \hskip0pt
; the LaTeX representation is \hspace{0pt}
;[8] and the groff representation is \:
.[9]
See also
- Hair space
- Whitespace character – including a table comparing various space-like characters
- Word divider
- Word wrapping
- Word joiner (U+2060: ), as well as zero-width no-break space (U+FEFF: )
- Zero-width joiner (U+200D: )
- Zero-width non-joiner (U+200C: )
References
Citations
Sources
- Mair, Victor H.; Liu, Yongquan (1991), Characters and computers, IOS Press