Copyright | (c) The University of Glasgow, 2008-2011 |
---|---|
License | (c) The University of Glasgow, 2008-2011 |
Maintainer | libraries@haskell.org |
Stability | internal |
Portability | non-portable |
Safe Haskell | Trustworthy |
Types for specifying how text encoding/decoding fails
- data CodingFailureMode
- codingFailureModeSuffix :: CodingFailureMode -> String
- isSurrogate :: Char -> Bool
- recoverDecode :: CodingFailureMode -> Buffer Word8 -> Buffer Char -> IO (Buffer Word8, Buffer Char)
- recoverEncode :: CodingFailureMode -> Buffer Char -> Buffer Word8 -> IO (Buffer Char, Buffer Word8)
Documentation
data CodingFailureMode Source
The CodingFailureMode
is used to construct TextEncoding
s, and
specifies how they handle illegal sequences.
ErrorOnCodingFailure | Throw an error when an illegal sequence is encountered |
IgnoreCodingFailure | Attempt to ignore and recover if an illegal sequence is encountered |
TransliterateCodingFailure | Replace with the closest visual match upon an illegal sequence |
RoundtripFailure | Use the private-use escape mechanism to attempt to allow illegal sequences to be roundtripped. |
isSurrogate :: Char -> BoolSource
Some characters are actually surrogate codepoints defined for
use in UTF-16. We need to signal an invalid character if we detect
them when encoding a sequence of Char
s into Word8
s because they
won't give valid Unicode.
We may also need to signal an invalid character if we detect them
when encoding a sequence of Char
s into Word8
s because the
RoundtripFailure
mode creates these to round-trip bytes through
our internal UTF-16 encoding.