wp_check_invalid_utf8
wp_check_invalid_utf8 ( $text, $strip = false )
Parameters:- (string) text String which is expected to be encoded as UTF-8 unless `blog_charset` is another encoding.
- (bool) strip Optional. Whether to replace invalid sequences of bytes with the Unicode replacement character (U+FFFD `�`). Default `false` returns an empty string for invalid UTF-8 inputs.
Returns:- (string) The checked text.
Defined at:Change Log: - Introduced in WordPress: 2.8.0
- Deprecated in WordPress: —
Description
Checks for invalid UTF8 in a string.Note! This function only performs its work if the `blog_charset` is set
to UTF-8. For all other values it returns the input text unchanged.
Note! Unless requested, this returns an empty string if the input contains
any sequences of invalid UTF-8. To replace invalid byte sequences, pass
`true` as the optional `$strip` parameter.
Consider using {@see} instead which does not depend on
the value of `blog_charset`.
Example:
// The `blog_charset` is `latin1`, so this returns the input unchanged.
$every_possible_input === wp_check_invalid_utf8( $every_possible_input );
// Valid strings come through unchanged.
'test' === wp_check_invalid_utf8( 'test' );
$invalid = "the byte \xC0 is never allowed in a UTF-8 string.";
// Invalid strings are rejected outright.
'' === wp_check_invalid_utf8( $invalid );
// “Stripping” invalid sequences produces the replacement character instead.
"the byte \u{FFFD} is never allowed in a UTF-8 string." === wp_check_invalid_utf8( $invalid, true );
'the byte � is never allowed in a UTF-8 string.' === wp_check_invalid_utf8( $invalid, true );