🇷🇺 Русским гражданам
В Украине сейчас идет война. Силами РФ наносятся удары по гражданской инфраструктуре в [Харькове][1], [Киеве][2], [Чернигове][3], [Сумах][4], [Ирпене][5] и десятках других городов. Гибнут люди - и гражданское население, и военные, в том числе российские призывники, которых бросили воевать. Чтобы лишить собственный народ доступа к информации, правительство РФ запретило называть войну войной, закрыло независимые СМИ и принимает сейчас ряд диктаторских законов. Эти законы призваны заткнуть рот всем, кто против войны. За обычный призыв к миру сейчас можно получить несколько лет тюрьмы.
Не молчите! Молчание - знак вашего согласия с политикой российского правительства. Вы можете сделать выбор НЕ МОЛЧАТЬ.
🇺🇸 To people of Russia
There is a war in Ukraine right now. The forces of the Russian Federation are attacking civilian infrastructure in [Kharkiv][1], [Kyiv][2], [Chernihiv][3], [Sumy][4], [Irpin][5] and dozens of other cities. People are dying – both civilians and military servicemen, including Russian conscripts who were thrown into the fighting. In order to deprive its own people of access to information, the government of the Russian Federation has forbidden calling a war a war, shut down independent media and is passing a number of dictatorial laws. These laws are meant to silence all those who are against war. You can be jailed for multiple years for simply calling for peace. Do not be silent! Silence is a sign that you accept the Russian government's policy. You can choose NOT TO BE SILENT.
- [1] https://cloudfront-us-east-2.images.arcpublishing.com/reuters/P7K2MSZDGFMIJPDD7CI2GIROJI.jpg "Kharkiv under attack"
- [2] https://gdb.voanews.com/01bd0000-0aff-0242-fad0-08d9fc92c5b3_cx0_cy5_cw0_w1023_r1_s.jpg "Kyiv under attack"
- [3] https://ichef.bbci.co.uk/news/976/cpsprodpb/163DD/production/_123510119_hi074310744.jpg "Chernihiv under attack"
- [4] https://www.youtube.com/watch?v=8K-bkqKKf2A "Sumy under attack"
- [5] https://cloudfront-us-east-2.images.arcpublishing.com/reuters/K4MTMLEHTRKGFK3GSKAT4GR3NE.jpg "Irpin under attack"
- immutable
Methods |
public __construct() __construct() |
public static access(string $str, int $pos, string $encoding = 'UTF-8') : string Return the character at the specified position: $str[1] like functionality. EXAMPLE: UTF8::access('fòô', 1); // 'ò'
|
public static add_bom_to_string(string $str) : string Prepends UTF-8 BOM character to the string and returns the whole string. INFO: If BOM already existed there, the Input string is returned. EXAMPLE: UTF8::add_bom_to_string('fòô'); // "\xEF\xBB\xBF" . 'fòô'
|
public static array_change_key_case(array $array, int $case = 0CASE_LOWER, string $encoding = 'UTF-8') : array Changes all keys in an array.
|
public static between(string $str, string $start, string $end, int $offset = 0, string $encoding = 'UTF-8') : string Returns the substring between $start and $end, if found, or an empty string. An optional offset may be supplied from which to begin the search for the start string.
|
public static binary_to_str( $bin) : string Convert binary into a string. INFO: opposite to UTF8::str_to_binary() EXAMPLE: UTF8::binary_to_str('11110000100111111001100010000011'); // '😃'
|
public static bom() : string Returns the UTF-8 Byte Order Mark Character. INFO: take a look at UTF8::$bom for e.g. UTF-16 and UTF-32 BOM values EXAMPLE: UTF8::bom(); // "\xEF\xBB\xBF"
|
public static callback( $callback, string $str) : array
|
public static char_at(string $str, int $index, string $encoding = 'UTF-8') : string Returns the character at $index, with indexes starting at 0.
|
public static chars(string $str) : array Returns an array consisting of the characters in the string.
|
public static checkForSupport() This method will auto-detect your server environment for UTF-8 support.
|
public static chr( $code_point, string $encoding = 'UTF-8') Generates a UTF-8 encoded character from the given code point. INFO: opposite to UTF8::ord() EXAMPLE: UTF8::chr(0x2603); // '☃'
|
public static chr_map( $callback, string $str) : array Applies callback to all characters of a string. EXAMPLE: UTF8::chr_map([UTF8::class, 'strtolower'], 'Κόσμε'); // ['κ','ό', 'σ', 'μ', 'ε']
|
public static chr_size_list(string $str) : array Generates an array of byte length of each character of a Unicode string. 1 byte => U+0000 - U+007F 2 byte => U+0080 - U+07FF 3 byte => U+0800 - U+FFFF 4 byte => U+10000 - U+10FFFF EXAMPLE: UTF8::chr_size_list('中文空白-test'); // [3, 3, 3, 3, 1, 1, 1, 1, 1]
|
public static chr_to_decimal(string $char) : int Get a decimal code representation of a specific character. INFO: opposite to UTF8::decimal_to_chr() EXAMPLE: UTF8::chr_to_decimal('§'); // 0xa7
|
public static chr_to_hex( $char, string $prefix = 'U+') : string Get hexadecimal code point (U+xxxx) of a UTF-8 encoded character. EXAMPLE: UTF8::chr_to_hex('§'); // U+00a7
|
public static chunk_split(string $str, int $chunk_length = 76, string $end = '
') : string Splits a string into smaller chunks and multiple lines, using the specified line ending character. EXAMPLE: UTF8::chunk_split('ABC-ÖÄÜ-中文空白-κόσμε', 3); // "ABC\r\n-ÖÄ\r\nÜ-中\r\n文空白\r\n-κό\r\nσμε"
|
public static clean(string $str, bool $remove_bom = false, bool $normalize_whitespace = false, bool $normalize_msword = false, bool $keep_non_breaking_space = false, bool $replace_diamond_question_mark = false, bool $remove_invisible_characters = true, bool $remove_invisible_characters_url_encoded = false) : string Accepts a string and removes all non-UTF-8 characters from it + extras if needed. EXAMPLE: UTF8::clean("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef …” — 😃 - Düsseldorf'
|
public static cleanup( $str) : string Clean-up a string and show only printable UTF-8 chars at the end + fix UTF-8 encoding. EXAMPLE: UTF8::cleanup("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef …” — 😃 - Düsseldorf'
|
public static codepoints( $arg, bool $use_u_style = false) : array Accepts a string or an array of chars and returns an array of Unicode code points. INFO: opposite to UTF8::string() EXAMPLE: UTF8::codepoints('κöñ'); // array(954, 246, 241) // ... OR ... UTF8::codepoints('κöñ', true); // array('U+03ba', 'U+00f6', 'U+00f1')
|
public static collapse_whitespace(string $str) : string Trims the string and replaces consecutive whitespace characters with a single space. This includes tabs and newline characters, as well as multibyte whitespace such as the thin space and ideographic space.
|
public static count_chars(string $str, bool $clean_utf8 = false, bool $try_to_use_mb_functions = true) : array Returns count of characters used in a string. EXAMPLE: UTF8::count_chars('κaκbκc'); // array('κ' => 3, 'a' => 1, 'b' => 1, 'c' => 1)
|
public static css_identifier(string $str = '', array $filter = [' ' => '-', '/' => '-', '[' => '', ']' => ''], bool $strip_tags = false, bool $strtolower = true) : string Create a valid CSS identifier for e.g. "class"- or "id"-attributes. EXAMPLE: UTF8::css_identifier('123foo/bar!!!'); // _23foo-bar copy&past from https://github.com/drupal/core/blob/8.8.x/lib/Drupal/Component/Utility/Html.php#L95
|
public static css_stripe_media_queries(string $str) : string Remove css media-queries.
|
public static ctype_loaded() : bool Checks whether ctype is available on the server.
|
public static decimal_to_chr( $int) : string Converts an int value into a UTF-8 character. INFO: opposite to UTF8::string() EXAMPLE: UTF8::decimal_to_chr(931); // 'Σ'
|
public static decode_mimeheader( $str, string $encoding = 'UTF-8') Decodes a MIME header field
|
public static emoji_decode(string $str, bool $use_reversible_string_mappings = false) : string Decodes a string which was encoded by "UTF8::emoji_encode()". INFO: opposite to UTF8::emoji_encode() EXAMPLE: UTF8::emoji_decode('foo CHARACTER_OGRE', false); // 'foo 👹' // UTF8::emoji_decode('foo -PORTABLE_UTF8-308095726-627590803-8FTU_ELBATROP-', true); // 'foo 👹'
|
public static emoji_encode(string $str, bool $use_reversible_string_mappings = false) : string Encode a string with emoji chars into a non-emoji string. INFO: opposite to UTF8::emoji_decode() EXAMPLE: UTF8::emoji_encode('foo 👹', false)); // 'foo CHARACTER_OGRE' // UTF8::emoji_encode('foo 👹', true)); // 'foo -PORTABLE_UTF8-308095726-627590803-8FTU_ELBATROP-'
|
public static emoji_from_country_code(string $country_code_iso_3166_1) : string Convert any two-letter country code (ISO 3166-1) to the corresponding Emoji.
|
public static encode(string $to_encoding, string $str, bool $auto_detect_the_from_encoding = true, string $from_encoding = '') : string Encode a string with a new charset-encoding. INFO: This function will also try to fix broken / double encoding, so you can call this function also on a UTF-8 string and you don't mess up the string. EXAMPLE: UTF8::encode('ISO-8859-1', '-ABC-中文空白-'); // '-ABC-????-' // UTF8::encode('UTF-8', '-ABC-中文空白-'); // '-ABC-中文空白-' // UTF8::encode('HTML', '-ABC-中文空白-'); // '-ABC-中文空白-' // UTF8::encode('BASE64', '-ABC-中文空白-'); // 'LUFCQy3kuK3mlofnqbrnmb0t'
|
public static encode_mimeheader(string $str, string $from_charset = 'UTF-8', string $to_charset = 'UTF-8', string $transfer_encoding = 'Q', string $linefeed = '
', int $indent = 76)
|
public static extract_text(string $str, string $search = '', ?int $length = NULL, string $replacer_for_skipped_text = '…', string $encoding = 'UTF-8') : string Create an extract from a sentence, so if the search-string was found, it tries to center in the output.
|
public static file_get_contents(string $filename, bool $use_include_path = false, $context = NULL, ?int $offset = NULL, ?int $max_length = NULL, int $timeout = 10, bool $convert_to_utf8 = true, string $from_encoding = '') Reads entire file into a string. EXAMPLE: UTF8::file_get_contents('utf16le.txt'); // ... WARNING: Do not use UTF-8 Option ($convert_to_utf8) for binary files (e.g.: images) !!!
|
public static file_has_bom(string $file_path) : bool Checks if a file starts with BOM (Byte Order Mark) character. EXAMPLE: UTF8::file_has_bom('utf8_with_bom.txt'); // true
|
public static filter( $var, int $normalization_form = 16Normalizer::NFC, string $leading_combining = '◌') Normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. EXAMPLE: UTF8::filter(array("\xE9", 'à', 'a')); // array('é', 'à', 'a')
|
public static filter_input(int $type, string $variable_name, int $filter = 516FILTER_DEFAULT, $options = NULL) "filter_input()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets a specific external variable by name and optionally filters it. EXAMPLE: // _GET['foo'] = 'bar'; UTF8::filter_input(INPUT_GET, 'foo', FILTER_UNSAFE_RAW)); // 'bar'
|
public static filter_input_array(int $type, $definition = NULL, bool $add_empty = true) "filter_input_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets external variables and optionally filters them. EXAMPLE: // _GET['foo'] = 'bar'; UTF8::filter_input_array(INPUT_GET, array('foo' => 'FILTER_UNSAFE_RAW')); // array('bar')
|
public static filter_var( $variable, int $filter = 516FILTER_DEFAULT, $options = 0) "filter_var()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Filters a variable with a specified filter. EXAMPLE: UTF8::filter_var('-ABC-中文空白-', FILTER_VALIDATE_URL); // false
|
public static filter_var_array(array $data, $definition = 0, bool $add_empty = true) "filter_var_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets multiple variables and optionally filters them. EXAMPLE: $filters = [ 'name' => ['filter' => FILTER_CALLBACK, 'options' => [UTF8::class, 'ucwords']], 'age' => ['filter' => FILTER_VALIDATE_INT, 'options' => ['min_range' => 1, 'max_range' => 120]], 'email' => FILTER_VALIDATE_EMAIL, ]; $data = [ 'name' => 'κόσμε', 'age' => '18', 'email' => 'foo@bar.de' ]; UTF8::filter_var_array($data, $filters, true); // ['name' => 'Κόσμε', 'age' => 18, 'email' => 'foo@bar.de']
|
public static finfo_loaded() : bool Checks whether finfo is available on the server.
|
public static first_char(string $str, int $n = 1, string $encoding = 'UTF-8') : string Returns the first $n characters of the string.
|
public static fits_inside(string $str, int $box_size) : bool Check if the number of Unicode characters isn't greater than the specified integer. EXAMPLE: UTF8::fits_inside('κόσμε', 6); // false
|
public static fix_simple_utf8(string $str) : string Try to fix simple broken UTF-8 strings. INFO: Take a look at "UTF8::fix_utf8()" if you need a more advanced fix for broken UTF-8 strings. EXAMPLE: UTF8::fix_simple_utf8('Düsseldorf'); // 'Düsseldorf' If you received an UTF-8 string that was converted from Windows-1252 as it was ISO-8859-1 (ignoring Windows-1252 chars from 80 to 9F) use this function to fix it. See: http://en.wikipedia.org/wiki/Windows-1252
|
public static fix_utf8( $str) Fix a double (or multiple) encoded UTF8 string. EXAMPLE: UTF8::fix_utf8('Fédération'); // 'Fédération'
|
public static get_file_type(string $str, array $fallback = ['ext' => NULL, 'mime' => 'application/octet-stream', 'type' => NULL]) : array Warning: this method only works for some file-types (png, jpg) if you need more supported types, please use e.g. "finfo"
|
public static get_random_string(int $length, string $possible_chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', string $encoding = 'UTF-8') : string
|
public static get_unique_string( $extra_entropy = '', bool $use_md5 = true) : string
|
public static getCharDirection(string $char) : string Get character of a specific character. EXAMPLE: UTF8::getCharDirection('ا'); // 'RTL'
|
public static getSupportInfo(?string $key = NULL) Check for php-support.
|
public static getUrlParamFromArray(string $param, array $data) Get data from an array via array like string. EXAMPLE: $array['foo'][123] = 'lall'; UTF8::getUrlParamFromArray('foo[123]', $array); // 'lall'
|
public static has_lowercase(string $str) : bool Returns true if the string contains a lower case char, false otherwise.
|
public static has_uppercase(string $str) : bool Returns true if the string contains an upper case char, false otherwise.
|
public static has_whitespace(string $str) : bool Returns true if the string contains whitespace, false otherwise.
|
public static hex_to_chr(string $hexdec) Converts a hexadecimal value into a UTF-8 character. INFO: opposite to UTF8::chr_to_hex() EXAMPLE: UTF8::hex_to_chr('U+00a7'); // '§'
|
public static hex_to_int( $hexdec) Converts hexadecimal U+xxxx code point representation to integer. INFO: opposite to UTF8::int_to_hex() EXAMPLE: UTF8::hex_to_int('U+00f1'); // 241
|
public static html_encode(string $str, bool $keep_ascii_chars = false, string $encoding = 'UTF-8') : string Converts a UTF-8 string to a series of HTML numbered entities. INFO: opposite to UTF8::html_decode() EXAMPLE: UTF8::html_encode('中文空白'); // '中文空白'
|
public static html_entity_decode(string $str, ?int $flags = NULL, string $encoding = 'UTF-8') : string UTF-8 version of html_entity_decode() The reason we are not using html_entity_decode() by itself is because while it is not technically correct to leave out the semicolon at the end of an entity most browsers will still interpret the entity correctly. html_entity_decode() does not convert entities without semicolons, so we are left with our own little solution here. Bummer. Convert all HTML entities to their applicable characters. INFO: opposite to UTF8::html_encode() EXAMPLE: UTF8::html_entity_decode('中文空白'); // '中文空白'
|
public static html_escape(string $str, string $encoding = 'UTF-8') : string Create a escape html version of the string via "UTF8::htmlspecialchars()".
|
public static html_stripe_empty_tags(string $str) : string Remove empty html-tag. e.g.:
|
public static htmlentities(string $str, int $flags = 2ENT_COMPAT, string $encoding = 'UTF-8', bool $double_encode = true) : string Convert all applicable characters to HTML entities: UTF-8 version of htmlentities(). EXAMPLE: UTF8::htmlentities('<白-öäü>'); // '<白-öäü>'
|
public static htmlspecialchars(string $str, int $flags = 2ENT_COMPAT, string $encoding = 'UTF-8', bool $double_encode = true) : string Convert only special characters to HTML entities: UTF-8 version of htmlspecialchars() INFO: Take a look at "UTF8::htmlentities()" EXAMPLE: UTF8::htmlspecialchars('<白-öäü>'); // '<白-öäü>'
|
public static iconv_loaded() : bool Checks whether iconv is available on the server.
|
public static int_to_hex(int $int, string $prefix = 'U+') : string Converts Integer to hexadecimal U+xxxx code point representation. INFO: opposite to UTF8::hex_to_int() EXAMPLE: UTF8::int_to_hex(241); // 'U+00f1'
|
public static intl_loaded() : bool Checks whether intl is available on the server.
|
public static intlChar_loaded() : bool Checks whether intl-char is available on the server.
|
public static is_alpha(string $str) : bool Returns true if the string contains only alphabetic chars, false otherwise.
|
public static is_alphanumeric(string $str) : bool Returns true if the string contains only alphabetic and numeric chars, false otherwise.
|
public static is_ascii(string $str) : bool Checks if a string is 7 bit ASCII. EXAMPLE: UTF8::is_ascii('白'); // false
|
public static is_base64( $str, bool $empty_string_is_valid = false) : bool Returns true if the string is base64 encoded, false otherwise. EXAMPLE: UTF8::is_base64('4KSu4KWL4KSo4KS/4KSa'); // true
|
public static is_binary( $input, bool $strict = false) : bool Check if the input is binary... (is look like a hack). EXAMPLE: UTF8::is_binary(01); // true
|
public static is_binary_file( $file) : bool Check if the file is binary. EXAMPLE: UTF8::is_binary('./utf32.txt'); // true
|
public static is_blank(string $str) : bool Returns true if the string contains only whitespace chars, false otherwise.
|
public static is_bom( $str) : bool Checks if the given string is equal to any "Byte Order Mark". WARNING: Use "UTF8::string_has_bom()" if you will check BOM in a string. EXAMPLE: UTF8::is_bom("\xef\xbb\xbf"); // true
|
public static is_empty( $str) : bool Determine whether the string is considered to be empty. A variable is considered empty if it does not exist or if its value equals FALSE. empty() does not generate a warning if the variable does not exist.
|
public static is_hexadecimal(string $str) : bool Returns true if the string contains only hexadecimal chars, false otherwise.
|
public static is_html(string $str) : bool Check if the string contains any HTML tags. EXAMPLE: UTF8::is_html('lall'); // true
|
public static is_json(string $str, bool $only_array_or_object_results_are_valid = true) : bool Try to check if "$str" is a JSON-string. EXAMPLE: UTF8::is_json('{"array":[1,"¥","ä"]}'); // true
|
public static is_lowercase(string $str) : bool
|
public static is_printable(string $str, bool $ignore_control_characters = false) : bool Returns true if the string contains only printable (non-invisible) chars, false otherwise.
|
public static is_punctuation(string $str) : bool Returns true if the string contains only punctuation chars, false otherwise.
|
public static is_serialized(string $str) : bool Returns true if the string is serialized, false otherwise.
|
public static is_uppercase(string $str) : bool Returns true if the string contains only lower case chars, false otherwise.
|
public static is_url(string $url, bool $disallow_localhost = false) : bool Check if $url is an correct url.
|
public static is_utf16( $str, bool $check_if_string_is_binary = true) Check if the string is UTF-16. EXAMPLE: UTF8::is_utf16(file_get_contents('utf-16-le.txt')); // 1 // UTF8::is_utf16(file_get_contents('utf-16-be.txt')); // 2 // UTF8::is_utf16(file_get_contents('utf-8.txt')); // false
|
public static is_utf32( $str, bool $check_if_string_is_binary = true) Check if the string is UTF-32. EXAMPLE: UTF8::is_utf32(file_get_contents('utf-32-le.txt')); // 1 // UTF8::is_utf32(file_get_contents('utf-32-be.txt')); // 2 // UTF8::is_utf32(file_get_contents('utf-8.txt')); // false
|
public static is_utf8( $str, bool $strict = false) : bool Checks whether the passed input contains only byte sequences that appear valid UTF-8. EXAMPLE: UTF8::is_utf8(['Iñtërnâtiônàlizætiøn', 'foo']); // true // UTF8::is_utf8(["Iñtërnâtiônàlizætiøn\xA0\xA1", 'bar']); // false
|
public static json_decode(string $json, bool $assoc = false, int $depth = 512, int $options = 0) (PHP 5 >= 5.2.0, PECL json >= 1.2.0) Decodes a JSON string EXAMPLE: UTF8::json_decode('[1,"¥","ä"]'); // array(1, '¥', 'ä')
|
public static json_encode( $value, int $options = 0, int $depth = 512) (PHP 5 >= 5.2.0, PECL json >= 1.2.0) Returns the JSON representation of a value. EXAMPLE: UTF8::json_encode(array(1, '¥', 'ä')); // '[1,"¥","ä"]'
|
public static json_loaded() : bool Checks whether JSON is available on the server.
|
public static lcfirst(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Makes string's first char lowercase. EXAMPLE: UTF8::lcfirst('ÑTËRNÂTIÔNÀLIZÆTIØN'); // ñTËRNÂTIÔNÀLIZÆTIØN
|
public static lcwords(string $str, array $exceptions = [], string $char_list = '', string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Lowercase for all words in the string.
|
public static levenshtein(string $str1, string $str2, int $insertionCost = 1, int $replacementCost = 1, int $deletionCost = 1) : int Calculate Levenshtein distance between two strings. For better performance, in a real application with a single input string matched against many strings from a database, you will probably want to pre- encode the input only once and use \levenshtein().
|
public static ltrim(string $str = '', ?string $chars = NULL) : string Strip whitespace or other characters from the beginning of a UTF-8 string. EXAMPLE: UTF8::ltrim(' 中文空白 '); // '中文空白 '
|
public static max( $arg) Returns the UTF-8 character with the maximum code point in the given data. EXAMPLE: UTF8::max('abc-äöü-中文空白'); // 'ø'
|
public static max_chr_width(string $str) : int Calculates and returns the maximum number of bytes taken by any UTF-8 encoded character in the given string. EXAMPLE: UTF8::max_chr_width('Intërnâtiônàlizætiøn'); // 2
|
public static mbstring_loaded() : bool Checks whether mbstring is available on the server.
|
public static min( $arg) Returns the UTF-8 character with the minimum code point in the given data. EXAMPLE: UTF8::min('abc-äöü-中文空白'); // '-'
|
public static normalize_encoding( $encoding, $fallback = '') Normalize the encoding-"name" input. EXAMPLE: UTF8::normalize_encoding('UTF8'); // 'UTF-8'
|
public static normalize_line_ending(string $str, $replacer = '
') : string Standardize line ending to unix-like.
|
public static normalize_msword(string $str) : string Normalize some MS Word special characters. EXAMPLE: UTF8::normalize_msword('„Abcdef…”'); // '"Abcdef..."'
|
public static normalize_whitespace(string $str, bool $keep_non_breaking_space = false, bool $keep_bidi_unicode_controls = false, bool $normalize_control_characters = false) : string Normalize the whitespace. EXAMPLE: UTF8::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"
|
public static ord( $chr, string $encoding = 'UTF-8') : int Calculates Unicode code point of the given UTF-8 encoded character. INFO: opposite to UTF8::chr() EXAMPLE: UTF8::ord('☃'); // 0x2603
|
public static parse_str(string $str, $result, bool $clean_utf8 = false) : bool Parses the string into an array (into the the second parameter). WARNING: Unlike "parse_str()", this method does not (re-)place variables in the current scope, if the second parameter is not set! EXAMPLE: UTF8::parse_str('Iñtërnâtiônéàlizætiøn=測試&arr[]=foo+測試&arr[]=ການທົດສອບ', $array); echo $array['Iñtërnâtiônéàlizætiøn']; // '測試'
|
public static pcre_utf8_support() : bool Checks if \u modifier is available that enables Unicode support in PCRE.
|
public static range( $var1, $var2, bool $use_ctype = true, string $encoding = 'UTF-8', $step = 1) : array Create an array containing a range of UTF-8 characters. EXAMPLE: UTF8::range('κ', 'ζ'); // array('κ', 'ι', 'θ', 'η', 'ζ',)
|
public static rawurldecode(string $str, bool $multi_decode = true) : string Multi decode HTML entity + fix urlencoded-win1252-chars. EXAMPLE: UTF8::rawurldecode('tes%20öäü%20ítest+test'); // 'tes öäü ítest+test' e.g: 'test+test' => 'test+test' 'Düsseldorf' => 'Düsseldorf' 'D%FCsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%26%23xFC%3Bsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%C3%BCsseldorf' => 'Düsseldorf' 'D%C3%83%C2%BCsseldorf' => 'Düsseldorf' 'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf'
|
public static regex_replace(string $str, string $pattern, string $replacement, string $options = '', string $delimiter = '/') : string Replaces all occurrences of $pattern in $str by $replacement.
|
public static remove_bom(string $str) : string Remove the BOM from UTF-8 / UTF-16 / UTF-32 strings. EXAMPLE: UTF8::remove_bom("\xEF\xBB\xBFΜπορώ να"); // 'Μπορώ να'
|
public static remove_duplicates(string $str, $what = ' ') : string Removes duplicate occurrences of a string in another string. EXAMPLE: UTF8::remove_duplicates('öäü-κόσμεκόσμε-äöü', 'κόσμε'); // 'öäü-κόσμε-äöü'
|
public static remove_html(string $str, string $allowable_tags = '') : string Remove html via "strip_tags()" from the string.
|
public static remove_html_breaks(string $str, string $replacement = '') : string Remove all breaks [ | \r\n | \r | \n | ...] from the string.
|
public static remove_ileft(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the prefix $substring removed, if present and case-insensitive.
|
public static remove_invisible_characters(string $str, bool $url_encoded = false, string $replacement = '', bool $keep_basic_control_characters = true) : string Remove invisible characters from a string. e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script. EXAMPLE: UTF8::remove_invisible_characters("κόσ\0με"); // 'κόσμε' copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php
|
public static remove_iright(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the suffix $substring removed, if present and case-insensitive.
|
public static remove_left(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the prefix $substring removed, if present.
|
public static remove_right(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the suffix $substring removed, if present.
|
public static replace(string $str, string $search, string $replacement, bool $case_sensitive = true) : string Replaces all occurrences of $search in $str by $replacement.
|
public static replace_all(string $str, array $search, $replacement, bool $case_sensitive = true) : string Replaces all occurrences of $search in $str by $replacement.
|
public static replace_diamond_question_mark(string $str, string $replacement_char = '', bool $process_invalid_utf8_chars = true) : string Replace the diamond question mark (�) and invalid-UTF8 chars with the replacement. EXAMPLE: UTF8::replace_diamond_question_mark('中文空白�', ''); // '中文空白'
|
public static rtrim(string $str = '', ?string $chars = NULL) : string Strip whitespace or other characters from the end of a UTF-8 string. EXAMPLE: UTF8::rtrim('-ABC-中文空白- '); // '-ABC-中文空白-'
|
public static showSupport(bool $useEcho = true) WARNING: Print native UTF-8 support (libs) by default, e.g. for debugging.
|
public static single_chr_html_encode(string $char, bool $keep_ascii_chars = false, string $encoding = 'UTF-8') : string Converts a UTF-8 character to HTML Numbered Entity like "{". EXAMPLE: UTF8::single_chr_html_encode('κ'); // 'κ'
|
public static spaces_to_tabs(string $str, int $tab_length = 4) : string
|
public static str_camelize(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Returns a camelCase version of the string. Trims surrounding spaces, capitalizes letters following digits, spaces, dashes and underscores, and removes spaces, dashes, as well as underscores.
|
public static str_capitalize_name(string $str) : string Returns the string with the first letter of each word capitalized, except for when the word is a name which shouldn't be capitalized.
|
public static str_contains(string $haystack, string $needle, bool $case_sensitive = true) : bool Returns true if the string contains $needle, false otherwise. By default the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false.
|
public static str_contains_all(string $haystack, array $needles, bool $case_sensitive = true) : bool Returns true if the string contains all $needles, false otherwise. By default, the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false.
|
public static str_contains_any(string $haystack, array $needles, bool $case_sensitive = true) : bool Returns true if the string contains any $needles, false otherwise. By default the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false.
|
public static str_dasherize(string $str, string $encoding = 'UTF-8') : string Returns a lowercase and trimmed string separated by dashes. Dashes are inserted before uppercase characters (with the exception of the first character of the string), and in place of spaces as well as underscores.
|
public static str_delimit(string $str, string $delimiter, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Returns a lowercase and trimmed string separated by the given delimiter. Delimiters are inserted before uppercase characters (with the exception of the first character of the string), and in place of spaces, dashes, and underscores. Alpha delimiters are not converted to lowercase. EXAMPLE: UTF8::str_delimit('test case, '#'); // 'test#case' UTF8::str_delimit('test -case', ''); // 'testcase'
|
public static str_detect_encoding( $str) Optimized "mb_detect_encoding()"-function -> with support for UTF-16 and UTF-32. EXAMPLE: UTF8::str_detect_encoding('中文空白'); // 'UTF-8' UTF8::str_detect_encoding('Abc'); // 'ASCII'
|
public static str_ends_with(string $haystack, string $needle) : bool Check if the string ends with the given substring. EXAMPLE: UTF8::str_ends_with('BeginMiddleΚόσμε', 'Κόσμε'); // true UTF8::str_ends_with('BeginMiddleΚόσμε', 'κόσμε'); // false
|
public static str_ends_with_any(string $str, array $substrings) : bool Returns true if the string ends with any of $substrings, false otherwise.
|
public static str_ensure_left(string $str, string $substring) : string Ensures that the string begins with $substring. If it doesn't, it's prepended.
|
public static str_ensure_right(string $str, string $substring) : string Ensures that the string ends with $substring. If it doesn't, it's appended.
|
public static str_humanize( $str) : string Capitalizes the first word of the string, replaces underscores with spaces, and strips '_id'.
|
public static str_iends_with(string $haystack, string $needle) : bool Check if the string ends with the given substring, case-insensitive. EXAMPLE: UTF8::str_iends_with('BeginMiddleΚόσμε', 'Κόσμε'); // true UTF8::str_iends_with('BeginMiddleΚόσμε', 'κόσμε'); // true
|
public static str_iends_with_any(string $str, array $substrings) : bool Returns true if the string ends with any of $substrings, false otherwise.
|
public static str_insert(string $str, string $substring, int $index, string $encoding = 'UTF-8') : string Inserts $substring into the string at the $index provided.
|
public static str_ireplace( $search, $replacement, $subject, $count = NULL) Case-insensitive and UTF-8 safe version of str_replace. EXAMPLE: UTF8::str_ireplace('lIzÆ', 'lise', 'Iñtërnâtiônàlizætiøn'); // 'Iñtërnâtiônàlisetiøn'
|
public static str_ireplace_beginning(string $str, string $search, string $replacement) : string Replaces $search from the beginning of string with $replacement.
|
public static str_ireplace_ending(string $str, string $search, string $replacement) : string Replaces $search from the ending of string with $replacement.
|
public static str_istarts_with(string $haystack, string $needle) : bool Check if the string starts with the given substring, case-insensitive. EXAMPLE: UTF8::str_istarts_with('ΚόσμεMiddleEnd', 'Κόσμε'); // true UTF8::str_istarts_with('ΚόσμεMiddleEnd', 'κόσμε'); // true
|
public static str_istarts_with_any(string $str, array $substrings) : bool Returns true if the string begins with any of $substrings, false otherwise.
|
public static str_isubstr_after_first_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring after the first occurrence of a separator.
|
public static str_isubstr_after_last_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring after the last occurrence of a separator.
|
public static str_isubstr_before_first_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring before the first occurrence of a separator.
|
public static str_isubstr_before_last_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring before the last occurrence of a separator.
|
public static str_isubstr_first(string $str, string $needle, bool $before_needle = false, string $encoding = 'UTF-8') : string Gets the substring after (or before via "$before_needle") the first occurrence of the "$needle".
|
public static str_isubstr_last(string $str, string $needle, bool $before_needle = false, string $encoding = 'UTF-8') : string Gets the substring after (or before via "$before_needle") the last occurrence of the "$needle".
|
public static str_last_char(string $str, int $n = 1, string $encoding = 'UTF-8') : string Returns the last $n characters of the string.
|
public static str_limit(string $str, int $length = 100, string $str_add_on = '…', string $encoding = 'UTF-8') : string Limit the number of characters in a string.
|
public static str_limit_after_word(string $str, int $length = 100, string $str_add_on = '…', string $encoding = 'UTF-8') : string Limit the number of characters in a string, but also after the next word. EXAMPLE: UTF8::str_limit_after_word('fòô bàř fòô', 8, ''); // 'fòô bàř'
|
public static str_limit_in_byte(string $str, int $length = 100, string $str_add_on = '...', string $encoding = 'UTF-8') : string Limit the number of characters in a string in bytes.
|
public static str_longest_common_prefix(string $str1, string $str2, string $encoding = 'UTF-8') : string Returns the longest common prefix between the $str1 and $str2.
|
public static str_longest_common_substring(string $str1, string $str2, string $encoding = 'UTF-8') : string Returns the longest common substring between the $str1 and $str2. In the case of ties, it returns that which occurs first.
|
public static str_longest_common_suffix(string $str1, string $str2, string $encoding = 'UTF-8') : string Returns the longest common suffix between the $str1 and $str2.
|
public static str_matches_pattern(string $str, string $pattern) : bool Returns true if $str matches the supplied pattern, false otherwise.
|
public static str_obfuscate(string $str, float $percent = 0.5, string $obfuscateChar = '*', array $keepChars = []) : string Convert a string into a obfuscate string. EXAMPLE: UTF8::str_obfuscate('lars@moelleken.org', 0.5, '', ['@', '.']); // e.g. "l**@m**lleke*.r"
|
public static str_offset_exists(string $str, int $offset, string $encoding = 'UTF-8') : bool Returns whether or not a character exists at an index. Offsets may be negative to count from the last character in the string. Implements part of the ArrayAccess interface.
|
public static str_offset_get(string $str, int $index, string $encoding = 'UTF-8') : string Returns the character at the given index. Offsets may be negative to count from the last character in the string. Implements part of the ArrayAccess interface, and throws an OutOfBoundsException if the index does not exist.
|
public static str_pad(string $str, int $pad_length, string $pad_string = ' ', $pad_type = 1STR_PAD_RIGHT, string $encoding = 'UTF-8') : string Pad a UTF-8 string to a given length with another string. EXAMPLE: UTF8::str_pad('中文空白', 10, '_', STR_PAD_BOTH); // '中文空白'
|
public static str_pad_both(string $str, int $length, string $pad_str = ' ', string $encoding = 'UTF-8') : string Returns a new string of a given length such that both sides of the string are padded. Alias for "UTF8::str_pad()" with a $pad_type of 'both'.
|
public static str_pad_left(string $str, int $length, string $pad_str = ' ', string $encoding = 'UTF-8') : string Returns a new string of a given length such that the beginning of the string is padded. Alias for "UTF8::str_pad()" with a $pad_type of 'left'.
|
public static str_pad_right(string $str, int $length, string $pad_str = ' ', string $encoding = 'UTF-8') : string Returns a new string of a given length such that the end of the string is padded. Alias for "UTF8::str_pad()" with a $pad_type of 'right'.
|
public static str_repeat(string $str, int $multiplier) : string Repeat a string. EXAMPLE: UTF8::str_repeat("°~\xf0\x90\x28\xbc", 2); // '°~ð(¼°~ð(¼'
|
public static str_replace( $search, $replace, $subject, ?int $count = NULL) INFO: This is only a wrapper for "str_replace()" -> the original functions is already UTF-8 safe. Replace all occurrences of the search string with the replacement string
|
public static str_replace_beginning(string $str, string $search, string $replacement) : string Replaces $search from the beginning of string with $replacement.
|
public static str_replace_ending(string $str, string $search, string $replacement) : string Replaces $search from the ending of string with $replacement.
|
public static str_replace_first(string $search, string $replace, string $subject) : string Replace the first "$search"-term with the "$replace"-term.
|
public static str_replace_last(string $search, string $replace, string $subject) : string Replace the last "$search"-term with the "$replace"-term.
|
public static str_shuffle(string $str, string $encoding = 'UTF-8') : string Shuffles all the characters in the string. INFO: uses random algorithm which is weak for cryptography purposes EXAMPLE: UTF8::str_shuffle('fòô bàř fòô'); // 'àòôřb ffòô '
|
public static str_slice(string $str, int $start, ?int $end = NULL, string $encoding = 'UTF-8') Returns the substring beginning at $start, and up to, but not including the index specified by $end. If $end is omitted, the function extracts the remaining string. If $end is negative, it is computed from the end of the string.
|
public static str_snakeize(string $str, string $encoding = 'UTF-8') : string Convert a string to e.g.: "snake_case"
|
public static str_sort(string $str, bool $unique = false, bool $desc = false) : string Sort all characters according to code points. EXAMPLE: UTF8::str_sort(' -ABC-中文空白- '); // ' ---ABC中文白空'
|
public static str_split( $str, int $length = 1, bool $clean_utf8 = false, bool $try_to_use_mb_functions = true) : array Convert a string to an array of unicode characters. EXAMPLE: UTF8::str_split('中文空白'); // array('中', '文', '空', '白')
|
public static str_split_array(array $input, int $length = 1, bool $clean_utf8 = false, bool $try_to_use_mb_functions = true) : array Convert a string to an array of Unicode characters. EXAMPLE: UTF8::str_split_array(['中文空白', 'test'], 2); // [['中文', '空白'], ['te', 'st']]
|
public static str_split_pattern(string $str, string $pattern, int $limit = -1) : array Splits the string with the provided regular expression, returning an array of strings. An optional integer $limit will truncate the results.
|
public static str_starts_with(string $haystack, string $needle) : bool Check if the string starts with the given substring. EXAMPLE: UTF8::str_starts_with('ΚόσμεMiddleEnd', 'Κόσμε'); // true UTF8::str_starts_with('ΚόσμεMiddleEnd', 'κόσμε'); // false
|
public static str_starts_with_any(string $str, array $substrings) : bool Returns true if the string begins with any of $substrings, false otherwise.
|
public static str_substr_after_first_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring after the first occurrence of a separator.
|
public static str_substr_after_last_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring after the last occurrence of a separator.
|
public static str_substr_before_first_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring before the first occurrence of a separator.
|
public static str_substr_before_last_separator(string $str, string $separator, string $encoding = 'UTF-8') : string Gets the substring before the last occurrence of a separator.
|
public static str_substr_first(string $str, string $needle, bool $before_needle = false, string $encoding = 'UTF-8') : string Gets the substring after (or before via "$before_needle") the first occurrence of the "$needle".
|
public static str_substr_last(string $str, string $needle, bool $before_needle = false, string $encoding = 'UTF-8') : string Gets the substring after (or before via "$before_needle") the last occurrence of the "$needle".
|
public static str_surround(string $str, string $substring) : string Surrounds $str with the given substring.
|
public static str_titleize(string $str, ?array $ignore = NULL, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false, bool $use_trim_first = true, ?string $word_define_chars = NULL) : string Returns a trimmed string with the first letter of each word capitalized. Also accepts an array, $ignore, allowing you to list words not to be capitalized.
|
public static str_titleize_for_humans(string $str, array $ignore = [], string $encoding = 'UTF-8') : string Returns a trimmed string in proper title case. Also accepts an array, $ignore, allowing you to list words not to be capitalized. Adapted from John Gruber's script.
|
public static str_to_binary(string $str) Get a binary representation of a specific string. EXAPLE: UTF8::str_to_binary('😃'); // '11110000100111111001100010000011'
|
public static str_to_lines(string $str, bool $remove_empty_values = false, ?int $remove_short_values = NULL) : array
|
public static str_to_words(string $str, string $char_list = '', bool $remove_empty_values = false, ?int $remove_short_values = NULL) : array Convert a string into an array of words. EXAMPLE: UTF8::str_to_words('中文空白 oöäü#s', '#') // array('', '中文空白', ' ', 'oöäü#s', '')
|
public static str_truncate(string $str, int $length, string $substring = '', string $encoding = 'UTF-8') : string Truncates the string to a given length. If $substring is provided, and truncating occurs, the string is further truncated so that the substring may be appended without exceeding the desired length.
|
public static str_truncate_safe(string $str, int $length, string $substring = '', string $encoding = 'UTF-8', bool $ignore_do_not_split_words_for_one_word = false) : string Truncates the string to a given length, while ensuring that it does not split words. If $substring is provided, and truncating occurs, the string is further truncated so that the substring may be appended without exceeding the desired length.
|
public static str_underscored(string $str) : string Returns a lowercase and trimmed string separated by underscores. Underscores are inserted before uppercase characters (with the exception of the first character of the string), and in place of spaces as well as dashes.
|
public static str_upper_camelize(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Returns an UpperCamelCase version of the supplied string. It trims surrounding spaces, capitalizes letters following digits, spaces, dashes and underscores, and removes spaces, dashes, underscores.
|
public static str_word_count(string $str, int $format = 0, string $char_list = '') Get the number of words in a specific string. EXAMPLES: // format: 0 -> return only word count (int) // UTF8::str_word_count('中文空白 öäü abc#c'); // 4 UTF8::str_word_count('中文空白 öäü abc#c', 0, '#'); // 3 // format: 1 -> return words (array) // UTF8::str_word_count('中文空白 öäü abc#c', 1); // array('中文空白', 'öäü', 'abc', 'c') UTF8::str_word_count('中文空白 öäü abc#c', 1, '#'); // array('中文空白', 'öäü', 'abc#c') // format: 2 -> return words with offset (array) // UTF8::str_word_count('中文空白 öäü ab#c', 2); // array(0 => '中文空白', 5 => 'öäü', 9 => 'abc', 13 => 'c') UTF8::str_word_count('中文空白 öäü ab#c', 2, '#'); // array(0 => '中文空白', 5 => 'öäü', 9 => 'abc#c')
|
public static strcasecmp(string $str1, string $str2, string $encoding = 'UTF-8') : int Case-insensitive string comparison. INFO: Case-insensitive version of UTF8::strcmp() EXAMPLE: UTF8::strcasecmp("iñtërnâtiôn\nàlizætiøn", "Iñtërnâtiôn\nàlizætiøn"); // 0
|
public static strcmp(string $str1, string $str2) : int Case-sensitive string comparison. EXAMPLE: UTF8::strcmp("iñtërnâtiôn\nàlizætiøn", "iñtërnâtiôn\nàlizætiøn"); // 0
|
public static strcspn(string $str, string $char_list, int $offset = 0, ?int $length = NULL, string $encoding = 'UTF-8') : int Find length of initial segment not matching mask.
|
public static string( $intOrHex) : string Create a UTF-8 string from code points. INFO: opposite to UTF8::codepoints() EXAMPLE: UTF8::string(array(246, 228, 252)); // 'öäü'
|
public static string_has_bom(string $str) : bool Checks if string starts with "BOM" (Byte Order Mark Character) character. EXAMPLE: UTF8::string_has_bom("\xef\xbb\xbf foobar"); // true
|
public static strip_tags(string $str, ?string $allowable_tags = NULL, bool $clean_utf8 = false) : string Strip HTML and PHP tags from a string + clean invalid UTF-8. EXAMPLE: UTF8::strip_tags("κόσμε\xa0\xa1"); // 'κόσμε'
|
public static strip_whitespace(string $str) : string Strip all whitespace characters. This includes tabs and newline characters, as well as multibyte whitespace such as the thin space and ideographic space. EXAMPLE: UTF8::strip_whitespace(' Ο συγγραφέας '); // 'Οσυγγραφέας'
|
public static stripos(string $haystack, string $needle, int $offset = 0, string $encoding = 'UTF-8', bool $clean_utf8 = false) Find the position of the first occurrence of a substring in a string, case-insensitive. INFO: use UTF8::stripos_in_byte() for the byte-length EXAMPLE: UTF8::stripos('aσσb', 'ΣΣ'); // 1 (σσ == ΣΣ)
|
public static stripos_in_byte(string $haystack, string $needle, int $offset = 0) Find the position of the first occurrence of a substring in a string, case-insensitive.
|
public static stristr(string $haystack, string $needle, bool $before_needle = false, string $encoding = 'UTF-8', bool $clean_utf8 = false) Returns all of haystack starting from and including the first occurrence of needle to the end. EXAMPLE: $str = 'iñtërnâtiônàlizætiøn'; $search = 'NÂT'; UTF8::stristr($str, $search)); // 'nâtiônàlizætiøn' UTF8::stristr($str, $search, true)); // 'iñtër'
|
public static strlen(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false) Get the string length, not the byte-length! INFO: use UTF8::strwidth() for the char-length EXAMPLE: UTF8::strlen("Iñtërnâtiôn\xE9àlizætiøn")); // 20
|
public static strlen_in_byte(string $str) : int Get string length in byte.
|
public static strnatcasecmp(string $str1, string $str2, string $encoding = 'UTF-8') : int Case-insensitive string comparisons using a "natural order" algorithm. INFO: natural order version of UTF8::strcasecmp() EXAMPLES: UTF8::strnatcasecmp('2', '10Hello WORLD 中文空白!'); // -1 UTF8::strcasecmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // 1 UTF8::strnatcasecmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // 1 UTF8::strcasecmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // -1
|
public static strnatcmp(string $str1, string $str2) : int String comparisons using a "natural order" algorithm INFO: natural order version of UTF8::strcmp() EXAMPLES: UTF8::strnatcmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // -1 UTF8::strcmp('2Hello world 中文空白!', '10Hello WORLD 中文空白!'); // 1 UTF8::strnatcmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // 1 UTF8::strcmp('10Hello world 中文空白!', '2Hello WORLD 中文空白!'); // -1
|
public static strncasecmp(string $str1, string $str2, int $len, string $encoding = 'UTF-8') : int Case-insensitive string comparison of the first n characters. EXAMPLE: UTF8::strcasecmp("iñtërnâtiôn\nàlizætiøn321", "iñtërnâtiôn\nàlizætiøn123", 5); // 0
|
public static strncmp(string $str1, string $str2, int $len, string $encoding = 'UTF-8') : int String comparison of the first n characters. EXAMPLE: UTF8::strncmp("Iñtërnâtiôn\nàlizætiøn321", "Iñtërnâtiôn\nàlizætiøn123", 5); // 0
|
public static strpbrk(string $haystack, string $char_list) Search a string for any of a set of characters. EXAMPLE: UTF8::strpbrk('-中文空白-', '白'); // '白-'
|
public static strpos(string $haystack, $needle, int $offset = 0, string $encoding = 'UTF-8', bool $clean_utf8 = false) Find the position of the first occurrence of a substring in a string. INFO: use UTF8::strpos_in_byte() for the byte-length EXAMPLE: UTF8::strpos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 8
|
public static strpos_in_byte(string $haystack, string $needle, int $offset = 0) Find the position of the first occurrence of a substring in a string.
|
public static strrchr(string $haystack, string $needle, bool $before_needle = false, string $encoding = 'UTF-8', bool $clean_utf8 = false) Find the last occurrence of a character in a string within another. EXAMPLE: UTF8::strrchr('κόσμεκόσμε-äöü', 'κόσμε'); // 'κόσμε-äöü'
|
public static strrev(string $str, string $encoding = 'UTF-8') : string Reverses characters order in the string. EXAMPLE: UTF8::strrev('κ-öäü'); // 'üäö-κ'
|
public static strrichr(string $haystack, string $needle, bool $before_needle = false, string $encoding = 'UTF-8', bool $clean_utf8 = false) Find the last occurrence of a character in a string within another, case-insensitive. EXAMPLE: UTF8::strrichr('Aκόσμεκόσμε-äöü', 'aκόσμε'); // 'Aκόσμεκόσμε-äöü'
|
public static strripos(string $haystack, $needle, int $offset = 0, string $encoding = 'UTF-8', bool $clean_utf8 = false) Find the position of the last occurrence of a substring in a string, case-insensitive. EXAMPLE: UTF8::strripos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 13
|
public static strripos_in_byte(string $haystack, string $needle, int $offset = 0) Finds position of last occurrence of a string within another, case-insensitive.
|
public static strrpos(string $haystack, $needle, int $offset = 0, string $encoding = 'UTF-8', bool $clean_utf8 = false) Find the position of the last occurrence of a substring in a string. EXAMPLE: UTF8::strrpos('ABC-ÖÄÜ-中文空白-中文空白', '中'); // 13
|
public static strrpos_in_byte(string $haystack, string $needle, int $offset = 0) Find the position of the last occurrence of a substring in a string.
|
public static strspn(string $str, string $mask, int $offset = 0, ?int $length = NULL, string $encoding = 'UTF-8') Finds the length of the initial segment of a string consisting entirely of characters contained within a given mask. EXAMPLE: UTF8::strspn('iñtërnâtiônàlizætiøn', 'itñ'); // '3'
|
public static strstr(string $haystack, string $needle, bool $before_needle = false, string $encoding = 'UTF-8', bool $clean_utf8 = false) Returns part of haystack string from the first occurrence of needle to the end of haystack. EXAMPLE: $str = 'iñtërnâtiônàlizætiøn'; $search = 'nât'; UTF8::strstr($str, $search)); // 'nâtiônàlizætiøn' UTF8::strstr($str, $search, true)); // 'iñtër'
|
public static strstr_in_byte(string $haystack, string $needle, bool $before_needle = false) Finds first occurrence of a string within another.
|
public static strtocasefold(string $str, bool $full = true, bool $clean_utf8 = false, string $encoding = 'UTF-8', ?string $lang = NULL, bool $lower = true) : string Unicode transformation for case-less matching. EXAMPLE: UTF8::strtocasefold('ǰ◌̱'); // 'ǰ◌̱'
|
public static strtolower( $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Make a string lowercase. EXAMPLE: UTF8::strtolower('DÉJÀ Σσς Iıİi'); // 'déjà σσς iıii'
|
public static strtoupper( $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Make a string uppercase. EXAMPLE: UTF8::strtoupper('Déjà Σσς Iıİi'); // 'DÉJÀ ΣΣΣ IIİI'
|
public static strtr(string $str, $from, $to = '') : string Translate characters or replace sub-strings. EXAMPLE: $array = [ 'Hello' => '○●◎', '中文空白' => 'earth', ]; UTF8::strtr('Hello 中文空白', $array); // '○●◎ earth'
|
public static strwidth(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false) : int Return the width of a string. INFO: use UTF8::strlen() for the byte-length EXAMPLE: UTF8::strwidth("Iñtërnâtiôn\xE9àlizætiøn")); // 21
|
public static substr(string $str, int $offset = 0, ?int $length = NULL, string $encoding = 'UTF-8', bool $clean_utf8 = false) Get part of a string. EXAMPLE: UTF8::substr('中文空白', 1, 2); // '文空'
|
public static substr_compare(string $str1, string $str2, int $offset = 0, ?int $length = NULL, bool $case_insensitivity = false, string $encoding = 'UTF-8') : int Binary-safe comparison of two strings from an offset, up to a length of characters. EXAMPLE: UTF8::substr_compare("○●◎\r", '●◎', 0, 2); // -1 UTF8::substr_compare("○●◎\r", '◎●', 1, 2); // 1 UTF8::substr_compare("○●◎\r", '●◎', 1, 2); // 0
|
public static substr_count(string $haystack, string $needle, int $offset = 0, ?int $length = NULL, string $encoding = 'UTF-8', bool $clean_utf8 = false) Count the number of substring occurrences. EXAMPLE: UTF8::substr_count('中文空白', '文空', 1, 2); // 1
|
public static substr_count_in_byte(string $haystack, string $needle, int $offset = 0, ?int $length = NULL) Count the number of substring occurrences.
|
public static substr_count_simple(string $str, string $substring, bool $case_sensitive = true, string $encoding = 'UTF-8') : int Returns the number of occurrences of $substring in the given string. By default, the comparison is case-sensitive, but can be made insensitive by setting $case_sensitive to false.
|
public static substr_ileft(string $haystack, string $needle) : string Removes a prefix ($needle) from the beginning of the string ($haystack), case-insensitive. EXMAPLE: UTF8::substr_ileft('ΚόσμεMiddleEnd', 'Κόσμε'); // 'MiddleEnd' UTF8::substr_ileft('ΚόσμεMiddleEnd', 'κόσμε'); // 'MiddleEnd'
|
public static substr_in_byte(string $str, int $offset = 0, ?int $length = NULL) Get part of a string process in bytes.
|
public static substr_iright(string $haystack, string $needle) : string Removes a suffix ($needle) from the end of the string ($haystack), case-insensitive. EXAMPLE: UTF8::substr_iright('BeginMiddleΚόσμε', 'Κόσμε'); // 'BeginMiddle' UTF8::substr_iright('BeginMiddleΚόσμε', 'κόσμε'); // 'BeginMiddle'
|
public static substr_left(string $haystack, string $needle) : string Removes a prefix ($needle) from the beginning of the string ($haystack). EXAMPLE: UTF8::substr_left('ΚόσμεMiddleEnd', 'Κόσμε'); // 'MiddleEnd' UTF8::substr_left('ΚόσμεMiddleEnd', 'κόσμε'); // 'ΚόσμεMiddleEnd'
|
public static substr_replace( $str, $replacement, $offset, $length = NULL, string $encoding = 'UTF-8') Replace text within a portion of a string. EXAMPLE: UTF8::substr_replace(array('Iñtërnâtiônàlizætiøn', 'foo'), 'æ', 1); // array('Iæñtërnâtiônàlizætiøn', 'fæoo')
|
public static substr_right(string $haystack, string $needle, string $encoding = 'UTF-8') : string Removes a suffix ($needle) from the end of the string ($haystack). EXAMPLE: UTF8::substr_right('BeginMiddleΚόσμε', 'Κόσμε'); // 'BeginMiddle' UTF8::substr_right('BeginMiddleΚόσμε', 'κόσμε'); // 'BeginMiddleΚόσμε'
|
public static swapCase(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false) : string Returns a case swapped version of the string. EXAMPLE: UTF8::swapCase('déJÀ σσς iıII'); // 'DÉjà ΣΣΣ IIii'
|
public static symfony_polyfill_used() : bool Checks whether symfony-polyfills are used.
|
public static tabs_to_spaces(string $str, int $tab_length = 4) : string
|
public static titlecase(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Converts the first character of each word in the string to uppercase and all other chars to lowercase.
|
public static to_ascii(string $str, string $unknown = '?', bool $strict = false) : string Convert a string into ASCII. EXAMPLE: UTF8::to_ascii('déjà σσς iıii'); // 'deja sss iiii'
|
public static to_boolean( $str) : bool
|
public static to_filename(string $str, bool $use_transliterate = false, string $fallback_char = '-') : string Convert given string to safe filename (and keep string case).
|
public static to_int(string $str) Returns the given string as an integer, or null if the string isn't numeric.
|
public static to_iso8859( $str) Convert a string into "ISO-8859"-encoding (Latin-1). EXAMPLE: UTF8::to_utf8(UTF8::to_iso8859(' -ABC-中文空白- ')); // ' -ABC-????- '
|
public static to_string( $input) Returns the given input as string, or null if the input isn't int|float|string and do not implement the "__toString()" method.
|
public static to_utf8( $str, bool $decode_html_entity_to_utf8 = false) This function leaves UTF-8 characters alone, while converting almost all non-UTF8 to UTF8. EXAMPLE: UTF8::to_utf8(["cat"]); // array('cat')
|
public static to_utf8_string(string $str, bool $decode_html_entity_to_utf8 = false) : string This function leaves UTF-8 characters alone, while converting almost all non-UTF8 to UTF8. EXAMPLE: UTF8::to_utf8_string("cat"); // 'cat'
|
public static trim(string $str = '', ?string $chars = NULL) : string Strip whitespace or other characters from the beginning and end of a UTF-8 string. INFO: This is slower then "trim()" We can only use the original-function, if we use <= 7-Bit in the string / chars but the check for ASCII (7-Bit) cost more time, then we can safe here. EXAMPLE: UTF8::trim(' -ABC-中文空白- '); // '-ABC-中文空白-'
|
public static ucfirst(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Makes string's first char uppercase. EXAMPLE: UTF8::ucfirst('ñtërnâtiônàlizætiøn foo'); // 'Ñtërnâtiônàlizætiøn foo'
|
public static ucwords(string $str, array $exceptions = [], string $char_list = '', string $encoding = 'UTF-8', bool $clean_utf8 = false) : string Uppercase for all words in the string. EXAMPLE: UTF8::ucwords('iñt ërn âTi ônà liz æti øn'); // 'Iñt Ërn ÂTi Ônà Liz Æti Øn'
|
public static urldecode(string $str, bool $multi_decode = true) : string Multi decode HTML entity + fix urlencoded-win1252-chars. EXAMPLE: UTF8::urldecode('tes%20öäü%20ítest+test'); // 'tes öäü ítest test' e.g: 'test+test' => 'test test' 'Düsseldorf' => 'Düsseldorf' 'D%FCsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%26%23xFC%3Bsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%C3%BCsseldorf' => 'Düsseldorf' 'D%C3%83%C2%BCsseldorf' => 'Düsseldorf' 'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf'
|
public static utf8_decode(string $str, bool $keep_utf8_chars = false) : string Decodes a UTF-8 string to ISO-8859-1. EXAMPLE: UTF8::encode('UTF-8', UTF8::utf8_decode('-ABC-中文空白-')); // '-ABC-????-'
|
public static utf8_encode(string $str) : string Encodes an ISO-8859-1 string to UTF-8. EXAMPLE: UTF8::utf8_decode(UTF8::utf8_encode('-ABC-中文空白-')); // '-ABC-中文空白-'
|
public static whitespace_table() : array Returns an array with all utf8 whitespace characters.
|
public static words_limit(string $str, int $limit = 100, string $str_add_on = '…') : string Limit the number of words in a string. EXAMPLE: UTF8::words_limit('fòô bàř fòô', 2, ''); // 'fòô bàř'
|
public static wordwrap(string $str, int $width = 75, string $break = '
', bool $cut = false) : string Wraps a string to a given number of characters EXAMPLE: UTF8::wordwrap('Iñtërnâtiônàlizætiøn', 2, '', true)); // 'Iñtërnâtiônàlizætiøn'
|
public static wordwrap_per_line(string $str, int $width = 75, string $break = '
', bool $cut = false, bool $add_final_break = true, ?string $delimiter = NULL) : string Line-Wrap the string after $limit, but split the string by "$delimiter" before ... ... so that we wrap the per line.
|
public static ws() : array Returns an array of Unicode White Space characters.
|
Properties |
private static $BOM = ['' => 3, '' => 6, ' ��' => 4, ' þÿ' => 6, '�� ' => 4, 'ÿþ ' => 6, '��' => 2, 'þÿ' => 4, '��' => 2, 'ÿþ' => 4] |
private static $BROKEN_UTF8_FIX = ['‚' => '‚', '„' => '„', '…' => '…', '‡' => '‡', '‰' => '‰', '‹' => '‹', '‘' => '‘', '’' => '’', '“' => '“', '•' => '•', '–' => '–', '—' => '—', 'â„¢' => '™', '›' => '›', '€' => '€', '' => '€', '' => '‚', '' => 'ƒ', '' => '„', '
' => '…', '' => '†', '' => '‡', '' => 'ˆ', '' => '‰', '' => 'Š', '' => '‹', '' => 'Œ', '' => 'Ž', '' => '‘', '' => '’', '' => '“', '' => '”', '' => '•', '' => '–', '' => '—', '' => '˜', 'Â' => 'Â', 'Æ’' => 'ƒ', 'Ã' => 'Ã', 'Ä' => 'Ä', 'Ã…' => 'Å', 'Æ' => 'Æ', 'Ç' => 'Ç', 'ˆ' => 'ˆ', 'È' => 'È', 'É' => 'É', 'Ê' => 'Ê', 'Ë' => 'Ë', 'Å’' => 'Œ', 'ÃŒ' => 'Ì', 'Ž' => 'Ž', 'ÃŽ' => 'Î', 'Ñ' => 'Ñ', 'Ã’' => 'Ò', 'Ó' => 'Ó', 'â€' => '”', 'Ô' => 'Ô', 'Õ' => 'Õ', 'Ö' => 'Ö', '×' => '×', 'Ëœ' => '˜', 'Ø' => 'Ø', 'Ù' => 'Ù', 'Å¡' => 'š', 'Ú' => 'Ú', 'Û' => 'Û', 'Å“' => 'œ', 'Ãœ' => 'Ü', 'ž' => 'ž', 'Þ' => 'Þ', 'Ÿ' => 'Ÿ', 'ß' => 'ß', '¡' => '¡', 'á' => 'á', '¢' => '¢', 'â' => 'â', '£' => '£', 'ã' => 'ã', '¤' => '¤', 'ä' => 'ä', 'Â¥' => '¥', 'Ã¥' => 'å', '¦' => '¦', 'æ' => 'æ', '§' => '§', 'ç' => 'ç', '¨' => '¨', 'è' => 'è', '©' => '©', 'é' => 'é', 'ª' => 'ª', 'ê' => 'ê', '«' => '«', 'ë' => 'ë', '¬' => '¬', 'ì' => 'ì', '®' => '®', 'î' => 'î', '¯' => '¯', 'ï' => 'ï', '°' => '°', 'ð' => 'ð', '±' => '±', 'ñ' => 'ñ', '²' => '²', 'ò' => 'ò', '³' => '³', 'ó' => 'ó', '´' => '´', 'ô' => 'ô', 'µ' => 'µ', 'õ' => 'õ', '¶' => '¶', 'ö' => 'ö', '·' => '·', '÷' => '÷', '¸' => '¸', 'ø' => 'ø', '¹' => '¹', 'ù' => 'ù', 'º' => 'º', 'ú' => 'ú', '»' => '»', 'û' => 'û', '¼' => '¼', 'ü' => 'ü', '½' => '½', 'ý' => 'ý', '¾' => '¾', 'þ' => 'þ', '¿' => '¿', 'ÿ' => 'ÿ', 'À' => 'À']
|
private static $CHR = [' ', '', '', '', '', '', '', '', '', ' ', '
', '', '', '
', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' ', '!', '"', '#', '$', '%', '&', ''', '(', ')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '<', '=', '>', '?', '@', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\', ']', '^', '_', '`', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}', '~', '', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�']
|
private static $COMMON_CASE_FOLD = ['upper' => ['µ', 'ſ', 'ͅ', 'ς', 'ẞ', 'ϐ', 'ϑ', 'ϕ', 'ϖ', 'ϰ', 'ϱ', 'ϵ', 'ẛ', 'ι'], 'lower' => ['μ', 's', 'ι', 'σ', 'ß', 'β', 'θ', 'φ', 'π', 'κ', 'ρ', 'ε', 'ṡ', 'ι']]
|
private static $EMOJI = NULL
|
private static $EMOJI_KEYS_CACHE = NULL
|
private static $EMOJI_KEYS_REVERSIBLE_CACHE = NULL
|
private static $EMOJI_VALUES_CACHE = NULL
|
private static $ENCODINGS = NULL
|
private static $INTL_TRANSLITERATOR_LIST = NULL
|
private static $ORD = NULL
|
private static $SUPPORT = []
|
private static $WHITESPACE = [' ', 9 => ' ', 10 => '
', 11 => '', 13 => '
', 32 => ' ', 160 => ' ', 5760 => ' ', 6158 => '', 8192 => ' ', 8193 => ' ', 8194 => ' ', 8195 => ' ', 8196 => ' ', 8197 => ' ', 8198 => ' ', 8199 => ' ', 8200 => ' ', 8201 => ' ', 8202 => ' ', 8232 => '
', 8233 => '
', 8239 => ' ', 8287 => ' ', 65440 => 'ᅠ', 12288 => ' '] Numeric code point => UTF-8 Character
|
private static $WHITESPACE_TABLE = ['SPACE' => ' ', 'NO-BREAK SPACE' => ' ', 'OGHAM SPACE MARK' => ' ', 'EN QUAD' => ' ', 'EM QUAD' => ' ', 'EN SPACE' => ' ', 'EM SPACE' => ' ', 'THREE-PER-EM SPACE' => ' ', 'FOUR-PER-EM SPACE' => ' ', 'SIX-PER-EM SPACE' => ' ', 'FIGURE SPACE' => ' ', 'PUNCTUATION SPACE' => ' ', 'THIN SPACE' => ' ', 'HAIR SPACE' => ' ', 'LINE SEPARATOR' => '
', 'PARAGRAPH SEPARATOR' => '
', 'ZERO WIDTH SPACE' => '', 'NARROW NO-BREAK SPACE' => ' ', 'MEDIUM MATHEMATICAL SPACE' => ' ', 'IDEOGRAPHIC SPACE' => ' ', 'HALFWIDTH HANGUL FILLER' => 'ᅠ']
|
private static $WIN1252_TO_UTF8 = NULL
|
Methods |
private static fixStrCaseHelper(string $str, bool $use_lowercase = false, bool $use_full_case_fold = false)
|
private static getData(string $file) get data from "/data/*.php"
|
private static initEmojiData()
|
private static is_utf8_string(string $str, bool $strict = false) Checks whether the passed string contains only byte sequences that are valid UTF-8 characters. EXAMPLE: UTF8::is_utf8_string('Iñtërnâtiônàlizætiøn']); // true // UTF8::is_utf8_string("Iñtërnâtiônàlizætiøn\xA0\xA1"); // false
|
private static mbstring_overloaded() : bool Checks whether mbstring "overloaded" is active on the server.
|
private static reduce_string_array(array $strings, bool $remove_empty_values, ?int $remove_short_values = NULL)
|
private static rxClass(string $s, string $class = '') rxClass
|
private static str_capitalize_name_helper(string $names, string $delimiter, string $encoding = 'UTF-8') Personal names such as "Marcus Aurelius" are sometimes typed incorrectly using lowercase ("marcus aurelius").
|
private static strtonatfold(string $str) Generic case-sensitive transformation for collation matching.
|
private static to_utf8_convert_helper( $input)
|
private static urldecode_unicode_helper(string $str)
|
Properties |
private static $BOM = ['' => 3, '' => 6, ' ��' => 4, ' þÿ' => 6, '�� ' => 4, 'ÿþ ' => 6, '��' => 2, 'þÿ' => 4, '��' => 2, 'ÿþ' => 4] |
private static $BROKEN_UTF8_FIX = ['‚' => '‚', '„' => '„', '…' => '…', '‡' => '‡', '‰' => '‰', '‹' => '‹', '‘' => '‘', '’' => '’', '“' => '“', '•' => '•', '–' => '–', '—' => '—', 'â„¢' => '™', '›' => '›', '€' => '€', '' => '€', '' => '‚', '' => 'ƒ', '' => '„', '
' => '…', '' => '†', '' => '‡', '' => 'ˆ', '' => '‰', '' => 'Š', '' => '‹', '' => 'Œ', '' => 'Ž', '' => '‘', '' => '’', '' => '“', '' => '”', '' => '•', '' => '–', '' => '—', '' => '˜', 'Â' => 'Â', 'Æ’' => 'ƒ', 'Ã' => 'Ã', 'Ä' => 'Ä', 'Ã…' => 'Å', 'Æ' => 'Æ', 'Ç' => 'Ç', 'ˆ' => 'ˆ', 'È' => 'È', 'É' => 'É', 'Ê' => 'Ê', 'Ë' => 'Ë', 'Å’' => 'Œ', 'ÃŒ' => 'Ì', 'Ž' => 'Ž', 'ÃŽ' => 'Î', 'Ñ' => 'Ñ', 'Ã’' => 'Ò', 'Ó' => 'Ó', 'â€' => '”', 'Ô' => 'Ô', 'Õ' => 'Õ', 'Ö' => 'Ö', '×' => '×', 'Ëœ' => '˜', 'Ø' => 'Ø', 'Ù' => 'Ù', 'Å¡' => 'š', 'Ú' => 'Ú', 'Û' => 'Û', 'Å“' => 'œ', 'Ãœ' => 'Ü', 'ž' => 'ž', 'Þ' => 'Þ', 'Ÿ' => 'Ÿ', 'ß' => 'ß', '¡' => '¡', 'á' => 'á', '¢' => '¢', 'â' => 'â', '£' => '£', 'ã' => 'ã', '¤' => '¤', 'ä' => 'ä', 'Â¥' => '¥', 'Ã¥' => 'å', '¦' => '¦', 'æ' => 'æ', '§' => '§', 'ç' => 'ç', '¨' => '¨', 'è' => 'è', '©' => '©', 'é' => 'é', 'ª' => 'ª', 'ê' => 'ê', '«' => '«', 'ë' => 'ë', '¬' => '¬', 'ì' => 'ì', '®' => '®', 'î' => 'î', '¯' => '¯', 'ï' => 'ï', '°' => '°', 'ð' => 'ð', '±' => '±', 'ñ' => 'ñ', '²' => '²', 'ò' => 'ò', '³' => '³', 'ó' => 'ó', '´' => '´', 'ô' => 'ô', 'µ' => 'µ', 'õ' => 'õ', '¶' => '¶', 'ö' => 'ö', '·' => '·', '÷' => '÷', '¸' => '¸', 'ø' => 'ø', '¹' => '¹', 'ù' => 'ù', 'º' => 'º', 'ú' => 'ú', '»' => '»', 'û' => 'û', '¼' => '¼', 'ü' => 'ü', '½' => '½', 'ý' => 'ý', '¾' => '¾', 'þ' => 'þ', '¿' => '¿', 'ÿ' => 'ÿ', 'À' => 'À']
|
private static $CHR = [' ', '', '', '', '', '', '', '', '', ' ', '
', '', '', '
', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' ', '!', '"', '#', '$', '%', '&', ''', '(', ')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '<', '=', '>', '?', '@', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '[', '\', ']', '^', '_', '`', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}', '~', '', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�', '�']
|
private static $COMMON_CASE_FOLD = ['upper' => ['µ', 'ſ', 'ͅ', 'ς', 'ẞ', 'ϐ', 'ϑ', 'ϕ', 'ϖ', 'ϰ', 'ϱ', 'ϵ', 'ẛ', 'ι'], 'lower' => ['μ', 's', 'ι', 'σ', 'ß', 'β', 'θ', 'φ', 'π', 'κ', 'ρ', 'ε', 'ṡ', 'ι']]
|
private static $EMOJI = NULL
|
private static $EMOJI_KEYS_CACHE = NULL
|
private static $EMOJI_KEYS_REVERSIBLE_CACHE = NULL
|
private static $EMOJI_VALUES_CACHE = NULL
|
private static $ENCODINGS = NULL
|
private static $INTL_TRANSLITERATOR_LIST = NULL
|
private static $ORD = NULL
|
private static $SUPPORT = []
|
private static $WHITESPACE = [' ', 9 => ' ', 10 => '
', 11 => '', 13 => '
', 32 => ' ', 160 => ' ', 5760 => ' ', 6158 => '', 8192 => ' ', 8193 => ' ', 8194 => ' ', 8195 => ' ', 8196 => ' ', 8197 => ' ', 8198 => ' ', 8199 => ' ', 8200 => ' ', 8201 => ' ', 8202 => ' ', 8232 => '
', 8233 => '
', 8239 => ' ', 8287 => ' ', 65440 => 'ᅠ', 12288 => ' '] Numeric code point => UTF-8 Character
|
private static $WHITESPACE_TABLE = ['SPACE' => ' ', 'NO-BREAK SPACE' => ' ', 'OGHAM SPACE MARK' => ' ', 'EN QUAD' => ' ', 'EM QUAD' => ' ', 'EN SPACE' => ' ', 'EM SPACE' => ' ', 'THREE-PER-EM SPACE' => ' ', 'FOUR-PER-EM SPACE' => ' ', 'SIX-PER-EM SPACE' => ' ', 'FIGURE SPACE' => ' ', 'PUNCTUATION SPACE' => ' ', 'THIN SPACE' => ' ', 'HAIR SPACE' => ' ', 'LINE SEPARATOR' => '
', 'PARAGRAPH SEPARATOR' => '
', 'ZERO WIDTH SPACE' => '', 'NARROW NO-BREAK SPACE' => ' ', 'MEDIUM MATHEMATICAL SPACE' => ' ', 'IDEOGRAPHIC SPACE' => ' ', 'HALFWIDTH HANGUL FILLER' => 'ᅠ']
|
private static $WIN1252_TO_UTF8 = NULL
|
Methods |
public static access(string $str, int $pos, string $encoding = 'UTF-8') : string Return the character at the specified position: $str[1] like functionality. EXAMPLE: UTF8::access('fòô', 1); // 'ò'
|
public static add_bom_to_string(string $str) : string Prepends UTF-8 BOM character to the string and returns the whole string. INFO: If BOM already existed there, the Input string is returned. EXAMPLE: UTF8::add_bom_to_string('fòô'); // "\xEF\xBB\xBF" . 'fòô'
|
public static array_change_key_case(array $array, int $case = 0CASE_LOWER, string $encoding = 'UTF-8') : array Changes all keys in an array.
|
public static between(string $str, string $start, string $end, int $offset = 0, string $encoding = 'UTF-8') : string Returns the substring between $start and $end, if found, or an empty string. An optional offset may be supplied from which to begin the search for the start string.
|
public static binary_to_str( $bin) : string Convert binary into a string. INFO: opposite to UTF8::str_to_binary() EXAMPLE: UTF8::binary_to_str('11110000100111111001100010000011'); // '😃'
|
public static bom() : string Returns the UTF-8 Byte Order Mark Character. INFO: take a look at UTF8::$bom for e.g. UTF-16 and UTF-32 BOM values EXAMPLE: UTF8::bom(); // "\xEF\xBB\xBF"
|
public static callback( $callback, string $str) : array
|
public static char_at(string $str, int $index, string $encoding = 'UTF-8') : string Returns the character at $index, with indexes starting at 0.
|
public static chars(string $str) : array Returns an array consisting of the characters in the string.
|
public static checkForSupport() This method will auto-detect your server environment for UTF-8 support.
|
public static chr( $code_point, string $encoding = 'UTF-8') Generates a UTF-8 encoded character from the given code point. INFO: opposite to UTF8::ord() EXAMPLE: UTF8::chr(0x2603); // '☃'
|
public static chr_map( $callback, string $str) : array Applies callback to all characters of a string. EXAMPLE: UTF8::chr_map([UTF8::class, 'strtolower'], 'Κόσμε'); // ['κ','ό', 'σ', 'μ', 'ε']
|
public static chr_size_list(string $str) : array Generates an array of byte length of each character of a Unicode string. 1 byte => U+0000 - U+007F 2 byte => U+0080 - U+07FF 3 byte => U+0800 - U+FFFF 4 byte => U+10000 - U+10FFFF EXAMPLE: UTF8::chr_size_list('中文空白-test'); // [3, 3, 3, 3, 1, 1, 1, 1, 1]
|
public static chr_to_decimal(string $char) : int Get a decimal code representation of a specific character. INFO: opposite to UTF8::decimal_to_chr() EXAMPLE: UTF8::chr_to_decimal('§'); // 0xa7
|
public static chr_to_hex( $char, string $prefix = 'U+') : string Get hexadecimal code point (U+xxxx) of a UTF-8 encoded character. EXAMPLE: UTF8::chr_to_hex('§'); // U+00a7
|
public static chunk_split(string $str, int $chunk_length = 76, string $end = '
') : string Splits a string into smaller chunks and multiple lines, using the specified line ending character. EXAMPLE: UTF8::chunk_split('ABC-ÖÄÜ-中文空白-κόσμε', 3); // "ABC\r\n-ÖÄ\r\nÜ-中\r\n文空白\r\n-κό\r\nσμε"
|
public static clean(string $str, bool $remove_bom = false, bool $normalize_whitespace = false, bool $normalize_msword = false, bool $keep_non_breaking_space = false, bool $replace_diamond_question_mark = false, bool $remove_invisible_characters = true, bool $remove_invisible_characters_url_encoded = false) : string Accepts a string and removes all non-UTF-8 characters from it + extras if needed. EXAMPLE: UTF8::clean("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef …” — 😃 - Düsseldorf'
|
public static cleanup( $str) : string Clean-up a string and show only printable UTF-8 chars at the end + fix UTF-8 encoding. EXAMPLE: UTF8::cleanup("\xEF\xBB\xBF„Abcdef\xc2\xa0\x20…” — 😃 - Düsseldorf", true, true); // '„Abcdef …” — 😃 - Düsseldorf'
|
public static codepoints( $arg, bool $use_u_style = false) : array Accepts a string or an array of chars and returns an array of Unicode code points. INFO: opposite to UTF8::string() EXAMPLE: UTF8::codepoints('κöñ'); // array(954, 246, 241) // ... OR ... UTF8::codepoints('κöñ', true); // array('U+03ba', 'U+00f6', 'U+00f1')
|
public static collapse_whitespace(string $str) : string Trims the string and replaces consecutive whitespace characters with a single space. This includes tabs and newline characters, as well as multibyte whitespace such as the thin space and ideographic space.
|
public static count_chars(string $str, bool $clean_utf8 = false, bool $try_to_use_mb_functions = true) : array Returns count of characters used in a string. EXAMPLE: UTF8::count_chars('κaκbκc'); // array('κ' => 3, 'a' => 1, 'b' => 1, 'c' => 1)
|
public static css_identifier(string $str = '', array $filter = [' ' => '-', '/' => '-', '[' => '', ']' => ''], bool $strip_tags = false, bool $strtolower = true) : string Create a valid CSS identifier for e.g. "class"- or "id"-attributes. EXAMPLE: UTF8::css_identifier('123foo/bar!!!'); // _23foo-bar copy&past from https://github.com/drupal/core/blob/8.8.x/lib/Drupal/Component/Utility/Html.php#L95
|
public static css_stripe_media_queries(string $str) : string Remove css media-queries.
|
public static ctype_loaded() : bool Checks whether ctype is available on the server.
|
public static decimal_to_chr( $int) : string Converts an int value into a UTF-8 character. INFO: opposite to UTF8::string() EXAMPLE: UTF8::decimal_to_chr(931); // 'Σ'
|
public static decode_mimeheader( $str, string $encoding = 'UTF-8') Decodes a MIME header field
|
public static emoji_decode(string $str, bool $use_reversible_string_mappings = false) : string Decodes a string which was encoded by "UTF8::emoji_encode()". INFO: opposite to UTF8::emoji_encode() EXAMPLE: UTF8::emoji_decode('foo CHARACTER_OGRE', false); // 'foo 👹' // UTF8::emoji_decode('foo -PORTABLE_UTF8-308095726-627590803-8FTU_ELBATROP-', true); // 'foo 👹'
|
public static emoji_encode(string $str, bool $use_reversible_string_mappings = false) : string Encode a string with emoji chars into a non-emoji string. INFO: opposite to UTF8::emoji_decode() EXAMPLE: UTF8::emoji_encode('foo 👹', false)); // 'foo CHARACTER_OGRE' // UTF8::emoji_encode('foo 👹', true)); // 'foo -PORTABLE_UTF8-308095726-627590803-8FTU_ELBATROP-'
|
public static emoji_from_country_code(string $country_code_iso_3166_1) : string Convert any two-letter country code (ISO 3166-1) to the corresponding Emoji.
|
public static encode(string $to_encoding, string $str, bool $auto_detect_the_from_encoding = true, string $from_encoding = '') : string Encode a string with a new charset-encoding. INFO: This function will also try to fix broken / double encoding, so you can call this function also on a UTF-8 string and you don't mess up the string. EXAMPLE: UTF8::encode('ISO-8859-1', '-ABC-中文空白-'); // '-ABC-????-' // UTF8::encode('UTF-8', '-ABC-中文空白-'); // '-ABC-中文空白-' // UTF8::encode('HTML', '-ABC-中文空白-'); // '-ABC-中文空白-' // UTF8::encode('BASE64', '-ABC-中文空白-'); // 'LUFCQy3kuK3mlofnqbrnmb0t'
|
public static encode_mimeheader(string $str, string $from_charset = 'UTF-8', string $to_charset = 'UTF-8', string $transfer_encoding = 'Q', string $linefeed = '
', int $indent = 76)
|
public static extract_text(string $str, string $search = '', ?int $length = NULL, string $replacer_for_skipped_text = '…', string $encoding = 'UTF-8') : string Create an extract from a sentence, so if the search-string was found, it tries to center in the output.
|
public static file_get_contents(string $filename, bool $use_include_path = false, $context = NULL, ?int $offset = NULL, ?int $max_length = NULL, int $timeout = 10, bool $convert_to_utf8 = true, string $from_encoding = '') Reads entire file into a string. EXAMPLE: UTF8::file_get_contents('utf16le.txt'); // ... WARNING: Do not use UTF-8 Option ($convert_to_utf8) for binary files (e.g.: images) !!!
|
public static file_has_bom(string $file_path) : bool Checks if a file starts with BOM (Byte Order Mark) character. EXAMPLE: UTF8::file_has_bom('utf8_with_bom.txt'); // true
|
public static filter( $var, int $normalization_form = 16Normalizer::NFC, string $leading_combining = '◌') Normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. EXAMPLE: UTF8::filter(array("\xE9", 'à', 'a')); // array('é', 'à', 'a')
|
public static filter_input(int $type, string $variable_name, int $filter = 516FILTER_DEFAULT, $options = NULL) "filter_input()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets a specific external variable by name and optionally filters it. EXAMPLE: // _GET['foo'] = 'bar'; UTF8::filter_input(INPUT_GET, 'foo', FILTER_UNSAFE_RAW)); // 'bar'
|
public static filter_input_array(int $type, $definition = NULL, bool $add_empty = true) "filter_input_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets external variables and optionally filters them. EXAMPLE: // _GET['foo'] = 'bar'; UTF8::filter_input_array(INPUT_GET, array('foo' => 'FILTER_UNSAFE_RAW')); // array('bar')
|
public static filter_var( $variable, int $filter = 516FILTER_DEFAULT, $options = 0) "filter_var()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Filters a variable with a specified filter. EXAMPLE: UTF8::filter_var('-ABC-中文空白-', FILTER_VALIDATE_URL); // false
|
public static filter_var_array(array $data, $definition = 0, bool $add_empty = true) "filter_var_array()"-wrapper with normalizes to UTF-8 NFC, converting from WINDOWS-1252 when needed. Gets multiple variables and optionally filters them. EXAMPLE: $filters = [ 'name' => ['filter' => FILTER_CALLBACK, 'options' => [UTF8::class, 'ucwords']], 'age' => ['filter' => FILTER_VALIDATE_INT, 'options' => ['min_range' => 1, 'max_range' => 120]], 'email' => FILTER_VALIDATE_EMAIL, ]; $data = [ 'name' => 'κόσμε', 'age' => '18', 'email' => 'foo@bar.de' ]; UTF8::filter_var_array($data, $filters, true); // ['name' => 'Κόσμε', 'age' => 18, 'email' => 'foo@bar.de']
|
public static finfo_loaded() : bool Checks whether finfo is available on the server.
|
public static first_char(string $str, int $n = 1, string $encoding = 'UTF-8') : string Returns the first $n characters of the string.
|
public static fits_inside(string $str, int $box_size) : bool Check if the number of Unicode characters isn't greater than the specified integer. EXAMPLE: UTF8::fits_inside('κόσμε', 6); // false
|
public static fix_simple_utf8(string $str) : string Try to fix simple broken UTF-8 strings. INFO: Take a look at "UTF8::fix_utf8()" if you need a more advanced fix for broken UTF-8 strings. EXAMPLE: UTF8::fix_simple_utf8('Düsseldorf'); // 'Düsseldorf' If you received an UTF-8 string that was converted from Windows-1252 as it was ISO-8859-1 (ignoring Windows-1252 chars from 80 to 9F) use this function to fix it. See: http://en.wikipedia.org/wiki/Windows-1252
|
public static fix_utf8( $str) Fix a double (or multiple) encoded UTF8 string. EXAMPLE: UTF8::fix_utf8('Fédération'); // 'Fédération'
|
private static fixStrCaseHelper(string $str, bool $use_lowercase = false, bool $use_full_case_fold = false)
|
public static get_file_type(string $str, array $fallback = ['ext' => NULL, 'mime' => 'application/octet-stream', 'type' => NULL]) : array Warning: this method only works for some file-types (png, jpg) if you need more supported types, please use e.g. "finfo"
|
public static get_random_string(int $length, string $possible_chars = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789', string $encoding = 'UTF-8') : string
|
public static get_unique_string( $extra_entropy = '', bool $use_md5 = true) : string
|
public static getCharDirection(string $char) : string Get character of a specific character. EXAMPLE: UTF8::getCharDirection('ا'); // 'RTL'
|
private static getData(string $file) get data from "/data/*.php"
|
public static getSupportInfo(?string $key = NULL) Check for php-support.
|
public static getUrlParamFromArray(string $param, array $data) Get data from an array via array like string. EXAMPLE: $array['foo'][123] = 'lall'; UTF8::getUrlParamFromArray('foo[123]', $array); // 'lall'
|
public static has_lowercase(string $str) : bool Returns true if the string contains a lower case char, false otherwise.
|
public static has_uppercase(string $str) : bool Returns true if the string contains an upper case char, false otherwise.
|
public static has_whitespace(string $str) : bool Returns true if the string contains whitespace, false otherwise.
|
public static hex_to_chr(string $hexdec) Converts a hexadecimal value into a UTF-8 character. INFO: opposite to UTF8::chr_to_hex() EXAMPLE: UTF8::hex_to_chr('U+00a7'); // '§'
|
public static hex_to_int( $hexdec) Converts hexadecimal U+xxxx code point representation to integer. INFO: opposite to UTF8::int_to_hex() EXAMPLE: UTF8::hex_to_int('U+00f1'); // 241
|
public static html_encode(string $str, bool $keep_ascii_chars = false, string $encoding = 'UTF-8') : string Converts a UTF-8 string to a series of HTML numbered entities. INFO: opposite to UTF8::html_decode() EXAMPLE: UTF8::html_encode('中文空白'); // '中文空白'
|
public static html_entity_decode(string $str, ?int $flags = NULL, string $encoding = 'UTF-8') : string UTF-8 version of html_entity_decode() The reason we are not using html_entity_decode() by itself is because while it is not technically correct to leave out the semicolon at the end of an entity most browsers will still interpret the entity correctly. html_entity_decode() does not convert entities without semicolons, so we are left with our own little solution here. Bummer. Convert all HTML entities to their applicable characters. INFO: opposite to UTF8::html_encode() EXAMPLE: UTF8::html_entity_decode('中文空白'); // '中文空白'
|
public static html_escape(string $str, string $encoding = 'UTF-8') : string Create a escape html version of the string via "UTF8::htmlspecialchars()".
|
public static html_stripe_empty_tags(string $str) : string Remove empty html-tag. e.g.:
|
public static htmlentities(string $str, int $flags = 2ENT_COMPAT, string $encoding = 'UTF-8', bool $double_encode = true) : string Convert all applicable characters to HTML entities: UTF-8 version of htmlentities(). EXAMPLE: UTF8::htmlentities('<白-öäü>'); // '<白-öäü>'
|
public static htmlspecialchars(string $str, int $flags = 2ENT_COMPAT, string $encoding = 'UTF-8', bool $double_encode = true) : string Convert only special characters to HTML entities: UTF-8 version of htmlspecialchars() INFO: Take a look at "UTF8::htmlentities()" EXAMPLE: UTF8::htmlspecialchars('<白-öäü>'); // '<白-öäü>'
|
public static iconv_loaded() : bool Checks whether iconv is available on the server.
|
private static initEmojiData()
|
public static int_to_hex(int $int, string $prefix = 'U+') : string Converts Integer to hexadecimal U+xxxx code point representation. INFO: opposite to UTF8::hex_to_int() EXAMPLE: UTF8::int_to_hex(241); // 'U+00f1'
|
public static intl_loaded() : bool Checks whether intl is available on the server.
|
public static intlChar_loaded() : bool Checks whether intl-char is available on the server.
|
public static is_alpha(string $str) : bool Returns true if the string contains only alphabetic chars, false otherwise.
|
public static is_alphanumeric(string $str) : bool Returns true if the string contains only alphabetic and numeric chars, false otherwise.
|
public static is_ascii(string $str) : bool Checks if a string is 7 bit ASCII. EXAMPLE: UTF8::is_ascii('白'); // false
|
public static is_base64( $str, bool $empty_string_is_valid = false) : bool Returns true if the string is base64 encoded, false otherwise. EXAMPLE: UTF8::is_base64('4KSu4KWL4KSo4KS/4KSa'); // true
|
public static is_binary( $input, bool $strict = false) : bool Check if the input is binary... (is look like a hack). EXAMPLE: UTF8::is_binary(01); // true
|
public static is_binary_file( $file) : bool Check if the file is binary. EXAMPLE: UTF8::is_binary('./utf32.txt'); // true
|
public static is_blank(string $str) : bool Returns true if the string contains only whitespace chars, false otherwise.
|
public static is_bom( $str) : bool Checks if the given string is equal to any "Byte Order Mark". WARNING: Use "UTF8::string_has_bom()" if you will check BOM in a string. EXAMPLE: UTF8::is_bom("\xef\xbb\xbf"); // true
|
public static is_empty( $str) : bool Determine whether the string is considered to be empty. A variable is considered empty if it does not exist or if its value equals FALSE. empty() does not generate a warning if the variable does not exist.
|
public static is_hexadecimal(string $str) : bool Returns true if the string contains only hexadecimal chars, false otherwise.
|
public static is_html(string $str) : bool Check if the string contains any HTML tags. EXAMPLE: UTF8::is_html('lall'); // true
|
public static is_json(string $str, bool $only_array_or_object_results_are_valid = true) : bool Try to check if "$str" is a JSON-string. EXAMPLE: UTF8::is_json('{"array":[1,"¥","ä"]}'); // true
|
public static is_lowercase(string $str) : bool
|
public static is_printable(string $str, bool $ignore_control_characters = false) : bool Returns true if the string contains only printable (non-invisible) chars, false otherwise.
|
public static is_punctuation(string $str) : bool Returns true if the string contains only punctuation chars, false otherwise.
|
public static is_serialized(string $str) : bool Returns true if the string is serialized, false otherwise.
|
public static is_uppercase(string $str) : bool Returns true if the string contains only lower case chars, false otherwise.
|
public static is_url(string $url, bool $disallow_localhost = false) : bool Check if $url is an correct url.
|
public static is_utf16( $str, bool $check_if_string_is_binary = true) Check if the string is UTF-16. EXAMPLE: UTF8::is_utf16(file_get_contents('utf-16-le.txt')); // 1 // UTF8::is_utf16(file_get_contents('utf-16-be.txt')); // 2 // UTF8::is_utf16(file_get_contents('utf-8.txt')); // false
|
public static is_utf32( $str, bool $check_if_string_is_binary = true) Check if the string is UTF-32. EXAMPLE: UTF8::is_utf32(file_get_contents('utf-32-le.txt')); // 1 // UTF8::is_utf32(file_get_contents('utf-32-be.txt')); // 2 // UTF8::is_utf32(file_get_contents('utf-8.txt')); // false
|
public static is_utf8( $str, bool $strict = false) : bool Checks whether the passed input contains only byte sequences that appear valid UTF-8. EXAMPLE: UTF8::is_utf8(['Iñtërnâtiônàlizætiøn', 'foo']); // true // UTF8::is_utf8(["Iñtërnâtiônàlizætiøn\xA0\xA1", 'bar']); // false
|
private static is_utf8_string(string $str, bool $strict = false) Checks whether the passed string contains only byte sequences that are valid UTF-8 characters. EXAMPLE: UTF8::is_utf8_string('Iñtërnâtiônàlizætiøn']); // true // UTF8::is_utf8_string("Iñtërnâtiônàlizætiøn\xA0\xA1"); // false
|
public static json_decode(string $json, bool $assoc = false, int $depth = 512, int $options = 0) (PHP 5 >= 5.2.0, PECL json >= 1.2.0) Decodes a JSON string EXAMPLE: UTF8::json_decode('[1,"¥","ä"]'); // array(1, '¥', 'ä')
|
public static json_encode( $value, int $options = 0, int $depth = 512) (PHP 5 >= 5.2.0, PECL json >= 1.2.0) Returns the JSON representation of a value. EXAMPLE: UTF8::json_encode(array(1, '¥', 'ä')); // '[1,"¥","ä"]'
|
public static json_loaded() : bool Checks whether JSON is available on the server.
|
public static lcfirst(string $str, string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Makes string's first char lowercase. EXAMPLE: UTF8::lcfirst('ÑTËRNÂTIÔNÀLIZÆTIØN'); // ñTËRNÂTIÔNÀLIZÆTIØN
|
public static lcwords(string $str, array $exceptions = [], string $char_list = '', string $encoding = 'UTF-8', bool $clean_utf8 = false, ?string $lang = NULL, bool $try_to_keep_the_string_length = false) : string Lowercase for all words in the string.
|
public static levenshtein(string $str1, string $str2, int $insertionCost = 1, int $replacementCost = 1, int $deletionCost = 1) : int Calculate Levenshtein distance between two strings. For better performance, in a real application with a single input string matched against many strings from a database, you will probably want to pre- encode the input only once and use \levenshtein().
|
public static ltrim(string $str = '', ?string $chars = NULL) : string Strip whitespace or other characters from the beginning of a UTF-8 string. EXAMPLE: UTF8::ltrim(' 中文空白 '); // '中文空白 '
|
public static max( $arg) Returns the UTF-8 character with the maximum code point in the given data. EXAMPLE: UTF8::max('abc-äöü-中文空白'); // 'ø'
|
public static max_chr_width(string $str) : int Calculates and returns the maximum number of bytes taken by any UTF-8 encoded character in the given string. EXAMPLE: UTF8::max_chr_width('Intërnâtiônàlizætiøn'); // 2
|
public static mbstring_loaded() : bool Checks whether mbstring is available on the server.
|
private static mbstring_overloaded() : bool Checks whether mbstring "overloaded" is active on the server.
|
public static min( $arg) Returns the UTF-8 character with the minimum code point in the given data. EXAMPLE: UTF8::min('abc-äöü-中文空白'); // '-'
|
public static normalize_encoding( $encoding, $fallback = '') Normalize the encoding-"name" input. EXAMPLE: UTF8::normalize_encoding('UTF8'); // 'UTF-8'
|
public static normalize_line_ending(string $str, $replacer = '
') : string Standardize line ending to unix-like.
|
public static normalize_msword(string $str) : string Normalize some MS Word special characters. EXAMPLE: UTF8::normalize_msword('„Abcdef…”'); // '"Abcdef..."'
|
public static normalize_whitespace(string $str, bool $keep_non_breaking_space = false, bool $keep_bidi_unicode_controls = false, bool $normalize_control_characters = false) : string Normalize the whitespace. EXAMPLE: UTF8::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -"
|
public static ord( $chr, string $encoding = 'UTF-8') : int Calculates Unicode code point of the given UTF-8 encoded character. INFO: opposite to UTF8::chr() EXAMPLE: UTF8::ord('☃'); // 0x2603
|
public static parse_str(string $str, $result, bool $clean_utf8 = false) : bool Parses the string into an array (into the the second parameter). WARNING: Unlike "parse_str()", this method does not (re-)place variables in the current scope, if the second parameter is not set! EXAMPLE: UTF8::parse_str('Iñtërnâtiônéàlizætiøn=測試&arr[]=foo+測試&arr[]=ການທົດສອບ', $array); echo $array['Iñtërnâtiônéàlizætiøn']; // '測試'
|
public static pcre_utf8_support() : bool Checks if \u modifier is available that enables Unicode support in PCRE.
|
public static range( $var1, $var2, bool $use_ctype = true, string $encoding = 'UTF-8', $step = 1) : array Create an array containing a range of UTF-8 characters. EXAMPLE: UTF8::range('κ', 'ζ'); // array('κ', 'ι', 'θ', 'η', 'ζ',)
|
public static rawurldecode(string $str, bool $multi_decode = true) : string Multi decode HTML entity + fix urlencoded-win1252-chars. EXAMPLE: UTF8::rawurldecode('tes%20öäü%20ítest+test'); // 'tes öäü ítest+test' e.g: 'test+test' => 'test+test' 'Düsseldorf' => 'Düsseldorf' 'D%FCsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%26%23xFC%3Bsseldorf' => 'Düsseldorf' 'Düsseldorf' => 'Düsseldorf' 'D%C3%BCsseldorf' => 'Düsseldorf' 'D%C3%83%C2%BCsseldorf' => 'Düsseldorf' 'D%25C3%2583%25C2%25BCsseldorf' => 'Düsseldorf'
|
private static reduce_string_array(array $strings, bool $remove_empty_values, ?int $remove_short_values = NULL)
|
public static regex_replace(string $str, string $pattern, string $replacement, string $options = '', string $delimiter = '/') : string Replaces all occurrences of $pattern in $str by $replacement.
|
public static remove_bom(string $str) : string Remove the BOM from UTF-8 / UTF-16 / UTF-32 strings. EXAMPLE: UTF8::remove_bom("\xEF\xBB\xBFΜπορώ να"); // 'Μπορώ να'
|
public static remove_duplicates(string $str, $what = ' ') : string Removes duplicate occurrences of a string in another string. EXAMPLE: UTF8::remove_duplicates('öäü-κόσμεκόσμε-äöü', 'κόσμε'); // 'öäü-κόσμε-äöü'
|
public static remove_html(string $str, string $allowable_tags = '') : string Remove html via "strip_tags()" from the string.
|
public static remove_html_breaks(string $str, string $replacement = '') : string Remove all breaks [ | \r\n | \r | \n | ...] from the string.
|
public static remove_ileft(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the prefix $substring removed, if present and case-insensitive.
|
public static remove_invisible_characters(string $str, bool $url_encoded = false, string $replacement = '', bool $keep_basic_control_characters = true) : string Remove invisible characters from a string. e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script. EXAMPLE: UTF8::remove_invisible_characters("κόσ\0με"); // 'κόσμε' copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php
|
public static remove_iright(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the suffix $substring removed, if present and case-insensitive.
|
public static remove_left(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the prefix $substring removed, if present.
|
public static remove_right(string $str, string $substring, string $encoding = 'UTF-8') : string Returns a new string with the suffix $substring removed, if present.
|
public static replace(string $str, string $search, string $replacement, bool $case_sensitive = true) : string Replaces all occurrences of $search in $str by $replacement.
|
public static replace_all(string $str, array $search, $replacement, bool $case_sensitive = true) : string Replaces all occurrences of $search in $str by $replacement.
|