public final class Utf8 extends Object
The variant of UTF-8 implemented by this class is the restricted definition of UTF-8 introduced in Unicode 3.1. One implication of this is that it rejects "non-shortest form" byte sequences, even though the JDK decoder may accept them.
| Constructor and Description |
|---|
Utf8() |
| Modifier and Type | Method and Description |
|---|---|
static int |
encodedLength(CharSequence sequence)
Returns the number of bytes in the UTF-8-encoded form of
sequence. |
private static int |
encodedLengthGeneral(CharSequence sequence,
int start) |
private static String |
unpairedSurrogateMsg(int i) |
public static int encodedLength(CharSequence sequence)
sequence. For a string, this
method is equivalent to string.getBytes(UTF_8).length, but is more efficient in both
time and space.IllegalArgumentException - if sequence contains ill-formed UTF-16 (unpaired
surrogates)private static int encodedLengthGeneral(CharSequence sequence, int start)
private static String unpairedSurrogateMsg(int i)
Copyright © 2022 ScalAgent D.T.. All rights reserved.