@Stringable @InterfaceAudience.Public @InterfaceStability.Stable public class Text extends BinaryComparable implements WritableComparable<BinaryComparable>
In addition, it provides methods for string traversal without converting the byte array to a string.
Also includes utilities for serializing/deserialing a string, coding/decoding a string, checking if a byte array contains valid UTF8 code, calculating the length of an encoded string.
| Modifier and Type | Field and Description | 
|---|---|
| static int | DEFAULT_MAX_LEN | 
| Constructor and Description | 
|---|
| Text() | 
| Text(byte[] utf8)Construct from a byte array. | 
| Text(String string)Construct from a string. | 
| Text(Text utf8)Construct from another text. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | append(byte[] utf8,
      int start,
      int len)Append a range of bytes to the end of the given text | 
| static int | bytesToCodePoint(ByteBuffer bytes)Returns the next code point at the current position in
 the buffer. | 
| int | charAt(int position)Returns the Unicode Scalar Value (32-bit integer value)
 for the character at  position. | 
| void | clear()Clear the string to empty. | 
| byte[] | copyBytes()Get a copy of the bytes that is exactly the length of the data. | 
| static String | decode(byte[] utf8)Converts the provided byte array to a String using the
 UTF-8 encoding. | 
| static String | decode(byte[] utf8,
      int start,
      int length) | 
| static String | decode(byte[] utf8,
      int start,
      int length,
      boolean replace)Converts the provided byte array to a String using the
 UTF-8 encoding. | 
| static ByteBuffer | encode(String string)Converts the provided String to bytes using the
 UTF-8 encoding. | 
| static ByteBuffer | encode(String string,
      boolean replace)Converts the provided String to bytes using the
 UTF-8 encoding. | 
| boolean | equals(Object o)Returns true iff  ois a Text with the same contents. | 
| int | find(String what) | 
| int | find(String what,
    int start)Finds any occurence of  whatin the backing
 buffer, starting as positionstart. | 
| byte[] | getBytes()Returns the raw bytes; however, only data up to  getLength()is
 valid. | 
| int | getLength()Returns the number of bytes in the byte array | 
| int | hashCode()Return a hash of the bytes returned from {#getBytes()}. | 
| void | readFields(DataInput in)deserialize | 
| void | readFields(DataInput in,
          int maxLength) | 
| static String | readString(DataInput in)Read a UTF8 encoded string from in | 
| static String | readString(DataInput in,
          int maxLength)Read a UTF8 encoded string with a maximum size | 
| void | readWithKnownLength(DataInput in,
                   int len)Read a Text object whose length is already known. | 
| void | set(byte[] utf8)Set to a utf8 byte array | 
| void | set(byte[] utf8,
   int start,
   int len)Set the Text to range of bytes | 
| void | set(String string)Set to contain the contents of a string. | 
| void | set(Text other)copy a text. | 
| static void | skip(DataInput in)Skips over one Text in the input. | 
| String | toString()Convert text back to string | 
| static int | utf8Length(String string)For the given string, returns the number of UTF-8 bytes
 required to encode the string. | 
| static void | validateUTF8(byte[] utf8)Check if a byte array contains valid utf-8 | 
| static void | validateUTF8(byte[] utf8,
            int start,
            int len)Check to see if a byte array is valid utf-8 | 
| void | write(DataOutput out)serialize
 write this object to out
 length uses zero-compressed encoding | 
| void | write(DataOutput out,
     int maxLength) | 
| static int | writeString(DataOutput out,
           String s)Write a UTF8 encoded string to out | 
| static int | writeString(DataOutput out,
           String s,
           int maxLength)Write a UTF8 encoded string with a maximum size to out | 
compareTo, compareToclone, finalize, getClass, notify, notifyAll, wait, wait, waitcompareTopublic static final int DEFAULT_MAX_LEN
public Text()
public Text(String string)
public Text(Text utf8)
public Text(byte[] utf8)
public byte[] copyBytes()
getBytes() for faster access to the underlying array.public byte[] getBytes()
getLength() is
 valid. Please use copyBytes() if you
 need the returned array to be precisely the length of the data.getBytes in class BinaryComparablepublic int getLength()
getLength in class BinaryComparablepublic int charAt(int position)
position. Note that this
 method avoids using the converter or doing String instantiationpublic int find(String what)
public int find(String what, int start)
what in the backing
 buffer, starting as position start. The starting
 position is measured in bytes and the return value is in
 terms of byte position in the buffer. The backing buffer is
 not converted to a string for this operation.public void set(String string)
public void set(byte[] utf8)
public void set(Text other)
public void set(byte[] utf8,
       int start,
       int len)
utf8 - the data to copy fromstart - the first position of the new stringlen - the number of bytes of the new stringpublic void append(byte[] utf8,
          int start,
          int len)
utf8 - the data to copy fromstart - the first position to append from utf8len - the number of bytes to appendpublic void clear()
getBytes().
 In order to free the byte-array memory, call set(byte[])
 with an empty byte array (For example, new byte[0]).public String toString()
toString in class ObjectObject.toString()public void readFields(DataInput in) throws IOException
readFields in interface Writablein - DataInput to deseriablize this object from.IOExceptionpublic void readFields(DataInput in, int maxLength) throws IOException
IOExceptionpublic static void skip(DataInput in) throws IOException
IOExceptionpublic void readWithKnownLength(DataInput in, int len) throws IOException
IOExceptionpublic void write(DataOutput out) throws IOException
write in interface Writableout - DataOuput to serialize this object into.IOExceptionWritable.write(DataOutput)public void write(DataOutput out, int maxLength) throws IOException
IOExceptionpublic boolean equals(Object o)
o is a Text with the same contents.equals in class BinaryComparablepublic int hashCode()
BinaryComparablehashCode in class BinaryComparableWritableComparator.hashBytes(byte[],int)public static String decode(byte[] utf8) throws CharacterCodingException
CharacterCodingExceptionpublic static String decode(byte[] utf8, int start, int length) throws CharacterCodingException
CharacterCodingExceptionpublic static String decode(byte[] utf8, int start, int length, boolean replace) throws CharacterCodingException
replace is true, then
 malformed input is replaced with the
 substitution character, which is U+FFFD. Otherwise the
 method throws a MalformedInputException.CharacterCodingExceptionpublic static ByteBuffer encode(String string) throws CharacterCodingException
CharacterCodingExceptionpublic static ByteBuffer encode(String string, boolean replace) throws CharacterCodingException
replace is true, then
 malformed input is replaced with the
 substitution character, which is U+FFFD. Otherwise the
 method throws a MalformedInputException.CharacterCodingExceptionpublic static String readString(DataInput in) throws IOException
IOExceptionpublic static String readString(DataInput in, int maxLength) throws IOException
IOExceptionpublic static int writeString(DataOutput out, String s) throws IOException
IOExceptionpublic static int writeString(DataOutput out, String s, int maxLength) throws IOException
IOExceptionpublic static void validateUTF8(byte[] utf8)
                         throws MalformedInputException
utf8 - byte arrayMalformedInputException - if the byte array contains invalid utf-8public static void validateUTF8(byte[] utf8,
                int start,
                int len)
                         throws MalformedInputException
utf8 - the array of bytesstart - the offset of the first byte in the arraylen - the length of the byte sequenceMalformedInputException - if the byte array contains invalid bytespublic static int bytesToCodePoint(ByteBuffer bytes)
public static int utf8Length(String string)
string - text to encodeCopyright © 2022 Apache Software Foundation. All rights reserved.