Ustr

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES All Classes

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

com.textuality
Class Ustr

java.lang.Object
  |
  +--com.textuality.Ustr

All Implemented Interfaces:: java.lang.Comparable, java.io.Serializable

public class Ustr
extends java.lang.Object
implements java.lang.Comparable, java.io.Serializable

Ustr - rhymes with Wooster. Implements a string, with three design goals:

Correct implementation of Unicode semantics.
Support for as many of java's String and StringBuffer methods as is reasonable.
Support for the familiar null-terminated-string primitives of the C programming language: strcpy() and friends.

Note that in the context of a Ustr, "index" always means how many Unicode characters you are into the Ustr's text, while "offset" always mean how many bytes you are into its UTF8 encoded form.

Similarly, "char" and "String" always refer to the Java constructs, while "character" always means a Unicode character, always identified by a a Java "int".

For any method that copies characters and might overrun a buffer, a "safe" version is provided, starting with an extra s, e.g. sstrcopy and sstrcat. These versions always arrange that the copied string not overrun the provided buffer, which will be properly null-terminated.

See Also:: Serialized Form

Field Summary

int base
          Where in the array s the string starts.

int offset
          To keep track of a single character position within the string; this is used by the nextChar and appendChar methods.

byte[] s
          A byte array containing the string, always in UTF-8 form.

Constructor Summary

Ustr()
          Creates an empty Ustr with no buffer

Ustr(byte[] bytes)
          Wraps a Ustr around a buffer.

Ustr(byte[] bytes, int start)
          Wraps a Ustr around a position in a buffer.

Ustr(char[] chars)
          Makes a Ustr from a char[] array.

Ustr(int length)
          Creates an empty Ustr, with a null termination at the front.

Ustr(int[] ints)
          Makes a Ustr from an int[] array.

Ustr(int space, java.lang.Object o)
          Makes a Ustr from an object, based on its toString(), leaving room for growth.

Ustr(java.lang.Object o)
          Makes a Ustr from an object, based on its toString().

Method Summary

void appendChar(int c)
          Append one Unicode character to a Ustr.

static int appendChar(int c, byte[] s, int offset)
          Writes one Unicode character into a UTF-8 encoded byte array at a given offset, and null-terminates it.

int charAt(int at)
          find the Unicode character at some index in a Ustr.

int compareTo(java.lang.Object other)
          Supports the Comparable interface.

com.textuality.Ustr concat(java.lang.String str)
          Append a String to the end of this.

com.textuality.Ustr concat(com.textuality.Ustr us)
          Append a Ustr to the end of this.

boolean endsWith(java.lang.String suffix)
          Test if this Ustr ends with specified suffix (a String).

boolean endsWith(com.textuality.Ustr suffix)
          Test if this Ustr ends with the specified suffix (a Ustr).

boolean equals(java.lang.Object anObject)
          Compares this Ustr to another object.

byte[] getBytes()
          Convert this Ustr into bytes according to the platform's default character encoding, storing the result in a new byte array.

byte[] getBytes(java.lang.String enc)
          Convert this Ustr into bytes according to the specified character encoding, storing the result into a new byte array.

void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
          Copies Unicode characters from this Ustr into the destination char array.

static void getChars(java.lang.String str, int srcBegin, int srcEnd, char[] dst, int dstBegin)
          Copies Unicode characters from this String into the destination char array.

int hashCode()
          Returns a hashcode for this Ustr.

int indexOf(int ch)
          Returns the first index within this Ustr of the specified Unicode character.

int indexOf(int ch, int start)
          Returns the first index within this Ustr of the specified character, starting at the specified index.

int indexOf(com.textuality.Ustr us)
          Returns the index within this Ustr of the first occurrence of the specified other Ustr, or -1.

int indexOf(com.textuality.Ustr us, int start)
          Returns the index within this Ustr of the first occurrence of the specified other Ustr starting at the given offset, or -1.

void init()
          Empty a Ustr by setting its first byte to 0.

com.textuality.Ustr intern()
          returns a canonical version of the Ustr, which should be treated as read-only.

int lastIndexOf(int ch)
          Returns the index within this Ustr of the last occurrence of the specified Unicode character.

int lastIndexOf(int ch, int stop)
          Returns the index within this Ustr of the last occurrence of the specified Unicode character before the specified stop index.

int lastIndexOf(com.textuality.Ustr us)
          Finds the last substring match.

int lastIndexOf(com.textuality.Ustr us, int stop)
          Finds the last substring match before the given index.

int length()
          Length of a Ustr in Unicode characters (not bytes).

static int length(byte[] b)
          Number of Unicode characters stored in a byte array.

static int length(byte[] b, int offset)
          Number of Unicode characters stored starting at some offset in a byte array.

static int length(java.lang.String str)
          Number of Unicode characters stored in a Java string.

int nextChar()
          Retrieve one Unicode character from a Ustr and advance the working offset.

static void nextChar(byte[] b, int[] answer)
          Retrieve one Unicode character from some offset in byte buffer.

void prepareAppend()
          Set up for appendChar.

void prepareNext()
          Set up for nextChar().

com.textuality.Ustr replace(int oldChar, int newChar)
          returns a new Ustr with all instances of one Unicode character replaced by another.

byte[] sstrcat(byte[] to, byte[] from)
          Safely append one null-terminated byte array to another.

static byte[] sstrcat(byte[] to, int tbase, byte[] from, int fbase)
          Safely append one null-terminated byte array to another with control over offsets.

com.textuality.Ustr sstrcat(com.textuality.Ustr from)
          Safely append one Ustr to another.

static byte[] sstrcpy(byte[] to, byte[] from)
          Safely copy a null-terminated byte array.

static byte[] sstrcpy(byte[] to, int tbase, byte[] from, int fbase)
          Safely copy null-terminated byte arrays with control over offsets.

com.textuality.Ustr sstrcpy(com.textuality.Ustr from)
          Safely copy in the contents of another Ustr.

boolean startsWith(com.textuality.Ustr us)
          Tests if other Ustr is prefix of this.

boolean startsWith(com.textuality.Ustr us, int start)
          Tests if other Ustr is prefix at given index.

static byte[] strcat(byte[] to, byte[] from)
          Copy one null-terminated byte array to the end of another.

static byte[] strcat(byte[] to, int tbase, byte[] from, int fbase)
          Copy one null-terminated array to the end of another, with starting offsets for each

com.textuality.Ustr strcat(com.textuality.Ustr other)
          Append the contents of another Ustr to the end of this one

static int strchr(byte[] b, int c)
          Find the offset where a Unicode character starts in a null-terminated UTF-encoded byte array.

com.textuality.Ustr strchr(int c)
          Locate a Unicode character in a Ustr.

static int strcmp(byte[] s1, byte[] s2)
          Compare two null-terminated byte arrays.

static int strcmp(byte[] s1, int s1base, byte[] s2, int s2base)
          Compare sections of two null-terminated byte arrays.

int strcmp(java.lang.Object o)
          Compare a Ustr to an object's String representation.

int strcmp(com.textuality.Ustr other)
          Compare two Ustrs.

com.textuality.Ustr strcpy(byte[] from)
          Copy in the contents of a null-terminated byte array.

static byte[] strcpy(byte[] to, byte[] from)
          Copy a null-terminated byte array.

com.textuality.Ustr strcpy(byte[] from, int boffset)
          Copy in the contents at some offset in a null-terminated byte array.

static byte[] strcpy(byte[] to, int tbase, byte[] from, int fbase)
          Copy null-terminated byte arrays with control over offsets.

static byte[] strcpy(byte[] b, int offset, java.lang.String s)
          Load a null-terminated UTF-8 encoding of a String into a byte array.

static byte[] strcpy(byte[] b, java.lang.String s)
          Load a null-terminated UTF-8 encoding of a String into a byte array at the front.

com.textuality.Ustr strcpy(java.lang.Object o)
          Copy in the String representation of an Object.

com.textuality.Ustr strcpy(com.textuality.Ustr from)
          Copy in the contents of another Ustr.

int strlen()
          The length in bytes of a Ustr's UTF representation.

static int strlen(byte[] b)
          The length in bytes of a null-terminated byte array

static int strlen(byte[] b, int base)
          The length in bytes of a null-terminated sequence starting at some offset in a byte array.

static int strrchr(byte[] b, int c)
          Find the index of the last appearance of a Unicode character in a null-terminated UTF-encoded byte array.

com.textuality.Ustr strrchr(int c)
          Locate the last occurrence of a Unicode character in a Ustr.

static int strstr(byte[] big, byte[] little)
          locate a substring in a byte array.

com.textuality.Ustr strstr(com.textuality.Ustr little)
          Locate a substring in a string.

com.textuality.Ustr substring(int start)
          makes a new substring of a Ustr given a start index.

com.textuality.Ustr substring(int start, int end)
          makes a new substring of a Ustr identified by start and end indices.

char[] toCharArray()
          converts Ustr to a char array.

java.lang.String toString()
          Generates a Java String representing the Ustr.

Methods inherited from class java.lang.Object

clone, finalize, getClass, notify, notifyAll, wait, wait, wait

Field Detail

s

public byte[] s

A byte array containing the string, always in UTF-8 form. All Ustr operations count on null-termination. The byte array may be much bigger than the contained string

base

public int base

Where in the array s the string starts. You can have lots of different Ustrs co-existing in a single byte array.

offset

public int offset

To keep track of a single character position within the string; this is used by the nextChar and appendChar methods.

Constructor Detail

Ustr

public Ustr()

Creates an empty Ustr with no buffer

Ustr

public Ustr(int length)

Creates an empty Ustr, with a null termination at the front.
Parameters:: length - length of the buffer, in bytes

Ustr

public Ustr(byte[] bytes)

Wraps a Ustr around a buffer. Does not do null termination, so you can pass in a buffer already containing a string.
Parameters:: bytes - the buffer

Ustr

public Ustr(byte[] bytes,
            int start)

Wraps a Ustr around a position in a buffer. Does not do null termination, so you can pass in a buffer already containing a string.
Parameters:: bytes - the buffer; start - where in the buffer the strings starts

Ustr

public Ustr(char[] chars)

Makes a Ustr from a char[] array. The Ustr is null-terminated, but no space is allocated beyond what's needed.
Parameters:: chars - the char array

Ustr

public Ustr(int[] ints)

Makes a Ustr from an int[] array. Each integer is the value of a Unicode character.
Parameters:: ints - the int array

Ustr

public Ustr(java.lang.Object o)

Makes a Ustr from an object, based on its toString(). Most commonly used with a String argument. The Ustr is null-terminated, but no space is allocated beyond what's needed.
Parameters:: o - the Object

Ustr

public Ustr(int space,
            java.lang.Object o)

Makes a Ustr from an object, based on its toString(), leaving room for growth. Most commonly used with a String argument. The Ustr is null-terminated.
Parameters:: space - How large a buffer to allocate; o - The object

Method Detail

init

public void init()

Empty a Ustr by setting its first byte to 0.

Returns:: the Ustr

compareTo

public int compareTo(java.lang.Object other)

Supports the Comparable interface. The ordering is that of native Unicode code points and probably not culturally appropriate anywhere.

Specified by:: compareTo in interface java.lang.Comparable

Parameters:: other - the object compared
Returns:: -1, 0, or 1 as you'd expect.

toString

public java.lang.String toString()

Generates a Java String representing the Ustr.

Overrides:: toString in class java.lang.Object

Returns:: the String.

length

public int length()

Length of a Ustr in Unicode characters (not bytes).

Returns:: the number of Unicode characters.

length

public static int length(byte[] b,
                         int offset)

Number of Unicode characters stored starting at some offset in a byte array. Assumes UTF-8 encoding and null termination.

Parameters:: b - the byte array; offset - where to start counting
Returns:: the number of unicode characters.

length

public static int length(byte[] b)

Number of Unicode characters stored in a byte array. Assumes UTF-8 encoding and null termination.

Parameters:: b - the byte array
Returns:: the number of Unicode characters.

length

public static int length(java.lang.String str)

Number of Unicode characters stored in a Java string. if s is a String, s.length() and Ustr.length(s) will be the same except when s contains non-BMP characters.

Parameters:: str - the string
Returns:: the number of Unicode characters

prepareAppend

public void prepareAppend()

Set up for appendChar. Points the offset field at the buffer's null terminator.

appendChar

public void appendChar(int c)

Append one Unicode character to a Ustr. Assumes that the offset points to the null-termination, where the character ought to go, updates that field and applies another null termination. You could change the value of offset and start "appending" into the middle of a Ustr if that's what you wanted. This generates the UTF-8 bytes from the input characters.

If the character is less than 128, one byte of buffer is used. If less than 0x8000, two bytes. If less than 2**16, three bytes. If less than 0x10ffff, four bytes. If greater than 0x10ffff, or negative, you get an exception.

Parameters:: c - the character to be appended.

appendChar

public static int appendChar(int c,
                             byte[] s,
                             int offset)

Writes one Unicode character into a UTF-8 encoded byte array at a given offset, and null-terminates it.

Parameters:: c - the Unicode character; s - the array; offset - the offset to write at
Returns:: the offset of the null byte after the encoded character

nextChar

public static void nextChar(byte[] b,
                            int[] answer)

Retrieve one Unicode character from some offset in byte buffer. Advances the offset to make it useful as an iterator.

Parameters:: b - the byte buffer; answer - a two-integer array. The first integer is the offset to start reading the Unicode character, and is updated to point at the next Unicode character. The second integer is used to return the character. If Java supported multiple return values from a function, this wouldn't be necessary.

prepareNext

public void prepareNext()

Set up for nextChar(). Points the offset field at the start of the buffer.

nextChar

public int nextChar()

Retrieve one Unicode character from a Ustr and advance the working offset. Assumes the working offset is sanely located.

Returns:: the Unicode character, 0 signaling the end of the string

strlen

public int strlen()

The length in bytes of a Ustr's UTF representation. Assumes null-termination.

Returns:: the number of bytes

strlen

public static int strlen(byte[] b)

The length in bytes of a null-terminated byte array

Parameters:: b - the array
Returns:: the number of bytes

strlen

public static int strlen(byte[] b,
                         int base)

The length in bytes of a null-terminated sequence starting at some offset in a byte array.

Parameters:: b - the byte array; base - the byte offset to start counting at
Returns:: the number of bytes

strcpy

public static byte[] strcpy(byte[] to,
                            byte[] from)

Copy a null-terminated byte array.

Parameters:: to - destination array; from - source array
Returns:: the destination array

strcpy

public static byte[] strcpy(byte[] to,
                            int tbase,
                            byte[] from,
                            int fbase)

Copy null-terminated byte arrays with control over offsets.

Parameters:: to - destination array; tbase - starting offset in destination array; from - source array; fbase - starting offset in source array
Returns:: the destination array

strcpy

public com.textuality.Ustr strcpy(com.textuality.Ustr from)

Copy in the contents of another Ustr. Does not change the offset.

Parameters:: from - source Ustr
Returns:: this Ustr

strcpy

public com.textuality.Ustr strcpy(java.lang.Object o)

Copy in the String representation of an Object. Does not change the offset.

Parameters:: o - the source object
Returns:: this Ustr

strcpy

public com.textuality.Ustr strcpy(byte[] from)

Copy in the contents of a null-terminated byte array. Does not change the offset.

Parameters:: from - the byte array
Returns:: this Ustr

strcpy

public com.textuality.Ustr strcpy(byte[] from,
                                  int boffset)

Copy in the contents at some offset in a null-terminated byte array. Does not change the offset.

Parameters:: from - the source byte array; boffset - where to start copying in the source array
Returns:: this Ustr

strcpy

public static byte[] strcpy(byte[] b,
                            java.lang.String s)

Load a null-terminated UTF-8 encoding of a String into a byte array at the front.

Parameters:: b - the byte array; s - the String
Returns:: the byte array

strcpy

public static byte[] strcpy(byte[] b,
                            int offset,
                            java.lang.String s)

Load a null-terminated UTF-8 encoding of a String into a byte array.

Parameters:: b - the byte array; offset - where in the byte array to load; s - the String
Returns:: the byte array

sstrcat

public com.textuality.Ustr sstrcat(com.textuality.Ustr from)

Safely append one Ustr to another.

Parameters:: from - the Ustr to be appended
Returns:: this

sstrcat

public byte[] sstrcat(byte[] to,
                      byte[] from)

Safely append one null-terminated byte array to another. Destination buffer will not be overrun.

Parameters:: to - dest array; from - source array
Returns:: dest array

sstrcat

public static byte[] sstrcat(byte[] to,
                             int tbase,
                             byte[] from,
                             int fbase)

Safely append one null-terminated byte array to another with control over offsets. Destination buffer will not be overrun.

Parameters:: to - dest array; tbase - base of dest array; from - source array; fbase - base of source array
Returns:: to

sstrcpy

public static byte[] sstrcpy(byte[] to,
                             int tbase,
                             byte[] from,
                             int fbase)

Safely copy null-terminated byte arrays with control over offsets. Destination buffer will not be overrun.

Parameters:: to - destination array; tbase - starting offset in destination array; from - source array; fbase - starting offset in source array`
Returns:: the destination array

sstrcpy

public static byte[] sstrcpy(byte[] to,
                             byte[] from)

Safely copy a null-terminated byte array. The destination buffer will not be overrun.

Parameters:: to - destination array; from - source array
Returns:: the destination array

sstrcpy

public com.textuality.Ustr sstrcpy(com.textuality.Ustr from)

Safely copy in the contents of another Ustr. Does not change the offset. The destination buffer will not be overrun.

Parameters:: from - source Ustr
Returns:: this Ustr

strcat

public static byte[] strcat(byte[] to,
                            int tbase,
                            byte[] from,
                            int fbase)

Copy one null-terminated array to the end of another, with starting offsets for each

Parameters:: to - destination array; from - source array; fbase - base pos of source
Returns:: destination

strcat

public static byte[] strcat(byte[] to,
                            byte[] from)

Copy one null-terminated byte array to the end of another.

Parameters:: to - destination array; from - source array
Returns:: the destionation array

strcat

public com.textuality.Ustr strcat(com.textuality.Ustr other)

Append the contents of another Ustr to the end of this one

Parameters:: other - the other Ustr
Returns:: this Ustr

strcmp

public static int strcmp(byte[] s1,
                         byte[] s2)

Compare two null-terminated byte arrays. The ordering is that of native Unicode code points and probably not culturally appropriate anywhere.

Parameters:: s1 - first byte array; s2 - second byte array
Returns:: a negative number, zero, or a positive number depending on whether s1 is lexically less than, equal to, or greater than s2.

strcmp

public static int strcmp(byte[] s1,
                         int s1base,
                         byte[] s2,
                         int s2base)

Compare sections of two null-terminated byte arrays. The ordering is that of native Unicode code points and probably not culturally appropriate anywhere.

Parameters:: s1 - first byte array; s1base - byte offset in first array to start comparing; s2 - second byte array; s2base - byte offset in second array to start comparing
Returns:: a negative number, zero, or a positive number depending on whether s1 is lexically less than, equal to, or greater than s2.

strcmp

public int strcmp(com.textuality.Ustr other)

Compare two Ustrs. The ordering is that of native Unicode code points and probably not culturally appropriate anywhere.

Parameters:: other - the other Ustr
Returns:: a negative number, zero, or a positive number depending on whether the other is lexically less than, equal to, or greater than this.

strcmp

public int strcmp(java.lang.Object o)

Compare a Ustr to an object's String representation. The ordering is that of native Unicode code points and probably not culturally appropriate anywhere.

Returns:: a negative number, zero, or a positive number depending on whether the other is lexically less than, equal to, or greater than this.

strchr

public com.textuality.Ustr strchr(int c)

Locate a Unicode character in a Ustr. Returns null if not found; if the character is zero, finds the offset of the null termination.

Parameters:: c - the character, as an integer
Returns:: a Ustr with the same buffer, starting at the matching character

strchr

public static int strchr(byte[] b,
                         int c)

Find the offset where a Unicode character starts in a null-terminated UTF-encoded byte array. Returns -1 if not found; if the character is zero, finds the index of the null termination.

Parameters:: b - UTF-encoded null-terminated byte array
Returns:: the offset in the string, or -1

strrchr

public com.textuality.Ustr strrchr(int c)

Locate the last occurrence of a Unicode character in a Ustr. If found, returns a Ustr built around the same buffer as this, with the base set to the matching location. If not found, null

Parameters:: c - the character, as an integer
Returns:: a Ustr with the base set to the match, or null

strrchr

public static int strrchr(byte[] b,
                          int c)

Find the index of the last appearance of a Unicode character in a null-terminated UTF-encoded byte array. Returns -1 if not found.

Parameters:: b - the byte array; c - the integer
Returns:: the offset where the last occurence of c starts, or -1

strstr

public com.textuality.Ustr strstr(com.textuality.Ustr little)

Locate a substring in a string. Returns a Ustr built around the same buffer, but starting at the matching position, or null if no match is found.

Parameters:: little - the substring to be located
Returns:: matching Ustr, or null

strstr

public static int strstr(byte[] big,
                         byte[] little)

locate a substring in a byte array. Returns the offset of the substring if it matches, otherwise -1.

Parameters:: big - the array to search in; little - the array to search for
Returns:: the index of the match, or -1

charAt

public int charAt(int at)
           throws java.lang.IndexOutOfBoundsException

find the Unicode character at some index in a Ustr. Throws an IndexOutOfBounds exception if appropriate.

Parameters:: at - the index
Returns:: the Unicode character, as an integer; java.lang.IndexOutOfBoundsException

concat

public com.textuality.Ustr concat(java.lang.String str)

Append a String to the end of this.

Parameters:: str - the string
Returns:: a a new Ustr which contains the concatenation

concat

public com.textuality.Ustr concat(com.textuality.Ustr us)

Append a Ustr to the end of this.

Parameters:: us - the ustr to append
Returns:: a new ustr

endsWith

public boolean endsWith(com.textuality.Ustr suffix)

Test if this Ustr ends with the specified suffix (a Ustr).

Parameters:: suffix - the possible suffix.
Returns:: true or false.

endsWith

public boolean endsWith(java.lang.String suffix)

Test if this Ustr ends with specified suffix (a String).

Parameters:: suffix - the possible suffix
Returns:: true or false

equals

public boolean equals(java.lang.Object anObject)

Compares this Ustr to another object.

Overrides:: equals in class java.lang.Object

Parameters:: anObject - the other object
Returns:: true or false

getBytes

public byte[] getBytes()

Convert this Ustr into bytes according to the platform's default character encoding, storing the result in a new byte array.

Returns:: a new byte array

getBytes

public byte[] getBytes(java.lang.String enc)
                throws java.io.UnsupportedEncodingException

Convert this Ustr into bytes according to the specified character encoding, storing the result into a new byte array.

Parameters:: enc - the encoding to use in generating bytes
Returns:: the new byte array; java.io.UnsupportedEncodingException

getChars

public static void getChars(java.lang.String str,
                            int srcBegin,
                            int srcEnd,
                            char[] dst,
                            int dstBegin)

Copies Unicode characters from this String into the destination char array. Note that if the String contains UTF-16 surrogate pairs, each pair counts as a single character.

Parameters:: str - the string; srcBegin - where to start copying; srcEnd - index after last char to copy; dst - start of destination array; dstBegin - where in the destination array to start copying

getChars

public void getChars(int srcBegin,
                     int srcEnd,
                     char[] dst,
                     int dstBegin)

Copies Unicode characters from this Ustr into the destination char array. We can't just dispatch to the String implementation because we do Unicode characters, it does UTF-16 code points

Parameters:: srcBegin - where to start copying; srcEnd - index after last char to copy; dst - start of destination array; dstBegin - where in the destination array to start copying

hashCode

public int hashCode()

Returns a hashcode for this Ustr. The algorithm is that documented for String, only that documentation says 'int' arithmetic, which is clearly wrong, but this produces the same result as String's hashCode() for the strings "1" and "2", and thus by induction must be correct.

Overrides:: hashCode in class java.lang.Object

Returns:: an integer hashcode

indexOf

public int indexOf(int ch)

Returns the first index within this Ustr of the specified Unicode character.

Parameters:: ch - the character
Returns:: index (usable by charAt) in the string of the char, or -1

indexOf

public int indexOf(int ch,
                   int start)

Returns the first index within this Ustr of the specified character, starting at the specified index.

Parameters:: ch - the character; start - where to start looking
Returns:: index (usable by charAt) in the string of the char, or -1

indexOf

public int indexOf(com.textuality.Ustr us)

Returns the index within this Ustr of the first occurrence of the specified other Ustr, or -1.

Parameters:: us - the other Ustr
Returns:: the index of the match, or -1

indexOf

public int indexOf(com.textuality.Ustr us,
                   int start)

Returns the index within this Ustr of the first occurrence of the specified other Ustr starting at the given offset, or -1.

Parameters:: us - the other Ustr; start - the index to start looking
Returns:: the index of the match, or -1

intern

public com.textuality.Ustr intern()

returns a canonical version of the Ustr, which should be treated as read-only. Differs from the intern function of String in that it never returns the input string; if a new hashtable entry is required, it makes a new Ustr and returns that. If a programmer updates the contents of a Ustr returned from intern(), grave disorder will ensue.

Returns:: the canonical version of the Ustr.

lastIndexOf

public int lastIndexOf(int ch)

Returns the index within this Ustr of the last occurrence of the specified Unicode character.

Parameters:: ch - the character
Returns:: the last index of the character, or -1

lastIndexOf

public int lastIndexOf(int ch,
                       int stop)

Returns the index within this Ustr of the last occurrence of the specified Unicode character before the specified stop index.

Parameters:: ch - the character; stop - last index to consider
Returns:: the last index of the character, or -1

lastIndexOf

public int lastIndexOf(com.textuality.Ustr us)

Finds the last substring match.

Parameters:: us - the subtring to search for
Returns:: the match index, or =1

lastIndexOf

public int lastIndexOf(com.textuality.Ustr us,
                       int stop)

Finds the last substring match before the given index.

Parameters:: us - the subtring to search for; stop - where to stop searching
Returns:: the match index, or =1

replace

public com.textuality.Ustr replace(int oldChar,
                                   int newChar)

returns a new Ustr with all instances of one Unicode character replaced by another.

Parameters:: oldChar - the Unicode character to be replaced; newChar - the Unicode character to replace it with
Returns:: the new Ustr

startsWith

public boolean startsWith(com.textuality.Ustr us)

Tests if other Ustr is prefix of this.

Parameters:: us - the other Ustr
Returns:: true/false

startsWith

public boolean startsWith(com.textuality.Ustr us,
                          int start)

Tests if other Ustr is prefix at given index.

Parameters:: us - the other Ustr; start - where to test
Returns:: true/false

substring

public com.textuality.Ustr substring(int start)

makes a new substring of a Ustr given a start index.

Parameters:: start - index of start of substr
Returns:: new Ustr containing substr

substring

public com.textuality.Ustr substring(int start,
                                     int end)

makes a new substring of a Ustr identified by start and end indices.

Parameters:: start - index of start of substr; end - index of end of substr
Returns:: new Ustr containing substr

toCharArray

public char[] toCharArray()

converts Ustr to a char array.

Returns:: the new char array

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES All Classes

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Field Summary
`int`	`base` Where in the array `s` the string starts.
`int`	`offset` To keep track of a single character position within the string; this is used by the `nextChar` and `appendChar` methods.
`byte[]`	`s` A byte array containing the string, always in UTF-8 form.

Constructor Summary
`Ustr()` Creates an empty Ustr with no buffer
`Ustr(byte[] bytes)` Wraps a Ustr around a buffer.
`Ustr(byte[] bytes, int start)` Wraps a Ustr around a position in a buffer.
`Ustr(char[] chars)` Makes a Ustr from a char[] array.
`Ustr(int length)` Creates an empty Ustr, with a null termination at the front.
`Ustr(int[] ints)` Makes a Ustr from an int[] array.
`Ustr(int space, java.lang.Object o)` Makes a Ustr from an object, based on its `toString()`, leaving room for growth.
`Ustr(java.lang.Object o)` Makes a Ustr from an object, based on its `toString()`.

Method Summary
`void`	`appendChar(int c)` Append one Unicode character to a Ustr.
`static int`	`appendChar(int c, byte[] s, int offset)` Writes one Unicode character into a UTF-8 encoded byte array at a given offset, and null-terminates it.
`int`	`charAt(int at)` find the Unicode character at some index in a Ustr.
`int`	`compareTo(java.lang.Object other)` Supports the `Comparable` interface.
`com.textuality.Ustr`	`concat(java.lang.String str)` Append a String to the end of this.
`com.textuality.Ustr`	`concat(com.textuality.Ustr us)` Append a Ustr to the end of this.
`boolean`	`endsWith(java.lang.String suffix)` Test if this Ustr ends with specified suffix (a String).
`boolean`	`endsWith(com.textuality.Ustr suffix)` Test if this Ustr ends with the specified suffix (a Ustr).
`boolean`	`equals(java.lang.Object anObject)` Compares this Ustr to another object.
`byte[]`	`getBytes()` Convert this Ustr into bytes according to the platform's default character encoding, storing the result in a new byte array.
`byte[]`	`getBytes(java.lang.String enc)` Convert this Ustr into bytes according to the specified character encoding, storing the result into a new byte array.
`void`	`getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)` Copies Unicode characters from this Ustr into the destination char array.
`static void`	`getChars(java.lang.String str, int srcBegin, int srcEnd, char[] dst, int dstBegin)` Copies Unicode characters from this String into the destination char array.
`int`	`hashCode()` Returns a hashcode for this Ustr.
`int`	`indexOf(int ch)` Returns the first index within this Ustr of the specified Unicode character.
`int`	`indexOf(int ch, int start)` Returns the first index within this Ustr of the specified character, starting at the specified index.
`int`	`indexOf(com.textuality.Ustr us)` Returns the index within this Ustr of the first occurrence of the specified other Ustr, or -1.
`int`	`indexOf(com.textuality.Ustr us, int start)` Returns the index within this Ustr of the first occurrence of the specified other Ustr starting at the given offset, or -1.
`void`	`init()` Empty a Ustr by setting its first byte to 0.
`com.textuality.Ustr`	`intern()` returns a canonical version of the Ustr, which should be treated as read-only.
`int`	`lastIndexOf(int ch)` Returns the index within this Ustr of the last occurrence of the specified Unicode character.
`int`	`lastIndexOf(int ch, int stop)` Returns the index within this Ustr of the last occurrence of the specified Unicode character before the specified stop index.
`int`	`lastIndexOf(com.textuality.Ustr us)` Finds the last substring match.
`int`	`lastIndexOf(com.textuality.Ustr us, int stop)` Finds the last substring match before the given index.
`int`	`length()` Length of a Ustr in Unicode characters (not bytes).
`static int`	`length(byte[] b)` Number of Unicode characters stored in a byte array.
`static int`	`length(byte[] b, int offset)` Number of Unicode characters stored starting at some offset in a byte array.
`static int`	`length(java.lang.String str)` Number of Unicode characters stored in a Java string.
`int`	`nextChar()` Retrieve one Unicode character from a Ustr and advance the working offset.
`static void`	`nextChar(byte[] b, int[] answer)` Retrieve one Unicode character from some offset in byte buffer.
`void`	`prepareAppend()` Set up for `appendChar`.
`void`	`prepareNext()` Set up for `nextChar()`.
`com.textuality.Ustr`	`replace(int oldChar, int newChar)` returns a new Ustr with all instances of one Unicode character replaced by another.
`byte[]`	`sstrcat(byte[] to, byte[] from)` Safely append one null-terminated byte array to another.
`static byte[]`	`sstrcat(byte[] to, int tbase, byte[] from, int fbase)` Safely append one null-terminated byte array to another with control over offsets.
`com.textuality.Ustr`	`sstrcat(com.textuality.Ustr from)` Safely append one Ustr to another.
`static byte[]`	`sstrcpy(byte[] to, byte[] from)` Safely copy a null-terminated byte array.
`static byte[]`	`sstrcpy(byte[] to, int tbase, byte[] from, int fbase)` Safely copy null-terminated byte arrays with control over offsets.
`com.textuality.Ustr`	`sstrcpy(com.textuality.Ustr from)` Safely copy in the contents of another Ustr.
`boolean`	`startsWith(com.textuality.Ustr us)` Tests if other Ustr is prefix of this.
`boolean`	`startsWith(com.textuality.Ustr us, int start)` Tests if other Ustr is prefix at given index.
`static byte[]`	`strcat(byte[] to, byte[] from)` Copy one null-terminated byte array to the end of another.
`static byte[]`	`strcat(byte[] to, int tbase, byte[] from, int fbase)` Copy one null-terminated array to the end of another, with starting offsets for each
`com.textuality.Ustr`	`strcat(com.textuality.Ustr other)` Append the contents of another Ustr to the end of this one
`static int`	`strchr(byte[] b, int c)` Find the offset where a Unicode character starts in a null-terminated UTF-encoded byte array.
`com.textuality.Ustr`	`strchr(int c)` Locate a Unicode character in a Ustr.
`static int`	`strcmp(byte[] s1, byte[] s2)` Compare two null-terminated byte arrays.
`static int`	`strcmp(byte[] s1, int s1base, byte[] s2, int s2base)` Compare sections of two null-terminated byte arrays.
`int`	`strcmp(java.lang.Object o)` Compare a Ustr to an object's String representation.
`int`	`strcmp(com.textuality.Ustr other)` Compare two Ustrs.
`com.textuality.Ustr`	`strcpy(byte[] from)` Copy in the contents of a null-terminated byte array.
`static byte[]`	`strcpy(byte[] to, byte[] from)` Copy a null-terminated byte array.
`com.textuality.Ustr`	`strcpy(byte[] from, int boffset)` Copy in the contents at some offset in a null-terminated byte array.
`static byte[]`	`strcpy(byte[] to, int tbase, byte[] from, int fbase)` Copy null-terminated byte arrays with control over offsets.
`static byte[]`	`strcpy(byte[] b, int offset, java.lang.String s)` Load a null-terminated UTF-8 encoding of a String into a byte array.
`static byte[]`	`strcpy(byte[] b, java.lang.String s)` Load a null-terminated UTF-8 encoding of a String into a byte array at the front.
`com.textuality.Ustr`	`strcpy(java.lang.Object o)` Copy in the String representation of an Object.
`com.textuality.Ustr`	`strcpy(com.textuality.Ustr from)` Copy in the contents of another Ustr.
`int`	`strlen()` The length in bytes of a Ustr's UTF representation.
`static int`	`strlen(byte[] b)` The length in bytes of a null-terminated byte array
`static int`	`strlen(byte[] b, int base)` The length in bytes of a null-terminated sequence starting at some offset in a byte array.
`static int`	`strrchr(byte[] b, int c)` Find the index of the last appearance of a Unicode character in a null-terminated UTF-encoded byte array.
`com.textuality.Ustr`	`strrchr(int c)` Locate the last occurrence of a Unicode character in a Ustr.
`static int`	`strstr(byte[] big, byte[] little)` locate a substring in a byte array.
`com.textuality.Ustr`	`strstr(com.textuality.Ustr little)` Locate a substring in a string.
`com.textuality.Ustr`	`substring(int start)` makes a new substring of a Ustr given a start index.
`com.textuality.Ustr`	`substring(int start, int end)` makes a new substring of a Ustr identified by start and end indices.
`char[]`	`toCharArray()` converts Ustr to a char array.
`java.lang.String`	`toString()` Generates a Java String representing the Ustr.

com.textuality Class Ustr

s

base

offset

Ustr

Ustr

Ustr

Ustr

Ustr

Ustr

Ustr

Ustr

init

compareTo

toString

length

length

length

length

prepareAppend

appendChar

appendChar

nextChar

prepareNext

nextChar

strlen

strlen

strlen

strcpy

strcpy

strcpy

strcpy

strcpy

strcpy

strcpy

strcpy

sstrcat

sstrcat

sstrcat

sstrcpy

sstrcpy

sstrcpy

strcat

strcat

strcat

strcmp

strcmp

strcmp

strcmp

strchr

strchr

strrchr

strrchr

strstr

strstr

charAt

concat

concat

endsWith

endsWith

equals

getBytes

getBytes

getChars

getChars

hashCode

indexOf

indexOf

indexOf

indexOf

intern

lastIndexOf

lastIndexOf

lastIndexOf

lastIndexOf

replace

startsWith

startsWith

substring

substring

com.textuality
Class Ustr