|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--com.textuality.Ustr
Ustr - rhymes with Wooster. Implements a string, with three design goals:
Note that in the context of a Ustr, "index" always means how many Unicode characters you are into the Ustr's text, while "offset" always mean how many bytes you are into its UTF8 encoded form.
Similarly, "char" and "String" always refer to the Java constructs, while "character" always means a Unicode character, always identified by a a Java "int".
For any method that copies characters and might overrun a buffer, a
"safe" version is provided, starting with an extra s
, e.g.
sstrcopy
and sstrcat
. These versions always
arrange that the copied string not overrun the provided buffer, which
will be properly null-terminated.
Field Summary | |
int |
base
Where in the array s the string starts. |
int |
offset
To keep track of a single character position within the string; this is used by the nextChar and appendChar
methods. |
byte[] |
s
A byte array containing the string, always in UTF-8 form. |
Constructor Summary | |
Ustr()
Creates an empty Ustr with no buffer |
|
Ustr(byte[] bytes)
Wraps a Ustr around a buffer. |
|
Ustr(byte[] bytes,
int start)
Wraps a Ustr around a position in a buffer. |
|
Ustr(char[] chars)
Makes a Ustr from a char[] array. |
|
Ustr(int length)
Creates an empty Ustr, with a null termination at the front. |
|
Ustr(int[] ints)
Makes a Ustr from an int[] array. |
|
Ustr(int space,
java.lang.Object o)
Makes a Ustr from an object, based on its toString() ,
leaving room for growth. |
|
Ustr(java.lang.Object o)
Makes a Ustr from an object, based on its toString() . |
Method Summary | |
void |
appendChar(int c)
Append one Unicode character to a Ustr. |
static int |
appendChar(int c,
byte[] s,
int offset)
Writes one Unicode character into a UTF-8 encoded byte array at a given offset, and null-terminates it. |
int |
charAt(int at)
find the Unicode character at some index in a Ustr. |
int |
compareTo(java.lang.Object other)
Supports the Comparable interface. |
com.textuality.Ustr |
concat(java.lang.String str)
Append a String to the end of this. |
com.textuality.Ustr |
concat(com.textuality.Ustr us)
Append a Ustr to the end of this. |
boolean |
endsWith(java.lang.String suffix)
Test if this Ustr ends with specified suffix (a String). |
boolean |
endsWith(com.textuality.Ustr suffix)
Test if this Ustr ends with the specified suffix (a Ustr). |
boolean |
equals(java.lang.Object anObject)
Compares this Ustr to another object. |
byte[] |
getBytes()
Convert this Ustr into bytes according to the platform's default character encoding, storing the result in a new byte array. |
byte[] |
getBytes(java.lang.String enc)
Convert this Ustr into bytes according to the specified character encoding, storing the result into a new byte array. |
void |
getChars(int srcBegin,
int srcEnd,
char[] dst,
int dstBegin)
Copies Unicode characters from this Ustr into the destination char array. |
static void |
getChars(java.lang.String str,
int srcBegin,
int srcEnd,
char[] dst,
int dstBegin)
Copies Unicode characters from this String into the destination char array. |
int |
hashCode()
Returns a hashcode for this Ustr. |
int |
indexOf(int ch)
Returns the first index within this Ustr of the specified Unicode character. |
int |
indexOf(int ch,
int start)
Returns the first index within this Ustr of the specified character, starting at the specified index. |
int |
indexOf(com.textuality.Ustr us)
Returns the index within this Ustr of the first occurrence of the specified other Ustr, or -1. |
int |
indexOf(com.textuality.Ustr us,
int start)
Returns the index within this Ustr of the first occurrence of the specified other Ustr starting at the given offset, or -1. |
void |
init()
Empty a Ustr by setting its first byte to 0. |
com.textuality.Ustr |
intern()
returns a canonical version of the Ustr, which should be treated as read-only. |
int |
lastIndexOf(int ch)
Returns the index within this Ustr of the last occurrence of the specified Unicode character. |
int |
lastIndexOf(int ch,
int stop)
Returns the index within this Ustr of the last occurrence of the specified Unicode character before the specified stop index. |
int |
lastIndexOf(com.textuality.Ustr us)
Finds the last substring match. |
int |
lastIndexOf(com.textuality.Ustr us,
int stop)
Finds the last substring match before the given index. |
int |
length()
Length of a Ustr in Unicode characters (not bytes). |
static int |
length(byte[] b)
Number of Unicode characters stored in a byte array. |
static int |
length(byte[] b,
int offset)
Number of Unicode characters stored starting at some offset in a byte array. |
static int |
length(java.lang.String str)
Number of Unicode characters stored in a Java string. |
int |
nextChar()
Retrieve one Unicode character from a Ustr and advance the working offset. |
static void |
nextChar(byte[] b,
int[] answer)
Retrieve one Unicode character from some offset in byte buffer. |
void |
prepareAppend()
Set up for appendChar . |
void |
prepareNext()
Set up for nextChar() . |
com.textuality.Ustr |
replace(int oldChar,
int newChar)
returns a new Ustr with all instances of one Unicode character replaced by another. |
byte[] |
sstrcat(byte[] to,
byte[] from)
Safely append one null-terminated byte array to another. |
static byte[] |
sstrcat(byte[] to,
int tbase,
byte[] from,
int fbase)
Safely append one null-terminated byte array to another with control over offsets. |
com.textuality.Ustr |
sstrcat(com.textuality.Ustr from)
Safely append one Ustr to another. |
static byte[] |
sstrcpy(byte[] to,
byte[] from)
Safely copy a null-terminated byte array. |
static byte[] |
sstrcpy(byte[] to,
int tbase,
byte[] from,
int fbase)
Safely copy null-terminated byte arrays with control over offsets. |
com.textuality.Ustr |
sstrcpy(com.textuality.Ustr from)
Safely copy in the contents of another Ustr. |
boolean |
startsWith(com.textuality.Ustr us)
Tests if other Ustr is prefix of this. |
boolean |
startsWith(com.textuality.Ustr us,
int start)
Tests if other Ustr is prefix at given index. |
static byte[] |
strcat(byte[] to,
byte[] from)
Copy one null-terminated byte array to the end of another. |
static byte[] |
strcat(byte[] to,
int tbase,
byte[] from,
int fbase)
Copy one null-terminated array to the end of another, with starting offsets for each |
com.textuality.Ustr |
strcat(com.textuality.Ustr other)
Append the contents of another Ustr to the end of this one |
static int |
strchr(byte[] b,
int c)
Find the offset where a Unicode character starts in a null-terminated UTF-encoded byte array. |
com.textuality.Ustr |
strchr(int c)
Locate a Unicode character in a Ustr. |
static int |
strcmp(byte[] s1,
byte[] s2)
Compare two null-terminated byte arrays. |
static int |
strcmp(byte[] s1,
int s1base,
byte[] s2,
int s2base)
Compare sections of two null-terminated byte arrays. |
int |
strcmp(java.lang.Object o)
Compare a Ustr to an object's String representation. |
int |
strcmp(com.textuality.Ustr other)
Compare two Ustrs. |
com.textuality.Ustr |
strcpy(byte[] from)
Copy in the contents of a null-terminated byte array. |
static byte[] |
strcpy(byte[] to,
byte[] from)
Copy a null-terminated byte array. |
com.textuality.Ustr |
strcpy(byte[] from,
int boffset)
Copy in the contents at some offset in a null-terminated byte array. |
static byte[] |
strcpy(byte[] to,
int tbase,
byte[] from,
int fbase)
Copy null-terminated byte arrays with control over offsets. |
static byte[] |
strcpy(byte[] b,
int offset,
java.lang.String s)
Load a null-terminated UTF-8 encoding of a String into a byte array. |
static byte[] |
strcpy(byte[] b,
java.lang.String s)
Load a null-terminated UTF-8 encoding of a String into a byte array at the front. |
com.textuality.Ustr |
strcpy(java.lang.Object o)
Copy in the String representation of an Object. |
com.textuality.Ustr |
strcpy(com.textuality.Ustr from)
Copy in the contents of another Ustr. |
int |
strlen()
The length in bytes of a Ustr's UTF representation. |
static int |
strlen(byte[] b)
The length in bytes of a null-terminated byte array |
static int |
strlen(byte[] b,
int base)
The length in bytes of a null-terminated sequence starting at some offset in a byte array. |
static int |
strrchr(byte[] b,
int c)
Find the index of the last appearance of a Unicode character in a null-terminated UTF-encoded byte array. |
com.textuality.Ustr |
strrchr(int c)
Locate the last occurrence of a Unicode character in a Ustr. |
static int |
strstr(byte[] big,
byte[] little)
locate a substring in a byte array. |
com.textuality.Ustr |
strstr(com.textuality.Ustr little)
Locate a substring in a string. |
com.textuality.Ustr |
substring(int start)
makes a new substring of a Ustr given a start index. |
com.textuality.Ustr |
substring(int start,
int end)
makes a new substring of a Ustr identified by start and end indices. |
char[] |
toCharArray()
converts Ustr to a char array. |
java.lang.String |
toString()
Generates a Java String representing the Ustr. |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
public byte[] s
public int base
s
the string starts. You can
have lots of different Ustrs co-existing in a single byte array.
public int offset
nextChar
and appendChar
methods.
Constructor Detail |
public Ustr()
public Ustr(int length)
length
- length of the buffer, in bytespublic Ustr(byte[] bytes)
bytes
- the bufferpublic Ustr(byte[] bytes, int start)
bytes
- the bufferstart
- where in the buffer the strings startspublic Ustr(char[] chars)
chars
- the char arraypublic Ustr(int[] ints)
ints
- the int arraypublic Ustr(java.lang.Object o)
toString()
.
Most commonly used with a String argument. The Ustr is null-terminated,
but no space is allocated beyond what's needed.
o
- the Objectpublic Ustr(int space, java.lang.Object o)
toString()
,
leaving room for growth. Most commonly used with a String argument.
The Ustr is null-terminated.
space
- How large a buffer to allocateo
- The objectMethod Detail |
public void init()
public int compareTo(java.lang.Object other)
Comparable
interface. The ordering is that of
native Unicode code points and probably not culturally appropriate
anywhere.
compareTo
in interface java.lang.Comparable
other
- the object compared
public java.lang.String toString()
toString
in class java.lang.Object
public int length()
public static int length(byte[] b, int offset)
b
- the byte arrayoffset
- where to start counting
public static int length(byte[] b)
b
- the byte array
public static int length(java.lang.String str)
s
is a String, s.length()
and
Ustr.length(s)
will be the same except when s
contains non-BMP characters.
str
- the string
public void prepareAppend()
appendChar
. Points the offset
field at the buffer's null terminator.
public void appendChar(int c)
offset
points to the null-termination,
where the character ought to go, updates that field and applies
another null termination. You could change the value of
offset
and start "appending" into the middle of a Ustr
if that's what you wanted. This generates the UTF-8 bytes from
the input characters.
If the character is less than 128, one byte of buffer is used. If less than 0x8000, two bytes. If less than 2**16, three bytes. If less than 0x10ffff, four bytes. If greater than 0x10ffff, or negative, you get an exception.
c
- the character to be appended.public static int appendChar(int c, byte[] s, int offset)
c
- the Unicode characters
- the arrayoffset
- the offset to write at
public static void nextChar(byte[] b, int[] answer)
b
- the byte bufferanswer
- a two-integer array. The first integer is the offset
to start reading the Unicode character, and is updated to point at
the next Unicode character. The second integer is used to return the
character. If Java supported multiple return values from a function,
this wouldn't be necessary.public void prepareNext()
nextChar()
. Points the offset
field at the start of the buffer.
public int nextChar()
public int strlen()
public static int strlen(byte[] b)
b
- the array
public static int strlen(byte[] b, int base)
b
- the byte arraybase
- the byte offset to start counting at
public static byte[] strcpy(byte[] to, byte[] from)
to
- destination arrayfrom
- source array
public static byte[] strcpy(byte[] to, int tbase, byte[] from, int fbase)
to
- destination arraytbase
- starting offset in destination arrayfrom
- source arrayfbase
- starting offset in source array
public com.textuality.Ustr strcpy(com.textuality.Ustr from)
from
- source Ustr
public com.textuality.Ustr strcpy(java.lang.Object o)
o
- the source object
public com.textuality.Ustr strcpy(byte[] from)
from
- the byte array
public com.textuality.Ustr strcpy(byte[] from, int boffset)
from
- the source byte arrayboffset
- where to start copying in the source array
public static byte[] strcpy(byte[] b, java.lang.String s)
b
- the byte arrays
- the String
public static byte[] strcpy(byte[] b, int offset, java.lang.String s)
b
- the byte arrayoffset
- where in the byte array to loads
- the String
public com.textuality.Ustr sstrcat(com.textuality.Ustr from)
from
- the Ustr to be appended
public byte[] sstrcat(byte[] to, byte[] from)
to
- dest arrayfrom
- source array
public static byte[] sstrcat(byte[] to, int tbase, byte[] from, int fbase)
to
- dest arraytbase
- base of dest arrayfrom
- source arrayfbase
- base of source array
public static byte[] sstrcpy(byte[] to, int tbase, byte[] from, int fbase)
to
- destination arraytbase
- starting offset in destination arrayfrom
- source arrayfbase
- starting offset in source array`
public static byte[] sstrcpy(byte[] to, byte[] from)
to
- destination arrayfrom
- source array
public com.textuality.Ustr sstrcpy(com.textuality.Ustr from)
from
- source Ustr
public static byte[] strcat(byte[] to, int tbase, byte[] from, int fbase)
to
- destination arrayfrom
- source arrayfbase
- base pos of source
public static byte[] strcat(byte[] to, byte[] from)
to
- destination arrayfrom
- source array
public com.textuality.Ustr strcat(com.textuality.Ustr other)
other
- the other Ustr
public static int strcmp(byte[] s1, byte[] s2)
s1
- first byte arrays2
- second byte array
public static int strcmp(byte[] s1, int s1base, byte[] s2, int s2base)
s1
- first byte arrays1base
- byte offset in first array to start comparings2
- second byte arrays2base
- byte offset in second array to start comparing
public int strcmp(com.textuality.Ustr other)
other
- the other Ustr
public int strcmp(java.lang.Object o)
public com.textuality.Ustr strchr(int c)
c
- the character, as an integer
public static int strchr(byte[] b, int c)
b
- UTF-encoded null-terminated byte array
public com.textuality.Ustr strrchr(int c)
c
- the character, as an integer
public static int strrchr(byte[] b, int c)
b
- the byte arrayc
- the integer
public com.textuality.Ustr strstr(com.textuality.Ustr little)
little
- the substring to be located
public static int strstr(byte[] big, byte[] little)
big
- the array to search inlittle
- the array to search for
public int charAt(int at) throws java.lang.IndexOutOfBoundsException
at
- the index
java.lang.IndexOutOfBoundsException
public com.textuality.Ustr concat(java.lang.String str)
str
- the string
public com.textuality.Ustr concat(com.textuality.Ustr us)
us
- the ustr to append
public boolean endsWith(com.textuality.Ustr suffix)
suffix
- the possible suffix.
public boolean endsWith(java.lang.String suffix)
suffix
- the possible suffix
public boolean equals(java.lang.Object anObject)
equals
in class java.lang.Object
anObject
- the other object
public byte[] getBytes()
public byte[] getBytes(java.lang.String enc) throws java.io.UnsupportedEncodingException
enc
- the encoding to use in generating bytes
java.io.UnsupportedEncodingException
public static void getChars(java.lang.String str, int srcBegin, int srcEnd, char[] dst, int dstBegin)
str
- the stringsrcBegin
- where to start copyingsrcEnd
- index after last char to copydst
- start of destination arraydstBegin
- where in the destination array to start copyingpublic void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
srcBegin
- where to start copyingsrcEnd
- index after last char to copydst
- start of destination arraydstBegin
- where in the destination array to start copyingpublic int hashCode()
hashCode
in class java.lang.Object
public int indexOf(int ch)
ch
- the character
public int indexOf(int ch, int start)
ch
- the characterstart
- where to start looking
public int indexOf(com.textuality.Ustr us)
us
- the other Ustr
public int indexOf(com.textuality.Ustr us, int start)
us
- the other Ustrstart
- the index to start looking
public com.textuality.Ustr intern()
public int lastIndexOf(int ch)
ch
- the character
public int lastIndexOf(int ch, int stop)
ch
- the characterstop
- last index to consider
public int lastIndexOf(com.textuality.Ustr us)
us
- the subtring to search for
public int lastIndexOf(com.textuality.Ustr us, int stop)
us
- the subtring to search forstop
- where to stop searching
public com.textuality.Ustr replace(int oldChar, int newChar)
oldChar
- the Unicode character to be replacednewChar
- the Unicode character to replace it with
public boolean startsWith(com.textuality.Ustr us)
us
- the other Ustr
public boolean startsWith(com.textuality.Ustr us, int start)
us
- the other Ustrstart
- where to test
public com.textuality.Ustr substring(int start)
start
- index of start of substr
public com.textuality.Ustr substring(int start, int end)
start
- index of start of substrend
- index of end of substr
public char[] toCharArray()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |