Skip navigation links

Oracle® Database Globalization Development Kit Java API Reference
11g Release 1 (11.1)

Book Part Number B28299-01


oracle.i18n.lcsd
Class LCSDetectionReader

java.lang.Object
  extended by java.io.Reader
      extended by oracle.i18n.lcsd.LCSDetectionReader

All Implemented Interfaces:
Closeable, Readable
Direct Known Subclasses:
LCSDetectionHTMLReader

public class LCSDetectionReader
extends Reader

The LCSDetectionReader class is the language and character detector (LCSD) reader class that transparently detects the character set and converts it to the Unicode data.

The most common usage is for the Reader interface to read the text data as follows:

 InputStream in = file.getInputStream();
 Reader rdr = new LCSDetectionReader(in);
 char cbuf = new char[1024];
 for (int len = -1; (len = rdr.read(cbuf)) != -1;)
 {
   // do something with cbuf
   ...
 }
 

The detection occurs only once by sampling the first chunk of data.

Since:
10.2

Field Summary
protected static int DEFAULT_SAMPLING_SIZE
          Default sampling byte length for language and character set detection.

 

Fields inherited from class java.io.Reader
lock

 

Constructor Summary
LCSDetectionReader(InputStream in)
          Constructs the LCSD Reader instance with the character set determined by sampling initial data.
LCSDetectionReader(InputStream in, int size)
          Constructs the LCSD Reader instance with the character set determined by sampling initial data.
LCSDetectionReader(Reader reader)
          Constructs the LCSD Reader instance over the input stream reader.
LCSDetectionReader(String profile, InputStream in)
          Constructs the LCSD Reader instance with the character set determined by sampling initial data.
LCSDetectionReader(String profile, InputStream in, int size)
          Constructs the LCSD Reader instance with the character set determined by sampling initial data.
LCSDetectionReader(String profile, Reader reader)
          Constructs the LCSD Reader instance over the reader.

 

Method Summary
 void close()
          Closes the stream.
 LCSDResultSet getResult()
          Returns the result set of LCSD.
 void mark(int readAheadLimit)
          Marks the present position in the stream.
 boolean markSupported()
          Tells whether this stream supports the mark() operation.
 int read()
          Reads a single character.
 int read(char[] cbuf)
          Reads characters into an array.
 int read(char[] cbuf, int offset, int length)
          Reads characters into a portion of an array.
 boolean ready()
          Tells whether this stream is ready to be read.
 void reset()
          Resets the stream.

 

Methods inherited from class java.io.Reader
read, skip

 

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

 

Field Detail

DEFAULT_SAMPLING_SIZE

protected static final int DEFAULT_SAMPLING_SIZE
Default sampling byte length for language and character set detection.
See Also:
Constant Field Values

Constructor Detail

LCSDetectionReader

public LCSDetectionReader(InputStream in)
                   throws IOException,
                          UTFDataFormatException
Constructs the LCSD Reader instance with the character set determined by sampling initial data.
Parameters:
in - the InputStream object including the text data
Throws:
IOException - if any I/O error occurs
UTFDataFormatException - if an invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 data.

LCSDetectionReader

public LCSDetectionReader(InputStream in,
                          int size)
                   throws IOException,
                          UTFDataFormatException
Constructs the LCSD Reader instance with the character set determined by sampling initial data.
Parameters:
in - the InputStream object including the text data
size - the sampling size
Throws:
IOException - if any I/O error occurs
UTFDataFormatException - if an invalid UTF-8 data sequence is detected. Note this occurs only if the source is in UTF-8 encoding

LCSDetectionReader

public LCSDetectionReader(String profile,
                          InputStream in)
                   throws IOException,
                          UTFDataFormatException
Constructs the LCSD Reader instance with the character set determined by sampling initial data.
Parameters:
profile - the LCSD profile name. null is the default.
in - the InputStream object including the text data
Throws:
IOException - if any I/O error occurs
UTFDataFormatException - if an invalid UTF-8 data sequence is detected. Note this occurs only if the source is in UTF-8 encoding

LCSDetectionReader

public LCSDetectionReader(String profile,
                          InputStream in,
                          int size)
                   throws IOException,
                          UTFDataFormatException
Constructs the LCSD Reader instance with the character set determined by sampling initial data.
Parameters:
profile - the LCSD profile name. null is the default
in - the InputStream object including the text data
size - the sampling size
Throws:
IOException - if any I/O error occurs
UTFDataFormatException - if an invalid UTF-8 data sequence is detected. Note this occurs only if the source is in UTF-8 encoding

LCSDetectionReader

public LCSDetectionReader(Reader reader)
                   throws IOException
Constructs the LCSD Reader instance over the input stream reader.

This constructor is used to detect the language from the reader object. The character set is always UTF-16.

Parameters:
reader - the InputStreamReader object
Throws:
IOException - if any I/O error occurs

LCSDetectionReader

public LCSDetectionReader(String profile,
                          Reader reader)
                   throws IOException
Constructs the LCSD Reader instance over the reader.

This constructor is used to detect the language from the reader object. The character set is always UTF-16.

Parameters:
profile - the LCSD Profile name. null is the default
reader - the reader including the text data
Throws:
IOException - if any I/O error occurs

Method Detail

getResult

public LCSDResultSet getResult()
                        throws IOException,
                               UTFDataFormatException
Returns the result set of LCSD.

If the language information is required in your application, call this method. The character set is implicitly used for the conversions, but if you need the name, call this method.

Returns:
the result set of LCSD
Throws:
IOException - if any I/O error occurs
UTFDataFormatException - if an invalid UTF-8 data sequence is detected. Note this occurs only if the source is in UTF-8 encoding

read

public int read()
         throws IOException,
                UTFDataFormatException
Reads a single character.
Overrides:
read in class Reader
Returns:
the character read, or -1 if the end of the stream has been reached
Throws:
IOException - if an I/O error occurs
UTFDataFormatException - if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 data.

read

public int read(char[] cbuf,
                int offset,
                int length)
         throws IOException,
                UTFDataFormatException
Reads characters into a portion of an array.
Specified by:
read in class Reader
Parameters:
cbuf - destination buffer
offset - offset at which to start storing characters
length - maximum number of characters to read
Returns:
the number of characters read, or -1 if the end of the stream has been reached
Throws:
IOException - if I/O error occurs
UTFDataFormatException - if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 data.

close

public void close()
           throws IOException
Closes the stream.
Specified by:
close in interface Closeable
Specified by:
close in class Reader
Throws:
IOException - if I/O error occurs

ready

public boolean ready()
              throws IOException
Tells whether this stream is ready to be read.
Overrides:
ready in class Reader
Returns:
true if the next read() call is guaranteed not to block input, otherwise false is returned
Throws:
IOException - if I/O error occurs

markSupported

public boolean markSupported()
Tells whether this stream supports the mark() operation.
Overrides:
markSupported in class Reader
Returns:
true if this stream supports the mark() operation

mark

public void mark(int readAheadLimit)
          throws IOException
Marks the present position in the stream.
Overrides:
mark in class Reader
Parameters:
readAheadLimit - limit on the number of characters that may be read while still preserving the mark. After reading the limited number of characters, attempting to reset the stream may fail.
Throws:
IOException - if the stream does not support the mark() operation, or if some other I/O error occurs

reset

public void reset()
           throws IOException
Resets the stream.
Overrides:
reset in class Reader
Throws:
IOException - if the stream has not been marked, or if the mark has been invalidated, or if the stream does not support the reset() operation, or if some other I/O error occurs

read

public int read(char[] cbuf)
         throws IOException,
                UTFDataFormatException
Reads characters into an array.
Overrides:
read in class Reader
Parameters:
cbuf - destination buffer
Returns:
the number of characters read, or -1 if the end of the stream has been reached
Throws:
IOException - if I/O error occurs
UTFDataFormatException - if any invalid UTF-8 data sequence is detected. Note this occurs only if the source is UTF-8 data.

Skip navigation links

Oracle® Database Globalization Development Kit Java API Reference
11g Release 1 (11.1)

Book Part Number B28299-01


Copyright © 2003, 2007, Oracle. All rights reserved.