Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

org.apache.xerces.readers
Class UTF8Reader  view UTF8Reader download UTF8Reader.java

java.lang.Object
  extended byorg.apache.xerces.readers.XMLEntityReader
      extended byorg.apache.xerces.readers.UTF8Reader
All Implemented Interfaces:
XMLEntityHandler.EntityReader

final class UTF8Reader
extends XMLEntityReader

This is the primary reader used for UTF-8 encoded byte streams.

This reader processes requests from the scanners against the underlying UTF-8 byte stream, avoiding when possible any up-front transcoding. When the StringPool handle interfaces are used, the information in the data stream will be added to the string pool and lazy-evaluated until asked for.

We use the SymbolCache to match expected names (element types in end tags) and walk the data structures of that class directly.

There is a significant amount of hand-inlining and some blatant voilation of good object oriented programming rules, ignoring boundaries of modularity, etc., in the name of good performance.

There are also some places where the code here frequently crashes the SUN java runtime compiler (JIT) and the code here has been carefully "crafted" to avoid those problems.

Version:
$Id: UTF8Reader.java,v 1.3 2000/10/07 18:06:55 markd Exp $

Field Summary
private static char[] cdata_string
           
private  boolean fCallClearPreviousChunk
           
private  boolean fCalledCharPropInit
           
protected  int fCarriageReturnCounter
           
protected  int fCharacterCounter
           
private  char[] fCharacters
           
private  org.apache.xerces.utils.StringPool.CharArrayRange fCharArrayRange
           
protected  XMLEntityHandler.CharDataHandler fCharDataHandler
           
private  int fCharDataLength
           
private  org.apache.xerces.utils.UTF8DataChunk fCurrentChunk
           
private  int fCurrentIndex
           
protected  int fCurrentOffset
           
protected  XMLEntityHandler fEntityHandler
           
protected  org.apache.xerces.framework.XMLErrorReporter fErrorReporter
           
static byte[] fgAsciiAttValueChar
           
static byte[] fgAsciiEntityValueChar
           
protected  boolean fInCDSect
           
private  java.io.InputStream fInputStream
           
private  int fLength
           
protected  int fLinefeedCounter
           
private  int fMostRecentByte
           
private  byte[] fMostRecentData
           
protected  boolean fSendCharDataAsCharArray
           
private  org.apache.xerces.utils.StringPool fStringPool
           
private static boolean USE_OUT_OF_LINE_LOAD_NEXT_BYTE
           
private static boolean USE_TRY_CATCH_FOR_LOAD_NEXT_BYTE
           
 
Constructor Summary
UTF8Reader(XMLEntityHandler entityHandler, org.apache.xerces.framework.XMLErrorReporter errorReporter, boolean sendCharDataAsCharArray, java.io.InputStream dataStream, org.apache.xerces.utils.StringPool stringPool)
           
 
Method Summary
 int addString(int offset, int length)
          Add a string to the StringPool from the characters scanned using this reader as described by offset and length.
 int addSymbol(int offset, int length)
          Add a symbol to the StringPool from the characters scanned using this reader as described by offset and length.
private  int addSymbol(int offset, int length, int hashcode)
           
 void append(XMLEntityHandler.CharBuffer charBuffer, int offset, int length)
          Append the characters processed by this reader associated with offset and length to the CharBuffer.
private  void appendCharData(int ch)
           
private  boolean atEOF(int offset)
           
 XMLEntityHandler.EntityReader changeReaders()
          This method is called by the reader subclasses at the end of input.
private  void characters(int offset, int endOffset)
           
private  int copyAsciiCharData()
           
private  boolean copyMultiByteCharData(int b0)
           
 int currentOffset()
          Return the current offset within this reader.
private  int fillCurrentChunk()
           
 int getColumnNumber()
          Return the column number of the current position within the document that we are processing.
 boolean getInCDSect()
          This method is provided for scanner implementations.
 int getLineNumber()
          Return the line number of the current position within the document that we are processing.
private  int getMultiByteSymbolChar(int b0)
           
protected  void init(XMLEntityHandler entityHandler, org.apache.xerces.framework.XMLErrorReporter errorReporter, boolean sendCharDataAsCharArray, int lineNumber, int columnNumber)
           
private  int loadNextByte()
           
 boolean lookingAtChar(char ch, boolean skipPastChar)
          Test that the current character is a ch character.
 boolean lookingAtSpace(boolean skipPastChar)
          Test that the current character is a whitespace character.
 boolean lookingAtValidChar(boolean skipPastChar)
          Test that the current character is valid.
private  int recognizeMarkup(int b0, org.apache.xerces.utils.QName element)
           
private  int recognizeReference(int ch)
           
 int scanAttValue(char qchar, boolean asSymbol)
          Scan an attribute value.
 int scanCharRef(boolean hex)
          Scan a character reference.
 int scanContent(org.apache.xerces.utils.QName element)
          Skip through the input while we are looking at character data.
 int scanEntityValue(int qchar, boolean createString)
          Scan an entity value.
 boolean scanExpectedName(char fastcheck, org.apache.xerces.utils.StringPool.CharArrayRange expectedName)
          Scan the name that is expected at the current position in the document.
 int scanInvalidChar()
          Scan an invalid character.
private  int scanMatchingName(int ch, int b0, int fastcheck)
           
 int scanName(char fastcheck)
          Add a sequence of characters that match the XML definition of a Name to the StringPool.
 void scanQName(char fastcheck, org.apache.xerces.utils.QName qname)
          Add a sequence of characters that match the XML Namespaces definition of a QName to the StringPool.
 int scanStringLiteral()
          Scan a string literal.
 void setInCDSect(boolean inCDSect)
          This method is provided for scanner implementations.
private  int skipAsciiCharData()
           
private  boolean skipMultiByteCharData(int b0)
           
 void skipPastName(char fastcheck)
          Skip past a sequence of characters that match the XML definition of a Name.
 void skipPastNmtoken(char fastcheck)
          Skip past a sequence of characters that match the XML definition of an Nmtoken.
 void skipPastSpaces()
          Skip past whitespace characters starting at the current position.
protected  boolean skippedMultiByteCharWithFlag(int b0, int flag)
           
 boolean skippedString(char[] s)
          Skip past a sequence of characters that matches the specified character array.
 void skipToChar(char ch)
          Advance through the input data up to the next ch character.
private  void slowAppendCharData(int ch)
           
private  int slowLoadNextByte()
           
private  void whitespace(int offset, int endOffset)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

USE_OUT_OF_LINE_LOAD_NEXT_BYTE

private static final boolean USE_OUT_OF_LINE_LOAD_NEXT_BYTE
See Also:
Constant Field Values

USE_TRY_CATCH_FOR_LOAD_NEXT_BYTE

private static final boolean USE_TRY_CATCH_FOR_LOAD_NEXT_BYTE
See Also:
Constant Field Values

fgAsciiAttValueChar

public static final byte[] fgAsciiAttValueChar

fgAsciiEntityValueChar

public static final byte[] fgAsciiEntityValueChar

fCharacters

private char[] fCharacters

fCharDataLength

private int fCharDataLength

cdata_string

private static final char[] cdata_string

fCharArrayRange

private org.apache.xerces.utils.StringPool.CharArrayRange fCharArrayRange

fInputStream

private java.io.InputStream fInputStream

fStringPool

private org.apache.xerces.utils.StringPool fStringPool

fCurrentChunk

private org.apache.xerces.utils.UTF8DataChunk fCurrentChunk

fCurrentIndex

private int fCurrentIndex

fMostRecentData

private byte[] fMostRecentData

fMostRecentByte

private int fMostRecentByte

fLength

private int fLength

fCalledCharPropInit

private boolean fCalledCharPropInit

fCallClearPreviousChunk

private boolean fCallClearPreviousChunk

fEntityHandler

protected XMLEntityHandler fEntityHandler

fErrorReporter

protected org.apache.xerces.framework.XMLErrorReporter fErrorReporter

fSendCharDataAsCharArray

protected boolean fSendCharDataAsCharArray

fCharDataHandler

protected XMLEntityHandler.CharDataHandler fCharDataHandler

fInCDSect

protected boolean fInCDSect

fCarriageReturnCounter

protected int fCarriageReturnCounter

fLinefeedCounter

protected int fLinefeedCounter

fCharacterCounter

protected int fCharacterCounter

fCurrentOffset

protected int fCurrentOffset
Constructor Detail

UTF8Reader

public UTF8Reader(XMLEntityHandler entityHandler,
                  org.apache.xerces.framework.XMLErrorReporter errorReporter,
                  boolean sendCharDataAsCharArray,
                  java.io.InputStream dataStream,
                  org.apache.xerces.utils.StringPool stringPool)
           throws java.lang.Exception
Method Detail

addString

public int addString(int offset,
                     int length)
Description copied from interface: XMLEntityHandler.EntityReader
Add a string to the StringPool from the characters scanned using this reader as described by offset and length.


addSymbol

public int addSymbol(int offset,
                     int length)
Description copied from interface: XMLEntityHandler.EntityReader
Add a symbol to the StringPool from the characters scanned using this reader as described by offset and length.


addSymbol

private int addSymbol(int offset,
                      int length,
                      int hashcode)

append

public void append(XMLEntityHandler.CharBuffer charBuffer,
                   int offset,
                   int length)
Description copied from interface: XMLEntityHandler.EntityReader
Append the characters processed by this reader associated with offset and length to the CharBuffer.


slowLoadNextByte

private int slowLoadNextByte()
                      throws java.lang.Exception

loadNextByte

private int loadNextByte()
                  throws java.lang.Exception

atEOF

private boolean atEOF(int offset)

changeReaders

public XMLEntityHandler.EntityReader changeReaders()
                                            throws java.lang.Exception
Description copied from class: XMLEntityReader
This method is called by the reader subclasses at the end of input.

Overrides:
changeReaders in class XMLEntityReader

lookingAtChar

public boolean lookingAtChar(char ch,
                             boolean skipPastChar)
                      throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Test that the current character is a ch character.


lookingAtValidChar

public boolean lookingAtValidChar(boolean skipPastChar)
                           throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Test that the current character is valid.


lookingAtSpace

public boolean lookingAtSpace(boolean skipPastChar)
                       throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Test that the current character is a whitespace character.


skipToChar

public void skipToChar(char ch)
                throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Advance through the input data up to the next ch character.


skipPastSpaces

public void skipPastSpaces()
                    throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Skip past whitespace characters starting at the current position.


skippedMultiByteCharWithFlag

protected boolean skippedMultiByteCharWithFlag(int b0,
                                               int flag)
                                        throws java.lang.Exception

skipPastName

public void skipPastName(char fastcheck)
                  throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Skip past a sequence of characters that match the XML definition of a Name.


skipPastNmtoken

public void skipPastNmtoken(char fastcheck)
                     throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Skip past a sequence of characters that match the XML definition of an Nmtoken.


skippedString

public boolean skippedString(char[] s)
                      throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Skip past a sequence of characters that matches the specified character array.


scanInvalidChar

public int scanInvalidChar()
                    throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Scan an invalid character.


scanCharRef

public int scanCharRef(boolean hex)
                throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Scan a character reference.


scanStringLiteral

public int scanStringLiteral()
                      throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Scan a string literal.


scanAttValue

public int scanAttValue(char qchar,
                        boolean asSymbol)
                 throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Scan an attribute value.


scanEntityValue

public int scanEntityValue(int qchar,
                           boolean createString)
                    throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Scan an entity value.


scanExpectedName

public boolean scanExpectedName(char fastcheck,
                                org.apache.xerces.utils.StringPool.CharArrayRange expectedName)
                         throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Scan the name that is expected at the current position in the document. This method is invoked when we are scanning the element type in an end tag that must match the element type in the corresponding start tag.


scanQName

public void scanQName(char fastcheck,
                      org.apache.xerces.utils.QName qname)
               throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Add a sequence of characters that match the XML Namespaces definition of a QName to the StringPool. If we find a QName at the current position we will add it to the StringPool and will return the string pool handle of that QName to the caller.


getMultiByteSymbolChar

private int getMultiByteSymbolChar(int b0)
                            throws java.lang.Exception

scanName

public int scanName(char fastcheck)
             throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Add a sequence of characters that match the XML definition of a Name to the StringPool. If we find a name at the current position we will add it to the StringPool as a symbol and will return the string pool handle for that symbol to the caller.


scanMatchingName

private int scanMatchingName(int ch,
                             int b0,
                             int fastcheck)
                      throws java.lang.Exception

recognizeMarkup

private int recognizeMarkup(int b0,
                            org.apache.xerces.utils.QName element)
                     throws java.lang.Exception

recognizeReference

private int recognizeReference(int ch)
                        throws java.lang.Exception

scanContent

public int scanContent(org.apache.xerces.utils.QName element)
                throws java.lang.Exception
Description copied from interface: XMLEntityHandler.EntityReader
Skip through the input while we are looking at character data.


copyMultiByteCharData

private boolean copyMultiByteCharData(int b0)
                               throws java.lang.Exception

skipMultiByteCharData

private boolean skipMultiByteCharData(int b0)
                               throws java.lang.Exception

copyAsciiCharData

private int copyAsciiCharData()
                       throws java.lang.Exception

skipAsciiCharData

private int skipAsciiCharData()
                       throws java.lang.Exception

appendCharData

private void appendCharData(int ch)
                     throws java.lang.Exception

slowAppendCharData

private void slowAppendCharData(int ch)
                         throws java.lang.Exception

characters

private void characters(int offset,
                        int endOffset)
                 throws java.lang.Exception

whitespace

private void whitespace(int offset,
                        int endOffset)
                 throws java.lang.Exception

fillCurrentChunk

private int fillCurrentChunk()
                      throws java.lang.Exception

init

protected void init(XMLEntityHandler entityHandler,
                    org.apache.xerces.framework.XMLErrorReporter errorReporter,
                    boolean sendCharDataAsCharArray,
                    int lineNumber,
                    int columnNumber)

currentOffset

public int currentOffset()
Return the current offset within this reader.

Specified by:
currentOffset in interface XMLEntityHandler.EntityReader

getLineNumber

public int getLineNumber()
Return the line number of the current position within the document that we are processing.

Specified by:
getLineNumber in interface XMLEntityHandler.EntityReader

getColumnNumber

public int getColumnNumber()
Return the column number of the current position within the document that we are processing.

Specified by:
getColumnNumber in interface XMLEntityHandler.EntityReader

setInCDSect

public void setInCDSect(boolean inCDSect)
This method is provided for scanner implementations.

Specified by:
setInCDSect in interface XMLEntityHandler.EntityReader

getInCDSect

public boolean getInCDSect()
This method is provided for scanner implementations.

Specified by:
getInCDSect in interface XMLEntityHandler.EntityReader