Package org.jsoup.parser
Class Tokeniser
java.lang.Object
org.jsoup.parser.Tokeniser
Readers the input stream into tokens.
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) final Token.Characterprivate intprivate final int[](package private) final Token.Comment(package private) final TokenData(package private) final Token.Doctypeprivate Token(package private) final Token.EndTagprivate final ParseErrorListprivate booleanprivate Stringprivate Stringprivate intprivate final int[]private static final char[]private final CharacterReader(package private) static final char(package private) final Token.StartTagprivate TokeniserState(package private) final Document.OutputSettings.Syntax(package private) Token.Tag(package private) static final int[](package private) static final int(package private) final Token.XmlDecl -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) voidadvanceTransition(TokeniserState newState) (package private) String(package private) StringReturns the closer sequence</lastStartprivate voidcharacterReferenceError(String message, Object... args) (package private) int[]consumeCharacterReference(Character additionalAllowedCharacter, boolean inAttribute) Tries to consume a character reference, and returns: null if nothing, int[1], or int[2].(package private) void(package private) void(package private) void(package private) Token.TagcreateTagPending(boolean start) (package private) void(package private) Token.XmlDeclcreateXmlDeclPending(boolean isDeclaration) (package private) voidemit(char c) (package private) voidemit(int[] codepoints) (package private) void(package private) void(package private) void(package private) void(package private) void(package private) voideofError(TokeniserState state) (package private) void(package private) void(package private) voiderror(TokeniserState state) (package private) boolean(package private) Tokenread()(package private) voidtransition(TokeniserState newState) (package private) StringunescapeEntities(boolean inAttribute) Utility method to consume reader and unescape entities found within.
-
Field Details
-
replacementChar
static final char replacementChar- See Also:
-
notCharRefCharsSorted
private static final char[] notCharRefCharsSorted -
win1252ExtensionsStart
static final int win1252ExtensionsStart- See Also:
-
win1252Extensions
static final int[] win1252Extensions -
reader
-
errors
-
state
-
emitPending
-
isEmitPending
private boolean isEmitPending -
dataBuffer
-
syntax
-
startPending
-
endPending
-
tagPending
Token.Tag tagPending -
charPending
-
doctypePending
-
commentPending
-
xmlDeclPending
-
lastStartTag
-
lastStartCloseSeq
-
markupStartPos
private int markupStartPos -
charStartPos
private int charStartPos -
codepointHolder
private final int[] codepointHolder -
multipointHolder
private final int[] multipointHolder
-
-
Constructor Details
-
Tokeniser
Tokeniser(TreeBuilder treeBuilder)
-
-
Method Details
-
read
Token read() -
emit
-
emit
-
emit
void emit(char c) -
emit
void emit(int[] codepoints) -
transition
-
advanceTransition
-
consumeCharacterReference
Tries to consume a character reference, and returns: null if nothing, int[1], or int[2]. -
createTagPending
-
createXmlDeclPending
-
emitTagPending
void emitTagPending() -
createCommentPending
void createCommentPending() -
emitCommentPending
void emitCommentPending() -
createBogusCommentPending
void createBogusCommentPending() -
createDoctypePending
void createDoctypePending() -
emitDoctypePending
void emitDoctypePending() -
createTempBuffer
void createTempBuffer() -
isAppropriateEndTagToken
boolean isAppropriateEndTagToken() -
appropriateEndTagName
String appropriateEndTagName() -
appropriateEndTagSeq
String appropriateEndTagSeq()Returns the closer sequence</lastStart -
error
-
eofError
-
characterReferenceError
-
error
-
error
-
unescapeEntities
Utility method to consume reader and unescape entities found within.- Parameters:
inAttribute- if the text to be unescaped is in an attribute- Returns:
- unescaped string from reader
-