NativeTextExtractor (NETVIBES Exalead CloudView Clients SDK)

java.lang.Object
- com.exalead.indexing.analysis.v10.DocumentProcessor
- - com.exalead.indexing.analysis.v10.NativeTextExtractor

All Implemented Interfaces:

com.exalead.util.Checkable, java.io.Serializable
```
public class NativeTextExtractor
extends DocumentProcessor
implements com.exalead.util.Checkable, java.io.Serializable
```
Extraction is performed for the following data types:
- text/plain for Text files.
- text/html for HTML Files.
- application/x-exalead-document for CloudView 4.6 document format (com.exalead.document)
- application/x-exalead-ndoc for CloudView 5 internal document format, binary.
- application/x-exalead-ndoc-v10+xml for CloudView internal document format, XML.
See Also:

Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class com.exalead.indexing.analysis.v10.DocumentProcessor
  DocumentProcessor.FromDataModel, DocumentProcessor.Transformer<T>

Field Summary

Fields
Modifier and Type	Field and Description
`boolean`	`annotateHTML`
`static boolean`	`DEFAULT_ANNOTATE_H_T_M_L`
`static boolean`	`DEFAULT_DISABLE_AUTOMATIC_H_T_M_L_D_T_D_FIX`
`static boolean`	`DEFAULT_EXTRACT_H_T_M_L_FORMS`
`static boolean`	`DEFAULT_EXTRACT_H_T_M_L_STYLES`
`static boolean`	`DEFAULT_EXTRACT_H_T_M_L_TABLES`
`static boolean`	`DEFAULT_EXTRACT_JS`
`static int`	`DEFAULT_MAX_H_T_M_L_ANNOTATION_DEPTH`
`static boolean`	`DEFAULT_SKIP_INVISIBLE_H_T_M_L_TEXT`
`boolean`	`disableAutomaticHTMLDTDFix`
`boolean`	`extractHTMLForms`
`boolean`	`extractHTMLStyles`
`boolean`	`extractHTMLTables`
`boolean`	`extractJs`
`int`	`maxHTMLAnnotationDepth`
`boolean`	`skipInvisibleHTMLText`

Fields inherited from class com.exalead.indexing.analysis.v10.DocumentProcessor
acceptCondition, dataModelClass, dataModelProperty, dataModelState, DEFAULT_DISABLED, disabled, fromDataModel, name

Constructor Summary

Constructors
Constructor and Description

NativeTextExtractor()

NativeTextExtractor(NativeTextExtractor o)
Copy constructor

Constructors
Constructor and Description
`NativeTextExtractor()`
`NativeTextExtractor(NativeTextExtractor o)` Copy constructor

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`<T> T`	`accept(DocumentProcessor.Transformer<T> transformer, T[] t)`
`void`	`check(boolean deep, java.lang.String errorContext)` Checks this NativeTextExtractor.
`static NativeTextExtractor`	`fromString(java.lang.String s)` String representation of this NativeTextExtractor.
`int`	`getMaxHTMLAnnotationDepth()` Prevents new annotations from being created after @c maxHTMLAnnotationDepth HTML level.
`boolean`	`isAnnotateHTML()` Adds some stylish annotations to DocumentChunks (for HTML files only): html:p for DocumentChunks generated from {@literal
`boolean`	`isDisableAutomaticHTMLDTDFix()` Disables automatic DTD fix on HTML documents.
`boolean`	`isExtractHTMLForms()` Add annotations on Forms, select.
`boolean`	`isExtractHTMLStyles()` Adds annotations on style attributes.
`boolean`	`isExtractHTMLTables()` Adds annotations on table, tr, td, th
`boolean`	`isExtractJs()` Tries to parse JavaScript and then extract links.
`boolean`	`isSkipInvisibleHTMLText()` Skips the invisible text.
`NativeTextExtractor`	`makeCopy()` Creates and returns a deep copy of this NativeTextExtractor.
`static NativeTextExtractor`	`readFrom(java.io.InputStream is)` Read this NativeTextExtractor from an XML fragment.
`void`	`setAnnotateHTML(boolean annotateHTML)` Adds some stylish annotations to DocumentChunks (for HTML files only): html:p for DocumentChunks generated from {@literal
`void`	`setDisableAutomaticHTMLDTDFix(boolean disableAutomaticHTMLDTDFix)` Disables automatic DTD fix on HTML documents.
`void`	`setExtractHTMLForms(boolean extractHTMLForms)` Add annotations on Forms, select.
`void`	`setExtractHTMLStyles(boolean extractHTMLStyles)` Adds annotations on style attributes.
`void`	`setExtractHTMLTables(boolean extractHTMLTables)` Adds annotations on table, tr, td, th
`void`	`setExtractJs(boolean extractJs)` Tries to parse JavaScript and then extract links.
`void`	`setMaxHTMLAnnotationDepth(int maxHTMLAnnotationDepth)` Prevents new annotations from being created after @c maxHTMLAnnotationDepth HTML level.
`void`	`setSkipInvisibleHTMLText(boolean skipInvisibleHTMLText)` Skips the invisible text.
`java.lang.String`	`toString()` String representation of this NativeTextExtractor.
`NativeTextExtractor`	`withAcceptCondition(AcceptCondition acceptCondition)`
`NativeTextExtractor`	`withAnnotateHTML(boolean annotateHTML)`
`NativeTextExtractor`	`withAnnotateHTML(java.lang.Boolean annotateHTML)`
`NativeTextExtractor`	`withDataModelClass(java.lang.String dataModelClass)`
`NativeTextExtractor`	`withDataModelProperty(java.lang.String dataModelProperty)`
`NativeTextExtractor`	`withDataModelState(java.lang.String dataModelState)`
`NativeTextExtractor`	`withDisableAutomaticHTMLDTDFix(boolean disableAutomaticHTMLDTDFix)`
`NativeTextExtractor`	`withDisableAutomaticHTMLDTDFix(java.lang.Boolean disableAutomaticHTMLDTDFix)`
`NativeTextExtractor`	`withDisabled(boolean disabled)`
`NativeTextExtractor`	`withDisabled(java.lang.Boolean disabled)`
`NativeTextExtractor`	`withExtractHTMLForms(boolean extractHTMLForms)`
`NativeTextExtractor`	`withExtractHTMLForms(java.lang.Boolean extractHTMLForms)`
`NativeTextExtractor`	`withExtractHTMLStyles(boolean extractHTMLStyles)`
`NativeTextExtractor`	`withExtractHTMLStyles(java.lang.Boolean extractHTMLStyles)`
`NativeTextExtractor`	`withExtractHTMLTables(boolean extractHTMLTables)`
`NativeTextExtractor`	`withExtractHTMLTables(java.lang.Boolean extractHTMLTables)`
`NativeTextExtractor`	`withExtractJs(boolean extractJs)`
`NativeTextExtractor`	`withExtractJs(java.lang.Boolean extractJs)`
`NativeTextExtractor`	`withFromDataModel(DocumentProcessor fromDataModel)`
`NativeTextExtractor`	`withMaxHTMLAnnotationDepth(int maxHTMLAnnotationDepth)`
`NativeTextExtractor`	`withMaxHTMLAnnotationDepth(java.lang.Integer maxHTMLAnnotationDepth)`
`NativeTextExtractor`	`withName(java.lang.String name)`
`NativeTextExtractor`	`withSkipInvisibleHTMLText(boolean skipInvisibleHTMLText)`
`NativeTextExtractor`	`withSkipInvisibleHTMLText(java.lang.Boolean skipInvisibleHTMLText)`
`void`	`writeTo(java.io.OutputStream os)` Write this NativeTextExtractor as an XML fragment

Methods inherited from class com.exalead.indexing.analysis.v10.DocumentProcessor
getAcceptCondition, getDataModelClass, getDataModelProperty, getDataModelState, getFromDataModel, getName, isDisabled, setAcceptCondition, setDataModelClass, setDataModelProperty, setDataModelState, setDisabled, setFromDataModel, setName

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Field Detail
  - annotateHTML
```
public boolean annotateHTML
```
  - DEFAULT_ANNOTATE_H_T_M_L
```
public static final boolean DEFAULT_ANNOTATE_H_T_M_L
```
    See Also:
    
    Constant Field Values
  - skipInvisibleHTMLText
```
public boolean skipInvisibleHTMLText
```
  - DEFAULT_SKIP_INVISIBLE_H_T_M_L_TEXT
```
public static final boolean DEFAULT_SKIP_INVISIBLE_H_T_M_L_TEXT
```
    See Also:
    
    Constant Field Values
  - extractJs
```
public boolean extractJs
```
  - DEFAULT_EXTRACT_JS
```
public static final boolean DEFAULT_EXTRACT_JS
```
    See Also:
    
    Constant Field Values
  - extractHTMLTables
```
public boolean extractHTMLTables
```
  - DEFAULT_EXTRACT_H_T_M_L_TABLES
```
public static final boolean DEFAULT_EXTRACT_H_T_M_L_TABLES
```
    See Also:
    
    Constant Field Values
  - extractHTMLStyles
```
public boolean extractHTMLStyles
```
  - DEFAULT_EXTRACT_H_T_M_L_STYLES
```
public static final boolean DEFAULT_EXTRACT_H_T_M_L_STYLES
```
    See Also:
    
    Constant Field Values
  - extractHTMLForms
```
public boolean extractHTMLForms
```
  - DEFAULT_EXTRACT_H_T_M_L_FORMS
```
public static final boolean DEFAULT_EXTRACT_H_T_M_L_FORMS
```
    See Also:
    
    Constant Field Values
  - maxHTMLAnnotationDepth
```
public int maxHTMLAnnotationDepth
```
  - DEFAULT_MAX_H_T_M_L_ANNOTATION_DEPTH
```
public static final int DEFAULT_MAX_H_T_M_L_ANNOTATION_DEPTH
```
    See Also:
    
    Constant Field Values
  - disableAutomaticHTMLDTDFix
```
public boolean disableAutomaticHTMLDTDFix
```
  - DEFAULT_DISABLE_AUTOMATIC_H_T_M_L_D_T_D_FIX
```
public static final boolean DEFAULT_DISABLE_AUTOMATIC_H_T_M_L_D_T_D_FIX
```
    See Also:
    
    Constant Field Values
- Constructor Detail
  - NativeTextExtractor
```
public NativeTextExtractor()
```
  - NativeTextExtractor
```
public NativeTextExtractor(NativeTextExtractor o)
```
    Copy constructor
- Method Detail
  - withAcceptCondition
```
public NativeTextExtractor withAcceptCondition(AcceptCondition acceptCondition)
```
    Overrides:
    
    withAcceptCondition in class DocumentProcessor
  - withName
```
public NativeTextExtractor withName(java.lang.String name)
```
    Overrides:
    
    withName in class DocumentProcessor
  - withDataModelState
```
public NativeTextExtractor withDataModelState(java.lang.String dataModelState)
```
    Overrides:
    
    withDataModelState in class DocumentProcessor
  - withFromDataModel
```
public NativeTextExtractor withFromDataModel(DocumentProcessor fromDataModel)
```
  - withDataModelClass
```
public NativeTextExtractor withDataModelClass(java.lang.String dataModelClass)
```
    Overrides:
    
    withDataModelClass in class DocumentProcessor
  - withDataModelProperty
```
public NativeTextExtractor withDataModelProperty(java.lang.String dataModelProperty)
```
    Overrides:
    
    withDataModelProperty in class DocumentProcessor
  - withDisabled
```
public NativeTextExtractor withDisabled(boolean disabled)
```
    Overrides:
    
    withDisabled in class DocumentProcessor
  - withDisabled
```
public NativeTextExtractor withDisabled(java.lang.Boolean disabled)
```
    Overrides:
    
    withDisabled in class DocumentProcessor
  - setAnnotateHTML
```
public void setAnnotateHTML(boolean annotateHTML)
```
    Adds some stylish annotations to DocumentChunks (for HTML files only):
    - html:p for DocumentChunks generated from <p>
    - html:row for DocumentChunks generated from <tr>
    - html:column for DocumentChunks generated from <td> or <th>
    - html:table for DocumentChunks generated from <table>
    - html:h1 for DocumentChunks generated from <h1>
    - html:h2 for DocumentChunks generated from <h2>
    - html:h3 for DocumentChunks generated from <h3>
    - html:h4 for DocumentChunks generated from <h4>
    - html:h5 for DocumentChunks generated from <h5>
    - html:h6 for DocumentChunks generated from <h6>
    - html:link for DocumentChunks generated from <a>, <iframe> or <frame>
    - html:link:rel if the link has a "rel" attribute
      
      html:link:name if the link has a "name" attribute
    - html:list for DocumentChunks generated from <ul>, <ol> or <dl>
    - html:item for DocumentChunks generated from <li>
    - html:bold for DocumentChunks generated from <b> or <strong>
    - html:italic for DocumentChunks generated from <i> or <em>
    - html:underline for DocumentChunks generated from <u>
    - html:strike for DocumentChunks generated from <s> or <strike>
    - html:pre for DocumentChunks generated from <pre>
    - html:invisible for DocumentChunks containing invisible text (display: none, white on white)
    - html:class for DocumentChunks taken in a CSS class
    - html:id for DocumentChunks taken in a CSS id
    - html:img:src for DocumentChunks created from a <img>
    It also creates specific HTML DocumentChunks with the following contexts:
    - html:lang when parsing a <html> containing the "lang" attribute
    - html:xml:lang when parsing a <html> containing the "xml:lang" attribute
    - html:title when parsing a <title>
    - html:title:other when parsing a second <title>
    - html:base:href when parsing a <base>
    - html:link when parsing a <link> containing the "src" attribute and annotated by:
    - html:link:rel if the link has a "rel" attribute
      
      html:link:type if the link has a "type" attribute
    - html:http-equiv:NAME when parsing a http-equiv meta
    - html:meta:NAME when parsing a meta named "NAME"
  - isAnnotateHTML
```
public boolean isAnnotateHTML()
```
    Adds some stylish annotations to DocumentChunks (for HTML files only):
    - html:p for DocumentChunks generated from <p>
    - html:row for DocumentChunks generated from <tr>
    - html:column for DocumentChunks generated from <td> or <th>
    - html:table for DocumentChunks generated from <table>
    - html:h1 for DocumentChunks generated from <h1>
    - html:h2 for DocumentChunks generated from <h2>
    - html:h3 for DocumentChunks generated from <h3>
    - html:h4 for DocumentChunks generated from <h4>
    - html:h5 for DocumentChunks generated from <h5>
    - html:h6 for DocumentChunks generated from <h6>
    - html:link for DocumentChunks generated from <a>, <iframe> or <frame>
    - html:link:rel if the link has a "rel" attribute
      
      html:link:name if the link has a "name" attribute
    - html:list for DocumentChunks generated from <ul>, <ol> or <dl>
    - html:item for DocumentChunks generated from <li>
    - html:bold for DocumentChunks generated from <b> or <strong>
    - html:italic for DocumentChunks generated from <i> or <em>
    - html:underline for DocumentChunks generated from <u>
    - html:strike for DocumentChunks generated from <s> or <strike>
    - html:pre for DocumentChunks generated from <pre>
    - html:invisible for DocumentChunks containing invisible text (display: none, white on white)
    - html:class for DocumentChunks taken in a CSS class
    - html:id for DocumentChunks taken in a CSS id
    - html:img:src for DocumentChunks created from a <img>
    It also creates specific HTML DocumentChunks with the following contexts:
    - html:lang when parsing a <html> containing the "lang" attribute
    - html:xml:lang when parsing a <html> containing the "xml:lang" attribute
    - html:title when parsing a <title>
    - html:title:other when parsing a second <title>
    - html:base:href when parsing a <base>
    - html:link when parsing a <link> containing the "src" attribute and annotated by:
    - html:link:rel if the link has a "rel" attribute
      
      html:link:type if the link has a "type" attribute
    - html:http-equiv:NAME when parsing a http-equiv meta
    - html:meta:NAME when parsing a meta named "NAME"
  - withAnnotateHTML
```
public NativeTextExtractor withAnnotateHTML(boolean annotateHTML)
```
  - withAnnotateHTML
```
public NativeTextExtractor withAnnotateHTML(java.lang.Boolean annotateHTML)
```
  - setSkipInvisibleHTMLText
```
public void setSkipInvisibleHTMLText(boolean skipInvisibleHTMLText)
```
    Skips the invisible text. For example, white fonts on white backgrounds (for HTML files only).
  - isSkipInvisibleHTMLText
```
public boolean isSkipInvisibleHTMLText()
```
    Skips the invisible text. For example, white fonts on white backgrounds (for HTML files only).
  - withSkipInvisibleHTMLText
```
public NativeTextExtractor withSkipInvisibleHTMLText(boolean skipInvisibleHTMLText)
```
  - withSkipInvisibleHTMLText
```
public NativeTextExtractor withSkipInvisibleHTMLText(java.lang.Boolean skipInvisibleHTMLText)
```
  - setExtractJs
```
public void setExtractJs(boolean extractJs)
```
    Tries to parse JavaScript and then extract links.
  - isExtractJs
```
public boolean isExtractJs()
```
    Tries to parse JavaScript and then extract links.
  - withExtractJs
```
public NativeTextExtractor withExtractJs(boolean extractJs)
```
  - withExtractJs
```
public NativeTextExtractor withExtractJs(java.lang.Boolean extractJs)
```
  - setExtractHTMLTables
```
public void setExtractHTMLTables(boolean extractHTMLTables)
```
    Adds annotations on table, tr, td, th
  - isExtractHTMLTables
```
public boolean isExtractHTMLTables()
```
    Adds annotations on table, tr, td, th
  - withExtractHTMLTables
```
public NativeTextExtractor withExtractHTMLTables(boolean extractHTMLTables)
```
  - withExtractHTMLTables
```
public NativeTextExtractor withExtractHTMLTables(java.lang.Boolean extractHTMLTables)
```
  - setExtractHTMLStyles
```
public void setExtractHTMLStyles(boolean extractHTMLStyles)
```
    Adds annotations on style attributes.
  - isExtractHTMLStyles
```
public boolean isExtractHTMLStyles()
```
    Adds annotations on style attributes.
  - withExtractHTMLStyles
```
public NativeTextExtractor withExtractHTMLStyles(boolean extractHTMLStyles)
```
  - withExtractHTMLStyles
```
public NativeTextExtractor withExtractHTMLStyles(java.lang.Boolean extractHTMLStyles)
```
  - setExtractHTMLForms
```
public void setExtractHTMLForms(boolean extractHTMLForms)
```
    Add annotations on Forms, select.
  - isExtractHTMLForms
```
public boolean isExtractHTMLForms()
```
    Add annotations on Forms, select.
  - withExtractHTMLForms
```
public NativeTextExtractor withExtractHTMLForms(boolean extractHTMLForms)
```
  - withExtractHTMLForms
```
public NativeTextExtractor withExtractHTMLForms(java.lang.Boolean extractHTMLForms)
```
  - setMaxHTMLAnnotationDepth
```
public void setMaxHTMLAnnotationDepth(int maxHTMLAnnotationDepth)
```
    Prevents new annotations from being created after @c maxHTMLAnnotationDepth HTML level.
  - getMaxHTMLAnnotationDepth
```
public int getMaxHTMLAnnotationDepth()
```
    Prevents new annotations from being created after @c maxHTMLAnnotationDepth HTML level.
  - withMaxHTMLAnnotationDepth
```
public NativeTextExtractor withMaxHTMLAnnotationDepth(int maxHTMLAnnotationDepth)
```
  - withMaxHTMLAnnotationDepth
```
public NativeTextExtractor withMaxHTMLAnnotationDepth(java.lang.Integer maxHTMLAnnotationDepth)
```
  - setDisableAutomaticHTMLDTDFix
```
public void setDisableAutomaticHTMLDTDFix(boolean disableAutomaticHTMLDTDFix)
```
    Disables automatic DTD fix on HTML documents.
  - isDisableAutomaticHTMLDTDFix
```
public boolean isDisableAutomaticHTMLDTDFix()
```
    Disables automatic DTD fix on HTML documents.
  - withDisableAutomaticHTMLDTDFix
```
public NativeTextExtractor withDisableAutomaticHTMLDTDFix(boolean disableAutomaticHTMLDTDFix)
```
  - withDisableAutomaticHTMLDTDFix
```
public NativeTextExtractor withDisableAutomaticHTMLDTDFix(java.lang.Boolean disableAutomaticHTMLDTDFix)
```
  - makeCopy
```
public NativeTextExtractor makeCopy()
```
    Creates and returns a deep copy of this NativeTextExtractor.
    
    Overrides:
    
    makeCopy in class DocumentProcessor
  - readFrom
```
public static NativeTextExtractor readFrom(java.io.InputStream is)
                                    throws javax.xml.bind.JAXBException
```
    Read this NativeTextExtractor from an XML fragment.
    
    Throws:
    
    javax.xml.bind.JAXBException
  - writeTo
```
public void writeTo(java.io.OutputStream os)
             throws javax.xml.bind.JAXBException,
                    java.io.IOException
```
    Write this NativeTextExtractor as an XML fragment
    
    Overrides:
    
    writeTo in class DocumentProcessor
    
    Throws:
    
    javax.xml.bind.JAXBException
    
    java.io.IOException
  - fromString
```
public static NativeTextExtractor fromString(java.lang.String s)
                                      throws javax.xml.bind.JAXBException,
                                             java.io.UnsupportedEncodingException
```
    String representation of this NativeTextExtractor.
    
    Throws:
    
    javax.xml.bind.JAXBException
    
    java.io.UnsupportedEncodingException
  - toString
```
public java.lang.String toString()
```
    String representation of this NativeTextExtractor.
    
    Overrides:
    
    toString in class DocumentProcessor
  - check
```
public void check(boolean deep,
                  java.lang.String errorContext)
           throws com.exalead.util.TypedException
```
    Checks this NativeTextExtractor.
    
    Specified by:
    
    check in interface com.exalead.util.Checkable
    
    Overrides:
    
    check in class DocumentProcessor
    
    Throws:
    
    com.exalead.util.TypedException
  - accept
```
public <T> T accept(DocumentProcessor.Transformer<T> transformer,
                    T[] t)
             throws com.exalead.util.TypedException
```
    Specified by:
    
    accept in class DocumentProcessor
    
    Throws:
    
    com.exalead.util.TypedException

Class NativeTextExtractor

Nested Class Summary

Nested classes/interfaces inherited from class com.exalead.indexing.analysis.v10.DocumentProcessor

Field Summary

Fields inherited from class com.exalead.indexing.analysis.v10.DocumentProcessor

Constructor Summary

Method Summary

Methods inherited from class com.exalead.indexing.analysis.v10.DocumentProcessor

Methods inherited from class java.lang.Object

Field Detail

annotateHTML

DEFAULT_ANNOTATE_H_T_M_L

skipInvisibleHTMLText

DEFAULT_SKIP_INVISIBLE_H_T_M_L_TEXT

extractJs

DEFAULT_EXTRACT_JS

extractHTMLTables

DEFAULT_EXTRACT_H_T_M_L_TABLES

extractHTMLStyles

DEFAULT_EXTRACT_H_T_M_L_STYLES

extractHTMLForms

DEFAULT_EXTRACT_H_T_M_L_FORMS

maxHTMLAnnotationDepth

DEFAULT_MAX_H_T_M_L_ANNOTATION_DEPTH

disableAutomaticHTMLDTDFix

DEFAULT_DISABLE_AUTOMATIC_H_T_M_L_D_T_D_FIX

Constructor Detail

NativeTextExtractor

NativeTextExtractor

Method Detail

withAcceptCondition

withName

withDataModelState

withFromDataModel

withDataModelClass

withDataModelProperty

withDisabled

withDisabled

setAnnotateHTML

isAnnotateHTML

withAnnotateHTML

withAnnotateHTML

setSkipInvisibleHTMLText

isSkipInvisibleHTMLText

withSkipInvisibleHTMLText

withSkipInvisibleHTMLText

setExtractJs

isExtractJs

withExtractJs

withExtractJs

setExtractHTMLTables

isExtractHTMLTables

withExtractHTMLTables

withExtractHTMLTables

setExtractHTMLStyles

isExtractHTMLStyles

withExtractHTMLStyles

withExtractHTMLStyles

setExtractHTMLForms

isExtractHTMLForms

withExtractHTMLForms

withExtractHTMLForms

setMaxHTMLAnnotationDepth

getMaxHTMLAnnotationDepth

withMaxHTMLAnnotationDepth

withMaxHTMLAnnotationDepth

setDisableAutomaticHTMLDTDFix

isDisableAutomaticHTMLDTDFix

withDisableAutomaticHTMLDTDFix

withDisableAutomaticHTMLDTDFix

makeCopy

readFrom

writeTo

fromString

toString

check

accept