Name | Type | Default value | Description |
---|---|---|---|
preProcessorClassId | string | Custom PreProcessor. Called at the end of the preprocess pipe. | |
fetcherClassId | string | Custom Fetcher. | |
processorClassId | string | Custom Processor. Called at the end of the process pipe. Catches all mime types. | |
htmlProcessorClassId | string | Custom HTML Processor. Called at the of the html process pipe. Catches only html documents. | |
linksFilterClassId | string | Custom LinksFilter. Called at the end of the links filter list. Can decide whether to crawl an outgoing link. | |
postProcessorClassId | string | Custom PostProcessor. Called at the end of the postprocess pipe. | |
crawlerTemplate | string | Alternatively, specify the url of a xml file describing the whole crawler. |