If you do not find answers to your questions here, please ask your question on GitHub and it may find its way here.
KeeyOnlyTagger will help you
produce just the fields you want. The later is probably the one you want
in most cases. The following shows how to eliminate all fields from a document,
except for the document reference, keywords, and description fields:
<importer> <postParseHandlers> <tagger class="com.norconex.importer.handler.tagger.impl.KeepOnlyTagger"> <fields>document.reference, keywords, description</fields> </tagger> </postParseHandlers> </importer>
Norconex Collectors need a database to store key reference information about a collected document (URL, path, etc.). Three implementations are offered out-of-the-box: MVStore, MapDB, MongoDB, and JDBC (Derby or H2). Prior to version 2.5.0 of both HTTP and Filesystem collectors, MapDB was the default implementation. Since version 2.5.0 of these collectors, MVStore is now the default implementation. Using the default implementation does not require explicit configuration. The following will help you decide which one is the right one for you: