public class CompositeDocumentFactory extends AbstractDocumentFactory
Factories can be composed. A composite factory will pass in turn
the input stream given to getDocument(InputStream, Reference2ObjectMap)
to the underlying factories, after calling InputStream.reset().
Document sequences using
composite factories must pass to getDocument(InputStream, Reference2ObjectMap)
a MultipleInputStream that can be reset enough times.
Note that in general composite factories support only sequential access to field content (albeit skipping items is allowed).
| Modifier and Type | Class and Description |
|---|---|
protected class |
CompositeDocumentFactory.CompositeDocument
A document obtained by composition of documents of underyling factories.
|
DocumentFactory.FieldType| Modifier | Constructor and Description |
|---|---|
protected |
CompositeDocumentFactory(DocumentFactory[] documentFactory,
String[] fieldName)
Creates a new composite document factory using the factories in a given array.
|
| Modifier and Type | Method and Description |
|---|---|
CompositeDocumentFactory |
copy() |
int |
fieldIndex(String fieldName)
Returns the index of a field, given its symbolic name.
|
String |
fieldName(int field)
Returns the symbolic name of a field.
|
DocumentFactory.FieldType |
fieldType(int field)
Returns the type of a field.
|
Document |
getDocument(InputStream rawContent,
Reference2ObjectMap<Enum<?>,Object> metadata)
Returns the document obtained by parsing the given byte stream.
|
static DocumentFactory |
getFactory(DocumentFactory... documentFactory)
Returns a document factory composing the given document factories.
|
static DocumentFactory |
getFactory(DocumentFactory[] documentFactory,
String[] fieldName)
Returns a document factory composing the given document factories.
|
int |
numberOfFields()
Returns the number of fields present in the documents produced by this factory.
|
ensureFieldIndex, toStringprotected CompositeDocumentFactory(DocumentFactory[] documentFactory, String[] fieldName)
documentFactory - an array of document factories that will composed.fieldName - an array of names for the resulting field, or null.public CompositeDocumentFactory copy()
public static DocumentFactory getFactory(DocumentFactory[] documentFactory, String[] fieldName)
By passing an optional array of field names, it is possible to rename the fields of the composing factories.
documentFactory - an array of document factories that will composed.fieldName - an array of names for the resulting field, or null.public static DocumentFactory getFactory(DocumentFactory... documentFactory)
documentFactory - document factories that will composed.public int numberOfFields()
DocumentFactorypublic String fieldName(int field)
DocumentFactoryfield - the index of a field (between 0 inclusive and DocumentFactory.numberOfFields() exclusive}).field-th field.public int fieldIndex(String fieldName)
DocumentFactoryfieldName - the name of a field of this factory.fieldName.public DocumentFactory.FieldType fieldType(int field)
DocumentFactoryThe possible types are defined in DocumentFactory.FieldType.
field - the index of a field (between 0 inclusive and DocumentFactory.numberOfFields() exclusive}).field-th field.public Document getDocument(InputStream rawContent, Reference2ObjectMap<Enum<?>,Object> metadata) throws IOException
DocumentFactoryThe parameter metadata actually replaces the lack of a simple keyword-based
parameter-passing system in Java. This method might take several different type of “suggestions”
which have been collected by the collection: typically, the document title, a URI representing
the document, its MIME type, its encoding and so on. Some of this information might be
set by default (as it happens, for instance, in a PropertyBasedDocumentFactory).
Implementations of this method must consult the metadata provided by the collection, possibly
complete them with default factory metadata, and proceed to the document construction.
rawContent - the raw content from which the document should be extracted; it must not be closed, as
resource management is a responsibility of the DocumentCollection.metadata - a map from enums (e.g., keys taken in PropertyBasedDocumentFactory) to various kind of objects.IOException