|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object
|
+--fr.gouv.culture.sdx.utils.SdxObjectImpl
|
+--fr.gouv.culture.sdx.utils.database.DatabaseBacked
|
+--fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
|
+--fr.gouv.culture.sdx.documentbase.SDXDocumentBase
|
+--fr.gouv.culture.sdx.documentbase.LuceneDocumentBase
A document base within an SDX application.
A document base is a very important document in SDX development. A document base is where documents are searched and retrieved, thus added (indexed), deleted or updated. A search cannot occur in a smaller unit than the document base. To exclude some parts of a document base, one should use query constructions, possibly filters.
A document base has a structure ; this structure is basically a list of fields. An application may have many document bases, and these document bases may have different structures. As always, indexable documents (XML, HTML or the like) with different structures can be indexed within a single document base.
Most applications will have only one document base, but in some cases it could be interesting to have more than one, like when different kinds of documents are never searched at the same time, in this case it would speed up the searching and indexing process if they are separated in different document bases.
A document base uses an indexer to index documents. It uses repositories to store the documents, either indexable ones or attached ones (i.e. non-indexable documents that are logically dependant of the indexable documents, images or the like). An application can get a searcher to perform searches within this document base, possibly with other document bases.
In order to work properly, a document base must be instantiated given the following sequence : 1) creation, 2) setting the logger (optional, but suggested for errors messages), 3) configuration, 4) initialization.
SdxObjectImpl.enableLogging(org.apache.avalon.framework.logger.Logger),
configure(org.apache.avalon.framework.configuration.Configuration),
init()| Field Summary | |
protected org.apache.avalon.framework.configuration.Configuration |
configuration
The configuration object to be used for this document base. |
static java.lang.String |
DBELEM_ATTRIBUTE_REMOTE_ACCESS
The implied attribute stating whether this document base is to be exposed to remote access or not. |
static java.lang.String |
ELEMENT_NAME_FIELD_LIST
The element used to define the indexation field list. |
static java.lang.String |
ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
The element used to define system fields in sdx.xconf. |
static java.lang.String |
FIELDS_DEFINITION
String representation for a key in the Properties object : fields definition object. |
protected FieldsDefinition |
fieldsDef
The (Lucene) fields that are to be handled by the index. |
protected LuceneIndex |
luceneSearchIndex
The index for this document base : obviously a Lucene one... |
protected java.lang.String |
SEARCH_INDEX_DIRECTORY_NAME
The directory name for the index that stores documents' indexation. |
| Fields inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBase |
_documentAdditionStatus, DOC_ADD_STATUS_ADDED, DOC_ADD_STATUS_FAILURE, DOC_ADD_STATUS_IGNORED, DOC_ADD_STATUS_REFRESHED, DOC_ADD_STATUS_REPLACED, DOC_URL, DOCUMENTBASE_DIR_PATH, keepOriginalDocuments, SDX_DATABASE_FORMAT, SDX_DATABASE_VERSION, SDX_DATABASE_VERSION_2_3, SDX_DATE, SDX_DATE_MILLISECONDS, SDX_ISO8601_DATE, SDX_USER |
| Fields inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked |
_manager, ATTRIBUTE_ID, CLASS_NAME_SUFFIX, database, DATABASE_DIR_NAME, databaseConf, dbLocation, dbPath, DEFAULT_DATABASE_TYPE, ELEMENT_NAME_DATABASE, id, PACKAGE_QUALNAME, props |
| Fields inherited from class fr.gouv.culture.sdx.utils.SdxObjectImpl |
encoding, logger |
| Fields inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase |
ATTRIBUTE_ID, ATTRIBUTE_TYPE, CLASS_NAME_SUFFIX, ELEMENT_NAME_DOCUMENT_BASE, ELEMENT_NAME_DOCUMENT_BASES, PACKAGE_QUALNAME |
| Constructor Summary | |
LuceneDocumentBase()
Creates the document base. |
|
| Method Summary | |
protected void |
addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
Writes a document to the search index |
protected void |
compactSearchIndex()
|
void |
configure(org.apache.avalon.framework.configuration.Configuration configuration)
Sets the configuration options for this document base. |
protected void |
configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
|
protected void |
configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
|
protected void |
configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
|
protected void |
configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
|
protected void |
configureSearchIndex()
|
java.util.Date |
creationDate()
|
void |
delete(Document[] docs,
org.xml.sax.ContentHandler handler)
Overriding parent method only to add lucene index optimazation |
protected void |
deleteFromSearchIndex(java.lang.String docId)
|
Index |
getIndex()
Gets the Index object for indexing and searching. |
protected java.lang.Object |
getIndexationDocument(IndexableDocument doc,
java.lang.String storeDocId,
java.lang.String repoId,
IndexParameters params)
|
void |
init()
Initializes the document base. |
java.util.Date |
lastModificationDate()
|
protected IndexParameters |
setBaseParameters(IndexParameters params)
Set's the default pipeline parameters and ensures the params have a pipeline |
protected void |
setSearchIndexParameters(LuceneIndexParameters params)
Sets the search index parameters for indexation performance |
| Methods inherited from class fr.gouv.culture.sdx.documentbase.SDXDocumentBase |
add, configureIdGenerator, configureOAIComponents, configureRepositories, delete, deleteIndexableDocumentComponents, deleteRelationsToMastersFromDatabase, getDocument, getDocument, getDocument, getDocument, getOwners, getRelated, getRepositoryConfigurationList, getRepositoryForDocument, getRepositoryForStorage, handleParameters, index, index, index, loadBaseConfiguration, rollbackIndexation |
| Methods inherited from class fr.gouv.culture.sdx.documentbase.AbstractDocumentBase |
addOaiDeletedRecord, configurePipeline, contextualize, createEntityForDocMetaData, delete, deletePhysicalDocument, getDefaultRepository, getId, getIndexationPipeline, getMimeType, getOAIHarvester, getOAIRepository, getPooledRepositoryConnection, getRepository, isDefault, optimizeDatabase, optimizeRepositories, releasePooledRepositoryConnections, removeOaiDeletedRecord, setId, toSAX |
| Methods inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked |
compose, getDatabase, setProperties |
| Methods inherited from class fr.gouv.culture.sdx.utils.SdxObjectImpl |
enableLogging, getChildLogger, setEncoding |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase |
setProperties |
| Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled |
enableLogging |
| Methods inherited from interface org.apache.avalon.framework.component.Composable |
compose |
| Field Detail |
protected org.apache.avalon.framework.configuration.Configuration configuration
protected LuceneIndex luceneSearchIndex
protected FieldsDefinition fieldsDef
public static final java.lang.String FIELDS_DEFINITION
protected final java.lang.String SEARCH_INDEX_DIRECTORY_NAME
public static final java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
public static final java.lang.String ELEMENT_NAME_FIELD_LIST
public static final java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
| Constructor Detail |
public LuceneDocumentBase()
SdxObjectImpl.enableLogging(org.apache.avalon.framework.logger.Logger),
configure(org.apache.avalon.framework.configuration.Configuration),
init()| Method Detail |
public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configure in interface org.apache.avalon.framework.configuration.Configurableconfigure in class SDXDocumentBaseconfiguration - The configuration object from which to build a document base.
Sample configuration entry:
<sdx:documentBase sdx:id = "myDocumentBaseName" sdx:type = "lucene">
<sdx:fieldList xml:lang = "fr-FR" sdx:variant = "" sdx:analyzerConf = "" sdx:analyzerClass = "">
<sdx:field code = "fieldName" type = "word" xml:lang = "fr-FR" sdx:analyzerClass = "" sdx:analyzerConf = ""/>
<sdx:field code = "fieldName2" type = "field" xml:lang = "fr-FR" brief = "true"/>
<sdx:field code = "fieldName3" type = "date" xml:lang = "fr-FR"/>
<sdx:field code = "fieldName4" type = "unindexed" xml:lang = "fr-FR"/>
</sdx:fieldList>
<sdx:index>
<sdx:pipeline sdx:id = "sdxIndexationPipeline">
<sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step2" sdx:type = "xslt"/>
<sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step3" sdx:type = "xslt" keep = "true"/>
</sdx:pipeline>
</sdx:index>
<sdx:repositories>
<sdx:repository baseDirectory = "blah4" depth = "3" extent = "100" sdx:type = "FS" sdx:default = "true" sdx:id = "blah4"/>
<sdx:repository ref = "blah2"/>
</sdx:repositories>
</sdx:documentBase>
org.apache.avalon.framework.configuration.ConfigurationExceptionwe should link to this in the future when we have better documentation capabilities
protected void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureDocumentBase in class SDXDocumentBaseorg.apache.avalon.framework.configuration.ConfigurationException
protected void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureSearchIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepository in class SDXDocumentBaseorg.apache.avalon.framework.configuration.ConfigurationException
protected void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIHarvester in class SDXDocumentBaseorg.apache.avalon.framework.configuration.ConfigurationException
public void delete(Document[] docs,
org.xml.sax.ContentHandler handler)
throws SDXException,
org.xml.sax.SAXException,
org.apache.cocoon.ProcessingException
delete in interface DocumentBasedelete in class SDXDocumentBasedocs - The documents to delete.handler - A content handler to feed with information.
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingExceptionprotected IndexParameters setBaseParameters(IndexParameters params)
setBaseParameters in class SDXDocumentBaseparams - The params object provided by the user at indexation timepublic Index getIndex()
protected void setSearchIndexParameters(LuceneIndexParameters params)
params - The lucene specific params to user
protected void addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
throws SDXException
addToSearchIndex in class SDXDocumentBaseindexationDoc - The Document to addbatchIndex -
SDXException
protected void deleteFromSearchIndex(java.lang.String docId)
throws SDXException
deleteFromSearchIndex in class SDXDocumentBaseSDXException
protected void compactSearchIndex()
throws SDXException
compactSearchIndex in class SDXDocumentBaseSDXException
protected java.lang.Object getIndexationDocument(IndexableDocument doc,
java.lang.String storeDocId,
java.lang.String repoId,
IndexParameters params)
throws SDXException
getIndexationDocument in class SDXDocumentBaseSDXExceptionpublic java.util.Date lastModificationDate()
public java.util.Date creationDate()
public void init()
throws SDXException
DocumentBaseThis method must be called after the logger has been set and the configuration done.
init in interface DocumentBaseinit in class SDXDocumentBaseSDXException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||