|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectfr.gouv.culture.sdx.utils.AbstractSdxObject
fr.gouv.culture.sdx.utils.database.DatabaseBacked
fr.gouv.culture.sdx.documentbase.AbstractDocumentBase
fr.gouv.culture.sdx.documentbase.SDXDocumentBase
fr.gouv.culture.sdx.documentbase.LuceneDocumentBase
public class LuceneDocumentBase
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.documentbase.SDXDocumentBaseTarget |
|---|
SDXDocumentBaseTarget.ConfigurationNode |
| Nested classes/interfaces inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase |
|---|
DocumentBase.ConfigurationNode |
| Field Summary | |
|---|---|
protected FieldList |
_fieldList
The (Lucene) fields that are to be handled by the index. |
protected java.util.HashMap |
_xmlFieldList
The list of fields with a XML type |
static java.lang.String |
DBELEM_ATTRIBUTE_REMOTE_ACCESS
The implied attribute stating whether this document base is to be exposed to remote access or not. |
static java.lang.String |
ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
The element used to define system fields in sdx.xconf. |
protected java.lang.String |
INDEX_DIR_CURRENT
Directory names for indexes |
protected java.lang.String |
INDEX_DIR_MAIN
|
protected long |
lastDocCount
Number of indexed doc since last split |
protected LuceneIndex |
luceneActiveIndex
The active index for this document base |
protected LuceneIndex |
luceneCurrentIndex
The temporary index for this document base |
protected java.util.Vector |
luceneSearchIndexList
The sub-indexes for this document base (first entry is the activeIndex) |
protected java.lang.String |
SEARCH_INDEX_DIRECTORY_NAME
The directory name for the index that stores documents' indexation. |
protected int |
subIndexCount
Number of subindexes |
| Fields inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked |
|---|
_database, CLASS_NAME_SUFFIX, DATABASE_DIR_NAME, databaseConf, dbLocation, dbPath, DEFAULT_DATABASE_TYPE |
| Fields inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject |
|---|
_context, _description, _encoding, _id, _locale, _logger, _manager, _xmlizable_objects, _xmlLang, isToSaxInitialized |
| Fields inherited from interface fr.gouv.culture.sdx.documentbase.DocumentBase |
|---|
CLASS_NAME_SUFFIX, PACKAGE_QUALNAME |
| Fields inherited from interface fr.gouv.culture.sdx.utils.Encodable |
|---|
DEFAULT_ENCODING |
| Fields inherited from interface fr.gouv.culture.sdx.utils.save.Saveable |
|---|
ALL_SAVE_ATTRIB, PATH_ATTRIB, SAVE_DIRECTORY_PARAM |
| Constructor Summary | |
|---|---|
LuceneDocumentBase()
Creates the document base. |
|
| Method Summary | |
|---|---|
protected void |
addSubIndex()
Adds a splitted sub-index and update configuration aftermath |
protected void |
addSubIndex(LuceneIndex index)
Adds a splitted sub-index and update configuration aftermath |
protected void |
addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
Writes a document to the search index |
void |
backup(SaveParameters save_config)
Saves the DocumentBase data objects |
protected void |
backupIndexes(SaveParameters save_config)
Save the indexes files |
protected void |
backupTimeStamp(SaveParameters save_config)
Save the timestamp files |
protected void |
compactSearchIndex()
|
void |
configure(org.apache.avalon.framework.configuration.Configuration configuration)
Sets the configuration options for this document base. |
protected void |
configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the Lucene document base |
protected void |
configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the fields list |
protected void |
configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
Configures the OAI harverster of this Lucene document base. |
protected void |
configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
Configures on or more OAI repositories. |
protected void |
configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Configures an OAIRespository Configures an OAIRespository based on the configuration element <oai-repository> |
protected void |
configureSearchIndex()
Configures Lucene search index |
OAIRepository |
createOAIRepository()
Creates the default OAIRepository for the documentbase, using the older configuration |
OAIRepository |
createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
Creates the OAIRepository for the documentbase Configures an OAIRespository based on the configuration that must start with an element <oai-repository> |
OAIRepository |
createOAIRepository(java.lang.String repoId)
Creates an OAIRepository for the documentbase, using the older configuration |
java.util.Date |
creationDate()
Returns the creation date of the Lucene search index. |
void |
delete(Document[] docs,
org.xml.sax.ContentHandler handler)
Deletes documents to this base. |
protected void |
deleteFromSearchIndex(java.lang.String docId)
|
int |
docCount()
Returns the number of documents in all Lucene sub indexes. |
protected java.lang.String |
getFormatedSubIndexId(int subIndexNumber)
Gets the formated sub-index number (for directories name) |
Index |
getIndex()
Gets the Index object for indexing and searching. |
protected java.lang.Object |
getIndexationDocument(IndexableDocument doc,
java.lang.String storeDocId,
java.lang.String repoId,
IndexParameters params)
|
org.apache.lucene.index.IndexReader |
getIndexReader()
Return the Lucene index reader Returns the index reader for all this document base indexes. |
protected long |
getIndexSize(LuceneIndex index)
Returns the index size |
LuceneIndex |
getLuceneIndex()
|
org.apache.lucene.search.Searcher |
getSearcher()
Returns the Lucene index searcher Returns the index searcher for all this document base indexes. |
java.util.HashMap |
getXMLFieldList()
Returns the list of XML type fields |
void |
index(IndexableDocument[] docs,
Repository repository,
IndexParameters params,
org.xml.sax.ContentHandler handler)
Adds one or more indexables documents to the search index of Lucene. |
void |
indexModified()
Modifies the last modfication timestamp file |
void |
init()
Initializes the document base. |
protected void |
initializeVectorizedIndex()
Initializes the index vector Initializes the index vector by searching all sub index in it's directory NB : working as intended. |
protected boolean |
initToSax()
Init the LinkedHashMap _xmlizable_objects with the objects in order to describ them in XML |
protected void |
initVolatileObjectsToSax()
Init the LinkedHashMap _xmlizable_volatile_objects with the objects in order to describ them in XML. |
java.util.Date |
lastModificationDate()
Returns the last modification date of the Lucene search index. |
void |
mergeBatch()
Deprecated. This method is deprecated since SDX v. 2.3. Use mergeCurrentBatch() instead. |
void |
mergeCurrentBatch()
Merges a batch of documents Merges a batch of documents (in memory) into the physical index on the file system and optimize this one if necessary (depends of the autoOptimize attribute for the current Document Base). |
void |
optimize()
Process an optimization of the indexes and repositories and system databases |
void |
reloadFieldList(java.lang.String appConfString)
Reloads the fieldList of an application |
protected void |
removeSubIndex()
Remove a splitted sub-index and update configuration aftermath Currently of no use as there is no plan to do so, just here as a reminder for future functionnalities |
protected void |
renewKeyIndex()
Refreshes data for the main and current index |
void |
replaceFieldList(FieldList fieldList)
Replaces the current fieldList by the new one |
void |
restore(SaveParameters save_config)
Restore the DocumentBase data objects |
protected void |
restoreIndexes(SaveParameters save_config)
Save the indexes files |
protected void |
restoreTimeStamp(SaveParameters save_config)
Restore the timestamp files |
protected IndexParameters |
setBaseParameters(IndexParameters params)
Sets the default pipeline parameters and ensures the params have a pipeline |
protected void |
setSearchIndexParameters(LuceneIndexParameters params)
Sets the search index parameters for indexation performance |
boolean |
splitCheck(boolean currentIndex)
Tests splitting conditions Returns true when splitting condition are reached. |
void |
splitIndex(boolean currentIndex)
Splits current index Splits the current big index into 2 smaller one |
| Methods inherited from class fr.gouv.culture.sdx.utils.database.DatabaseBacked |
|---|
configure, getClassNameSuffix, getDatabase |
| Methods inherited from class fr.gouv.culture.sdx.utils.AbstractSdxObject |
|---|
configureDescription, contextualize, enableLogging, getBaseAttributes, getContext, getDescription, getEncoding, getId, getLocale, getLog, getServiceManager, getXmlLang, service, setDescription, setEncoding, setId, setLocale, setUpSdxObject, setUpSdxObject, setXmlLang, toSAX, verifyConfigurationResources |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface fr.gouv.culture.sdx.utils.SdxObject |
|---|
getLog |
| Methods inherited from interface org.apache.avalon.framework.logger.LogEnabled |
|---|
enableLogging |
| Methods inherited from interface org.apache.avalon.framework.context.Contextualizable |
|---|
contextualize |
| Methods inherited from interface org.apache.avalon.framework.service.Serviceable |
|---|
service |
| Methods inherited from interface fr.gouv.culture.sdx.utils.Identifiable |
|---|
getId, setId |
| Methods inherited from interface fr.gouv.culture.sdx.utils.Describable |
|---|
getDescription, setDescription |
| Methods inherited from interface fr.gouv.culture.sdx.utils.Encodable |
|---|
getEncoding, setEncoding |
| Methods inherited from interface fr.gouv.culture.sdx.utils.Localizable |
|---|
getLocale, getXmlLang, setLocale, setXmlLang |
| Methods inherited from interface org.apache.excalibur.xml.sax.XMLizable |
|---|
toSAX |
| Methods inherited from interface fr.gouv.culture.sdx.search.Searchable |
|---|
getId |
| Field Detail |
|---|
protected java.util.Vector luceneSearchIndexList
protected LuceneIndex luceneActiveIndex
protected LuceneIndex luceneCurrentIndex
protected FieldList _fieldList
protected java.util.HashMap _xmlFieldList
protected int subIndexCount
protected long lastDocCount
protected final java.lang.String INDEX_DIR_CURRENT
protected final java.lang.String INDEX_DIR_MAIN
protected final java.lang.String SEARCH_INDEX_DIRECTORY_NAME
public static final java.lang.String DBELEM_ATTRIBUTE_REMOTE_ACCESS
public static final java.lang.String ELEMENT_NAME_LUCENE_SDX_INTERNAL_FIELDS
| Constructor Detail |
|---|
public LuceneDocumentBase()
AbstractSdxObject.enableLogging(org.apache.avalon.framework.logger.Logger),
configure(org.apache.avalon.framework.configuration.Configuration),
init()| Method Detail |
|---|
public void configure(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configure in interface org.apache.avalon.framework.configuration.Configurableconfigure in class SDXDocumentBaseconfiguration - The configuration object from which to build a document base.
Sample configuration entry:
<sdx:documentBase sdx:id = "myDocumentBaseName" sdx:type = "lucene">
<sdx:fieldList xml:lang = "fr-FR" sdx:variant = "" sdx:analyzerConf = "" sdx:analyzerClass = "">
<sdx:field code = "fieldName" type = "word" xml:lang = "fr-FR" sdx:analyzerClass = "" sdx:analyzerConf = ""/>
<sdx:field code = "fieldName2" type = "field" xml:lang = "fr-FR" brief = "true"/>
<sdx:field code = "fieldName3" type = "date" xml:lang = "fr-FR"/>
<sdx:field code = "fieldName4" type = "unindexed" xml:lang = "fr-FR"/>
</sdx:fieldList>
<sdx:index>
<sdx:pipeline sdx:id = "sdxIndexationPipeline">
<sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step2" sdx:type = "xslt"/>
<sdx:transformation src = "path to stylesheet, can be absolute or relative to the directory containing this file" sdx:id = "step3" sdx:type = "xslt" keep = "true"/>
</sdx:pipeline>
</sdx:index>
<sdx:repositories>
<sdx:repository baseDirectory = "blah4" depth = "3" extent = "100" sdx:type = "FS" sdx:default = "true" sdx:id = "blah4"/>
<sdx:repository ref = "blah2"/>
</sdx:repositories>
</sdx:documentBase>
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureDocumentBase(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureDocumentBase in class SDXDocumentBaseconfigruation - Configuration
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureFieldList(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configuration -
org.apache.avalon.framework.configuration.ConfigurationException
public void reloadFieldList(java.lang.String appConfString)
throws SDXException
appConfString - The path of the configuration file wich contain the new fieldList (eg, file:///myFiles/application.xconf, cocoon://myApplication/conf/application.xconf)
SDXException
public void replaceFieldList(FieldList fieldList)
throws org.apache.avalon.framework.configuration.ConfigurationException
fieldList - The new fieldList wich replace the old one
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureSearchIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
org.apache.avalon.framework.configuration.ConfigurationExceptionpublic OAIRepository createOAIRepository(java.lang.String repoId)
createOAIRepository in class AbstractDocumentBaserepoId - String The id of the repository to create
public OAIRepository createOAIRepository()
createOAIRepository in interface DocumentBasecreateOAIRepository in class AbstractDocumentBasecreateOAIRepository(String)public OAIRepository createOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
configuration - The configuration
protected void configureOAIRepositories(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepositories in class SDXDocumentBaseconfiguration -
org.apache.avalon.framework.configuration.ConfigurationException
protected void configureOAIRepository(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIRepository in class SDXDocumentBaseconfiguration - The configuration
org.apache.avalon.framework.configuration.ConfigurationExceptionSDXDocumentBase.configureOAIRepository(org.apache.avalon.framework.configuration.Configuration)
protected void configureOAIHarvester(org.apache.avalon.framework.configuration.Configuration configuration)
throws org.apache.avalon.framework.configuration.ConfigurationException
configureOAIHarvester in class SDXDocumentBaseorg.apache.avalon.framework.configuration.ConfigurationException
public void index(IndexableDocument[] docs,
Repository repository,
IndexParameters params,
org.xml.sax.ContentHandler handler)
throws SDXException,
org.xml.sax.SAXException,
org.apache.cocoon.ProcessingException
After adding the document to the search index, this method recycles the Lucene searcher if :
index in interface DocumentBaseindex in class SDXDocumentBasedocs - The documents to add.repository - The repository where to store the documents. If null is passed, the default repository will be used.params - The parameters for this adding action.handler - A content handler where to send information about the process (may be null)
TODO : what kind of "informations" ? -pb
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingExceptionSDXDocumentBase.index(fr.gouv.culture.sdx.document.IndexableDocument[], fr.gouv.culture.sdx.repository.Repository, fr.gouv.culture.sdx.documentbase.IndexParameters, org.xml.sax.ContentHandler)
public void delete(Document[] docs,
org.xml.sax.ContentHandler handler)
throws SDXException,
org.xml.sax.SAXException,
org.apache.cocoon.ProcessingException
Deletes one or more documents to this LuceneDocumentBase and recycle Lucene searcher if deletes only one document or the LuceneDocumentBase is not autoOptimize.
delete in interface DocumentBasedelete in class SDXDocumentBasedocs - The document to add and to index.handler - A content handler to feed with information.
SDXException
org.xml.sax.SAXException
org.apache.cocoon.ProcessingExceptionAbstractDocumentBase.delete(Document, ContentHandler)protected IndexParameters setBaseParameters(IndexParameters params)
setBaseParameters in class SDXDocumentBaseparams - The params object provided by the user at indexation timepublic java.util.HashMap getXMLFieldList()
SDXDocumentBase
getXMLFieldList in class SDXDocumentBasepublic Index getIndex()
public LuceneIndex getLuceneIndex()
protected void setSearchIndexParameters(LuceneIndexParameters params)
params - The lucene specific params to user
protected void addToSearchIndex(java.lang.Object indexationDoc,
boolean batchIndex)
throws SDXException
addToSearchIndex in class SDXDocumentBaseindexationDoc - The Document to addbatchIndex -
SDXException
protected void deleteFromSearchIndex(java.lang.String docId)
throws SDXException
deleteFromSearchIndex in class SDXDocumentBaseSDXException
protected void compactSearchIndex()
throws SDXException
compactSearchIndex in class SDXDocumentBaseSDXException
protected java.lang.Object getIndexationDocument(IndexableDocument doc,
java.lang.String storeDocId,
java.lang.String repoId,
IndexParameters params)
throws SDXException
getIndexationDocument in class SDXDocumentBaseSDXExceptionpublic java.util.Date lastModificationDate()
public java.util.Date creationDate()
public void init()
throws SDXException
DocumentBaseThis method must be called after the super.getLog() has been set and the configuration done.
init in interface DocumentBaseinit in class SDXDocumentBaseSDXExceptionprotected boolean initToSax()
AbstractSdxObject
initToSax in class SDXDocumentBaseprotected void initVolatileObjectsToSax()
Some objects need to be refresh each time a toSAX is called.
initVolatileObjectsToSax in class SDXDocumentBasepublic void optimize()
optimize in interface DocumentBaseoptimize in class SDXDocumentBasepublic void mergeCurrentBatch()
Merges a batch of documents (in memory) into the physical index on the
file system and optimize this one if necessary (depends of the
autoOptimize attribute for the current Document Base).
mergeCurrentBatch in class SDXDocumentBasepublic void indexModified()
indexModified in class SDXDocumentBase
public void splitIndex(boolean currentIndex)
throws java.io.IOException,
SDXException
Splits the current big index into 2 smaller one
splitIndex in class SDXDocumentBaseIOException, - SDXException
java.io.IOException
SDXException
protected void initializeVectorizedIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
Initializes the index vector by searching all sub index in it's directory
NB : working as intended.
org.apache.avalon.framework.configuration.ConfigurationException
protected void addSubIndex()
throws org.apache.avalon.framework.configuration.ConfigurationException
SDXException - If it's impossible to configure or initialize the sub-index to add.
org.apache.avalon.framework.configuration.ConfigurationExceptionprotected void removeSubIndex()
public boolean splitCheck(boolean currentIndex)
throws SDXException
Returns true when splitting condition are reached. If so, should be followed by a splitIndex() call. Controls order:
splitCheck in class SDXDocumentBasecurrentIndex - boolean to indicate the test concerns the current
index (true) or the active one (false)
true when splitting condition are reached,
false otherwise.
SDXExceptionprotected long getIndexSize(LuceneIndex index)
index - LuceneIndex
public org.apache.lucene.search.Searcher getSearcher()
throws SDXException
Returns the index searcher for all this document base indexes.
SDXException - If it's not possible to build MultiSearcher.ParallelMultiSearcher
public org.apache.lucene.index.IndexReader getIndexReader()
throws SDXException
Returns the index reader for all this document base indexes.
SDXException - If it's not possible to build MultiReader.MultiReaderprotected java.lang.String getFormatedSubIndexId(int subIndexNumber)
subIndexNumber - int representing the number of the sub-index
protected void addSubIndex(LuceneIndex index)
throws SDXException
index - LuceneIndex
SDXException - If nt's not possible to configure and initialize th sub-index.
protected void renewKeyIndex()
throws SDXException
SDXException - If it's impossible to freeing resources or
initializing Lucene index.
public void backup(SaveParameters save_config)
throws SDXException
backup in interface Saveablebackup in class SDXDocumentBasesave_config - SaveParameters
SDXExceptionSaveable.backup(fr.gouv.culture.sdx.utils.save.SaveParameters)
protected void backupIndexes(SaveParameters save_config)
throws SDXException
backupIndexes in class SDXDocumentBaseSDXException
protected void backupTimeStamp(SaveParameters save_config)
throws SDXException
backupTimeStamp in class SDXDocumentBaseSDXException
public void restore(SaveParameters save_config)
throws SDXException
restore in interface Saveablerestore in class SDXDocumentBaseSDXExceptionSaveable.restore(fr.gouv.culture.sdx.utils.save.SaveParameters)
protected void restoreIndexes(SaveParameters save_config)
throws SDXException
restoreIndexes in class SDXDocumentBaseSDXException
protected void restoreTimeStamp(SaveParameters save_config)
throws SDXException
restoreTimeStamp in class SDXDocumentBaseSDXExceptionpublic int docCount()
public void mergeBatch()
throws SDXException
mergeBatch in class SDXDocumentBaseSDXException
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||