914 汽車大全

Chapter 6. Manual index changes

hibernate search會檢測hibernate core對數據庫的操作，並且自動更新索引（除非EventListeners設置為disabled)。當然hibernate search也支持手動更新索引，來滿足我們的需求。比如從備份數據中導入數據到數據庫。則需手動建立索引

6.1. Adding instances to the index

使用FullTextSession.index(T entity)可以直接添加或者更新一個特定實體對象到索引。如果實體已經被索引了，

那麼索引就會被更新，隻有當請求被commit之後，索引才會改變。

Example 6.1. Indexing an entity via FullTextSession.index(T entity)

FullTextSession fullTextSession = Search.getFullTextSession(session);                     Transaction tx = fullTextSession.beginTransaction();                                      Object customer = fullTextSession.load( Customer.class, 8 );
fullTextSession.index(customer);
tx.commit(); //index only updated at commit time

如果要添加一個類索引的所有實例，或者所有類索引，推薦的方法是MassIndexer.see Section 6.3.2, “Using a MassIndexer” for more details.

6.2. Deleting instances from the index

我們可以通過api刪除索引，這個操作叫做purging,清除。也是通過FullTextSession。

Example 6.2. Purging a specific instance of an entity from the index

FullTextSession fullTextSession = Search.getFullTextSession(session);                     Transaction tx = fullTextSession.beginTransaction();                                      for (Customer customer : customers) {                                                     fullTextSession.purge( Customer.class, customer.getId() );
}                                                                                         tx.commit(); //index is updated at commit time

purging通過id刪除索引中的實體，不會影響到數據庫

如果要刪除一個索引的所有實例，可以使用purgeAll方法.

Example 6.3. Purging all instances of an entity from the index

FullTextSession fullTextSession = Search.getFullTextSession(session);                     Transaction tx = fullTextSession.beginTransaction();
fullTextSession.purgeAll( Customer.class );
//optionally optimize the index                                                           //fullTextSession.getSearchFactory().optimize( Customer.class );                          tx.commit(); //index changes are applied at commit time

和

FullTextSession.index(T
 entity)一樣，配置了

EntityIndexingInterceptor的實體將無法進行操作。詳見Section 4.5, “Conditional indexing: to index or not based on entity state”.

Note

FullTextEntityManager也有index, purge and purgeAll 等方法

Note

所有手動索引方法 (index, purge and purgeAll) 隻影響索引, 然而它們任然有事務性，隻有committed或者flushToindexes才能完成操作請求。

6.3. Rebuilding the whole index

如果實體和所以之間的映射被改變，那麼就需要重建索引。比如新增了一個查詢域。當數據庫中導入新數據的時候，也需要重建索引。重建索引的方法有兩種：

定期使用 FullTextSession.flushToIndexes()進行索引更新，或者使用 FullTextSession.index()更新實體。
使用MassIndexer.

6.3.1. Using flushToIndexes()

利用flushToIndexes可以刷新FullTextSession.puregeAll()已經刪除的索引，和FullTextSession.index()添加索引實例到索引中。但是有一些內存和效率方麵的問題。索引大量數據時，如果不周期性的利用flushToIndexes()清理隊列請小心內存溢出。flushToIndexes()或者commit()之後，索引即被更新，並且無法rolled back.

Example 6.4. Index rebuilding using index() and flushToIndexes()

fullTextSession.setFlushMode(FlushMode.MANUAL);                                           fullTextSession.setCacheMode(CacheMode.IGNORE);                                           transaction = fullTextSession.beginTransaction();                                         //Scrollable results will avoid loading too many objects in memory                        ScrollableResults results = fullTextSession.createCriteria( Email.class )                    .setFetchSize(BATCH_SIZE)                                                                 .scroll( ScrollMode.FORWARD_ONLY );                                                    int index = 0 ;                                                                            while( results.next() ) {                                                                     index++;                                                                                  fullTextSession.index( results.get(0) ); //index each element                             if (index % BATCH_SIZE == 0) {                                                            fullTextSession.flushToIndexes(); //apply changes to indexes                              fullTextSession.clear(); //free memory since the queue is processed                       }                                                                                     }                                                                                        transaction.commit();

為了防止內存溢出，請使用setFetchSize(BATCH_SIZE)，來限製。但是BATCH_SIZE越大，從數據庫fetch的速度也越快。

6.3.2. Using a MassIndexer

hibernate search的MassIndexer方法利用多線程重建索引；你可以自由選擇重載或者重建索引。這個方法的效率最高但是需要程序進入維護模式，不建議在進行MassIndexer的時候請求索引等操作。

Example 6.5. Index rebuilding using a MassIndexer

fullTextSession.createIndexer().startAndWait();

上麵的操作將重建索引，刪除原來的索引，重新從數據庫中加載轉化索引。雖然api使用很方便，但是建議添加一些額外配置來使進程加快。

Warning

MassIndexer期間索引將無法被請求，請求結果可能會丟失。

Example 6.6. Using a tuned MassIndexer

fullTextSession                                                                             .createIndexer( User.class )                                                               .batchSizeToLoadObjects( 25 )                                                                          .cacheMode( CacheMode.NORMAL )                                                                         .threadsToLoadObjects( 5 )                                                                             .idFetchSize( 150 )                                                                                    .threadsForSubsequentFetching( 20 )                                                                    .progressMonitor( monitor ) //a MassIndexerProgressMonitor implementation                              .startAndWait();

上麵操作將重建所有User索引實例，並且將創建5個讀數據庫線程，每個query攜帶25個objects.由20個線程去讀User的關聯對象集。具體參數請見：Table 3.3, “Execution configuration”.

重建索引的時候推薦使用CacheMode.IGNORE（默認），因為緩存對重建索引是額外消耗。當然某些數據類型下，開啟緩存有助於效率，比如枚舉類型

數據

Tip

效率最高的線程數量取決於你係統的整體結構, 數據庫設計甚至數據的類型. 使用profiler可以幫助找到最佳線程數量: all internal thread groups have meaningful names to be easily identified with most tools.

Note

MassIndexer為了速度而生，且與實務無關，無需begin(),commit(). MassIndexer期間不建議用戶查詢，一來無法查詢到結果，二來增加了係統負載.

其他一些影響索引效率和內存消耗的因素：

hibernate.search.[default|<indexname>].exclusive_index_use
hibernate.search.[default|<indexname>].indexwriter.max_buffered_docs
hibernate.search.[default|<indexname>].indexwriter.max_merge_docs
hibernate.search.[default|<indexname>].indexwriter.merge_factor
hibernate.search.[default|<indexname>].indexwriter.merge_min_size
hibernate.search.[default|<indexname>].indexwriter.merge_max_size
hibernate.search.[default|<indexname>].indexwriter.merge_max_optimize_size
hibernate.search.[default|<indexname>].indexwriter.merge_calibrate_by_deletes
hibernate.search.[default|<indexname>].indexwriter.ram_buffer_size
hibernate.search.[default|<indexname>].indexwriter.term_index_interval

上一個版本還支持max_field_length，不過Lucene已經不支持了，可以使用LimitTokenCountAnalyzer達到同樣效果

所有

.indexwriter參數都是lucene定義的，hibernate
 search隻不過傳遞這些參數。、詳見

Section 3.6, “Tuning Lucene indexing performance”

MassIndexer僅向前遍曆加載的主鍵，但是mysql's jdbc driver會加載所有值到內存。為了最優化，請設置idFetchSize為Integer.MIN_VALUE

最後更新：2017-04-03 18:52:02

Chapter 6. Manual index changes