5. Querying

Hibernate Search第二個最重要的能力就是執行lucene查詢和檢索Hibernate session中的實體.

準備和執行查詢包括以下步驟：

創建FullTextSession
創建Lucene query，通過Hibernate Search query DSL (recommended)或者使用Lucene query API
Wrapping the Lucene query using an org.hibernate.Query
執行查詢--> list() or scroll()

我們使用FullTextSession進行查詢，通過傳遞一個Hibernate的session

Example 5.1. Creating a FullTextSession

Session session = sessionFactory.openSession();

...

FullTextSession fullTextSession =Search.getFullTextSession(session);

一旦你擁有了FullTextSession,你可以使用2種查詢方法: Hibernate Search query DSL 或者 Lucene query.

DSL查詢方法：

final QueryBuilder b = fullTextSession.getSearchFactory()
    .buildQueryBuilder().forEntity( Myth.class ).get();

org.apache.lucene.search.Query luceneQuery =
    b.keyword()
        .onField("history").boostedTo(3)
        .matching("storm")
        .createQuery();

org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery );List result = fullTextQuery.list();//return a list of managed objects

二選一，你可以選擇一種方法進行查詢操作。下麵的例子是lucene api查詢.

Example 5.2. Creating a Lucene query via the QueryParser

SearchFactory searchFactory = fullTextSession.getSearchFactory();
org.apache.lucene.queryParser.QueryParser parser = 
    new QueryParser("title", searchFactory.getAnalyzer(Myth.class) );
try {
    org.apache.lucene.search.Query luceneQuery = parser.parse( "history:storm^3" );
}
catch (ParseException e) {
    //handle parsing failure
}

org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery(luceneQuery);List result = fullTextQuery.list();//return a list of managed objects

Note

Hibernate query方法是基於lucene query的:org.hibernate.Query, 這意味著Hibernate query也支持HQL, Native or Criteria). The regular list() , uniqueResult(), iterate() and scroll()等平常我們使用的方法

你也可以使用JPA查詢:

Example 5.3. Creating a Search query using the JPA API

EntityManager em = entityManagerFactory.createEntityManager();

FullTextEntityManager fullTextEntityManager = org.hibernate.search.jpa.Search.getFullTextEntityManager(em);

...finalQueryBuilder b = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Myth.class).get();

org.apache.lucene.search.Query luceneQuery =  b.keyword().onField("history").boostedTo(3).matching("storm").createQuery();
javax.persistence.Query fullTextQuery = fullTextEntityManager.createFullTextQuery( luceneQuery );

List result = fullTextQuery.getResultList();//return a list of managed objects

Note

接下來的例子都是介紹hibernate apis,但是可以很方便的轉換到jpa方式、

5.1. Building queries

5.1.1. Building a Lucene query using the Lucene API

使用lucene api,你可以有幾個選項，使用query parser(簡單查詢)，或者lucene programmatic api(複雜查詢）

這個超出我們文檔範圍。請出門右拐找lucene文檔。

5.1.2. Building a Lucene query with the Hibernate Search query DSL

使用lucene programmatic api進行全文檢索挺麻煩的...balabala....

Hibernate Search 的DSL查詢方法的api可以稱作流暢的api(無恥 - -），有幾個特性:

方法名言簡意賅
省略不必要的配置
It often uses the chaining method pattern（沒懂 - -)
方便使用和閱讀

現在我們來看如何使用ＡＰＩ，首先需要一個QueryBuilder，綁定一個要查詢的類。QueryBuilder知道用什麼分析器，

使用什麼橋。

你也可以重寫域要使用的分析器,但是很少這麼做。除非你知道你在做什麼、

QueryBuilder mythQB = searchFactory.buildQueryBuilder().forEntity(Myth.class).overridesForField("history","stem_analyzer_definition").get();

使用query builder,要注意的是最終結果都是來自lucene query.因為這個原因，我們可以很容易的將lucene's query

parser或者lucene programmatic api的查詢通hibernate search DSL結合在一起，以防DSL不支持一些功能

5.1.2.1. Keyword queries 關鍵字查詢

我們先來查詢特定單詞

Query luceneQuery = mythQB.keyword().onField("history").matching("storm").createQuery();

keyword()的意思是，查找一個特定的單詞。OnField()指明查找哪個域。matching()為要查詢的單詞。

storm這個值通過history橋
橋的值之後會傳遞給分析器，分析器對索引進行匹配。

我們來看看被搜索的屬性不是String的時候：

@Entity@IndexedpublicclassMyth{  @Field(analyze =Analyze.NO)  @DateBridge(resolution =Resolution.YEAR)publicDate getCreationDate(){return creationDate;}publicDate setCreationDate(Date creationDate){this.creationDate = creationDate;}privateDate creationDate;...}

Date birthdate =...;Query luceneQuery = mythQb.keyword().onField("creationDate").matching(birthdate).createQuery();

Note

使用lucene必須將日期轉化為String類型。而hibernate search不用

hibernate search支持各種變換，不單單是Date,也提供其他的橋，提供objectToString方法（太方便啦！lucene隻支持

String，而hibernate幫我們封裝好啦）

接下來我們來個有點難度的例子。使用連詞分析器(ngram analyzers)。連詞分析器可以彌補因為用戶打錯字，導致

搜索不到結果的情況。比如我們搜索(3-grams,應該是3個字母組合的意思)hibernate可以是：hib, ibe, ber, rna, nat, ate.

@AnalyzerDef(name ="ngram",  tokenizer = @TokenizerDef(factory =StandardTokenizerFactory.class),  filters ={    @TokenFilterDef(factory =StandardFilterFactory.class),    @TokenFilterDef(factory =LowerCaseFilterFactory.class),    @TokenFilterDef(factory =StopFilterFactory.class),    @TokenFilterDef(factory =NGramFilterFactory.class,      params ={        @Parameter(name ="minGramSize", value ="3"),        @Parameter(name ="maxGramSize", value ="3")})})@Entity@IndexedpublicclassMyth{  @Field(analyzer=@Analyzer(definition="ngram")  @DateBridge(resolution =Resolution.YEAR)publicString getName(){return name;}publicString setName(Date name){this.name = name;}privateString name;...}

Date birthdate =...;Query luceneQuery = mythQb.keyword().onField("name").matching("Sisiphus").createQuery();

在上麵的例子中，我們搜索的關鍵字Sisiphus，會先轉換成小寫，然後分成3個字母組合(3-grams)sis, isi, sip, phu, hus. 每個

n-gram都將作為查詢關鍵字。

Note

如果不想使用橋(field bridge)或者分析器，可以使用ignoreAnalyzer()和ignoreFieldBridge（）

查詢一個域裏麵可能包含的多個關鍵字使用：

//search document with storm or lightning in their historyQuery luceneQuery =    mythQB.keyword().onField("history").matching("storm lightning").createQuery();

查詢幾個域中可能包含關鍵字使用：

Query luceneQuery = mythQB.keyword().onFields("history","description","name").matching("storm").createQuery();

我們可以對域設置權重，name這個域權重為5：

Query luceneQuery = mythQB.keyword().onField("history").andField("name").boostedTo(5).andField("description").matching("storm").createQuery();

5.1.2.2. Fuzzy queries 模煳查詢（應該隻支持英文）

使用模煳字段查詢。

Query luceneQuery = mythQB.keyword().fuzzy().withThreshold(.8f).withPrefixLength(1).onField("history").matching("starm").createQuery();

threshold（臨界值）規定了兩個terms被認為相同（匹配）的上限，是0～1之間的數，默認是0.5。

prefixLength（前綴長度）說明了模煳性（被忽略的前綴長度）：如果被設置為0，則任意一個非零的值被推薦（估計是匹配所有）

5.1.2.3. Wildcard queries 通配符查詢

可以執行通配符搜索（查找隻知道單詞部分內容），“？”代表單個字符，“*”代表任意多個字符。注意：出於性能的考慮，查詢時不要以通配符開頭。

Query luceneQuery = mythQB.keyword().wildcard().onField("history").matching("sto*").createQuery();

5.1.2.4. Phrase queries 短語查詢

可以使用它來搜索確切匹配或者相似的句子，可以使用phrase（）來完成：

Query luceneQuery = mythQB.phrase().onField("history").sentence("Thou shalt not kill").createQuery();

也可以搜索相似的句子，可以通過添加一個slop factor來實現。它允許其它單詞出現在這個句子中。

Query luceneQuery = mythQB.phrase().withSlop(3).onField("history").sentence("Thou kill").createQuery();

5.1.2.5. Range queries 邊界查詢

現在介紹邊界搜索（可以作用在數字、日期、字符串等上）。邊界搜索用來在某兩個邊界之間進行搜索，或者搜索給定值之上或之下的結果，示例代碼如下：

//look for0<= starred <3Query luceneQuery = mythQB.range().onField("starred").from(0).to(3).excludeLimit().createQuery();

//look for myths strictly BCDate beforeChrist =...;Query luceneQuery = mythQB.range().onField("creationDate").below(beforeChrist).excludeLimit().createQuery();

5.1.2.6. Combining queries 組合查詢

最後介紹組合查詢，可以創建更複雜的查詢語句，有以下組合操作可以供使用：

SHOULD: 查詢應該包含子查詢的結果。
MUST: 必須包含匹配元素的子查詢。
MUST NOT: 一定不能包含。

//look for popular modern myths that are not urban

DatetwentiethCentury =...;

Query luceneQuery = mythQB.bool().must( mythQB.keyword().onField("description").matching("urban").createQuery()).not().must( mythQB.range().onField("starred").above(4).createQuery()).must( mythQB.range().onField("creationDate").above(twentiethCentury).createQuery()).createQuery();

//look for popular myths that are preferably urban

Query luceneQuery = mythQB.bool().should( mythQB.keyword().onField("description").matching("urban").createQuery()).must( mythQB.range().onField("starred").above(4).createQuery() ).createQuery();

//look for all myths except religious ones

Query luceneQuery = mythQB.all().except( monthQb.keyword().onField("description_stem").matching("religion").createQuery()).createQuery();

5.1.2.7. Query options

? boostedTo：可以用在查詢實體或字段中，使用給定的因子提升整個查詢或特定字段。

? withConstantScore (on query)：和boost（作用）一樣，所有匹配的查詢結果有一個常量分數。

? filteredBy(on query)：使用過濾器過濾查詢結果。

? ignoreAnalyzer (on field)：處理字段時忽略analyzer。

? ignoreFieldBridge (on field)：處理字段時忽略field bridge。

來看例子：

Query luceneQuery = mythQB
    .bool()
      .should( mythQB.keyword().onField("description").matching("urban").createQuery() )
      .should( mythQB
        .keyword()
        .onField("name")
          .boostedTo(3)
          .ignoreAnalyzer()
        .matching("urban").createQuery() )
      .must( mythQB
        .range()
          .boostedTo(5).withConstantScore()
        .onField("starred").above(4).createQuery() )
    .createQuery();

5.1.3. Building a Hibernate Search query 構建hibernate search查詢

目前為止我們隻討論了如何創建LuceneQuery，這隻是一係列動作中的第一步，現在看一看如果從Lucene Query創建Hibernate Search Query。

5.1.3.1. Generality

一旦Lucene Query被創建，他需要被包裝成一個Hibernate查詢。如果沒有特殊說明，它將會對所有的索引實體進行查詢，可能返回所有的索引類的類型。

從性能的角度考慮，建議限製返回的實體類型。

Example 5.4. Wrapping a Lucene query into a Hibernate Query

FullTextSession fullTextSession = Search.getFullTextSession( session );

org.hibernate.Query fullTextQuery = fullTextSession.createFullTextQuery( luceneQuery );

Example 5.5. Filtering the search result by entity type

fullTextQuery = fullTextSession    .createFullTextQuery( luceneQuery, Customer.class );

// or

fullTextQuery = fullTextSession    .createFullTextQuery( luceneQuery, Item.class, Actor.class );

在例5.5中，第一個例子隻返回匹配Customer的結果，第二個例子返回匹配Actor和Item類的機構。結果限製是多態實現的，也就是說如果有兩個子類Salesman和Custom繼承自父類Person，可以隻指定Person.class來過濾返回結果。

5.1.3.2. Pagination 分頁

由於性能的原因，推薦每次查詢返回一定數量的查詢結果。事實上用戶瀏覽時從一頁翻到另一頁是非常常見的情況。你定義翻頁的方法正是使用HQL或Criteria定義分頁的方法。

Example 5.6. Defining pagination for a search query

org.hibernate.Query fullTextQuery =     fullTextSession.createFullTextQuery( luceneQuery, Customer.class );

fullTextQuery.setFirstResult(15); //start from the 15th

elementfullTextQuery.setMaxResults(10); //return 10 elements

Tip

可以使用fulltextQuery.getResultSize()獲取全部匹配元素的個數。

5.1.3.3. Sorting 排序

apache lucene提供非常強大方便的排序功能，

Example 5.7. Specifying a Lucene Sort in order to sort the results

org.hibernate.search.FullTextQuery query = s.createFullTextQuery( query, Book.class );

org.apache.lucene.search.Sort sort = new Sort(new SortField("title", SortField.STRING));
query.setSort(sort);
List results = query.list();

Tip

注意需要排序的域是不能被標注為分詞的(tokenized )

5.1.3.4. Fetching strategy 抓取策略

Example 5.8. Specifying FetchMode on a query

Criteria criteria =     s.createCriteria( Book.class ).setFetchMode( "authors", FetchMode.JOIN );

s.createFullTextQuery( luceneQuery ).setCriteriaQuery( criteria );

上麵的例子將返回所有luceneQuery匹配的Books,authors將被作為外部鏈接加載。

Important

隻有設置fetch mode才可以使用criteria的restriction

Important

如果返回多個不同類型實體，則不能使用setCriteriaQuery

5.1.3.5. Projection 投影

有些時候不需要返回整個實體模型，而僅僅是實體中的部分字段。Hibernate Search允許你這樣做，即返回幾個字段。

Example 5.9. Using projection instead of returning the full domain object

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

query.setProjection( "id", "summary", "body", "mainAuthor.name" );

List results = query.list();

Object[] firstResult = (Object[]) results.get(0);

Integer id = firstResult[0];

String summary = firstResult[1];

String body = firstResult[2];

String authorName = firstResult[3];

5.1.3.6. Customizing object initialization strategies 自定義對象初始化策略

設置hibernate search先從二級緩存取實體還是先從database中取：

Example 5.11. Check the second-level cache before using a query

FullTextQuery query = session.createFullTextQuery(luceneQuery, User.class);query.initializeObjectWith(    ObjectLookupMethod.SECOND_LEVEL_CACHE,    DatabaseRetrievalMethod.QUERY);

ObjectLookupMethod.PERSISTENCE_CONTEXT: useful if most of the matching entities are already in the persistence context (ie loaded in the Session or EntityManager)
ObjectLookupMethod.SECOND_LEVEL_CACHE: check first the persistence context and then the second-level cache.

5.1.3.7. Limiting the time of a query 限製時間查詢

使用Hibernate Search進行全文檢索時，你可以使用下麵兩種方式限製每次查詢的時間：

? 當限定時間到時拋出異常

? 當限定時間到時限製查詢結果的個數。（EXPERIMENTAL）

兩種方式不兼容。

5.2. Retrieving the results 檢索結果

一旦建立了Hibernate Search query.執行查詢操作就像執行HQL，Criteria查詢一樣，list(), uniqueResult(), iterate(), scroll()

5.2.1. Performance considerations 考慮效率

如果需要返回特定結果，（比如利用分頁），並且希望所有查詢結果都運用該規則，推薦list() or uniqueResult()。

list()可以設置batch-size。當使用list() , uniqueResult() and iterate()時，注意hibernate search會處理所有Lucene匹配的索引（包括分頁）

如果你希望盡量少去加載lucene document,scroll非常適合。別忘了使用完關閉ScrollableResults對象

Important

分頁比用scrolling好

5.2.2. Result size 返回結果數量

有時候我們需要知道搜到到的結果集數量

像我們使用google搜索時，顯示的結果數量 "1-10 of about 888,000,000"
分頁需要
to implement a multi step search engine (adding approximation if the restricted query return no or not enough results)

將所有匹配到的lucene document都取出來肯定會損耗很多資源。

hibernate search允許獲取所有匹配到的索引document，即使你設置了分頁參數，.更有趣的是，支持獲取所有索引個數，而不需要加搜索條件

Example 5.16. Determining the result size of a query

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

//return the number of matching books without loading a single one

assert 3245 == query.getResultSize(); 

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

query.setMaxResult(10);List results = query.list();

//return the total number of matching books regardless of pagination

assert 3245 == query.getResultSize();

Note

就像Google,搜索結果數量隻是個大概，如果有索引還沒有被更新添加

5.2.3. ResultTransformer 結果轉換

就像Section 5.1.3.5, “Projection”章節看到的投影結果就是返回成一個Object數組。

但有時候這樣的數據結構不是我們想要的，那麼我們可以轉換：

Example 5.17. Using ResultTransformer in conjunction with projections

org.hibernate.search.FullTextQuery query =     s.createFullTextQuery( luceneQuery, Book.class );

query.setProjection( "title", "mainAuthor.name" );
query.setResultTransformer( 
    new StaticAliasToBeanResultTransformer( 
        BookView.class, 
        "title", 
        "author" ) 
);
List<BookView> results = (List<BookView>) query.list();

for(BookView view : results) {

log.info( "Book: " + view.getTitle() + ", " + view.getAuthor() );

上麵的例子，將投影的兩個域title,mainAuthor.name,利用ResultTransformaer封裝成BookView(tile,author)類.

5.2.4. Understanding results 理解、調試結果

有時候我們查詢得到的結果不是我們想要的，比如返回空結果或者亂七八糟，我們可以利用

luke來調試。但是hibernate search也提供一個操作lucene解釋類（ Explanation object）的方法。

fullTextQuery.explain(int)
使用projection

第一個方式使用ducument id作為參數、獲得Explanation對象。document id 可以通過projection或者

FullTextQuery.DOCUMENT_ID

Warning

Document ID 和實體類的ID不是同一個東西

第二個方法：利用FullTextQuery.EXPLANATION常量

Example 5.18. Retrieving the Lucene Explanation object using projection

FullTextQuery ftQuery = s.createFullTextQuery( luceneQuery, Dvd.class )

        .setProjection(

              FullTextQuery.DOCUMENT_ID,

              FullTextQuery.EXPLANATION,

              FullTextQuery.THIS );

@SuppressWarnings("unchecked") List<Object[]> results = ftQuery.list();

for (Object[] result : results) {

    Explanation e = (Explanation) result[1];

    display( e.toString() );

注意，在使用explanation對象的時候，會粗略、損耗性大的再跑一遍與lucene query。所以必須的

時候再使用這個。

5.3. Filters 過濾器

apache lucene允許使用filter過濾器過濾查詢結果，也支持自定義的過濾器。應用例子：

security
temporal data (eg. view only last month's data)
population filter (eg. search limited to a given category)
and many more

Hibernate Search過濾器類似Hibernate過濾器：

Example 5.19. Enabling fulltext filters for a given query

fullTextQuery = s.createFullTextQuery( query, Driver.class );

fullTextQuery.enableFullTextFilter("bestDriver");

fullTextQuery.enableFullTextFilter("security").setParameter( "login", "andre" );

fullTextQuery.list(); //returns only best drivers where andre has credentials

上麵的例子中我們啟用了兩個過濾器。

通過@FullTextFilterDef標注聲明過濾器。過濾器可以標注在任何被@Indexed的實體類。

過濾器必須實現filter的函數

Example 5.20. Defining and implementing a Filter

@Entity
@Indexed
@FullTextFilterDefs( {
    @FullTextFilterDef(name = "bestDriver", impl = BestDriversFilter.class), 
    @FullTextFilterDef(name = "security", impl = SecurityFilterFactory.class) 
})
public class Driver { ... }

public class BestDriversFilter extends org.apache.lucene.search.Filter {

    public DocIdSet getDocIdSet(IndexReader reader) throws IOException {

        OpenBitSet bitSet = new OpenBitSet( reader.maxDoc() );

        TermDocs termDocs = reader.termDocs( new Term( "score", "5" ) );

        while ( termDocs.next() ) {

            bitSet.set( termDocs.doc() );        }        return bitSet;    }}

下步意義。

最後更新：2017-04-03 18:52:02

5. Querying

Note

Note

Tip

Important

Important

Note

上一篇： Chapter 6. Manual index changes

下一篇： 穀歌搜索成功秘訣：以簡潔贏得用戶

相關內容

熱門內容

最新內容

下一篇：穀歌搜索成功秘訣：以簡潔贏得用戶