Py学习  »  Elasticsearch

来自ElasticSearch的唯一搜索结果

illug • 4 年前 • 344 次点击  

我对ElasticSearch还不太熟悉,不知道我想要什么是可能的还是不可能的。

我可以这样问:

GET entity/_search
{
  "query": {
    "bool": { 
      "must": [
        { "match": { "searchField":   "searchValue" }}
      ]
    }
  },
      "aggs" : {
    "uniq_Id" : {
      "terms" : { "field" : "Id", "size":500 }
      }
  }
}

它将返回顶级搜索结果和术语聚合存储桶。但理想情况下,我希望搜索结果返回的是,对于聚合术语中定义的每个唯一Id,只返回一个(可能是最上面的一个,并不重要)。

Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/52428
 
344 次点击  
文章 [ 1 ]  |  最新文章 4 年前
Opster ES Ninja - Kamal
Reply   •   1 楼
Opster ES Ninja - Kamal    4 年前

你可以利用 Terms Aggregation 以及 Top Hits Aggregation 给你你想要的结果。

完成后,将大小指定为 1 热门点击聚合

基于您的查询,我创建了示例映射、文档、聚合查询和响应供您参考。

映射:

PUT mysampleindex
{
  "mappings": {
    "mydocs": {
      "properties": {
        "searchField":{
          "type": "text"
        },
        "Id": {
          "type": "keyword"
        }
      }
    }
  }
}

示例文档:

POST mysampleindex/mydocs/1
{
  "searchField": "elasticsearch",
  "Id": "1000"
}

POST mysampleindex/mydocs/2
{
  "searchField": "elasticsearch is awesome",
  "Id": "1000"
}

POST mysampleindex/mydocs/3
{
  "searchField": "elasticsearch is awesome",
  "Id": "1001"
}

POST mysampleindex/mydocs/4
{
  "searchField": "elasticsearch is pretty cool",
  "Id": "1001"
}

POST mysampleindex/mydocs/5
{
  "searchField": "elasticsearch is pretty cool",
  "Id": "1002"
}

查询:

POST mysampleindex/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "searchField": "elasticsearch"
          }
        }
      ]
    }
  },
  "aggs": {
    "myUniqueIds": {
      "terms": {
        "field": "Id",
        "size": 10
      },
      "aggs": {
        "myDocs": {
          "top_hits": {                     <---- Top Hits Aggregation
            "size": 1                       <---- Note this
          }
        }
      }
    }
  }
}

样本响应:

{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "myUniqueIds": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "1000",
          "doc_count": 2,
          "myDocs": {
            "hits": {
              "total": 2,
              "max_score": 0.2876821,
              "hits": [
                {
                  "_index": "mysampleindex",
                  "_type": "mydocs",
                  "_id": "1",
                  "_score": 0.2876821,
                  "_source": {
                    "searchField": "elasticsearch",
                    "Id": "1000"
                  }
                }
              ]
            }
          }
        },
        {
          "key": "1001",
          "doc_count": 2,
          "myDocs": {
            "hits": {
              "total": 2,
              "max_score": 0.25316024,
              "hits": [
                {
                  "_index": "mysampleindex",
                  "_type": "mydocs",
                  "_id": "3",
                  "_score": 0.25316024,
                  "_source": {
                    "searchField": "elasticsearch is awesome",
                    "Id": "1001"
                  }
                }
              ]
            }
          }
        },
        {
          "key": "1002",
          "doc_count": 1,
          "myDocs": {
            "hits": {
              "total": 1,
              "max_score": 0.2876821,
              "hits": [
                {
                  "_index": "mysampleindex",
                  "_type": "mydocs",
                  "_id": "5",
                  "_score": 0.2876821,
                  "_source": {
                    "searchField": "elasticsearch is pretty cool",
                    "Id": "1002"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

注意,在上面我没有返回任何bool结果,您正在寻找的搜索结果以热门点击聚合的形式出现。

希望这有帮助!