Py学习  »  Elasticsearch

Elasticsearch筛选具有相同字段值的文档ID计数

Iwavenice • 4 年前 • 434 次点击  

我需要返回相同myField值重复[range]次的文档的id。

我有以下疑问

_search?filter_path=hits.total,hits.hits._id


"query": {
    some filters that return matching document ids
 },
"post_filter": {
    "bool": {
        "should": [
            {
                "range": {
                    "myField": {
                        "lte": 3,
                        "gte": 2
                    }
                }
            }
        ],
        "minimum_should_match": 1
    }
}

但它返回myField值的范围,而不是myField值的计数。 如何修改post_过滤器以获得所需的结果?

Python社区是高质量的Python/Django开发社区
本文地址:http://www.python88.com/topic/56922
 
434 次点击  
文章 [ 1 ]  |  最新文章 4 年前
jaspreet chahal
Reply   •   1 楼
jaspreet chahal    4 年前

后过滤器:

在计算了聚合之后,post_过滤器将应用于搜索请求末尾的搜索结果

不能根据查询或后筛选中字段的出现次数筛选文档,也不能使用聚合结果筛选查询中的文档。

一。获取给定次数出现的术语并在文档中搜索这些术语(2次调用弹性搜索)

2。使用 top_hits

映射:

{
  "index48" : {
    "mappings" : {
      "properties" : {
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

查询:

{
  "size": 0,
  "aggs": {
    "NAME": {
      "terms": {
        "field": "name.keyword",
        "size": 10
      },
      "aggs": {
        "documents": {
          "top_hits": {
            "size": 10
          }
        },
        "bucket_count": {
        "bucket_selector": {
          "buckets_path": {
            "path": "_count"
          },
          "script": "if(params.path>=1 && params.path<=3) return true"
        }
      }
      }
    }
  }
}

"aggregations" : {
    "NAME" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "John",
          "doc_count" : 3,
          "documents" : {
            "hits" : {
              "total" : {
                "value" : 3,
                "relation" : "eq"
              },
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "index48",
                  "_type" : "_doc",
                  "_id" : "ZRPC3HABF9RqmGpImxj_",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "John"
                  }
                },
                {
                  "_index" : "index48",
                  "_type" : "_doc",
                  "_id" : "ZhPC3HABF9RqmGpIpBh4",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "John"
                  }
                },
                {
                  "_index" : "index48",
                  "_type" : "_doc",
                  "_id" : "ZxPC3HABF9RqmGpIqRhj",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "John"
                  }
                }
              ]
            }
          }
        },
        {
          "key" : "Doe",
          "doc_count" : 2,
          "documents" : {
            "hits" : {
              "total" : {
                "value" : 2,
                "relation" : "eq"
              },
              "max_score" : 1.0,
              "hits" : [
                {
                  "_index" : "index48",
                  "_type" : "_doc",
                  "_id" : "aBPC3HABF9RqmGpIyhhU",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "Doe"
                  }
                },
                {
                  "_index" : "index48",
                  "_type" : "_doc",
                  "_id" : "aRPC3HABF9RqmGpIzhh7",
                  "_score" : 1.0,
                  "_source" : {
                    "name" : "Doe"
                  }
                }
              ]
            }
          }
        }
      ]
    }
  }