python使用elasticsearch_dsl库聚合查询Es并进行分页

1年前 (2024-09-04)学习1193

做大数据分析时用到Es，需要查询聚合后的每类数据量，聚合后的桶超过10000.搜索了半天，总结下。

1、首先导入所需的库

from elasticsearch_dsl import connections,Search,Q,A

Q用作条件查询、A用作聚合

2、建立Es连接客户端

client = connections.create_connection(hosts=settings.ES_HOST, timeout=settings.ES_TIMEOUT)

3、新建了一个方法，专门处理Es查询，后续分页也会用到

def fetch_case_num(class,after_key=None):
    s = Search(using=client, index='t_case_party')

    q = Q('range', time={'gte': '2024-09-03'})
    q = q & Q('match', class=class)
	# 排除type_id为空的字段
    q = q & Q('exists', field='type_id')
    res = s.query(q)

    # 首次查询
    if after_key is None:
        composite_agg = A('composite', sources=[
            {'term': A('terms', field='type_id.keyword')}
        ], size=1000)
    else:
        composite_agg = A('composite', sources=[
            {'term': A('terms', field='type_id.keyword')}
        ], size=1000,after=after_key)

    res.aggs.bucket('gender_terms', composite_agg)

    # 执行查询并返回响应
    return res.execute()

4、调用上面的方法进行查询并获取桶

# 查询首页
response = fetch_case_num(startdate,orguuid)

# 获取查询数据桶
 buckets = response.aggregations.gender_terms.buckets

5、循环检查是否还有下一页，有的话执行查询方法追加到桶中

# 循环获取所有页
while after_key:
    print(after_key)
    response = fetch_case_num(startdate,orguuid,after_key)
    if response:
        buckets.extend(response.aggregations.gender_terms.buckets)
        after_key = response.aggregations.gender_terms.after_key if 'after_key' in response.aggregations.gender_terms else None
    else:
        break

6、处理数据

results = []
for bucket in buckets:
	result = {}
	result['cus_id'] = bucket.key['term']
	result['val_c'] = bucket.doc_count
	results.append(result)

ok，好了。主要是用到after_key，查看是否存在，接着after_key继续查询，如果没数据就说明到最后了。

扫描二维码推送至手机访问。

本文链接：https://forstyle.cc/zblog/post/40.html

分享给朋友：

返回列表

上一篇：Ubuntu中conda虚拟环境安装kenlm步骤及报错整理

下一篇：python自动生成pg数据库表对应的es索引

: 星光

星光下的赶路人

搜索: Search

最新文章

Automa简单上手
1小时前
Funasr的speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch模型训练
2个月前 (09-11)
通过label-studio标柱FUNasr语音识别模型训练所需的数据
2个月前 (09-08)
PaddleSpeech tts语音合成模型训练
2个月前 (09-04)
Ubuntu下命令行显示路径配置
2个月前 (08-26)

热门阅读

一部手机如何配置内网电脑同时访问内外网
1353 浏览科技
window系统annaconda中同时安装paddle和pytorch环境
1083 浏览学习
服务器交换区占用量查看
986 浏览学习
$pip命令报错Script file 'D:\conda\Scripts\pip-script.py' is not present.$
pip命令报错Script file 'D:\conda\Scripts\pip-script.py' is not present.
978 浏览学习
docker常用命令汇总
952 浏览学习

« 2025年11月 »
一	二	三	四	五	六	日
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

python使用elasticsearch_dsl库聚合查询Es并进行分页

星光

方向比速度重要，智能比吃苦重要，学习比学历重要，机遇比关系重要，要什么比做什么重要!
晋ICP备2024040319号-1 晋公网安备14010802080384号

Powered By Z-BlogPHP. Theme by TOYEAN.

python使用elasticsearch_dsl库聚合查询Es并进行分页

星光

方向比速度重要，智能比吃苦重要，学习比学历重要，机遇比关系重要，要什么比做什么重要! 晋ICP备2024040319号-1 晋公网安备14010802080384号

Powered By Z-BlogPHP. Theme by TOYEAN.

方向比速度重要，智能比吃苦重要，学习比学历重要，机遇比关系重要，要什么比做什么重要!
晋ICP备2024040319号-1 晋公网安备14010802080384号