ElasticSearch lấy top file download

Mình có các document dạng như sau:

{"_index":"my_index","_type":"type","_id":"1","_version":1,"found":true,"_source":{"url":"/video/test/1.mp4","status":"200","time":1507003456}} {"_index":"my_index","_type":"type","_id":"2","_version":1,"found":true,"_source":{"url":"/video/test/1.mp4","status":"404","time":1507003457}} {"_index":"my_index","_type":"type","_id":"3","_version":1,"found":true,"_source":{"url":"/video/test/2.mp4","status":"200","time":44324234234}} {"_index":"my_index","_type":"type","_id":"4","_version":1,"found":true,"_source":{"url":"/video/test/2.mp4","status":"200","time":65464564543}} {"_index":"my_index","_type":"type","_id":"5","_version":1,"found":true,"_source":{"url":"/video/test/3.mp4","status":"200","time":42343244}} {"_index":"my_index","_type":"type","_id":"2","_version":1,"found":true,"_source":{"url":"/video/test/1.mp4","status":"200","time":8675456456}} ...

Mình muốn đếm số lần xuất hiện của từng url theo status để lấy top file download. Mình dùng query :

{ "aggs": { "group_by_Url": { "terms": { "field": "url" } } } }

thì kết quả trả về như sau:

array(1) { ["group_by_Url"]=> array(3) { ["doc_count_error_upper_bound"]=> int(0) ["sum_other_doc_count"]=> int(0) ["buckets"]=> array(5) { [0]=> array(2) { ["key"]=> string(3) "mp4" ["doc_count"]=> int(10) } [1]=> array(2) { ["key"]=> string(4) "test" ["doc_count"]=> int(10) } [2]=> array(2) { ["key"]=> string(5) "video" ["doc_count"]=> int(10) } [3]=> array(2) { ["key"]=> string(1) "1" ["doc_count"]=> int(6) } [4]=> array(2) { ["key"]=> string(1) "2" ["doc_count"]=> int(4) } } } }

Làm cách nào để đếm theo fulll text trong url, vd:"/video/test/1.mp4" chứ theo như query trên nó chia nhỏ text trong url ra mất rồi. Help me

Nguyễn Trường Sa @nguyentruongsa19920311

2 0 3 1

Thêm một bình luận

1 CÂU TRẢ LỜI

Nhan Tran

Đã trả lời thg 10 13, 2017 4:34 SA

Bạn sửa lại mapping index trường "url" theo kiểu "keyword" bạn nhé. ("type": "keyword")

Hoặc cứ để kiểu text nhưng sửa lại analyze option là "not_analyze".

Nhan Tran @trongnhan.tran93

1 0 0 1

Thêm một bình luận