Elasticsearch提供了基于JSON的完整查询DSL(特定于域的语言)来定义查询。将查询DSL视为查询的AST(抽象语法树),它由两种子句组成:
- 叶子查询子句:
叶查询子句中寻找一个特定的值在某一特定领域,如 match,term或 range查询。这些查询可以自己使用。- 复合查询子句
复合查询子句包装其他叶查询或复合查询,并用于以逻辑方式组合多个查询(例如 bool或dis_max查询),或更改其行为(例如 constant_score查询)。
查询子句的行为会有所不同,具体取决于它们是在 查询上下文中还是在过滤器上下文中使用。
导入官方测试数据
1,下载json文件
下载地址:官方下载
或者
https://gitee.com/xlh_blog/common_content/blob/master/es%E6%B5%8B%E8%AF%95%E6%95%B0%E6%8D%AE.json
2,导入elasticsearch
在kibana输入
POST bank/account/_bulk
上面json文件的数据
seach检索文档
ES支持两种基本方式检索;
- 通过REST request uri 发送搜索参数 (uri +检索参数);
- 通过REST request body 来发送它们(uri+请求体);
API:
https://www.elastic.co/guide/en/elasticsearch/reference/7.x/getting-started-search.html
1,请求参数方式检索
GET bank/_search?q=*&sort=account_number:asc
说明:
q=* # 查询所有
sort # 排序字段
asc #升序
检索bank下所有信息,包括type和docs
GET bank/_search
2,uri+请求体进行检索
GET /bank/_search
{
"query": { "match_all": {} },
"sort": [
{ "account_number": "asc" },
{ "balance":"desc"}
]
}
DSL
Elasticsearch提供了一个可以执行查询的Json风格的DSL(domain-specific language领域特定语言)。这个被称为Query DSL,该查询语言非常全面
基本语法
如果针对于某个字段,那么它的结构如下:
{
QUERY_NAME:{ # 使用的功能
FIELD_NAME:{ # 功能参数
ARGUMENT:VALUE,
ARGUMENT:VALUE,...
}
}
}
示例:
GET bank/_search
{
"query": { # 查询的字段
"match_all": {}
},
"from": 0, # 从第几条文档开始查
"size": 5,
"_source":["balance"],
"sort": [
{
"account_number": { # 返回结果按哪个列排序
"order": "desc" # 降序
}
}
]
}
#_source为要返回的字段
from,size分页查询:
GET bank/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 5
}
查询结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1000,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"account_number" : 1,
"balance" : 39225,
"firstname" : "Amber",
"lastname" : "Duke",
"age" : 32,
"gender" : "M",
"address" : "880 Holmes Lane",
"employer" : "Pyrami",
"email" : "amberduke@pyrami.com",
"city" : "Brogan",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "13",
"_score" : 1.0,
"_source" : {
"account_number" : 13,
"balance" : 32838,
"firstname" : "Nanette",
"lastname" : "Bates",
"age" : 28,
"gender" : "F",
"address" : "789 Madison Street",
"employer" : "Quility",
"email" : "nanettebates@quility.com",
"city" : "Nogal",
"state" : "VA"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "18",
"_score" : 1.0,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "daleadams@boink.com",
"city" : "Orick",
"state" : "MD"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "20",
"_score" : 1.0,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "elinorratliff@scentric.com",
"city" : "Ribera",
"state" : "WA"
}
}
]
}
}
query/match匹配查询:
如果是非字符串,会进行精确匹配。如果是字符串,会进行全文检索
1,基本类型(非字符串),精确匹配
查询account_number为20的数据
GET bank/_search
{
"query": {
"match": {
"account_number": "20"
}
}
}
查询结果:
{
"took" : 13,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,#有几条结果
"relation" : "eq"
},
"max_score" : 1.0,#最大得分
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "20",
"_score" : 1.0,#得分
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "elinorratliff@scentric.com",
"city" : "Ribera",
"state" : "WA"
}
}
]
}
}
2,字符串,全文检索
查询address为mill lane的数据
GET bank/_search
{
"query": {
"match": {
"address": "mill lane"
}
}
}
全文检索,最终会按照评分进行排序,会对检索条件进行分词匹配。
下面结果按照条件mill lane进行分词匹配,mill,lane,mill lane都会匹配
查询结果为:
{
"took" : 18,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 19,
"relation" : "eq"
},
"max_score" : 9.507477,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "136",
"_score" : 9.507477,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "970",
"_score" : 5.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "345",
"_score" : 5.4032025,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "leelong@comverges.com",
"city" : "Movico",
"state" : "MT"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "1",
"_score" : 4.1042743,
"_source" : {
"account_number" : 1,
"balance" : 39225,
"firstname" : "Amber",
"lastname" : "Duke",
"age" : 32,
"gender" : "M",
"address" : "880 Holmes Lane",
"employer" : "Pyrami",
"email" : "amberduke@pyrami.com",
"city" : "Brogan",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "70",
"_score" : 4.1042743,
"_source" : {
"account_number" : 70,
"balance" : 38172,
"firstname" : "Deidre",
"lastname" : "Thompson",
"age" : 33,
"gender" : "F",
"address" : "685 School Lane",
"employer" : "Netplode",
"email" : "deidrethompson@netplode.com",
"city" : "Chestnut",
"state" : "GA"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "556",
"_score" : 4.1042743,
"_source" : {
"account_number" : 556,
"balance" : 36420,
"firstname" : "Collier",
"lastname" : "Odonnell",
"age" : 35,
"gender" : "M",
"address" : "591 Nolans Lane",
"employer" : "Sultraxin",
"email" : "collierodonnell@sultraxin.com",
"city" : "Fulford",
"state" : "MD"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "568",
"_score" : 4.1042743,
"_source" : {
"account_number" : 568,
"balance" : 36628,
"firstname" : "Lesa",
"lastname" : "Maynard",
"age" : 29,
"gender" : "F",
"address" : "295 Whitty Lane",
"employer" : "Coash",
"email" : "lesamaynard@coash.com",
"city" : "Broadlands",
"state" : "VT"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "715",
"_score" : 4.1042743,
"_source" : {
"account_number" : 715,
"balance" : 23734,
"firstname" : "Tammi",
"lastname" : "Hodge",
"age" : 24,
"gender" : "M",
"address" : "865 Church Lane",
"employer" : "Netur",
"email" : "tammihodge@netur.com",
"city" : "Lacomb",
"state" : "KS"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "449",
"_score" : 4.1042743,
"_source" : {
"account_number" : 449,
"balance" : 41950,
"firstname" : "Barnett",
"lastname" : "Cantrell",
"age" : 39,
"gender" : "F",
"address" : "945 Bedell Lane",
"employer" : "Zentility",
"email" : "barnettcantrell@zentility.com",
"city" : "Swartzville",
"state" : "ND"
}
}
]
}
}
query/match_phrase [不拆分匹配]
将需要匹配的值当成一整个单词(不分词)进行检索
- match_phrase:不拆分字符串进行检索
- 字段.keyword:必须全匹配上才检索成功
查处address中包含mill lane的所有记录,并给出相关性得分
GET bank/_search
{
"query": {
"match_phrase": {
"address": "mill lane" # 就是说不要匹配只有mill或只有lane的,要匹配mill lane一整个子串
}
}
}
查询结果为:
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 9.507477,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "136",
"_score" : 9.507477,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
}
]
}
}
使用match的keyword
GET bank/_search
{
"query": {
"match": {
"address.keyword": "mill" # 字段后面加上 .keyword
}
}
}
查询结果:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
修改匹配条件为"198 Mill Lane"
GET bank/_search
{
"query": {
"match": {
"address.keyword": "198 Mill Lane"
}
}
}
查询结果为:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.5032897,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "136",
"_score" : 6.5032897,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
}
]
}
}
文本字段的匹配,使用keyword,匹配的条件就是要显示字段的全部值,要进行精确匹配的。
match_phrase是做短语匹配,只要文本中包含匹配条件,就能匹配到。
query/multi_math【多字段匹配】
state或者address中包含mill,并且在查询过程中,会对于查询条件进行分词。
GET bank/_search
{
"query": {
"multi_match": { # 前面的match仅指定了一个字段。
"query": "mill",
"fields": [ # state和address有mill子串 不要求都有
"state",
"address"
]
}
}
}
查询结果:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 5.4032025,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "970",
"_score" : 5.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "136",
"_score" : 5.4032025,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "345",
"_score" : 5.4032025,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "leelong@comverges.com",
"city" : "Movico",
"state" : "MT"
}
}
]
}
}
query/bool/must复合查询
复合语句可以合并,任何其他查询语句,包括符合语句。这也就意味着,复合语句之间可以互相嵌套,可以表达非常复杂的逻辑。
- must:必须达到must所列举的所有条件
- must_not:必须不匹配must_not所列举的所有条件。
- should:应该满足should所列举的条件。满足条件最好,不满足也可以,满足得分更高
例:
GET bank/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{
"match": {
"age": "18"
}
}
],
"should": [
{
"match": {
"lastname": "Wallace"
}
}
]
}
}
}
查询结果:
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 12.585751,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "970",
"_score" : 12.585751,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "136",
"_score" : 6.0824604,
"_source" : {
"account_number" : 136,
"balance" : 45801,
"firstname" : "Winnie",
"lastname" : "Holland",
"age" : 38,
"gender" : "M",
"address" : "198 Mill Lane",
"employer" : "Neteria",
"email" : "winnieholland@neteria.com",
"city" : "Urie",
"state" : "IL"
}
},
{
"_index" : "bank",
"_type" : "accout",
"_id" : "345",
"_score" : 6.0824604,
"_source" : {
"account_number" : 345,
"balance" : 9812,
"firstname" : "Parker",
"lastname" : "Hines",
"age" : 38,
"gender" : "M",
"address" : "715 Mill Avenue",
"employer" : "Baluba",
"email" : "parkerhines@baluba.com",
"city" : "Blackgum",
"state" : "KY"
}
}
]
}
}
query/filter【结果过滤】
- must 贡献得分
- should 贡献得分
- must_not 不贡献得分
- filter 不贡献得分
上面的must和should影响相关性得分,而must_not仅仅是一个filter ,不贡献得分
must改为filter就使must不贡献得分
如果只有filter条件的话,我们会发现得分都是0
一个key多个值可以用terms
并不是所有的查询都需要产生分数,特别是哪些仅用于filtering过滤的文档。为了不计算分数,elasticsearch会自动检查场景并且优化查询的执行。
查询所有匹配address=mill的文档,然后再根据10000<=balance<=20000进行过滤查询结果:
GET bank/_search
{
"query": {
"bool": {
"must": [
{"match": {
"address": "mill"
}}
],
"filter": {
"range": {
"balance": {
"gte": 10000,
"lte": 20000
}
}
}
}
}
}
查询结果为:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 5.4032025,
"hits" : [
{
"_index" : "bank",
"_type" : "accout",
"_id" : "970",
"_score" : 5.4032025,
"_source" : {
"account_number" : 970,
"balance" : 19648,
"firstname" : "Forbes",
"lastname" : "Wallace",
"age" : 28,
"gender" : "M",
"address" : "990 Mill Road",
"employer" : "Pheast",
"email" : "forbeswallace@pheast.com",
"city" : "Lopezo",
"state" : "AK"
}
}
]
}
}
query/term
和match一样。匹配某个属性的值。
- 全文检索字段用match,
- 其他非text字段匹配用term。
不要使用term来进行文本字段查询
es默认存储text值时用分词分析,所以要搜索text值,使用match
字段.keyword:要一一匹配到
match_phrase:子串包含即可
使用term匹配文本,会匹配不到
GET bank/_search
{
"query": {
"term": {
"address": "mill Road"
}
}
}
查询结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
而更换为match匹配时,能够匹配到32个文档
全文检索字段用match,其他非text字段匹配用term
aggs(聚合)
聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于SQL Group by和SQL聚合函数。
在elasticsearch中,执行搜索返回this(命中结果),并且同时返回聚合结果,把以响应中的所有hits(命中结果)分隔开的能力。这是非常强大且有效的,你可以执行查询和多个聚合,并且在一次使用中得到各自的(任何一个的)返回结果,使用一次简洁和简化的API啦避免网络往返。
语法:
"aggs":{ # 聚合
"aggs_name":{ # 这次聚合的名字,方便展示在结果集中
"AGG_TYPE":{} # 聚合的类型(avg,term,terms...)
}
}
例:搜索address中包含mill的所有人的年龄分布以及平均年龄,但不显示这些人的详情
# 分别为包含mill、,平均年龄、
GET bank/_search
{
"query": { # 查询出包含mill的
"match": {
"address": "Mill"
}
},
"aggs": { #基于查询聚合
"ageAgg": { # 聚合的名字,随便起
"terms": { # 看值的可能性分布
"field": "age",
"size": 10
}
},
"ageAvg": {
"avg": { # 看age值的平均
"field": "age"
}
},
"balanceAvg": {
"avg": { # 看balance的平均
"field": "balance"
}
}
},
"size": 0 # 结果显示条数为0,不看详情
}
查询结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4, // 命中4条
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"ageAgg" : { // 第一个聚合的结果
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 38, # age为38的有2条
"doc_count" : 2
},
{
"key" : 28,
"doc_count" : 1
},
{
"key" : 32,
"doc_count" : 1
}
]
},
"ageAvg" : { // 第二个聚合的结果
"value" : 34.0 # balance字段的平均值是34
},
"balanceAvg" : {
"value" : 25208.0
}
}
}
aggs/aggName子聚合
查出所有年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"genderAgg":{
"terms": {
"field": "gender.keyword",
"size": 10
},
"aggs": {
"avgAgg": {
"avg": {
"field": "balance"
}
}
}
},
"ageBalace": {
"avg": {
"field": "balance"
}
}
}
}
},
"size": 0
}
查询结果为: