新版博客SEO优化基本完成,新老博客内容正在整合中,保证每篇文章高质量。 SiteMap RSS Github
Elasticsearch基本查询
嘉美伯爵   2019年9月26日 12:46   数据库   ElasticSearch   85  

Elasticsearch遵循RESTful协议(get:查、post:查、更新,put:增,delete:删),es中mappingsy一旦设置不可修改和关系数据库不同

区别

索引字段添加

suggest

  • 添加suggest字段
# _mapping/modelresult
{
    "properties":{
        "suggest":{
            "type":"completion",
            "analyzer":"ik"
        }
    }
}
  • 插入suggest值

  • 插入语句
# essay/modelresult/essay.essay.24/_update
{
  "doc": {
    "suggest": {
      "input":[
            "模型",
            "数据",
            "mysql",
            "数据库",
            "关系",
            "操作"
        ],
      "weight":10
     }
   }
}

模糊查询

  • fuzzy
# fuzziness:插入、替换、交换的间距 ---> linux = liunx
# prefix_length:最小匹配前缀
{
    "query":{
        "fuzzy":{
            "title":{
                "value":"python",
                "fuzziness":1,
                "prefix_length":3
            }
        }
    },
    "_source":[
        "title"
    ]
}

  • 指定id插入数据
PUT lagou/job/1
{
  "title":"Python",
  "address":"北京beijing",
  "company":{
    "name":"baidu"
  },
  "date":"2018-4-16"
}
  • 不指定id,自增插入
POST lagou/job/
{
  "title":"Python",
  "address":"beijing",
  "company":{
    "name":"alibaba"
  },
  "date":"2019-08-26"
}

# id查询
GET lagou/job/1

# 返回指定字段
GET lagou/job/1?_source=address,date

更新

POST lagou/job/1/_update
{
  "doc": {
    "title":"Java"
  }
}

删除

DELETE lagou/job/1

批量查询

GET lagou/job/_mget
{
  "ids":[1]
}

批量创建

POST _bulk
{"index":{"_index":"lagou","_type":"job","_id":"2"}}
{"title":"Vue开发","address":"北京beijing","company":{ "name":"baidu"},"date":"2018-4-16"}

映射

创建es映射,在没有创建映射的情况下,es也可以根据字段类型自己进行模糊创建,但是在没有创建的情况下,并不能使用es的分词等高级特性

创建映射

PUT lagou
{
  "mappings":{
    "job":{
      "properties": {
        "title":{
          "type": "text",
          "store": true, 
          "analyzer": "ik_max_word"
        },
        "address":{
          "store": true, 
          "type": "keyword"
        },
        "company":{
          "properties": {
            "name":{
              "type":"text"
            }
          }
        },
        "comments":{
          "type": "integer"
        },
        "date":{
          "type": "date",
          "format": "year_month_day"
        }
      }
    }
  }
}

POST lagou/job/
{
  "title":"此Java语言具中文名佳沃,是意大利的自行车品牌。品牌产品主要有来山地车、儿童自行车、城市车、折叠车python有功能强大和简单易用两个",
  "address":"dgdgd撒旦大苏打海淀区颐和园路5号",
  "company":{
    "name":"简称“北大”,由中华人民共和国教育部直属"
  },
  "comments":1828,
  "date":"2019-08-26"
}

es查询参数

中文分词我们使用ik,ik在es5.0之前统一使用ik进行分词,5.0之后改为了ik_max_word(最细粒度拆分)和ik_smart(最粗粒度的拆分),我们一般使用ik_max_word

  • match
# keyword 不会分词 全量查询
# text 根映射设定的分词器进行分词查询
GET lagou/_search
{
  "query": {
    "match": {
      "address": "海淀区颐和园路5号"
    }
  }
}
  • term
# term 区分大小写
# terms []
GET lagou/_search
{
  "query": {
    "terms": {
      "title": ["python","山地车"]
      }
  },"from": 0,
  "size": 1
}
  • match_all
GET lagou/_search
{
  "query": {
    "match_all": {}
  }
}
  • match_phrase
# 先分词 但必须满足分词完的数组都可被搜索到
GET lagou/_search
{
  "query": {
    "match_phrase": {
      "title": {
        "query": "Java语言"
      }
    }
  }
}
  • multi_match
# 多字段查询,设置权重
GET lagou/_search
{
  "query": {
    "multi_match": {
      "query": "python",
      "fields": ["title^3", "address"]
    }
  }
}
  • stored_fields
# 指明返回的字段 store=true
GET lagou/_search
{
  "stored_fields": ["title", "address"],
  "query": {
    "match": {
      "title": "python"
    }
  }
}
  • sort
# sort 字段排序
GET lagou/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "comments": {
        "order": "desc"
      }
    }
  ]
}
  • range
# range 数值和时间
GET lagou/_search
{
  "query": {
    "range": {
      "comments": {
        "gte": 10,
        "lte": 40,
        "boost": 1
      }
    }
  }
}
  • 通配符
# *
GET lagou/_search
{
  "query": {
    "wildcard": {
      "title": {
        "value": "pyth*n"
      }
    }
  }
}
  • bulk
POST lagou/testjob/_bulk
{"index":{"_id": 1}}
{"salary":10, "title":"python"}
{"index":{"_id": 2}}
{"salary":20, "title":"java"}
{"index":{"_id": 3}}
{"salary":30, "title":"golang"}
{"index":{"_id": 4}}
{"salary":40, "title":"ruby"}

ik分词测试

中文分词我们使用ik,ik在es5.0之前统一使用ik进行分词,5.0之后改为了ik_max_word(最细粒度拆分)和ik_smart(最粗粒度的拆分),我们一般使用ik_max_word

  • ik(es>5.0)
# ik_max_word
GET _analyze
{
  "analyzer": "ik_smart",
  "text": "python区分大小"
}
  • ik(es<5.0)

  • _analyze
# 指定分词器
{"analyzer":"ik","text":"python现场"}

# 返回
{
    "tokens":[
        {
            "token":"python",
            "start_offset":0,
            "end_offset":6,
            "type":"ENGLISH",
            "position":0
        },
        {
            "token":"现场",
            "start_offset":6,
            "end_offset":8,
            "type":"CN_WORD",
            "position":1
        }
    ]
}

参考文档

使用 reindex 来修改 elasticsearch 索引mapping

Elasticsearchanalyzer和search_analyzer的使用记录