ElasticSearch7

1 2	# 进入 bin 目录 elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.16.2/elasticsearch-analysis-ik-7.16.2.zip

安装到 windows 服务

# 进入 bin 目录
# 注意 JDK 必须安装，且 JAVA_HOME 变量必须设置，否则报错
elasticsearch-service.bat install

# 卸载服务
elasticsearch-service.bat remove

启动 Elasticsearch 服务

elasticsearch-service.bat start 或者在任务管理器-服务中启动

Kibana-数据可视化

下载 https://artifacts.elastic.co/downloads/kibana/kibana-7.16.2-windows-x86_64.zip
解压，运行bin目录下的kibana.bat，启动Kibana的用户界面
访问http://localhost:5601 即可打开Kibana的用户界面

Logstash- 数据收集

下载 https://artifacts.elastic.co/downloads/logstash/logstash-7.16.2-windows-x86_64.zip
解压，运行bin目录下的
访问

Docker

https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

参考《Elasticsearch-Docker》文档

Elasticsearch

下载elasticsearch 7.16.2的docker镜像
1
docker pull elasticsearch:7.16.2
修改虚拟内存区域大小，否则会因为过小而无法启动；
1
sysctl -w vm.max_map_count=262144

使用docker命令启动

防火墙要开启9200和9300端口

docker run --name es7.16.2 --restart always -d -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e "cluster.name=elasticsearch" \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
elasticsearch:7.16.2

参数	范例值	备注
-d		启动后不在当前界面打印日志，你也可以不写这个参数
–name	es	容器名，你也可以自己命名或着不命名
–restart	always	自动高可用，死了自动拉起来，你也可以直接不写这个参数
-p	宿主机端口号:容器内端口号	将容器内的端口映射到宿主机的端口，根据你的需要可以自行调整宿主机的端口号
-v	宿主机路径:容器内路径	将容器内指定路径映射到宿主机的路径，用于将数据存储到宿主机以便在容器重启或关闭时保留es中的数据
-e		传入容器内的配置项，根据需要填写

启动时会发现/usr/share/elasticsearch/data目录没有访问权限，只需要修改该目录的权限，再重新启动即可

1	chmod 777 /mydata/elasticsearch/data/

ES 容器名称修改为 elasticsearch

1	docker rename elasticsearch:7.16.2 elasticsearch

访问会返回版本信息：http://localhost:9200

Kibina

防火墙的5601端口需要打开

下载 kibana 镜像

1	docker pull kibana:7.16.2

启动并连接 elasticsearch

1	docker run --name kibana -d --restart always --link es7.16.2:elasticsearch -p 5601:5601 kibana:7.16.2

参数	范例值	备注
-d		启动后不在当前界面打印日志，你也可以不写这个参数
–link	ES容器名或容器ID:elasticsearch	将Kibana连接到ES，冒号前面可写你自己命名的ES容器名
–name	kibana	容器名，你也可以自己命名或着不命名
–restart	always	自动高可用，死了自动拉起来，你也可以直接不写这个参数
-p	宿主机端口号:容器内端口号	将容器内的端口映射到宿主机的端口，根据你的需要可以自行调整宿主机的端口号

访问地址进行测试：http://localhost:5601

汉化

1 2	docker exec -it kibana /bin/bash vi /opt/kibana/config/kibana.yml

在最后加上一句

1 2	#注意冒号后面一定要有一个空格 i18n.locale: zh-CN

保存后重启容器

1	docker restart kibana

kibana 配置 elasticsearch 连接的认证用户密码

#进入kibana安装目录
cd /usr/local/kibana-7.2.0-linux-x86_64/config

#修改配置文件
vim kibana.yml

#添加配置
elasticsearch.username: "elastic"
elasticsearch.password: "123456"

cerebro - 可视化工具

Cerebro 是查看分片分配和最有用的界面之一通过图形界面执行常见的索引操作。完全开放源，并且它允许您添加用户，密码或 LDAP 身份验证问网络界面。Cerebro 是对先前插件的部分重写，并且可以作为自运行工具使用应用程序服务器，基于 Scala 的Play 框架

安装
1
docker pull lmenezes/cerebro:0.9.4

启动

自己新建一个文件，命名随意，本例中命名为env-ldap
写入以下内容

# Set it to ldap to activate ldap authorization
AUTH_TYPE=basic
BASIC_AUTH_USER=admin
BASIC_AUTH_PWD=admin

# Group membership settings (optional)

# If left unset LDAP_BASE_DN will be used
# LDAP_GROUP_BASE_DN=OU=users,DC=example,DC=com

# Attribute that represent the user, for example uid or mail
# LDAP_USER_ATTR=mail

# If left unset LDAP_USER_TEMPLATE will be used
# LDAP_USER_ATTR_TEMPLATE=%s

# Filter that tests membership of the group. If this property is empty then there is no group membership check
# AD example => memberOf=CN=mygroup,ou=ouofthegroup,DC=domain,DC=com
# OpenLDAP example => CN=mygroup
# LDAP_GROUP=memberOf=memberOf=CN=mygroup,ou=ouofthegroup,DC=domain,DC=com

1	docker run --name cerebro --restart always -d -p 9000:9000 --env-file env-ldap lmenezes/cerebro:0.9.4

访问 http://localhost:9000

Logstash

安装
1
docker pull logstash:7.16.2

启动

1	docker run --rm -it -v ~/pipeline/:/usr/share/logstash/pipeline/ logstash:7.16.2

配置

https://www.cnblogs.com/hanyouchun/p/5163183.html

文件路径：es 根目录的 config 目录下面，elasticsearch.yml 和 logging.yml

主配置文件是 elasticsearch.yml，日志配置文件是 logging.yml

基本配置

config/elasticsearch.yml

cluster.name: bjtcrjes
network.host: 0.0.0.0
network.bind_host: 0.0.0.0
network.publish_host: 192.168.1.33
http.port: 9200

path.data: C:\elasticsearch-6.2.1\data
path.logs: C:\elasticsearch-6.2.1\logs
http.cors.enabled: true
http.cors.allow-origin: "*"
transport.tcp.port: 9300
transport.tcp.compress: true

启用 Elasticsearch 安全功能-设置用户密码

config/elasticsearch.yml

# 启用 Elasticsearch 安全功能
xpack.security.enabled: true
xpack.license.self_generated.type: basic
xpack.security.transport.ssl.enabled: true

使用该参数将随机生成的密码输出到控制台，您可以稍后根据需要更改这些密码

1	./bin/elasticsearch-setup-passwords auto

如果要使用自己的密码，请使用参数而不是参数运行命令。使用此模式将逐步完成所有内置用户的密码配置

C:\elasticsearch-7.16.2\bin>elasticsearch-setup-passwords interactive
# 输入各账号的密码并确认，如下：
"warning: usage of JAVA_HOME is deprecated, use ES_JAVA_HOME"
Future versions of Elasticsearch will require Java 11; your Java version from [C:\Program Files\Java\jdk1.8.0_191\jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y

Enter password for [elastic]:
Reenter password for [elastic]:
Enter password for [apm_system]:
Reenter password for [apm_system]:
Enter password for [kibana_system]:
Reenter password for [kibana_system]:
Enter password for [logstash_system]:
Reenter password for [logstash_system]:
Enter password for [beats_system]:
Reenter password for [beats_system]:
Enter password for [remote_monitoring_user]:
Reenter password for [remote_monitoring_user]:
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]

浏览器访问 `http://localhost:9200`

插件

查询当前集群已安装插件

1	http://localhost:9200/_cat/plugins?v

安装插件

# 安装中文分词器IKAnalyzer，并重新启动
docker exec -it elasticsearch /bin/bash
   
#此命令需要在容器中运行
elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.2/elasticsearch-analysis-ik-7.16.2.zip

docker restart elasticsearch

IKAnalyzer 配置扩展字典、停止词典

专有名词或特殊词集合，分词器不做分词处理

停止词，无意义的词，不需要存储和搜索的词。分词时直接忽略

配置文件位置：config/IKAnalyzer.cfg.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <!--用户可以在这里配置自己的扩展字典 -->
    <entry key="ext_dict">custom/mydict.dic;custom/single_word_low_freq.dic</entry>
     <!--用户可以在这里配置自己的扩展停止词字典-->
    <entry key="ext_stopwords">custom/ext_stopword.dic</entry>
    <!--用户可以在这里配置远程扩展字典 -->
    <entry key="remote_ext_dict">http://xxx.com/xxx.dic</entry>
    <!--用户可以在这里配置远程扩展停止词字典-->
    <entry key="remote_ext_stopwords">http://xxx.com/xxx.dic</entry>
</properties>

索引及数据导出与导入

介绍

使用Elasticdump时特别需要是，若直接用npm install elasticdump -g来安装，node版本需要在v10.0.0以上才能支持，否则执行该指令会出错

Elasticdump通过向发送一个input来工作output，其标准指令是

1	elasticdump --input SOURCE --output DESTINATION [OPTIONS]

input SOURCE表示读取数据源SOURCE
output DESTINATION表示将数据源传输到目的地DESTINATION。
SOURCE/DESTINATION两者都可以是Elasticsearch URL或文件，如果是Elasticsearch URL，例如http://127.0.0.1/index，就意味着是直接往地址为http://127.0.0.1ES库里导入或者从其导出索引相关数据。
[OPTIONS]是操作选项，比较常用有type和limit，其他操作这里就不展开介绍

type 是ES数据导出导入类型，Elasticdum工具支持以下数据类型的导入导出

type类型	说明
mapping	ES的索引映射结构数据
data	ES的数据
settings	ES的索引库默认配置
analyzer	ES的分词器
template	ES的模板结构数据
alias	ES的索引别名

limit从SOURCE备份到DESTINATION的对象数量，默认是100，可自定义设置

Elasticdump工具安装

1. 在线安装Elasticdum工具需要依赖node，故而先安装v10.0.0以上的node。

[root@zhu opt]# wget https://nodejs.org/dist/v12.18.3/node-v12.18.3-linux-x64.tar.xz
[root@zhu opt]# tar xvf  node-v12.18.3-linux-x64.tar.xz -C /usr/local/
[root@zhu opt]# mv /usr/local/node-v12.18.3-linux-x64 /usr/local/nodejs
[root@zhu opt]# echo export NODEJS_HOME=/usr/local/nodejs >> /etc/profile
[root@zhu opt]# echo export PATH=$PATH:$NODEJS_HOME/bin >> /etc/profile
[root@zhu opt]# echo export NODEJS_PATH=$NODEJS_HOME/lib/node_modules >>/etc/profile
[root@zhu opt]# source /etc/profile
[root@zhu opt]# ln -s /usr/local/nodejs/bin/node /usr/local/bin/node
[root@zhu opt]# ln -s /usr/local/nodejs/bin/npm /usr/local/bin/npm
[root@zhu opt]# npm -v
6.14.6
[root@zhu opt]# node -v
v12.18.3

2. 通过npm安装elasticdump

1	[root@zhu opt]# npm install elasticdump -g

安装成功后，进入到

1	[root@zhu opt]#cd /usr/local/nodejs/lib/node_modules/elasticdump/bin

可以看到有两个命令，elasticdump用来备份单个索引，multielasticdump可以用来并行备份多个索引：

root@zhu bin]# ll
总用量 20
-rwxr-xr-x. 1 1001 1001  4026 4月   9 14:38 elasticdump
-rwxr-xr-x. 1 1001 1001 14598 10月 26 1985 multielasticdump

导出

使用elasticdump进行单个索引备份还原操作
- 导出索引test_event的mapping映射结构：

1	[root@zhu opt]# elasticdump --input=http://127.0.0.1:9200/test_event --output=/opt/test_event_mapping.json --type=mapping

检查当前，发现已经备份成json文件：

1
2
3

[root@zhu opt]# ll
总用量 14368
-rw-r--r--. 1 root root     6200 4月   9 11:30 ucas_hisevenr_mapping.json

还可以直接导入到另一个es集群当中：

1	[root@zhu opt]# elasticdump --input=http://127.0.0.1:9200/test_event --output=http://127.0.0.2:9200/test_event --type=mapping

- 导出索引test_event的数据：

1	[root@zhu opt]# elasticdump --input=http://127.0.0.1:9200/test_event --output=/opt/data.json --type=data

同理，可直接将备份数据导入另一个es集群：

1	[root@zhu opt]# elasticdump --input=http://127.0.0.1:9200/test_event --output=http://127.0.0.2:9200/test_event --type=data

导入

- mapping映射结构还原：

1	[root@zhu opt]# elasticdump --input=/opt/test_event_mapping.json --output http://127.0.0.1:9200/ --type=mapping

- data数据还原

1	[root@zhu opt]# elasticdump --input=/opt/data.json --output=http://127.0.0.1:9200/test_event --type=data

批量导出导入

＃将ES索引及其所有类型备份到es_backup文件夹中
multielasticdump direction = dump match ='^.*$'  input = http://127.0.0.1:9200   output =/tmp/es_backup
＃仅备份ES索引以“ -index”（匹配正则表达式）为前缀的结尾。仅备份索引数据。所有其他类型都将被忽略。＃注意：默认情况下会忽略分析器和别名类型
multielasticdump --direction=dump --match='^.*-index$' --input=http://127.0.0.1:9200 --ignoreType='mapping,settings,template'  --output=/tmp/es_backup

使用elasticdump进行多个索引还原操作：

1	multielasticdump --direction=load --input=/tmp/es_backup --output=http://127.0.0.1:9200

根据npm的elasticdump英文官网介绍可知，这里需要注意一点是，即使用multielasticdump有一个区别的地方是–direction的参数设置和–ignoreType参数设置

备份时，--direction=dump是默认值，则--input必须是ElasticSearch服务器基本位置的URL（即http://localhost:9200），并且--output必须是目录。每个匹配的索引都会创建一个数据，映射和分析器文件
还原时，要加载从multi- elasticsearch转储的文件，--direction应将其设置为load，--input必须是multielasticsearch转储的目录，并且--output必须是Elasticsearch服务器URL
–match`用于过滤应转储/加载的索引（正则表达式）
--ignoreType允许从转储/加载中忽略类型。支持六个选项。data,mapping,analyzer,alias,settings,template。提供了多类型支持，使用时每种类型必须用逗号分隔，并interval允许控制生成新索引的转储/装入的时间间隔。
--includeType允许将类型包含在转储/装载中。支持六个选项- data,mapping,analyzer,alias,settings,template

Kibana

https://www.cnblogs.com/cjsblog/p/9476813.html

概述

Kibana 是一个开源的分析和可视化平台，设计用于和 Elasticsearch 一起工作

你用 Kibana 来搜索，查看，并和存储在 Elasticsearch 索引中的数据进行交互

你可以轻松地执行高级数据分析，并且以各种图表、表格和地图的形式可视化数据

Kibana 使得理解大量数据变得很容易。它简单的、基于浏览器的界面使你能够快速创建和共享动态仪表板，实时显示 Elasticsearch 查询的变化

配置

https://www.elastic.co/guide/en/kibana/current/settings.html

配置文件目录： config/kibana.yml

elasticsearch.url	The URLs of the Elasticsearch instances to use for all your queries. All nodes listed here must be on the same cluster. Default: `[ "http://localhost:9200" ]`To enable SSL/TLS for outbound connections to Elasticsearch, use the `https` protocol in this setting.
`server.name:`	A human-readable display name that identifies this Kibana instance. Default: `"your-hostname"`
`server.port:`	Kibana is served by a back end server. This setting specifies the port to use. Default: `5601`
cluster.name:	docker-cluster「节点名称」
network.host:	127.0.0.1「只支持本地访问」
path.data:	D:\elasticsearch-6.2.1\data
path.logs:	D:\elasticsearch-6.2.1\logs
http.cors.enabled:	true
http.cors.allow-origin:	“*”
transport.tcp.port:	9300
transport.tcp.compress:	true
http.port:	9200

elasticsearch-head 浏览器插件「适用于Chrome、 360 等浏览器」

支持索引库创建、查看、删除，数据搜索等
复合查询中支持创建索引映射

地理信息存储及查询

Geo_Point

地理坐标点，在Elasticsearch中用来存经纬度数据的一种数据格式

{
  "mappings": {
    "_doc": {
      "properties": {
        "Cancelled": {
          "type": "boolean"
        },
        "FlightDelayMin": {
          "type": "integer"
        },
        "AvgTicketPrice": {
          "type": "float"
        },
        "timestamp": {
          "type": "date"
        },
        "FlightNum": {
          "type": "keyword"
        },
        "DestLocation": {
          "type": "geo_point"
        },
        "OriginLocation": {
          "type": "geo_point"
        },
      }
    }
  }
}

存储

Elasticsearch地理信息存储及查询之Geo_Point - 知乎 (zhihu.com)

字符串

{
  "id": 10001,
  "location": "25.345,127.453"  // 纬度,经度
}

对象

{
  "id": 10001,
  "location": {
  		"lon": 127.453,
      "lat": 25.345
	 }
}

数组

{
  "id": 10001,
  "location": [
  		127.453,
    	25.345
	 ]
}

问题

spring data elasticsearch 对应 elasticsearch 版本

https://docs.spring.io/spring-data/elasticsearch/docs/current/reference/html/#preface.requirements

Spring Data Release Train	Spring Data Elasticsearch	Elasticsearch	Spring Framework	Spring Boot
2021.0 (Pascal)[1]	4.2.x[1]	7.12.0	5.3.x[1]	2.4.x[1]
2020.0 (Ockham)	4.1.x	7.9.3	5.3.2	2.4.x
Neumann	4.0.x	7.6.2	5.2.12	2.3.x
Moore	3.2.x	6.8.12	5.2.12	2.2.x
Lovelace[2]	3.1.x[2]	6.2.2	5.1.19	2.1.x
Kay[2]	3.0.x[2]	5.5.0	5.0.13	2.0.x
Ingalls[2]	2.1.x[2]	2.4.0	4.3.25	1.5.x

Elasticsearch health check failed

关闭 actuator 对 elasticsearch 的健康检查

management:
  health:
    elasticsearch:
      enabled: false

参考

下载

注意

Windows

Elasticsearch

配置环境变量 ES_JAVA_HOME

安装中文分词插件

安装到 windows 服务

启动 Elasticsearch 服务

Kibana-数据可视化

Logstash- 数据收集

Docker

Elasticsearch

Kibina

cerebro - 可视化工具

Logstash

配置

基本配置

启用 Elasticsearch 安全功能-设置用户密码

浏览器访问 http://localhost:9200

插件

查询当前集群已安装插件

安装插件

IKAnalyzer 配置扩展字典、停止词典

索引及数据导出与导入

介绍

Elasticdump工具安装

导出

导入

批量导出导入

Kibana

概述

配置

elasticsearch-head 浏览器插件「适用于Chrome、 360 等浏览器」

地理信息存储及查询

Geo_Point

存储

问题

spring data elasticsearch 对应 elasticsearch 版本

Elasticsearch health check failed

浏览器访问 `http://localhost:9200`