ElasticSearch——京东商城搜索项目实战

「这是我参与11月更文挑战的第25天,活动详情查看:2021最后一次更文挑战

技术栈:

SpringBoot 2.5.6

ElasticSearch 7.8.0

Vue

1、项目搭建

1、新建一个Spring Boot 项目

2、导入依赖

​ 2.1、 修改Springboot中的ElasticSearch版本和本地的版本一致

1
2
3
4
java复制代码<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.8.0</elasticsearch.version>
</properties>

​ 2.2、导入依赖

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
xml复制代码<dependencies>
<!-- jsoup解析网页-->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.10.2</version>
</dependency>
<!-- fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.70</version>
</dependency>
<!-- ElasticSearch -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<!-- thymeleaf -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<!-- web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- devtools热部署 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<!-- -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
<!-- lombok 需要安装插件 -->
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<!-- test -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

3、编写配置文件

1
2
3
4
properties复制代码# 更改端口,防止冲突
server.port=9090
# 关闭thymeleaf缓存
spring.thymeleaf.cache=false

4、导入静态资源

链接:pan.baidu.com/s/10EF40UUK…
提取码:ot8j

在这里插入图片描述

5、测试访问静态页面

1
2
3
4
java复制代码@GetMapping({"/","/index"})
public String test(){
return "index";
}

访问请求:http://localhost:9090/

在这里插入图片描述

项目搭建完成!

2、爬取数据

1、通过请求 search.jd.com/Search?keyw… 查询到页面

在这里插入图片描述

检查网页:可以看到元素列表id为 J_goodsList,

在这里插入图片描述

每个li标签里面存放了每个商品的具体数据:

在这里插入图片描述

2、解析页面,获取数据

编写解析页面工具类,获取商品信息:简易版

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
java复制代码package com.cheng.utils;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;

/**
* @Author wpcheng
* @Create 2021-11-10-9:46
*/
public class HTMLParseUtil {
public static void main(String[] args) throws IOException {
//1.获取请求
String url = "https://search.jd.com/Search?keyword=java";
//2、解析网页 这里Jsoup返回的就是浏览器的Document对象,在这里可以用js里的方法
Document document = Jsoup.parse(new URL(url), 30000);
//获取"J_goodsList"列表
Element element = document.getElementById("J_goodsList");
//获取"J_goodsList"列表中的li标签集合
Elements elements = element.getElementsByTag("li");
//将li标签集合中的每一个li标签遍历出来,一个el里有一个商品的信息
for (Element el : elements) {

//一般图片特别多的网站,所有的图片都是通过延迟加载的,图片地址放在"data-lazy-img"里面
String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");//获取商品图片的地址
String price = el.getElementsByClass("p-price").eq(0).text();//获取商品的价格
String title = el.getElementsByClass("p-name").eq(0).text();//获取商品的标题

System.out.println("=====================================");
System.out.println(img);
System.out.println(price);
System.out.println(title);
}
}
}

运行:数据获取成功。

在这里插入图片描述

编写商品信息的实体类:

1
2
3
4
5
6
7
8
9
10
java复制代码@Data
@NoArgsConstructor
@AllArgsConstructor
public class Content implements Serializable {
private static final long serialVersionUID = -8049497962627482693L;
private String name;
private String img;
private String price;

}

将解析页面工具类 HTMLParseUtil 封装成方法:完整版

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
java复制代码package com.cheng.utils;

import com.cheng.pojo.Content;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;

public class HTMLParseUtil {
public static void main(String[] args) throws Exception {
//测试
new HTMLParseUtil().parseJD("刘同").forEach(System.out::println);

}

public List<Content> parseJD(String keyword) throws Exception {

String url = "https://search.jd.com/Search?keyword="+keyword;
//2、解析网页 这里Jsoup返回的就是浏览器的Document对象,在这里可以用js里的方法
Document document = Jsoup.parse(new URL(url), 30000);
//获取"J_goodsList"列表
Element element = document.getElementById("J_goodsList");
//获取"J_goodsList"列表中的li标签集合
Elements elements = element.getElementsByTag("li");
//将li标签集合中的每一个li标签遍历出来,一个el里有一个商品的信息
// list存储所有li下的内容
List<Content> contents = new ArrayList<>();
for (Element el : elements) {
//一般图片特别多的网站,所有的图片都是通过延迟加载的,图片地址放在"data-lazy-img"里面
String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");//获取商品图片的地址
String price = el.getElementsByClass("p-price").eq(0).text();//获取商品的价格
String title = el.getElementsByClass("p-name").eq(0).text();//获取商品的标题

Content content = new Content(img, price, title);
contents.add(content);
}

return contents;
}
}

运行测试工具类:

在这里插入图片描述

爬取数据测试成功!

3、编写业务

1、编写配置文件

1
2
3
4
5
6
7
8
9
10
11
java复制代码@Configuration
public class ESHighClientConfig {

@Bean
public RestHighLevelClient restHighLevelClient(){
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("127.0.0.1", 9200, "http")));
return client;
}

}

2、编写service层

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
java复制代码package com.cheng.service;

import com.alibaba.fastjson.JSON;
import com.cheng.pojo.Content;
import com.cheng.utils.HTMLParseUtil;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.index.query.TermQueryBuilder;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;


@Service
public class ContentService {

@Autowired
RestHighLevelClient restHighLevelClient;

//爬取数据放入索引
public boolean parseContent(String keyword) throws Exception {
//用自定义解析工具类解析网页,获取数据
List<Content> contents = HTMLParseUtil.parseJD(keyword);

//将解析得到的数据批量加入ES中
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.timeout("2m");
for (int i = 0; i < contents.size(); i++) {
bulkRequest.add(
new IndexRequest("jd_goods")
.source(JSON.toJSONString(contents.get(i)), XContentType.JSON));

}
BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);

return !bulkResponse.hasFailures();//返回true执行成功
}


//搜索文档信息
public List<Map<String,Object>> search(String keyword,int pageIndex,int pageSize) throws IOException {

if (pageIndex < 0){
pageIndex = 0;
}

//针对索引构建查询请求
SearchRequest request = new SearchRequest("jd_goods");

//构建搜索条件
SearchSourceBuilder sourceBuilder = SearchSourceBuilder.searchSource();
//精确查询关键词
TermQueryBuilder termQuery = QueryBuilders.termQuery("name", keyword);
//把精确查询放入搜索条件
sourceBuilder.query(termQuery).timeout(new TimeValue(60, TimeUnit.SECONDS));

//分页
sourceBuilder.from(pageIndex);
sourceBuilder.size(pageSize);

//把搜索条件放入请求
request.source(sourceBuilder);

//执行查询请求
SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);

List<Map<String,Object>> list = new ArrayList<>();

for (SearchHit documentFields : response.getHits().getHits()) {
//把每个商品信息当做map取出
Map<String, Object> sourceAsMap = documentFields.getSourceAsMap();
//放入list集合
list.add(sourceAsMap);
}
return list;
}

}

3、编写controller层

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
java复制代码package com.cheng.controller;

import com.cheng.service.ContentService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.bind.annotation.RestController;

import java.io.IOException;
import java.util.List;
import java.util.Map;

@RestController
public class TextController {

@Autowired
private ContentService contentService;


@GetMapping({"/","/index"})
public String test(){
return "index";
}


@GetMapping("/parse/{keyword}")
public Boolean parse(@PathVariable("keyword") String keyword) throws Exception {
return contentService.parseContent(keyword);
}


@GetMapping("/search/{keyword}/{pageIndex}/{pageSize}")
public List<Map<String, Object>> parse(@PathVariable("keyword") String keyword,
@PathVariable("pageIndex") Integer pageIndex,
@PathVariable("pageSize") Integer pageSize) throws IOException {
return contentService.search(keyword,pageIndex,pageSize);
}

}

4、进行测试

访问请求:http://localhost:9090/parse/java 添加文档:文档数据已添加进ES中

在这里插入图片描述

访问请求:http://localhost:9090/search/java/1/20 查询带有关键字“java”的商品信息,并进行分页:查询成功!

在这里插入图片描述

4、前后端交互

1、导入vue和axios依赖

在这里插入图片描述

2、引入js到html文件中

1
2
html复制代码<script th:src="@{/js/vue.min.js}"></script>
<script th:src="@{/js/axios.min.js}"></script>

3、渲染后的 index.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
html复制代码<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org">
<head>
<meta charset="utf-8"/>
<title>狂神说Java-ES仿京东实战</title>
<link rel="stylesheet" th:href="@{/css/style.css}"/>
<script th:src="@{/js/jquery.min.js}"></script>
</head>
<body class="pg">
<div class="page">
<div id="app" class=" mallist tmall- page-not-market ">
<!-- 头部搜索 -->
<div id="header" class=" header-list-app">
<div class="headerLayout">
<div class="headerCon ">
<!-- Logo-->
<h1 id="mallLogo">
<img th:src="@{/images/jdlogo.png}" alt="">
</h1>
<div class="header-extra">
<!--搜索-->
<div id="mallSearch" class="mall-search">
<form name="searchTop" class="mallSearch-form clearfix">
<fieldset>
<legend>天猫搜索</legend>
<div class="mallSearch-input clearfix">
<div class="s-combobox" id="s-combobox-685">
<div class="s-combobox-input-wrap">
<input v-model="keyword" type="text" autocomplete="off" id="mq"
class="s-combobox-input" aria-haspopup="true">
</div>
</div>
<button type="submit" @click.prevent="searchKey" id="searchbtn">搜索</button>
</div>
</fieldset>
</form>
<ul class="relKeyTop">
<li><a>狂神说Java</a></li>
<li><a>狂神说前端</a></li>
<li><a>狂神说Linux</a></li>
<li><a>狂神说大数据</a></li>
<li><a>狂神聊理财</a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
<!-- 商品详情页面 -->
<div id="content">
<div class="main">
<!-- 品牌分类 -->
<form class="navAttrsForm">
<div class="attrs j_NavAttrs" style="display:block">
<div class="brandAttr j_nav_brand">
<div class="j_Brand attr">
<div class="attrKey">
品牌
</div>
<div class="attrValues">
<ul class="av-collapse row-2">
<li><a href="#"> 狂神说 </a></li>
<li><a href="#"> Java </a></li>
</ul>
</div>
</div>
</div>
</div>
</form>
<!-- 排序规则 -->
<div class="filter clearfix">
<a class="fSort fSort-cur">综合<i class="f-ico-arrow-d"></i></a>
<a class="fSort">人气<i class="f-ico-arrow-d"></i></a>
<a class="fSort">新品<i class="f-ico-arrow-d"></i></a>
<a class="fSort">销量<i class="f-ico-arrow-d"></i></a>
<a class="fSort">价格<i class="f-ico-triangle-mt"></i><i class="f-ico-triangle-mb"></i></a>
</div>
<!-- 商品详情 -->
<div class="view grid-nosku" >
<div class="product" v-for="result in results">
<div class="product-iWrap">
<!--商品封面-->
<div class="productImg-wrap">
<a class="productImg">
<img :src="result.img">
</a>
</div>
<!--价格-->
<p class="productPrice">
<em v-text="result.price"></em>
</p>
<!--标题-->
<p class="productTitle">
<a v-html="result.name"></a>
</p>
<!-- 店铺名 -->
<div class="productShop">
<span>店铺: 狂神说Java </span>
</div>
<!-- 成交信息 -->
<p class="productStatus">
<span>月成交<em>999笔</em></span>
<span>评价 <a>3</a></span>
</p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<script src="https://cdn.jsdelivr.net/npm/vue@2/dist/vue.js"></script>
<script th:src="@{/js/axios.min.js}"></script>
<script>
new Vue({
el:"#app",
data:{
keyword: '', // 搜索的关键字
results:[] // 后端返回的结果
},
methods:{
searchKey(){
var keyword = this.keyword;
console.log(keyword);
axios.get('h_search/'+keyword+'/0/20').then(response=>{
console.log(response.data);
this.results=response.data;
})
}
}
});
</script>
</body>
</html>

5、关键字高亮

在ContentService里面加上关键字高亮的方法:

原理:用新的高亮字段值覆盖旧的字段值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
java复制代码public List<Map<String, Object>> highlightSearch(String keyword, Integer pageIndex, Integer pageSize) throws IOException {
//针对构建查询请求
SearchRequest searchRequest = new SearchRequest("jd_goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// 精确查询
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", keyword);
searchSourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
//添加查询
searchSourceBuilder.query(termQueryBuilder);
// 分页
searchSourceBuilder.from(pageIndex);
searchSourceBuilder.size(pageSize);
// 关键字高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("name");
highlightBuilder.preTags("<span style='color:red'>");
highlightBuilder.postTags("</span>");
searchSourceBuilder.highlighter(highlightBuilder);
//添加查询条件到查询请求
searchRequest.source(searchSourceBuilder);
// 执行查询请求
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
// 解析结果
SearchHits hits = searchResponse.getHits();
List<Map<String, Object>> results = new ArrayList<>();
for (SearchHit documentFields : hits.getHits()) {
// 使用新的高亮字段值覆盖旧的字段值
Map<String, Object> sourceAsMap = documentFields.getSourceAsMap();
// 高亮字段
Map<String, HighlightField> highlightFields = documentFields.getHighlightFields();
HighlightField name = highlightFields.get("name");
// 开始替换
if (name != null){
Text[] fragments = name.fragments();
//用StringBuilder效率更高
StringBuilder new_name = new StringBuilder();
for (Text text : fragments) {
new_name.append(text);
}
sourceAsMap.put("name",new_name.toString());
}
results.add(sourceAsMap);
}
return results;
}

在index.html中配置高亮效果:

1
2
3
4
html复制代码<!--标题-->
<p class="productTitle">
<a v-html="result.name"></a>
</p>

6、最终效果

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

本文转载自: 掘金

开发者博客 – 和开发相关的 这里全都有

0%