当前位置：主页 > 查看内容

Elasticsearch 和 Python构建面部识别系统—Elastic Stack 实战

发布时间：2021-05-19 00:00| 有位朋友查看

简介：作者：刘晓国你是否曾经尝试在图像中搜索目标？ Elasticsearch 可以帮助你存储，分析和搜索图像或视频中的目标。在本文中，我们将向你展示如何构建一个使用 Python 进行面部识别的系统。了解有关如何检测和编码面部信息的更多信息-并在搜索中找到匹配项。……

作者：刘晓国

你是否曾经尝试在图像中搜索目标？ Elasticsearch 可以帮助你存储，分析和搜索图像或视频中的目标。

在本文中，我们将向你展示如何构建一个使用 Python 进行面部识别的系统。了解有关如何检测和编码面部信息的更多信息-并在搜索中找到匹配项。

我们将参照代码：https://github.com/liu-xiao-guo/face_detection_elasticsearch。你可以把这个代码下载到本地的电脑：

$ pwd
/Users/liuxg/python/face_detection
$ tree -L 2
├── README.md
├── getVectorFromPicture.py
├── images
│ ├── shay.png
│ ├── simon.png
│ ├── steven.png
│ └── uri.png
├── images_to_be_recognized
│ └── facial-recognition-blog-elastic-founders-match.png
└── recognizeFaces.py

在上面的代码中，有如下的两个 python 文件：

getVectorFromPicture.py：导入在 images 目录下的图像。这些图像将被导入到 Elasticsearch 中recognizeFaces.py：识别位于 images_to_be_recognized 目录下的图像文件基础知识面部识别

面部识别是使用面部特征来识别用户的过程，例如，为了实现身份验证机制（例如解锁智能手机）。它根据人的面部细节捕获，分析和比较模式。此过程可以分为三个步骤：

人脸检测：识别数字图像中的人脸人脸数据编码：将人脸特征转换为数字表示脸部比对：搜寻和比较脸部特征

在示例中，我们将引导你完成每个步骤。

128 维向量

可以将面部特征转换为一组数字信息，以便进行存储和分析。

Vector data type

Elasticsearch 提供了 dense_vector 数据类型来存储浮点值的 dense vectors。向量中的最大尺寸数不应超过 2048，这足以存储面部特征表示。

现在，让我们实现所有这些概念。

准备

要检测面部并编码信息，你需要执行以下操作：

Python：在此示例中，我们将使用 Python 3Elasticsearch 集群：你可以免费使用阿里云Elasticsearch 来启动集群。本文中，我将进行一个本地的部署 Elasticsearch 及 Kibana。人脸识别库：一个简单的人脸识别 Python 库。Python Elasticsearch 客户端：Elasticsearch的官方Python客户端。客户端下载：https://elasticsearch-py.readthedocs.io/en/v7.10.1/

Python教程：https://elasticstack.blog.csdn.net/article/details/111573923

Python下载：:https://www.python.org/downloads/

注意，我们已经在 Ubuntu 20.04 LTS 和 Ubuntu 18.04 LTS 上测试了以下说明。根据你的操作系统，可能需要进行一些更改。尽管下面的安装步骤是针对 Ubuntu 操作系统的，但是我们可以按照同样的步骤在 Mac OS 上进行同样的顺序进行安装（部分指令会有所不同）。

安装 Python 和 Python 库

随 Python 3 的安装一起提供了 Ubuntu 20.04 和其他版本的 Debian Linux。

如果你的系统不是这种情况，则可以点击下载并安装 Python：https://www.python.org/downloads/

要确认您的版本是最新版本，可以运行以下命令：

sudo apt update 
sudo apt upgrade

确认 Python 版本为 3.x：

python3 -V

或者：

python --version

安装 pip3 来管理 Python 库：

sudo apt install -y python3-pip

安装 face_recognition 库所需的 cmake：

pip3 install CMake

将 cmake bin 文件夹添加到 $PATH 目录中：

export PATH=$CMake_bin_folder:$PATH

在我的测试中，上述步骤可以不需要。你只要在任何一个 terminal 中打入 cmake 命令，如果能看到被执行，那么就可以不用上面的命令了。

最后，在开始编写主程序脚本之前，安装以下库：

pip3 install dlib 
pip3 install numpy 
pip3 install face_recognition 
pip3 install elasticsearch

从图像中检测和编码面部信息

使用 face_recognition 库，我们可以从图像中检测人脸，并将人脸特征转换为 128 维向量。

为此，我们创建一个叫做 getVectorFromPicture.py:

getVectorFromPicture.py

import face_recognition 
import numpy as np 
import sys
import os
from pathlib import Path
from elasticsearch import Elasticsearch
es = Elasticsearch([{'host':'localhost','port':9200}])
cwd = os.getcwd()
print("cwd: " + cwd)
# Get the images directory
rootdir = cwd + "/images"
print("rootdir: " + rootdir)
for subdir, dirs, files in os.walk(rootdir):
 for file in files:
 print(os.path.join(subdir, file))
 file_path = os.path.join(subdir, file)
 image = face_recognition.load_image_file(file_path)
 # detect the faces from the images
 face_locations = face_recognition.face_locations(image)
 # encode the 128-dimension face encoding for each face in the image
 face_encodings = face_recognition.face_encodings(image, face_locations)
 # Display the 128-dimension for each face detected
 for face_encoding in face_encodings:
 print("Face found == ", face_encoding.tolist())
 print("name: " + Path(file_path).stem)
 name = Path(file_path).stem
 face_encoding = face_encoding.tolist()
 # format a dictionary to be indexed
 e = {
 "face_name": name,
 "face_encoding": face_encoding 
 res = es.index(index = 'faces', doc_type ='_doc', body = e)

首先，我们需要声明的是：你需要修改上面的 Elasticsearch 的地址，如果你的 Elasticsearch 不是运行于 localhost:9200。上面的代码非常之简单。它把当前目录下的子目录 images 下的所有文件都扫描一遍，并针对每个文件进行编码。我们使用 Python client API 接口把数据导入到 Elasticsearch 中去。在我们的 images 文件夹中，有四个文件。

在导入数据之前，我们需要在 Kibana 中创建一个叫做 faces 的索引：

PUT faces
 "mappings": {
 "properties": {
 "face_name": {
 "type": "keyword"
 "face_encoding": {
 "type": "dense_vector",
 "dims": 128
}

让我们执行 getVectorFromPicture.py 以获取 Elastic 创始人图像的面部特征表示。

python3 getVectorFromPicture.py

现在，我们可以将面部特征表示存储到 Elasticsearch 中。

我们可以在 Elasticsearch 中看到四个文档：

GET faces/_count

{
 "count" : 4,
 "_shards" : {
 "total" : 1,
 "successful" : 1,
 "skipped" : 0,
 "failed" : 0
}

我们也可以查看 faces 索引的文档：

GET faces/_search

匹配面孔

假设我们在 Elasticsearch 中索引了四个文档，其中包含 Elastic 创始人的每个面部表情。现在，我们可以使用创始人的其他图像来匹配各个图像。

为此，我们需要创建一个叫做 recognizeFaces.py 的文件。

recognizeFaces.py

import face_recognition
import numpy as np
from elasticsearch import Elasticsearch
import sys
import os
from elasticsearch import Elasticsearch
es = Elasticsearch([{'host': 'localhost', 'port': 9200}])
cwd = os.getcwd()
# print("cwd: " + cwd)
# Get the images directory
rootdir = cwd + "/images_to_be_recognized"
# print("rootdir: {0}".format(rootdir))
for subdir, dirs, files in os.walk(rootdir):
 for file in files:
 print(os.path.join(subdir, file))
 file_path = os.path.join(subdir, file)
 image = face_recognition.load_image_file(file_path)
 # detect the faces from the images
 face_locations = face_recognition.face_locations(image)
 # encode the 128-dimension face encoding for each face in the image
 face_encodings = face_recognition.face_encodings(image, face_locations)
 # Display the 128-dimension for each face detected
 i = 0
 for face_encoding in face_encodings:
 i += 1
 print("Face", i)
 response = es.search(
 index="faces",
 body={
 "size": 1,
 "_source": "face_name",
 "query": {
 "script_score": {
 "query": {
 "match_all": {}
 "script": {
 "source": "cosineSimilarity(params.query_vector, 'face_encoding')",
 "params": {
 "query_vector": face_encoding.tolist()
 # print(response)
 for hit in response['hits']['hits']:
 # double score=float(hit['_score'])
 print("score: {}".format(hit['_score']))
 if float(hit['_score']) 0.92:
 print("== This face match with ", hit['_source']['face_name'], ",the score is", hit['_score'])
 else:
 print("== Unknown face")

这个文件的写法也非常简单。它从目录 images_to_be_recognized 中获取需要识别的文件，并对这个图片进行识别。我们使用 cosineSimilarity 函数来计算给定查询向量和存储在 Elasticsearch 中的文档向量之间的余弦相似度。

 # Display the 128-dimension for each face detected
 i = 0
 for face_encoding in face_encodings:
 i += 1
 print("Face", i)
 response = es.search(
 index="faces",
 body={
 "size": 1,
 "_source": "face_name",
 "query": {
 "script_score": {
 "query": {
 "match_all": {}
 "script": {
 "source": "cosineSimilarity(params.query_vector, 'face_encoding')",
 "params": {
 "query_vector": face_encoding.tolist()
 )

假设分数低于 0.92 被认为是未知面孔：

for hit in response['hits']['hits']:
 # double score=float(hit['_score'])
 print("score: {}".format(hit['_score']))
 if float(hit['_score']) 0.92:
 print("== This face match with ", hit['_source']['face_name'], ",the score is", hit['_score'])
 else:
 print("== Unknown face")

执行上面的 Python 代码：

该脚本能够检测出得分匹配度高于 0.92 的所有面孔

搜寻进阶

面部识别和搜索可以结合使用，以用于高级用例。你可以使用 Elasticsearch 构建更复杂的查询，例如 geo_queries，query-dsl-bool-query 和 search-aggregations。

例如，以下查询将 cosineSimilarity 搜索应用于200公里半径内的特定位置：

GET /_search 
 "query": { 
 "script_score": { 
 "query": { 
 "bool": { 
 "must": { 
 "match_all": {} 
 "filter": { 
 "geo_distance": { 
 "distance": "200km", 
 "pin.location": { 
 "lat": 40, 
 "lon": -70 
 "script": { 
 "source": "cosineSimilarity(params.query_vector, 'face_encoding')", 
 "params": { 
 "query_vector":[ 
 -0.14664565,
 0.07806452,
 0.03944433,
 -0.03167224,
 -0.13942884
}

将 cosineSimilarity 与其他 Elasticsearch 查询结合使用，可以无限地实现更复杂的用例。

结论

面部识别可能与许多用例相关，并且你可能已经在日常生活中使用了它。上面描述的概念可以推广到图像或视频中的任何对象检测，因此你可以将用例扩展到非常大的应用场景。

参考：

https://www.elastic.co/blog/how-to-build-a-facial-recognition-system-using-elasticsearch-and-python
本文转自网络，原文链接：https://developer.aliyun.com/article/784165
本站部分内容转载于网络，版权归原作者所有，转载之目的在于传播更多优秀技术内容，如有侵权请联系QQ/微信：153890879删除，谢谢！

上一篇：在 Docker 上使用 Elastic Stack 和 Kafka—Elastic Stack 实战 下一篇：[leetcode/lintcode 题解] 算法面试高频题详解：股票价格跨度

随机推荐

长沙营智：PolarDB助力长沙营智提速资讯

公司介绍长沙营智信息技术有限公司旗下易撰网，2017年10月份上线以来，基于数据...
DataWorks 2021-03 产品月刊

本月DataWorks产品月刊为您带来产品活动 1.参与阿里云DataWorks问卷调研 (Aliyu...
百度开放离线人脸识别SDK，活体识别率超

人脸识别是目前商业应用最成熟、最广泛的人工智能技术之一，成为开发者、企业接...
一日一技：你怎么总是搞不清反斜杠的问题

大家在开发Python的过程中，一定会遇到很多反斜杠的问题，很多人被反斜杠的数量...
数据分析师必备的6项技能

【51CTO.com快译】数据分析是对数据进行判断、细化、更改和建模的过程，目的是...
2020年大数据给企业带来的5大好处

大数据市场如今正在呈爆炸式增长。根据调研机构Markets and Markets公司的调查，...
怎样才能成为一名合格的微服务构架师？

阿里巴巴、腾讯、支付宝、网易、IBM、谷歌、京东、百度、滴滴等一线互联网公司...
删除镜像_镜像服务 IMS_用户指南_管理私

操作场景您可以删除不需要的私有镜像。删除私有镜像后，将无法找回，请谨慎操...
稻香小镇新建数字农业基地

案例背景永安稻香小镇的体验式数字农业基地是余杭街道依托“阿里以西10分钟”的...
构建前瞻性应用架构的优秀实践

【51CTO.com快译】不知道您是否听说过软件架构师最讨厌意大利面这个梗?它是指软...

Elasticsearch 和 Python构建面部识别系统—Elastic Stack 实战

推荐图文

跨境医疗走向后疫情时代：门槛提高、专业化、平台化

网络视频服务器的优势

使用顶层await简化JS代码

TRTC Web端仿腾讯会议麦克风静音检测

Flink on Zeppelin 系列之：Yarn Application 模式

在Python中搭建币价树形图

随机推荐

长沙营智：PolarDB助力长沙营智提速资讯

DataWorks 2021-03 产品月刊

百度开放离线人脸识别SDK，活体识别率超

一日一技：你怎么总是搞不清反斜杠的问题

数据分析师必备的6项技能

2020年大数据给企业带来的5大好处

怎样才能成为一名合格的微服务构架师？

删除镜像_镜像服务 IMS_用户指南_管理私

稻香小镇新建数字农业基地

构建前瞻性应用架构的优秀实践

关于我们