OpenCV实现图像搜索引擎

简单介绍一下OpenCV。

OpenCV was designed for computational efficiency and with a strong focus on real-time applications. Written in optimized C/C++, the library can take advantage of multi-core processing. Enabled with OpenCL, it can take advantage of the hardware acceleration of the underlying heterogeneous compute platform. Adopted all around the world, OpenCV has more than 47 thousand people of user community and estimated number of downloads exceeding 9 million. Usage ranges from interactive art, to mines inspection, stitching maps on the web or through advanced robotics.

OpenCV（Open Source Computer Vision Library）的计算效率很高且能够完成实时任务。OpenCV库由优化的C/C++代码编写而成，能够充分发挥多核处理和硬件加速的优势。OpenCV有大量技术社区和超过900万的下载量，它的使用范围极为广泛，如人机互动、资源检查、拼接地图等。

0.Python+OpenCV实现图像搜索引擎

之前看到谷歌和百度出了图像搜索引擎，查阅了相关资料深入了解了图像搜索引擎的算法原理。一部分参考了用Python和OpenCV创建一个图片搜索引擎的完整指南。决定自己实现一个简单的图像搜索引擎，也可以让自己更快地查找mac中的图片。为什么使用OpenCV+Python实现图像搜索引擎呢？

首先，OpenCV是一个开源的计算机视觉处理库，在计算机视觉、图像处理和模式识别中有广泛的应用。接口安全易用，而且跨平台做的相当不错，是一个不可多得的计算机图像及视觉处理库。
其次，Python的语法更加易用，贴近自然语言，极为灵活。虽然计算效率并不高，但快速开发上它远胜于C++或其他语言，引入pysco能够优化python代码中的循环，一定程度上缩小与C/C++在计算上的差距。而且图像处理中需要大量的矩阵计算，引入numpy做矩阵运算能够降低编程的冗杂度，更多地把精力放在匹配的逻辑上，而非计算的细枝末节。

1. 图像搜索原理

图像搜索算法基本可以分为如下步骤：

提取图像特征。如采用SIFT、指纹算法函数、哈希函数、bundling features算法等。当然如知乎中所言，也可以针对特定的图像集群采用特定的模式设计算法，从而提高匹配的精度。如已知所有图像的中间部分在颜色空间或构图上有显著的区别，就可以加强对中间部分的分析，从而更加高效地提取图像特征。
图像特征的存储。一般将图像特征量化为数据存放于索引表中，并存储在外部存储介质中，搜索图片时仅搜索索引表中的图像特征，按匹配程度从高到低查找类似图像。对于图像尺寸分辩率不同的情况可以采用降低采样或归一化方法。
相似度匹配。如存储的是特征向量，则比较特征向量之间的加权后的平方距离。如存储的是散列码，则比较Hamming距离。初筛后，还可以进一步筛选最佳图像集。

2. 图片搜索引擎算法及框架设计

基本步骤

采用颜色空间特征提取器和构图空间特征提取器提取图像特征。
图像索引表构建驱动程序生成待搜索图像库的图像特征索引表。
图像搜索引擎驱动程序执行搜索命令，生成原图图像特征并传入图片搜索匹配器。
图片搜索匹配内核执行搜索匹配任务。返回前limit个最佳匹配图像。

所需模块

numpy。科学计算和矩阵运算利器。
cv2。OpenCV的python模块接入。
re。正则化模块。解析csv中的图像构图特征和色彩特征集。
csv。高效地读入csv文件。
glob。正则获取文件夹中文件路径。
argparse。设置命令行参数。

封装类及驱动程序

颜色空间特征提取器ColorDescriptor。

类成员bins。记录HSV色彩空间生成的色相、饱和度及明度分布直方图的最佳bins分配。bins分配过多则可能导致程序效率低下，匹配难度和匹配要求过分苛严；bins分配过少则会导致匹配精度不足，不能表证图像特征。
成员函数getHistogram(self, image, mask, isCenter)。生成图像的色彩特征分布直方图。image为待处理图像，mask为图像处理区域的掩模，isCenter判断是否为图像中心，从而有效地对色彩特征向量做加权处理。权重weight取5.0。采用OpenCV的calcHist()方法获得直方图，normalize()方法归一化。
成员函数describe(self, image)。将图像从BGR色彩空间转为HSV色彩空间（此处应注意OpenCV读入图像的色彩空间为BGR而非RGB）。生成左上、右上、左下、右下、中心部分的掩模。中心部分掩模的形状为椭圆形。这样能够有效区分中心部分和边缘部分，从而在getHistogram()方法中对不同部位的色彩特征做加权处理。

class ColorDescriptor:
    __slot__ = ["bins"]
    def __init__(self, bins):
        self.bins = bins
    def getHistogram(self, image, mask, isCenter):
        # get histogram
        imageHistogram = cv2.calcHist([image], [0, 1, 2], mask, self.bins, [0, 180, 0, 256, 0, 256])
        # normalize
        imageHistogram = cv2.normalize(imageHistogram, imageHistogram).flatten()
        if isCenter:
            weight = 5.0
            for index in xrange(len(imageHistogram)):
                imageHistogram[index] *= weight
        return imageHistogram
    def describe(self, image):
        image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        features = []
        # get dimension and center
        height, width = image.shape[0], image.shape[1]
        centerX, centerY = int(width * 0.5), int(height * 0.5)
        # initialize mask dimension
        segments = [(0, centerX, 0, centerY), (0, centerX, centerY, height), (centerX, width, 0, centerY), (centerX, width, centerY, height)]
        # initialize center part
        axesX, axesY = int(width * 0.75) / 2, int (height * 0.75) / 2
        ellipseMask = numpy.zeros([height, width], dtype="uint8")
        cv2.ellipse(ellipseMask, (centerX, centerY), (axesX, axesY), 0, 0, 360, 255, -1)
        # initialize corner part
        for startX, endX, startY, endY in segments:
            cornerMask = numpy.zeros([height, width], dtype="uint8")
            cv2.rectangle(cornerMask, (startX, startY), (endX, endY), 255, -1)
            cornerMask = cv2.subtract(cornerMask, ellipseMask)
            # get histogram of corner part
            imageHistogram = self.getHistogram(image, cornerMask, False)
            features.append(imageHistogram)
        # get histogram of center part
        imageHistogram = self.getHistogram(image, ellipseMask, True)
        features.append(imageHistogram)
        # return
        return features

构图空间特征提取器StructureDescriptor。

类成员dimension。将所有图片归一化（降低采样）为dimension所规定的尺寸。由此才能够用于统一的匹配和构图空间特征的生成。
成员函数describe(self, image)。将图像从BGR色彩空间转为HSV色彩空间（此处应注意OpenCV读入图像的色彩空间为BGR而非RGB）。返回HSV色彩空间的矩阵，等待在搜索引擎核心中的下一步处理。

class StructureDescriptor:
    __slot__ = ["dimension"]
    def __init__(self, dimension):
        self.dimension = dimension
    def describe(self, image):
        image = cv2.resize(image, self.dimension, interpolation=cv2.INTER_CUBIC)
        # image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        return image

图片搜索匹配内核Searcher。

类成员colorIndexPath和structureIndexPath。记录色彩空间特征索引表路径和结构特征索引表路径。
成员函数solveColorDistance(self, features, queryFeatures, eps = 1e-5)。求features和queryFeatures特征向量的二范数。eps是为了避免除零错误。
成员函数solveStructureDistance(self, structures, queryStructures, eps = 1e-5)。同样是求特征向量的二范数。eps是为了避免除零错误。需作统一化处理，color和structure特征向量距离相对比例适中，不可过分偏颇。
成员函数searchByColor(self, queryFeatures)。使用csv模块的reader方法读入索引表数据。采用re的split方法解析数据格式。用字典searchResults存储query图像与库中图像的距离，键为图库内图像名imageName，值为距离distance。
成员函数transformRawQuery(self, rawQueryStructures)。将未处理的query图像矩阵转为用于匹配的特征向量形式。
成员函数searchByStructure(self, rawQueryStructures)。类似4。
成员函数search(self, queryFeatures, rawQueryStructures, limit = 3)。将searchByColor方法和searchByStructure的结果汇总，获得总匹配分值，分值越低代表综合距离越小，匹配程度越高。返回前limit个最佳匹配图像。

class Searcher:
    __slot__ = ["colorIndexPath", "structureIndexPath"]
    def __init__(self, colorIndexPath, structureIndexPath):
        self.colorIndexPath, self.structureIndexPath = colorIndexPath, structureIndexPath
    def solveColorDistance(self, features, queryFeatures, eps = 1e-5):
        distance = 0.5 * numpy.sum([((a - b) ** 2) / (a + b + eps) for a, b in zip(features, queryFeatures)])
        return distance
    def solveStructureDistance(self, structures, queryStructures, eps = 1e-5):
        distance = 0
        normalizeRatio = 5e3
        for index in xrange(len(queryStructures)):
            for subIndex in xrange(len(queryStructures[index])):
                a = structures[index][subIndex]
                b = queryStructures[index][subIndex]
                distance += (a - b) ** 2 / (a + b + eps)
        return distance / normalizeRatio
    def searchByColor(self, queryFeatures):
        searchResults = {}
        with open(self.colorIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                features = []
                for feature in line[1:]:
                    feature = feature.replace("[", "").replace("]", "")
                    findStartPosition = 0
                    feature = re.split("\s+", feature)
                    rmlist = []
                    for index, strValue in enumerate(feature):
                        if strValue == "":
                            rmlist.append(index)
                    for _ in xrange(len(rmlist)):
                        currentIndex = rmlist[-1]
                        rmlist.pop()
                        del feature[currentIndex]
                    feature = [float(eachValue) for eachValue in feature]
                    features.append(feature)
                distance = self.solveColorDistance(features, queryFeatures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "feature", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def transformRawQuery(self, rawQueryStructures):
        queryStructures = []
        for substructure in rawQueryStructures:
            structure = []
            for line in substructure:
                for tripleColor in line:
                    structure.append(float(tripleColor))
            queryStructures.append(structure)
        return queryStructures
    def searchByStructure(self, rawQueryStructures):
        searchResults = {}
        queryStructures = self.transformRawQuery(rawQueryStructures)
        with open(self.structureIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                structures = []
                for structure in line[1:]:
                    structure = structure.replace("[", "").replace("]", "")
                    structure = re.split("\s+", structure)
                    if structure[0] == "":
                        structure = structure[1:]
                    structure = [float(eachValue) for eachValue in structure]
                    structures.append(structure)
                distance = self.solveStructureDistance(structures, queryStructures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "structure", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def search(self, queryFeatures, rawQueryStructures, limit = 3):
        featureResults = self.searchByColor(queryFeatures)
        structureResults = self.searchByStructure(rawQueryStructures)
        results = {}
        for key, value in featureResults.iteritems():
            results[key] = value + structureResults[key]
        results = sorted(results.iteritems(), key = lambda item: item[1], reverse = False)
        return results[ : limit]

图像索引表构建驱动index.py。
引入color_descriptor和structure_descriptor。用于解析图片库图像，获得色彩空间特征向量和构图空间特征向量。
用argparse设置命令行参数。参数包括图片库路径、色彩空间特征索引表路径、构图空间特征索引表路径。
用glob获得图片库路径。
生成索引表文本并写入csv文件。
可采用如下命令行形式启动驱动程序。

python index.py --dataset dataset --colorindex color——index.csv --structure structure_index.csv

dataset为图片库路径。color_index.csv为色彩空间特征索引表路径。structure_index.csv为构图空间特征索引表路径。

import color_descriptor
import structure_descriptor
import glob
import argparse
import cv2

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-d", "--dataset", required = True, help = "Path to the directory that contains the images to be indexed")
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
arguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
colorDesriptor = color_descriptor.ColorDescriptor(idealBins)

output = open(arguments["colorindex"], "w")

for imagePath in glob.glob(arguments["dataset"] + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    features = colorDesriptor.describe(image)
    # write features to file
    features = [str(feature).replace("\n", "") for feature in features]
    output.write("%s,%s\n" % (imageName, ",".join(features)))
# close index file
output.close()

idealDimension = (16, 16)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)

output = open(arguments["structureindex"], "w")

for imagePath in glob.glob("dataset" + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    structures = structureDescriptor.describe(image)
    # write structures to file
    structures = [str(structure).replace("\n", "") for structure in structures]
    output.write("%s,%s\n" % (imageName, ",".join(structures)))
# close index file
output.close()

图像搜索引擎驱动searchEngine.py。
引入color_descriptor和structure_descriptor。用于解析待匹配（搜索）的图像，获得色彩空间特征向量和构图空间特征向量。
用argparse设置命令行参数。参数包括图片库路径、色彩空间特征索引表路径、构图空间特征索引表路径、待搜索图片路径。
生成索引表文本并写入csv文件。
可采用如下命令行形式启动驱动程序。

python searchEngine.py -c color_index.csv -s structure_index.csv -r dataset -q query/pyramid.jpg

dataset为图片库路径。color_index.csv为色彩空间特征索引表路径。structure_index.csv为构图空间特征索引表路径，query/pyramid.jpg为待搜索图片路径。

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
searchArgParser.add_argument("-q", "--query", required = True, help = "Path to the query image")
searchArgParser.add_argument("-r", "--resultpath", required = True, help = "Path to the result path")
searchArguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
idealDimension = (16, 16)

colorDescriptor = color_descriptor.ColorDescriptor(idealBins)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)
queryImage = cv2.imread(searchArguments["query"])
colorIndexPath = searchArguments["colorindex"]
structureIndexPath = searchArguments["structureindex"]
resultPath = searchArguments["resultpath"]

queryFeatures = colorDescriptor.describe(queryImage)
queryStructures = structureDescriptor.describe(queryImage)

imageSearcher = searcher.Searcher(colorIndexPath, structureIndexPath)
searchResults = imageSearcher.search(queryFeatures, queryStructures)

for imageName, score in searchResults:
    queryResult = cv2.imread(resultPath + "/" + imageName)
    cv2.imshow("Result Score: " + str(int(score)) + " (lower is better)", queryResult)
    cv2.waitKey(0)

cv2.imshow("Query", queryImage)
cv2.waitKey(0)

3. 搜索引擎测试

Qeury: fish.jpg

fish

Result(匹配分值越低越好):

Score: 0

fish
Score: 17

fish
Score: 21

fish

Qeury: forest.jpg

forest

Result(匹配分值越低越好):

Score: 0

forest
Score: 33

forest
Score: 33

forest

Qeury: trip.jpg

trip

Result(匹配分值越低越好):

Score: 0

trip
Score: 23

trip
Score: 24

trip

Qeury: zebra.jpg

zebra

Result(匹配分值越低越好):

Score: 0

zebra
Score: 23

zebra
Score: 25

zebra

总结：总能搜索到完全一致的图像（即原图）。搜索得到的图像与原图基本符合。测试成功。

4. Python源代码

`color_descriptor.py`

import cv2
import numpy

class ColorDescriptor:
    __slot__ = ["bins"]
    def __init__(self, bins):
        self.bins = bins
    def getHistogram(self, image, mask, isCenter):
        # get histogram
        imageHistogram = cv2.calcHist([image], [0, 1, 2], mask, self.bins, [0, 180, 0, 256, 0, 256])
        # normalize
        imageHistogram = cv2.normalize(imageHistogram, imageHistogram).flatten()
        if isCenter:
            weight = 5.0
            for index in xrange(len(imageHistogram)):
                imageHistogram[index] *= weight
        return imageHistogram
    def describe(self, image):
        image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        features = []
        # get dimension and center
        height, width = image.shape[0], image.shape[1]
        centerX, centerY = int(width * 0.5), int(height * 0.5)
        # initialize mask dimension
        segments = [(0, centerX, 0, centerY), (0, centerX, centerY, height), (centerX, width, 0, centerY), (centerX, width, centerY, height)]
        # initialize center part
        axesX, axesY = int(width * 0.75) / 2, int (height * 0.75) / 2
        ellipseMask = numpy.zeros([height, width], dtype="uint8")
        cv2.ellipse(ellipseMask, (centerX, centerY), (axesX, axesY), 0, 0, 360, 255, -1)
        # initialize corner part
        for startX, endX, startY, endY in segments:
            cornerMask = numpy.zeros([height, width], dtype="uint8")
            cv2.rectangle(cornerMask, (startX, startY), (endX, endY), 255, -1)
            cornerMask = cv2.subtract(cornerMask, ellipseMask)
            # get histogram of corner part
            imageHistogram = self.getHistogram(image, cornerMask, False)
            features.append(imageHistogram)
        # get histogram of center part
        imageHistogram = self.getHistogram(image, ellipseMask, True)
        features.append(imageHistogram)
        # return
        return features

`structure_descriptor.py`

import cv2

class StructureDescriptor:
    __slot__ = ["dimension"]
    def __init__(self, dimension):
        self.dimension = dimension
    def describe(self, image):
        image = cv2.resize(image, self.dimension, interpolation=cv2.INTER_CUBIC)
        # image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
        return image

`searcher.py`

import numpy
import csv
import re

class Searcher:
    __slot__ = ["colorIndexPath", "structureIndexPath"]
    def __init__(self, colorIndexPath, structureIndexPath):
        self.colorIndexPath, self.structureIndexPath = colorIndexPath, structureIndexPath
    def solveColorDistance(self, features, queryFeatures, eps = 1e-5):
        distance = 0.5 * numpy.sum([((a - b) ** 2) / (a + b + eps) for a, b in zip(features, queryFeatures)])
        return distance
    def solveStructureDistance(self, structures, queryStructures, eps = 1e-5):
        distance = 0
        normalizeRatio = 5e3
        for index in xrange(len(queryStructures)):
            for subIndex in xrange(len(queryStructures[index])):
                a = structures[index][subIndex]
                b = queryStructures[index][subIndex]
                distance += (a - b) ** 2 / (a + b + eps)
        return distance / normalizeRatio
    def searchByColor(self, queryFeatures):
        searchResults = {}
        with open(self.colorIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                features = []
                for feature in line[1:]:
                    feature = feature.replace("[", "").replace("]", "")
                    findStartPosition = 0
                    feature = re.split("\s+", feature)
                    rmlist = []
                    for index, strValue in enumerate(feature):
                        if strValue == "":
                            rmlist.append(index)
                    for _ in xrange(len(rmlist)):
                        currentIndex = rmlist[-1]
                        rmlist.pop()
                        del feature[currentIndex]
                    feature = [float(eachValue) for eachValue in feature]
                    features.append(feature)
                distance = self.solveColorDistance(features, queryFeatures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "feature", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def transformRawQuery(self, rawQueryStructures):
        queryStructures = []
        for substructure in rawQueryStructures:
            structure = []
            for line in substructure:
                for tripleColor in line:
                    structure.append(float(tripleColor))
            queryStructures.append(structure)
        return queryStructures
    def searchByStructure(self, rawQueryStructures):
        searchResults = {}
        queryStructures = self.transformRawQuery(rawQueryStructures)
        with open(self.structureIndexPath) as indexFile:
            reader = csv.reader(indexFile)
            for line in reader:
                structures = []
                for structure in line[1:]:
                    structure = structure.replace("[", "").replace("]", "")
                    structure = re.split("\s+", structure)
                    if structure[0] == "":
                        structure = structure[1:]
                    structure = [float(eachValue) for eachValue in structure]
                    structures.append(structure)
                distance = self.solveStructureDistance(structures, queryStructures)
                searchResults[line[0]] = distance
            indexFile.close()
        # print "structure", sorted(searchResults.iteritems(), key = lambda item: item[1], reverse = False)
        return searchResults
    def search(self, queryFeatures, rawQueryStructures, limit = 3):
        featureResults = self.searchByColor(queryFeatures)
        structureResults = self.searchByStructure(rawQueryStructures)
        results = {}
        for key, value in featureResults.iteritems():
            results[key] = value + structureResults[key]
        results = sorted(results.iteritems(), key = lambda item: item[1], reverse = False)
        return results[ : limit]

`index.py`

import color_descriptor
import structure_descriptor
import glob
import argparse
import cv2

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-d", "--dataset", required = True, help = "Path to the directory that contains the images to be indexed")
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
arguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
colorDesriptor = color_descriptor.ColorDescriptor(idealBins)

output = open(arguments["colorindex"], "w")

for imagePath in glob.glob(arguments["dataset"] + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    features = colorDesriptor.describe(image)
    # write features to file
    features = [str(feature).replace("\n", "") for feature in features]
    output.write("%s,%s\n" % (imageName, ",".join(features)))
# close index file
output.close()

idealDimension = (16, 16)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)

output = open(arguments["structureindex"], "w")

for imagePath in glob.glob("dataset" + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    structures = structureDescriptor.describe(image)
    # write structures to file
    structures = [str(structure).replace("\n", "") for structure in structures]
    output.write("%s,%s\n" % (imageName, ",".join(structures)))
# close index file
output.close()

`searchEngine.py`

import color_descriptor
import structure_descriptor
import searcher
import argparse
import cv2

searchArgParser = argparse.ArgumentParser()
searchArgParser.add_argument("-c", "--colorindex", required = True, help = "Path to where the computed color index will be stored")
searchArgParser.add_argument("-s", "--structureindex", required = True, help = "Path to where the computed structure index will be stored")
searchArgParser.add_argument("-q", "--query", required = True, help = "Path to the query image")
searchArgParser.add_argument("-r", "--resultpath", required = True, help = "Path to the result path")
searchArguments = vars(searchArgParser.parse_args())

idealBins = (8, 12, 3)
idealDimension = (16, 16)

colorDescriptor = color_descriptor.ColorDescriptor(idealBins)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)
queryImage = cv2.imread(searchArguments["query"])
colorIndexPath = searchArguments["colorindex"]
structureIndexPath = searchArguments["structureindex"]
resultPath = searchArguments["resultpath"]

queryFeatures = colorDescriptor.describe(queryImage)
queryStructures = structureDescriptor.describe(queryImage)

imageSearcher = searcher.Searcher(colorIndexPath, structureIndexPath)
searchResults = imageSearcher.search(queryFeatures, queryStructures)

for imageName, score in searchResults:
    queryResult = cv2.imread(resultPath + "/" + imageName)
    cv2.imshow("Result Score: " + str(int(score)) + " (lower is better)", queryResult)
    cv2.waitKey(0)

cv2.imshow("Query", queryImage)
cv2.waitKey(0)

`searchEngineTest.py`

import cv2
import glob
import csv
import re
import numpy
import structure_descriptor

idealDimension = (16, 16)
structureDescriptor = structure_descriptor.StructureDescriptor(idealDimension)

testImage = cv2.imread("query/forest.jpg")
rawQueryStructures = structureDescriptor.describe(testImage)

# index
output = open("structureIndex.csv", "w")

for imagePath in glob.glob("dataset" + "/*.jpg"):
    imageName = imagePath[imagePath.rfind("/") + 1 : ]
    image = cv2.imread(imagePath)
    structures = structureDescriptor.describe(image)
    # write structures to file
    structures = [str(structure).replace("\n", "") for structure in structures]
    output.write("%s,%s\n" % (imageName, ",".join(structures)))
# close index file
output.close()

# searcher

def solveStructureDistance(self, structures, queryStructures, eps = 1e-5):
    distance = 0
    for index in xrange(len(queryFeatures)):
        for subIndex in xrange(len(queryFeatures[index])):
            a = features[index][subIndex]
            b = queryFeatures[index][subIndex]
            distance += (a - b) ** 2 / (a + b + eps)
    return distance / 5e3

queryStructures = []
for substructure in rawQueryStructures:
    structure = []
    for line in substructure:
        for tripleColor in line:
            structure.append(float(tripleColor))
    queryStructures.append(structure)
searchResults = {}
with open("structureIndex.csv") as indexFile:
    reader = csv.reader(indexFile)
    for line in reader:
        structures = []
        for structure in line[1:]:
            structure = structure.replace("[", "").replace("]", "")
            structure = re.split("\s+", structure)
            if structure[0] == "":
                structure = structure[1:]
            structure = [float(eachValue) for eachValue in structure]
            print len(structure)
            structures.append(structure)
        distance = solveDistance(structures, queryStructures)
        searchResults[line[0]] = distance
    indexFile.close()
searchResults = sorted(searchResults.iteritems(), key=lambda item: item[1], reverse=False)

print searchResults

最后编辑于：2017.12.03 05:43:27

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 206,126评论 6赞 481
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 88,254评论 2赞 382
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 152,445评论 0赞 341
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 55,185评论 1赞 278
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 64,178评论 5赞 371
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,970评论 1赞 284
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,276评论 3赞 399
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,927评论 0赞 259
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 43,400评论 1赞 300
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,883评论 2赞 323
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,997评论 1赞 333
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,646评论 4赞 322
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,213评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,204评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,423评论 1赞 260
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,423评论 2赞 352
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,722评论 2赞 345

OpenCV实现图像搜索引擎

0.Python+OpenCV实现图像搜索引擎

1. 图像搜索原理

2. 图片搜索引擎算法及框架设计

基本步骤

所需模块

封装类及驱动程序

3. 搜索引擎测试

Qeury: fish.jpg

Result(匹配分值越低越好):

Qeury: forest.jpg

Result(匹配分值越低越好):

Qeury: trip.jpg

Result(匹配分值越低越好):

Qeury: zebra.jpg

Result(匹配分值越低越好):

4. Python源代码

color_descriptor.py

structure_descriptor.py

searcher.py

index.py

searchEngine.py

searchEngineTest.py

推荐阅读更多精彩内容

`color_descriptor.py`

`structure_descriptor.py`

`searcher.py`

`index.py`

`searchEngine.py`

`searchEngineTest.py`