Tools to help you dive into Computer Vision
Computer Vision is currently one of the most emerging fields in the industry and gaining lots of attention as it is gradually integrating into real-life applications, from social networks, mobile apps, and self-driving cars.
While there are still open research problems to be solved in the field, but many open source tools are available for developing research or industrial applications. Some areas such as Image Processing have very mature stable libraries such as OpenCV, BoofCV, etc.. Other areas are still open for progress like Tracking and Video Stabilization.
Here are the most popular libraries and tools used in the Computer Vision community nowadays:
- Image Processing
Processing Images is about applying Mathematical operations on the images or videos.
OpenCVThe most popular and well documented library for general purpose image processing. Released under a BSD license and hence it’s free for both academic and commercial use. It has C++, C, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android.
BoofCVAn open source Java library for real-time computer vision and robotics applications. Written from scratch for ease of use and high performance. Its functionality covers a wide range of subjects including, optimized low-level image processing routines, camera calibration, feature detection/tracking, structure-from-motion, and recognition. It has been released under an Apache 2.0 license for both academic and commercial use.
NASA Vision WorkbenchA general purpose image processing and computer vision library developed by the Autonomous Systems and Robotics (ASR) Area in the Intelligent Systems Division at the NASA Ames Research Center. VW has been publicly released under the terms of the [NASA Open Source Software Agreement][nosa]. The Vision Workbench was implemented in the C++ programming.
SimpleCVSimpleCV is an open source framework for building computer vision applications. With it, you get access to several high-powered computer vision libraries such as OpenCV — without having to first learn about bit depths, file formats, color spaces, buffer management, eigenvalues, or matrix versus bitmap storage. - OCR
Optical Character Recognition is about converting images to text.
TesseractReleased under Apache 2.0 license, Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages out of the box, It can be trained to recognize other languages. - Machine Learning Tools
Machine learning is about analyzing data and deriving insights from it based on applied algorithms.
DLibModern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems.
SciPyAn open source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering. - Deep Learning Tools
Deep learning is a branch of Machine learning, its algorithms is applied in successive layers where each layer uses the output of the previous one.
TensorFlowA library for numerical computation using data flow graphs, originally developed by researchers and engineers working on the Google Brain team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research.
TheanoA Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently
CaffeA deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. is released under the BSD 2-Clause license.
TorchA scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.
KerasA high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. - Segmentation
Segmentation is about partitioning a digital image to multiple segments to help in analyzing and identifying the objects in the image.
SLIC SuperpixelsAn algorithm that clusters pixels in the combined five-dimensional color and image plane space to efficiently generate compact, nearly uniform superpixels.
6)Multi-View Geometry
MVG is about understanding the real world given several images of the same scene.
OpenMVGA library for computer-vision scientists and especially targeted to the Multiple View Geometry community, released under Mozilla Public License Version 2.0 - Visual Odometry
Visual Odometry is about determining the position and orientation of objects by analyzing the camera images.
LIBVISOA very fast cross-platfrom (Linux, Windows) C++ library with MATLAB wrappers for computing the 6 DOF motion of a moving mono/stereo camera. - Scene Reconstruction
Scene Reconstruction is about computing a 3D Model of a scene
VisualSFMA GUI application for 3D reconstruction using structure from motion (SFM) written in C++.
MeshLabAn open source system for processing and editing 3D triangular meshes. It provides a set of tools for editing, cleaning, healing, inspecting, rendering, texturing and converting meshes. It offers features for processing raw data produced by 3D digitization tools/devices and for preparing models for 3D printing.
BundlerA structure-from-motion (SfM) system for unordered image collections (for instance, images from the Internet) written in C and C++. , distributed under the GNU General Public License - Video Tracking
Video Tracking is the process to locate a moving object(s) using a camera video stream
OpenTLA general-purpose library for markerless tracking that provides a user-friendly high-level application programming interface (API) for the widest variety of methods and applications. Implemented in C++ providing multi-threading and GPU-accelerated capabilities for real-time efficiency. - Video Stabilization
Video stabilization is the process to remove undesirable shakes and jitters from a video stream.
Vid.stabA video stabilization library which can be plugged-in with Ffmpeg and Transcode. Developed under GPL