OpenCV訓練分類器
OpenCV訓練分類器一、簡介
目標檢測方法最初由Paul Viola [Viola01]提出,並由Rainer Lienhart [Lienhart02]對這一方法進行了改善。該方法的基本步驟為: 首先,利用樣本(大約幾百幅樣本圖片)的 harr 特征進行分類器訓練,得到一個級聯的boosted分類器。
分類器中的"級聯"是指最終的分類器是由幾個簡單分類器級聯組成。在圖像檢測中,被檢窗口依次通過每一級分類器, 這樣在前麵幾層的檢測中大部分的候選區域就被排除了,全部通過每一級分類器檢測的區域即為目標區域。
分類器訓練完以後,就可以應用於輸入圖像中的感興趣區域(與訓練樣本相同的尺寸)的檢測。檢測到目標區域(汽車或人臉)分類器輸出為1,否則輸出為0。為了檢測整副圖像,可以在圖像中移動搜索窗口,檢測每一個位置來確定可能的目標。為了搜索不同大小的目標物體,分類器被設計為可以進行尺寸改變,這樣比改變待檢圖像的尺寸大小更為有效。所以,為了在圖像中檢測未知大小的目標物體,掃描程序通常需要用不同比例大小的搜索窗口對圖片進行幾次掃描。
目前支持這種分類器的boosting技術有四種: Discrete Adaboost, Real Adaboost, Gentle Adaboost and Logitboost。
"boosted" 即指級聯分類器的每一層都可以從中選取一個boosting算法(權重投票),並利用基礎分類器的自我訓練得到。
根據上麵的分析,目標檢測分為三個步驟:
1、 樣本的創建
2、 訓練分類器
3、 利用訓練好的分類器進行目標檢測。
二、樣本創建
訓練樣本分為正例樣本和反例樣本,其中正例樣本是指待檢目標樣本(例如人臉或汽車等),反例樣本指其它任意圖片,所有的樣本圖片都被歸一化為同樣的尺寸大小(例如,20x20)。
負樣本
負樣本可以來自於任意的圖片,但這些圖片不能包含目標特征。負樣本由背景描述文件來描述。背景描述文件是一個文本文件,每一行包含了一個負樣本圖片的文件名(基於描述文件的相對路徑)。該文件必須手工創建。
e.g: 負樣本描述文件的一個例子:
假定目錄結構如下:
/img
img1.jpg
img2.jpg
bg.txt
則背景描述文件bg.txt的內容為:
img/img1.jpg
img/img2.jpg
正樣本
正樣本由程序craatesample程序來創建。該程序的源代碼由OpenCV給出,並且在bin目錄下包含了這個可執行的程序。
正樣本可以由單個的目標圖片或者一係列的事先標記好的圖片來創建。
Createsamples程序的命令行參數:
命令行參數:
-vec <vec_file_name>
訓練好的正樣本的輸出文件名。
-img<image_file_name>
源目標圖片(例如:一個公司圖標)
-bg<background_file_name>
背景描述文件。
-num<number_of_samples>
要產生的正樣本的數量,和正樣本圖片數目相同。
-bgcolor<background_color>
背景色(假定當前圖片為灰度圖)。背景色製定了透明色。對於壓縮圖片,顏色方差量由bgthresh參數來指定。則在bgcolor-bgthresh和bgcolor+bgthresh中間的像素被認為是透明的。
-bgthresh<background_color_threshold>
-inv
如果指定,顏色會反色
-randinv
如果指定,顏色會任意反色
-maxidev<max_intensity_deviation>
背景色最大的偏離度。
-maxangel<max_x_rotation_angle>
-maxangle<max_y_rotation_angle>,
-maxzangle<max_x_rotation_angle>
最大旋轉角度,以弧度為單位。
-show
如果指定,每個樣本會被顯示出來,按下"esc"會關閉這一開關,即不顯示樣本圖片,而創建過程繼續。這是個有用的debug選項。
-w<sample_width>
輸出樣本的寬度(以像素為單位)
-h《sample_height》
輸出樣本的高度,以像素為單位。
注:正樣本也可以從一個預先標記好的圖像集合中獲取。這個集合由一個文本文件來描述,類似於背景描述文件。每一個文本行對應一個圖片。每行的第一個元素是圖片文件名,第二個元素是對象實體的個數。後麵緊跟著的是與之匹配的矩形框(x, y, 寬度,高度)。
下麵是一個創建樣本的例子:
假定我們要進行人臉的檢測,有5個正樣本圖片文件img1.bmp,…img5.bmp;有2個背景圖片文件:bg1.bmp,bg2.bmp,文件目錄結構如下:
positive
img1.bmp
……
Img5.bmp
negative
bg1.bmp
bg2.bmp
info.dat
bg.txt
正樣本描述文件info.dat的內容如下:
Positive/imag1.bmp 1 0 0 24 28
……
Positive/imag5.bmp 1 0 0 24 28
圖片img1.bmp包含了單個目標對象實體,矩形為(0,0,24,28)。
注意:要從圖片集中創建正樣本,要用-info參數而不是用-img參數。
-info <collect_file_name>
標記特征的圖片集合的描述文件。
背景(負樣本)描述文件的內容如下:
nagative/bg1.bmp
nagative/bg2.bmp
我們用一個批處理文件run.bat來進行正樣本的創建:該文件的內容如下:
cd e:\face\bin
CreateSamples -vec e:\face\a.vec
-info e:\face\info.dat
-bg e:\face\bg.txt
-num 5
-show
-w 24
-h 28
其中e:\face\bin目錄包含了createsamples可執行程序,生成的正樣本文件a.vec在e:\face目錄下。
三、訓練分類器
樣本創建之後,接下來要訓練分類器,這個過程是由haartraining程序來實現的。該程序源碼由OpenCV自帶,且可執行程序在OpenCV安裝目錄的bin目錄下。
Haartraining的命令行參數如下:
-data<dir_name>
存放訓練好的分類器的路徑名。
-vec<vec_file_name>
正樣本文件名(由trainingssamples程序或者由其他的方法創建的)
-bg<background_file_name>
背景描述文件。
-npos<number_of_positive_samples>,
-nneg<number_of_negative_samples>
用來訓練每一個分類器階段的正/負樣本。合理的值是:nPos = 7000;nNeg = 3000
-nstages<number_of_stages>
訓練的階段數。
-nsplits<number_of_splits>
決定用於階段分類器的弱分類器。如果1,則一個簡單的stump classifier被使用。如果是2或者更多,則帶有number_of_splits個內部節點的CART分類器被使用。
-mem<memory_in_MB>
預先計算的以MB為單位的可用內存。內存越大則訓練的速度越快。
-sym(default)
-nonsym
指定訓練的目標對象是否垂直對稱。垂直對稱提高目標的訓練速度。例如,正麵部是垂直對稱的。
-minhitrate《min_hit_rate》
每個階段分類器需要的最小的命中率。總的命中率為min_hit_rate的number_of_stages次方。
-maxfalsealarm<max_false_alarm_rate>
沒有階段分類器的最大錯誤報警率。總的錯誤警告率為max_false_alarm_rate的number_of_stages次方。
-weighttrimming<weight_trimming>
指定是否使用權修正和使用多大的權修正。一個基本的選擇是0.9
-eqw
-mode<basic(default)|core|all>
選擇用來訓練的haar特征集的種類。basic僅僅使用垂直特征。all使用垂直和45度角旋轉特征。
-w《sample_width》
-h《sample_height》
訓練樣本的尺寸,(以像素為單位)。必須和訓練樣本創建的尺寸相同。
一個訓練分類器的例子:
同上例,分類器訓練的過程用一個批處理文件run2.bat來完成:
cd e:\face\bin
haartraining -data e:\face\data
-vec e:\face\a.vec
-bg e:\face\bg.txt
-npos 5
-nneg 2
-w 24
-h 28
訓練結束後,會在目錄data下生成一些子目錄,即為訓練好的分類器。
注:OpenCv 的某些版本可以將這些目錄中的分類器直接轉換成xml文件。但在實際的操作中,haartraining程序卻好像永遠不會停止,而且沒有生成xml文件,後來在OpenCV的yahoo論壇上找到一個haarconv的程序,才將分類器轉換為xml文件,其中的原因尚待研究。
英文原文如下:
Rapid Object Detection With A Cascade of Boosted Classifiers Based on Haar-like Features
Introduction
This document describes how to train and use a cascade of boosted classifiers for rapid object detection. A large set of over-complete haar-like features provide the basis for the simple individual classifiers. Examples of object detection tasks are face, eye and nose detection, as well as logo detection.
The sample detection task in this document is logo detection, since logo detection does not require the collection of large set of registered and carefully marked object samples. Instead we assume that from one prototype image, a very large set of derived object examples can be derived (createsamples utility, see below).
A detailed description of the training/evaluation algorithm can be found in [1] and [2].
Samples Creation
For training a training samples must be collected. There are two sample types: negative samples and positive samples. Negative samples correspond to non-object images. Positive samples correspond to object images.
Negative Samples
Negative samples are taken from arbitrary images. These images must not contain object representations. Negative samples are passed through background description file. It is a text file in which each text line contains the filename (relative to the directory of the description file) of negative sample image. This file must be created manually. Note that the negative samples and sample images are also called background samples or background samples images, and are used interchangeably in this document
Example of negative description file:
Directory structure:
/img
img1.jpg
img2.jpg
bg.txt
File bg.txt:
img/img1.jpg
img/img2.jpg
Positive Samples
Positive samples are created by createsamples utility. They may be created from single object image or from collection of previously marked up images.
The single object image may for instance contain a company logo. Then are large set of positive samples are created from the given object image by randomly rotating, changing the logo color as well as placing the logo on arbitrary background.
The amount and range of randomness can be controlled by command line arguments.
Command line arguments:
- vec <vec_file_name>
name of the output file containing the positive samples for training
- img <image_file_name>
source object image (e.g., a company logo)
- bg <background_file_name>
background description file; contains a list of images into which randomly distorted versions of the object are pasted for positive sample generation
- num <number_of_samples>
number of positive samples to generate
- bgcolor <background_color>
background color (currently grayscale images are assumed); the background color denotes the transparent color. Since there might be compression artifacts, the amount of color tolerance can be specified by –bgthresh. All pixels between bgcolor-bgthresh and bgcolor+bgthresh are regarded as transparent.
- bgthresh <background_color_threshold>
- inv
if specified, the colors will be inverted
- randinv
if specified, the colors will be inverted randomly
- maxidev <max_intensity_deviation>
maximal intensity deviation of foreground samples pixels
- maxxangle <max_x_rotation_angle>,
- maxyangle <max_y_rotation_angle>,
- maxzangle <max_z_rotation_angle>
maximum rotation angles in radians
-show
if specified, each sample will be shown. Pressing ‘Esc’ will continue creation process without samples showing. Useful debugging option.
- w <sample_width>
width (in pixels) of the output samples
- h <sample_height>
height (in pixels) of the output samples
For following procedure is used to create a sample object instance:
The source image is rotated random around all three axes. The chosen angle is limited my -max?angle. Next pixels of intensities in the range of [bg_color-bg_color_threshold; bg_color+bg_color_threshold] are regarded as transparent. White noise is added to the intensities of the foreground. If –inv key is specified then foreground pixel intensities are inverted. If –randinv key is specified then it is randomly selected whether for this sample inversion will be applied. Finally, the obtained image is placed onto arbitrary background from the background description file, resized to the pixel size specified by –w and –h and stored into the file specified by the –vec command line parameter.
Positive samples also may be obtained from a collection of previously marked up images. This collection is described by text file similar to background description file. Each line of this file corresponds to collection image. The first element of the line is image file name. It is followed by number of object instances. The following numbers are the coordinates of bounding rectangles (x, y, width, height).
Example of description file:
Directory structure:
/img
img1.jpg
img2.jpg
info.dat
File info.dat:
img/img1.jpg 1 140 100 45 45
img/img2.jpg 2 100 200 50 50 50 30 25 25
Image img1.jpg contains single object instance with bounding rectangle (140, 100, 45, 45). Image img2.jpg contains two object instances.
In order to create positive samples from such collection –info argument should be specified instead of –img:
- info <collection_file_name>
description file of marked up images collection
The scheme of sample creation in this case is as follows. The object instances are taken from images. Then they are resized to samples size and stored in output file. No distortion is applied, so the only affecting arguments are –w, -h, -show and –num.
createsamples utility may be used for examining samples stored in positive samples file. In order to do this only –vec, –w and –h parameters should be specified.
Note that for training, it does not matter how positive samples files are generated. So the createsamples utility is only one way to collect/create a vector file of positive samples.
Training
The next step after samples creation is training of classifier. It is performed by the haartraining utility.
Command line arguments:
- data <dir_name>
directory name in which the trained classifier is stored
- vec <vec_file_name>
file name of positive sample file (created by trainingsamples utility or by any other means)
- bg <background_file_name>
background description file
- npos <number_of_positive_samples>,
- nneg <number_of_negative_samples>
number of positive/negative samples used in training of each classifier stage. Reasonable values are npos = 7000 and nneg = 3000.
- nstages <number_of_stages>
number of stages to be trained
- nsplits <number_of_splits>
determines the weak classifier used in stage classifiers. If 1, then a simple stump classifier is used, if 2 and more, then CART classifier with number_of_splits internal (split) nodes is used
- mem <memory_in_MB>
Available memory in MB for precalculation. The more memory you have the faster the training process
- sym (default),
- nonsym
specifies whether the object class under training has vertical symmetry or not. Vertical symmetry speeds up training process. For instance, frontal faces show off vertical symmetry
- minhitrate <min_hit_rate>
minimal desired hit rate for each stage classifier. Overall hit rate may be estimated as (min_hit_rate^number_of_stages)
- maxfalsealarm <max_false_alarm_rate>
maximal desired false alarm rate for each stage classifier. Overall false alarm rate may be estimated as (max_false_alarm_rate^number_of_stages)
- weighttrimming <weight_trimming>
Specifies wheter and how much weight trimming should be used. A decent choice is 0.90.
- eqw
- mode <BASIC (default) | CORE | ALL>
selects the type of haar features set used in training. BASIC use only upright features, while ALL uses the full set of upright and 45 degree rotated feature set. See [1] for more details.
- w <sample_width>,
- h <sample_height>
Size of training samples (in pixels). Must have exactly the same values as used during training samples creation (utility trainingsamples)
Note: in order to use multiprocessor advantage a compiler that supports OpenMP 1.0 standard should be used.
Application
OpenCV cvHaarDetectObjects() function (in particular haarFaceDetect demo) is used for detection.
Test Samples
In order to evaluate the performance of trained classifier a collection of marked up images is needed. When such collection is not available test samples may be created from single object image bycreatesamples utility. The scheme of test samples creation in this case is similar to training samples creation since each test sample is a background image into which a randomly distorted and randomly scaled instance of the object picture is pasted at a random position.
If both –img and –info arguments are specified then test samples will be created by createsamples utility. The sample image is arbitrary distorted as it was described below, then it is placed at random location to background image and stored. The corresponding description line is added to the file specified by –info argument.
The –w and –h keys determine the minimal size of placed object picture.
The test image file name format is as follows:
imageOrderNumber_x_y_width_height.jpg, where x, y, width and height are the coordinates of placed object bounding rectangle.
Note that you should use a background images set different from the background image set used during training.
Performance Evaluation
In order to evaluate the performance of the classifier performance utility may be used. It takes a collection of marked up images, applies the classifier and outputs the performance, i.e. number of found objects, number of missed objects, number of false alarms and other information.
Command line arguments:
- data <dir_name>
directory name in which the trained classifier is stored
- info <collection_file_name>
file with test samples description
- maxSizeDiff <max_size_difference>,
- maxPosDiff <max_position_difference>
determine the criterion of reference and detected rectangles coincidence. Default values are 1.5 and 0.3 respectively.
- sf <scale_factor>,
detection parameter. Default value is 1.2.
- w <sample_width>,
- h <sample_height>
Size of training samples (in pixels). Must have exactly the same values as used during training (utility haartraining)
References
[1] Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection. Submitted to ICIP2002.
[2] Alexander Kuranov, Rainer Lienhart, and Vadim Pisarevsky. An Empirical Analysis of Boosting Algorithms for Rapid Objects With an Extended Set of Haar-like Features. Intel Technical Report MRL-TR-July02-01, 2002.
最後更新:2017-04-03 05:39:58