Face Recognition

Content

          1. Abstract
          2. Database Creation
          3. Recognition
          4. Reduced Database Creation

Abstract

          This article is about basic 2D patterns recognition from 2D spatial domain array, special case, face patterns recognition. We will consider patterns detection and recognition from image containing stand up, frontal face image. Detection and recognition should be the same process. I have Intel 4 threads i5 processor, and I use 3 threads to run program. It could be much powerfull and faster with more threads, but it is what I have.

          What is pattern ?

          As you can see in the picture above, we have face as an object, and patterns on it needed for recognition. We have 7 patterns with numbers placed in the midle of every pattern. Object (face) is of size 52x52 RGB pix. Each pattern is of size 26x26 RGB pix. Each pattern has 26x26 pix R and G and B subpatern. 26x26=676 pix or 338 frequencies for each subpattern, so we get 1014 freq. for each pattern and it satisfies statistics.

left eye pattern

spaned R or G or B subpatterns of left eye

          Here is example of face pattern nr. 1, and how are it subpatterns spaned in the buffer to scan. In the middle of buffer to scan is reversed subpattern, reversed means subpattern flipped horizontaly (mirrored). Buffer to scan is of size 26x78 pix. Size of scan buffer is 26x26 pix and starts from position left and shifts 52 steps to the end right. At the end of each subpattern scan we get sum of 338 frequencies in the freq. buffer, which has to be normalized for logarithm and then logarithm will be taken, and then normalized again as preparation for use in PNN(). Freq. buffer is buffer of 3 (RGB) subpatterns freq., so it has 3x338=1014 freq. In the process of database creation it is stored in the database.

Database Creation

          DB creation starts with face images collection, grabbing from camera. 300 images of the same object (face) with format 768x768 pix are captured from camera and saved to the disk (in 96x96 pix format). After that images are loaded one by one and reduced to 52x52 pix. Spatial domain eye patterns are taken from reduced image and loaded to Class objects (7) to get frequency patterns for recognition DBs. 

What are Class Objects, and how they work ?

                Class Objects are program code objects which process patterns (in parallel). Each Class Object process one rec. pattern. Main function is GetFreq() and this function is called from main program. Class Object returns X buffer with normalized freq. from three subpatterns (RGB) (3x338 = 1014 freq.) for databases or returns probability in recognition process.

          First and second normalization are performed on X buffer (frequency buffer) as in 1D Object Recognition (N = 1014), and logarithm is
               X[i] = log(X[i] + 1)

          After second normalization sum of all (1014) elements of freq. buffer is  1 . It should be useful for probabilistic function in recognition process.

Recognition

          Detection and recognition are in fact the same processes. This subroutine initialises data, creates two threads, one for image grabbing and data display, and other for loops. Detection and recognition happen in GetFaceRegion() function.

          GetFaceRegion() function

                    Ideal solution is to scan whole 768x768 pix image for face which can appear on screen in all sizes and places, from 768x768 pix size (full screen) to 52x52 pix size  somewhere on the screen.  It will take a long time with i5 processor to process it. So I resize 768x768 pix image to 80x80 pix, and scan this image in all sizes and places through this image size. From 80x80 pix image to 52x52 pix image (minimum size) in steps of -2 image pix size (80, 78, 76,..., 52). So I get half meter tolerance from camera forward and back (from position where zoom is fixed) to position head. Process can significantly be speed up if in the first step we put S = 76 and in the loop S -= 4. Whole process takes 20 seconds for recognition. If you have one processor thread for each Class Object (7), whole process should take less than five seconds.

          PNN() - probabilistic discriminant function

          void PNN() {

              int i, j;
              double sum;

              for (i = 0; i < brUz; i++) {
                  sum = 0;
                  for (j = 0; j < 1014; j++)
                      sum += fabs(X[j] - R[i][j]);
                  largest[i] = exp(-pow(sum, 10) /​2.0E-7);
              }
          }

          This function is part of every Class object, and calculates similarity of pattern to database pattern. X is freq. buffer of unknown pattern, R[i] is database freq. pattern of the same type. brUz  is total number of objects (faces) in the database.  largest  is buffer for probabilities calculated through whole database for specific pattern. It is used later to calculate probability for object (face). This probability is sum of patterns probabilities devided by number of patterns (7). Function is of type y = exp(-pow(x, m) / n) , where  m controls slope of function and  n squeezes function toward zero.

          object probability calculation

          // *** recognition objects creation and initialization

for (i = 0; i < brObjects; i++) {
    REC[i] = new GetFreq_REC();      // REC[i] = pointer to recognition object program structure
    REC[i]->brUz = brUz;
    REC[i]->sWitch = 1;
    REC[i]->largest = largX[i];
    REC[i]->R = R_REC[i];
}

          By initialization  Class object pointer largest is pointer to largX[i] buffer. So, largX[i] buffer collects probabilities for specific object pattern of all objects (faces). To calculate object probability, sum of all patterns probability should be taken, and devided by number of patterns (7). To find max probability it should be done for all objects (faces).

// *** calculate probabilities

for (i = 0; i < brUz; i++) {
    pom = 0;
    for (j = 0; j < brObjects; j++)
        pom += largX[j][i];
    pom /= brObjects;

    if (pom > largest) {
        largest = pom;
        poz = i;
    }
}

          brUz is total number of faces in the database, brObjects is number of patterns pro face, largest is here probability of object (face), max probability on the end of the process, at the end of the loop poz is db number or pozition of most likely image in the db.

Reduced Database Creation

          Fast created face images grabbed from camera for database (300), could be very similar , some of them could contain very similar freq. patterns. To avoid such cases, this subroutine orders patterns by mutual probability differences and then selects 30 freq. patterns for reduced database. This DB is used for pattern recognition instead of 300 patterns database.

          ​G array is array (300x300) of mutual probab. differences between DB elements. Distances() function orders elements according to mutual differences, starting from first element. Next selected element is one with smallest difference regarding first one, next selected element is one with smallest difference regarding all elements already selected.... This order is buffered in the variable put[]. After this process 30 elements for DB are selected from put[], from position 5 in jumps of 10.
          30 selected images of selected member are then renumbered and copied to MEMBERS directory. Recompile rec. database subroutine is then called to create 30 rec. freq. patterns for database. 

          Recognition freq. patterns are saved to DAT_DB_REC directory. These freq. patterns are loaded in recognition process.
          Add New Member procedure:

                1. klik RecognitionDB --> AddNewRecognitionDBMember;
                2. klik Tools --> DBMembersSelection;
                3. klik RecognitionDB --> RecompileRecognitionDatabase;

Appendix

Program Codes:

FaceRec9_pdf

FaceRec9_cpp

GetFreq_REC_h_pdf

GetFreq_REC_h_cpp

GetFreq_REC_cpp_pdf

GetFreq_REC_cpp_cpp

Sources:

FFTW

OpenCV

Visual Studio

Dialogs: