Facial Recognition using AWS Rokognition

Rekognition Overview

This overview section was copied from AWS Rokognition site.

Rekognition API service provides identification of objects, people, text, scenes, activities, or inappropriate content. Developers can quickly build a searchable content library to optimize media workflows, enrich recommendation engines by extracting text in images, or integrate secondary authentication into existing applications to enhance end-user security.

Below are some of the example use cases of Rokognition.

Analyze facial attributes

Analyze facial attributes

Easily identify objects, scenes & activities
Easily identify objects & scenes

Detect & extract text in images
Detect & extract text in images

Moderate content at scale
Moderate content at scale

Automatically recognize celebrities
Automatically recognize celebrities

Compare & match faces
Compare & match faces

Capture the path of people in scene
Image for person tracking

 

Common use cases for using Amazon Rekognition include the following:

  • Searchable image and video libraries – Amazon Rekognition makes images and stored videos searchable so you can discover objects and scenes that appear within them.
  • Face-based user verification – Amazon Rekognition enables your applications to confirm user identities by comparing their live image with a reference image.
  • Sentiment and demographic analysis – Amazon Rekognition detects emotions such as happy, sad, or surprise, and demographic information such as gender from facial images. Rekognition can analyze images, and send the emotion and demographic attributes to Amazon Redshift for periodic reporting on trends such as in store locations and similar scenarios.
  • Facial recognition – With Amazon Rekognition, you can search images, stored videos, and streaming videos for faces that match those stored in a container known as a face collection. A face collection is an index of faces that you own and manage. Identifying people based on their faces requires two major steps in Amazon Rekognition:
    • Index the faces.
    • Search the faces.
  • Unsafe Content Detection – Amazon Rekognition can detect explicit and suggestive adult content in images and in videos. Developers can use the returned metadata to filter inappropriate content based on their business needs. Beyond flagging an image based on the presence of adult content, the API also returns a hierarchical list of labels with confidence scores. These labels indicate specific categories of adult content, thus allowing granular filtering and management of large volumes of user generated content (UGC). For example, social and dating sites, photo sharing platforms, blogs and forums, apps for children, e-commerce sites, entertainment and online advertising services.
  • Celebrity recognition – Amazon Rekognition can recognize celebrities within supplied images and in videos. Rekognition can recognize thousands of celebrities across a number of categories, such as politics, sports, business, entertainment, and media.
  • Text detection – Amazon Rekognition Text in Image allows you to recognize and extract textual content from images. Text in Image supports most fonts including highly stylized ones. It detects text and numbers in different orientations such as those commonly found in banners and posters. In image sharing and social media applications, you can use it to enable visual search based on an index of images that contain the same keywords. In media and entertainment applications, you can catalogue videos based on relevant text on screen, such as ads, news, sport scores, and captions. Finally, in public safety applications, you can identify vehicles based on license plate numbers from images taken by street cameras.

 

Image Analysis

When working specifically with images, Rekognition can be used for

  • Label detection
  • Face detection and comparison
  • Celebrity recognition
  • Image moderation
  • Text in image detection

Image analysis requires the file to be in jpg or png format. Can pass images directly to Rekognition (via API) or through S3. When passing images through the API, the files must be base64-encoded string. When using S3 we dont need to worry about this as the S3 service will handle it. Note that the S3 bucket must be in the same region as the Rekognition collection. There are SDKs to help with interacting the API for Java, JavaScript, Python, PHP and .Net. It is faster to process images directly through the SDK/API rather than uploading it to S3 and referencing it from there.

 

Rekognition for Facial Analysis

Amazon Rekognition can detect faces in images and videos. Faces are added to a collection and must be indexed (using IndexFaces). Once indexed we can make comparisons with other images (CompareFaces).

There are two primary applications of machine learning that analyze images containing faces: face detection and face recognition.

A face detection system is designed to answer the question: is there a face in this picture? A face detection system determines the presence, location, scale, and (possibly) orientation of any face present in a still image or video frame. This system is designed to detect the presence of faces regardless of attributes such as gender, age, and facial hair.

A face recognition system is designed to answer the question: does the face in an image match the face in another image? A face recognition system takes an image of a face and makes a prediction about whether the face matches other faces in a provided database. Face recognition systems are designed to compare and predict potential matches of faces regardless of their expression, facial hair, and age.

Confidence scores are a critical component of face detection and recognition systems. These systems make predictions of whether a face exists in an image or matches a face in another image, with a corresponding level of confidence in the prediction. Users of these systems should consider the confidence score/similarity threshold provided by the system when designing their application and making decisions based on the output of the system. For example, in a photo application used to identify similar looking family members, if the confidence threshold is set at 80%, then the application will return matches when predictions reach an 80% confidence level, but will not return matches below that level.

Rekognition can also return bound boxes to show where in the image the face was detected (DetectFaces). This is returned as coordinate values in the image.

Image orientation is noted in the image’s Exchangeable image file (Exif) metadata. To use bound boxes correctly we need to know the orientation. Rekognition will estimate this but the image orientation should be checked before processing for best results.

Some recommended tips when working with facial images:

  • Face should be less than 30 degrees face down or up (pitch); yaw should be less than 45 degrees and roll can be any
  • Both eyes open
  • Full face in view, in image with no obstructions
  • Resolution should be greater than 50×50 pixels and up to 1920×1080
  • Color images
  • Neuteral facial expressions – mouth closed
  • Good composition, lighting, contrast, etc
  • When indexing, use images with different pitches and yaws (within range stated above)

 

Collections

We can create collections that store images of different faces, or multiple images of the same face. This allows us to lookup faces in that collection or to confirm a given face is the same person based on the images in the collection.

Collections can be created using the aws cli. Below is list of Rekognition CLI commands.

compare-faces                            | create-collection
create-stream-processor                  | delete-collection
delete-faces                             | delete-stream-processor
describe-collection                      | describe-stream-processor
detect-faces                             | detect-labels
detect-moderation-labels                 | detect-text
get-celebrity-info                       | get-celebrity-recognition
get-content-moderation                   | get-face-detection
get-face-search                          | get-label-detection
get-person-tracking                      | index-faces
list-collections                         | list-faces
list-stream-processors                   | recognize-celebrities
search-faces                             | search-faces-by-image
start-celebrity-recognition              | start-content-moderation
start-face-detection                     | start-face-search
start-label-detection                    | start-person-tracking
start-stream-processor                   | stop-stream-processor
help

Note that when a new collection is created it is given a Model Version number. This number indicates the level of deep learning models AWS uses to run the Rekognition service. These models are continuously improved by AWS and therefore the versions keep incrementing. You can see the versions when listing the collections.

[solidfish@macbook]$ aws rekognition list-collections --region=us-west-2
COLLECTIONIDS family_collection FACEMODELVERSIONS 4.0
COLLECTIONIDS facerekogtest1collection FACEMODELVERSIONS 4.0

Note that images from different collections that have different model versions are not compatible. Therefore it may be required to re-index older collections into newer ones when new model versions are released. More information about this can be found here:

https://docs.aws.amazon.com/rekognition/latest/dg/face-detection-model.html

 

Demo – Facial Recognition

AWS provides a demo (linked below) using a serverless architecture to process images through Rekognition. This architecture is shown below. It levereages S3, Lambda and DynamoDB for metadata storage.

In this demo I create a console app that directly connects to Rekognition (SDK + API). I use Rekognition to demo the following use cases:

  • Detect faces in image
  • Compare two images for similar faces
  • Lookup identity based on photo
  • Lookup similar faces in collection

The demo is written in .Net Core 2.1 and the full source is available at the following github:

https://github.com/johnlee/facialrekognition

 

Each use case uses different AWS API but the same image type, which is Amazon.rekgnition.Model.Image object. This object can be created from a local jpg/png file or can by a reference to an image file in S3. Note that when going through S3 there maybe latency since it has fetch the image over the network. In the main Program.cs we can see it generate this object in the GetRekogImage function.

public static Image GetRekogImage(string filename)
{
    Image rekogImage = new Image();
    try
    {
        using (FileStream fs = new FileStream(Environment.CurrentDirectory + '\\' + filename, FileMode.Open, FileAccess.Read))
        {
            byte[] data = new byte[fs.Length];
            fs.Read(data, 0, (int)fs.Length);
            rekogImage.Bytes = new MemoryStream(data);
        }
    }
    catch (Exception)
    {
        Console.WriteLine("Failed to load source image: " + filename);
        Environment.Exit(-1);
    }
    return rekogImage;
}

 

The use cases are written in their own file. For the detect faces, we use the DetectFacesRequest. Pretty straight forward. In this demo I only count the number of matches but the response actually contains coordinates for where the matches actually occurred in the image. We could use this to draw a new image with the border box discussed above. And example of this implementation can be found at the link below under References.

private static async Task IdentifyFaces(Image image)
{
    AmazonRekognitionClient rekognitionClient = new AmazonRekognitionClient(Amazon.RegionEndpoint.USWest2);

    DetectFacesRequest request = new DetectFacesRequest();
    request.Image = image;
    return await rekognitionClient.DetectFacesAsync(request);
}

 

For comparing two images, we use the CompareFacesRequest API. Note it takes a threshold value to determine how close of a probability we want. I’ve got it set to zero in this demo just to see what those scores are. I found it interesting that the score goes higher when the faces are of similar age, gender and color and ethnicity.

private static async Task Compare(Image image1, Image image2)
{
    AmazonRekognitionClient rekognitionClient = new AmazonRekognitionClient(Amazon.RegionEndpoint.USWest2);
    float similarityThreshold = 0F; // set to 0 to see all probability scores

    CompareFacesRequest compareFacesRequest = new CompareFacesRequest()
    {
        SourceImage = image1,
        TargetImage = image2,
        SimilarityThreshold = similarityThreshold
    };

    return await rekognitionClient.CompareFacesAsync(compareFacesRequest);
}

The lookup identity use case can be applied when we are given an image and want to find the identity of person by comparing that photo to a collection of photos. In order to compare against a collection, we must first create that collection. Then we must index faces into that collection. I’ve got a separate console app for doing this which can be found on github linked below.

https://github.com/johnlee/rekognitionindex

With an indexed collection we can use the SearchFacesByImageRequest API to initiate the lookup. The response will be FaceMatch type which gives information about the indexed image. Sample code shown below.

AmazonRekognitionClient rekognitionClient = new AmazonRekognitionClient(Amazon.RegionEndpoint.USWest2);
float similarityThreshold = 0F; // set to 0 to see all probability scores
int maxResults = 100;

SearchFacesByImageRequest request = new SearchFacesByImageRequest()
{
    CollectionId = collectionId,
    Image = image,
    FaceMatchThreshold = similarityThreshold,
    MaxFaces = maxResults
};

return await rekognitionClient.SearchFacesByImageAsync(request);

Note that the lookup request has values for threshold and max results. The threshold value is similar to that one used in CompareFaces. By setting it to zero I’m returning all images in the collection to see their probability scores.

 

The last use case is looking up similar faces in a collection. This is very similar to the previous use case except that we use the SearchFacesRequest. This request requires a FaceId that is already in an existing collection and then searches through that collection for similar faces. This use can be applied for situations where we might be capturing several images of people – like people walking down the sidewalk throughout the day. We then select a face in one of those photos and search the whole collection for where else that face has come up. Sample code for submitting that request shown below.

AmazonRekognitionClient rekognitionClient = new AmazonRekognitionClient(Amazon.RegionEndpoint.USWest2);
float similarityThreshold = 0F; // set to 0 to see all probability scores
int maxResults = 100;

SearchFacesRequest request = new SearchFacesRequest()
{
    CollectionId = collectionId,
    FaceId = faceId,
    FaceMatchThreshold = similarityThreshold,
    MaxFaces = maxResults
};

return await rekognitionClient.SearchFacesAsync(request);

 

References

AWS Rekognition
https://aws.amazon.com/rekognition/

Developer Guide
https://docs.aws.amazon.com/rekognition/latest/dg/what-is.html

Build your own Face Recognition
https://aws.amazon.com/blogs/machine-learning/build-your-own-face-recognition-service-using-amazon-rekognition/

Face Comparision
https://docs.aws.amazon.com/rekognition/latest/dg/faces-comparefaces.html

Face Lookup / Search
https://docs.aws.amazon.com/rekognition/latest/dg/collections.html

.Net Examples
https://github.com/awsdocs/amazon-rekognition-developer-guide/tree/master/code_examples/dotnet_examples

 

 

.