Friday, 14 July 2017

Amazon Lex Now Supports Telephony Audio (8 kHz) for Increased Speech Recognition Accuracy

Leave a Comment

To increase the accuracy of speech recognition for conversations over the phone, Amazon Lex now supports telephony audio (8 kHz). You can now employ the same deep learning technology as Amazon Alexa to converse with your applications and fulfill the most common requests. Amazon Lex maintains context and dynamically manages the dialogue, adjusting responses based on the conversation.

Amazon Lex integrates with Amazon Connect, a cloud-based contact center service that scales to meet your needs, so you can deploy chatbots to handle first-level customer support. With the Amazon Lex integration to Amazon Connect, you can solve many customer problems without involving a human operator. When necessary, Amazon Lex can transfer a customer support call to an agent, with full context.

Amazon Lex, with 8 kHz telephony audio, is available in the US East (N. Virginia) AWS Region. For more information on using Amazon Lex with your Amazon Connect contact center, see the Amazon Lex chatbots page for Amazon Connect. For details on using the runtime API communicate with Amazon Lex, see the related Amazon Lex documentation.

Powered by WPeMatico

The post Amazon Lex Now Supports Telephony Audio (8 kHz) for Increased Speech Recognition Accuracy appeared first on Artificial Intelligence Solutions.

Read More...

Wednesday, 12 July 2017

Voice-Enabled Mobile Bot Drives Auto Industry Innovation with Real-Time Trade-in Values for Vehicles

Leave a Comment

The Kelley Blue Book Bot allows users to get real-time Kelley Blue Book® Trade-In Value for vehicles using natural language. Users can interact with the chatbot in both voice and text. A simple question like, “Kelley Blue Book, can you tell me the trade-in value for my 2012 Honda Civic?” is all that is needed for getting expert car advice from an industry leading automotive company. The bot is built using Amazon Lex and AWS Mobile Hub. Once the conversation has started, Amazon Lex captures user input and manages the dialogue about the vehicle until all information is received. Amazon Lex then calls the Kelley Blue Book API to retrieve the current Kelley Blue Book Trade-In Value based on the user’s location. AWS Mobile Hub enables easy integration of the bot into your mobile app.

In this post, we explain how we built the Kelley Blue Book Bot. We then walk you through building your own bot with Amazon Lex. We will also describe how you can embed the bot into a fully-functional iOS or Android mobile app using AWS Mobile Hub.

The basics

Before you dive into this post, we recommend that you review the basics of building a conversational bot in Amazon Lex: How It Works in the Amazon Lex Developer Guide. It’s worth noting that Amazon Lex uses the same technology that powers Amazon Alexa and can process both speech and text input. The bot that you create understands both input types and you can incorporate one or both ways to interact. 

The design

The premise of a chatbot is that you interact with a bot naturally, using voice or text. You ask the bot questions (or make demands if you’d like), get answers, and complete sophisticated tasks. The Kelley Blue Book Bot is no different. The automotive experts at Kelley Blue Book have built an extensive vehicle database along with their own APIs to retrieve information from the database. Amazon Lex and AWS Mobile Hub enable a simplified user experience by making information from this database available via a conversational interface. During the conversation, Amazon Lex maintains the context by keeping track of the intent, the questions and user responses. Amazon Lex is a fully managed service so you don’t have to worry about designing or managing an infrastructure. The bot can be made available to web, mobile, and enterprise customers.

In action

The interaction begins when the mobile user asks the Kelley Blue Book Bot for the trade-in value for a vehicle. The client captures this input and sends it across to Amazon Lex. Amazon Lex translates the voice into text, captures slot values, and validates the values using AWS Lambda. Amazon Lex manages the dialogue, dynamically adjusting the responses until it has all the information that it needs and then sends to a Lambda function for fulfillment. The Lambda function then queries a Kelley Blue Book API, and responds back to Amazon Lex with real-time vehicle data. The real magic here is the innovative way that a user interacts with an existing API using just voice.

The details: intents, utterances, and slots, oh my!

The Kelley Blue Book Bot has one Intent (VehicleMarketValue) which retrieves vehicle market value. It contains utterances such as “Get Trade-In value for my {VehicleYear}, {VehicleMake}, and {VehicleModel}” to identify the user intent. To fulfill the business logic, an intent needs information or ‘slots’. For example, to retrieve market value the VehicleValue intent requires slots such as VehicleYear, VehicleMake, and VehicleModel. If a user replies with a slot value that includes additional words, such as “87,000 miles”, Amazon Lex can still understand the intended slot value (VehicleMileage:87000). The Kelley Blue Book Bot uses pre-defined slot types to capture certain user information. For example, AMAZON.NUMBER is a built-in slot type that is used for the {VehicleYear} and {VehicleMileage} slots. Amazon Lex provides an easy-to-use console to guide you through creating your own bot.  Alternately, you can also programmatically build and connect to bots via SDKs.

Analyze and improve

Amazon Lex provides analytics so that you can view how your customers are interacting with the bot and make necessary improvements over time.  In the console, on the Monitoring tab, you can track the number of utterances (for speech and text), the number of utterances that were not recognized (also known as missed utterances), and the request latency for your bot. The Utterances section provides details on detected and missed utterances. Simply choose the missed utterances to view the inputs that were not recognized by your bot. You can add these utterances to the intent.

Let’s see what happens under the hood

The following graphic shows the voice interaction between a mobile user and the Kelley Blue Book Bot:

The following graphic shows how the Kelley Blue Book Bot reports the current Kelley Blue Book Trade-In Value after the user provides all required vehicle information:

When you integrate the Kelley Blue Book Bot with a mobile application, the following happens:

Monitor your bot

You can view the utterance detail for the bot in the Monitoring tab. The ‘Utterances’ section provides detail on detected and missed utterances. Simply click on the missed utterances to view the inputs that were not recognized by your bot. You can also add these utterances back to the Intent. By doing so you can improve the bot over time.

If you have an existing API and want to learn how to build your own mobile voice enabled bots using Amazon Lex and Mobile Hub, then keep reading!

Build your own chatbot!

To build a mobile bot with your own AWS resources and backend API, you do the following:

  1. Create a Lambda function to handle validation and fulfillment
  2. Create a custom bot with intents and slots using the Amazon Lex console
  3. Integrate your Amazon Lex bot into a fully functional, secure, and scalable mobile application using Mobile Hub

The Amazon Lex console guides you through creating your own bot. You can also programmatically build and connect to bots using APIs.

Step 1. Create a Lambda validation and fulfillment function

To validate user input and fulfill the user’s request, you need to create a Lambda function. You use this function when you create your Amazon Lex bot in Step 2. The function runs in the Node.js runtime environment.

  1. Sign in to the AWS AWS Lambda console.
  2. Choose the US East (N. Virginia) Region (us-east-1). Currently, Amazon Lex is available only in this AWS Region.
  3. Choose Create a Lambda function.
  4. On the Select blueprint page, choose Blank function.

The Lambda function uses custom code that you cut and paste it into the code editor in step 6.

  1. On the Configure triggers page, choose Next.
  2. On the Configure function page, type the name of the bot, MobileChatbotHandler, and for runtime, choose js.6.10.
  3. In the Lambda function code section, choose Edit code inline, and then copy the following Lambda Node.js code (Github) and paste it in the editor window:
/*
  * Copyright 2017 Amazon.com, Inc. and its affiliates. All Rights Reserved.
  *
  * Licensed under the MIT License. See the LICENSE accompanying this file
  * for the specific language governing permissions and limitations under
  * the License.
  */
 
/**
* This sample demonstrates an implementation of the Amazon Lex Code Hook Interface
* for the 'VehicleValue' intent for the 'MyMobileChatBot' Amazon Lex ChatBot as described in this blog:
* http://ift.tt/2lAb0EG
*/
 
'use strict';
 
// --------------- Main handler -----------------------
// Route the incoming request based on intent.
// The JSON body of the request is provided in the event slot.
exports.handler = (event, context, callback) => {
    try {
        // By default, treat the user request as coming from the America/New_York time zone.
        process.env.TZ = 'America/New_York';
        console.log('event.bot.name=${event.bot.name}');
        dispatch(event, (response) => callback(null, response));
    } catch (err) {
        callback(err);
    }
};
 
/**
* Called when the user specifies an intent for this skill.
*/
function dispatch(intentRequest, callback) {
    console.log('dispatch userId=${intentRequest.userId}, intentName=${intentRequest.currentIntent.name}');
 
    const intentName = intentRequest.currentIntent.name;
 
    // Dispatch to your skill's intent handlers
    return vehicleValue(intentRequest, callback);
}
 
//
/* ----------- Functions that control the bot's behavior ---------------
 * This function performs dialog management and fulfillment for the bot
 */
function vehicleValue(intentRequest, callback) {
    //for this example, we'll explore just the first three vehicle details; year, make, and model
    const slots = intentRequest.currentIntent.slots;
    const outputSessionAttributes = intentRequest.sessionAttributes || {};
    const carYear = (slots.VehicleYear ? slots.VehicleYear : null);
    const carMake = (slots.VehicleMake ? slots.VehicleMake : null);
    const carModel = (slots.VehicleModel ? slots.VehicleModel : null);
    const source = intentRequest.invocationSource;
 
    if (source === 'DialogCodeHook') {
        // Perform basic validation on the supplied input slots. Use the elicitSlot dialog action to re-prompt for the first violation detected.
        const validationResult = validateVehicleData(carYear, carMake, carModel);
 
        // If any slots are invalid, re-elicit for their value
        if (!validationResult.isValid) {
            slots[`${validationResult.violatedSlot}`] = null;
            callback(elicitSlot(intentRequest.sessionAttributes, intentRequest.currentIntent.name, slots, validationResult.violatedSlot, validationResult.message));
            return;
        }
        callback(delegate(outputSessionAttributes, intentRequest.currentIntent.slots));
        return;
    }
 
    // This is called when the Amazon Lex invocationSource = FulfillmentCodeHook
    // If the intent is configured to invoke a Lambda function as a fulfillment code hook, Amazon Lex sets the invocationSource to this value only after it has all the slot data to fulfill the intent.
    // In a real bot, this would likely involve a call to a backend service.
    callback(close(intentRequest.sessionAttributes, 'Fulfilled',
        { contentType: 'PlainText', content: `Your ${carYear} ${carMake} ${carModel} vehicle has been validated and ready for trade-in.` }));
}
 
function validateVehicleData(carYear, carMake, carModel) {
 
    if (carYear) {
        if (!isValidCarYear(carYear)) {
            return buildValidationResult(false, 'VehicleYear', `We do not have any vehicles in our inventory for the year ${carYear}. Please try a year newer than 1991 and not a date in the future.`);
        }
    }
 
    if (carMake) {
        if (!isValidCarMake(carMake)) {
            return buildValidationResult(false, 'VehicleMake', `We do not have a ${carMake} vehicle make in our inventory, can you provide a different vehicle make such as Ford, Honda, Chevrolet, or Dodge?`);
        }
    }
 
    if (carModel) {
        if (!isValidCarModel(carModel)) {
            return buildValidationResult(false, 'VehicleModel', `We do not have a ${carModel} vehicle model in our inventory matching a ${carYear} ${carMake}, can you provide a different vehicle model such as Explorer, Civic, Malibu, or Dakota?`);
        }
    }
 
    return buildValidationResult(true, null, null);
}
 
//-------------Helper validation functions--------------
 
// Make sure the vehicle year falls within the date range of used vehicles
// Valid dates: 1992 -> CurrentYear
function isValidCarYear(carYear) {
    var isValid = false;
    if (1991 < carYear && carYear <= new Date().getFullYear()) {
        isValid = true;
    }
    return isValid;
}
 
function isValidCarMake(carMake) {
    const vehicleMakes = ['ford', 'honda', 'chevrolet', 'dodge'];
    console.log('[' + carMake + '] matches known vehicle makes? ' + (vehicleMakes.indexOf(carMake.toLowerCase()) > -1));
    return (vehicleMakes.indexOf(carMake.toLowerCase()) > -1);
}
 
function isValidCarModel(carModel) {
    const vehicleModels = ['explorer', 'civic', 'malibu', 'dakota'];
    console.log('[' + carModel + '] matches known vehicle model? ' + (vehicleModels.indexOf(carModel.toLowerCase()) > -1));
    return (vehicleModels.indexOf(carModel.toLowerCase()) > -1);
}
 
function buildValidationResult(isValid, violatedSlot, messageContent) {
    if (messageContent === null) {
        return {
            isValid,
            violatedSlot,
        };
    }
    return {
        isValid,
        violatedSlot,
        message: { contentType: 'PlainText', content: messageContent },
    };
}
 
// --------------- Helpers to build responses which match the structure of the necessary dialog actions -----------------------
function elicitSlot(sessionAttributes, intentName, slots, slotToElicit, message) {
    return {
        sessionAttributes,
        dialogAction: {
            type: 'ElicitSlot',
            intentName,
            slots,
            slotToElicit,
            message,
        },
    };
}
 
function confirmIntent(sessionAttributes, intentName, slots, message) {
    return {
        sessionAttributes,
        dialogAction: {
            type: 'ConfirmIntent',
            intentName,
            slots,
            message,
        },
    };
}
 
function close(sessionAttributes, fulfillmentState, message) {
    return {
        sessionAttributes,
        dialogAction: {
            type: 'Close',
            fulfillmentState,
            message,
        },
    };
}
 
function delegate(sessionAttributes, slots) {
    return {
        sessionAttributes,
        dialogAction: {
            type: 'Delegate',
            slots,
        },
    };
}

The code is available also available on Github.

  1. In the Lambda function handler and role section, choose Choose a new role from template(s), then type a role name. Leave all other fields as default. More information about this IAM execution role here.
  2. Choose Next.
  3. On the Review page, choose Create function.

You now have a Lambda Node.js function that can execute your bot’s business logic and fulfillment tasks.

Step 2. Create a voice-enabled bot with the Amazon Lex console

In the Amazon Lex console, create a custom bot with the following settings, as shown in the following screenshot:

  • Bot name: MyMobileChatbot
  • Output voice: Joanna
  • Session timeout: 10 min 

The output voice specified above is used by the bot for text to speech. For your Amazon Lex bot, you configure the session timeout. Amazon Lex maintains the context information of each in-progress conversation for the duration of the session.

Note: Amazon Lex assumes service linked roles to call AWS services on behalf of your bots and bot channels.

Create intent

Select the blue Create Intent button, enter VehicleValue, and choose Add.

Create slot types

For a generic vehicle market value interaction, add the following slot types, or use these as an example to build your own intents for your business workflow. 

Add them on the Amazon Lex Slots page, as follows. AMAZON.NUMBER is a built-in slot type that is used for the {VehicleYear} slot. You add the prompts (utterances) later.

Create utterances

In the Amazon Lex console, under Sample utterances, add the following utterances. You don’t have to provide an exhaustive list of all possible combinations. Providing a few representative utterances allows the Amazon Lex machine learning system to understand the range of possible user inputs. To create utterances for your own custom bot, use these utterances as an example.

Associate AWS Lambda function for validation and fulfillment

We need to associate our Lambda function we created earlier.

Under the Lambda initialization and validation, select the Lambda function you created in step #1

Under Fulfillment, select the AWS Lambda function you created in step #1. For this sample, we are using the same function for validation and fulfillment.

That completes the setup.

When you’re ready to test your bot, choose Save, and then choose Build in the upper-right corner.

Step 3. Create a Mobile Hub project and enable the Conversation Bots feature card

Mobile Hub generates fully functional iOS or Android mobile sample project code with an embedded Amazon Lex bot. The mobile app generated by Mobile Hub is securely configured with access control to your AWS resources using Amazon Cognito Identity and uses the Amazon Lex mobile SDK to acquire speech and text inputs from the app and send them to Amazon Lex for natural language processing.

Log in to the AWS Mobile Hub console.

Create a new project.

  1. Choose the Conversational Bots
  2. Choose Import a bot. Check MyMobileChatBot, and then choose
  3. In the left panel of your project, choose Integrate.
  4. Choose the Swift/Objective-C for iOS or Android.
  5. Choose Download a customized example Mobile App project.

 

  1. Download the zipped project code, and open it in Xcode for iOS, or in Android Studio for Android.

What have we done so far?

When you imported the Amazon Lex bot to your project, Mobile Hub performed two critical steps for you. First, Mobile Hub created an Amazon Cognito identity pool for the project and added permissions for authenticated and unauthenticated users of your app to securely interact with the Amazon Lex bot and make it easy to connect to other AWS services such as Amazon DynamoDB from your mobile app.

The following IAM policy statement grants permission to both authenticated and unauthenticated users of your application through Amazon Cognito. Learn more about AWS Identity and Access Management (IAM) roles created here.

 

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "lex:postContent"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

Second, Mobile Hub generated a fully functional mobile sample application using the latest Amazon Lex SDK. With this application, your users can immediately interact with the bot. Even permission to use the microphone is provided.

Note: To keep this example simple, the application uses Amazon Cognito unauthenticated users. You can use the application for experimenting and developing prototypes, but we recommend that you give unauthenticated users only read-only permissions in production applications. For information on Amazon Cognito identity pools, see the Amazon Cognito Developer Guide.

You now have a working Amazon Lex bot running on a native iOS or Android mobile device!

That was easy!

Building a chatbot isn’t that difficult. We hope you can find a way to interact with your API using Amazon Lex and AWS Mobile Hub. We can’t wait to see what you build!


Additional Reading

Take your skills to the next level. Learn how to use Amazon Lex and other AWS services to build a voice enabled tracking application that you can access from a web browser or Android application.


About the Authors

harshal_pimpalkhute_100As a Product Manager on the Amazon Lex team, Harshal Pimpalkhute spends his time trying to get machines to engage (nicely) with humans.

 

 

 

Dennis Hills is a Mobile Developer Advocate for Amazon Web Services (AWS). He has published more than a half dozen mobile apps and is a regular AWS blogger to the dev community. He is passionate about mobile, automation, AI, and serverless cloud computing.

 

 

 

Powered by WPeMatico

The post Voice-Enabled Mobile Bot Drives Auto Industry Innovation with Real-Time Trade-in Values for Vehicles appeared first on Artificial Intelligence Solutions.

Read More...

Thursday, 6 July 2017

Find Distinct People in a Video with Amazon Rekognition

Leave a Comment

Amazon Rekognition makes it easy to detect, search for, and compare faces in images to find matches. In this post, we show how to use Amazon Rekognition to find distinct people in a video and identify the frames that they appear in. You could use face detection in videos, for example, to identify actors in a movie, find relatives and friends in a personal video library, or track people in video surveillance.

First, we explain how the serverless solution finds distinct people in a video. Then, we explain how to implement the solution in your AWS account with AWS CloudFormation and to test it with a sample video.

How it works

The following diagram shows how this solution works:

Amazon Rekognition currently supports image analysis only. Therefore, we need to extract frames of the input video into images. We use Amazon Elastic Transcoder to create video thumbnails, a service that makes it easy to convert media files in the cloud with no need to manage the underlying infrastructure.

This is what happens in greater detail:

  1. You upload a video file into an S3 bucket.
  2. Amazon S3 invokes the first of the two AWS Lambda functions to create a new job in Amazon Elastic Transcoder (the code for this follows this list).
  3. The Elastic Transcoder job creates video thumbnails in .png format for every second of input video and uploads them into the S3 bucket. (It also creates a transcoded video, which we don’t use for this post.)
  4. When the job completes, Elastic Transcoder sends a notification to an SNS topic and Amazon Simple Notification Service (Amazon SNS) invokes another Lambda function.
# Retrieve the key for the S3 object that caused this function to be triggered
key = urllib.unquote_plus(event['Records'][0]['s3']['object']['key'].encode('utf8'))
filename = key.split('/')[-1]

# Create a new transcoding job. Files created by Elastic Transcoder start with 'elastictranscoder/[filename]/[timestamp]_'
timestamp = datetime.utcnow().strftime('%Y-%m-%d_%H-%M-%S')

client = boto3.client('elastictranscoder')
response = client.create_job(
  PipelineId=os.environ['PipelineId'],
  Input={'Key': key},
  OutputKeyPrefix='elastictranscoder/{}/{}_'.format(filename, timestamp),
  Output={
    'Key': 'transcoded-video.mp4',
    'ThumbnailPattern': 'thumbnail-{count}',
    'PresetId': os.environ['PresetId']
  }
)
  1. The second Lambda function creates a new collection in Amazon Rekognition. A collection is a container for the faces that Amazon Rekognition detected in images by using the IndexFaces. Note that the image bytes don’t persist in Amazon Rekognition. Instead, Amazon Rekognition extracts and stores facial features in the collection. It then retrieves the list of thumbnail objects created by Elastic Transcoder for that video in the S3 bucket and does the following:
    1. Calls the IndexFaces operation for each thumbnail. The solution uses concurrent threads to increase the throughput of requests to Amazon Rekognition and to reduce the time needed to complete the operation. In the end, the collection contains as many faces as there are faces detected in each thumbnail.
# Create a new collection. I use the job ID for the name of the collection
collectionId = sns_msg['jobId']
rekognition.create_collection(CollectionId=collectionId)

# Retrieve the list of thumbnail objects in the S3 bucket
thumbnailKeys = []
prefix = sns_msg['outputKeyPrefix']
prefix += sns_msg['outputs'][0]['thumbnailPattern'].replace('{count}', '')

paginator = s3.get_paginator('list_objects')
response_iterator = paginator.paginate(
  Bucket=os.environ['Bucket'],
  Prefix=prefix
)
for page in response_iterator:
  thumbnailKeys += [i['Key'] for i in page['Contents']]

# Call the IndexFaces operation for each thumbnail
faces = {}
indexFacesQueue = Queue()

def index_faces_worker():
  rekognition = boto3.client('rekognition', region_name=os.environ['AWS_REGION'])

  while True:
    key = indexFacesQueue.get()
    
    try:
      response = rekognition.index_faces(
        CollectionId=collectionId,
        Image={'S3Object': {
          'Bucket': os.environ['Bucket'],
          'Name': key
        }},
        ExternalImageId=str(frameNumber)
      )
      
      # Store information about returned faces in a local variable
      frameNumber = int(key[:-4][-5:])
      for face in response['FaceRecords']:
        faceId = face['Face']['FaceId']
        faces[faceId] = {
          'FrameNumber': frameNumber,
          'BoundingBox': face['Face']['BoundingBox']
        }

    # Put the key back in the queue if the IndexFaces operation failed
    except:
      indexFacesQueue.put(key)

    indexFacesQueue.task_done()

# Start CONCURRENT_THREADS threads
for i in range(CONCURRENT_THREADS):
  t = Thread(target=index_faces_worker)
  t.daemon = True
  t.start()

# Wait for all thumbnail objects to be processed
for key in thumbnailKeys:
  indexFacesQueue.put(key)
indexFacesQueue.join()
  1. For each face stored in the collection, calls the SearchFaces operation to search for faces that are similar to that face and in which it has a confidence in the match that is higher than 97%. The following code shows how this works:
searchFacesQueue = Queue()

def search_faces_worker():
  rekognition = boto3.client('rekognition', region_name=os.environ['AWS_REGION'])
  
  while True:
    faceId = searchFacesQueue.get()

    try:
      response = rekognition.search_faces(
        CollectionId=collectionId,
        FaceId=faceId,
        FaceMatchThreshold=97,
        MaxFaces=256
      )
      matchingFaces = [i['Face']['FaceId'] for i in response['FaceMatches']]

      # Delete the face from the local variable 'faces' if it has no matching faces
      if len(matchingFaces) > 0:
        faces[faceId]['MatchingFaces'] = matchingFaces
      else:
        del faces[faceId]

    except:
        searchFacesQueue.put(faceId)

    searchFacesQueue.task_done()

for i in range(CONCURRENT_THREADS):
  t = Thread(target=search_faces_worker)
  t.daemon = True
  t.start()

for faceId in list(faces):
  searchFacesQueue.put(faceId)
searchFacesQueue.join()

  1. Find faces in the collection that match each face that it detected. It starts from the first face that appears in the video and associates that face with a peopleId of 1. Then, it recursively propagates the peopleId to the matching faces. In other words, if faceA matches faceB and faceB matches faceC, the function decides that faceA, faceB and faceC correspond to the same person and assigns them all the same peopleId. To avoid false positives, the Lambda function propagates the peopleId from faceA to faceB only if there are at least two faces that match faceB that also match faceA. When the peopleId 1 has fully propagated, the function associates a peopleId of 2 to the next face appearing in the video that has no peopleId associated with it. It continues this process until all of the faces have a peopleId. The following code shows how this works:
# Sort the list of faces in the order of which they appear in the video
def getKey(item):
  return item[1]
facesFrameNumber = {k: v['FrameNumber'] for k, v in faces.items()}
faceIdsSorted = [i[0] for i in sorted(facesFrameNumber.items(), key=getKey)]

# Identify unique people and detect the frames in which they appear
def propagate_person_id(faceId):
  for matchingId in faces[faceId]['MatchingFaces']:
    if not 'PersonId' in faces[matchingId]:

      numberMatchingLoops = 0
      for matchingId2 in faces[matchingId]['MatchingFaces']:
          if faceId in faces[matchingId2]['MatchingFaces']:
              numberMatchingLoops = numberMatchingLoops + 1

      if numberMatchingLoops >= 2:
          personId = faces[faceId]['PersonId']
          faces[matchingId]['PersonId'] = personId
          propagate_person_id(matchingId)

personId = 0
for faceId in faceIdsSorted:
  if not 'PersonId' in faces[faceId]:
    personId = personId + 1
    faces[faceId]['PersonId'] = personId
    propagate_person_id(faceId)

In our solution, we arbitrarily chose to return people that appear in at least five consecutive frames. The Lambda function creates and uploads a JSON file to the S3 bucket with the following code:

{
  "People": [
    {
      "Frames": [
        {
          "FrameNumber": number,
          "FrameTimePosition": "HH:MM:SS",
          "BoundingBox": { 
            "Height": number,
            "Left": number,
            "Top": number,
            "Width": number
          }
        },
        ...
      ]
    },
    ...
  ]
}

It also creates and uploads a visual representation to the S3 bucket. You will see an example in the next section. Finally, the Lambda function deletes the collection from Amazon Rekognition.

Implementing and testing the solution

To implement and test the solution in your AWS account, you will use AWS CloudFormation to provision the required resources in the AWS North Virginia Region.

CloudFormation creates the following resources:

  • An S3 bucket that stores input videos, video thumbnails, and the files created with this solution.
  • An SNS topic where Elastic Transcoder publishes an event when a job completes.
  • An IAM role that grants Elastic Transcoder the required permissions to access Amazon S3 and Amazon SNS.
  • A pipeline and a preset in Elastic Transcoder. The pipeline is a queue for Elastic Transcoder jobs that defines how input and output files are stored in Amazon S3 and which notifications to send. The preset specifies settings, including thumbnail settings, for transcoding media files.
  • An IAM role that grants Lambda the required permissions to access Amazon S3 and Amazon Rekognition.
  • A Lambda function that Amazon S3 invokes when a new video is uploaded into the S3 bucket.
  • The second Lambda function that Amazon SNS invokes. This Lambda function processes the video thumbnails to find distinct people.

Some of the resources that AWS CloudFormation creates are custom resources. Therefore, AWS CloudFormation creates the related Lambda functions and IAM roles for Lambda beforehand.

To deploy and test the solution

  1. Choose Create stack to create an AWS CloudFormation stack. Then, follow the on-screen instructions.
    After creating these resources, AWS CloudFormation creates a copy of the video Democratizing LoRaWAN and IoT with The Things Network and stores it in the S3 bucket. This saves you from manually copying the video to test the solution. This triggers the solution. It can take up to 10 minutes after you start creating the stack for the solution to process the video.
  2. After the video’s been processed, in the AWS CloudFormation console, choose Outputs and note the name of the S3 bucket.
  3. Open the Amazon S3 console to browse the objects in this S3 bucket. You should see a new folder called output, which contains two files: the JSON document and the visual representation of each face in .png format, as follows:


    The solution has detected seven people in the video. For each person, the visual representation shows four randomly selected views of that person’s face and red vertical lines that indicate where that person appears in a frame.

  4. You can now clean up the resources by deleting the AWS CloudFormation stack. AWS CloudFormation does not delete the S3 bucket because it contains objects. You need to delete the S3 bucket manually.

Conclusion

In this post, we’ve shown how to use Amazon Rekognition, Amazon Elastic Transcoder, AWS Lambda, and Amazon S3 to identify people who appear in a video and to detect the frames in which they appear.

You can adapt this solution to your own requirements. For example, you could return additional attributes for the people that the solution finds, like an estimated age range or their name if they are famous individuals or celebrities.

If you have comments, submit them in the Comments section. If you have questions, start a new thread on the Amazon Rekognition forum.

 


Next Steps

Take your knowledge to the next level. Learn how to classify a large number of images with Amazon Rekognition and AWS Batch.


About the Authors

Nicolas Malaval is a Consultant for AWS Professional Services. He lives in Paris and works with our enterprise customers, helping them adopt cloud technology and innovate with AWS.

 

 

 

Rudy Krol is a Solution Architect for Amazon Web Services. He gained experience in software development before joining AWS. He is now specialized in serverless and IoT, helping our customers in France embrace the latest technologies on their innovative projects.

 

 

 

 

Powered by WPeMatico

The post Find Distinct People in a Video with Amazon Rekognition appeared first on Artificial Intelligence Solutions.

Read More...

ShareThis