Thursday, 29 June 2017

AWS Partners with Mapillary to Support the Large-Scale Scene Understanding Challenge at CVPR 2017

Leave a Comment

On July 26, 2017, Mapillary, Princeton University and others are hosting the Large-Scale Scene Understanding (LSUN) Challenge in conjunction with CVPR, the premier computer vision conference, in Honolulu, Hawaii. The LSUN Challenge and an associated workshop will bring computer vision researchers and practitioners together to solve problems with large-scale scene classification, scene segmentation, saliency prediction, and RGB-D detection.

Street-level image recognition is the foundation of next-generation applications, such as autonomous vehicles, delivery drones, and smart city projects. The unavailability of large-scale datasets with dense annotations is one of the biggest obstacles to making these applications a reality. In the LSUN Challenge, the world’s computer vision experts will leverage the new Mapillary Vistas dataset to help push the state-of-the-art forward.

Training, validation, and test data are already available. To register for access to the dataset, see the Mapillary Vistas Dataset site. The Challenge submission deadline is July 9, 2017.

We look forward to seeing the results and how the community leverages this new and exciting dataset! As part of the challenge, AWS will award $20,000 in prizes in the form of AWS credits. For more information about the challenge and the associated workshop, see the LSUN Challenge site.

Questions? See the semantic segmentation challenge homepage for more details.

Follow us on Twitter: #MapillaryVistas, #AmazonAI, #CVPR2017

 

 

 

 

Powered by WPeMatico

The post AWS Partners with Mapillary to Support the Large-Scale Scene Understanding Challenge at CVPR 2017 appeared first on Artificial Intelligence Solutions.

Read More...

Tuesday, 27 June 2017

In the Research Spotlight: Zornitsa Kozareva

Leave a Comment

As AWS continues to support the Artificial Intelligence (AI) community with contributions to Apache MXNet and the release of Amazon Lex, Amazon Polly, and Amazon Rekognition managed services, we are also expanding our team of AI experts, who have one primary mission: To lower the barrier to AI for all AWS developers, making AI more accessible and easy to use. As Swami Sivasubramanian, VP of Machine Learning at AWS, succinctly stated, “We want to democratize AI.”

In our Research Spotlight series, I spend some time with these AI team members for in-depth conversations about their experiences and get a peek into what they’re working on at AWS.


Dr. Zornitsa Kozareva joined AWS in June, 2016, as the Manager of Applied Science for Deep Learning, focusing on natural language processing (NLP) and dialog applications. Zornitsa is a recipient of the John Atanasoff Award, which was given to her by the President of the Republic of Bulgaria in 2016 for her contributions and impact in science, education, and industry; the Yahoo! Labs Excellence Award in 2014; and the RANLP Young Researcher Award in 2011. You can read more about Dr. Kozareva on her website, or visit Google Scholar to find her 80 papers and 1464 citations.

Getting into the field of natural language processing

Zornitsa’s interest in the field of natural language processing dates back to 2003, when she was doing her undergraduate studies in computer science in her native Bulgaria. In her third year of undergrad, she applied to the Leonardo Da Vinci Program, which is funded by the European Commission. She was selected to conduct research on multilingual information retrieval at the New University of Lisbon, Portugal. “This was a really great experience. I learned how to build a search engine; how to innovate, write, and publish scientific papers; and, most importantly, how to share my findings with the rest of the research community. For an undergrad such as myself, this opened my eyes to a brand new horizon.”

From that moment, Zornitsa says that she was “mesmerized by machine learning and its ability to solve natural language problems. I became super passionate about the field and I decided that I wanted to pursue a PhD in NLP.”

In 2004, Zornitsa went to Spain for graduate studies, where she worked on “a wide spectrum of topics, including information extraction, semantics, and question answering. This is how my career in NLP started.”

While working toward her PhD, Zornitsa had the opportunity to do a full-year internship. “I picked the Information Sciences Institute, located in Los Angeles, because I wanted to work with world-renowned leaders in the NLP field, such as Dr. Eduard Hovy. For a year, I worked with Dr. Hovy and Dr. Ellen Riloff conducting research on knowledge extraction. It was a great learning experience, and I also received valuable career advice. Right after I graduated, I decided that I wanted to come back to the US and continue to enhance my scientific career.”

In 2009, she became a Research Scientist at the Information Sciences Institute (ISI). At ISI, she spearheaded multimillion-dollar research grants funded by the Defense Advanced Research Projects Agency (DARPA) and Intelligence Advanced Research Projects Activity (IARPA). The research focused on topics such as machine reading, which aims at teaching machines to read and understand text just like humans do; information extraction from unstructured documents on the Web; metaphor interpretation; and sentiment analysis.

Zornitsa compared working on grants to running a mini start-up. “You have to be good at pitching your idea, in order to get it funded. Then you need to figure out which types of people should be hired. Next, you have to meet the milestones that the funding entities expect. And, at the same time, continue to innovate and publish stellar research.” Zornitsa learned how to perfect and balance these skills from Dr. Eduard Hovy, Dr. Jerry Hobbs, and Dr. Kevin Knight. “We were all in the same institute and we worked together. I was very fortunate to have met them, worked with them, and learned from them. That is something for which I am very grateful.” In 2011, Dr. Kozareva joined the University of Southern California as a Research Assistant Professor.

Why did you move out of academia?

“Academia had been great for me. I learned a lot about writing grants; raising funding; operating, delivering, and publishing the research; and teaching. But I had a couple of pursuits. First, I wanted to learn how to build systems that can handle billions of data points. This is something that you can’t really do in academia–you don’t have that much data. You work on much smaller datasets. I was interested in how to build systems that work at a large scale. Second, in research, people often say, ‘Oh that’s an easy problem–that’s a solved problem.’ But when you move to industry, you see that it is much more challenging to make those systems work. Moving to industry allows me to solve ‘What does it take to make research come to life?’ And, most importantly, it allows me to see how the technologies I build could impact the lives of other people. I joined Yahoo, in 2014.”

At Yahoo, she worked on mobile search and product ads. “There were a lot of NLP challenges. For instance: How do you automatically detect that a particular query has a shopping intent? How do you extract the semantic information from the query so you can display the relevant information? I worked with various teams to understand the scope of this work and outline the tasks that needed to be executed.” It quickly became clear to senior leadership that Zornitsa had a knack for guiding people. She could certainly drive the technical aspects, but she could also organize people “to deliver results–which is very important to the business.” She started and grew a new group that focused on query understanding for mobile search and ads.

Why did you leave Yahoo?

“I believe that people should continue to evolve. When we built NLP applications (for Yahoo), we built them for a specific user segment. I was curious about how you build these systems for an even bigger customer segment. How do you build them for big corporations that might not have expertise in NLP and machine learning, but still want to solve these problems? AWS is a pioneer in the Cloud. AWS has this vision of doing the heavy lifting for customers so that they can focus on building what they are good at and the products they want. I decided that this was something I wanted to be part of, and that it was time for me to embark on a new adventure.” 

On Joining AWS

Alex Smola reached out to Zornitsa to lead NLP efforts at AWS. He was familiar with her work, and needed someone with research and practical experience in building NLP systems. She thought it was a good fit, and she liked the charter that Alex described. “I have the role of defining what the products should look like, thinking about the customers–understanding who would be using this product; thinking about how you build everything end-to-end–the whole stack from the scientific knowledge to what these machine learning systems should look like; evaluating them at scale; and caring about the quality of what you are producing.”

“We live in the era of artificial intelligence, where the goal is to build systems with humanlike capabilities. We see a lot of progress in self-driving cars and the Internet of Things, but at the core are the conversational assistants that enable us humans to communicate with machines. Until now, developers couldn’t build conversational systems, because they needed to understand a lot about NLP and speech recognition. You had to worry about scalability. How do you test what you built? How do you integrate it? One of the amazing services that we built here at AWS is Amazon Lex, which allows you to build conversational interfaces for your apps using voice and text. And this service is super easy to use–developers don’t have to worry about the infrastructure or the machine learning and NLP components. That is something that I’m passionate about, and I’m proud that we have built it. Now any developer, with or without machine learning or speech expertise, can build these kinds of applications.”

How do you like it at AWS?

“The mission we are on–building things that are used to help people who might not have the necessary expertise–is awesome. And that’s the future.”

When she’s not working on NLP challenges at AWS, Zornitsa enjoys playing beach volleyball and traveling to new places.


About the Author

Victoria Kouyoumjian is a Sr. Product Marketing Manager for the AWS AI portfolio of services which includes Amazon Lex, Amazon Polly, and Amazon Rekognition, as well the AWS marketing initiatives with Apache MXNet. She lives in Southern California on an avocado farm and can’t wait until AI can clone her.

 

 

 

 

 

 

 

 

Powered by WPeMatico

The post In the Research Spotlight: Zornitsa Kozareva appeared first on Artificial Intelligence Solutions.

Read More...

Saturday, 24 June 2017

Build a Real-time Object Classification System with Apache MXNet on Raspberry Pi

Leave a Comment

In the past five years, deep neural networks have solved many computationally difficult problems, particularly in the field of computer vision. Because deep networks require a lot of computational power to train, often using tens of GPUs, many people assume that you can run them only on powerful cloud servers. In fact, after a deep network model has been trained, it needs relatively few computational resources to run predictions. This means that you can deploy a model on lower-powered edge (non-cloud) devices and run it without relying on an internet connection.

Enter Apache MXNet, Amazon’s open source deep learning engine of choice. In addition to effectively handling multi-GPU training and deployment of complex models, MXNet produces very lightweight neural network model representations. You can deploy these representations on devices with limited memory and compute power. This makes MXNet perfect for running deep learning models on devices like the popular $35 Raspberry Pi computer.

In this post, we walk through creating a computer vision system using MXNet for the Raspberry Pi. We also show how to use AWS IoT to connect to the AWS Cloud. This allows you to use the Cloud to manage a lightweight convolutional neural network running real-time object recognition on the Pi.

Prerequisites

To follow this post, you need a Raspberry Pi 3 Model B device running Jessie or a later version of the Raspbian operating system, the Raspberry Pi Camera Module v2, and an AWS account.

Setting up the Raspberry Pi

First, you set up the Pi with the camera module to turn it into a video camera, and then install MXNet. This allows you to start running deep network-based analysis on everything that the Pi “sees.”

Set up your Pi with the Camera Module and connect the device to the Internet, either through the Ethernet port or with WiFi. Then, open the terminal and type the following commands to install the Python dependencies for this post:

sudo apt-get update
sudo apt-get install python-pip python-opencv python-scipy 
python-picamera

Build MXNet for the Pi with the corresponding Python bindings by following the instructions for Devices. For this tutorial, you won’t need to build MXNet with OpenCV.

Verify that the build succeeded by opening a Python 2.7 Read-Eval-Print-Loop (REPL) environment on your Pi’s terminal and typing the following:

python
>>> import mxnet as mx
>>> mx.__version__

Running predictions locally

To run predictions on images captured by the Pi camera, you need to fetch a pretrained deep network model from the MXNet Model Zoo. Create a Python file in the Pi’s home directory, name load_model.py, and write a class that downloads ImageNet-trained models from the Model Zoo and loads them into MXNet on the Pi:

# load_model.py  
import mxnet as mx
import numpy as np
import picamera
import cv2, os, urllib2, argparse, time
from collections import namedtuple
Batch = namedtuple('Batch', ['data'])


class ImagenetModel(object):

    """
    Loads a pre-trained model locally or from an external URL and returns an MXNet graph that is ready for prediction
    """
    def __init__(self, synset_path, network_prefix, params_url=None, symbol_url=None, synset_url=None, context=mx.cpu(), label_names=['prob_label'], input_shapes=[('data', (1,3,224,224))]):

        # Download the symbol set and network if URLs are provided
        if params_url is not None:
            print "fetching params from "+params_url
            fetched_file = urllib2.urlopen(params_url)
            with open(network_prefix+"-0000.params",'wb') as output:
                output.write(fetched_file.read())

        if symbol_url is not None:
            print "fetching symbols from "+symbol_url
            fetched_file = urllib2.urlopen(symbol_url)
            with open(network_prefix+"-symbol.json",'wb') as output:
                output.write(fetched_file.read())

        if synset_url is not None:
            print "fetching synset from "+synset_url
            fetched_file = urllib2.urlopen(synset_url)
            with open(synset_path,'wb') as output:
                output.write(fetched_file.read())

        # Load the symbols for the networks
        with open(synset_path, 'r') as f:
            self.synsets = [l.rstrip() for l in f]

        # Load the network parameters from default epoch 0
        sym, arg_params, aux_params = mx.model.load_checkpoint(network_prefix, 0)

        # Load the network into an MXNet module and bind the corresponding parameters
        self.mod = mx.mod.Module(symbol=sym, label_names=label_names, context=context)
        self.mod.bind(for_training=False, data_shapes= input_shapes)
        self.mod.set_params(arg_params, aux_params)
        self.camera = None

    """
    Takes in an image, reshapes it, and runs it through the loaded MXNet graph for inference returning the N top labels from the softmax
    """
    def predict_from_file(self, filename, reshape=(224, 224), N=5):

        topN = []

        # Switch RGB to BGR format (which ImageNet networks take)
        img = cv2.cvtColor(cv2.imread(filename), cv2.COLOR_BGR2RGB)
        if img is None:
            return topN

        # Resize image to fit network input
        img = cv2.resize(img, reshape)
        img = np.swapaxes(img, 0, 2)
        img = np.swapaxes(img, 1, 2)
        img = img[np.newaxis, :]

        # Run forward on the image
        self.mod.forward(Batch([mx.nd.array(img)]))
        prob = self.mod.get_outputs()[0].asnumpy()
        prob = np.squeeze(prob)

        # Extract the top N predictions from the softmax output
        a = np.argsort(prob)[::-1]
        for i in a[0:N]:
            print('probability=%f, class=%s' %(prob[i], self.synsets[i]))
            topN.append((prob[i], self.synsets[i]))
        return topN

    """
    Captures an image from the PiCamera, then sends it for prediction
    """
    def predict_from_cam(self, capfile='cap.jpg', reshape=(224, 224), N=5):
        if self.camera is None:
            self.camera = picamera.PiCamera()

        # Show quick preview of what's being captured
        self.camera.start_preview()
        time.sleep(3)
        self.camera.capture(capfile)
        self.camera.stop_preview()

        return self.predict_from_file(capfile)


if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="pull and load pre-trained resnet model to classify one image")
    parser.add_argument('--img', type=str, default='cam', help='input image for classification, if this is cam it captures from the PiCamera')
    parser.add_argument('--prefix', type=str, default='squeezenet_v1.1', help='the prefix of the pre-trained model')
    parser.add_argument('--label-name', type=str, default='prob_label', help='the name of the last layer in the loaded network (usually softmax_label)')
    parser.add_argument('--synset', type=str, default='synset.txt', help='the path of the synset for the model')
    parser.add_argument('--params-url', type=str, default=None, help='the (optional) url to pull the network parameter file from')
    parser.add_argument('--symbol-url', type=str, default=None, help='the (optional) url to pull the network symbol JSON from')
    parser.add_argument('--synset-url', type=str, default=None, help='the (optional) url to pull the synset file from')
    args = parser.parse_args()
    mod = ImagenetModel(args.synset, args.prefix, label_names=[args.label_name], params_url=args.params_url, symbol_url=args.symbol_url, synset_url=args.synset_url)
    print "predicting on "+args.img
    if args.img == "cam":
        print mod.predict_from_cam()
    else:
        print mod.predict_from_file(args.img)

 

To download the lightweight, but highly accurate, ImageNet-trained SqueezeNet V1.1 model and run it on an image of a cat, run the following command in the Pi’s home directory:

wget http://ift.tt/2tDPhPF -O cat.jpg
python load_model.py --img 'cat.jpg' --prefix 'squeezenet_v1.1' --synset 'synset.txt' --params-url 'http://ift.tt/2t459hx' --symbol-url 'http://ift.tt/2tDyMDk' --synset-url 'http://ift.tt/2t4atRR'

The output should include cat as one of the top labels, and look similar to this:

[(0.57816696, 'n02123045 tabby, tabby cat'), (0.19830757, 'n02124075 Egyptian cat'), (0.16912524, 'n02325366 wood rabbit, cottontail, cottontail rabbit'), (0.020817872, 'n02123159 tiger cat'), (0.020065691, 'n02326432 hare')]

To run the pretrained model on an image captured with the Raspberry Pi camera, point the camera at an object that you want to classify and run the following command in the Pi’s home directory:

python load_model.py –img ‘cam’ –prefix ‘squeezenet_v1.1’ –synset ‘synset.txt’

You will see a quick preview of the image captured by the camera. Then the model runs and returns suggested labels for the object.

Connecting to AWS IoT

Running a model locally on the Pi is a great first step. But to reliably centralize and store predictions and remotely update the model, you need to connect the Pi to the AWS Cloud. To do this, set up AWS IoT on the Pi.

In the AWS IoT console, use the AWS IoT Connect wizard. For platform, choose Linux/OSX. For SDK type, choose Python, and then choose Next.

Register your device with the name “MyRaspberryPi.”

Choose Next Step and download the connection kit in  connect_device_package.zip to your Pi. When you unzip connect_device_package.zip and extract its contents into your Pi’s home directory, you see the files that you need to securely connect your device to AWS:

  • myraspberrypi.cert.pem
  • myraspberrypi.private.key
  • myraspberrypi.public.key
  • start.sh

To set up a secure connection between your device and the AWS Cloud, follow the steps on the next screen to run the start.sh script on the Pi. This script downloads the Symantec Root-CA certificate onto your Pi and installs the AWS IoT SDK, which lets you easily interact with AWS IoT from Python. The script also confirms that the Pi is talking to AWS IoT.

Now you can use AWS IoT to create a service on the Pi that runs near-real-time object recognition and constantly pushes results to the AWS Cloud. It also provides a mechanism to seamlessly update the model running on the Pi.

In your home directory, create a new file called iot_service.py, and add the following code to it:

# iot_service.py         
import AWSIoTPythonSDK
from AWSIoTPythonSDK.MQTTLib import AWSIoTMQTTClient
import sys
import logging
import time
import getopt
import json
import load_model

# Custom MQTT message callback
def customCallback(client, userdata, message):
    print("Received a new message: ")
    print(message.payload)
    print("from topic: ")
    print(message.topic)
    print("--------------nn")

    if message.topic == "sdk/test/load":
        args = json.loads(message.payload)
        new_model = load_model.ImagenetModel(args['synset'], args['prefix'], label_names=[args['label_name']], params_url=args['params_url'], symbol_url=args['symbol_url'])
        global_model = new_model
    elif message.topic == "sdk/test/switch":
        args = json.loads(message.payload)
        new_model = load_model.ImagenetModel(args['synset'], args['prefix'], label_names=[args['label_name']])
        global_model = new_model        

# Usage
usageInfo = """Usage:
 
Use certificate based mutual authentication:
python iot_server.py -e  -r  -c  -k 
 
Use MQTT over WebSocket:
python iot_server.py -e  -r  -w
 
Type "python iot_server.py -h" for available options.
"""

# Help info
helpInfo = """-e, --endpoint
    Your AWS IoT custom endpoint
-r, --rootCA
    Root CA file path
-c, --cert
    Certificate file path
-k, --key
    Private key file path
-w, --websocket
    Use MQTT over WebSocket
-h, --help
    Help information
"""
 
# Read in command-line parameters
useWebsocket = False
host = ""
rootCAPath = ""
certificatePath = ""
privateKeyPath = ""
try:
    opts, args = getopt.getopt(sys.argv[1:], "hwe:k:c:r:", ["help", "endpoint=", "key=","cert=","rootCA=", "websocket"])
    if len(opts) == 0:
        raise getopt.GetoptError("No input parameters!")
    for opt, arg in opts:
        if opt in ("-h", "--help"):
            print(helpInfo)
            exit(0)
        if opt in ("-e", "--endpoint"):
            host = arg
        if opt in ("-r", "--rootCA"):
            rootCAPath = arg
        if opt in ("-c", "--cert"):
            certificatePath = arg
        if opt in ("-k", "--key"):
            privateKeyPath = arg
        if opt in ("-w", "--websocket"):
            useWebsocket = True
except getopt.GetoptError:
    print(usageInfo)
    exit(1)

# Missing configuration notification
missingConfiguration = False
if not host:
    print("Missing '-e' or '--endpoint'")
    missingConfiguration = True
if not rootCAPath:
    print("Missing '-r' or '--rootCA'")
    missingConfiguration = True
if not useWebsocket:
    if not certificatePath:
        print("Missing '-c' or '--cert'")
        missingConfiguration = True
    if not privateKeyPath:
        print("Missing '-k' or '--key'")
        missingConfiguration = True
if missingConfiguration:
    exit(2)


# Configure logging
logger = logging.getLogger("AWSIoTPythonSDK.core")
logger.setLevel(logging.DEBUG)
streamHandler = logging.StreamHandler()
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
streamHandler.setFormatter(formatter)
logger.addHandler(streamHandler)


# Init AWSIoTMQTTClient for publish/subscribe communication with the server
myAWSIoTMQTTClient = None
if useWebsocket:
    myAWSIoTMQTTClient = AWSIoTMQTTClient("basicPubSub", useWebsocket=True)
    myAWSIoTMQTTClient.configureEndpoint(host, 443)
    myAWSIoTMQTTClient.configureCredentials(rootCAPath)
else:
    myAWSIoTMQTTClient = AWSIoTMQTTClient("basicPubSub")
    myAWSIoTMQTTClient.configureEndpoint(host, 8883)
    myAWSIoTMQTTClient.configureCredentials(rootCAPath, privateKeyPath, certificatePath)


# AWSIoTMQTTClient connection configuration
myAWSIoTMQTTClient.configureAutoReconnectBackoffTime(1, 32, 20)
myAWSIoTMQTTClient.configureOfflinePublishQueueing(-1)  # Infinite offline Publish queueing
myAWSIoTMQTTClient.configureDrainingFrequency(2)  # Draining: 2 Hz
myAWSIoTMQTTClient.configureConnectDisconnectTimeout(10)  # 10 sec
myAWSIoTMQTTClient.configureMQTTOperationTimeout(5)  # 5 sec


# Connect and subscribe to AWS IoT
myAWSIoTMQTTClient.connect()
myAWSIoTMQTTClient.subscribe("sdk/test/load", 1, customCallback)
time.sleep(2)


# Tell the server we are alive
myAWSIoTMQTTClient.publish("sdk/test/monitor", "New Message: Starting IoT Server", 0)

global_model = load_model.ImagenetModel('synset.txt', 'squeezenet_v1.1')

while True:
    if global_model is not None:
        predictions = global_model.predict_from_cam()
        print predictions
        myAWSIoTMQTTClient.publish("sdk/test/monitor", "New Prediction: "+str(predictions), 0)


Now run this file by entering the following command in the Pi’s home directory:

python iot_service.py -e my-device-endpoint.amazonaws.com -r root-CA.crt -c myraspberrypi.cert.pem -k myraspberrypi.private.key

In the AWS IoT Console choose Test, and subscribe to the sdk/test/monitor topic:

  

To see the predictions streaming into AWS in real time, on the Test page, choose the name of the new topic. Even if the network connection slows or is dropped, AWS IoT ensures that packets aren’t lost and the prediction log remains up to date.

To send commands to the Pi to update the MXNet model it’s running, you can publish to MQTT topics. For example, to update the SqueezeNet model running on the Pi to a larger, but more accurate, ResNet model, in the MQTT client in the Publish section, send the following JSON to the sdk/test/load topic:

{
"synset": "synset.txt",
"prefix": "resnet-18",
"label_name": "softmax_label",
"params_url": "http://ift.tt/2s6kskI",
"symbol_url": "http://ift.tt/2sBqGN7"
}

This is what it looks like in the MQTT client:

The Pi downloads the new network symbol and parameter files from the Model Zoo, loads them for prediction, and continues running with the new model. You don’t need to download a new synset. The two models that you’re using have been trained for the ImageNet task, so the set of objects that you’re classifying remains the same.

Next steps

By running MXNet for predictions on the Raspberry Pi and connecting it to the AWS Cloud with AWS IoT, you have created a near state-of-the-art computer vision system. Your system doesn’t rely on a constant high-bandwidth connection to stream video or expensive GPU servers in the cloud to process that video. In fact, by using AWS and MXNet on the Pi, you can easily create a much more reliable and cost-effective smart camera system. With this approach, you enjoy most of the benefits of cloud-based model monitoring and management. But, you reduce the price from a monthly cost of hundreds of dollars (for server and data transfer costs) to a one-time hardware cost of around $60 (the cost of the Pi and the camera module).

This smart camera system is just the tip of the iceberg. You can start to iterate on it, connecting it to production services in the AWS Cloud, building out multi-device coordination though AWS IoT, and using methods like transfer learning to adapt the pretrained models to specific computer vision tasks.

 


Additional Reading

Learn how to use Amazon Rekognition to build an end-to-end serverless photo recognition system.


About the Author

Aran Khanna is a Software Development Engineer with AWS Deep Learning. He works as the technical lead for MXNet on IoT, edge and mobile devices, allowing customers to put intelligence everywhere, by enabling them to deploy and manage efficient deep networks across a broad set of low powered devices. In his spare time, you can find him writing about digital privacy, building new features for his smart home or skiing in Lake Tahoe.

 

Powered by WPeMatico

The post Build a Real-time Object Classification System with Apache MXNet on Raspberry Pi appeared first on Artificial Intelligence Solutions.

Read More...

“Greetings, visitor!” — Engage Your Web Users with Amazon Lex

Leave a Comment

All was well with the world last night. You went to bed thinking about convincing your manager to add some time in the next sprint for much-needed improvements to the recommendation engine for shoppers on your website. The machine learning models are out of date and people are complaining, but no one is looking past the one-off tickets that stream in every day. You wake up to the usual flurry of email.

But what’s this? You learn that the Chief Marketing Officer is at an industry conference where she’s heard the buzz about conversational experiences. She just tried out some chatbots, and now she wants one for the site. She wants to connect with shoppers one-on-one to offer them a personalized experience. That’s a fun technology problem. As long as the management team hires someone to help with the look and feel, you can focus on the fun part of putting the chatbot together.

In this post, we show how easy it is to create a chatbot and a personalized web experience for your customers using Amazon Lex and other AWS services.

What do you need to prove?

Personalized experience covers a lot of ground, but you have ideas. You could create a virtual shopping assistant that can answer questions about products; check colors, styles, and pricing; offer product recommendations; bring up relevant deals; remember shopping preferences; look up ratings and reviews–and, of course, you’ll need to look up the most useful and recent reviews first–or wait … maybe even talk about what the Twittiverse thinks. But you have to nail basic stuff like “Do you have this in red?,” “Where can I get it?,” and “What’s the return policy?.”

Basically, you need to prove:

  1. That you can build a bot quickly (check, you have Amazon Lex for that)
  2. That you can integrate your bot with the site (and later on, you might use AWS Lambda to connect to other apps)
  3. That it’s easy to monitor the bot and update it (you’re not really sure about this one)

For starters, you decide to keep it simple. You decide to build an example bot using Amazon Lex, wire it up to static HTML, connect it to a stub service, and see what it takes to update the bot. This is going to be fun!

Build an Amazon Lex bot

The specific bot isn’t important. You just want to make sure that you can put together a web experience that integrates with a service on the backend. You can start with the Amazon Lex BookTrip example. It takes a couple of minutes, but when you’re done, you’re ready to test the “Return parameters to client” (no code hooks yet) version of the bot. San Francisco for two nights, anyone?

Next, you follow the instructions to use a blueprint to create a Lambda function (BookTripCodeHook) that will serve as the code hook for initialization, data validation, and fulfillment activities. You use the Test events from the Sample event template list to confirm that the code works as expected and that you don’t have any setup or permissions issues.

Now, you incorporate the Lambda function into the Amazon Lex bot. You follow the instructions to associate the new function as the Initialization and data validation code hook and the Fulfillment code hook for both the BookCar and BookHotel intents:

You specify a Goodbye message so you’ll know for sure when the bot completes successfully as you test.

You build the bot and retest it. This time around, the Lambda function provides the room rate based on the location, the room options, and the number of nights. You tweak the code to make sure that it’s easy to integrate with an API. For example, you could integrate with a weather data source so that the user can ask about the weather in the chosen city.

Set up Amazon Cognito

You’re ready to push this out to a static website, but you want to ensure it’s not left wide open. You know Amazon Cognito will let you manage permissions and users for mobile and web apps, so you start with an Amazon Cognito federated identity pool.

From the Amazon Cognito console, you choose Manage new identity pool, and then choose Create new identity pool. You provide a pool name (botpool), choose Enable access to unauthenticated identities, and then choose Create Pool:

To create the pool and the associated AWS Identity and Access Management (IAM) roles, you choose Allow. Then, you record the IAM role names so you can modify them:

Finally, you get the pool ID that you need for the JavaScript you will use to integrate the bot.

You modify the IAM roles to allow access to Amazon Lex. From the IAM console, you find the roles and change each of them to attach the AmazonLexRunBotsOnly and AmazonPollyReadOnlyAccess policies:

Test your chatbot on the web

You quickly put together an HTML file that you can use to test your bot. The pool ID is used here to establish an IAM session.

 





Amazon Lex for JavaScript - Sample Application (BookTrip)


Amazon Lex - BookTrip

This little chatbot shows how easy it is to incorporate Amazon Lex into your web pages. Try it out.

You upload the file so that you can host it on Amazon S3 as a static web site to test your chatbot on the web.

Now that’s a productive morning! You send a quick note to the team and head out for a well-deserved break.

Monitoring and feedback

By the time you get back, a few people have already tried out the bot. You check the Amazon Lex console for metrics.

You notice that some people have been saying, “hotel for 2 nights” and Amazon Lex isn’t catching that, so you add a new utterance and rebuild the bot: You realize that you can use the Model API to do this programmatically, but this will do for now. You’ve met your goal and can now demo your solution.

Conclusion

Amazon Lex makes it easy to create functioning bots in minutes. Using services like Amazon Cognito and Amazon S3, you can quickly integrate a chatbot into a web experience, but there is so much more to do. How can you tell when there are a hundred users of the new bot? Could you wire it up to the web analytics? Could you fire analytics events when the visitor gets to a certain step in the interaction?

 


Additional Reading

Learn how to integrate your Amazon Lex bot with any messaging service.


About the Author

As a Solutions Architect, Niranjan Hira is often found near a white board helping our customers assemble the right building blocks to address their business challenges. In his spare time, he breaks things to see if he can put them back together.

 

 

 

Powered by WPeMatico

The post “Greetings, visitor!” — Engage Your Web Users with Amazon Lex appeared first on Artificial Intelligence Solutions.

Read More...

In the Research Spotlight: Hassan Sawaf

Leave a Comment

As AWS continues to support the Artificial Intelligence (AI) community with contributions to Apache MXNet and the release of Amazon Lex, Amazon Polly, and Amazon Rekognition managed services, we are also expanding our team of AI experts, who have one primary mission: To lower the barrier to AI for all AWS developers, making AI more accessible and easy to use. As Swami Sivasubramanian, VP of Machine Learning at AWS, succinctly stated, “We want to democratize AI.”

In our Research Spotlight series, I spend some time with these AI team members for in-depth conversations about their experiences and get a peek into what they’re working on at AWS.


Hassan Sawaf has been with Amazon since September 2016. This January, he joined AWS as Director of Applied Science and Artificial Intelligence.

Hassan has worked in the automatic speech recognition, computer vision, natural language understanding, and machine translation fields for 20+ years. In 1999, he cofounded AIXPLAIN AG, a company focusing on speech recognition and machine translation. His partners were, among others, Franz Josef Och, who eventually started the Google Translate team, and Stephan Kanthak, now Group Manager with Nuance Communications, and Stefan Ortmanns, today Senior Vice President, Mobile Engineering and Professional Services with Nuance Communications. Hassan also spent time at SAIC as Chief Scientist for Human Language Technology, where he worked on multilingual spoken dialogue systems. Coincidentally, his peer from Raytheon BBN Technologies was Rohit Prasad, who is now VP and Head Scientist for Amazon Alexa.

How did you get started?

“I started working in development on information systems in airports, believe it or not. Between airlines and airports, and from airport-to-airport, the communication used to be via Telex messages, using something similar to “shorthand” information about the plane. These messages included information such as Who has boarded the plane? What’s the cargo? How is the baggage distributed on the plane? How much fuel does it have? What kinds of passengers (first class, business class), etc. This kind of information was sent from airline to airport before the plane landed. But by the 1990’s, flight travel had grown exponentially. And it used to be that humans had to read this information and translate that into actions in the airport. So, we built the technology that could do this fully automatically, so that manual human intervention was no longer needed. People no longer needed to sit there reading Telex messages and typing ahead on the computer. We converted this such that the process was completely done by machine. This was my first project in natural language understanding.

“After that, I started doing speech recognition and machine translation in combination, so that people with different languages could communicate over the phone with each other. Again, in the mid-90’s, this was very complicated – it still is!  But more so at that time because hardware was not available and machine learning was just getting ready to be utilized. So, we developed a system, out of the University of Aachen in research, and started a company in 1999— taking with me some of the best research scientists and students, to commercialize a product for speech translation which we launched in 2002. One of the co-founders was Franz Och, who started and led the Google Translate team.”

In 2010, Hassan started at SAIC as Chief Scientist for a DARPA project doing dialogue systems – specifically working on speech translation projects, and projects that perfect communications with robots, such that these bots receive instructions and respond with inquiries to learn and perfect their actions.

After SAIC, Hassan joined eBay and established several AI teams, starting with a team that implemented machine translation – specifically for increasing cross-border trade revenue. Hassan later also managed computer vision, user behavior modeling, natural language understanding and dialogue modeling. While leading the AI team behind the eBay Chat Bot, he was instrumental in expanding the idea of “chatbot conversations” to include images.

Why did you join AWS?

Hassan explained that although eBay’s scope is large, it’s primarily focused on commerce.

“I was hired by Swami Sivasubramanian, VP of Machine Learning at AWS, to develop technology around human language for higher level services – e.g. working on the science behind Amazon Lex. eBay was very interesting for me, with a large scope, but at AWS, the scope is bigger, as it covers not just commerce – but everything: Building technologies that are available for anyone to use for any use case they have; Enabling our developers to come up with new ideas that they might have to utilize the technology, instead of building the tech from scratch again and again which is expensive and slows down the advancement of products and solutions. Customers can focus on their business ideas and their special competencies, while AWS takes care of the core capabilities. Developers can take advantage of this to come up with these new and innovative solutions. That’s very exciting for me –  I love new ideas. Specifically, I like to help new entrepreneurs start something, and AWS is exactly in that space.”

You can find Hassan in Palo Alto, CA, working on his passions in human language, machine translation, and computer vision, and the science behind Amazon Lex and other Amazon and AWS AI services. In his free time, Hassan enjoys hiking and learning to play the guitar. You might also see him out in Monterey, CA, on the track racing sports cars!


About the Author

Victoria Kouyoumjian is a Sr. Product Marketing Manager for the AWS AI portfolio of services which includes Amazon Lex, Amazon Polly, and Amazon Rekognition, as well the AWS marketing initiatives with Apache MXNet. She lives in Southern California on an avocado farm and can’t wait until AI can clone her.

 

 

 

 

 

 

 

 

Powered by WPeMatico

The post In the Research Spotlight: Hassan Sawaf appeared first on Artificial Intelligence Solutions.

Read More...

Using Amazon Rekognition to Identify Persons of Interest for Law Enforcement

Leave a Comment

This is a guest post by Chris Adzima, a Senior Information Systems Analyst for the Washington County Sheriff’s Office. 

In law enforcement, it is extremely important to identify persons of interest quickly. In most cases, this is accomplished by showing a picture of the person to multiple law enforcement officers in hopes that someone knows the person. In Washington County, Oregon, there are nearly 20,000 different bookings (when a person is processed into the jail) every year. As time passes, officers’ memories of individual bookings fade. Also, in most cases, investigations move very quickly. Waiting for an officer to come on duty to identify a picture might mean missing the opportunity to solve the case.

In this post, I discuss our decision to use AWS for facial recognition. I walk through setting up web and mobile applications using AWS, demonstrating how easy it is even for someone who is new to AWS. I then show how we used Amazon Rekognition to build a powerful tool for solving crimes.

The following diagram shows the system architecture:

Setup

When we were presented with the problem of quickly identifying persons of interest, we thought it seemed like something we could automate instead of resorting to the usual manual processes. We wanted to be able to not only get responses back to the officers within seconds, but also to ensure that officers’ memory wasn’t going to be a limiting factor.

This is where we turned to AWS and Amazon Rekognition. We had not used AWS, but we had read a release announcement about Amazon Rekognition a few days prior to being approached about fixing the identification process. We thought this would be a great product to test.

Setup was fairly straightforward. In the Washington County jail management system (JMS), we have an archive of mugshots going back to 2001. We needed to get the mugshots (all 300,000 of them) into Amazon S3. Then we need to index them all in Amazon Rekognition, which took about 3 days.

Our JMS allows us to tag the shots with the following information: front view or side view, scars, marks, or tattoos. We only wanted the front view, so we used those tags to get a list of just those.

Uploading to S3 was easy. At first, we simply created the bucket and manually used the web interface to upload approximately 1,000 images at a time. While this took a while, it didn’t take a lot of our time because we could set it and forget it.

Implementation (here be code)

Later, we used a script to upload the images. We used PHP to move the files from our JMS servers and process them onto the web server we are using for AWS. On the server, we use the following code to place the images in S3:

$sharedConfig = [
    'region' => 'us-west-2',
    'version' => 'latest'
];

$sdk = new AwsSdk($sharedConfig);
$client = $sdk->createS3();

$result = $client->putObject(array(
    'Bucket' => ‘BUCKETNAME',
    'Key'    => $_REQUEST['name'],
    'Body'   => $_REQUEST['fileData']
));

S3 makes it simple to create the files in the system.

After the 300,000 images were uploaded into Amazon S3, we then needed to index all of the images.  In hindsight, we realized that it would have been easier to index them in the same script that I used to upload them to S3. This would have eliminated the need to validate which images had already been indexed.

To index the faces, we simply looped through every image in the bucket:

$iterator = $client->listObjects(array(
    'Bucket' => 'wcso-let-faces',
        "MaxKeys" => 50,
        "Marker"=>$previousMarker  //This marker was saved in a database. It allowed me to know where in the list I was during indexing.
));

foreach ($iterator["Contents"] as $object) {
        $result = $rekog->indexFaces([
            'CollectionId' => 'COLLECTIONID', // REQUIRED
            'ExternalImageId' => $object['Key'],
            'Image' => [
                'S3Object' => [
                    'Bucket' => 'BUCKETNAME',
                    'Name' => $object['Key'],
                ],
            ],
        ]);
}

Again, for having no experience with AWS, we found this extremely easy, and it and worked very well. You do need to use the ExternalImageId property so that you know what Amazon Rekognition returns when you do a face search. Without that, you have no back reference to the S3 object.

After all of the images were indexed, we worked on a quick front end that would let me search the collection for matches when we got a new image. A simple form to a PHP script provided that front end.

var formData = new FormData();
for (var i = 0; i < files.length; i++) {
            var file = files[i];
            if (!file.type.match('image.*')) {
              continue;
}
            formData.append('photos[]', file, file.name);
          }
var xhttp = new XMLHttpRequest();
        xhttp.onreadystatechange = function() {
          if (this.readyState == 4 && this.status == 200) {
            var json = this.responseText;
            obj = JSON.parse(json);
              if(obj.length > 0){
                for(i in obj){
                    getFaceResults(obj[i]);
                 }
                
              }
            }
        };
        
        xhttp.open("POST", "getSearchResults.php", true);
        xhttp.send(formData);

When an image was submitted through the form, we searched using a simple script:

$sharedConfig = [
    'region' => 'us-west-2',
    'version' => 'latest'
];

$sdk = new AwsSdk($sharedConfig);
$client = $sdk->createS3();
$rekog = new AwsRekognitionRekognitionClient($sharedConfig);

$fileData = file_get_contents($_FILES["photos"]["tmp_name"][0]);

$result = $rekog->searchFacesByImage([
        'CollectionId' => 'COLLECTIONNAME', // REQUIRED
        'Image' => [ // REQUIRED
            'Bytes' => $fileData,
        ],
        "MaxFaces" => 5
    ]);

foreach($result['FaceMatches'] as $v){
        $results[] = array($v["Face"]["ExternalImageId"],$v["Similarity"]);
    }

echo json_encode($results);

With the results in the $results[] array, I used the ExternalImageId and was able to display the S3 image:

$result = $client->getObject(array(
        'Bucket' => 'wcso-let-faces',
        'Key'    => $imgKey
    ));
    
echo $result['Body'];

We also used the ExternalImageId to query our database for information about the booking. We accomplished this with a simple AJAX call to the web service we set up on our JMS server.

function getFaceResults(obj){
        var json = JSON.stringify(obj);
        var resDiv = document.getElementById("resultDiv");
        
        var iframe = document.createElement('iframe');
        iframe.width = "90%";
        iframe.height = "230px";
        iframe.style.border = "0"
        iframe.style.overflowY = "hidden";
        iframe.style.overflowX = "hidden";
        iframe.style.overflow = "hidden";
        iframe.src = "http://PATHTOAPPLICATION"+json;
        //console.log(json);
        
        
        resDiv.appendChild(iframe);
        
        
    }

After setting up, we were ready to test the tool. The best way to test would be to run surveillance and other images of known suspects from solved cases and evaluate the accuracy of the results. But we didn’t want to taint the results because we already knew who the suspects were. So we had a detective send 20 random pictures of individuals whose identity he knew, but we didn’t. We ran all of them through the system, and reviewed the results to find the face that we thought matched the best. We sent the results to the detective to see how it went. 75% of the results accurately identified the person in the photo.

After testing was done, we wanted to put the power of the application into the hands of the officers. We did that by creating a mobile application.

Again, we created a simple UI for capturing an image and then processing it with Amazon Rekognition. The code for searching faces is fairly straightforward:

let Rekognition = AWSRekognition.default();
            
            
        let searchFaceRequest = AWSRekognitionSearchFacesByImageRequest();
        searchFaceRequest?.collectionId = "COLLECTIONID";
        searchFaceRequest?.maxFaces = 5;
        searchFaceRequest?.faceMatchThreshold = 0.85;
        searchFaceRequest?.image = AWSRekognitionImage();
        let newSize = CGSize(width: 1500, height: 1500);
        let newImage = resizeImage(image: self.samplePicture.image!, targetSize: newSize);
        
        searchFaceRequest?.image?.bytes = UIImagePNGRepresentation(newImage);
        
        
        
        
        
        Rekognition.searchFaces(byImage: searchFaceRequest!).continue({ (task) -> AnyObject! in

            
            if let faces = task.result?.faceMatches {
                    for face in faces{
                    let externalID = face.face?.externalImageId!;
                    let similarity = face.similarity!;
                    //Use the externalID and Similarity to call a web service that will get the information
                }
                
                
            } else {
               //DO SOEMTHING – when there are no results
                
            }
            return nil;
        });

After we made both the mobile application and the web application available, we started seeing results. For example, we caught a suspect based on an image taken with a camera at a self-checkout kiosk at a big box hardware store.

Early in 2017, an unknown suspect visited a hardware store, filled a basket with expensive items, and scanned them at the self-checkout kiosk. Before finishing the checkout process, the suspect picked up the merchandise and walked out of the store. The checkout kiosk’s camera captured a great shot of him.

Typically, this would initiate a manual process where we show the image to multiple law enforcement officers and hope that someone recognizes the suspect.

This time, we ran the image through our facial recognition system and got four hits with more than 80% similarity according to Amazon Rekognition. We noticed that one of the men looked very familiar to us. We gave his name to the detective in charge of the investigation. The detective did a quick search of Facebook and found a picture of him. In that picture, we noticed many facial similarities. The best part? He was wearing the same hoodie as the man captured on camera who was suspected of the theft.

In another example, a surveillance camera captured the image of a man using a credit card that was later reported as stolen. Because of the low resolution and high angle of the image, it was difficult to determine who it was just by looking at the image. When we ran it through Amazon Rekognition, we received a result that was greater than a 95% match.

In a final example, we were searching for a person of interest who was posting photos on Facebook under a pseudonym. Due to some of the posts he was authoring, a local law enforcement agency needed to identify and speak with him. His profile picture showed him laying on a bed covered in dollar bills. We used this image to search our mugshots and found a close to 100% match. We identified the individual. Officers were able to discuss the post with him and ensure that he and the public were safe.

Conclusion

In this post, I showed how to use Amazon Rekognition for facial recognition. I also showed how easy it is to design and implement a facial recognition system in AWS. Amazon Rekognition has become a powerful tool for identifying suspects for my agency. As the service improves, I hope to make it the standard for facial recognition in law enforcement.

Powered by WPeMatico

The post Using Amazon Rekognition to Identify Persons of Interest for Law Enforcement appeared first on Artificial Intelligence Solutions.

Read More...

Activity Tracking with a Voice-Enabled Bot on AWS

Leave a Comment
Read More...

ShareThis