Building Serverless AI powered Applications

Friday, November 24, 2017

This post is a recap of a session given by Adrian Hornsby at Devoxx 2017.
The session is online available on

1. Serverless Computing

Serverless computing allows you to build and run applications and services without thinking about servers. Serverless applications don't require you to provision, scale, and manage any servers. You can build them for virtually any type of application or backend service, and everything required to run and scale your application with high availability is handled for you.

Building serverless applications means that your developers can focus on their core product instead of worrying about managing and operating servers or runtimes, either in the cloud or on-premises. This reduced overhead lets developers reclaim time and energy that can be spent on developing great products which scale and that are reliable.

Serverless applications provide four main benefits:

No server management
There is no need to provision or maintain any servers. There is no software or runtime to install, maintain, or administer.

Flexible scaling
Your application can be scaled automatically or by adjusting its capacity through toggling the units of consumption (e.g. throughput, memory) rather than units of individual servers.

High availability
Serverless applications have built-in availability and fault tolerance. You don't need to architect for these capabilities since the services running the application provide them by default.

No idle capacity
You don't have to pay for idle capacity. There is no need to pre- or over-provision capacity for things like compute and storage. For example, there is no charge when your code is not running.


2. Amazon Polly

Amazon Polly is a cloud service that converts text into lifelike speech. You can use Amazon Polly to develop applications that increase engagement and accessibility. Amazon Polly supports multiple languages and includes a variety of lifelike voices, so you can build speech-enabled applications that work in multiple locations and use the ideal voice for your customers. With Amazon Polly, you only pay for the text you synthesize. You can also cache and replay Amazon Polly’s generated speech at no additional cost.

Common use cases for Amazon Polly include, but are not limited to, mobile applications such as newsreaders, games, e Learning platforms, accessibility applications for visually impaired people, and the rapidly growing segment of Internet of Things (IoT).

Some of the benefits of using Amazon Polly include:

  • High quality – Amazon Polly uses best-in-class Text-to-Speech (TTS) technology to synthesize natural speech with high pronunciation accuracy (including abbreviations, acronym expansions, date/time interpretations, and homograph disambiguation).
  • Low latency – Amazon Polly ensures fast response times, which make it a viable option for low-latency use cases such as dialog systems.
  • Support for a large portfolio of languages and voices – Amazon Polly supports dozens of voices and multiple languages, offering male and female voice options for most languages.
  • Cost-effective – Amazon Polly's pay-per-use model means there are no setup costs. You can start small and scale up as your application grows.
  • Cloud-based solution – On-device Text-to-Speech solutions require significant computing resources, notably CPU power, RAM, and disk space. These can result in higher development costs and higher power consumption on devices such as tablets, smart phones, etc. In contrast, Text-to-Speech conversion done in the cloud dramatically reduces local resource requirements. This enables support of all the available languages and voices at the best possible quality. Moreover, speech improvements are instantly available to all end-users and do not require additional updates for devices.

Polly API Example

The Polly API is easy to use.
This example will make use of the AWS CLI (how to install AWS CLI)

aws polly synthesize-speech
--text “It was nice to live such a wonderful live show”
--output-format mp3
--voice-id Joanna
​--text-type text johanna.mp3

There is more

An in depth example developed by Adrian Hornsby of an RSS-feed connected to a Lambda function parsing text to speech with Polly and storing the MP3’s on Bitbucket can be found here:

3. Amazon Rekognition

Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, faces; recognize celebrities; and identify inappropriate content in images. You can also search and compare faces. Rekognition’s API enables you to quickly add sophisticated deep learning-based visual search and image classification to your applications.


Amazon Rekognition is based on the same proven, highly scalable, deep learning technology developed by Amazon’s computer vision scientists to analyze billions of images daily for Prime Photos. Amazon Rekognition uses deep neural network models to detect and label thousands of objects and scenes in your images, and we are continually adding new labels and facial recognition features to the service.

Rekognition’s API lets you easily build powerful visual search and discovery into your applications. With Amazon Rekognition, you only pay for the images you analyze and the face metadata you store. There are no minimum fees and there are no upfront commitments.

Rekognition API Example

The Rekognition API is easy to use.
Both examples will use the AWS CLI (how to install AWS CLI) and use an image stored in your S3 bucket.

Use Rekognition to detect labels

aws rekognition detect-labels
--image '{"S3Object":{"Bucket":"adhorn-reko","Name":"horse.jpg"}}'

Use Rekognition to detect faces

aws rekognition detect-faces
--image '{"S3Object":{"Bucket":"adhorn-reko","Name":"horse.jpg"}}' 
--attributes "ALL"

There is more

A more in depth example developed by Adrian Hornsby can be found on Github as well:

This example processes photos uploaded to Amazon S3 and extracts metadata from the image such as geolocation, size/format, time, etc. It then uses image recognition to tag objects in the photo. In parallel, it also produces a thumbnail of the photo.

4. Rise of AI

Cloud Computing and AI is a match made in heaven. Ten years ago it would be almost impossible at home but now you can just boot up one thousand AWS instances and start doing neural networking (Amazon will love you for this ;)).

Ten years ago there was also very little data available to train your network, neural networks need enormous amount of data to learn. Cloud Computing offers cheap storage as well to store these amounts of data.

A lot of companies start with AI because storage and computing in the cloud is now very cheap.

Neural Network AMI’s

Amazon has some AMI’s available for deep learning which are available in the Market Place: