As the complexity of modern software development and maintenance grows, so too does the cost - in terms of time and money. It comes as no surprise then that the evolution of technologies continues to make progress in hiding complexity under abstractions for ease of operations. This has always been the core principle of best software development practices, and the same applies to infrastructure.
For infrastructure, this evolution can be visualized as follows:
As an example of this evolution, take the complexity of shipping an application on bare-metal servers - it was no easy task, and required whole dedicated teams to work on. With provided VMs, things got a bit simpler. Running multiple environments on a single host made hardware utilization more efficient and simplified management of existing servers and deployment of new servers. But this approach still required all the configuration steps.
With the advent of containers, we got the opportunity to use multiple environments on a single OS, and further reduced the number of steps required to ship the application code. Now, the environment could be built and shipped within minutes by creating a container, adding the application code to it and deploying.
While containers seemed to be the pinnacle of evolution at the time, cloud providers kept looking for ways to reduce the complexity of managing infrastructure even further.
Enter Function as a service (FaaS). A revolutionary way to completely remove infrastructure management from the scope of software development that is currently being widely adopted for business applications in the cloud.
FaaS is a category of cloud computing services that provides a platform that enables you to develop, run, and manage application functionalities without the complexity of building and maintaining the infrastructure one normally associates with developing and launching an app. Following a FaaS model is one way of achieving a “serverless” architecture, and is typically used when building microservices applications.
Moving to FaaS was a natural step forward for many container users. According to Datadog's “The state of Serverless” research, about 80% of the companies running containers in AWS have adopted a FaaS approach with Lambda .
AWS Lambda is a serverless compute service that runs your code in response to events and automatically manages the underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create your own back-end services that operate at AWS scale, performance, and security. AWS Lambda can automatically run code in response to multiple events, such as HTTP requests via Amazon API Gateway, modifications to objects in Amazon S3 buckets, table updates in Amazon DynamoDB, and state transitions in AWS Step Functions. 
Of course, there are still physical hardware and virtualization layers even for the "serverless" environments, but they are hidden for the application engineers. One of the main advantages of the serverless approach is that there is no need to maintain any servers or containers explicitly.
As we can see, FaaS minimizes the steps required to get the application up and running as efficiently as possible. This results in faster time-to-market and a reduction in infrastructure team size, which leads to a direct profit for the business.
However, FaaS is not a magical silver bullet and comes with its own unique set of problems. In this post we will share our practical experience with FaaS architectures, as well as common pitfalls and solutions following an example of a real-world application based on AWS Lambda.
Our client, a world-leading manufacturer of electronic components, engaged Grid Dynamics to develop and launch a new solution that would help them to keep track of component lifecycles, demands, delivery timelines, and calculate associated costs and risks in the form of reports and data visualizations.
The client required a short time-to-market, iterating in short development cycles with continuous feedback and the ability to make changes rapidly. The expected load was not known beforehand as it would depend on the number of their clients that would opt to use the app. On one hand, the client wanted the app to be able to scale rapidly to meet the possible demand, on the other hand, they wanted to optimize infrastructure costs for maximum cost-efficiency.
Upon careful consideration of possible options, including AWS Lambda and more traditional containers and servers, we decided that FaaS is the perfect solution as it comprehensively meets key client requirements:
Lambdas integrate seamlessly into the rest of the cloud infrastructure allowing us to easily use the set of cloud tools together. For example, to trigger a Lambda with an SNS notification or when something comes into an SQS queue, it is enough to go to the function, click “Add trigger” and add one of the multiple AWS services as the source of an event. AWS SDK - that allows working with other AWS services, for example, AWS Secrets Manager or S3 - is available in Lambdas out-of-the-box.
For more insights into our experience with serverless environments, read this case study: How to create a serverless real-time analytics platform: a case study
The application that we would like to overview is a set of independent API services. Each service consists of 3-4 functions that serve endpoints corresponding to the task it performs.
A typical service architecture looks like this:
However, when it comes to FaaS, we need to take into account some important considerations.
Java and C# have notoriously long cold start times in serverless environments due to the need to boot a VM. For a customer-facing serverless application, this can be a deal breaker.
Most of the work that the backend does is to take data from the database and send it to the UI. This is an I/O intensive workload and NodeJS, with its asynchronous nature, is known for performing very well in such tasks.
For the functions written in Java or Python that deal with multiple synchronous calls to external systems, especially over http, multithreading can be a boon. Instead of waiting for a response, we can spawn a process and then execute a callback when it’s finished. NodeJS, with its concurrent execution, allows us to get the same benefits without introducing the complexity of multithreading. This adds to the popularity of NodeJS for serverless implementations.
It is also important to take into account the technical skills of the development team and the IT organization as a whole when it comes to the choice of the technologies. NodeJs + TypeScript is a popular, mature, and well-adopted technology stack that fits well into a majority of IT organizations and allows fast extension of the implementation team from the market.
On our mission to optimize app performance, we noticed that some invocations had a delay of up to 2 seconds. Speed is especially crucial for authorizer Lambdas that are triggered before each request, and a 2-second delay is simply unacceptable.
The X-Ray segments showed that this initial delay was caused by the “Initialization” subsegment - a well known issue with Lambdas - cold start.
There are a few possible reasons for cold starts:
The cold starts caused by scaling our customer-facing API routes were a problem, so we started improving things on our side.
The first step was reducing the latency on initialization. One of the steps that Lambda does during the initialization is downloading the code. We cleaned up all dependencies, reduced the amount of code in our shared library and minified the code with webpack. That allowed us to bring the initialization time to under 1 second. But we still wanted to eliminate this time as much as we could, avoiding cold starts where possible.
One of the common strategies to deal with cold starts is scheduled warm-up. This is just calling a function from time to time to keep the environment alive. However, this does not help with cold starts caused by scaling so we opted for another approach by setting up provisioned concurrency.
While this guarantees having warm environments, this approach has some caveats. First, you are billed for the time you have provisioned concurrency enabled and this increases costs. Second, as we mentioned earlier, Lambdas have limited concurrency per region. When you set provisioned concurrency on a function, concurrency available for other functions decreases and this might cause throttling. On the other hand, provisioned concurrency prevents throttling on the function where it is set.
As we can see, it is a great tool that allows for predictable latencies but the amount of provisioned concurrency for a function needs to be carefully measured depending on your traffic patterns.
When a Lambda is called before the environment is reaped, it reuses the existing container and just triggers the handler. This is a warm start. That means that all objects initialized outside the handle are preserved. The first step in our optimization process was to move the creation of database connections and GraphQL server initialization outside the handler function so that they could be reused.
Caching is one of the most effective ways to improve system performance. When it comes to APIs built on Lambdas, we deal with caching on several layers.
When using API Gateway, its built-in caching functionality can be leveraged to improve performance. While this is as simple as ticking a checkbox in API Gateway settings and waiting for a few minutes while it spins up a cache instance, this method has some important restrictions that make it unsuitable for some of our endpoints. For example, the maximum size of the response that could be cached is 1048576 bytes and the max TTL is 3600 seconds.
This led us to setting up a dedicated read-through cache for some endpoints based on an Elasticache Redis cluster. This allowed us to improve performance and reduce costs on long-running requests while storing the data that does not change often as long as we needed. We also implemented another endpoint to flush the cache on demand manually or as part of our release process.
While we achieved an enormous advantage in terms of development velocity and scalability, we had to deal with a set of serverless-specific problems. Here is what we learned and what can help you avoid common pitfalls.
Unlike typical service-oriented backend applications in a serverless environment, the code base size matters. So, work with the source code of your serverless applications thoughtfully and mind the impact of dependencies:
In the world of VMs and containers, the larger your instance, the faster it works and the more it costs. In the world of FaaS, the correlation between the processing power and costs is not that direct. You pay for the time, and a Lambda with more CPU and memory works faster. The only thing you can control is the memory (from 128 MB to 10 GB), and with more memory comes more CPU power.
Julian Wood, in his great Optimizing AWS Lambda Performance and Cost for Your Serverless Applications talk at AWS Online Tech Talks, gives an example of calculating all prime numbers up to 1 million 1000 times on Lambdas of different sizes. The most drastic example of this calculation looks like this :
As you can see, for only $0.00001 extra per invocation, we can get the code running 8 times faster! But how can you calculate which size of the Lambda will be optimal for your code to get the best duration/cost proportion? There is an open source tool called AWS Lambda Power Tuning. Under the hood, it uses the AWS Step Functions state machine to run multiple invocations of your Lambda on different memory allocation and record the results.
At the moment, FaaS is a mature technology that has been adopted by some of the largest companies in the world like Netflix, BBC and Coca-Cola.
Among the companies that run infrastructure on AWS, adoption of Lambda doubled in two years from 2018 to 2020 reaching about 50%. In enterprise environments, this number reaches an impressive 75% .
That shows that businesses see enormous value in serverless development and even the traditionally conservative large companies are eager to benefit from it.
On the other hand, adopting a new approach comes with a new set of challenges to solve. To get the most out of it, you should have access to a team of highly-skilled cloud professionals that know the ins and outs of serverless development.
In this article we showed you some of the challenges associated with serverless environments, as well as the steps we took to improve system performance:
We achieved an enormous improvement in development velocity and scalability and our client will soon see the advantages of cost reductions and speed thanks to their optimized serverless environment.
Get in touch with us to start a discussion about how your business can adopt a serverless approach.