Using Amazon CloudWatch Internet Monitor for a Better Gaming Experience

TutoSartup excerpt from this article:
Multiplayer online games require fast, stable internet connections to ensure smooth, seamless gameplay, so internet speed and connectivity are a significant priority for gamers… Amazon CloudWatch Internet Monitor can help you quickly pinpoint where users are experiencing latency issues globally a…

This blog post shares how gaming customers can use health metrics to more easily monitor game performance using Amazon CloudWatch Internet Monitor. It provides an example overview of an online-gaming application architecture, everyday issues and challenges to monitor performance and availability for gamers, and describes how you can use Amazon CloudWatch Internet Monitor to identify and address latency and availability issues for gaming applications. Please note the blog assumes the reader has already read Introducing CloudWatch Internet Monitor.

Several gaming companies are taking advantage of Amazon Web Services (AWS) services, covering breadth and depth in hardware as well as software components. Multiplayer online games require fast, stable internet connections to ensure smooth, seamless gameplay, so internet speed and connectivity are a significant priority for gamers. Not all gamers understand latency, but it’s one of the single highest priorities for a game developer to ensure a smooth experience for players. Amazon CloudWatch Internet Monitor can help you quickly pinpoint where users are experiencing latency issues globally and give insights into improving performance for your AWS internet-facing applications.

The architecture of gaming workloads

By running multiplayer game servers in the cloud, game sessions are hosted on remote servers and connected to end devices, such as computers, tablets, or smartphones. Because video and audio are streaming onto local devices, gamers can enjoy interactively playing with others, and there’s no requirement for specific high-end gaming hardware. The following diagram shows an example gaming application architecture and typical AWS components.

Figure 1. Example architecture for gaming applications in AWS

Figure 1. Example architecture for gaming applications in AWS

  • The game client is a computer, tablet, or handheld gaming device. That is the player’s device to access the game portal and play games.
  • Frontend servers include the game servers, which host the game itself and platform services for the game, which provide features like leaderboards, matchmaking, chat services, inventory management, and analytics.
  • Backend servers include servers that provide game database and analytics services (either regional or centralized), maintain game state, store analytics, and keep game servers up-to-date.

Running game servers in the cloud can provide a scalable, high-performance platform for players to enjoy games from anywhere in the world. However, given that the internet is a distributed environment with users competing for resources, intermittent issues may occur. The following section runs through common internet issues.

Internet issues for game studios

The following are some common internet issues that gamers who use cloud gaming can encounter:

  • Latency and lag: Latency is colloquially known by players as “ping time” or “round-trip time”. This is the time it takes for data to travel from a player’s device to the game server, and back. High latency can cause delays and lag in the game, which can affect the player’s ability to play because they are out of sync.
  • Packet loss: Packet loss is when some of the data sent from a gamer’s device to the game server is lost. This can lead to glitches in the game and even game crashing, which is frustrating for players because they miss what’s happened.
  • Bandwidth: Bandwidth is the amount of data that can be transferred over an internet connection per second. High-quality games require more bandwidth to run smoothly, and a slow internet connection can result in choppy gameplay and longer load times.

Challenges with monitoring online games

The quality of a player’s experience is tightly integrated with the performance of their internet connection. When the connection quality degrades, identifying what’s causing a specific internet issue isn’t easy because there are multiple factors that can be responsible, and it can be hard to track down which one is the actual problem. Here are some of the challenges:

  • Internet performance is made up of many different providers
  • Lack of visibility when an Internet Service Provider (ISP) has network issues
  • Gathering data about ISP performance to drive improvements

The following diagram helps illustrate how complex troubleshooting internet issues can be. In the diagram, game servers located in one AWS Region are connected over the internet to play games. ISPs typically have multiple points of presence (POPs) for accessing and connecting to the internet. There can be an internet outage at a POP or another access point. Game developers require testing, monitoring, and troubleshooting to detect these kinds of internet problems.

Figure 2. Clients and ISP networks accessing the AWS cloud

Figure 2. Clients and ISP networks accessing the AWS cloud

It is essential to monitor traffic into AWS to understand how your internet-facing game performs. However, collecting and tracking internet traffic data can be difficult and expensive. Network capture tools can be intrusive and create machine overhead for players. You can avoid these problems using Amazon CloudWatch Internet Monitor for your game.

Addressing customer needs

With Amazon CloudWatch Internet Monitor, you can monitor internet problems across multiple geographic locations and Autonomous System Numbers (ASN), such as ISPs, without writing a single line of code or putting network capture tools on your players’ machines. When an issue occurs, Internet Monitor can help you visualize its impact and pinpoint the locations, including internet service providers, affected. You can see a global view of traffic patterns and health events and dig deeper into details about events based on the event’s location. You can also learn actions that you can take to improve network experience for players in the future by rerouting through different ISPs or using other AWS regions or services.

Customer Use Case

In this walkthrough we demonstrate how CloudWatch Internet Monitor can help monitor game performance. The use case is a game studio has launched a global game in one AWS region. The studio is concerned about providing a smooth gaming experience for all players and wants to respond quickly if it encounters any network issues. The studio would like to see all of its game’s traffic therefore initially monitor 100% of the internet traffic. The studio focuses on improving performance across the top 100 cities by pinpointing areas experiencing issues and adjusting traffic accordingly. In order to identify which players have a bad experience, the studio will use Time to first byte (TTFB) as its key metric. TTFB measures how long it takes to transfer the first byte to the game client. The studio would also like to optimize future game launches by deploying in the location where the most gamers will have optimal TTFB.

Step 1: Create an Internet Monitor by following the steps outlined in Introducing CloudWatch Internet Monitor and select 100% of traffic as shown in the following image.

CloudWatch Internet Monitor

Step 2:  To see more than the dashboard traffic optimization suggestions, use the following CloudWatch Log Insights query.

  • Go to CloudWatch→LogInsights
  • Select the log-group “/aws/internet-monitor/<monitorname>/byCity”

CloudWatch Log Insights query below

  • Select an appropriate time-period, and use the following query.

fields @timestamp,
clientLocation.city as @city,
clientLocation.subdivision as @subdivision,
clientLocation.country as @country,
`trafficInsights.timeToFirstByte.currentExperience.serviceName` as @serviceNameField,
concat(@serviceNameField, ' (', `serviceLocation`, ')') as @currentExperienceField,
concat(`trafficInsights.timeToFirstByte.ec2.serviceName`, ' (', `trafficInsights.timeToFirstByte.ec2.serviceLocation`, ')') as @ec2Field,
`trafficInsights.timeToFirstByte.cloudfront.serviceName` as @cloudfrontField,
concat(`clientLocation.networkName`, ' (AS', `clientLocation.asn`, ')') as @networkName
| filter ispresent(`trafficInsights.timeToFirstByte.currentExperience.value`)
| stats avg(`trafficInsights.timeToFirstByte.currentExperience.value`) as @averageTTFB,
avg(`trafficInsights.timeToFirstByte.ec2.value`) as @ec2TTFB,
avg(`trafficInsights.timeToFirstByte.cloudfront.value`) as @cloudfrontTTFB,
sum(bytesIn + bytesOut) as @totalBytes,
latest(@ec2Field) as @ec2,
latest(@currentExperienceField) as @currentExperience,
latest(@cloudfrontField) as @cloudfront,
count(*) by @networkName, @city, @subdivision, @country
| display @city, @subdivision, @country, @networkName, @totalBytes, @currentExperience, @averageTTFB, @ec2, @ec2TTFB, @cloudfront, @cloudfrontTTFB
| sort @averageTTFB desc | limit 100

The query sorts by highest average TTFB for the top 100. The Gaming company can then focus on improving those areas first.

The query sorts by highest average TTFB for the top 100. You can now focus on improving those areas first.

Step 3: To determine where to launch your next server and improve performance for the greatest number of players, use CloudWatch Internet Monitor Traffic Insights. Sample Traffic Insights are shown in the image below. Here you can filter by a client location or network, and  select different options, such as Amazon Elastic Cloud Compute (EC2) or Amazon CloudFront, to see what the predicted average TTFB is compared to the current TTFB. Viewing the options and recommendations can help give you a head start on planning new setups for your application to improve performance.

To determine where to launch their next server and improve performance for the greatest number of their players they will use Traffic Insights

The traffic optimizations show that London, England, United Kingdom have the largest amount of total traffic with the highest TTFB and would be optimized if the next server was launched in eu-west-2.

Conclusion:

In this blog post, we showed how you can improve players experience with Amazon CloudWatch Internet Monitor. Using the connectivity data that AWS captures from its global networking footprint, game studios can identify internet issues impacting players and use data and recommendations to improve connectivity by shifting traffic through different AWS regions or services. To learn more about Internet Monitor, visit the documentation.

Blog authors: David Fowler & Prashanth Nalubandhu
Using Amazon CloudWatch Internet Monitor for a Better Gaming Experience
Author: David Fowler