View a markdown version of this page

Change management - Streaming Media Lens

Change management

In a traditional streaming media environment, major changes are performed infrequently during maintenance windows with manual processes. This is high-risk as the people, processes, and tools used to update infrastructure are infrequently used. This can lead to drift in documentation and institutional knowledge of the workload. 

We encourage you to automate deployments, testing, and rollbacks using services like AWS CloudFormation. This enables teams to make small, frequent changes on a regular basis and ensure that the infrastructure state is represented in code, managed in version control, and that your processes used for change management are well tested. To make troubleshooting issues easier, we also recommend creating a consistent naming convention for the components that make up the workflow, for instance, name the component with identifiers for asset, service Region, and AZ.

SM_REL2: How does your streaming media workload adapt to viewer demand?
SM_RBP4 – Use a CDN and plan capacity with your providers
SM_RBP5 – Design your origin service to automatically scale to meet viewer demand

The relationship between the client requests, delivery caching, and origin scaling is the most important area to examine when scaling your streaming media workload. Ingest and Processing components are scaled out in advance or run in batch before being made available for viewers and demand has no impact on these components. 

A Content Delivery Network (CDN) is necessary to scale infrastructure for streaming media. CDNs provide multiple benefits by reducing the load on backend origination services, improving end-user performance, and lowering cost. By caching requests at the edge and within the CDN network, viewers are served directly from caches nearest them and requests to your origin server are dramatically reduced.

When considering which CDN to use, use the CDN’s network infrastructure presence in the geographical areas where your viewers are located. While most CDNs offer coverage for viewers in the US and Europe, platforms with large number of viewers in Asia, Africa, or Latin America should be especially cognizant of the points of presence and network capacity of their CDN in those Regions and even consider a multi-CDN approach.

To achieve petabyte delivery scale, improve performance, or respond to intermittent delivery network issues, streaming media architectures might consider a multi-CDN delivery strategy. With multiple CDNs, you award or weight traffic to specific distribution networks based on current performance for a specific user or geographic Region – providing optimal viewing experience. Before you take this approach, consider the following trade-offs when compared to a single CDN approach:

  • Increased Origin Load — With multiple CDNs, you will have more caches to populate with content. This will result in a lower Cache-Hit-Ratio and increase the load to origination services. Some of this load can be offset through an origin shield component.

  • Increased Cost — Many CDNs offer tiered pricing based on utilization. By using multiple CDNs, you might not have access to lower pricing tiers.

  • Operational Overhead — Deployment, testing, and the operation of multiple CDNs adds operational overhead.

  • Lack of Feature Parity — Implementation might be hindered by the lack of feature parity across CDNs. This situation could introduce new requirements for your infrastructure and even reduce performance.

One approach to multi-CDN uses DNS resolution to apply a weighted round-robin distribution across CDNs. This is common for organizations looking to distribute load to meet capacity requirements, to meet cost commitments, or to minimize blast radius of CDN outages. This can be accomplished using Amazon Route 53 by defining multiple record sets with a “Weighted” routing policy and the desired weight. 

A DNS-based configuration is easy to implement and can be changed periodically to reflect network conditions, but DNS changes don’t propagate quickly and some systems do not honor TTL values. For live events, where client playback buffers are small and network degradation can cause immediate impact, we recommend a dynamic HTTP-based approach that specifies a CDN during initialization of a playback session.

Adaptive Bitrate (ABR) protocols function by serving a manifest with pointers to media segments representing the elementary streams of audio and video data. With HTTP-based evaluation and routing, client content requests are served playback manifests that reference objects hosted by one or more CDNs. By serving customized manifests, you can target specific devices, geographies, or ISPs with prescriptive CDN. You can maintain CDN redundancy by providing alternative content URIs within the manifest, weighting them, and implementing failover logic within the player.

Regardless of approach, using multiple CDNs will increase load on origin services because each CDN will have its own caching network and requests to satisfy. To minimize the number of requests hitting origin services directly, you should optimize the content TTL, enable each layer of caching available, and use an origin shield service that can collapse requests from multiple CDNs into a single request to your origin layer.

As client requests come through the CDN, the origin layer must elastically respond to meet viewership demand. Implement your origin service within Auto Scaling groups, across multiple Availability Zones, and behind Application Load Balancers to ensure high availability. To determine the additional demand back to the origin, refer to the Performance Pillar in this paper to estimate load and inform scaling needs. 

For best practices in change management, refer to the Reliability Pillar whitepaper.