There is an ongoing paradigm shift in how organizations address security as they move their workloads from traditional data centers to AWS.
Traditional IT security has relied heavily on a strong network segmentation approach as the main security strategy to secure and protect an organization’s resources.
Typically, the security governance structure of an organization has been siloed into distinct teams, each responsible for a different aspect of security: a firewall team, a router/switching team, a server team and the application team. Three of those four teams have been based on the fact that there are different physical devices (servers, firewalls, switches, etc.) managed by the different teams, with the application team assuming everything has been properly configured to meet security requirements.
This has led to unnecessary manual processes to ensure adequate coordination between the teams during deployments and change requests, all of which lead to long delays and are prone to human error.
As organizations move workloads from traditional data centers to AWS, the organization’s IT and security professionals have had to completely change the way they address security. AWS has virtual constructs that are analogous to the traditional firewalls, routers, switches, servers, etc.; however, they are not separate physical devices. They are all constructs created, configured and terminated through RESTful APIs.
This potentially allows one team to control the firewall rules, the routing, the server configuration and the mode through which applications can be deployed via the AWS APIs. And since everything is controlled via the APIs, automation can be used to remove manual processes, thus speeding all deployments and reducing human error.
Traditional Network Security and Segregation
It’s been common to illustrate traditional network security as a strong castle, with large, secure walls, and possibly a large moat full of water, all to keep the resources inside (application, data, etc.) safe from the outside world (the Internet). The only way to enter or leave this protected environment is through the front gates, which are closely guarded to prevent unwanted people from entering and protected resources from leaving it.
This approach relies heavily on a strong perimeter, with tight controls at the entrance gates to successfully protect internal resources. The inside of the castle is further segmented into isolated areas, with additional internal walls and moats, to further segment and protect the important resources.
This analogy is not far from how traditional network security has been implemented for the past 20-plus years. There are redundant external firewalls (the castle’s external walls and moat) which only allow defined traffic to ingress and egress the environment from the untrusted networks (the Internet) via specific firewall rules that map to business requirements. The environments are further segregated by deploying internal firewalls, or by relying on ACLs and internal routers to further protect sensitive resources (e.g. application servers, database servers, etc.).
The below example of a logical network diagram illustrates these approaches. It is important to note that the approach relies on physical appliances and devices to obtain the required level of isolation and segregation.
AWS Network Security and Segregation
The rest of this blog post will focus on the AWS security constructs that enable organizations to create the necessary security segregations. These constructs are not defined manually by manipulating physical devices that are controlled by different silos, but via AWS APIs. This allows organizations to collapse silos based on device ownership and facilitates needed automation to remove manual processes, thus speeding all deployments and reducing human error.
Software Defined Networking
Before diving into AWS Network security, it’s important to talk about SDN or Software Defined Networking. In simple terms, SDN is an architectural approach that allows the management of networks through an abstraction that decouples the system and makes decisions about where traffic is sent (control plane) from the underlying systems that forward traffic to the selected destination (data plane). This decoupling enables the network control to become directly programmable and the underlying infrastructure to be abstracted for applications and network services.
Having separate VPCs in the same account does not separate a top-level administrative control boundary for each VPC, thus, the security benefit is reduced (same account controls both VPCs and its resources).
There are operational reasons to have separate VPCs in the same AWS account. For example, having application workloads in a “workload” VPC that is peered to a “shared resources” VPC to support the workloads (e.g. active directory).
After creating a VPC, the next logical construct is the subnet. Subnets in AWS are sub-networks within a VPC and are analogous to the subnets that are defined in traditional VLANs via 801.1q tagging. One can add one or more subnets in each availability zone; however, each subnet must reside exclusively within one AZ and cannot span AZs. AWS has three types of subnets:
- Public – external subnets that have public IP addresses associated to servers and can be accessible from the Internet. They are analogous to traditional DMZ Networks.
2. Private – internal subnets that have only private IP addresses associated to server and are not accessible from the internet. They are able to access the Internet via NAT.
3. Protected – internal subnets that have only private IP addresses associated to the resources and are not accessible from the internet. They are NOT able to access the Internet.
It is important to ensure each type of subnet can span multiple AZs to achieve resiliency. Thus, if an environment has servers that have public facing servers and internal servers, at least two public subnets (in separate AZs) and two private subnets (in separate AZs) need to be created to support public facing servers and internal servers in a high availability configuration.
Everything up to this point has been focused on laying the necessary network components in a secure and highly-available manner. And up to this point, the AWS components have very closely corresponded to the traditional network approaches. Our discussion on security groups will deviate with this correlation.
As describes previously discussed, an in-line appliance has been used for all network-based firewall capabilities in traditional networks topologies. AWS security groups are not confined to single security appliances but span the entire VPC and are virtual constructs that actually tie into each hypervisor and network component of the AWS Physical Infrastructure layer.
A security group is a virtual stateful firewall for servers (ec2 instances) that enables the control of inbound and outbound traffic. When a security group is created to “protect” a group of servers in a VPC, the control-plane would create the necessary configurations to ensure all the instances, regardless of what AZ, or subnet, or hypervisors the servers reside to ensure the security group’s policy are adhered to.
It is important to note that security groups require explicit rules to permit all inbound traffic. Outbound traffic is open by default, but it too can be restricted.
Given that security groups create “containers for servers” with specific security profiles, security groups can be used as an isolation mechanism. Thus, instead of creating separate subnets for web servers, application servers and database servers, the security group can be leveraged to create the necessary isolation between “tiers”, and drastically simplify the required infrastructure. The diagram below depicts the isolation achieved by security groups.
Network Access Lists (NACLs)
AWS also offers network ACLs with rules similar to your security groups. NACLs act as a firewall for associated subnets, controlling both inbound and outbound traffic at the subnet level. The following summarizes the basic differences between security groups and network ACLs:
- Operates at the instance level (first layer of defense)
- Supports allow rules only
- Is statefull: Return traffic is automatically allowed, regardless of any rules
- We evaluate all rules before deciding whether to allow traffic
- Applies to an instance only if someone specifies the security group when launching the instance, or associates the security group with the instance later on
- Operates at the subnet level
- Supports allow rules and deny rules
- Is stateless: Return traffic must be explicitly allowed by rules
- We process rules in number order when deciding whether to allow traffic
- Automatically applies to all instances in the subnets it’s associated with (backup layer of defense, so you don’t have to rely on someone specifying the security group)
As a general best practice, Rackspace advises customers to use security groups as their primary method of segmenting and securing workloads within AWS. While NACLs are typically more familiar to networking engineers, they often introduce complexity into AWS architectures.
Security groups provide more granular control, are stateful (therefore more intelligent in allowing appropriate traffic) and apply only to the instance level. By using NACLs as well as security groups, one must consider all traffic in a stateless context (specifying inbound and outbound ports, including any ephemeral ports used by a given application) and these rules are applied at a subnet level; the “blast radius” or potential for impact when a NACL is incorrect or changed is significantly higher, without providing any tangible benefit over the use of a security group.
Rackspace and AWS recommend avoiding NACLs due to potential conflicts with security groups and performance degradation.
AWS has created a rich set of services, all of which are controlled via the control plane, which allows organizations to create secure and segmented networks that align with the needed security posture. Given that all the network constructs are accessible via the AWS APIs, the need for siloed teams is no longer relevant.
Previous governance policies and processes that relied on silos change to meet the demand for a cohesive multi-discipline team that can do all the work. Given that everything is accomplished via APIs, automation and auditing enable fast provisioning while adhering to the new governance policies and processes.
The following AWS network constructs allow for the needed segmentation:
Security Groups should be leveraged as the principle method for isolating and compartmentalizing workloads. Avoid creating subnets or separate VPC (with peering) for such traditional reasons.
Subnets should be used as a segmentation approach for workloads that have different routing requirements (e.g. Public vs. Private vs. Protected Subnets).
VPCs should be used as a segmentation approach (assuming VPC-Peering) when workloads that are controlled by separate business units in separate AWS accounts are dependent on each other and need the network connectivity without traversing the public network.
Additionally, there are operational reasons to have separate VPCs with peering in the same AWS account (e.g. application workloads in VPC peer to another VPC to access aentralized resources.