Cloud Architecture

As an extension of the CMS Common Enterprise Infrastructure environment, the cloud infrastructure must conform to the same guidelines and policies as the other CMS operating environments. Cloud architectures at CMS are built on:

  • Cloud infrastructure components
  • Cloud multi-zone architecture
  • Cloud-suitable applications

The infrastructure components serve as the architecture’s “building blocks.” The cloud multi-zone architecture outlines the “city plan” for constructing with those building blocks. Like a city, there are shared services that are defined for clouds to provide certain capabilities that are above and beyond traditional data centers. Application software can leverage the cloud architecture to provide business services that meet cloud computing expectations of scalability, elasticity, and self-service. The following subtopics address these architectural aspects.

Cloud Infrastructure Components

Some Cloud Service Providers may offer all three types of cloud service models (IaaS, PaaS, and SaaS). The delineation between IaaS and PaaS is often blurred depending on the CSP’s service offering as described in its cloud service catalog. More common are CSPs that coordinate with third-party providers or system integrators to supply and operate SaaS and/or PaaS independent of the IaaS offerings.

For purposes of discussion within this chapter and as defined by NIST, IaaS represents the core computing resources of CPU, memory, and virtual machine hypervisor instances; storage capacity; LAN throughput and Internet connectivity; and management software for managing these resources. Additional IaaS services often include backup storage for off-line / off-site storage, infrastructure and hypervisor security services, additional logging and reporting services, load balancing services, relational database and unstructured database storage services, and cloud administrator account management services to allow monitoring and managing of PaaS services. Other services such as database administration or web hosting typically fall under the PaaS category.

Role of Virtual Machines in Cloud

The core infrastructure component of the IaaS cloud is a virtual machine. A VM is a set of resources, provided by one or more physical devices, configured to provide x compute cores, y memory, z disk storage, and t network interfaces. These values for (x, y, z, and t) vary by provider. The VM can run any general-purpose operating system certified by the IaaS provider. In addition, the VM may also support special purpose components such as routers, firewalls, Graphics Processing Units (GPU), and Extensible Markup Language (XML) appliances, which use manufacturer-hardened OS, for virtual security appliance roles.

Not all flavors of OS are supported by the specific hypervisor(s) in use at a CSP. Most CSPs offer a catalog of the OS images their infrastructure supports. In addition, CMS may provide additional security-hardened images as needed, especially if virtual security appliances are necessary.

Whether a platform is IaaS or PaaS, the VM is the core technology that facilitates on-demand provisioning. The primary difference from the standpoint of the cloud user is that in IaaS, the VMs are provisioned on demand while such details are typically not transparent to PaaS users.

Types of Virtual Machines

IaaS Clouds use two kinds of VMs: application and infrastructure. This intentionally mimics the physical machine configurations typically found in data centers.

Application VMs operate application software such as:

  • Database servers (both relational and NoSQL)
  • Web servers
  • Java 2 Platform Enterprise Edition (J2EE) application servers
  • Commercial Off-the-Shelf (COTS) applications

Infrastructure VMs provide virtual equivalents of physical data center infrastructure components, including:

  • Virtual firewalls and switches
  • Virtual caching proxy servers
  • Virtual load balancers
  • Virtual network intrusion detection system devices
  • Virtual XML acceleration processors
  • Virtual machine management servers
    • Virtual Lightweight Directory Access Protocol (LDAP) servers
    • Virtual Domain Name Services (DNS) servers

The IaaS cloud configuration combines both kinds of VMs into a multi-zone architecture that meets the security and privacy requirements mandated for CMS systems.

In some situations, the IaaS cloud vendor may offer physical instances for some of the devices listed. There are valid reasons to choose physical devices over virtual devices if the choice is available. For example, using a physical firewall rather than a virtual appliance may improve performance or have a lesser exposure to security vulnerabilities. Due to the rapidly changing cloud infrastructure market, it is important to assess these options within the context of required CMS security and business requirements.

In other instances, the IaaS cloud vendor may offer these capabilities as services, without specifying virtual machines. For example, Amazon Web Services Security Groups and Elastic Load Balancers are not defined as virtual machines.

Services covered by a CMS ATO (above and beyond FedRAMP) may be used in lieu of virtual machines in the CMS Multi-Zone Architecture. Satisfying the CMS ARS is a higher bar than the FedRAMP controls. In other words, it is not sufficient to satisfy FedRAMP to meet CMS’s requirements.

Hosting CMS Government-Furnished Equipment at a CSP

Depending on the CSP, it may be possible to place CMS GFE into a cloud environment. Typically, the CSP would cage off and use this equipment for very specific purposes. For cost reasons, this is generally prohibited, as reinforced by Business Rule CI-4.

There are, however, several legitimate considerations for hosting GFE in a cloud environment. The following considerations are the most common:

  • Non-virtualized hardware may be more suitable in some cases than virtual hardware. This is true for applications that may leverage specific hardware capabilities or that are extremely performance intensive.
  • Some applications may not be suitable for a virtualized environment.
  • Specialized, non-general-purpose hardware may be required by CMS in the cloud environment.
  • Some data must be stored on GFE rather than third-party storage hardware.
  • Greater security may be realized with specialized appliances than general-purpose VMs.

Use of GFE constrains available cloud capabilities, such as bursting, elasticity, and managed services. By hosting GFE in the cloud, CMS loses many of the benefits of cloud computing.

Cloud Multi-Zone Architecture

The CMS TRA defines multiple zones in the architecture, including the Presentation Zone, Application Zone, Data Zone, a Transport Zone, and a Management Zone. In a traditional hosted managed services environment, these zones would be physically separated, with redundant paths and multiple (layered) security devices along the paths.

The cloud environment must be adapted to fit the CMS Multi-Zone Architecture. Since each change introduced to the standardized structure results in higher cost and management overhead for CMS, one benefit of cloud architectures is the standardized structure that employs common equipment, network interconnects, and management servers. The tradeoff is the need to provide sufficient detection and preventive security controls to meet CMS requirements.

In an IaaS / PaaS cloud environment, the devices that comprise the various zones (including servers and storage) are usually virtualized instances. The network itself is often a combination of services as well as virtualized and physical devices making extensive use of VLANs.

There are, of course, certain risks to be considered in cloud environments, including:

  • High-performance applications may experience performance degradation due to virtualization overhead.
  • Maintaining compliance with licensing contracts can be challenging due to relative ease of provisioning, elasticity, and bursting. These features may impact licensing costs, and not all software licenses are cloud compatible.
  • Not all software is equally suitable for deployment in a cloud environment.
  • For PaaS, the platform is typically upgraded for all customers simultaneously with no option to stay on back-level software.
  • The hypervisor introduces some additional security risk because it lies behind the cloud’s private infrastructure.
  • Ease of creation of virtual machines may lead to VM sprawl. Although VMs are efficient users of hardware, a VM requires the same amount of technical labor as a physical machine for services above the IaaS layer. Thus, costs can and will grow with elasticity.
  • For PaaS, the platform is typically proprietary. Consequently, applications written for this platform will not be portable to other platforms (PaaS or otherwise). There is a very real danger of vendor lock-in when dealing with PaaS.

Resource Aggregation – Resource Clusters and Resource Pools

Cloud computing involves two levels of resource aggregation: resource clusters and resource pools. A resource cluster is defined as a group of connected servers that work together so that, from the point of view of aggregating resources such as CPU processing and memory, they can be viewed as though they were a single computer. For example, a resource cluster might have 30 CPUs and 120GB of RAM. In a community cloud, a CSP builds resource clusters and assigns them to the community. The resource cluster is only available to members of the community.

A resource pool is a set of available CPU and memory resources, and generally consists of multiple “like” resources, i.e., of the same type. In a community cloud, once the resource cluster(s) are created, resource pools are defined and allocated to members of the community. Resource pools may be organized in a hierarchy in which resource pools are subdivided into subordinate resource pools. For example, a resource pool could be defined as “Administrative” and have sub-pools of “HR” and “Payroll.” At CMS, the rule is to use one resource pool per business application.

CMS Reference Architecture

Virtualized Multi-Zone Cloud Infrastructure depicts a conceptual CSP cloud implementation for a CMS reference architecture. Virtualized Multi-Zone Cloud Infrastructure illustrates several key concepts when deploying systems on a cloud infrastructure. To provide a defense-in-depth posture, the firewall and network devices managing Internet communications must be implemented as physical devices. If they were implemented on the same shared VM host, a single exploitation of the hypervisor would compromise the entire infrastructure.

The enclosing boxes in Virtualized Multi-Zone Cloud Infrastructure represent three different physical sets of resources, presented clockwise as the CMS-provided Equipment Cage (at 11 o’clock), the Production Zone Resource cluster at 1 o’clock, and the Management Zone Resource cluster, at 8 o’clock. The single physical firewall at the center of the diagram and the virtual switches are generally part of the CSP IaaS. The addition of virtual firewalls between processing zones meets CMS security requirements regardless of whether CMS manages only the firewall rules or the entire firewall instance.

For technologies that require shared responsibilities, CMS and the CSP must document the demarcations (e.g., firewall configurations managed by the CSP and firewall rules managed by CMS). Separate resource clusters must be defined for production, non-production, and Management Zones. Resource pools within a resource cluster are defined for each business owner or application.

Figure Virtualized Multi-Zone Cloud Infrastructure

Figure depicts a virtualized multi-zone cloud architecture. The internet is connected through a physical firewall to three zones. One zone contains CMS-provided equipment. The next is the Management zone, which is used to host management and security tools (as described in the Technical Reference Architecture foundation document). The final zone is the Production Zone, which consists of virtual machines. One set of virtual machines forms the Presentation Zone and is separated from the Application and Data Zones by a virtual firewall. The next set of virtual machines forms the Application Zone and is separated by the previously described virtual firewall and another virtual firewall.  Finally, the Data Zone consists of a set of virtual machines.
Virtualized Multi-Zone Cloud Infrastructure

The cloud environment offers additional architectural flexibility. Some CSPs provide a service through which customers can host agency-provided equipment at the CSP in a caged environment. CMS may choose to implement COTS or Government-Off-The-Shelf (GOTS) products, additional security monitoring equipment, or specialized equipment that handles CMS’s most sensitive data—e.g., Protected Health Information or Federal tax Information (FTI). Business owners should carefully consider costs when using the CSP’s customer hosting services. CMS or the CSP may be responsible for monitoring and managing this equipment.

Note: The diagram Virtualized Multi-Zone Cloud Infrastructure greatly simplifies the details of the Internet connection, especially the requirement for Trusted Internet Connection (TIC) as required by FedRAMP. The CMS ARS provides the details for wide area networking.

Cloud Services Architecture

A cloud can be represented functionally as the combination of many managed services. The following services are needed in a CMS Cloud. It is unlikely that any one commercial cloud offering would contain all such services; however, applications destined for the Cloud and with a DevOps focus need to ensure that these capabilities are available.

Here are cloud services organized from the perspective of the primary user:

Customer View
  • Self-Service portal
  • Cloud Service Provider Help Desk services
System Programmer View
  • CSP Application Programming Interface (to communicate with the CSP services and cloud management infrastructure)
Systems View

Virtualized machine resources

  • Virtualized storage resources
  • VM configuration management (different from Software Configuration Management)
  • Resource and Capacity Management services
  • Application Performance Management services
  • Storage Management services
  • Virtual Machine Image library
  • License-tracking services
  • Run-book automation services
Security View
  • Security services (including firewalls, audit logging, security monitoring, and other such services)
  • Remote Log-in services
Network Operations View
  • Backup and Archival services
  • Virtual Network services
    • VLAN Management services
    • Internet Protocol Address Management services
    • Dynamic Host Control Protocol (DHCP) / DNS services
    • Virtual Firewall services
    • Authenticated Time Servers
  • Application Performance Monitoring services
Cloud Service Provider View
  • Billing and Payment services
  • Hypervisor infrastructure

Data centers commonly use many of these tools today. The highly automated nature of cloud environments now makes their use mandatory.