Business Intelligence Environment
CMS has successfully implemented the Agency’s enterprise-wide BI Environment that provides a consolidated, secure gateway to the wealth of CMS data where users can leverage a suite of standard COTS BI tools. These BI software tools are made available for use by CMS offices and centers, analysts, developers, project managers, external researchers, law enforcement, partners, government contractors, and the public.
While CMS supports all of these tools, CMS prefers that project teams consider these factors:
- Use case
- Cost at scale
- Ability to reuse developed capabilities for the given data source(s)
These tools have been integrated with various data repositories and frameworks. Some are for use by the enterprise and external entities, while others are specific to particular Centers or projects. The latter are called out because their capabilities can be replicated for other use cases.
The analytic and BI tools implemented and supported at CMS include the following COTS products (comparisons below):
-
Esri ArcGIS Enterprise provides an AWS Cloud based enterprise solution that enables geospatial mapping and analysis. ArcGIS is used by the Center for Program Integrity (CPI) to enable geographic mapping and analyses to help meet its business objective of finding and reducing fraud, waste, and abuse in Medicare and Medicaid. It uses geocoded data from the Integrated Data Repository Cloud (IDRC) Provider and Beneficiary data.
-
AWS Services – AWS provides Query services like Amazon Athena and Redshift Spectrum, data visualization tools like Amazon QuickSight, data warehouses like Amazon Redshift, and sophisticated data processing frameworks like Amazon EMR (Elastic MapReduce). Each of these services addresses different needs and use cases, and are used by multiple centers at CMS. The table below compares them and provides guidance to help project teams choose one or more services based on their requirements.
-
SAP Business Objects (SAP Business Intelligence Platform) supports the Center for Program Integrity (CPI), for example, using the CMS Integrated Data Repository Cloud (IDRC) and provides ad hoc and OLAP reporting capabilities.
-
IBM Cognos Analytics supports the Part D and Drug Data Processing System (DDPS),Medical Appeal System (MAS), Unified Case Management (UCM), and Eligibility Appeals Case Management Solution (EACMS) (among others) using the Teradata data repository and provides ad hoc and online analytical processing (OLAP) reporting capabilities.
-
Informatica PowerCenter and Data Engineering Integration (DEI) are data integration tools used to integrate data warehouses, data marts, and operational data stores (ODS) with the capability to extract, transform, and load (ETL) data from virtually any business data source in CMS. PowerCenter is used for ETL with DB2, Teradata, Oracle databases, and flat files.
-
MicroStrategy MicroStrategy is a data analytics and business intelligence reporting tool that can generate interactive dashboards, high-end formatting in reports, scorecards, and several other features relating to the generation, sorting, and automated distribution of reports that provide insights into past, present, and future business trends. It supports Part B Analytics Reports (PBAR), for example, using the Teradata data repository and provides ad hoc and OLAP reporting capabilities.
-
Microsoft Power BI is a unified cloud-based business intelligence and analytics service that provides a full overview of the application's most critical data. By connecting to all the application data sources, Power BI simplifies data evaluation and sharing with scalable dashboards, interactive reports, embedded visuals, etc. while maintaining data accuracy, consistency, and security.
-
Python, R, and Scala are programming languages that are very helpful for data scientists since they offer powerful libraries to support data science use cases. The BI tools team provisions CMS Python AWS WorkSpaces with connectivity to IDRC , ACO-OS (DB2), and Informatica server for teams to use. Users can utilize these virtual desktops to run Python scripts and big data computations without storing data on their local machine (see comparisons).
-
SAS Enterprise Business Intelligence (EBI) combines the strengths of SAS Analytics and SAS Data Management and provides business users with a powerful tool for making better decisions. It also boosts data consistency and streamlines the administration. SAS offers a unified platform for teams to prepare the data, analyze it visually, and build, operationalize, and manage data science, and AI/ML models in an augmented design experience. SAS is a statistical software suite for data management, advanced analytics, multivariate analysis, business intelligence, criminal investigation, and predictive analytics. CPI and HHS OIG use SAS for accessing IDRC and IDRC (Snowflake) to perform analytical reports to identify Fraud, Waste, and Abuse for CMS.
-
Tableau provides all types of users with intuitive business intelligence (BI) tools to enhance data discovery and understanding. With simple drag-and-drop features, a user can easily access and analyze key data, create innovative reports and visualizations, and share critical insights across the agency. Tableau is currently used by a lot of centers including CMCS/DSG/DIS, CMMI, OA, OIT/ISPG, OIT/IUSG, and OSPR.
Business Intelligence Integrations and Platforms
These services provide integration with data sources for business intelligence and data analytics tools.
-
Databricks is a unified set of tools for building, deploying, sharing, and maintaining enterprise-grade data solutions at scale. The Databricks Lakehouse Platform integrates with cloud storage and security in CMS cloud and manages and deploys cloud infrastructure on the project’s behalf. Databricks is currently used by CMCS for its DataConnect. It is also available as Software as a Service (SaaS).
-
Snowflake is a cloud-native data warehouse platform that enables data storage, processing, and analytic solutions that are faster, easier to use, and massively scalable. Snowflake is currently used by OIT/EADG for the IDR Cloud program as well as EDM.
-
Enterprise Data Mesh (EDM) LaunchPad is an AWS-based Platform as a Service (PaaS) solution designed to facilitate the rapid setup of proof of concepts (POCs) and pilot projects. It offers flexibility in terms of supported architectures and leverages various AWS services to provide a comprehensive platform for application development and testing. Also, it is a pre-ATO’ed production environment for consumers who do not have a Cloud presence and want to explore cloud capabilities. The EDM Launchpad is designed to accelerate analytics and visualization endeavors for the CMS data community. CMS team can leverage AWS services such as Amazon Redshift for high-performance data warehousing, Amazon QuickSight for intuitive and interactive data visualization, and Amazon S3 for scalable object storage. Data integration and ETL workloads can be facilitated by AWS Glue and AWS Databrew, ensuring seamless data preparation and transformation. The environment also incorporates IAM policies to enforce secure access controls and compliance with CMS data governance policies. The Launchpad’s architecture can handle large volumes of data and support complex analytical workflows, ensuring scalability and performance. EDL Launchpad consumers can process large datasets to analyze and visualize the data using AI/ML models with the Python programming language.
-
EDM Workspace is a service offering provided to individual data analysts and scientists within CMS with a versatile and tailored virtual desktop environment equipped with the necessary tools for data analysis, wrangling, and visualization. The technology stack and architectural choices can be customized to ensure a secure, efficient, and personalized analytical workspace for each user. The Workspace offering is designed as an all-inclusive AWS Windows or Linux workspace tailored to cater to the unique preferences and requirements of individual users. The architecture features AWS WorkSpaces, providing a secure, cloud-based virtual desktop environment.
- These virtual desktops are integrated with Active Directory and Okta SSO for seamless authentication and authorization with CMS EUA and IDM user accounts.
- The CMS EDM team ensures that connectivity is established to CMS Business Intelligence tools, as needed, to ensure data scientists and data analysts can generate reports utilizing the metadata and data accessed through the EDM.
Background
CMS collects and maintains a vast amount of healthcare data. With the successful implementation of the CMS Integrated Data Strategy, the strategic, tactical, and operational importance of this data is now solidified as a critical enterprise asset. The CMS IDR Cloud has become the primary, authoritative enterprise data asset through the consolidation and modernization of various CMS data warehouses, data marts, and applications.
CMS has implemented an enterprise-wide BI Environment that provides the front-end query, analytics, and reporting solutions that make data more accessible. These solutions also empower users to make important decisions more efficiently and with greater confidence. The CMS IDR Cloud and BI Environments are major components of the CMS BI Reference Architecture, providing shared access to consolidated, reliable data and information across the CMS enterprise as well as with other government agencies and external business partners. The CMS BI Reference Architecture provides more detail.
Objectives
The CMS user base needs access to an integrated, layered CMS BI Reference Architecture that meets the following objectives:
-
Integrates authoritative data from diverse internal and external sources into a primary repository for access and sharing by common user communities
-
Adheres to the current CMS Minimum Security Requirements (CMSR) at the Moderate security level, as published in the CMS ARS , and requirements of the privacy and data use agreement (DUA) processes
-
Ensures that source data collected in the common CMS data repositories meets Agency standards for data quality and consistency
-
Develops self-service methods for easy, quick, and secure access to BI data by providing users with tools for reporting and analyzing data, whether that data resides within the IDR Cloud or in applications and data stores outside the IDR Cloud
-
Enables use of metadata and common enterprise-wide semantics to fully employ the self-service model, giving users access to information about the data they use
-
Provides an environment with the capacity to analyze integrated authoritative data in a secure CMS BI Environment
CMS BI Production Environment
The current CMS BI Production Environment consists of a multi-zone architecture that aligns with the CMS TRA. The CMS BI Production Environment supports the Presentation, Application, and Data Zones and the specific access and protection requirements for meeting the needs of users in each zone. CMS Business Intelligence Production Environment (As-Is) illustrates the BI tools and processes that form the framework of the CMS BI Production Environment.

The BI components are implemented in the CMS BI Production Environment in accordance with the CMS TRA Multi-Zone Architecture.
Non-Compliance with CMS TRA Multi-Zone Architecture
Although the BI tools implemented at CMS are enterprise-class BI software suites, the product manufacturers did not develop them to meet the unique demands and design requirements of the CMS infrastructure. None of the software suites, as implemented at CMS, are fully compliant with one aspect of the CMS TRA—i.e., data transport between the Application Zone and the Data Zone.
The CMS TRA requires that, “When an Application Zone application requires information or actions from the Data Zone, the application makes the request and processes the response using a ‘redundant messaging facility’.”
Instead of messaging, the BI software in use at CMS separates business processing and data processing by using ODBC or Java database connectivity (JDBC) as standard database access methods.
Only the manufacturers of the CMS BI software can alter this aspect of their products. The current versions of the implemented CMS BI tools have no accepted methods for mitigating this conflict with the CMS TRA Multi-Zone Architecture.
CMS Enterprise Portal Framework
In the current CMS BI Production Environment, each BI tool supports its own portal enabling application users to access the BI contents—the metadata, database data, queries, and reports—created for each specific BI application. Because information in the CMS BI production environment is distributed across many BI applications, information sharing and collaboration among CMS BI applications is quite limited.
CMS has implemented an enterprise-wide secure gateway, known as the CMS Enterprise Portal, that enables all BI software tools to access, manipulate, analyze, and share data. This portal provides a one-stop site where users can access and analyze data, helping them make important decisions more efficiently and with greater confidence. The BI Portal Framework leverages CMS’s Portal platform and architecture, providing aggregated reports and visualizations of data retrieved from various BI tools interacting with the CMS Enterprise Data Warehouse, which comprises any CMS data warehouse or data mart built for decision support.
CMS Enterprise Portal Framework illustrates the CMS Enterprise Portal Framework.

The CMS Enterprise Portal has the following additional features:
- Aligns with the CMS Enterprise Portal Strategy
- Integrates with the CMS Access Manager and CMS Enterprise Lightweight Directory Access Protocol
- Migrates the existing BI Portal from Solaris 10 to Z/Linux Operating System
- Implements single sign-on
The initial release of the CMS Enterprise Portal supports:
- Parts A, B, D, and Enrollment Dashboards
- Part B Analytics Reports
- National Level Repository (NLR), Health Information Technology for Economic and Clinical Health Act (HITECH) Electronic Health Record incentive program]
- Affordable Care Act (ACA) Program Dashboard
- MicroStrategy Web
- Teradata ViewPoint
Enterprise and other Data Repository Frameworks
These provide access to source data repositories housed in the cloud, and have been extensively integrated with enterprise BI tools. Depending on use case, they can be used as appropriate, or their designs can be reused for similar capabilities.
-
Enterprise Data Mesh (EDM) is a type of data platform architecture that embraces the ubiquity of data in the enterprise by leveraging a domain-oriented, self-serve design. The EDM creates a layer of connectivity that abstracts away the complexities of connecting, managing, and supporting access to data. At its core, it is used to stitch together data held across multiple data silos. The premise of a data mesh is that it is used to connect distributed data across different locations and organizations. A data mesh ensures that data is available, easily discoverable, secure, and interoperable with the applications that need access to it.
-
The Integrated Data Repository Cloud (IDRC) is a high-volume data warehouse integrating Medicare claims—Parts A, B, C, D, and Durable Medical Equipment (DME) with beneficiary and provider data sources, as well as such ancillary data as contract information and risk scores. This robust, integrated data supports much needed analytics across CMS, as well as external agencies such as DOJ, FBI, OIG, etc.
It consists of a data lake, ETL components, and a Snowflake data warehouse. Access to the IDR Cloud includes SAS EBI and Snowflake user interfaces, as well as the Data-as-a-Service (DaaS) API, or through EDL Snowflake.
-
CMCS DataConnect is an all-in-one analytics platform for the Center for Medicaid & CHIP Services (CMCS). DataConnect is built on Databricks and AWS QuickSight dashboards. It provides read-only access to an expanding set of enterprise datasets, integrated with tools. Existing data insights include Dashboards and Data Spotlights.
-
CMS Master Data Management (MDM) is a CMS enterprise shared service application with a focus on eliminating redundancy, inconsistency and fragmentation of CMS data and increasing efficiencies. MDM provides a single point of access to a singular, synchronized, comprehensive, and ID-resolved authoritative source of Beneficiary, Provider, Organization, Program and Relationship data for use within CMS and by external organizations and agencies.
Business Intelligence Tools Comparisons
Below are comparisons of analytics and BI tools in use at CMS