Data Validation Post Cloud Migration: Netezza to GCP – BigQuery

Posted by

With its high-performing hardware and database query engine, IBM’s Netezza is an enterprise-level data warehouse that provides easy access to unified data from any location. At the same time, this data warehouse requires companies to invest significantly in on-premises hardware, maintenance, and licensing. 

In the age of cloud migration, IBM Netezza has a host of scalability challenges:

  • Limited data volume per source
  • Data archiving
  • Inefficient node management
  • Limited data storage and processing power

To leverage cloud capabilities, many enterprises are modernizing their data warehouses including IBM Netezza. Among the leading cloud service providers, Google Cloud (along with Google BigQuery) is designed to offer unlimited scalability. With its petabyte-scale serverless data warehouse, Google BigQuery can easily process millions of data records within a few seconds. With its underlying Google Cloud infrastructure, BigQuery can replicate every SQL query on thousands of cloud-hosted servers.

Why Enterprises are choosing GCP over Netezza

More enterprises are choosing GCP-BigQuery over IBM Netezza for numerous reasons, including:

  • Technical constraints
  • Batch processing performance
  • Scalability problems
  • Implementation costs

When compared to Netezza, GCP-BigQuery does not require companies to manage their infrastructure – or even a database administrator. As the BigQuery engine utilizes Google’s infrastructure, it can easily run multiple queries in parallel on multiple servers. 

Here are two core features that differentiate GCP-BigQuery from Netezza:

  1. Column-based data storage
    In BigQuery, data is stored in the form of database columns (instead of rows). This makes it possible to improve the data compression ratio and throughput.
  2. Tree architecture
    BigQuery uses a tree architecture, which allows multiple queries to be dispatched and results to be aggregated across multiple machines within a few seconds.

By leveraging our cloud migration capabilities, Onix has successfully migrated many customers from IBM Netezza to Google Cloud. Before retiring legacy systems like Netezza, organizations need to perform data validation in the post-migration phase. Efficient data validation is integral to the success of cloud migration. 

How Pelican automates data validation during migration from Netezza to GCP-BigQuery

For customers moving data from IBM Netezza to GCP, data security and implementation delays are their leading concerns in performing data validation. This is where the Pelican data validation tool can make a difference.

With its automation capability, Pelican enables companies to validate their data in parallel to the data migration process. For post-migration data validation, Pelican saves over 60% time as compared to manual validation.

Here are some of Pelican’s features that simplify data migration of Netezza to the Google Cloud Platform:

  • Automated data validation and reconciliation
  • Zero coding requirement and data movement between systems
  • Real-time intelligent data comparison at the cell level
  • Parallel data validation during data migration
  • Real-time detailed reports on data errors and discrepancies

A real-world application of Pelican data validation

Here’s a real-world application of Pelican for a healthcare company migrating their data warehouse from IBM Netezza to GCP:

The company wanted to complete its warehouse migration to the cloud before the end-of-life of its Netezza system. Initially, the company planned to manually validate the data, but decided against it for the following reasons:

  • The entire validation process would be time-consuming and expensive
  • The company planned to hire 25 BigQuery engineers to perform the manual validation. However, they would only be able to run limited instances of data validation
  • In the limited time, manual validation would only evaluate a sample database instead of the entire database

With the Pelican validation tool, the company could quickly validate and reconcile petabyte-level data at the cell level. Besides, they could deploy this tool across heterogeneous systems to accelerate the validation process when transferring data to the cloud without moving the data.

Summary

As more organizations plan to migrate from IBM Netezza to GCP and BigQuery, an automated data validation tool can streamline this process while saving valuable time and effort. With our Pelican tool, you can achieve 100% accuracy in data validation – including validation of the entire dataset at every cell level.

As an integral part of our Datametica Birds product suite, Pelican is designed to accelerate the cloud migration process to GCP. Here’s a case study of a leading U.S.-based auto insurance company migrating from Netezza to GCP.

We can help you streamline your cloud migration to GCP. To learn more, contact us now.

Reference links:

https://cloud.google.com/bigquery/docs/migration/netezza

Related blogs

Subscribe to stay in the know

Your trusted guide to everything cloud

No matter where you are on your journey, trusted Onix experts can support you every step of the way.