What Is Netezza / PureData for Analytics

Hey kids! Do you want to do structured relational queries to a massively parallel database with custom FPGA hardware!? Sure you do, come with me on a magical adventure! Ok so I have a more detailed description of Netezza architecture on a site where I also offer my services as a Netezza consultant. I’d like to seed the database section of this website with an article and this is simply the easiest and most interesting one I’ve already been using to begin writing about here, and it certainly has a relevant place within the database and analytics landscape.

You can read my description at the link above to get more details around the architecture, if we sum it up here though, the system is quite a lot of hardware gear, thrown in parallel at the problem of dealing with a large volume of data often found in data warehouses. The intent of the system’s design is to perform very quickly and to do so somewhat easily, providing DBAs and developers learn and apply the best practices for both building data models and a data warehouse, as well as applying the technology advantages that Netezza provides.

The platform is great for doing more advanced analytics on, because in reality, much prep work for these things are good old fashioned regular grunt work, such as cleaning and preparing data. On a large and massively parallel machine with specialized hardware, this makes huge tasks a breeze. It also allows you to offset much of the work that may be done slowly other places; ie. bringing data from the warehouse to prep and then model in SAS; NZ could prep the data inside itself, do aggregate or more advanced computational work, and provide SAS only what is necessary to take the next step. It offers a significant way to reduce network and disk input/output (i/o) operations.

The above is a 2-rack model, with 480 active hard drives, 224 FPGA cores, and 280 CPU cores. It has 96 TB of raw capacity, and all data on Netezza is compressed; it actually makes the system faster as the FPGAs do nearly real-time uncompression. Actual data capacity is about 3-4x this raw number; although likely the workload you ask of the system will have exhaust it’s performance capacity by the time you’d actually want to store that much data on it.

Not only that, but it has some greater analytic capabilities built inside it as well. It has a geospatial package for working with locations and geometries; it also has a lot of math functions available within it’s analytic toolkit with algorithms such as linear regression or k-means clustering available in the database. Additionally other languages and packages can be used on the box; custom functions can be written in many languages, such as java, c++, python, and R – among still others. So all in all, it’s an “old” database in that it’s a traditional relational structured database – but it can certainly still kick it with today’s big data buzzwords and tech. Likely much of the data you’d find interesting is already on it, plus it has custom hardware, massive scan speeds, and the flexibility to do a lot of analytic work actually on the machine.

About Us

Stay connected

Hot Topics

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

Placed in My First Kaggle Competition – Python with Kiva & Geospatial Data

Database and Query Tuning with Snowflake Clustering Keys

Facebook

Trending Slider

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

Placed in My First Kaggle Competition – Python with Kiva & Geospatial Data

Database and Query Tuning with Snowflake Clustering Keys

Subscribe Now

Latest

Popular

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Database and Query Tuning with Snowflake Clustering Keys

Categories

About Us

Stay connected

What Is Netezza / PureData for Analytics

Related posts

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

Cluster Key Performance Impact on Snowflake Joins

This Snowflake Database Looks Like The Real Deal

Snowflake Database Architecture

Leave a Reply Cancel reply

Hot Topics

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

Placed in My First Kaggle Competition – Python with Kiva & Geospatial Data

Database and Query Tuning with Snowflake Clustering Keys

Facebook

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

Placed in My First Kaggle Competition – Python with Kiva & Geospatial Data

Database and Query Tuning with Snowflake Clustering Keys

Latest

Popular

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Snowflake Database Now Available on Azure

Snowflake Lambda Data Loader – Example with AWS S3 Trigger

How To Load Data Into Snowflake – Snowflake Data Load Best Practices

How To Extract Snowflake Data – Unload Best Practices

Database and Query Tuning with Snowflake Clustering Keys