本文主要是介绍redshift and MPP,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
MPP database
Massive Parallel Processing (MPP) database is a type of database that scales horizontally. MPP dbs adopted share-nothing architecture in that every “node” will maintain its own CPU, storage, etc. A query will be processed by multiple nodes in parallel and the results will be combined. In the early days, Teradata was the dominant vendor of MPP databases. Each node is a “database-like” program called AMP. Later on there are more MPP dbs. Most notable ones are Greenplum and Redshift. Both are based on PostgreSQL as basic nodes but both changed postgreSQL to columnar DB, whereas the regular postgreSQL is a row-based database. Another famous MPP and columnar database is Vertica, which originated from C-store.
Redshift
Redshift is Amazon’s version of MPP database and data warehouse (BI) that based on PostgreSQL 8.0.2. Since Redshift keeps the same interface as PostgreSQL, it is easy for customers to migrate their existing workload from PostgreSQL to Redshift.
There are several types of nodes: leader nodes, computer nodes. A computer nodes has dedicated CPU, disk resources and the resources are divided into node slices. The rows are distributed to node slices based on a distribution key. Then the leader node will distribute the work to node slices.
https://docs.aws.amazon.com/redshift/latest/dg/c_internal_arch_system_operation.html
这篇关于redshift and MPP的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!