0% found this document useful (0 votes)

385 views15 pages

Green Plum

The document summarizes the new features in Greenplum Database version 3.2.0.0, which is a major release that introduces several new features including Greenplum MapReduce, in-database compression, programmable parallel analytics using R and custom functions, enhanced database monitoring, and performance enhancements. It provides details on each new feature and references sections in the documentation for further information.

Uploaded by

sarferazul_haque

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

385 views15 pages

Green Plum

Uploaded by

sarferazul_haque

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

October 3, 2008

Welcome to Greenplum Database 3.2.0.0

Greenplum Database is a massively parallel processing (MPP) database server that

leverages PostgreSQL open-source technology. Greenplum Database 3.2 is a major
release which introduces several new features, performance and stability
enhancements, and internal changes to the system catalogs. Please refer to the
following sections for more information about this release:
• New Features in Greenplum Database 3.2
• Changed Features in Greenplum Database 3.2
• Resolved Issues in Greenplum Database 3.2
• Known Issues in Greenplum Database 3.2
• Upgrading to Greenplum Database 3.2 from 3.1
• Installing Greenplum Database 3.2 (New Users)
• Greenplum Database 3.2 Documentation

New Features in Greenplum Database 3.2

• Greenplum MapReduce
• In-Database Compression
• Programmable Parallel Analytics
• Enhanced Database Monitoring
• Append-Only Tables
• Enhanced Table Partitioning
• PostgreSQL 8.3 Regular Expression Functions
• Memory Overcommit Protection
• Performance Enhancement for Single Row INSERT Operations
• Performance Enhancement for DISTINCT Aggregate Operations
• Performance Enhancement for Hash Aggregate Operations
• Enhanced Interconnect Scalability
Greenplum MapReduce
MapReduce is a programming model originally developed by Google for processing
and analyzing large data sets on an array of commodity servers. Greenplum
MapReduce allows programmers who are familiar with the MapReduce paradigm to
write map and reduce functions and submit them to the Greenplum Database parallel
data flow engine for processing. The Greenplum Database system takes care of the
details of distributing the input data, executing the program across a set of machines,
handling machine failures, and managing the required inter-machine communication.

New Features in Greenplum Database 3.2 1

Greenplum Database 3.2.0.0 Release Notes

In order for Greenplum to be able to process the Map and Reduce functions written by
a user, the functions need to be defined in a specially formatted Greenplum
MapReduce document, which is then passed to the Greenplum MapReduce client
program, gpmapreduce, for execution by the Greenplum Database parallel engine.
Supporting Documentation for Greenplum MapReduce

For more information on Greenplum MapReduce, see the following sections of the
Greenplum Database 3.2 Administrator Guide (GPAdminGuide.pdf):
• Chapter 27, Using Greenplum MapReduce
• Appendix B, Management Utility Reference, “gpmapreduce”
• Appendix F, Greenplum MapReduce Specification
In-Database Compression
Greenplum Database 3.2 introduces in-database compression using gzip compression
technology. This allows customers to increase performance and reduce the space
required to store data. Customers can expect to see a 3-10x space reduction with
increased effective I/O performance to match (provided that your segment systems
have the available CPU power to compress and uncompress the data).
In-database compression is available on tables that utilize the append-only storage
model. See Append-Only Tables.

Programmable Parallel Analytics

Greenplum Database 3.2 introduces a new level of parallel analysis capabilities for
mathematicians and statisticians. For the first time customers can use the R statistical
language (see PL/R Support), or build custom functions using our linear algebra and
machine learning primitives (see Advanced Analytic Functions), and run them in
parallel directly against data in the database.
PL/R Support
PL/R is a PostgreSQL language extension that allows you to write user-defined
functions and aggregates in the R statistical computing language. More information on
R is available on the R Project web site: http://www.r-project.org. More information
on PL/R is available at: http://www.joeconway.com/plr.
Because R and PL/R are available under the terms of the GNU General Public
License, they are packaged separately from the 3.2 Greenplum Database installation.
Greenplum has compiled and packaged platform-specific packages of R and PL/R
(available for download on Greenplum Network). In order to use PL/R functions, R
and PL/R must be installed on every host in your Greenplum Database array.
This release includes built-in support for PL/R in the CREATE LANGUAGE command.
Once you have installed the required R and PL/R libraries on you Greenplum hosts,
you can use CREATE LANGUAGE to add PL/R support to your database. For example:
CREATE LANGUAGE plr;

New Features in Greenplum Database 3.2 2

Greenplum Database 3.2.0.0 Release Notes

Advanced Analytic Functions

Greenplum Database 3.2 introduces the following new built-in functions for advanced
statistics and analysis. These functions are executed in parallel within Greenplum
Database:
Table A.1 New Advanced Analytic Functions in 3.2

Function Name Description

matrix_add(array[], array[]) Adds two two-dimensional matrices. The matrices must be conformable.

matrix_multiply(array[], array[]) Multiplies two two-dimensional arrays. The matrices must be conformable.

matrix_multiply(array[], expr) Multiplies a two-dimensional array and a scalar numeric value.

matrix_transpose(array[]) Transposes a two-dimensional array.

pinv(array[]) Calculates the Moore-Penrose pseudoinverse of a matrix.

unnest(int[]) Transforms a one dimensional array into rows. Returns a set of

anyelement, a polymorphic psuedotype in PostgreSQL.
sum(array[]) Performs matrix summation. Can take as input a two-dimensional array that
is treated as a matrix.

pivot_sum(label[], label, expr) A pivot aggregation using sum to resolve duplicate entries.

mregr_coef(expr, array[]) The functions mregr_coef and mregr_r2 are both multiple linear
regression aggregates. mregr_coef calculates the regression coefficients.

mregr_r2(expr, array[]) The functions mregr_coef and mregr_r2 are both multiple linear
regression aggregates. mregr_r2 calculates the r-squared error value.

nb_classify(text[], bigint, Classify rows using a naive Bayes classifier. This function is used with a
bigint[], bigint[]) baseline of training data to predict the classification of new rows.

Supporting Documentation for Advanced Analytic Functions

For more information on these functions, see the following section of the Greenplum
Database 3.2 Administrator Guide (GPAdminGuide.pdf):
• Chapter 16, Querying Data, “Advanced Analytic Functions”
Enhanced Database Monitoring
Greenplum Database 3.2 introduces a new infrastructure for monitoring database
performance and usage. This new infrastructure enables the gathering and storage of
comprehensive metrics about system utilization and query performance.
This functionality is currently disabled, and will be enabled in an upcoming 3.2
maintenance release. This maintenance release will fully activate the Greenplum
Performance Monitor functionality and provide the data collection, SQL storage, and
graphical user interface components of the infrastructure.

Enhanced Table Partitioning

This underlying partitioning infrastructure is the same in 3.2 as in previous Greenplum
Database releases. Partitioned tables use a combination of table inheritance and CHECK
constraints. Table inheritance creates a persistent relationship between a parent table
and its child table partitions, so that all of the schema information from the parent

New Features in Greenplum Database 3.2 3

Greenplum Database 3.2.0.0 Release Notes

table propagates down to the child table partitions. CHECK constraints both limit the
data a child table partition can contain, and are used to eliminate partitions at query
runtime that are not relevant to a given query predicate.
What has changed about table partitioning in 3.2, is the way partitioned tables are
created and managed. In previous releases, tables were partitioned by running the
gpcreatepart management utility or via manual SQL operations. Also, it was
necessary to use rewrite rules to route data inserted through the parent table to the
correct child table. Rewrite rules did not apply to COPY commands, only INSERT, and
significantly slowed load operations when loading through the parent table.
In 3.2, partitioned tables are now created with one simple CREATE TABLE command.
New syntax has been added to CREATE TABLE to support the creation of multi-level
range and list partition designs. Similarly, ALTER TABLE now supports new syntax for
managing existing partition designs. With the new 3.2 partitioning, rewrite rules are
no longer needed to route incoming data to the correct partition. The partition design
and rules are now stored in system catalog tables, which significantly improves
runtime rule evaluation and load performance.
The old partitioning utilities have been deprecated in 3.2. gpcreatepart is no longer
distributed with the Greenplum Database 3.2 installation. gpaddpart and
gpdeletepart are available in 3.2, but can only be used to maintain old 3.1 style
partitions. For users who have existing 3.1 style partitioned tables and wish to upgrade
their partition designs to 3.2, Greenplum will provide an upgrade utility to assist with
the migration. Contact Greenplum Customer Support if you are interested in this
partition upgrade utility.
Supporting Documentation for Enahanced Table Partitioning

For more information on enhanced table partitioning in 3.2, see the following sections
of the Greenplum Database 3.2 Administrator Guide (GPAdminGuide.pdf):
• Chapter 14, Defining Database Objects, “Partitioning Large Tables”
• Appendix A, SQL Command Reference, “CREATE TABLE”
• Appendix A, SQL Command Reference, “ALTER TABLE”
Append-Only Tables
Append-only tables are a new table storage model for large tables in 3.2. Traditional
heap table storage tables favor OLTP-type workloads where data is often modified
after it is initially loaded. UPDATE and DELETE operations require table-level locking
and row-level versioning information to be stored in order to ensure that database
transactions are processed reliably.
In most data warehouse workloads, however, most of the data resides in denormalized
fact tables, which are typically the largest tables in the system. Fact tables are typically
appended to in batches and then accessed by read-only queries. Data is seldom
changed after it is written. These append-only characteristics allow a range of storage
and performance optimizations to be applied.
Moving large fact tables to an append-only storage model eliminates the storage
overhead of the per-row update visibility information (about 20 bytes per row is
saved). This allows for a leaner and easier-to-optimize page structure. This page
structure also allows for in-database compression (using gzip).

New Features in Greenplum Database 3.2 4

Greenplum Database 3.2.0.0 Release Notes

Append-only tables offer a more efficient storage format than traditional heap storage
tables, but lack certain features available with heap tables:
• Cannot UPDATE rows in an append-only table
• Cannot DELETE rows from an append-only table
• Cannot ALTER TABLE...ADD COLUMN to an append-only table
• Cannot add indexes to an append-only table
The storage model of a table is declared at CREATE TABLE time. The storage model of
a table cannot be changed on an existing table. The WITH clause of the CREATE TABLE
command is used to declare the storage options of the table. If not declared, the table
will be created as a regular heap-storage table. For example, to create an append-only
table with compression level of 5:
CREATE TABLE foo (a int, b text) WITH (appendonly=true,
compresslevel=5);
Supporting Documentation for Append-Only Tables

For more information on append-only tables, see the following sections of the
Greenplum Database 3.2 Administrator Guide (GPAdminGuide.pdf):
• Chapter 14, Defining Database Objects, “Choosing the Table Storage Type”
• Appendix A, SQL Command Reference, “CREATE TABLE”
PostgreSQL 8.3 Regular Expression Functions
Greenplum Database is based on PostgreSQL 8.2.9, but the following regular
expression functions from PostgreSQL 8.3 have been added to Greenplum Database
3.2:
regexp_matches(<string>, <pattern> [, <flags> ])
regexp_split_to_table(<string>, <pattern> [, <flags> ])
regexp_split_to_array(<string>, <pattern> [, <flags> ])
For more information on these functions, see the PostgreSQL 8.3 documentation at:
http://www.postgresql.org/docs/8.3/static/functions-matching.html

Memory Overcommit Protection

On some OS platforms, a segment host can become unresponsive when it runs out of
memory. To protect against this condition occurring, Greenplum 3.2 introduces a new
server configuration parameter:
gp_vmem_protect_limit
This parameter sets the amount of memory (in MBs) that all postgres processes of a
segment instance can consume. To prevent over allocation of memory, set to:
(physical_memory/segments) + (swap_space/segments/2)
For example, on a segment host with 16GB physical memory, 16GB swap space, and
4 segment instances the calculation would be:
(16/4) + (16/4/2) = 6GB
6 * 1024 = 6144MB

New Features in Greenplum Database 3.2 5

Greenplum Database 3.2.0.0 Release Notes

If a query causes this limit to be exceeded, memory will not be allocated and the query
will fail. Note that this is a local parameter and must be set for every segment instance.
This parameter replaces the now deprecated gp_process_memory_cutoff parameter
which was harder to calculate. It set a per-process memory limit rather than a
per-segment limit.

Performance Enhancement for Single Row INSERT Operations

In Greenplum Database 3.2, INSERT rows are now distributed to segments in the same
fashion as a COPY command. Inserted rows are dispatched directly to their target
segment instance without requiring a motion operation. This significantly improves
performance of single-row-insert statements anywhere from 100-300 percent.

Performance Enhancement for DISTINCT Aggregate Operations

In previous releases, it was often beneficial to rewrite queries involving
DISTINCT-qualified aggregate functions, for example COUNT(DISTINCT x), in such a
way as to eliminate duplicate values of the distinct attributes before an aggregation.
In 3.2, optimizer improvements render such rewrites unnecessary. In fact, the query
optimizer’s internal transformation will now usually produce a plan that is superior to
what can be achieved by rewriting the SQL.

Performance Enhancement for Hash Aggregate Operations

This release includes enhancements to the query executor for improved performance
of hash aggregation (HashAgg) operations. In cases where the query planner under
estimates the cost of a HashAgg operation, the executor will now attempt to reduce the
size of data that spills to disk during execution, thereby improving performance.
Aggregation of numeric data has also been improved. Users should see a 10-20%
performance enhancement on applicable queries involving hash aggregation.

Enhanced Interconnect Scalability

The interconnect is the networking layer of Greenplum Database. When a user
connects to a database and issues a query, processes are created on each of the
segments to handle the work of that query. The interconnect refers to the inter-process
communication between the segments, as well as the network infrastructure on which
this communication relies.
In previous releases, the interconnect used TCP (Transmission Control Protocol) to
send messages over the network. With TCP, Greenplum had a scalability limit of 1000
segment instances. To remove this limit, Greenplum has introduced a reliable
UDP-based transmission protocol that is optimized for data flow processing.

Changed Features in Greenplum Database 3.2

• Management Utility Changes
• SQL Command Changes
• Server Configuration Parameter Changes
• System Catalog Changes

Changed Features in Greenplum Database 3.2 6

Greenplum Database 3.2.0.0 Release Notes

Management Utility Changes

The following management utilities have new or changed command-line options in
3.2:
Table A.2 Management Utility Changes in 3.2

Utility New / Changed Options Deprecated Options

gpaddmirrors -i mirror_config_file (file format changed) -c ssh | rsync (rsync always used in 3.2)
-m datadir_config_file -v (changed to --version)
-o output_sample_mirror_config_file
--version

gpaddpart Old partitioning utilities are deprecated in 3.2.

This utility can still be used to maintain 3.1.x.x
style partitioned tables, but is not compatible with
3.2 partitioned tables.

gpchecknet New utility for testing network performance

gpcheckos -u ssh_user
-p ssh_port
-c custom_config_file

gpcheckperf -r {n|N} (network performance test removed -

see gpchecknet)

gpcreatepart Deprecated and removed from 3.2. To create and

maintain a new partitioned table in 3.2, use the
CREATE TABLE and ALTER TABLE commands.

gpcrondump --column-inserts
-E encoding
--inserts
--no-owner
--no-privileges
--oids
-u backup_directory
--use-set-session-authorization
-w dbid
-y report_file

gpdeletepart Old partitioning utilities are deprecated in 3.2.

This utility can still be used to maintain 3.1.x.x
style partitioned tables, but is not compatible with
3.2 partitioned tables.

gpdetective New diagnostic and log capture utility

gpfdist Now supports bzip2 compressed files in -q quote_character

addition to gzip -h header
-x escape_character
(Data formatting options removed from gpfdist
command-line. gpfdist now looks to the external
table definition FORMAT clause for instructions on
how to parse the data for both TEXT and CSV
formatted files)

gpinitstandby -i (do not start standby sync agent)

Changed Features in Greenplum Database 3.2 7

Greenplum Database 3.2.0.0 Release Notes

Table A.2 Management Utility Changes in 3.2

Utility New / Changed Options Deprecated Options

gpmapreduce New client utility for submitting Greenplum

MapReduce jobs

gpmigrator New 3.1.x.x to 3.2 upgrade utility

gprecoverseg -i input_config -c ssh | rsync (new smart sync added in 3.2)

-o output_config -v (changed to --version)
-S seg_dbid
-s primary_datadir,mirror_datadir

gpscp -u ssh_user
-p ssh_port

gpssh -u ssh_user
-p ssh_port

gpssh-exkeys -u ssh_user
-p ssh_port

gpstart -z (force start)

gpstate -Q (quick status)

SQL Command Changes

The following SQL commands have new or changed syntax in 3.2.
Table A.3 Changed SQL Commands in 3.2

Command Description of Change

CREATE TABLE New PARTITION BY clause for creating partitioned tables

New append-only storage options added to the WITH clause

SELECT Aggregate expressions now allow a FILTER clause:

aggregate_name (expression [ , ... ] ) [FILTER (WHERE condition)]

Server Configuration Parameter Changes

• New Parameters
• Parameters with Changed Defaults
• Deprecated Parameters

Changed Features in Greenplum Database 3.2 8

Greenplum Database 3.2.0.0 Release Notes

New Parameters

Table A.4 New Server Configuration Parameters in 3.2

Parameter Description

gp_enable_agg_distinct Enables or disables two-phase aggregation to compute a single distinct-qualified

aggregate. This applies only to subqueries that include a single distinct-qualified
aggregate function.

gp_enable_agg_distinct_pruning Enables or disables three-phase aggregation and join to compute

distinct-qualified aggregates. This applies only to subqueries that include one or
more distinct-qualified aggregate functions.

gp_enable_fast_sri When set to on, the query planner plans single row inserts so that they are sent
directly to the correct segment instance (no motion operation required). This
significantly improves performance of single-row-insert statements.

gp_enable_multiphase_agg Enables or disables the query planner’s use of two or three-stage parallel
aggregation plans. This approach applies to any subquery with aggregation. If
gp_enable_multiphase_agg is off, then gp_enable_agg_distinct and
gp_enable_agg_distinct_pruning are disabled.

gp_external_grant_privileges (Added in 3.1.1.4) Enables or disables non-superusers to issue a CREATE

EXTERNAL [WEB] TABLE command in cases where the LOCATION clause
specifies http or gpfdist.

gp_hashagg_compress_spill_files When a hash aggregation operation spills to disk during query processing,
specifies the compression algorithm to use on the spill files. If using zlib, it must
be in your $PATH on all segments.

gp_max_local_distributed_cache (Added in 3.1.1.5) Sets the number of local to distributed transactions to cache.
Higher settings may improve performance.

gp_safefswritesize Specifies a minimum size for safe write operations to append-only tables in a
non-mature file system. When a number of bytes greater than zero is specified,
the append-only writer adds padding data up to that number in order to prevent
data corruption due to file system errors. Each non-mature file system has a
known safe write size that must be specified here when using Greenplum
Database with that type of file system. This is commonly set to a multiple of the
extent size of the file system; for example, Linux ext3 is 4096 bytes, so a value
of 32768 is commonly used.

gp_snapshotadd_timeout (hidden parameter introduced in 3.1.1.9) Under very heavy workload, concurrent
sessions can sometimes collide waiting for a snapshot of the database at the
segment level. When this occurs, the segment instance may fail with a message
in its log such as:
'SharedSnapshotAdd: found existing entry for our session-id'
As a work around to this issue, increase this timeout (in seconds) at the
segment. To increase the timeout, you must set this parameter in the
postgresql.conf of all segment instances.

gp_vmem_protect_limit Sets the amount of memory (in number of MBs) that all postgres processes of a
segment instance can consume. To prevent over allocation of memory, set to:
(physical_memory/segments) + (swap_space/segments/2)

log_dispatch_stats When set to on, logs information about the dispatch of each statement.

max_appendonly_tables Sets the maximum number of append-only relations that can be written to or
loaded concurrently. Append-only table partitions and subpartitions are
considered as unique tables against this limit. Increasing the limit will allocate
more shared memory at server start.

Changed Features in Greenplum Database 3.2 9

Greenplum Database 3.2.0.0 Release Notes

Parameters with Changed Defaults

The following parameters have changed default values from 3.1.1.x:
• enable_mergejoin=off (was on in 3.1.1.x)
• enable_nestloop=off (was on in 3.1.1.x)
• log_line_prefix=|%m|%u|%d|%p|%I|%X|:- (%X now logs distributed
transaction ID, local transaction ID)
• gp_enable_agg_distinct=on (was off in 3.1.1.x)
• gp_selectivity_damping_for_scans=on (hidden parameter - was off in
3.1.1.x)
• gp_selectivity_damping_for_joins=off (hidden parameter - was on in
3.1.1.x)
• wal_send_client_timeout=30s (hidden parameter - was a half second in
3.1.1.x)
Deprecated Parameters
The following parameters are removed in 3.2. If these are referenced in your
postgresql.conf files, they must be removed.
• gp_propagate_broadcast_cost
• gp_process_memory_cutoff (replaced by gp_vmem_protect_limit)

System Catalog Changes

The following new system catalog tables and views have been added or modified in
3.2.
Table A.5 New System Catalogs in 3.2

Catalog Description

gp_distributed_log New system view for distributed transaction management

gp_master_mirroring New system table to track the state of synchronization between the primary master
and the standby master

gp_transaction_log New system view for distributed transaction management

pg_class Changes made to support append-only tables: relkind, relstorage,

relaosegxid, relaosegrelid columns

pg_partition New system table for table partitioning

pg_partition_columns New system view for table partitioning

pg_partition_rule New system table for table partitioning

pg_partition_templates New system view for table partitioning

pg_partitions New system view for table partitioning

Resolved Issues in Greenplum Database 3.2

This section lists the customer reported issues that are now resolved in Greenplum
Database 3.2. The resolved issues are divided into the following categories:

Resolved Issues in Greenplum Database 3.2 10

Greenplum Database 3.2.0.0 Release Notes

• Resolved Query Dispatch and Execution Issues

• Resolved Data Loading Issues
• Resolved Management Utility Issues
• Resolved Backup and Restore Issues
• Resolved Transaction and Concurrency Issues
Resolved Query Dispatch and Execution Issues

Table A.6 Resolved Issues in 3.2 - Query Dispatch and Execution

Issue Number Description

2875 Queries of Temporary Tables Fail After Automatic Segment Failover

In 3.1.1.3 and later, certain queries involving temporary tables would fail when a segment failure
occurred mid-stream. After automatic failover to the mirror, users would sometimes encounter the
following errors:
ERROR: Interconnect error QD lost contact with QE...
ERROR: freeGangsForPortal: Bad primary writer gang...

2963 Error Using the pg_size_pretty Function

Calculating the database size using the PostgreSQL function pg_size_pretty resulted in
catalog-related errors.

3022 Queries Using Date Functions Return Unexpected Results

In 3.1.1.4 and later, certain queries involving table joins on the results of the alldates() or
current_date() functions were not planned correctly. This could result in unexpected result sets
or the following error:
QD unable to connect...

3295 Failed Query with Window Function and ORDER BY clause

In 3.1.1.x releases, certain queries using window functions (such as rank) and an ORDER BY
clause where the ordering attribute was the output column of the window function would fail with
the following error:
Unexpected internal error: QD process received signal SIGSEGV

3511 Failed Query with Redundant DISTINCT-qualified Aggregates

In 3.1.1.x releases, certain queries with redundant DISTINCT-qualified aggregate clauses would
fail with the following error:
Unexpected internal error: QD process received signal SIGSEGV

Resolved Data Loading Issues

Table A.7 Resolved Issues in 3.2 - Data Loading

Issue Number Description

2462 External WEB Tables do not Return an Error if EXECUTE Command Fails
In 3.1.x.x releases, external web tables that were defined with the EXECUTE clause did not return
an error when the command ran by the external table failed at runtime.

Resolved Issues in Greenplum Database 3.2 11

Greenplum Database 3.2.0.0 Release Notes

Table A.7 Resolved Issues in 3.2 - Data Loading

Issue Number Description

2984 COPY Loads Errant Characters When Client and Server Encodings are Different
In 3.1.1.3 and later, COPY would insert errant characters at the beginning of strings when
converting between certain client and server encodings (such as UTF8 to EUC_JP).

3281 Failed gpfdist Process Under Heavy Load

In releases prior to 3.1.1.6, multiple concurrent accesses to an external table that used the
gpfdist protocol would sometimes cause the gpfdist file server process to fail with the
following errors:
ERROR: connection failed while gpfdist://...
WARN: port_getn: Invalid argument...

Resolved Management Utility Issues

Table A.8 Resolved Issues in 3.2 - Management Utilities

Issue Number Description

2651 gpstop -i Results in Core Dump on the Master

Using the gpstop utility with the -i option for immediate shutdown resulted in a core dump on the
master and SEGV errors. Error handling in the utility has been changed to resolve this issue.
Note that gpstop -i is not the recommended way to cleanly shutdown a Greenplum system.

2793 Not Able to Add Partitions in Middle of Range

The 3.1.x.x utilities gpcreatepart and gpaddpart were unable to add child partitions except at
the end of a range of tables. These utilities are deprecated in 3.2. Enhanced table partitioning in
this release resolves this issue.

Resolved Backup and Restore Issues

Table A.9 Resolved Issues in 3.2 - Backup and Restore

Issue Number Description

3645 gp_dump Locks Schemas not Involved in Dump Operation

In 3.1.1.x releases, using gp_dump to dump a single schema would lock all other schemas in the
database as well. gp_dump has been modified in 3.2 to obtain exclusive locks on just the specific
tables or schemas selected for backup.

Resolved Transaction and Concurrency Issues

Table A.10 Resolved Issues in 3.2 - Transactions and Concurrency

Issue Number Description

2593 Error dropping partitioned tables

In 3.1.x x releases, dropping a table with a large number of partitions would sometimes result in
the following error:
two-phase state file for transaction is corrupt...
This was due to a limit on the size of the file that saves the prepare state for a transaction. The
limit has been modified in this release to resolve this issue.

3164 Too many updates/deletes within transaction error in serializable transactions

In 3.1.1.4 and later, certain serializable transactions containing function calls or TRUNCATE
commands would fail with the error:
ERROR: Too many updates/deletes within transaction...

Resolved Issues in Greenplum Database 3.2 12

Greenplum Database 3.2.0.0 Release Notes

Table A.10 Resolved Issues in 3.2 - Transactions and Concurrency

Issue Number Description

3266 SEGV during rollback of transaction

In prior 3.1.x.x releases, when rolling back a query with pg_cancel_backend, the system
encountered a SEGV error and the postmaster process failed. This issue was caused by certain
server configuration parameters not being propagated correctly to the segments and has been
fixed in this release.

3726 Snapshot Collision Timeout Causes Failed Segment

In prior 3.1.1.x releases, under very heavy workload concurrent sessions had the potential to
collide waiting for a snapshot of the database at the segment level. Should this occur, the
segment instance would fail with a message in its log such as:
SharedSnapshotAdd: found existing entry for our session-id
As a work around to this issue, a new server configuration parameter has been added,
gp_snapshotadd_timeout which allows you to increase the timeout (in seconds) at the segment.
To increase the timeout, you must set this parameter in the postgresql.conf of all segment
instances and restart your system.

Known Issues in Greenplum Database 3.2

This section lists the customer reported issues that remain unresolved in Greenplum
Database 3.2. Work arounds are provided where applicable.
Table A.11 Known Issues in 3.2

Issue Number Description

1589 PostgreSQL Statistics Views and Functions do not Work as Expected in Greenplum
Database
PostgreSQL has a number of views (pg_stat_* , pg_statio_*) for showing usage statistics. All of
these views only report on the usage of the master (system catalogs), not the usage of user data
on the segments. Many of the PostgreSQL statistics functions have the same problem. For
example, pg_stat_get_tuples_inserted() shows only those inserts into the master (usually 0), not
the number inserted into the table in question.

2115 Query Planner not Choosing Bitmap Indexes

The Greenplum query planner can overestimate the cost of using bitmap index operations. To
favor the use of bitmap indexes, users can temporarily experiment with the following server
configuration parameters before a query run:
gp_segments_for_planner=4 (or less)
enable_hashjoin=off

2516, 796 ORDER BY Clause Ignored in Views

If you create a view with an ORDER BY clause, the ORDER BY clause is ignored when you do a
SELECT from the view.

2535 ALTER TABLE...ALTER COLUMN Disabled for Indexed Columns

Changing the datatype for indexed columns is not allowed. If you need to alter the data type of an
indexed column, Greenplum recommends dropping the index, altering the column datatype, and
then recreating the index.

2553 First Query of a Session Always Requires a Reconnect in readonly Mode

When running in readonly fault tolerance mode and a segment is down, the first query of a
session will always attempt to reconnect when a segment is down. This is because failed
segments are not marked as invalid in the gp_configuration table when running in readonly mode.

Known Issues in Greenplum Database 3.2 13

Greenplum Database 3.2.0.0 Release Notes

Table A.11 Known Issues in 3.2

Issue Number Description

2980 Primary Keys not Enforced on Partitioned Tables

To be able to enforce a primary key or unique constraint, the primary or unique key columns must
start with the partitioning key column.

3125, 3213 Transaction Within a Function Not Recognized as a Sub-Transaction

When a function containing multiple transaction blocks is run and an error occurs in one
transaction block, the entire function exits with the errors:
ERROR: The distributed transaction 'Prepare' broadcast failed to one or more
segments
ERROR: current transaction is aborted, commands ignored until end of
transaction block

3530 Concurrent Vacuum Operations Cause Bitmap Index Errors

In 3.1.1.x releases, multiple concurrent VACUUM operations on tables with a bitmap index can
sometimes cause the following error:
ERROR: could not open relation...
Users who encounter this error must run a REINDEX command on tables with bitmap indexes after
any inserts, updates or loads in order to update the bitmap index.

3592 Confusing Error Message When in Read-only Fault Operational Mode

When a segment failure is detected, the system always issues the following message when it
fails over to the mirror:
WARNING: Greenplum Database detected segment failure(s), system is reconnected
in read-write mode
This error message is confusing because it says the system is in read-write mode even when it is
actually in read-only mode.

3824 Using gp_dump From a 3.2 Installation Fails When Run on 3.1.x.x Databases
Using the 3.2 version of the gp_dump utility on a 3.1.x.x installation fails with the following error:
ERROR: relation "pg_partition_rule" does not exist
As a work around you can use your current 3.1 version of gp_dump against your 3.1 system and
use the 3.2 version of gp_restore to restore your data into a 3.2 system.

3837 gpinitsystem -p Option Does Propagate Parameter Changes to Segments

The gpinitsystem utility has a -p option, which allows you to specify a file containing server
configuration parameter settings to include in the postgresql.conf files of all segments upon
system initialization. This option is currently not passing settings to the segments as expected. As
a work around, you can edit $GPHOME/share/postgresql/postgresql.conf.sample prior to
initialization, copy it to all segment host installations using gpscp, and then run gpinitsystem.

Upgrading to Greenplum Database 3.2 from 3.1

For users currently running Greenplum Database 3.1, there are two available upgrade
paths:
• In-Place Upgrade to 3.2 — Greenplum provides a utility to upgrade an existing
Greenplum Database 3.1.x.x installation to 3.2. This utility makes the required
changes to the system catalogs in place without requiring users to dump and
restore their databases. The upgrade utility is packaged separately from the
Greenplum Database 3.2 distribution and will be generally available within 30
days of the release of 3.2.

Upgrading to Greenplum Database 3.2 from 3.1 14

Greenplum Database 3.2.0.0 Release Notes

• Upgrade via Backup/Restore — For systems that have sufficient disk capacity
to accommodate two simultaneous versions, an upgrade using dump and restore is
another possible upgrade path. This involves taking a backup of your 3.1.x.x
databases and global objects (using gp_dump or gpcrondump), initializing a new
3.2 Greenplum Database system (using gpinitsystem), and then restoring your
databases and global objects into your new 3.2 Greenplum Database system
(using gp_restore or gpdbrestore). See the Greenplum Database
Administrator Guide for more information on the initialization and backup/restore
procedures.

Installing Greenplum Database 3.2 (New Users)

These are the high-level steps to install and initialize a new Greenplum Database
system. For detailed instructions, please see the installation chapter of the Greenplum
Database 3.2 Administrator Guide.
1. Run the installer on the Greenplum Database master host.

2. As root, set the OS tuning parameters for your platform on all Greenplum hosts.

3. Allocate a gpadmin user to own and run your installation. This user must exist on
all Greenplum hosts.
4. Source the greenplum_path.sh file in your gpadmin user profile (.bashrc).
This sets the environment variables needed by Greenplum Database.

5. Create your data directory locations on all Greenplum hosts.

6. Use the gpssh-exkeys utility to exchange SSH keys between all hosts in your
Greenplum array. Note that for a single host demo configuration you still must
exchange ssh keys between the current host and itself.
7. (multi-host configuration only) Use the gpscp and gpssh utilities to copy and
install the Greenplum Database software on all segment hosts.

8. Use the gpinitsystem utility to initialize and start your Greenplum Database
system. This utility requires a configuration file. For example:
gpinitsystem -c gp_init_config
A sample gp_init_config configuration file can be found in
$GPHOME/docs/cli_help/gp_init_config_example. Edit this file to reflect
your desired Greenplum Database array configuration.

Greenplum Database 3.2 Documentation

The following Greenplum Database documentation is available in the $GPHOME/docs
directory of your Greenplum installation, or you can go to http://gpn.greenplum.com
to download the latest documentation:
GPAdminGuide.pdf - Greenplum Database 3.2 Administrator Guide
In 3.2, the previous User, Administrator, Performance and Overview Guides are now
combined into one book, the Greenplum Database 3.2 Administrator Guide.

Installing Greenplum Database 3.2 (New Users) 15

GP Getting Started
No ratings yet
GP Getting Started
56 pages
Data Warehousing With Greenplum
0% (1)
Data Warehousing With Greenplum
95 pages
Data Warehousing With Greenplum 2e
No ratings yet
Data Warehousing With Greenplum 2e
121 pages
Greenplum Database Overview
No ratings yet
Greenplum Database Overview
31 pages
BMC Atrium CMDB 2.1 Common Data Model
100% (1)
BMC Atrium CMDB 2.1 Common Data Model
2 pages
Matrix Perturbation Theory
No ratings yet
Matrix Perturbation Theory
18 pages
Powervault tl2000 tl4000 Usersguide3en-Us PDF
No ratings yet
Powervault tl2000 tl4000 Usersguide3en-Us PDF
288 pages
9 Backups
No ratings yet
9 Backups
72 pages
VSE+BackupExec Protecting Microsoft Applications 2019 06
No ratings yet
VSE+BackupExec Protecting Microsoft Applications 2019 06
14 pages
Knowledge Graph Attention Network For Recommendation
No ratings yet
Knowledge Graph Attention Network For Recommendation
9 pages
SAN Storage & OS Commands Guide
No ratings yet
SAN Storage & OS Commands Guide
11 pages
2 - 2 - 9 - 5 - NVMe - Hpe SSD Job Aid Vsan 340571 001 2019
No ratings yet
2 - 2 - 9 - 5 - NVMe - Hpe SSD Job Aid Vsan 340571 001 2019
2 pages
Dell SC Series Storage Synchronous Replication and Live Volume
No ratings yet
Dell SC Series Storage Synchronous Replication and Live Volume
96 pages
Delphix White Paper Database Virtualization and Instant Cloning
No ratings yet
Delphix White Paper Database Virtualization and Instant Cloning
11 pages
PowerVault ML6000
No ratings yet
PowerVault ML6000
250 pages
Anatomy of Linux Journaling File Systems: Journaling Today and Tomorrow
No ratings yet
Anatomy of Linux Journaling File Systems: Journaling Today and Tomorrow
9 pages
PowerStore - A Rare NVMe NVRAM Drive Firmware Issue May Occur After Multiple Power Loss Events - Dell Singapore
No ratings yet
PowerStore - A Rare NVMe NVRAM Drive Firmware Issue May Occur After Multiple Power Loss Events - Dell Singapore
2 pages
Dell Powerstore 3 0 Spec Sheet
No ratings yet
Dell Powerstore 3 0 Spec Sheet
9 pages
10778A ENU Companion PDF
No ratings yet
10778A ENU Companion PDF
152 pages
Data Domanin Command Guide
No ratings yet
Data Domanin Command Guide
448 pages
Dell Compellent
No ratings yet
Dell Compellent
21 pages
VSE+BackupExec Virtualization 2019 06
No ratings yet
VSE+BackupExec Virtualization 2019 06
30 pages
Veritas Backup Exec Update
No ratings yet
Veritas Backup Exec Update
4 pages
Data Dictionary: Metadata Repository
No ratings yet
Data Dictionary: Metadata Repository
7 pages
Avamar Data Domain Integration Guide
No ratings yet
Avamar Data Domain Integration Guide
92 pages
Veritas Backup Exec 9.0 For Windows Servers Administartor's Guide
No ratings yet
Veritas Backup Exec 9.0 For Windows Servers Administartor's Guide
1,296 pages
Backup Exec 16: Performance Guide
No ratings yet
Backup Exec 16: Performance Guide
47 pages
h16169 Ecs and Data Domain Cloud Tier Architecture Guide
No ratings yet
h16169 Ecs and Data Domain Cloud Tier Architecture Guide
41 pages
Management Console Command Line Interface (MCCLI) Guide
No ratings yet
Management Console Command Line Interface (MCCLI) Guide
170 pages
Appendix A-Agent For Linux
No ratings yet
Appendix A-Agent For Linux
23 pages
Flex Software Admin Guide 45x
No ratings yet
Flex Software Admin Guide 45x
264 pages
Scalar I3 Tape Library Datasheet (DS00509A)
No ratings yet
Scalar I3 Tape Library Datasheet (DS00509A)
2 pages
Dell Compellent Best Practices With VMware VSphere 5.x
No ratings yet
Dell Compellent Best Practices With VMware VSphere 5.x
84 pages
Hybrid Port Knocking
100% (1)
Hybrid Port Knocking
16 pages
0108 Lectura Recomendada #6 - NVMe, SAS, and SATA
No ratings yet
0108 Lectura Recomendada #6 - NVMe, SAS, and SATA
4 pages
H19048 Tanzugreenplum On Powerflex Ra Final1 PDF
No ratings yet
H19048 Tanzugreenplum On Powerflex Ra Final1 PDF
45 pages
Configuring Dell PowerEdge Servers For Low Latency 12132010 Final
No ratings yet
Configuring Dell PowerEdge Servers For Low Latency 12132010 Final
7 pages
DELL SC Storage Administration and Advanced Management
No ratings yet
DELL SC Storage Administration and Advanced Management
2 pages
VCLBook
No ratings yet
VCLBook
462 pages
PowerMax+Family+Administration+ +Remote+Replication 1
No ratings yet
PowerMax+Family+Administration+ +Remote+Replication 1
124 pages
Hpe Proliant Gen11
No ratings yet
Hpe Proliant Gen11
20 pages
dd3300 Competitive Landscape
No ratings yet
dd3300 Competitive Landscape
40 pages
Ecs Ex Series
No ratings yet
Ecs Ex Series
112 pages
Red Hat Linux 2.1 Virtual Memory Guide
No ratings yet
Red Hat Linux 2.1 Virtual Memory Guide
10 pages
Log Shipping
No ratings yet
Log Shipping
5 pages
Veritas Partner Force Program Guide: Reseller Partners
100% (1)
Veritas Partner Force Program Guide: Reseller Partners
36 pages
Adam Implementation Guide v1.0
No ratings yet
Adam Implementation Guide v1.0
81 pages
Data Domain Command Reference Guide 6.0
100% (1)
Data Domain Command Reference Guide 6.0
464 pages
Knowledge Transfer Data Redaction On EnterpriseDB
No ratings yet
Knowledge Transfer Data Redaction On EnterpriseDB
19 pages
Oracle Database 11g Real Application Testing
No ratings yet
Oracle Database 11g Real Application Testing
34 pages
Secure Connect Gateway - Virtual Edition: Troubleshooting Guide
No ratings yet
Secure Connect Gateway - Virtual Edition: Troubleshooting Guide
68 pages
Greenplum Database 42 Release-Notes
No ratings yet
Greenplum Database 42 Release-Notes
23 pages
Greenplum SQL for Data Engineers
No ratings yet
Greenplum SQL for Data Engineers
258 pages
Greenplum DBA Online Course
No ratings yet
Greenplum DBA Online Course
5 pages
Green Plum Advanced Analytics
No ratings yet
Green Plum Advanced Analytics
65 pages
Dbms Paper Work Mehedi
No ratings yet
Dbms Paper Work Mehedi
5 pages
Greenplum Architecture Administration and Implementation
No ratings yet
Greenplum Architecture Administration and Implementation
3 pages
Greenplum: MAD Analytics in Practice: Mmds June 16, 2010
No ratings yet
Greenplum: MAD Analytics in Practice: Mmds June 16, 2010
41 pages
Books That Teach Kids To Write
100% (9)
Books That Teach Kids To Write
200 pages
1logical Data Model - Bank
No ratings yet
1logical Data Model - Bank
25 pages
TM1 Budget Tool: Logging On Guide
No ratings yet
TM1 Budget Tool: Logging On Guide
7 pages
Oracle Data Integrator Overview
100% (3)
Oracle Data Integrator Overview
27 pages
An Overview of Oracle Financials
No ratings yet
An Overview of Oracle Financials
7 pages
Oracle Data Integrator Curriculum
No ratings yet
Oracle Data Integrator Curriculum
5 pages
Oracle Financial Tables
No ratings yet
Oracle Financial Tables
9 pages
Oracle HRMS Absence Solutions
80% (5)
Oracle HRMS Absence Solutions
18 pages
Flat File Source Module
No ratings yet
Flat File Source Module
8 pages
Gude To Grant Writing
100% (1)
Gude To Grant Writing
26 pages
Reference Guide: IBM Cognos TM1
No ratings yet
Reference Guide: IBM Cognos TM1
336 pages
Statistical Analysis With R - A Quick Start
100% (1)
Statistical Analysis With R - A Quick Start
47 pages
RFP Template PDF
No ratings yet
RFP Template PDF
19 pages
B305 Fastexport
0% (1)
B305 Fastexport
26 pages
Sample Prob For Fluid Mech
No ratings yet
Sample Prob For Fluid Mech
1 page
Sheet 2 Prob Power
100% (1)
Sheet 2 Prob Power
2 pages
Thermal Mapping SOP for QA Teams
No ratings yet
Thermal Mapping SOP for QA Teams
18 pages
NPN Epitaxial Kse800/801/803 Silicon Darlington Transistor
No ratings yet
NPN Epitaxial Kse800/801/803 Silicon Darlington Transistor
3 pages
CBSE NET Electronic Science Paper 3 Jan 2017
No ratings yet
CBSE NET Electronic Science Paper 3 Jan 2017
24 pages
Fire Alarm As Existing As Revised
No ratings yet
Fire Alarm As Existing As Revised
4 pages
8 Z 010202 D
No ratings yet
8 Z 010202 D
10 pages
Final Solution Manual
No ratings yet
Final Solution Manual
21 pages
Trimble r6 Gnss Datasheet
No ratings yet
Trimble r6 Gnss Datasheet
2 pages
F-MP-04-52 (C) Equipment Installation Qualification Report
100% (1)
F-MP-04-52 (C) Equipment Installation Qualification Report
6 pages
Space Policy-Sun Tzu's Art of War Applicability To Space-Unclassified
100% (3)
Space Policy-Sun Tzu's Art of War Applicability To Space-Unclassified
99 pages
4H Catalogue
No ratings yet
4H Catalogue
29 pages
Case 721b Single Lever Remote Control Valve - 109508a1
No ratings yet
Case 721b Single Lever Remote Control Valve - 109508a1
4 pages
Overview of Saturne As MDC - 2009-08 - en Final
No ratings yet
Overview of Saturne As MDC - 2009-08 - en Final
22 pages
Dynamic Programming Examples
No ratings yet
Dynamic Programming Examples
18 pages
Sensors 20 04174 v2
No ratings yet
Sensors 20 04174 v2
22 pages
CAT MD6750 Parts Book
50% (2)
CAT MD6750 Parts Book
720 pages
Farmlift 633 Stage IV
100% (2)
Farmlift 633 Stage IV
3,191 pages
Comparative Analysis of Crashworthiness of Empty and Foam-Filled Thin-Walled Tubes PDF
No ratings yet
Comparative Analysis of Crashworthiness of Empty and Foam-Filled Thin-Walled Tubes PDF
7 pages
Inquisitor Vehicle Combat Guide
No ratings yet
Inquisitor Vehicle Combat Guide
8 pages
1-Main-AdSense Simplified PDF
No ratings yet
1-Main-AdSense Simplified PDF
52 pages
Identifying The Factors That Contributed To The Ueberlingen Midair Collision
100% (1)
Identifying The Factors That Contributed To The Ueberlingen Midair Collision
4 pages
Servo (N)
100% (1)
Servo (N)
15 pages
Data and Information Attributes
No ratings yet
Data and Information Attributes
17 pages
The Minimum and Maximum Temperature Refer To The Environmental and Operational Limits A Piece of Equipment Can Handle While Still Functioning Safely and Effectively
No ratings yet
The Minimum and Maximum Temperature Refer To The Environmental and Operational Limits A Piece of Equipment Can Handle While Still Functioning Safely and Effectively
3 pages
MX4141 - MX5141 Service Manual
100% (6)
MX4141 - MX5141 Service Manual
543 pages
Safety Note
No ratings yet
Safety Note
4 pages
Basic Complaint Letter To A Tour Company
No ratings yet
Basic Complaint Letter To A Tour Company
3 pages
OEM X-Ray Solutions Catalog
No ratings yet
OEM X-Ray Solutions Catalog
74 pages
Teac LEDV2282FHD Service Manual
No ratings yet
Teac LEDV2282FHD Service Manual
46 pages