Greenplum – Architecture and SQL (Genius Series Book 19)

$199.99

Title: Greenplum – Architecture and SQL (Genius Series Book 19)

Format: e-Book

ISBN: 9781940540337

Publisher: Coffing Publishing

Pub. Date: November 2015

Description

Greenplum is the first open-source data warehouse and is now one of the world’s most powerful and widely-used systems. This incredible MPP data warehouse is designed for on-premises systems and the cloud. This book details the Greenplum Data Warehouse’s architecture and the available SQL commands. This book is perfect for anyone who designs, administers, or queries Greenplum. The book educates readers on how to create tables and indexes, how the data is distributed, and how the system processes the data. Plus, it is followed up with over 700 pages of SQL examples and explanations. The Authors Tera-Tom Coffing, who has written over 75 successful books on Data Warehousing, and Leona Coffing, Chief Financial Officer (CFO) of Coffing Data Warehousing, bring a combined 40 years of experience in data warehouse knowledge to create this must-have book.

Topics:

• What is Parallel Processing?
• The Architecture of A Greenplum Data Warehouse
• Greenplum Table Structures
• Best Practices for Choosing a Distribution Key
• Creating a Bitmap Index
• CREATE Statistics
• Major Sort vs. Minor Sorts
• The Like Command Wildcards are Percent and Underscore
• Distinct, Group By, and TOP
• There are Five Aggregates
• Inner and Outer Joins
• Date and Time Functions
• Format Functions
• Analytics and Window Functions
• Temporary Tables
• Subqueries
• Strings
• Interrogating the Data
• Views
• Set Operators
• Creating Tables
• Data Manipulation Language (DML)
• Analyze
• Statistical Aggregate Functions

Greenplum Architecture and SQL Course Outline

Chapter 1 – Introduction to the Greenplum Architecture

What is Parallel Processing?
The Basics of a Single Computer
Data in Memory is fast as Lightning
Parallel Processing Of Data
Symmetric Multi-Processing (SMP) Server
Commodity Hardware Servers are configured for Greenplum
Commodity Hardware Allows For One Segment per CPU
The Master Host
The Segment’s Responsibilities
The Host’s Plan is Either All Segments or a Single Segment
A Table has Columns and Rows
Greenplum has Linear Scalability
The Architecture of A Greenplum Data Warehouse
Nexus is Now Available for Greenplum

Chapter 2 – Greenplum Table Structures

The Concepts of Greenplum Tables
Tables are Either Distributed by Hash or Random
A Hash Distributed Table has A Distribution Key
Picking A Distribution Key That Is Not Very Unique
Random Distribution Uses a Round Robin Technique
Tables Will Be Distributed Among All Segments
The Default For Distribution Chooses the First Column
Table are Either a Heap or Append-Only
Tables are Stored in Either Row or Columnar Format
Creating a Column Oriented Table
Comparing Normal Table vs. Columnar Tables
Columnar can move just One Column Block Into Memory
Segments on Distributions are aligned to Rebuild a Row
Columnar Tables Store Each Column in Separate Blocks
Visualize the Data – Rows vs. Columns
Table Rows are Either Sorted or Unsorted
Creating a Clustered Index in Order to Physically Sort Rows
Physically Ordered Tables Are Faster on Certain Queries
Another Way to Create a Clustered Table
Creating a B-Tree Index and then Running Analyze
Creating a Bitmap Index
Why Create a Bitmap Index?
Tables Can Be Partitioned
A Table Partitioned By Range (Per Month)
A Visual of a Partitioned Table by Range (Month)
Tables Can Be Partitioned by Day
Visualize a Partitioned Table by Day
Creating a Partitioned Table Using a List
Creating a Multi-Level Partitioned Table
Changing a Table to a Partitioned Table
Not Null Constraints
Unique Constraints
Unique Constraints That Fail
Primary Key Constraints
A Primary Key Automatically Creates a Unique Index
Check Constraints
Creating an Automatic Number Called a Sequence
Multiple INSERT example using a Sequence

Chapter 3 – Hashing and Data Distribution

Distribution Keys Hashed on Unique Values Spread Evenly
Distribution Keys with Non-Unique Values Spread Unevenly
Best Practices for Choosing a Distribution Key
The Hash Map Determines which Segment owns the Row
The Hash Map Determines which Node will own the Row
The Hash Map Determines which Node will own the Row
The Hash Map Determines which Node will own the Row
Hash Map Determines which Node will own the Row
A Review of the Hashing Process
Non-Unique Distribution Keys have Skewed Data
Non-Unique Distribution Keys have Skewed Data

Chapter 4 – The Technical Details

Greenplum Limitations
Every Segment has the Exact Same Tables
Tables are Distributed across All Segments
The Table Header and the Data Rows are Stored Separately
Segments Store Rows inside a Data Block Called a Page
To Read a Data Block a Node Moves the Block into Memory
A Full Table Scan Means All Nodes Must Read All Rows
Rows are Organized inside a Page
Moving Data Blocks is Like Checking In Luggage
As Row-Based Tables Get Bigger, the Page Splits
Data Pages are Processed One at a Time per Unit
Creating a Table that is a Heap
Heap Page
Creating a Table that has a Clustered Index
Clustered Index Page
The Row Offset Array is the Guidance System for Every Row
The Row Offset Array Provides Two Search Options (1 of 2)
The Row Offset Array Provides Two Search Options (2 of 2)
The Row Offset Array Helps With Inserts
B-Trees
The Building of a B-Tree for a Clustered Index (1 of 3)
The Building of a B-Tree for a Clustered Index (2 of 3)
The Building of a B-Tree for a Clustered Index (3 of 3)
When Do I Create a Clustered Index?
When Do I Create a Non Clustered Index?
B-Tree for Non Clustered Index on a Clustered Table (1 of 2)
B-Tree for Non Clustered Index on a Clustered Table (2 of 2)
Adding a Non Clustered Index To A
B-Tree for Non Clustered Index on a Heap Table (1 of 2)
B-Tree for Non Clustered Index on a Heap Table (2 of 2)

Chapter 5 – Physical Database Design

The Four Stages of Modeling for Greenplum
The Logical Model
The Logical Model can be loaded inside Nexus
First, Second and Third Normal Form
Quiz – Choose that Normalization Technique
Answer to Quiz – Choose that Normalization Technique
Quiz – What Normalization is it now?
Answer to Quiz – What Normalization is it now?
The Employee_Table and Department_Table can be joined
The Employee_Table and Department_Table Join SQL
The Extended Logical Model Template
User Access is of Great Importance
User Access in Layman’s Terms
User Access for Joins in Layman’s Terms
The Nexus Shows Users the Table’s Distribution Key
Data Demographics Tell Us if the Column is Worthy
Data Demographics – Distinct Rows
Data Demographics – Distinct Rows Query
Data Demographics – Max Rows Null
Data Demographics – Max Rows Null Query
Data Demographics – Max Rows Per Value
Data Demographics – Max Rows Per Value
Data Demographics – Typical Rows Per Value
Typical Rows Per Value Query For Greenplum Systems
SQL to Get the Average Rows Per Value for a Column (Mean)
Data Demographics – Change Rating
Factors When Choosing Greenplum Indexes
Distribution Key Data Demographics Candidate Guidelines
Distribution key Access Considerations
Answer -Three Important distribution key Considerations
Step 1 is to Pick All Potential Distribution Key Columns
Step 1 is to Pick All Potential Distribution Key Columns
Step 2 is to Pick All Potential Secondary Indexes
Answer to 2nd Step to Picking Potential Secondary Indexes
Choose the Distribution Key and Secondary Indexes
3rd Step is to picking your Indexes
Our Index Picks

Chapter 6 – Denormalization

Denormalization
Derived Data
Repeating Groups
Pre-Joining Tables
Storing Summary Data with a Trigger
Summary Tables or Data Marts the Old Way
Horizontal Partitioning the Old Way
Horizontal Partitioning the New Way
Vertical Partitioning the Old Way
Columnar Tables Are the New Vertical Partitioning

Chapter 7 – Nexus

Nexus is Available on the Cloud
Nexus Queries Every Major System
How to Use Nexus
Why is Nexus Special? Visualization and Automatic SQL
Why is Nexus Special? Cross-System Joins
Why is Nexus Special? The Amazing Hub System
Why is Nexus Special? Save Answer Sets as Tables
Why is Nexus Special? Automated Data Movement
Why is Nexus Special? Nexus makes the Servers Talk Directly
What Makes Nexus Special? The Garden of Analysis
The Garden of Analysis Grouping Sets Tab
The Garden of Analysis – Grouping Sets Answer Sets
The Garden of Analysis – Join Tab (1 of 4)
The Garden of Analysis – Join Tab (2 of 4)
The Garden of Analysis – Join Tab (3 of 4)
The Garden of Analysis – Join Tab (4 of 4)
The Garden of Analysis – Charts/Graphs Tab (1 of 4)
The Garden of Analysis – Charts/Graphs Tab (2 of 4)
The Garden of Analysis – Charts/Graphs Tab (3 of 4)
The Garden of Analysis – Charts/Graphs Tab (4 of 4)
The Garden of Analysis – Dynamic Charts Tab (1 of 4)
The Garden of Analysis – Dynamic Charts Tab (2 of 4)
The Garden of Analysis – Dynamic Charts Tab (3 of 4)
The Garden of Analysis – Dynamic Charts Tab (4 of 4)
The Garden of Analysis – Dashboard Tab (1 of 5)
The Garden of Analysis – Dynamic Charts Tab (2 of 5)
The Garden of Analysis – Dynamic Charts Tab (3 of 5)
The Garden of Analysis – Dynamic Charts Tab (4 of 5)
The Garden of Analysis – Dynamic Charts Tab (5 of 5)
Getting to the Super Join Builder
The Super Join Builder is the First Entry in the Menu
The Super Join Builder Shows Tables Visually
Using the Add Join Button
What to Do When No Tables are Joinable?
Drag a Joinable Object into the Super Join Builder
You Will See the Add Custom Join Window
Defining the Join Columns
Your Tables Will Appear Together
Select the Columns You Want on the Report
Check out the SQL Tab to See the SQL that has been built
SQL Tab
Hit Execute to get the Report inside the Super Join Builder
The Report is delivered inside the Super Join Builder
Let’s Join Two Tables Again (1 of 6)
Let’s Join Two Tables Again (2 of 6)
Let’s Join Two Tables Again (3 of 6)
Let’s Join Two Tables Again (4 of 6)
Let’s Join Two Tables Again (5 of 6)
Let’s Join Two Tables Again (6 of 6)
The Tabs of the Super Join Builder Philosophy – One Query
The Tabs of the Super Join Builder – Objects Tab
The Tabs of the Super Join Builder – Columns Tab)
The Tabs of the Super Join Builder – Sorting Tab
The Tabs of the Super Join Builder – Joins Tab
The Tabs of the Super Join Builder – SQL Tab
The Tabs of the Super Join Builder – Metadata Tab
The Tabs of the Super Join Builder – Analytics Tab
The Tabs of the SJB – Analytics Tab – OLAP Screen
Getting a Simple CSUM in the Analytics Tab – OLAP
Getting a Simple CSUM – The SQL Automatically Generated
The Answer Set of the CSUM
Getting all of the OLAP functions in the Analytics Tab
A Five Table Join Using the Menu
The First Table is placed in the Super Join Builder
Using the Add Join Cascading Menu
All Five Tables Are In the Super Join Builder
A Five Table Join Two Steps (Cube)
Choose Cube with Columns from the Left Top of the Table
All Tables are Cubed (Joined Together Instantly)
Choose Cube and then Choose Your Columns
Create Cube – Tables Are Joined Without Columns Selected
Create Cube – Select the Columns You Want on the Report
How to join Greenplum, Oracle and SQL Server Tables
The Greenplum Table is now in the Super Join Builder
Drag the Joining Oracle Table to the Super Join Builder
Defining the Join Columns
Choose the Columns You Want on Your Report
Let’s Add a SQL Server Table to our Teradata and Oracle Join
Defining the Join Columns
All Three Tables are now in the Super Join Builder
Change the Hub and Run the Join on Oracle
Change the Hub and Run the Join on SQL Server
Simply Amazing – Change the Hub to the Garden of Analysis
Have the Answer Set Saved Automatically to Any System
Saving the Answer Set to an Oracle or SQL Server System
Saving the Answer Set to a Greenplum System
Saving the Answer Set to a Teradata System

Chapter 8 – The Basics of SQL

Introduction
SELECT * (All Columns) in a Table
Fully Qualifying a Database, Schema and Table
SELECT Specific Columns in a Table
Commas in the Front or Back?
Place your Commas in front for better Debugging Capabilities
Sort the Data with the ORDER BY Keyword
ORDER BY Defaults to Ascending
Use the Name or the Number in your ORDER BY Statement
Two Examples of ORDER BY using Different Techniques
Changing the ORDER BY to Descending Order
NULL Values sort First in Ascending Mode (Default)
NULL Values sort Last in Descending Mode (DESC)
Major Sort vs. Minor Sorts
Multiple Sort Keys using Names vs. Numbers
Sorts are Alphabetical, NOT Logical
Using A CASE Statement to Sort Logically
How to ALIAS a Column Name
A Missing Comma can by Mistake become an Alias
Comments using Double Dashes are Single Line Comments
Comments for Multi-Lines
Comments for Multi-Lines as Double Dashes Per Line
A Great Technique for Comments to Look for SQL Errors

Chapter 9 – The WHERE Clause

The WHERE Clause limits Returning Rows
Double Quoted Aliases are for Reserved Words and Spaces
Character Data needs Single Quotes in the WHERE Clause
Character Data needs Single Quotes, but Numbers Don’t
Comparisons against a Null Value
NULL means UNKNOWN DATA so Equal (=) won’t Work
Use IS NULL or IS NOT NULL when dealing with NULLs
NULL is UNKNOWN DATA so NOT Equal won’t Work
Use IS NULL or IS NOT NULL when dealing with NULLs
Using Greater Than or Equal To (>=)
AND in the WHERE Clause
Troubleshooting AND
OR in the WHERE Clause
Troubleshooting Or
Troubleshooting Character Data
Using Different Columns in an AND Statement
Quiz – How many rows will return?
Answer to Quiz – How many rows will return?
What is the Order of Precedence?
Using Parentheses to change the Order of Precedence
Using an IN List in place of OR
The IN List is an Excellent Technique
IN List vs. OR brings the same Results
The IN List Can Use Character Data
Using a NOT IN List
Null Values in a NOT IN List Bring Back No Rows
A Technique for Handling Nulls with a NOT IN List
BETWEEN is Inclusive
NOT BETWEEN is Also Inclusive
LIKE uses Wildcards Percent ‘%’ and Underscore ‘_’
LIKE command Underscore is Wildcard for one Character
The ilike Command
LIKE Command Works Differently on Char Vs Varchar
Troubleshooting LIKE Command on Character Data
Introducing the TRIM Command
Introducing the RTRIM Command
Quiz – What Data is Left Justified and what is Right?
Numbers are Right Justified and Character Data is Left
Answer – What Data is Left Justified and what is Right?
An example of Data with Left and Right Justification
A Visual of CHARACTER Data vs. VARCHAR Data
Use the TRIM command to remove spaces on CHAR Data
Escape Character in the LIKE Command changes Wildcards
Escape Characters Turn off Wildcards in the LIKE Command
Quiz – Turn off that Wildcard
ANSWER – To Find that Wildcard
Introducing the RTRIM Command
Quiz – What Data is Left Justified and What is Right?
Numbers are Right Justified and Character Data is Left
Answer – What Data is Left Justified and what is Right?
An example of Data with Left and Right Justification
A Visual of CHARACTER Data vs. VARCHAR Data
RTRIM command Removes Trailing spaces on CHAR Data
Using Like with an AND Clause to Find Multiple Letters
Using Like with an OR Clause to Find Either Letters

Chapter 10 – Distinct vs. Group By

The Distinct Command
Distinct vs. GROUP BY
Quiz – How many rows come back from the Distinct?
Answer – How many rows come back from the Distinct?

Chapter 11 – Aggregation

Quiz – You calculate the Answer Set in your own Mind
Answer – You calculate the Answer Set in your own Mind
Quiz – You calculate the Answer Set in your own Mind
Answer – You calculate the Answer Set in your own Mind
The 3 Rules of Aggregation
There are Five Aggregates
Quiz – How many rows come back?
Answer – How many rows come back?
Troubleshooting Aggregates
GROUP BY when Aggregates and Normal Columns Mix
GROUP BY delivers one row per Group
GROUP BY Dept_No or GROUP BY 1 the same thing
Limiting Rows and Improving Performance with WHERE
WHERE Clause in Aggregation limits unneeded Calculations
Keyword HAVING tests Aggregates after they are totaled
Aggregates Return Null on Empty Tables
Keyword HAVING is like an Extra WHERE Clause for Totals
Keyword HAVING tests Aggregates after they are totaled
Getting the Average Values per Column
Three types of Advanced Grouping
Group By Grouping Sets
Group By Rollup
GROUP BY Rollup Result Set
GROUP BY Cube
GROUP BY CUBE Result Set
GROUP BY CUBE Result Set
Quiz – GROUP BY GROUPING SETS Challenge
Answer To Quiz – GROUP BY GROUPING SETS Challenge

Chapter 12 – Join Functions

Greenplum Join Quiz
Greenplum Join Quiz Answer
Redistribution
Big Table Small Table Join Strategy
Duplication of the Smaller Table across All-Distributions
If the Join Condition is the Distribution Key no Movement
Matching Rows That Are On The Same Node Naturally
What if the Join Condition Columns are Not distribution keyes
Strategy 1 of 4 – The Merge Join
Quiz – Redistribute the Employees by their Dept_No
Quiz – Employees’ Dept_No landed on segment with Matches
Quiz – Redistribute the Orders to the Proper segment
Answer to Redistribute the Employees by their Dept_No Quiz
Strategy 2 of 4 – The Hash Join
Strategy 3 of 4 – The Nested Join
Strategy 4 of 4 – The Product Join
A Two-Table Join Using Traditional Syntax
A two-table join using Non-ANSI Syntax with Table Alias
You Can Fully Qualify All Columns
A two-table join using ANSI Syntax
Both Queries have the same Results and Performance
Quiz – Can You Finish the Join Syntax?
Answer to Quiz – Can You Finish the Join Syntax?
Quiz – Can You Find the Error?
Answer to Quiz – Can You Find the Error?
Super Quiz – Can You Find the Difficult Error?
Answer to Super Quiz – Can You Find the Difficult Error?
Quiz – Which rows from both tables won’t return?
Answer to Quiz – Which rows from both tables won’t return?
LEFT OUTER JOIN
LEFT OUTER JOIN Results
RIGHT OUTER JOIN
RIGHT OUTER JOIN Example and Results
FULL OUTER JOIN
FULL OUTER JOIN Results
Which Tables are the Left and which Tables are Right?
Answer – Which Tables are the Left and Which are the Right?
INNER JOIN with Additional AND Clause
ANSI INNER JOIN with Additional AND Clause
ANSI INNER JOIN with Additional WHERE Clause
OUTER JOIN with Additional WHERE Clause
OUTER JOIN with Additional AND Clause
OUTER JOIN with Additional AND Clause Results
Quiz – Why is this considered an INNER JOIN?
Evaluation Order for Outer Queries
The DREADED Product Join
The DREADED Product Join Results
The Horrifying Cartesian Product Join
The ANSI Cartesian Join will ERROR
Quiz – Do these Joins Return the Same Answer Set?
Answer – Do these Joins Return the Same Answer Set?
The CROSS JOIN
The CROSS JOIN Answer Set
The Self Join
The Self Join with ANSI Syntax
Quiz – Will both queries bring back the same Answer Set?
Answer – Will both queries bring back the same Answer Set?
Quiz – Will both queries bring back the same Answer Set?
Answer – Will both queries bring back the same Answer Set?
How would you Join these two tables?
An Associative Table is a Bridge that Joins Two Tables
Quiz – Can you write the 3-Table Join?
Answer to Quiz – Can you Write the 3-Table Join?
Quiz – Can you write the 3-Table Join to ANSI Syntax?
Answer – Can you Write the 3-Table Join to ANSI Syntax?
Quiz – Can you Place the ON Clauses at the End?
Answer – Can you Place the ON Clauses at the End?
The 5-Table Join – Logical Insurance Model
Quiz – Write a Five Table Join Using ANSI Syntax
Answer – Write a Five Table Join Using ANSI Syntax
Quiz – Write a Five Table Join Using Non-ANSI Syntax
Answer – Write a Five Table Join Using Non-ANSI Syntax
Quiz –Re-Write this putting the ON clauses at the END
Answer –Re-Write this putting the ON clauses at the END

Chapter 13 – Date Functions

Current_Date
Current_Date and Current_Time
Current_Date and Current_Timestamp
The Many Different Ways to Look at a Timestamp
Current_Time vs. LocalTime with Precision
Local_Time and Local_Timestamp With Precision
Now () and Timeofday () Functions
Adding A Week to a Date
Add or Subtract Days from a date
Formatting Dates and Dollar Amounts
The EXTRACT Command
EXTRACT from DATES and TIME
EXTRACT Command on the Century
EXTRACT Command for the Decade, DOW and DOY
EXTRACT Microseconds, Milliseconds and Millennium
EXTRACT of the Month on Aggregate Queries
Date_part Command
Date_Trunc Command with Time
Date_Trunc Command with Dates
The AGE Command
AGE Challenge
AGE Challenge Results
Epoch
Using Intervals
More Interval Examples
Interval Arithmetic Results
A Complex Time Interval example using CAST
The OVERLAPS Command
An OVERLAPS example that Returns No Rows
The OVERLAPS Command using TIME
Using both CAST and CONVERT in Literal Values
A Better Technique for YEAR, MONTH, and DAY Functions

Chapter 14 – Conversions and Formatting

Postgres Conversion Functions
Postgres Conversion Function Templates
Postgres Conversion Function Templates Continued
To_Char command Examples
Formatting A Date with To_Char
Formatting A Date With To_Char Continued
To_Number
To_Number Examples
To_Date
To_Timestamp
Numeric Manipulation Functions
Finding the Cube Root
Ceiling Gets the Smallest Integer Not Smaller Than X
Floor Finds the Largest Integer Not Greater Than X
The Round Function and Precision

Chapter 15 – Sub-query Functions

An IN List is much like a Subquery
An IN List Never has Duplicates – Just like a Subquery
An IN List Ignores Duplicates
The Subquery
The Three Steps of How a Basic Subquery Works
These are Equivalent Queries
The Final Answer Set from the Subquery
Quiz- Answer the Difficult Question
Answer to Quiz- Answer the Difficult Question
Should you use a Subquery of a Join?
Quiz- Write the Subquery
Answer to Quiz- Write the Subquery
Quiz- Write the More Difficult Subquery
Answer to Quiz- Write the More Difficult Subquery
Quiz – Write the Extreme Subquery
Answer to Quiz – Write the Extreme Subquery
Quiz- Write the Subquery with an Aggregate
Answer to Quiz- Write the Subquery with an Aggregate
Quiz- Write the Correlated Subquery
Answer to Quiz- Write the Correlated Subquery
The Basics of a Correlated Subquery
The Top Query always runs first in a Correlated Subquery
Correlated Subquery Example vs. a Join with a Derived Table
Quiz- A Second Chance to Write a Correlated Subquery
Answer – A Second Chance to Write a Correlated Subquery
Quiz- A Third Chance to Write a Correlated Subquery
Answer – A Third Chance to Write a Correlated Subquery
Quiz- Last Chance To Write a Correlated Subquery
Answer – Last Chance to Write a Correlated Subquery
Quiz – Write the Extreme Correlated Subquery
Answer To Quiz – Write the Extreme Correlated Subquery
Quiz- Write the NOT Subquery
Answer to Quiz- Write the NOT Subquery
Quiz- Write the Subquery using a WHERE Clause
Answer – Write the Subquery using a WHERE Clause
Quiz- Write the Subquery with Two Parameters
Answer to Quiz- Write the Subquery with Two Parameters
How the Double Parameter Subquery Works
More on how the Double Parameter Subquery Works
Quiz – Write the Triple Subquery
Answer to Quiz – Write the Triple Subquery
Quiz – How many rows return on a NOT IN with a NULL?
Answer – How many rows return on a NOT IN with a NULL?
How to handle a NOT IN with Potential NULL Values
IN is equivalent to =ANY
Using a Correlated Exists
How a Correlated Exists matches up
The Correlated NOT Exists
Quiz – How many rows come back from this NOT Exists?
Answer – How many rows come back from this NOT Exists?

Chapter 16 – Analytics

CSUM
CSUM – The Sort Explained
CSUM – Rows Unbounded Preceding Explained
CSUM – Making Sense of the Data
CSUM – Making Even More Sense of the Data
CSUM – The Major and Minor Sort Key(s)
The ANSI CSUM – Getting a Sequential Number
Troubleshooting The ANSI OLAP on a GROUP BY
Reset with a PARTITION BY Statement
PARTITION BY only Resets a Single OLAP not ALL of them
Moving SUM
ANSI Moving Window is Current Row and Preceding n Rows
How ANSI Moving SUM Handles the Sort
Quiz – How is that Total Calculated?
Answer to Quiz – How is that Total Calculated?
Moving SUM every 3-rows Vs a Continuous Average
Partition By Resets an ANSI OLAP
Both the Greenplum Moving Average and ANSI Version
Moving Average
The Moving Window is Current Row and Preceding
How Moving Average Handles the Sort
Quiz – How is that Total Calculated?
Answer to Quiz – How is that Total Calculated?
Quiz – How is that 4th Row Calculated?
Answer to Quiz – How is that 4th Row Calculated?
Moving Average every 3-rows Vs a Continuous Average
Partition By Resets an ANSI OLAP
Moving Difference using ANSI Syntax with Partition By
RANK Defaults to Ascending Order
Getting RANK to Sort in DESC Order
RANK OVER and PARTITION BY
RANK and DENSE RANK
PERCENT_RANK OVER
PERCENT_RANK OVER with 14 rows in Calculation
PERCENT_RANK OVER with 21 rows in Calculation
Quiz – What Causes the Product_ID to Reset?
Answer to Quiz – What Cause the Product_ID to Reset?
COUNT OVER for a Sequential Number
Troubleshooting COUNT OVER
Quiz – What caused the COUNT OVER to Reset?
Answer to Quiz – What caused the COUNT OVER to Reset?
The MAX OVER Command
MAX OVER with PARTITION BY Reset
Troubleshooting MAX OVER
The MIN OVER Command
Troubleshooting MIN OVER
Finding a Value of a Column in the Next Row with MIN
Quiz – Fill in the Blank
Answer – Fill in the Blank
The Row_Number Command
Using a Derived Table and Row_Number
Quiz – How did the Row_Number Reset?
Answer – How did the Row_Number Reset?
Ordered Analytics OVER
CURRENT ROW AND UNBOUNDED FOLLOWING
Different Windowing Options
The CSUM for Each Product_Id and the Next Start Date
How Ntile Works
Ntile
Ntile Continued
Ntile Percentile
Another Ntile example
Using Tertiles (Partitions of Four)
NTILE
NTILE Using a Value of 10
NTILE With a Partition
Using FIRST_VALUE
FIRST_VALUE
FIRST_VALUE after Sorting by the Highest Value
FIRST_VALUE with Partitioning
Using LAST_VALUE
LAST_VALUE
Using LEAD
Using LEAD With and Offset of 2
LEAD
LEAD With Partitioning
Using LAG
Using LAG with an Offset of 2
LAG
LAG with Partitioning
CUME_DIST
CUME_DIST with a Partition
SUM (SUM(n))

Chapter 17 – Temporary Tables

There are Two Types of Temporary Tables
CREATING A Derived Table
Naming the Derived Table
Aliasing the Column Names in the Derived Table
Multiple Ways to Alias the Columns in a Derived Table
CREATING a Derived Table using the WITH Command
The Same Derived Query shown Three Different Ways
Most Derived Tables Are Used To Join To Other Tables
The Three Components of a Derived Table
Visualize This Derived Table
A Derived Table and CAST Statements
A Derived example Using the WITH Syntax
Quiz – Answer the Questions
Answer to Quiz – Answer the Questions
Clever Tricks on Aliasing Columns in a Derived Table
An example of Two Derived Tables in a Single Query
MULTIPLE Derived Tables using the WITH Command
Finding the First Occurrence
Finding the Last Occurrence
Three Steps to Creating a Temporary Table
Three Versions of Creating a Temporary Table
ON COMMIT PRESERVE ROWS is the Greenplum Default
ON COMMIT DELETE ROWS
How to Use the ON COMMIT DELETE ROWS Option
ON COMMIT DROP
How to Use the ON COMMIT DROP Option
Create Table AS
Creating a Temporary Table Using a CTAS that Joins Multiple Tables
Create Table LIKE
Creating a Clustered Index on a Temporary Table

Chapter 18 – Character Strings

The LENGTH Command Counts Characters
The LENGTH Command – Spaces can Count too
The LENGTH Command Doesn’t Count Trailing Spaces
UPPER and LOWER Commands
Using the LOWER Command
A LOWER Command Example
Using the UPPER Command
An UPPER Command Example
Non-Letters are Unaffected by UPPER and LOWER
The CHARACTERS Command Counts Characters
The CHARACTERS Command and Character Data
CHARACTER_LENGTH and OCTET_LENGTH
The TRIM Command trims both Leading and Trailing Spaces
Trim Combined with the CHARACTERS Command
How to TRIM only the Trailing Spaces
REGEXP_REPLACE
Concatenation
A Visual of the TRIM Command Using Concatenation
Trim and Trailing is Case Sensitive
How to TRIM Trailing Letters
The SUBSTRING Command
SUBSTRING and SUBSTR are equal, but use different syntax
How SUBSTRING Works with NO ENDING POSITION
Using SUBSTRING to move backwards
How SUBSTRING Works with a Starting Position of -1
How SUBSTRING Works with an Ending Position of 0
An example using SUBSTRING, TRIM and CHAR Together
The POSITION Command finds a Letters Position
Concatenation
Concatenation and SUBSTRING
Four Concatenations Together
Troubleshooting Concatenation

Chapter 19 – Interrogating the Data

Quiz – What would the Answer be?
Answer to Quiz – What would the Answer be?
The NULLIF Command
Quiz – Fill in the Answers for the NULLIF Command
Answer– Fill in the Answers for the NULLIF Command
The COALESCE Command – Fill In the Answers
The COALESCE Answer Set
COALESCE is Equivalent to This CASE Statement
The COALESCE Command
The COALESCE Answer Set
The COALESCE Quiz
Answer – The COALESCE Quiz
The Basics of CAST (Convert and Store)
Some Great CAST (Convert and Store) Examples
Some Great CAST (Convert and Store) Examples
Some Great CAST (Convert and Store) example
Quiz – The Basics of the CASE Statements
Answer to Quiz – The Basics of the CASE Statements
Using an ELSE in the Case Statement
Using an ELSE as a Safety Net
Rules for a Valued Case Statement
Rules for a Searched Case Statement
Valued Case Vs. A Searched Case
Quiz – Valued Case Statement
Answer – Valued Case Statement
Quiz – Searched Case Statement
Answer – Searched Case Statement
The CASE Challenge
The CASE Challenge Answer
Combining Searched Case and Valued Case
A Trick for getting a Horizontal Case
Nested Case

Chapter 20 – Set Operators Functions

Rules of Set Operators
Rules of Set Operators
INTERSECT Explained Logically
INTERSECT Explained Logically
UNION Explained Logically
UNION Explained Logically
UNION ALL Explained Logically
UNION ALL Explained Logically
EXCEPT Explained Logically
EXCEPT Explained Logically
An Equal Amount of Columns in both SELECT List
Columns in the SELECT list should be from the same Domain
The Top Query handles all Aliases
The Bottom Query does the ORDER BY (a Number)
Great Trick: Place your Set Operator in a Derived Table
UNION Vs UNION ALL
Using UNION ALL and Literals
A Great example of how EXCEPT works
Quiz – Build that Query
Answer To Quiz – Build that Query
USING Multiple SET Operators in a Single Request
Changing the Order of Precedence with Parentheses
Using UNION ALL for speed in Merging Data Sets

Chapter 21 – View Functions

The Fundamentals of Views
Creating a Simple View to Restrict Sensitive Columns
Creating a Simple View to Restrict Rows
Basic Rules for Views
Exception to the ORDER BY Rule inside a View
Views sometimes CREATED for Formatting
Creating a View to Join Tables Together
Another Way to Alias Columns in a View CREATE
The Standard Way Most Aliasing is done
What Happens When Both Aliasing Options Are Present
Resolving Aliasing Problems in a View CREATE
Answer to Resolving Aliasing Problems in a View CREATE
Aggregates on View Aggregates
Altering A Table
Altering a Table after a View has been Created
A View that Errors after an ALTER

Chapter 22 – Table Create and Data Types

Greenplum Has Only Two Distribution Policies
Creating a Table with a Single Column Distribution Key
The Default Table Storage is a Heap
Creating a Table With a Multi-Column Distribution Key
Creating a Table with Random Distribution
Creating a Table with No Distribution Key
Guidelines for Partitioning a Table
Creating a Partitioned Table Using a Range
A Visual of One Year of Data with Range Partitioning
Creating a Partitioned Table Using a Range Per Day
A Visual of One Year of Data with Range per Day
Creating a Partitioned Table Using a List
Creating a Multi-Level Partitioned Table
Changing a Table to a Partitioned Table
Not Null Constraints
Unique Constraints
Primary Key Constraints
Check Constraints
Append Only Tables
Storage is Either Row, Column, or a Combination of Both
Column-Orientated Tables
CREATE INDEX Syntax
CREATE INDEX Syntax
Create Table LIKE
Greenplum Data Types
Greenplum Data Types Continued
Greenplum Data Types Continued
Greenplum Data Types Continued
Greenplum Data Types Continued

Chapter 23 – Data Manipulation Language (DML)

INSERT Syntax # 1
INSERT example with Syntax 1
INSERT Syntax # 2
INSERT example with Syntax 2
INSERT example with Syntax 3
INSERT/SELECT Command
INSERT/SELECT example using All Columns (*)
INSERT/SELECT example with Less Columns
The UPDATE Command Basic Syntax
Two UPDATE Examples
Subquery UPDATE Command Syntax
Example of Subquery UPDATE Command
Join UPDATE Command Syntax
Example of an UPDATE Join Command
Fast UPDATE
The DELETE Command Basic Syntax
DELETE and TRUNCATE Examples
To DELETE or to TRUNCATE
Subquery and Join DELETE Command Syntax
Example of Subquery DELETE Command

Chapter 24 – ANALYZE and VACUUM

ANALYZE
ANALYZE Options
What Columns Should You Analyze?
Why Analyze?
VACUUM
VACUUM Options

Chapter 25 – Greenplum Explain

How to See an EXPLAIN Plan
The Eight Rules to Reading an EXPLAIN Plan
Interpreting Keywords in an EXPLAIN Plan
Interpreting an EXPLAIN Plan
A Single Segment Retrieve – The Fastest Query
EXPLAIN With an ORDER BY Statement
EXPLAIN ANALYZE
EXPLAIN With a Range Query on a Table Partitioned By Day
EXPLAIN That Uses a B-Tree Index Scan
EXPLAIN That Uses a Bitmap Scan
EXPLAIN With a Simple Subquery
EXPLAIN With a Columnar Query
EXPLAIN With a Clustered Index
The Most Important Concept for Joins is the Distribution Key
EXPLAIN With Join that has to Move Data
EXPLAIN With Join that has to Move Data
Changing the Join Query Changes the EXPLAIN Plan
Analyzing the Tables Structures For a 3-Table Join
An EXPLAIN For a 3-Table Join
Explain of a Derived Table vs. a Correlated Subquery
Explain of the Correlated Subquery
Explain of the Derived Table

Chapter 26 – Statistical Aggregate Functions

The Stats Table
The STDDEV_POP Function
A STDDEV_POP Example
The STDDEV_SAMP Function
A STDDEV_SAMP Example
The VAR_POP Function
A VAR_POP Example
The VAR_SAMP Function
A VAR_SAMP Example
The VARIANCE Function
A VARIANCE Example
The CORR Function
A CORR Example
Another CORR Example so you can Compare
The COVAR_POP Function
A COVAR_POP Example
Another COVAR_POP Example so you can Compare
The COVAR_SAMP Function
A COVAR_SAMP Example
Another COVAR_SAMP Example so you can Compare
The REGR_INTERCEPT Function
A REGR_INTERCEPT Example
Another REGR_INTERCEPT Example so you can Compare
The REGR_SLOPE Function
A REGR_SLOPE Example
Another REGR_SLOPE Example so you can Compare
The REGR_AVGX Function
A REGR_AVGX Example
Another REGR_AVGX Example so you can Compare
The REGR_AVGY Function
A REGR_AVGY Example
Another COVAR_POP Example so you can Compare
The REGR_COUNT Function
A REGR_COUNT Example
The REGR_R2 Function
A REGR_R2 Example
The REGR_SXX Function
A REGR_SXX Example
The REGR_SXY Function
A REGR_SXY Example
The REGR_SYY Function
A REGR_SYY Example
Using GROUP BY