Translate

HASH Join and a NESTED Loop Join


What is the difference between a HASH join and a NESTED loop join?

The major difference between a hash join and a nested loops join is the use of a full-table scan with the hash join.

Hash joins - In a hash join, the Oracle database does a full-scan of the driving table. In certain cases, the hash join will execute faster than a nested loop join, but the hash join uses more RAM resources.

Nested loops join - The nested loops table join is one of the original table join plans and it remains the most common. In a nested loop join, we have two tables a driving table and a secondary table.  The rows are usually accessed from a driving table index range scan, and the driving table result set is then nested within a probe of the second table, normally using an index range scan method.

Some queries will perform faster with NESTED LOOPS joins, some with HASH joins. Let’s take an example with the help of below two tables:

SQL> CREATE TABLE Mytab1(COL_1 NUMBER);
Table created.
SQL> CREATE TABLE Mytab2(COL_1 NUMBER);
Table created.


Let’s Try to do below exercise without inserting any data into above tables.

SQL> SELECT * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3193831130
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN         |        |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Let’s try it with ORDERED and USE_NL hints:

SQL> SELECT /*+ ORDERED USE_NL(Mytab1) */ * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3193831130
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN         |        |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Let’s use only USE_NL hint and see the Plan: This will join the tables with NESTED LOOPS.

SQL> SELECT /*+ USE_NL(Mytab1) */ * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1448479866
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     4   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |        |     1 |    26 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Let’s try USE_NL with second table and see the plan:

SQL> SELECT /*+ ORDERED USE_NL(Mytab2) */ * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1297376410
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     4   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |        |     1 |    26 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

IF you see when used USE_NL hint with second table along with ORDERED hint, it has used NESTED LOOPS which is when used with First table was result in HASH Join.

Let’s try USE_NL Hint without ORDERED Hint:

SQL> SELECT /*+ USE_NL(Mytab2) */ * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 1297376410
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     4   (0)| 00:00:01 |
|   1 |  NESTED LOOPS      |        |     1 |    26 |     4   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

As you can see, it has used NESTED LOOPS.

Let’s Try inserting data to the above tables and do the same exercise and see what happens.

SQL> BEGIN
  2  FOR i in 1 .. 1000 LOOP
  3  INSERT INTO MYTAB1 Values(i);
  4  END LOOP;
  5  FOR i in 1 .. 1000000 LOOP
  6  INSERT INTO MYTAB2 Values(i);
  7  END LOOP;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> COMMIT;

Commit complete.

SQL> SELECT COUNT(*) From MYTAB1;

  COUNT(*)
----------
      1000
SQL> SELECT COUNT(*) From MYTAB2;
  COUNT(*)
----------
   1000000

Without Hint: This will result in same HASH join as earlier result with Empty tables.

SQL> SELECT  * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3193831130
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN         |        |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

ORDERD and USE_NL with First Table: This will result in same HASH join as earlier result with Empty tables.

SQL> SELECT /*+ ORDERED USE_NL(Mytab1) */ * From Mytab1 Natural Join Mytab2;

Execution Plan
----------------------------------------------------------
Plan hash value: 3193831130
-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        |     1 |    26 |     5  (20)| 00:00:01 |
|*  1 |  HASH JOIN         |        |     1 |    26 |     5  (20)| 00:00:01 |
|   2 |   TABLE ACCESS FULL| MYTAB1 |     1 |    13 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| MYTAB2 |     1 |    13 |     2   (0)| 00:00:01 |


Only USE_NL with First Table: This results in NESTED LOOPS when used without data and HASH JOIN when we inserted data into both the tables.



Get involved and leave your Comments in the Box Below. The more people get involved, the more we all benefit. So, leave your thoughts before you leave the page.

No comments:

Post a Comment