Next generation strategy and solutions: December 2010

Tuesday, December 7, 2010

Difference Between In and Exists - Part 2

SQL> select * from smalltable1 where data1 in (select data5 from largetable);

100 rows selected.

SQL> select * from smalltable1 st where exists (select 1 from largetable lt where st.data1=lt.data5);

100 rows selected.

SQL> select * from largetable lt where data5 in (select data1 from smalltable1 st);

100 rows selected.

SQL> select * from largetable lt where exists (select 1 from smalltable1 st where st.data1=lt.data5);

100 rows selected.

Inference

From the preceding example, it can be inferred that usage of IN and EXISTS provided us with the same results for both Small table to large table and large table to small table.

Scenario 3: Large table to Large table

Now, the last scenario is to compare large table with another large table and observe the results. Let us create another table largetable2 and compare the results.

SQL> select * from largetable where data5 in (select data5 from largetable2);

100000 rows selected.

SQL> select * from largetable lt where exists (select 1 from largetable2 lt2 where lt.data5=lt2.data5);

100000 rows selected.

To Conclude:

In all the three scenarios, except for the negligible difference in statistics, it can be observed that both IN and EXISTS behave equally.

Difference Between In and Exists

Database Version: Oracle 10g
Operating System: Windows XP

I was actually under the impression that there is actually a good difference between IN and EXISTS. But to my surprise, I found something interesting and further digged into the details and found some good facts. I had a wrong conception, which got cleared after the experiment. I actually wanted to share the same with you.

Till Oracle 8i, IN and Exists were processed in different way by Oracle. However from Oracle 9i, predominantly in Oracle 10g they are much processed on the same way, thanks to CBO intelligent optimizer.

For the illustration purpose I have created four tables, two small and two big as follows. Now I will compare small to small, small to big, big to big using both IN and EXISTS and see what are the results.

Scenario 1: Small table to Small table

Table

create table smalltable1 (data1 number(3), data2 varchar2(10), data3 varchar2(10), data4 varchar2(10));

Procedure

create or replace procedure smalltableproc is

counter number(3);

begin

counter := 1;

while (counter <= 100)

loop

insert into smalltable1 values (counter,'aa' || counter,'bb' || counter, 'cc' || counter);

counter := counter + 1;

end loop;

commit;

end;

Index

create index smalltable1data1idx on smalltable1(data1);

Procedure Execution

exec smalltableproc;

Now, I will create the second small table from the first small table as follows:

create table smalltable2 as select * from smalltable1;

create index smalltabled2ata1idx on smalltable2(data1);

Before comparing the results, in general I used to update the statistics for both indexes and tables. Even though, creating the indexes will actually build the statistics, I always make it a point to do this.

SQL> exec DBMS_STATS.gather_table_stats('INXQA','SMALLTABLE1');

PL/SQL procedure successfully completed.

SQL> exec DBMS_STATS.gather_table_stats('INXQA','SMALLTABLE2');

PL/SQL procedure successfully completed.

Now let us see the results between the usage of IN and EXISTS

SQL> select * from smalltable1 where data1 in (select data1 from smalltable2);

100 rows selected.

SQL> select * from smalltable1 ss where exists (select 1 from smalltable2 st where ss.data1=st.data1);

100 rows selected.

Inference

It can be inferred from the above example that both IN and EXISTS provided us with the same results for the smaller tables.

Now, let us take the next comparison of Small table to Big Table and Vice versa.

Scenario 2: Small table to Large table and Large Table to Small Table

Table

create table largetable (data5 number(6), data6 varchar2(10), data7 varchar2(10), data8 varchar2(10));

Procedure

create or replace procedure largetableproc is

counter number(6);

begin

counter := 1;

while (counter <= 100000)

loop

insert into largetable values (counter,'aa' || counter,'bb' || counter, 'cc' || counter);

counter := counter + 1;

end loop;

commit;

end;

Index

create index largetabledata5idx on largetable(data5);

Procedure Execution

SQL> exec largetableproc

PL/SQL procedure successfully completed.

Statistics

SQL> exec DBMS_STATS.gather_table_stats('HARI','LARGETABLE');

PL/SQL procedure successfully completed.