The most intuitive way to write a comparison of two tables and spit out the # of rows that are the same involves a left join:
select 'Investment' as TableName, count(*) as RowCount from Investment_A a, Investment_B b where a.col1 = b.col1 AND a.col2 = b.col2 AND a.col3 = b.col3 AND a.col4 = b.col4
This returns the correct answer but is very slow. Is there a better way? Of course!
select 'Investment' as TableName, count(*) as RowCount from ( select 1 as num FROM ( select * from Investment_A UNION ALL select * from Investment_B ) tmp GROUP BY col1, col2, col3, col4 HAVING COUNT(*) > 1 ) tmp2
By pushing the comparison off into the GROUP BY, we leverage the DBMS engine far more efficiently. There are two drawbacks:
- Readability of the SQL code
- Far more temporary storage is used for the GROUP BY. There is a real risk of running out of temporary storage if the tables are large.