Search notes:

Oracle: Aggregate function CHECKSUM

The aggregate and analytic function checksum calculates a number from a set of expressions and is intended to detect data changes in a table: if the number changes, the data has changed (or was possibly even tampered with).
checksum is available with 21c.

Simple demonstration

The following simple snippets try to demonstate checksum.
First, we need a table and fill it with some data:
create table tq84_tab (
    id  integer    generated always as identity primary key,
    grp char(1)    not null,
    val number(5,2),
    txt varchar2(10)
);

begin
    insert into tq84_tab (grp, val, txt) values ('A',   1.11, 'abc');
    insert into tq84_tab (grp, val, txt) values ('A',   null, 'def');
    insert into tq84_tab (grp, val, txt) values ('A',   2.22, 'ghi');

    insert into tq84_tab (grp, val, txt) values ('B',  10.10, 'jkl');
    insert into tq84_tab (grp, val, txt) values ('B',   1.01, 'mno');

    insert into tq84_tab (grp, val, txt) values ('C', 444.44,  null);
    insert into tq84_tab (grp, val, txt) values ('C', 555.55, 'pqr');
    commit;
end;
/
Then, we create a view to calculate the checksums. Because we use the group by clause, we get a checksum for each different value of grp.
The expression on which the checksum is calculated is the concatenation of the remaining columns, i. e. all columns except grp (id || '~' || val || '~' || txt):
create or replace view tq84_checksum_v as
select
   grp,
   checksum(id || '~' || val || '~' || txt) checksum_
from
   tq84_tab
group by
   grp
;
Selecting from the view gives the following result:
select * from tq84_checksum_v;
--
-- G  CHECKSUM_
-- - ----------
-- A     388079
-- B     955475
-- C     110315
We safe the calculated checksums in tq84_checksum_safe:
create table tq84_checksum_safe as select * from tq84_checksum_v;
At a later time, some data is altered:
update tq84_tab set val = 22.2 where txt = 'ghi';
commit;
Because we have to original checksums, we're able to at least determine in which group the data was changed:
select
   nvl(v.grp, s.grp) grp,
   case when nvl(v.checksum_, -1) <> nvl(s.checksum_, -1) then 'different' end diff
from
   tq84_checksum_v      v                  full outer join
   tq84_checksum_safe   s on v.grp = s.grp
where
   nvl(v.checksum_, -1) <> nvl(s.checksum_, -1)
;
--
-- G DIFF     
-- - ---------
-- A different
Cleaning up:
drop   view  tq84_checksum_v;
drop   table tq84_checksum_safe;
drop   table tq84_tab;

Achieving independence of order of rows

The returned value of checksum is independent of the order of the input rows. It is very likely that this is achieved by xor-ing the input values. The downside of this approach is that the checksum value of three repeated values is equal to the checksum of that value and the checksum value of two repeated values is zero, as demonsrated with the following statements:
create table tq84_checksum_test (val number);

select checksum(val) from tq84_checksum_test;
--
-- CHECKSUM(VAL)
-- -------------
--

insert into tq84_checksum_test values (1);
select checksum(val) from tq84_checksum_test;
--
-- CHECKSUM(VAL)
-- -------------
--        863352

insert into tq84_checksum_test values (1);
select checksum(val) from tq84_checksum_test;
--
-- CHECKSUM(VAL)
-- -------------
--             0

insert into tq84_checksum_test values (1);
select checksum(val) from tq84_checksum_test;
--
-- CHECKSUM(VAL)
-- -------------
--        863352

insert into tq84_checksum_test values (1);
select checksum(val) from tq84_checksum_test;
--
-- CHECKSUM(VAL)
-- -------------
--             0

insert into tq84_checksum_test values (1);
select checksum(val) from tq84_checksum_test;
--
-- CHECKSUM(VAL)
-- -------------
--        863352

drop   table tq84_checksum_test;

See also

The packages

Index