Search notes:

Oracle: Reading CSV files with external tables

This example tries to demonstrate how CSV files can be read with external tables
First, in the filesystem where Oracle is running, we create a directory, …
# su - oracle
$ mkdir /opt/oracle/ext 
… and make sure that only oracle can see its contents …
$ chmod 700 /opt/oracle/ext
… and create in this directory CSV with the data we want to import into the database:
$ cat <<CSV > /opt/oracle/ext/file.csv
1,one,first
2,two,second
3,three,third
4,four,fourth
CSV
Now, in SQL, a user with sufficient permissions create a directory object and grants read (at least) and write to the user that wants the external table:
connect / as sysdba
create directory ext_dir as '/opt/oracle/ext';
grant read on directory ext_dir to rene;
Create the external table:
connect rene/rene

create table ext_table_csv (
   i   number,
   n   varchar2(20),
   m   varchar2(20)
)
organization external (
   type              oracle_loader
   default directory ext_dir
   access parameters (
      records delimited  by newline
      nologfile                         -- Do not write log file
      nobadfile                         -- bad file when selecting from the external table.
      fields  terminated by ','
      missing field values are null
   )
   location ('file.csv')
)
-- reject limit 0 /* 0 is the default anyway */
;
select * from ext_table_csv;
Cleaning up:
drop directory ext_dir;
drop table ext_table_csv;

Reading CSV files while they are being written

The purpose of the following bash script is to constantly write to a CSV file without closing it.
I had hoped that Oracle would recognize that there is another process writing the file and would wait until the writing process is finished - but that was unfortunately not the case.
#!/bin/bash

# Open the file for writing via file descriptor 3 (so that it remains opened)
exec 3>'/opt/oracle/ext/file.csv'

while true; do

   i=$((RANDOM % 100))
   n=$(date +%s)
   m=$(head /dev/urandom | tr -dc 'A-Za-z0-9' | head -c 8) # random string of 8 characters

   echo "$i,$n,$m" >&3

   sleep 1
done

Setting up a scheduler job to import CSV data

create table imp_table_csv (
   i   number,
   n   varchar2(20),
   m   varchar2(20)
);

create table imp_log (
   dt   date,
   msg  varchar2(123)
);


create or replace package csv_importer authid definer as
   procedure import_file;
end csv_importer;
/

create or replace package body csv_importer as

   procedure log(msg varchar2) is
      pragma autonomous_transaction;
   begin
      insert into imp_log values (sysdate, msg);
      commit;
   end log;

   procedure import_file is
   -- TODO: Catch
   --       ORA-29913: error in executing ODCIEXTTABLEFETCH callout
   --       ORA-30653: reject limit reached   

      exs   boolean;
      len   number;
      bls   binary_integer;
      
   begin
   
      utl_file.fgetattr(
         location    => 'EXT_DIR',
         filename    => 'file.csv',
         fexists     =>  exs,
         file_length =>  len,
         block_size  =>  bls
      );
      
      if not exs then
      -- file does not exist
         return;
      end if;
      
      delete imp_table_csv;
      
      insert   into imp_table_csv
      select * from ext_table_csv;
      
   --
   -- Remove external file after importing it:
   --
      utl_file.fremove('EXT_DIR', 'file.csv');
      
      commit;
      log('file.csv was imported');
      
   exception when others then
      log('Trying to impoart file.csv raised ' || sqlerrm);
   end import_file;
   
end csv_importer;
/
begin
   dbms_scheduler.create_job (
      job_name             => 'imp_job',
      job_type             => 'stored_procedure',
      job_action           => 'csv_importer.import_file',
      repeat_interval      => 'freq=minutely'
   );
end;
/

begin
   dbms_scheduler.enable('imp_job');
end;
/
Query some job related characteristics:
select
   job.enabled,
   job.state,
   job.last_start_date,
   job.next_run_date,
  (job.next_run_date - systimestamp) time_until_next_run,
   job.last_run_duration,
   job.run_count,
   job.logging_level,
   job.repeat_interval,
   job.start_date,
   job.auto_drop,
   job.*
from
   user_scheduler_jobs job
where
   job.job_name = 'IMP_JOB';
Other statements:
exec csv_importer.import_file;
select * from imp_table_csv;
select * from imp_log order by dt desc;

Error messages in Windows

In Windows, I have received the following error:
ORA-29913: error in executing ODCIEXTTABLEOPEN callout
ORA-29400: data cartridge error
KUP-04040: file file.csv in EXT_DIR not found
I fixed this by giving the Oracle Service (process) the required privileges on the file (See KUP-04040).
After fixing this error, I had a KUP-04027: file name check failed error which I was able to fix by granting the necessary privileges on $home/ext (See KUP-04027)

See also

Reading Excel sheets with external tables.
Reading fixed field length files.

Index

Fatal error: Uncaught PDOException: SQLSTATE[HY000]: General error: 8 attempt to write a readonly database in /home/httpd/vhosts/renenyffenegger.ch/php/web-request-database.php:78 Stack trace: #0 /home/httpd/vhosts/renenyffenegger.ch/php/web-request-database.php(78): PDOStatement->execute(Array) #1 /home/httpd/vhosts/renenyffenegger.ch/php/web-request-database.php(30): insert_webrequest_('/notes/developm...', 1759612202, '216.73.216.149', 'Mozilla/5.0 App...', NULL) #2 /home/httpd/vhosts/renenyffenegger.ch/httpsdocs/notes/development/databases/Oracle/objects/tables/external/CSV(246): insert_webrequest() #3 {main} thrown in /home/httpd/vhosts/renenyffenegger.ch/php/web-request-database.php on line 78