This week I have been evaluating options to perform a virus scan on files stored as BLOBs in an Oracle database. This was primarily an academic exercise as all systems that permit documents to be uploaded should perform a virus scan as part of the upload process.
However it is conceivable that security-conscious system owners may want to periodically scan their uploaded documents to ensure they are still virus-free:
- To counter the lag between a new virus being released, and the virus scanner’s definitions being updated to detect it.
- To counter the threat of an undetected attacker replacing a safe file with an infected one. If this is served to the public then there could be serious damage to the organisation’s reputation.
For the purpose of this exercise the ClamAV engine is used for detection. One approach is to copy the files on the filesystem and get ClamAV to scan them there. However to avoid the (temporary) storage overhead, I decided to try connecting directly to the daemon and streaming the files using UTL_TCP.
First create a table to hold the results of the database scan.
CREATE TABLE clamav_test ( owner VARCHAR2(30) NOT NULL , table_name VARCHAR2(30) NOT NULL , column_name VARCHAR2(30) NOT NULL , urid UROWID NOT NULL , blob_size INTEGER NOT NULL , result VARCHAR2(4000) NOT NULL , CONSTRAINT clamav_test_pk PRIMARY KEY (owner, table_name, column_name, urid) );
We store the details of the column where the data exists, especially the ROWID so that it’s easy to find rows if any infections are discovered. The BLOB_SIZE is useful if errors are returned from ClamAV; you may need to increase its stream size limit.
The following script loops through every BLOB in the database. Unless you are paranoid you will want to restrict this to only the columns that contain uploaded data!
DECLARE c_blob SYS_REFCURSOR; l_rowid UROWID; l_blob BLOB; l_result clamav_test.result%TYPE; BEGIN dbms_application_info.set_module('ClamAV Test Script', NULL); FOR cols IN ( SELECT dtc.owner , dtc.table_name , dtc.column_name FROM dba_tab_columns dtc JOIN dba_tables dt ON dt.owner = dtc.owner AND dt.table_name = dtc.table_name WHERE dtc.data_type = 'BLOB' AND dt.secondary = 'N' ORDER BY dtc.owner , dtc.table_name , dtc.column_name ) LOOP dbms_application_info.set_action(cols.owner||'.'||cols.table_name||'.'||cols.column_name); dbms_output.put_line(cols.owner||'.'||cols.table_name||'.'||cols.column_name); OPEN c_blob FOR 'SELECT ROWID '|| ' , '||cols.column_name||' '|| 'FROM '||cols.owner||'.'||cols.table_name||' '|| 'WHERE '||cols.column_name||' IS NOT NULL'||' '|| ' AND ROWID NOT IN '|| ' ( SELECT urid '|| ' FROM clamav_test '|| ' WHERE owner = '''||cols.owner||''' '|| ' AND table_name = '''||cols.table_name||''' '|| ' AND column_name = '''||cols.column_name||''' '|| ' )'; LOOP FETCH c_blob INTO l_rowid , l_blob; EXIT WHEN c_blob%NOTFOUND; dbms_application_info.set_client_info(l_rowid||': '||dbms_lob.getlength(l_blob)); DECLARE l_conn utl_tcp.connection; l_buf RAW(2000); l_amount INTEGER; l_offset INTEGER := 1; l_bytes PLS_INTEGER; BEGIN l_conn := utl_tcp.open_connection ( remote_host => 'localhost' , remote_port => 3310 , newline => chr(10) ); l_bytes := utl_tcp.write_line(l_conn, 'nINSTREAM'); BEGIN LOOP l_amount := 2000; dbms_lob.read(l_blob, l_amount, l_offset, l_buf); l_offset := l_offset + l_amount; l_bytes := utl_tcp.write_raw(l_conn, utl_raw.concat(utl_raw.cast_from_binary_integer(l_amount), l_buf)); END LOOP; EXCEPTION WHEN NO_DATA_FOUND THEN l_bytes := utl_tcp.write_raw(l_conn, utl_raw.cast_from_binary_integer(0)); END; l_bytes := utl_tcp.read_line(l_conn, l_result, TRUE); utl_tcp.close_connection(l_conn); EXCEPTION WHEN OTHERS THEN l_result := SQLERRM; dbms_output.put_line('format_error_stack: '||dbms_utility.format_error_stack); dbms_output.put_line('format_error_backtrace: '||dbms_utility.format_error_backtrace); utl_tcp.close_connection(l_conn); END; INSERT INTO clamav_test VALUES (cols.owner, cols.table_name, cols.column_name, l_rowid, dbms_lob.getlength(l_blob), l_result); COMMIT; END LOOP; CLOSE c_blob; END LOOP; END; /
The script loops through all BLOB columns (ignoring secondary tables for domain indexes) and for each one opens a cursor to loop through the data. For each row it makes a connection to clamd (in this case running on port 3310 on the database server), streams the data in the correct format, and reads back the result. The details are then stored in the log table, which makes the script easily resumable. If you need to re-run for a particular column then just delete the rows from the table before running the script again.
The material in this article was most recently tested against Oracle 188.8.131.52.6 on 64-bit Linux.