Friday, June 26, 2009

Converting RTF to PDF in an Oracle Database

There are a number of commercial products out there which converts RTF to PDF, but there is also a free alternative called Ted. Actually, calling this a converter is a gross understatement, it is a fully fledged RTF editor which also can convert RTF to PDF.

I will describe how to install Ted, and how to use it with RTF-documents stored in the database. This combined with generating dynamic RTF-documents can give a rich document solution for your Oracle Apex applications.

This is a Linux/Unix specific solution. So if you are stuck with some other operating system, go do something about it ;-)

Downloading and Installing Ted
Ted runs on Unix/Linux, and you can download the software here. Ted is released under GNU Public License, and restrictions may apply.

The easiest way of installing Ted is to use RPM package file.
  • Download RPM package file to /tmp
  • Install RPM-package as user root (or system user with sufficient privileges)
    rpm -ivh .rpm
If you get an error stating that packages are missing, do not despair, simply download and install the package indicated (or get it from your installation CD). The installation is pretty much the same as described in the bullets above.

Preparing os for RTF conversion
Download Ted the rtf2pdf.sh script which converts rtf-files to pdf-files.
  • Create an os-directory:
    mkdir /u01/app/ted
  • Copy the Ted conversion script to the directory
    cp /u01/stage/ted/rtf2pdf.sh /u01/app/ted/.
  • Grant execute privileges on the script
    chmod +x rtf2pdf.sh
I have done the above operations with user oracle of group oinstall, and in this demonstration user oracle will be used to execute os-operations. This is probably not the best production solution, but will serve for my demonstration purposes.

Creating directory in the Oracle database
Defining a directory in Oracle is quite easy:
  • Connect as user SYS
  • Create directory:
    create directory RTF2PDF as '/u01/app/ted';
  • Grant privileges to RTF2PFDF
    grant all on directory RTF2PDF to <schema_user>
How to execute os-commands from the Oracle database
There are basicly three alternatives; dbms_scheduler, Java and C. The three alternatives and use of dbms_scheduler is described by Steven Feuerstein here. He also has a good description of the Java and C approaches.

dbms_scheduler is the superior alternative to me, it's easy to use, and as opposed to Java, is compatible with Oracle XE (even though there are some issues with this release). The only beef with dbms_scheduler are the implicit commits, I have not investigated when or why any further as I have barely scratched the surface of dbms_scheduler. I'll get around to it some rainy day.

What actually made me revisit dbms_scheduler was this recent blog post by Tobias Arnold, following feeds like orna.info and apexblogs.info has proven invaluable for a monkey like me :-)

Dr. Tim Hall has an excellent site called oracle-base.com with loads of examples and articles on the Oracle database. My dbms_scheduler code and setup is based on his article Scheduler Enhancements in Oracle Database 11g Release 1.

Preparing for use of dbms_scheduler
Before your schema user can take advantage of dbms_scheduler and execute host commands, there are a few steps that must be done:
  • Connect as user sys
  • Grant create job to schema user:
    grant create job to <schema_user>;
  • Grant the ability to execute host commands/external jobs to schema user:
    grant create external job to <schema_user>
  • Create credential (this is new in 11g, and saves a lot of work):
begin
-- basic credential.
dbms_scheduler.create_credential(
credential_name => 'LOCALORACLE',
username => '',
password => '');
end;
  • Grant credential to schema user:
    grant execute on LOCALORACLE to <schema_user>
The os-user specified must have execution privileges on rtf2pdf.sh, and r+w on the directory where the script resides for this example.

PL/SQL code to invoke conversion
I have written a sample package for converting a rtf-document to a pdf-document. Sadly, I can't use Oracle's hosted environment on apex.oracle.com to showcase the functionality. I do not have the privileges to perform the required steps described above, and rightly so. Feel free to download and test it yourself. I have uploaded the package to my sample application at apex.oracle.com, you can download the sample code here.

The code is for demonstration purposes only, so my usual disclaimer apply: I take no responsibility what so ever for what might happen when you use it and there is no warranty of any kind expressed or implied. Phew... I still hope for world domination when you compile the package though ;-)

The package contains procedures/functions for file handling and invoking rtf2pdf.sh, but can easily be adapted for a more generic use.

Converting RTF to PDF
Lets say you have a table called RTF_DOCUMENTS that looks like this:
create table rtf_document
(
rtf_document_id number not null,
file_name varchar2 (255) not null,
rtf_file clob,
pdf_file blob,
constraint rtf_document_pk primary key (rtf_document_id)
);

And you want to create a procedure to convert the contents of column RTF_FILE to a pdf-document in column PDF_DOCUMENT, you can use the following procedure:
create or replace procedure convert_rtf_doc (
p_rtf_document_id in rtf_document.rtf_document_id%type
, p_dir_name in varchar2 default 'RTF2PDF'
) is
cursor c_doc
is
select rtdo.rtf_document_id
, rtdo.file_name
, rtdo.rtf_file
, rtdo.pdf_file
from rtf_document rtdo
where rtdo.rtf_document_id = p_rtf_document_id;
l_pdf_file blob;
begin
for r_doc in c_doc
loop
l_pdf_file := rtf2pdf.rtf2pdf(p_rtf_file_name => r_doc.file_name
,p_rtf_file => r_doc.rtf_file
,p_dir_name => p_dir_name);
update rtf_document rtdo
set rtdo.pdf_file = l_pdf_file
where rtdo.rtf_document_id = p_rtf_document_id;
end loop;
end;
/

Beware of the implicit commit by dbms_scheduler! A select for update in c_doc cursor would result in "fetch out of sequence"-error.

Alternatives to Ted
There are always alternatives, this is not different. There are some Open Source projects out there, and this guy (that's sexist, but I still assume cfSearching is a he) has done some research on the topic. His angle was different from mine though, and he went the ooxml-way, which is not a stupid idea.

Open Office could be an alternative, and has a more active community surrounding it. The CLI surrounding the OO-libraries in server mode will probably do the job.

In conclusion
In all honesty, I have not yet investigated to which extent the rendering of pdf-files is correct based on the more complex elements of rtf. This, of course, I should have done first. For me the journey is the most fun when (as in this case) there are no expectations attached to the end solution.

Tightening the PL/SQl-code referenced here for production goes without saying!

Sunday, June 21, 2009

Generating Table Collection API for Oracle Apex

...or "The Art of Laziness". The lazy programmer will always strive to get more for less, which is a good thing.

Why?
After looking at the collection API's i Oracle Apex, I could see their usefulness, but also get an idea on how labor intensive the operations were (for the programmer, not the database). I quickly got bored of the c001, c002, etc. coding, so for some time now I have been working on an API generator for creating and manipulating collections based on tables.

What is generated?
You get two packages and a view.

  • The table API-package: This is very similar to the API packages generated by Apex, but with some changes to the cursor fetching table rows (all now share a global cursor) and the updates in my version does not attempt to update primary keys. Quite frankly, I could get by with the API's from Apex, but I have a bit of control freakishness about me, and it puts me in a better position to further customize the API when needed. I already have some ideas fro the latter part.

  • The collection API package: This is the package that wraps both the Apex API's, and the table operations. Here you will find create, drop, ins/upd/del collection procedures, all with parameters that match the table, both in name and data type. You also get an apply-procedure to handle the actual dml operations against the actual database table. The package supports "overloading" of collections, so you can have more then one collection based on the same table at the same time. It also handles MD5 if you want it to

  • The view: The view wraps the htmld_collections-view, and casts data types and column names from c### varchar2 format to their actual column names and data types from the database table it was founded on.
Statements for calling the collection API is also generated and included with the view-script.

Installation
Just download the generator package, and compile it into desired schema. Schema must have access to the following views: all_tab_columns, all_cons_columns and all_constraints. I deliberately left out any supporting tables (which certainly could have helped in some areas) to simplify things. This means that if you want to modify some parameters, you will have to do it in the package.

You can download the generator package here.

Generating code
Well, generating the code is quite easy. It is a "fire-and-forget" kind of operation; the packages and view will be generated in a PL/SQL-collection and can be queried and executed at will.

The following section shows how to generate code for an EMP-table located in schema NMS:

begin
gen_coll_p.gen_coll_package (p_owner => 'NMS',
p_table_name => 'EMP',
p_shortname => 'EMP',
p_gen_type => 'ALL');
end;
/

p_gen_type indicates what part(s) of code to generate, the available values are:
  • TAB_API_SPEC for table API package specification
  • TAB_API_BODY for table API package body
  • COLL_SPEC for table/collection API package specification
  • COLL_BODY for table/collection API package body
  • COLL_VIEW for table/collection wrapper view
  • ALL for all of the above
Viewing the generated code:

select text
from table (gen_coll_p.get_code)
order by line

Compiling generated code:
This part I have put NO amount of work in, after selecting the generated code, you are on your own :-) Well, it's not really that hard to select the code in SQL Developer or TOAD using the query above and copy/paste the result into a new window and choose "Run Script". Or lacking any "sophisticated" tools, you can even revert to spooling and running the result in SQL*Plus . I am sure you will find a way :-)

Using the collection API
Te following examples shows how to use the API from an Oracle Apex application. The API in these examples are generated on the EMP table, and p_shortname parameter during generation is set to EMP as well.

Creating a new collection:

apex_emp_coll_p.create_coll( p_coll_name => 'EMP_COLL'
, p_include_md5 => 'Y'
, p_where => 'where upper(ename) like ''S%''');

Viewing the contents of the collection:

select seq_id,
empno,
ename,
job,
mgr,
hiredate,
sal,
comm,
deptno
from apex_emp_coll_v
where apex_emp_coll_v.collection_name = 'EMP_COLL'

And please feel free to replace 'EMP_COLL' with a bind variable :-)

Inserting a collection member:

apex_emp_coll_p.ins(p_coll_name => 'EMP_COLL'
,p_empno => :p_empno
,p_ename => :p_ename
,p_job => :p_job
,p_mgr => :p_mgr
,p_hiredate => :p_hiredate
,p_sal => :p_sal
,p_comm => :p_comm
,p_deptno => :p_deptno);

Updating a collection member:

apex_emp_coll_p.upd(p_coll_name => 'EMP_COLL'
,p_seq_id => :p_seq_id
,p_empno => :p_empno
,p_ename => :p_ename
,p_job => :p_job
,p_mgr => :p_mgr
,p_hiredate => :p_hiredate
,p_sal => :p_sal
,p_comm => :p_comm
,p_deptno => :p_deptno);

Deleting a collection member:

apex_emp_coll_p.del(p_coll_name => 'EMP_COLL'
,p_seq_id => :p_seq_id);

Propagating changes to the database table:

apex_emp_coll_p.apply_changes(p_coll_name => 'EMP_COLL',
p_refresh => 'Y');
See it in action
I have made some sample pages to showcase the generator and using the generated code. See http://apex.oracle.com/pls/otn/f?p=28990:8 for the code generator, and http://apex.oracle.com/pls/otn/f?p=28990:4 for viewing the generated code in action.

Some notes on the generator
It did not turn out as smooth as I had hoped. I would really have liked to include an "instead of"-trigger to the view, but this cannot be done without granting rights on WWV_FLOW_COLLECTIONS which I would rather not do. Even though the "instead of"-trigger uses the documented Apex API's, it does not change the "ORA-01031: insufficient privileges" from cropping up. This pretty much rules out the automatic DML process, which is a shame.

When viewing the code in the generator itself, please remember that it has been evolving over time (and in my spare time too :-)), and the code reflects that fact.

I have some plans for features to include in the future, but as of now I do not know what and in what order they will be implemented, so I will let it lie.

Please leave a note if you find this useful or have any comments/questions, etc. :-)