Monday, September 2, 2019

12c SOA BPEL - Process Large Files Using Oracle File Adapter Chunked Read Option


Introduction: SOA 12c adds a new feature ChunkedRead operation to the JCA File Adapter. Prior to this, users had to use a SynchRead operation and then edit the JCA file to achieve a "chunked read". In this blog, I will attempt to explain how to process a large file in chunks using the SOA File Adapter.

Advantage: Chunking large files reduces the amount of data that is loaded in memory and makes efficient use of the translator resources.

Here we will poll a large CSV file having 12 number of records.

Step1: Create a file adapter
Drag & drop a File adapter to the external references swimlane of our SOA composite. Follow instructions in the wizard to complete the configuration as shown below.




Ensure that you choose the "Chunked Read" operation and define a chunk size - This will be the number of records in the file that will be read in each iteration. For instance., if you have 12 records with a chunk size of 3, the adapter would read the file in 4 chunks.

















Step2: Implement the BPEL section
In order to process the file in chunks, the BPEL process invoke that triggers the File Adapter must be placed within a while loop. During each iteration, the file adapter uses the property header values to determine where to start reading.

At a minimum, the following are the JCA adapter properties that must be set;

jca.file.FileName : Send/Receive file name. This property overrides the adapter configuration. Very handy property to set / get dynamic file names
jca.file.Directory : Send/Receive directory location. This property overrides the adapter configuration
jca.file.LineNumber : Set/Get line number from which the file adapter must start processing the native file
jca.file.ColumnNumber : Set/Get column number from which the file adapter must start processing the native file
jca.file.IsEOF : File adapter returns this property to indicate whether end-of-file has been reached or not
Apart from the above, there are 3 other properties that helps with error management & exception handling.
jca.file.IsMessageRejected : Returned by the file adapter if a message is rejected (non-conformance to the schema/not well formed)
jca.file.RejectionReason : Returned by the file adapter in conjunction with the above property. Reason for the message rejection
jca.file.NoDataFound : Returned by the file adapter if no data is found to be read

In the BPEL process "Invoke" activity, only jca.file.FileName and jca.file.Directory properies are available to choose from the properties tab. We will have to configure the other properties manually.

First, let's create a bunch of BPEL variables to hold these properties. For simplicity, just create all variables with a simple XSD string type.


Variables created:
    <variable name="lineNumber" type="xsd:string"/>
    <variable name="columnNumber" type="xsd:string"/>
    <variable name="isEOF" type="xsd:string"/>
    <variable name="returnLineNumber" type="xsd:string"/>
    <variable name="returnColumnNumber" type="xsd:string"/>
    <variable name="returnIsEOF" type="xsd:string"/>
    <variable name="returnIsMessageRejected" type="xsd:string"/>
    <variable name="returnRejectionReason" type="xsd:string"/>
    <variable name="returnNoDataFound" type="xsd:string"/>

Drag & drop an assign activity before the while loop to initialize the variables for the first time the file is read (first chunk) - since we know the first chunk of data will start at line 1 and column 1.

lineNumber -> 1
columnNumber -> 1
isEOF -> 'false'





Invoke properties:
<invoke name="Invoke_fileReference" partnerLink="fileReference" portType="ns1:ChunkedRead_ptt"
                operation="ChunkedRead" inputVariable="Invoke_fileReference_ChunkedRead_InputVariable"
                outputVariable="Invoke_fileReference_ChunkedRead_OutputVariable" bpelx:invokeAsDetail="no">
          <bpelx:toProperties>
            <bpelx:toProperty name="jca.file.LineNumber" variable="lineNumber"/>
            <bpelx:toProperty name="jca.file.ColumnNumber" variable="columnNumber"/>
          </bpelx:toProperties>
          <bpelx:fromProperties>
            <bpelx:fromProperty name="jca.file.LineNumber" variable="returnLineNumber"/>
            <bpelx:fromProperty name="jca.file.ColumnNumber" variable="returnColumnNumber"/>
            <bpelx:fromProperty name="jca.file.IsEOF" variable="returnIsEOF"/>
            <bpelx:fromProperty name="jca.file.IsMessageRejected" variable="returnIsMessageRejected"/>
            <bpelx:fromProperty name="jca.file.RejectionReason" variable="returnRejectionReason"/>
            <bpelx:fromProperty name="jca.file.NoDataFound" variable="returnNoDataFound"/>
            <bpelx:fromProperty name="jca.file.FileName" variable="csvfile"/>
          </bpelx:fromProperties>
        </invoke>

Within the while loop, drag & drop another assign activity to re-assign file adapter properties.
returnIsEOF -> isEOF
returnLineNumber -> lineNumber
returnColumnNumber -> columnNumber


Now deploy it to EM console and test 

The file is polled in chunks 4 time. as the number of lines in the file is 12 and chunk size is 3.

Bug or cons:
If we implement the chunk read, we will find that the file does not get deleted post reading the file.
It is because the file can be deleted only outside the loop of chunk read.

File adapter provides feature to delete the file from a location.
Provided we are getting the file name and file location, we can easily create an adapter to delete the file post chunk reading
Create a simple file adapter with sync read option

















7 comments:

  1. Hi Srinanda, I Was trying to do same did you try to use chunk-size to be like 5 for a 12 REC file where in the last chunk it will be only 2 Records.Is this scenario working for you ..?
    if so just let me know if there were any modifications that were done to accomplish that.

    ReplyDelete
    Replies
    1. Have you followed the same steps mentioned in the blog? it should work.

      Delete
  2. Hi,

    I have a scenario where my transaction starts with the same file I am trying to chunk read. So i have created the file adapter to read the file but I am not reading the content of file.

    Once the transaction starts, I have another file adapter which will read the file in Chunks.

    So my question is if my file is big and it takes time to chunk read it, will the Poller poll the same file again?

    Hope I was able to express my case clearly.

    ReplyDelete
    Replies
    1. Hi Kaushik,

      If we use chunking then once chunk and loops completed, after the loop, we have to delete the file explicitly using file adapter delete option. This is one cons of using it.

      Delete
  3. sharing a large file Wow, cool post. I'd like to write like this too - taking time and real hard work to make a great article... but I put things off too much and never seem to get started. Thanks though.

    ReplyDelete
  4. thanks for sharing the post. Please keep posting like the. check out my site to transfer large video files

    ReplyDelete

Featured Post

OIC - how can I use XSLT functions to remove leading zeros from numeric and alphanumeric fields?

To remove leading zeros from an numeric field in Oracle Integration Cloud (OIC) using XSLT, you can Use number() Function The number() funct...