Friday, July 20, 2012

INSERT ALL master/detail data from XML

This is something I actually made for a quiz on PL/SQL Challenge, but I think the technique could be useful for others as well :-)

The idea is you may have some master/detail data (in this case orders and orderlines) for which you get XML with such data that needs to be inserted into two relational tables. Many might be tempted to parse the XML client side or in PL/SQL, loop through the data, and then insert the orders and lines row by row (or perhaps bulk insert from arrays.)

But it can be done in a single statement if you combine XMLTABLE function to read the XML relationally and INSERT ALL statement to do a multi-table insert.

Let's create a couple of tables for orders and orderlines:

SQL> create table orders
  2  (
  3   order_id    integer primary key
  4   , customer    varchar2(20)
  5   , state     varchar2(20)
  6  )
  7  /

Table created.

SQL> create table orderlines
  2  (
  3   order_id    integer references orders (order_id)
  4   , line_no     integer
  5   , item     varchar2(20)
  6   , quantity    number
  7  )
  8  /

Table created.

We will receive xml files from external source with order data.

In production that will probably be loaded via CLOB or BFILE or other method. For test we setup a simple VARCHAR2 bind variable containing the XML.

SQL> variable received_xml varchar2(4000)
SQL> 
SQL> begin
  2   :received_xml :=
  3  '<orders>
  4    <order id="100">
  5    <customer>Smith</customer>
  6    <state>OHIO</state>
  7    <lines>
  8      <line no="1">
  9        <item>Mouse</item>
 10        <qty>3</qty>
 11      </line>
 12      <line no="2">
 13        <item>Monitor</item>
 14        <qty>2</qty>
 15      </line>
 16    </lines>
 17    </order>
 18    <order id="102">
 19    <customer>Jackson</customer>
 20    <state>TEXAS</state>
 21    <lines>
 22      <line no="2">
 23        <item>Monitor</item>
 24        <qty>1</qty>
 25      </line>
 26      <line no="4">
 27        <item>Keyboard</item>
 28        <qty>2</qty>
 29      </line>
 30      <line no="6">
 31        <item>RJ-45 Cable</item>
 32        <qty>6</qty>
 33      </line>
 34    </lines>
 35    </order>
 36  </orders>
 37  ';
 38  end;
 39  /

PL/SQL procedure successfully completed.

XMLTABLE can be used to read XML data as relational data. For example:

SQL> select   xmlorders.order_id
  2       , xmlorders.customer
  3       , xmlorders.state
  4    from xmltable(
  5     '/Orders/Order'
  6     passing xmltype(:received_xml)
  7     columns order_id integer      path '@Id'
  8           , customer varchar2(20) path 'Customer'
  9           , state   varchar2(20) path 'State'
 10         ) xmlorders
 11  /

ORDER_ID CUSTOMER STATE                                                         
-------- -------- -----                                                         
     100 Smith    OHIO                                                          
     102 Jackson  TEXAS                                                         

'/Orders/Order' is the XQuery path that tells Oracle we want all Order child items of the Orders root item. PASSING is the actual XML data, where XMLTYPE is a constructor to construct an XMLTYPE object from the VARCHAR2 bind-variable. COLUMNS then defines the relational columns and PATH shows where in the XML the data for that column can be found. Note that @Id denotes that Id is an attribute of the Order item, while Customer and State are child items.

XMLTABLE can of course also find the order lines at the lower levels of the XML. For example:

SQL> select   xmllines.line_no
  2       , xmllines.item
  3       , xmllines.quantity
  4    from xmltable(
  5     '/Orders/Order/Lines/Line'
  6     passing xmltype(:received_xml)
  7     columns line_no  integer      path '@No'
  8           , item   varchar2(20) path 'Item'
  9           , quantity number       path 'Qty'
 10         ) xmllines
 11  /

 LINE_NO ITEM        QUANTITY                                                   
-------- ----------- --------                                                   
       1 Mouse              3                                                   
       2 Monitor            2                                                   
       2 Monitor            1                                                   
       4 Keyboard           2                                                   
       6 RJ-45 Cable        6                                                   

The problem with the above XMLTABLE expression is, that it does not identify the order id for each line, as that id is only available at the higher level of XML. But it is possible to use XMLTABLE twice in order to solve that:

SQL> select   xmlorders.order_id
  2       , xmllines.line_no
  3       , xmllines.item
  4       , xmllines.quantity
  5    from xmltable(
  6     '/Orders/Order'
  7     passing xmltype(:received_xml)
  8     columns order_id integer      path '@Id'
  9           , lines   xmltype      path 'Lines'
 10         ) xmlorders
 11       , xmltable(
 12     '/Lines/Line'
 13     passing xmlorders.lines
 14     columns line_no  integer      path '@No'
 15           , item   varchar2(20) path 'Item'
 16           , quantity number       path 'Qty'
 17         ) xmllines
 18  /

ORDER_ID  LINE_NO ITEM        QUANTITY                                          
-------- -------- ----------- --------                                          
     100        1 Mouse              3                                          
     100        2 Monitor            2                                          
     102        2 Monitor            1                                          
     102        4 Keyboard           2                                          
     102        6 RJ-45 Cable        6                                          

The first XMLTABLE gets the order_id and then also a column lines of XMLTYPE that will contain the XML of each orders Lines item. That in turn is passed to the second XMLTABLE using PASSING, and so that retrieves the columns of each line for each order.

Inserting the data into the relational tables can then be done in a two-pass operation, first the orders, then the lines:

SQL> insert into orders
  2  select   xmlorders.order_id
  3       , xmlorders.customer
  4       , xmlorders.state
  5    from xmltable(
  6     '/Orders/Order'
  7     passing xmltype(:received_xml)
  8     columns order_id integer      path '@Id'
  9           , customer varchar2(20) path 'Customer'
 10           , state   varchar2(20) path 'State'
 11         ) xmlorders
 12  /

2 rows created.

SQL> insert into orderlines
  2  select   xmlorders.order_id
  3       , xmllines.line_no
  4       , xmllines.item
  5       , xmllines.quantity
  6    from xmltable(
  7     '/Orders/Order'
  8     passing xmltype(:received_xml)
  9     columns order_id integer      path '@Id'
 10           , lines   xmltype      path 'Lines'
 11         ) xmlorders
 12       , xmltable(
 13     '/Lines/Line'
 14     passing xmlorders.lines
 15     columns line_no  integer      path '@No'
 16           , item   varchar2(20) path 'Item'
 17           , quantity number       path 'Qty'
 18         ) xmllines
 19  /

5 rows created.

Or it can be done in one call using the INSERT ALL syntax for inserting into two tables simultaneously.

SQL> insert all
  2  when seq = 1
  3  then
  4     into orders
  5   values (order_id
  6         , customer
  7         , state)
  8  when 1 = 1
  9  then
 10     into orderlines
 11   values (order_id
 12         , line_no
 13         , item
 14         , quantity)
 15  select   xmlorders.order_id
 16       , xmlorders.customer
 17       , xmlorders.state
 18       , xmllines.line_no
 19       , xmllines.item
 20       , xmllines.quantity
 21       , xmllines.seq
 22    from xmltable(
 23     '/Orders/Order'
 24     passing xmltype(:received_xml)
 25     columns order_id integer      path '@Id'
 26           , customer varchar2(20) path 'Customer'
 27           , state   varchar2(20) path 'State'
 28           , lines   xmltype      path 'Lines'
 29         ) xmlorders
 30       , xmltable(
 31     '/Lines/Line'
 32     passing xmlorders.lines
 33     columns line_no  integer      path '@No'
 34           , item   varchar2(20) path 'Item'
 35           , quantity number       path 'Qty'
 36           , seq   for ordinality
 37         ) xmllines
 38  /

7 rows created.

See how the data now are nicely inserted in the tables :-)

SQL> select   order_id
  2       , customer
  3       , state
  4    from orders
  5  order by order_id
  6  /

ORDER_ID CUSTOMER STATE                                                         
-------- -------- -----                                                         
     100 Smith    OHIO                                                          
     102 Jackson  TEXAS                                                         

SQL> select   order_id
  2       , line_no
  3       , item
  4       , quantity
  5    from orderlines
  6  order by order_id, line_no
  7  /

ORDER_ID  LINE_NO ITEM        QUANTITY                                          
-------- -------- ----------- --------                                          
     100        1 Mouse              3                                          
     100        2 Monitor            2                                          
     102        2 Monitor            1                                          
     102        4 Keyboard           2                                          
     102        6 RJ-45 Cable        6                                          

But what if our xml data was simpler and did not contain any id information?

SQL> begin
  2   :received_xml :=
  3  '<Orders>
  4    <Order>
  5    <Customer>Smith</Customer>
  6    <State>OHIO</State>
  7    <Lines>
  8      <Line>
  9        <Item>Mouse</Item>
 10        <Qty>3</Qty>
 11      </Line>
 12      <Line>
 13        <Item>Monitor</Item>
 14        <Qty>2</Qty>
 15      </Line>
 16    </Lines>
 17    </Order>
 18    <Order>
 19    <Customer>Jackson</Customer>
 20    <State>TEXAS</State>
 21    <Lines>
 22      <Line>
 23        <Item>Monitor</Item>
 24        <Qty>1</Qty>
 25      </Line>
 26      <Line>
 27        <Item>Keyboard</Item>
 28        <Qty>2</Qty>
 29      </Line>
 30      <Line>
 31        <Item>RJ-45 Cable</Item>
 32        <Qty>6</Qty>
 33      </Line>
 34    </Lines>
 35    </Order>
 36  </Orders>
 37  ';
 38  end;
 39  /

PL/SQL procedure successfully completed.

These data has no order_id and no line_no:

SQL> select   xmlorders.customer
  2       , xmlorders.state
  3       , xmlorders.orderseq
  4       , xmllines.item
  5       , xmllines.quantity
  6       , xmllines.lineseq
  7    from xmltable(
  8     '/Orders/Order'
  9     passing xmltype(:received_xml)
 10     columns customer varchar2(20) path 'Customer'
 11           , state   varchar2(20) path 'State'
 12           , lines   xmltype      path 'Lines'
 13           , orderseq for ordinality
 14         ) xmlorders
 15       , xmltable(
 16     '/Lines/Line'
 17     passing xmlorders.lines
 18     columns item   varchar2(20) path 'Item'
 19           , quantity number       path 'Qty'
 20           , lineseq  for ordinality
 21         ) xmllines
 22  /

CUSTOMER STATE ORDERSEQ ITEM        QUANTITY  LINESEQ                           
-------- ----- -------- ----------- -------- --------                           
Smith    OHIO         1 Mouse              3        1                           
Smith    OHIO         1 Monitor            2        2                           
Jackson  TEXAS        2 Monitor            1        1                           
Jackson  TEXAS        2 Keyboard           2        2                           
Jackson  TEXAS        2 RJ-45 Cable        6        3                           

For the line_no we can use the ordinality column lineseq - no problem. For the order_id we can also have ordinality available to us, but what happens when the next batch of orders arrive?

We need a sequence for these orders.


SQL> create sequence order_no_seq
  2  /

Sequence created.

Now I would like to do this for the query to be used in INSERT ALL:

SQL> select   case xmllines.lineseq
  2     when 1 then order_no_seq.nextval
  3     else      order_no_seq.currval
  4         end order_id
  5       , xmlorders.customer
  6       , xmlorders.state
  7       , xmllines.lineseq line_no
  8       , xmllines.item
  9       , xmllines.quantity
 10    from xmltable(
 11     '/Orders/Order'
 12     passing xmltype(:received_xml)
 13     columns customer varchar2(20) path 'Customer'
 14           , state   varchar2(20) path 'State'
 15           , lines   xmltype      path 'Lines'
 16         ) xmlorders
 17       , xmltable(
 18     '/Lines/Line'
 19     passing xmlorders.lines
 20     columns item   varchar2(20) path 'Item'
 21           , quantity number       path 'Qty'
 22           , lineseq  for ordinality
 23         ) xmllines
 24  /

ORDER_ID CUSTOMER STATE  LINE_NO ITEM        QUANTITY                           
-------- -------- ----- -------- ----------- --------                           
       1 Smith    OHIO         1 Mouse              3                           
       2 Smith    OHIO         2 Monitor            2                           
       3 Jackson  TEXAS        1 Monitor            1                           
       4 Jackson  TEXAS        2 Keyboard           2                           
       5 Jackson  TEXAS        3 RJ-45 Cable        6                           

But because of the way sequences work (search asktom.oracle.com for explanation) that statement assigned an individual order_id to each orderline :-( We can work around that by retrieving our sequence via function calls:

SQL> create package order_api
  2  as
  3   function order_no_seq_nextval return number;
  4   function order_no_seq_currval return number;
  5  end order_api;
  6  /

Package created.

SQL> create package body order_api
  2  as
  3   function order_no_seq_nextval return number
  4   is
  5   begin
  6      return order_no_seq.nextval;
  7   end order_no_seq_nextval;
  8  
  9   function order_no_seq_currval return number
 10   is
 11   begin
 12      return order_no_seq.currval;
 13   end order_no_seq_currval;
 14  end order_api;
 15  /

Package body created.

And then we make a little change in the query:

SQL> select   case xmllines.lineseq
  2     when 1 then order_api.order_no_seq_nextval
  3     else      order_api.order_no_seq_currval
  4         end order_id
  5       , xmlorders.customer
  6       , xmlorders.state
  7       , xmllines.lineseq line_no
  8       , xmllines.item
  9       , xmllines.quantity
 10    from xmltable(
 11     '/Orders/Order'
 12     passing xmltype(:received_xml)
 13     columns customer varchar2(20) path 'Customer'
 14           , state   varchar2(20) path 'State'
 15           , lines   xmltype      path 'Lines'
 16         ) xmlorders
 17       , xmltable(
 18     '/Lines/Line'
 19     passing xmlorders.lines
 20     columns item   varchar2(20) path 'Item'
 21           , quantity number       path 'Qty'
 22           , lineseq  for ordinality
 23         ) xmllines
 24  /

ORDER_ID CUSTOMER STATE  LINE_NO ITEM        QUANTITY                           
-------- -------- ----- -------- ----------- --------                           
       6 Smith    OHIO         1 Mouse              3                           
       6 Smith    OHIO         2 Monitor            2                           
       7 Jackson  TEXAS        1 Monitor            1                           
       7 Jackson  TEXAS        2 Keyboard           2                           
       7 Jackson  TEXAS        3 RJ-45 Cable        6                           

This time we get the desired order_id numbering. And so we can do our insert again:

SQL> insert all
  2  when line_no = 1
  3  then
  4     into orders
  5   values (order_id
  6         , customer
  7         , state)
  8  when 1 = 1
  9  then
 10     into orderlines
 11   values (order_id
 12         , line_no
 13         , item
 14         , quantity)
 15  select   *
 16    from (
 17   select  case xmllines.lineseq
 18        when 1 then order_api.order_no_seq_nextval
 19        else order_api.order_no_seq_currval
 20     end order_id
 21          , xmlorders.customer
 22          , xmlorders.state
 23          , xmllines.lineseq line_no
 24          , xmllines.item
 25          , xmllines.quantity
 26       from xmltable(
 27        '/Orders/Order'
 28        passing xmltype(:received_xml)
 29        columns customer varchar2(20) path 'Customer'
 30       , state    varchar2(20) path 'State'
 31       , lines    xmltype   path 'Lines'
 32     ) xmlorders
 33          , xmltable(
 34        '/Lines/Line'
 35        passing xmlorders.lines
 36        columns item     varchar2(20) path 'Item'
 37       , quantity number   path 'Qty'
 38       , lineseq  for ordinality
 39     ) xmllines
 40      where rownum >= 1
 41  )
 42  /

7 rows created.

Notice the query from before has been put in an inline view with a "where rownum >= 1" clause. If I do not have that, then the insert fails because of referential integrity. The reason is that the optimizer would rewrite and take my "case" expression and use that instead of "order_id" in both parts of the INSERT ALL. Thus the insert into orders would call nextval and the first insert into orderlines would ALSO call nextval and then the subsequent inserts into orderlines would call currval. To work around that I put the query into an inline view and put a "where rownum >= 1" which forces Oracle to evaluate the case expression for order_id BEFORE doing the INSERT ALL.

So now we also have nice data that has been assigned order id:

SQL> select   order_id
  2       , customer
  3       , state
  4    from orders
  5  order by order_id
  6  /

ORDER_ID CUSTOMER STATE                                                         
-------- -------- -----                                                         
       8 Smith    OHIO                                                          
       9 Jackson  TEXAS                                                         

SQL> select   order_id
  2       , line_no
  3       , item
  4       , quantity
  5    from orderlines
  6  order by order_id, line_no
  7  /

ORDER_ID  LINE_NO ITEM        QUANTITY                                          
-------- -------- ----------- --------                                          
       8        1 Mouse              3                                          
       8        2 Monitor            2                                          
       9        1 Monitor            1                                          
       9        2 Keyboard           2                                          
       9        3 RJ-45 Cable        6                                          


Rather than the simple bind variable, we could have passed xml many ways:

We might have this in a procedure having a CLOB parameter:

   ...
   passing xmltype( p_in_clob )
   ...

Or we might have the xml in a file on the server in a folder for which we have created a directory object XML_IN_DIR:

   ...
   passing xmltype( BFILENAME('XML_IN_DIR', p_in_filename) )
   ...

Or the xml may be the return output of an http call to a webserver:

   ...
   passing httpuritype('http://a.server.com/getxml?id=123').getxml()
   ...

(Note in this last example there is no xmltype constructor as getxml() member method returns the datatype xmltype rather than simply text.)

The possibilities are endless :-)

You can download this demo script here.

2 comments:

  1. HI Kim,

    Very Nice and most useful post...I am doing the same work now...But am facing performance problem on Oracle 10g..

    Could you please help me to resolve my problem..Its urgent...
    MY XML is"


    -
    -
    -
    -
    PHIL
    POLLAK
    1591 Miller Rd
    Lilburn
    GA
    30047
    Sod Sales Direct
    null@cybersource.com
    US
    6785757676

    -
    -
    1339
    11
    2016
    Visa


    -
    -
    P
    1
    2015.81
    0.00
    default


    -
    -
    0
    DCARDREFUSED
    DECLINED

    -




    -
    3982617706980176056193
    fdiglobal
    2015.81
    USD
    0.00
    Z
    Z
    51
    2015.81
    USD
    474165

    -
    145 4/23/14


    -
    -
    PHIL
    POLLAK
    1591 Miller Rd
    Lilburn
    GA
    30047
    Polak Enterprises Inc.
    null@cybersource.com
    US
    6785757676

    -
    -
    1339
    11
    2016
    Visa


    -
    -
    P
    1
    2015.81
    0.00
    default


    -
    -
    0
    DCARDREFUSED
    DECLINED

    -




    -
    3982619956700176056193
    fdiglobal
    2015.81
    USD
    0.00
    Z
    Z
    51
    2015.81
    USD
    474165

    -
    145 4/23/14


    -
    -
    PHIL
    POLAK
    1591 Miller Rd
    Lilburn
    GA
    30047
    Polak Enterprises Inc.
    null@cybersource.com
    US
    6785757676

    -
    -
    1339
    11
    2016
    Visa


    -
    -
    P
    1
    2015.81
    0.00
    default


    -
    -
    1
    SOK
    Request was processed successfully.

    -
    1
    SOK
    Request was processed successfully.


    -
    3982625069050176056193
    fdiglobal
    2015.81
    USD
    0.00
    O
    517408
    Z
    Z
    M
    00
    2015.81
    USD
    474165

    -
    145 4/23/14



    "
    and my coding...

    ReplyDelete
    Replies
    1. Hi, Chithambaram

      Well, pasting XML into a Blogger comment is not very useful, as you can see for yourself - all the XML tags are missing ;-) And your coding is also missing - presumably the comment has become too large for Blogger. Sorry about that - Blogger is not really optimal for "support" type comments :-(

      As to "performance problem" - that is a wide question. What kind of performance problem?

      If I can I'll try to help if there's a bit more detail. Rather than pasting into Blogger comment, can you upload XML and code to some file sharing site? Then you can post the link in a comment along with a description of the performance problem.

      (I can't guarantee the speed of the reply, though, if it's really really urgent you might have to pay some consultant somewhere for fast action ;-)

      Regards
      Kim

      Delete