Author |
Message
|
adubya |
Posted: Sat Jul 09, 2016 6:27 am Post subject: DFDL reparsing |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
Hi All,
I'm struggling to understand the behavior of a message flow which is using the DFDL parser to process an EDIFACT message (turning it into an XML representation).
There are a number of compute nodes (JCNs) in the processing chain and when investigating a performance issue I ran a user debug trace and found that the DFDL parser was being invoked at the start of each JCN when it accessed the message root. It's essentially performing a DFDL reparse in each JCN.
I created a simple test flow using ESQL compute nodes which just copy InputRoot to OutputRoot (and access the same element in the DFDL message tree) and the same thing happens, a DFDL reparse in each compute node.
If I change the "mode" of the first ESQL Compute node to "LocalEnvironment" then no DFDL reparse occurs in the following compute node. So it looks like if IIB believes that the logical message structure could have been altered in the previous node (even it it wasn't) then it performs a DFDL reparse.
This is IIB10.0.0.2 but occurs in 8.0.0.5 also.
Any ideas why the reparse is occurring ?
A workaround is to store the message tree in the Environment tree and then reference that in subsequent compute nodes, that works and doesn't cause a reparse (as the InputRoot message body isn't being accessed) but seems like it shouldn't be required.
Thanks. _________________ Independent Middleware Consultant
andy@knownentity.com |
|
Back to top |
|
 |
timber |
Posted: Sun Jul 10, 2016 12:09 pm Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
I agree; that doesn't sound like correct behaviour to me. If you are 100% sure that the JCNs are not modifying the message tree then you should open a PMR. |
|
Back to top |
|
 |
adubya |
Posted: Mon Jul 11, 2016 12:11 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
timber wrote: |
I agree; that doesn't sound like correct behaviour to me. If you are 100% sure that the JCNs are not modifying the message tree then you should open a PMR. |
We've found that if we use the JCN "filter" pattern i.e. just propagate the entire message assembly (but still access the message body) then no-reparse occurs in the downstream JCN.
If we use the JCN "modify" pattern, but don't actually change the message tree then a reparse occurs in the downstream JCN.
This is consistent with the ESQL compute node situation whereas an ESQL node which just does SET OutputRoot = InputRoot; (and also reads elements in the input message body ) will cause a reparse in the next compute node which attempts to access the message body. A compute node with mode set to "Local Environment" i.e. broker knows the message tree won't change, will not cause a reparse in the following node. _________________ Independent Middleware Consultant
andy@knownentity.com |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Jul 11, 2016 4:07 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Broker uses a Just In Time parser.
It parses what you need, when you need it.
If you do not access any part of the message body tree before you get to a given node, then it won't have been parsed.
Unless you tell the input node to parse immediately. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
timber |
Posted: Mon Jul 11, 2016 4:41 am Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
Quote: |
This is consistent with the ESQL compute node situation whereas an ESQL node which just does SET OutputRoot = InputRoot; (and also reads elements in the input message body ) will cause a reparse in the next compute node which attempts to access the message body. |
Hmmm. I'm pretty sure that the MRM parser would not re-parse unless the message tree was explicitly written to. But I'm prepared to be persuaded. Anybody care to offer experimental proof one way or the other? |
|
Back to top |
|
 |
adubya |
Posted: Mon Jul 11, 2016 4:47 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
mqjeff wrote: |
Broker uses a Just In Time parser.
It parses what you need, when you need it.
If you do not access any part of the message body tree before you get to a given node, then it won't have been parsed.
Unless you tell the input node to parse immediately. |
Hi Jeff,
Understand that but in this case broker doesn't seem to make use of the message tree which has been constructed in previous nodes and performs a re-parse. _________________ Independent Middleware Consultant
andy@knownentity.com |
|
Back to top |
|
 |
adubya |
Posted: Mon Jul 11, 2016 4:50 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
timber wrote: |
Hmmm. I'm pretty sure that the MRM parser would not re-parse unless the message tree was explicitly written to. But I'm prepared to be persuaded. Anybody care to offer experimental proof one way or the other? |
Sorry I should have been clearer, my ESQL parallel was also with the DFDL domain i.e. we're seeing the same re-parse behavior in both JCN and ESQL compute nodes for DFDL. _________________ Independent Middleware Consultant
andy@knownentity.com
Last edited by adubya on Mon Jul 11, 2016 4:51 am; edited 1 time in total |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Jul 11, 2016 4:51 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
adubya wrote: |
mqjeff wrote: |
Broker uses a Just In Time parser.
It parses what you need, when you need it.
If you do not access any part of the message body tree before you get to a given node, then it won't have been parsed.
Unless you tell the input node to parse immediately. |
Hi Jeff,
Understand that but in this case broker doesn't seem to make use of the message tree which has been constructed in previous nodes and performs a re-parse. |
How do you know it's a "re-parse" and not just a parse?
Again, if you haven't done anything to the Body tree, then it won't be parsed unless you set or do something to cause it do be parsed. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
adubya |
Posted: Mon Jul 11, 2016 4:59 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
mqjeff wrote: |
How do you know it's a "re-parse" and not just a parse?
Again, if you haven't done anything to the Body tree, then it won't be parsed unless you set or do something to cause it do be parsed. |
If I have two ESQL compute nodes linked together, both containing the following ESQL
Code: |
SET OutputRoot = InputRoot;
DECLARE tmp CHAR;
SET tmp = InputRoot.DFDL.msg.element;
RETURN TRUE; |
Then when I run a user trace I see the exact same DFDL parser trace occurring twice, once in the first compute node and then again in the following compute node.
If I change the first compute node to remove the "SET OutputRoot = InputRoot" and set the node mode to "Local Environment" for that first compute node and rerun the trace then I see the DFDL parse kick in for the first compute node only. The second compute node accesses the message tree without what I call a re-parse. This is the behavior I would expect for the first scenario too i.e. the parser will only kick in when required and if a section of the message has been parsed and the corresponding message tree created then that will be used for subsequent access to the same element in the message. _________________ Independent Middleware Consultant
andy@knownentity.com
Last edited by adubya on Mon Jul 11, 2016 5:03 am; edited 1 time in total |
|
Back to top |
|
 |
mqjeff |
Posted: Mon Jul 11, 2016 5:02 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Ok.
So you pass the unparsed tree downstream with the set outputroot=inputroot.
You then access the input root, which causes a parse.
But you've already copied the unparsed input root.
So when you get to the next node, you have an unparsed input root still.
If you put the set output root *after* the access of the input tree, you shouldn't see a second parse. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
adubya |
Posted: Mon Jul 11, 2016 5:04 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
OK, it may well be as simple as that
Will give it a go!
Thanks. _________________ Independent Middleware Consultant
andy@knownentity.com |
|
Back to top |
|
 |
adubya |
Posted: Mon Jul 11, 2016 10:25 am Post subject: |
|
|
Partisan
Joined: 25 Aug 2011 Posts: 377 Location: GU12, UK
|
Tried adding the input message copy to output message at the end of the compute node and it makes no difference.
Two compute nodes in sequence, both containing the following code.
Code: |
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE inRef REFERENCE TO InputRoot;
DECLARE dummy CHAR;
SET dummy = inRef.DFDL.simpleCSV.record[>].field1;
CALL CopyEntireMessage();
RETURN TRUE;
END;
|
Trace file output
Code: |
2016-07-11 19:19:14.140655 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''BEGIN ... END;'' at ('.FileReadAndParse_Compute.Main', '2.2').
2016-07-11 19:19:14.140671 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''DECLARE inRef REFERENCE TO InputRoot;'' at ('.FileReadAndParse_Compute.Main', '4.3').
2016-07-11 19:19:14.140681 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''DECLARE dummy CHARACTER;'' at ('.FileReadAndParse_Compute.Main', '5.3').
2016-07-11 19:19:14.140688 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''SET dummy = inRef.DFDL.simpleCSV.record.field1;'' at ('.FileReadAndParse_Compute.Main', '7.3').
2016-07-11 19:19:14.140842 772 UserTrace BIP5841I: ''Offset: 0. Parsing will start from root element 'simpleCSV'.''
2016-07-11 19:19:14.140862 772 UserTrace BIP5841I: ''The default value of '%LF;' was assigned to variable 'outputNewLine' in namespace 'http://www.ogf.org/dfdl/dfdl-1.0/'. ''
2016-07-11 19:19:14.140883 772 UserTrace BIP5841I: ''Offset: 0. Starting to process element 'simpleCSV'.''
2016-07-11 19:19:14.140903 772 UserTrace BIP5841I: ''Offset: 0. Starting to process element 'record'.''
2016-07-11 19:19:14.140915 772 UserTrace BIP5841I: ''Offset: 0. Starting to process element 'field1'.''
2016-07-11 19:19:14.141053 772 UserTrace BIP5841I: ''Offset: 0. Found delimited value '1' for element 'field1'. The delimiter was ','. ''
2016-07-11 19:19:14.141133 772 UserTrace BIP5841I: ''Offset: 1. The simple content region of element 'field1' was converted to logical value '1'. ''
2016-07-11 19:19:14.141147 772 UserTrace BIP5841I: ''Offset: 1. Finished processing element 'field1'.''
2016-07-11 19:19:14.141158 772 UserTrace BIP2539I: Node '': Evaluating expression ''inRef.DFDL.simpleCSV.record.field1'' at ('.FileReadAndParse_Compute.Main', '7.15'). This resolved to ''inRef.DFDL.simpleCSV.record.field1''. The result was ''1''.
2016-07-11 19:19:14.141171 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''CALL CopyEntireMessage();'' at ('.FileReadAndParse_Compute.Main', '9.3').
2016-07-11 19:19:14.141176 772 UserTrace BIP2538I: Node 'FileReadAndParse.Compute': Evaluating expression ''CopyEntireMessage()'' at ('.FileReadAndParse_Compute.Main', '9.8').
2016-07-11 19:19:14.141182 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''BEGIN ... END;'' at ('.FileReadAndParse_Compute.CopyEntireMessage', '1.39').
2016-07-11 19:19:14.141190 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''SET OutputRoot = InputRoot;'' at ('.FileReadAndParse_Compute.CopyEntireMessage', '2.3').
2016-07-11 19:19:14.141201 772 UserTrace BIP2539I: Node '': Evaluating expression ''InputRoot'' at ('.FileReadAndParse_Compute.CopyEntireMessage', '2.20'). This resolved to ''InputRoot''. The result was ''ROW... Root Element Type=16777216 NameSpace='' Name='Root' Value=NULL''.
2016-07-11 19:19:14.141209 772 UserTrace BIP2568I: Node 'FileReadAndParse.Compute': Copying sub-tree from ''InputRoot'' to ''OutputRoot''.
2016-07-11 19:19:14.142099 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute': Executing statement ''RETURN TRUE;'' at ('.FileReadAndParse_Compute.Main', '11.3').
2016-07-11 19:19:14.142121 772 UserTrace BIP4015I: Message propagated to the 'out' terminal of node 'FileReadAndParse.Compute' with the following message trees: ''.
2016-07-11 19:19:14.142163 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''BEGIN ... END;'' at ('.FileReadAndParse_Compute.Main', '2.2').
2016-07-11 19:19:14.142171 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''DECLARE inRef REFERENCE TO InputRoot;'' at ('.FileReadAndParse_Compute.Main', '4.3').
2016-07-11 19:19:14.142178 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''DECLARE dummy CHARACTER;'' at ('.FileReadAndParse_Compute.Main', '5.3').
2016-07-11 19:19:14.142186 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''SET dummy = inRef.DFDL.simpleCSV.record.field1;'' at ('.FileReadAndParse_Compute.Main', '7.3').
2016-07-11 19:19:14.142252 772 UserTrace BIP5841I: ''Offset: 0. Parsing will start from root element 'simpleCSV'.''
2016-07-11 19:19:14.142264 772 UserTrace BIP5841I: ''The default value of '%LF;' was assigned to variable 'outputNewLine' in namespace 'http://www.ogf.org/dfdl/dfdl-1.0/'. ''
2016-07-11 19:19:14.142278 772 UserTrace BIP5841I: ''Offset: 0. Starting to process element 'simpleCSV'.''
2016-07-11 19:19:14.142290 772 UserTrace BIP5841I: ''Offset: 0. Starting to process element 'record'.''
2016-07-11 19:19:14.142300 772 UserTrace BIP5841I: ''Offset: 0. Starting to process element 'field1'.''
2016-07-11 19:19:14.142425 772 UserTrace BIP5841I: ''Offset: 0. Found delimited value '1' for element 'field1'. The delimiter was ','. ''
2016-07-11 19:19:14.142496 772 UserTrace BIP5841I: ''Offset: 1. The simple content region of element 'field1' was converted to logical value '1'. ''
2016-07-11 19:19:14.142508 772 UserTrace BIP5841I: ''Offset: 1. Finished processing element 'field1'.''
2016-07-11 19:19:14.142561 772 UserTrace BIP2539I: Node '': Evaluating expression ''inRef.DFDL.simpleCSV.record.field1'' at ('.FileReadAndParse_Compute.Main', '7.15'). This resolved to ''inRef.DFDL.simpleCSV.record.field1''. The result was ''1''.
2016-07-11 19:19:14.142574 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''CALL CopyEntireMessage();'' at ('.FileReadAndParse_Compute.Main', '9.3').
2016-07-11 19:19:14.142579 772 UserTrace BIP2538I: Node 'FileReadAndParse.Compute1': Evaluating expression ''CopyEntireMessage()'' at ('.FileReadAndParse_Compute.Main', '9.8').
2016-07-11 19:19:14.142584 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''BEGIN ... END;'' at ('.FileReadAndParse_Compute.CopyEntireMessage', '1.39').
2016-07-11 19:19:14.142590 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''SET OutputRoot = InputRoot;'' at ('.FileReadAndParse_Compute.CopyEntireMessage', '2.3').
2016-07-11 19:19:14.142599 772 UserTrace BIP2539I: Node '': Evaluating expression ''InputRoot'' at ('.FileReadAndParse_Compute.CopyEntireMessage', '2.20'). This resolved to ''InputRoot''. The result was ''ROW... Root Element Type=16777216 NameSpace='' Name='Root' Value=NULL''.
2016-07-11 19:19:14.142605 772 UserTrace BIP2568I: Node 'FileReadAndParse.Compute1': Copying sub-tree from ''InputRoot'' to ''OutputRoot''.
2016-07-11 19:19:14.143467 772 UserTrace BIP2537I: Node 'FileReadAndParse.Compute1': Executing statement ''RETURN TRUE;'' at ('.FileReadAndParse_Compute.Main', '11.3').
2016-07-11 19:19:14.143490 772 UserTrace BIP4015I: Message propagated to the 'out' terminal of node 'FileReadAndParse.Compute1' with the following message trees: ''.
|
So the parser is being invoked in both compute nodes, despite the first node having accessed the requested element successfully (and so the logical message tree must have been created to include that element). _________________ Independent Middleware Consultant
andy@knownentity.com |
|
Back to top |
|
 |
|