MQSeries.net :: View topic - DFDL Graphical Mapping multi Records to same XML subsegment

MQSeries.net

Tech Exchange

Education

Certifications

Library

Info Center

SupportPacs

FAQÂ Â

Usergroups

RSS Feed - WebSphere MQ Support

RSS Feed - Message Broker Support

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » DFDL Graphical Mapping multi Records to same XML subsegment

Goto page 1, 2 Next

DFDL Graphical Mapping multi Records to same XML subsegment

« View previous topic :: View next topic »

Author

Message

GARNOLD5551212

Posted: Tue Jul 30, 2013 7:06 am Post subject: DFDL Graphical Mapping multi Records to same XML subsegment

Novice

Joined: 23 Jul 2013
Posts: 13

I have a large file of records as input using DFDL defined input. I wish to convert it to a single XML output document. What I need help in understanding is how to rollup several records into on sub loop of the xml schema. I am mapping the key to the MSG assembly->LocalEnvironment->Variable section so I know when I hit the point that I want to create a new higher level section, but I don't know how to set the cardinality so that I stay at the same level and not close out my tags when writing to file.
Example:
Input.
[Group 1][IDField=1][DATA-1]
[Group 1][IDField=1][DATA-2]
[Group 1][IDField=1][DATA-3]
[Group 1][IDField=2][DATA-1]
[Group 1][IDField=2][DATA-2]

Desired Output
<Group ID=11>
<ID Value=1>
<Data Value=1/>
<Data Value=2/>
<Data Value=3/>
</ID>
<ID Value=2>
<Data Value=1/>
<Data Value=2/>
</ID>
</Group>

I'm coming from a TIBCO background, went through the week IBM training. Been creating flows using IBM graphical maps for about 8 weeks. I was able to solve this problem for one flow by mapping to just the ID section and appending to file, then creating a map that added hardcoded XML Prolog/Declaration, Opening <Group>, append the sub section file, then append a hard coded single xml string </Group>

In this instance I need to change values in the opening tags <Group > attribute, and keep a pointer in my output to not close out the segments till I detect a change in ID key. I hope that is clear. This seems like it would be a common pattern, and I'm hoping it is as simple as mapping to the Msg Assembly to create multiple XML docs out of one, but this is the opposite.

mqjeff

Posted: Tue Jul 30, 2013 7:27 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

Create a for loop over the idfield structure.

In that submap, create a for loop over the data fields.

GARNOLD5551212

Posted: Tue Jul 30, 2013 10:10 am Post subject:

Novice

Joined: 23 Jul 2013
Posts: 13

You cannot put a for each at the Data level because from the flat file on the left side Map from DFDL Data is [1..1]. Since each record in the file is processed one at a time in the flow read from file, I only have the one record to interrogate along with the last IDField processed so I have something to create a break on. This being the ID. Since this is all done in the context of one flow how can I direct the output to close the current DATA loop and ID, and start a new ID based on finding a new value in IDField?

One record per line = ([Group 1][IDField=1][DATA-1] )

The rights side XMLNSC DATA is defined as [1..*]

The file is so large that I cannot read the entire file into the Tree and do a for each on the ID as Primary input and then loop through the matching data as supplementary mapped data.

So when reading a file one record at a time and I want to combine records into matching ID level on Right side XML map.

dogorsy

Posted: Tue Jul 30, 2013 10:30 am Post subject:

Knight

Joined: 13 Mar 2013
Posts: 553
Location: Home Office

Are you trying to do it using a mapping node ? if so, that is the wrong approach. Use a compute node.

GARNOLD5551212

Posted: Tue Jul 30, 2013 11:23 am Post subject:

Novice

Joined: 23 Jul 2013
Posts: 13

That was what I was trying to avoid. I was pretty sure it could be coded in ESQL or Java Compute, but I wanted to stay in a pure graphical Mapping if possible.

mqjeff

Posted: Tue Jul 30, 2013 11:40 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

If each record is a new invocation of the flow, you need to store the "current" group id in global cache, or you need to stick it in a database or etc where you can read it so you can check if it's the next group id or not.

You can't access global cache from Mapping node, without writing ESQL or java compute.

This is really a collection pattern. You should consider using collector node to assemble the records that belong to each group and then output those at once.

dogorsy

Posted: Tue Jul 30, 2013 9:44 pm Post subject:

Knight

Joined: 13 Mar 2013
Posts: 553
Location: Home Office

mqjeff wrote:

agree, but why use a collector node when the whole file can be read ( rather than a record at a time ). then use a compute node to loop through the records and create the xml output. while looping, the consumed input records can be deleted to free up memory. but you may be right, if several output records are required (i.e. one per group)

kimbert

Posted: Wed Jul 31, 2013 12:58 am Post subject:

Jedi Council

Joined: 29 Jul 2003
Posts: 5543
Location: Southampton

Quote:

why use a collector node when the whole file can be read ( rather than a record at a time ).

Because the file might get very large?

If it was me, I would write a few lines of ESQL to do this. The current group id could be stored in a SHARED ROW variable. It's not graphical mapping, but it's not many lines of code either.
_________________
Before you criticize someone, walk a mile in their shoes. That way you're a mile away, and you have their shoes too.

dogorsy

Posted: Wed Jul 31, 2013 1:31 am Post subject:

Knight

Joined: 13 Mar 2013
Posts: 553
Location: Home Office

kimbert wrote:

Quote:

why use a collector node when the whole file can be read ( rather than a record at a time ).

yes, agree. But not only the current group id needs to be stored in a shared variable, the output xml needs to be, until the group is complete ( I know that is what you meant Tim, that's why you said ROW, but just clarifying.)
Having said that, if it is known that the file size will not be large, then life would be a lot easier by reading the whole file.

Vitor

Posted: Wed Jul 31, 2013 4:45 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

dogorsy wrote:

Having said that, if it is known that the file size will not be large, then life would be a lot easier by reading the whole file.

I have a great suspicion of things which are "known". In a lot of instances what is "known" is not in fact true a few years later (a file which is "never" more than 10 Mb grows to 100Mb a year later as the business evolves) or is not in fact known at all.

Case in point: one of my customers developed a flow which read the whole file for exactly the reason given here; they needed to group records & it was "known" that the file in question never contained more than 30 or so business accounts with never more than 50 transactions per account per day. They got through 3 months of QA & business testing & went live with much fanfare, then blew out my broker processing a file the day after. Investigation quickly revealed the broker had run out of memory trying to swallow their file whole, the file was in fact huge & 1 account in it had 7,500 transactions. When the business area was queried, and reminded that they'd been certain about the 50 transaction limit, they replied:

"Oh yes, never more than 50. Apart from 5 or 6 accounts we have to handle manually because they have thousands of transactions. No, the people you were speaking to probably didn't know about them because they always go through the exception process. We're really glad your new system is in place because those accounts are a real pain to deal with".
_________________
Honesty is the best policy.
Insanity is the best defence.

dogorsy

Posted: Wed Jul 31, 2013 7:32 am Post subject:

Knight

Joined: 13 Mar 2013
Posts: 553
Location: Home Office

Vitor wrote:

dogorsy wrote:

Having said that, if it is known that the file size will not be large, then life would be a lot easier by reading the whole file.

Nice one !

mqjeff

Posted: Wed Jul 31, 2013 7:35 am Post subject:

Grand Master

Joined: 25 Jun 2008
Posts: 17447

dogorsy wrote:

Vitor wrote:

dogorsy wrote:

Having said that, if it is known that the file size will not be large, then life would be a lot easier by reading the whole file.

Nice one !

It's not actually nice. It happens *all of the time*, out in the real world, rather than sitting in a lab somewhere in the south of blighty.

"Requirements? Yes, we should have some of those. We need you to go live in two weeks, so keep building the software.".

dogorsy

Posted: Wed Jul 31, 2013 7:45 am Post subject:

Knight

Joined: 13 Mar 2013
Posts: 553
Location: Home Office

mqjeff wrote:

dogorsy wrote:

Vitor wrote:

dogorsy wrote:

Having said that, if it is known that the file size will not be large, then life would be a lot easier by reading the whole file.

Nice one !

sorry, I was being sarcastic.

Vitor

Posted: Wed Jul 31, 2013 8:00 am Post subject:

Grand High Poobah

Joined: 11 Nov 2005
Posts: 26093
Location: Texas, USA

dogorsy wrote:

sorry, I was being sarcastic.

We need a better emoticon for that. A lot of us could make use of it.
_________________
Honesty is the best policy.
Insanity is the best defence.

GARNOLD5551212

Posted: Thu Oct 17, 2013 12:18 pm Post subject:

Novice

Joined: 23 Jul 2013
Posts: 13

Just to close this loop, my solution for this large file processing was a mix of Java Compute Node and maps to tmp files. I mapped the new Id Field from each record as it was read to the LocalEnv->Variables.
Then used a Java Compute node to store the CurrentID in a Class Static variable and detect when it changed. That let me use a route node to decide if I needed to close this sub group and append to my final output file. This let me group the ID's and only write about 10-15 lines of Java. The flow used a total of 20 out of the box nodes and one Java Compute. Memory use has stayed very low.
Thanks to all that commented.

Display posts from previous:

Goto page 1, 2 Next

Page 1 of 2

MQSeries.net Forum Index » WebSphere Message Broker (ACE) Support » DFDL Graphical Mapping multi Records to same XML subsegment

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Protected by Anti-Spam ACP