|
RSS Feed - WebSphere MQ Support
|
RSS Feed - Message Broker Support
|
Message Parsing in v5.0 |
« View previous topic :: View next topic » |
Author |
Message
|
Scorpio_16 |
Posted: Thu Apr 02, 2009 8:33 am Post subject: Message Parsing in v5.0 |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
This is the issue with message parsing
I have a input message which is of below format
"first name","last name","streetname,city","company name",gggg,tttt,hhhhh
I have defined a message type(DoubleQuotes as explained below) which can take in the data present inbetween double quotes as one element and assigned this type to elements ( first name, last name, (streetname,city), Company name)
remaining e elements are just declared as string type.
DoubleQuotes
Data Element Separation: All Elements Delimited
Group Indicator: "
Group Terminator: "
Delimiter: ,
But the problem here is when there is a comma in any of the first 4 elements, the parsing fails because the delimiter is comma for messagetype DoubleQuotes.
Could someone help me out or give me suggestions on how this can be resolved ?
The DoubleQuotes messagetype should ignore any comma that comes inside double quotations.
Thanks |
|
Back to top |
|
 |
elvis_gn |
Posted: Thu Apr 02, 2009 9:07 am Post subject: Re: Message Parsing in v5.0 |
|
|
 Padawan
Joined: 08 Oct 2004 Posts: 1905 Location: Dubai
|
Hi Scorpio_16,
Scorpio_16 wrote: |
"first name","last name","streetname,city","company name",gggg,tttt,hhhhh |
You could do it in many ways, and which will depend majorly on how you want to classify your fields, or if you don't want to classify but simply have them in one list.
I would personally do it like this
Main Complex Type: TotalMessage
Delimiter: ,
Complex Type: Name
Group Indicator: "
Group delimiter: ","
Group Terminator: "
Inner Children: FirstName, LastName
Complex Type: Address
Group Indicator: "
Group Delimiter: ,
Group Terminator: "
Inner Children: Street, City
Complex Type: Company
Group Indicator: "
Group Terminator: "
Inner children: CompanyName
Rest of the fields just need simple element I guess.
This is just a suggestion, you might be needing something simple which is also possible.
Regards. |
|
Back to top |
|
 |
Scorpio_16 |
Posted: Thu Apr 02, 2009 11:49 am Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
Thanks for the reply elvis.
According to the logic which you said, each field gets separated and allocates to each element.
But the requirement is
1st element should be First Name
2nd - Last name
3rd - Address,city
4th - Company name
5th - gggg
---
---
If you see the sample input that I gave ...each of the elements are in double quotes and (Address, city) both are in double quotes separated by comma inside the quotes. |
|
Back to top |
|
 |
kimbert |
Posted: Thu Apr 02, 2009 12:34 pm Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
This has cropped up before. It is possible, but you need some advanced TDS trickery to do it.
- v6.1 TDS has built-in support for this CSV-style escaping mechanism.
- The v6.0 Samples Gallery shows how to do it using standard TDS.
- In v5 you need help from someone who knows the recipe
I'll post again when I've looked up the solution. |
|
Back to top |
|
 |
kimbert |
Posted: Fri Apr 03, 2009 2:30 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
More information required:
- Are the quotes *always* included around *every* field. Or are they only there if the field contains a comma?
- A quoted field can contain a comma. Can it also contain a quote character, escaped by itelf? ( CSV allows this ).
Example :
Code: |
field1,"He said ""this is quoted""",field3 |
which parses as
Code: |
field1
He said "this is quoted"
field3 |
|
|
Back to top |
|
 |
Scorpio_16 |
Posted: Fri Apr 03, 2009 2:53 am Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
Hi Kimbert,
Thanks for the reply
The quotes are not included for every filed...they can be only for few elements as shown below
"first name","lastname","streetname,city","companyname",gggg,tttt,hhhhh
Comma is the delimiter for this record...but there are chances that comma can occur inside double quotes which is part of data...this should not be treated as a delimiter.
the above record should be parsed as
1st - First Name
2nd - Last name
3rd - streetname,city
4th - Companyname
5th - gggg
---
---
Thanks again |
|
Back to top |
|
 |
kimbert |
Posted: Fri Apr 03, 2009 3:28 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
and the answer to my second question is...? |
|
Back to top |
|
 |
Scorpio_16 |
Posted: Fri Apr 03, 2009 3:51 am Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
Hi Kimbert
A quoted field can contain a comma.
It doesn't contain quoted characters escaped by itself.
Thanks |
|
Back to top |
|
 |
kimbert |
Posted: Fri Apr 03, 2009 4:57 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
OK - here's my suggestion.
- Fields are separated by commas
- The last field on a line is terminated by <CR><LF>
- The value of each field might or might not start with ".
- If a field value starts with " then it will end with "
So I think every field on the line has to be modelled like this:
Code: |
Group name="field1"
complexType composition="choice", DES="UseDataPattern"
Group GI=""" GT=""" dataPattern=""[^"]*" DES=AllElementsDelimited, delimiter=""
element ref="field1"
Group GI="" GT="" dataPattern="[^,\r\n]*" DES=AllElementsDelimited, delimiter=""
element ref="field1"
|
I have only modelled field1. You would need to do this for every field.
Note that you need *two* references to each field - one with quotes and one without. So I have assumed that you will make those element declarations global, and reference them from the choice.
If you don't mind having the quotes included in the data, you could use this model which is simpler:
Code: |
group composition="sequence", DES="UseDataPattern"
element name="field1" dataPattern="("[^"]*") | ([^,\r\n]*) "
|
In this case, you just wrap field1 in a sequence ( and create a similar single-member group for all the other fields ).
This is not tested, so it probably will not work first time. |
|
Back to top |
|
 |
Scorpio_16 |
Posted: Mon Apr 06, 2009 5:02 am Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
Thanks for the inputs Kimbert.
I got this working finally. Thanks again.
Is it possible to define two types for a single element so that the message set picks up the right one based on incoming data ?
One more quick question for you.
For example see below two records
-> aaa,rdews,"12,123,99.00",xxxx,zzzzz
-> aaa,rdews,150.88,xxxx,zzzzz
In the above two records, the third element can come inside double quotes or just as a simple element as shown in 2nd record.
If the element is in double quotes it contains comma as part of data.
In this case the parsing is bit difficult as we dont know when its going to have doublequotes and when not. Could you please give me any inputs on this.?
Thanks again. |
|
Back to top |
|
 |
kimbert |
Posted: Mon Apr 06, 2009 5:21 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Quote: |
Is it possible to define two types for a single element |
In XML Schema ( which is the data model for message definitions ) an element must have exactly one type. But I don't think that answers your question.
Quote: |
In the above two records, the third element can come inside double quotes or just as a simple element as shown in 2nd record.
If the element is in double quotes it contains comma as part of data.
In this case the parsing is bit difficult as we dont know when its going to have doublequotes and when not. Could you please give me any inputs on this.? |
What does your model look like? If you used my suggested model ( the first one ), it will cope whether or not the field has quotes. |
|
Back to top |
|
 |
Scorpio_16 |
Posted: Mon Apr 06, 2009 5:57 am Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
here is the one that i have used
Global Group: DoubleQuoteComma
Composition : Choice
DES: use data pattern
GI: "
GT: "
Complex Type: DoubleQuoteCommaType
Group Reference: DoubleQuoteComma
DES: USer Data Pattern
GI: "
GT: "
Added a element DATA to the globalgroup with the below properties:
DataPattern: [^"]*
I have assigned the above complex Type to the element that I have created in the message type.
Since I know that this element always comes inside double quotes, I was able to define the message set for this.
There is one more format where we do not know when the double quotes can occur. If tehre are any doublequotes for any element, then that element data contains comma.
aaa,rdews,"12,123,99.00",xxxx,zzzzz
aaa,rdews,150.88,xxxx,zzzzz ( in this case my message set throws an exception as the GI and GT are " for 3rd element)
Note that comma is also a delimiter here.
Could you please suggest any idea for this? I have tried the configuration suggested by you but I couldn't get that working.
Thanks |
|
Back to top |
|
 |
Scorpio_16 |
Posted: Mon Apr 06, 2009 7:51 am Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
Hi Kimbert,
Could you explain this to me pls. Sorry I am bit confused with what to define for what component(group/type/element).
Thank You.
Code:
Group name="field1"
complexType composition="choice", DES="UseDataPattern"
Group GI=""" GT=""" dataPattern=""[^"]*" DES=AllElementsDelimited, delimiter=""
element ref="field1"
Group GI="" GT="" dataPattern="[^,\r\n]*" DES=AllElementsDelimited, delimiter=""
element ref="field1" |
|
Back to top |
|
 |
kimbert |
Posted: Mon Apr 06, 2009 11:06 am Post subject: |
|
|
 Jedi Council
Joined: 29 Jul 2003 Posts: 5542 Location: Southampton
|
Define field1, field2 as global elements. Then use a model which looks like this:
Code: |
Element name="message"
complexType composition="sequence" DES="Use Data Pattern"
Group dataPattern="[^\r]*", GroupTerminator="<CR><LF>", composition="choice", DES="UseDataPattern"
Group GI=""" GT=""" dataPattern=""[^"]*" DES=AllElementsDelimited, delimiter=""
element ref="field1"
Group GI="" GT="" dataPattern="[^,\r\n]*" DES=AllElementsDelimited, delimiter=""
element ref="field1"
Group dataPattern="[^\r]*", GroupTerminator="<CR><LF>", composition="choice", DES="UseDataPattern"
Group GI=""" GT=""" dataPattern=""[^"]*" DES=AllElementsDelimited, delimiter=""
element ref="field2"
Group GI="" GT="" dataPattern="[^,\r\n]*" DES=AllElementsDelimited, delimiter=""
element ref="field2"
Group ... |
For fields which never have quotes, you can use a simpler model, but be careful how you mix the two styles. btw, this is not tested, so you should expect to get errors. If you post again, please take a user trace and supply the full text of any error messages. |
|
Back to top |
|
 |
Scorpio_16 |
Posted: Mon Apr 06, 2009 12:33 pm Post subject: |
|
|
Apprentice
Joined: 15 Dec 2008 Posts: 25
|
Hi Kimbert
Thanks for the reply. But its really confusing to understand the format in which you copied the code. I couldn't find out which property is for which element/type/group.
Could you please send each element/type/group and its properties in paragraph format ?
Appreciate your help.
Cheers! |
|
Back to top |
|
 |
|
|
 |
Goto page 1, 2 Next |
Page 1 of 2 |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|
|