Author |
Message
|
santhoshramesh |
Posted: Wed Feb 26, 2014 11:30 pm Post subject: Maximum output file size from FileOutputNode |
|
|
Novice
Joined: 26 Feb 2014 Posts: 17
|
Hi All,
I am using MB V8.0 in windows XP OS. A message flow is collecting data from a table present in ORACLE database and forming a .csv file. The file is generated as record by record in transit and once it reaches finish file, output file will be moved from transit to actual output location. I have observed in windows XP that once the file size reaches around 44 MB the message flow stops processing the file and generating error as "Double Faulted during Abend Processing in Function ImbAbend::logSystemErrorAndTerminateProcessInternal(), Terminating Process" in event viewer. The same message flow if deployed in UNIX server, it is generating the output file more than 80 MB. So is there any limit for the output file size generation based on OS or any setting needs to be modified. If so please provide the command to modify the setting. |
|
Back to top |
|
 |
Esa |
Posted: Wed Feb 26, 2014 11:45 pm Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
I doubt very much this has anything to do with "maximum file size".
There is something wrong in your flow design. If I understand correctly, your flow reads a large number of records from a database, generates a message from each of them and propagates to File Output.
This kind of looping is merciless when it comes to small programming errors, the same that most message broker developers make every day and normal non-looping flows tolerate very well.
Your flow allocates memory or a parser for each little message it propagates to File Output and fails to release it. After it has allocated all available memory the execution group abends, hence your error message.
The unix server has larger JVM heap and it survives your flow a bit longer. You can increase JVM heap, but that is not the correct solution to the problem. |
|
Back to top |
|
 |
santhoshramesh |
Posted: Thu Feb 27, 2014 12:36 am Post subject: |
|
|
Novice
Joined: 26 Feb 2014 Posts: 17
|
The functionality of the flow is to archive data from table in .csv format. So the flow reads all the old data and generates the .csv file. I tried two different way of writing the data into file i) Writing all the record into single file ii) Splitting the record as month wise inside the flow and generating different files for different months, Even in both the ways either single file size exceeds 44 MB or sum of all the file size generated using second design exceeds 44 MB the broker is generating the error. |
|
Back to top |
|
 |
Esa |
Posted: Thu Feb 27, 2014 12:57 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
Thank you for confirming that my theory is correct.
Are you constructing one single OutputRoot containing all the records from the database or do you create an output message for every record or a limited number of records and run File output in append mode?
ESQL or java? |
|
Back to top |
|
 |
Tibor |
Posted: Thu Feb 27, 2014 2:21 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
|
Back to top |
|
 |
Esa |
Posted: Thu Feb 27, 2014 3:02 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
Yes, but this setting doesn't limit the size of the actual file. In practice it sets a global limit for the number of bytes you can append to a file by one propagation to a File Output node.
It doesn't limit the amount of times you can append to one file.
The maximum file size depends on the operating system. It's often something like 2 GB. |
|
Back to top |
|
 |
santhoshramesh |
Posted: Thu Feb 27, 2014 3:26 am Post subject: |
|
|
Novice
Joined: 26 Feb 2014 Posts: 17
|
I am using ESQL code. I tried both ways 1) writing all the record into one single output file 2) Splitting the records based on month and generating output file based on months. In the first method when the output file size reaches 43 MB the broker stops processing and throwing error. In the second method eventhough I create multiple files in one flow, when the sum of all the file sizes reaches 43 MB the broker stops processing. According to my understanding based on this issue, message flow is generating maximum of 43 MB size through one processing in windows with default settings. |
|
Back to top |
|
 |
Esa |
Posted: Thu Feb 27, 2014 3:45 am Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
The EG runs out of memory because of the way you do it.
Don't propagate the whole result set. Use a REFERENCE variable to navigate the result set, propagate (with DELETE DEFAULT) and create a new OutputRoot after every 50 records, for example.
You should be able to write everything into one file. And you will get remarkably better performance compared to your current implementation, too.
By the way, you do use a DFDL schema or a Message Set for the output CSV, don't you?
Because code like this:
Code: |
SET OutputRoot.BLOB.BLOB = OutputRoot.BLOB.BLOB || ',' || somevalue;
|
for creating large messages is very bad for performance and may almost double the overall memory allocated by the flow.
EDIT: replaced something rude with "very bad for performance..."
Last edited by Esa on Thu Feb 27, 2014 6:24 am; edited 1 time in total |
|
Back to top |
|
 |
Tibor |
Posted: Thu Feb 27, 2014 4:57 am Post subject: |
|
|
 Grand Master
Joined: 20 May 2001 Posts: 1033 Location: Hungary
|
|
Back to top |
|
 |
Esa |
Posted: Thu Feb 27, 2014 10:24 pm Post subject: |
|
|
 Grand Master
Joined: 22 May 2008 Posts: 1387 Location: Finland
|
santhoshramesh wrote: |
The file is generated as record by record in transit and once it reaches finish file, output file will be moved from transit to actual output location. |
Sorry, I didn't read your post carefully enough. You have already answered my previous questions in your first post.
Can you share with us the code where you split the records and propagate to File Output?
Typically problems like this are caused by the way people reuse or don't reuse OutputRoot within the loop.
But there may also be something in the way you handle the result set. Large message handling best practices apply to large result sets, too. Even if you propagated the messages correctly, a huge result set can exhaust your EG if you don't take precautions.
And please inform us if you have already solved your problem. |
|
Back to top |
|
 |
santhoshramesh |
Posted: Thu Feb 27, 2014 10:49 pm Post subject: |
|
|
Novice
Joined: 26 Feb 2014 Posts: 17
|
I will share the code here on sunday... Actually the same code is able to generate more than 80MB of file size in UNIX OS.. I am facing this problem in my local system which is using windows OS... |
|
Back to top |
|
 |
Gralgrathor |
Posted: Fri Feb 28, 2014 1:00 am Post subject: |
|
|
Master
Joined: 23 Jul 2009 Posts: 297
|
80MB is still nothing. Typical size for a nightly SAS batch produced by some of the processes in my current environment is around 100MB. A well coded flow should easily be able to produce files up to the OS file size limit. I've certainly never run into any practical limits yet. _________________ A measure of wheat for a penny, and three measures of barley for a penny; and see thou hurt not the oil and the wine. |
|
Back to top |
|
 |
santhoshramesh |
Posted: Fri Feb 28, 2014 1:06 am Post subject: |
|
|
Novice
Joined: 26 Feb 2014 Posts: 17
|
Yes that is correct.. I just quoted 80MB as an example for the comparison between the same flow generated only 43MB in windows and 80MB in UNIX. The flow ran to extract data from the same table with same amount of data in both OS. In windows it generated only 43MB and started throwing broker exception in event viewer. But the same flow extracted all the data from the table in UNIX and the output of all the data is 82 MB. As you mentioned about OS file size limit, how can I able to identify the OS file size limit for windows and UNIX? Anybody tried to generate more than 43 MB size of file in file output in windows OS without facing any issue? |
|
Back to top |
|
 |
Gralgrathor |
Posted: Fri Feb 28, 2014 1:25 am Post subject: |
|
|
Master
Joined: 23 Jul 2009 Posts: 297
|
santhoshramesh wrote: |
how can I able to identify the OS file size limit |
Don't worry about the OS filesize limit. You're nowhere near it in any case.
Your problem isn't files: it's memory.
Have you run comparisons between Unix and Windows to see how much memory your EG consumes during runs with identical data? Can you tell us how large your EG becomes before it starts to fail?
Probably the way you coded your flow makes it allocate memory and not release it to the heap. So after a while, your heap floods, and your EG starts choking. The difference between the two platforms likely has something to do with the maximum heap size specified for the execution group. _________________ A measure of wheat for a penny, and three measures of barley for a penny; and see thou hurt not the oil and the wine. |
|
Back to top |
|
 |
|