Author |
Message
|
Dave Ziegler |
Posted: Wed Jun 11, 2014 8:45 am Post subject: File to SQL Server, IIB V9, and Performance Concerns |
|
|
Centurion
Joined: 15 Apr 2014 Posts: 118
|
I have read through a handful of posts out here discussing database and file input node performance, but they seem to be referencing older versions of IIB. I'm using V9, and before I get too far off course I was hoping someone could guide me in the right direction.
Setup is: 5-10G text file, millions of records, need to get them from file to SQL Server 2008. The destination is one table with ~50 columns. I am using DFDL and parsing on-demand.
I would imagine a database or compute node directly after this would single thread my inserts, which would be very slow. (I read this in another thread: "If your SQL is written well, you should be able to achieve the same latency I do: INTERVAL '0.009147' SECOND per row. This means you can insert 100 rows per second.")
Should I be looking at dumping my (transformed) messages to a file and bulk importing them into SQL using SSIS, or are there some tricks I can consider in IIB?
I have not tried anything at this point, again I'm just looking for suggestions before I head down the wrong path. How have the experts handled this scenario? |
|
Back to top |
|
 |
Vitor |
Posted: Wed Jun 11, 2014 9:34 am Post subject: Re: File to SQL Server, IIB V9, and Performance Concerns |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
Dave Ziegler wrote: |
Should I be looking at dumping my (transformed) messages to a file and bulk importing them into SQL using SSIS, or are there some tricks I can consider in IIB? |
You could.
You could also ensure that the "single threaded" had the commit count set such that you don't thrash the database's I/O which will (IMHO) give you a better performance improvement that running multi-threaded.
You could get multi-threaded by dumping the transformed messages into a WMQ queue (and committing them) and have 1-n instances of a second flow loading them into the database. By putting this queue into a WMQ cluster you can then scale it across 1-n brokers until your database server wheezes and falls over (it is only Windoze SQL server after all...)
We have taken this one step further with a 135Gb file we need to process where flow A reads the record as a BLOB and does nothing but dump it to a clustered queue. Flow B reads the message, performs some fairly complex transformations and dumps it into a second, equally clustered, queue. Flow C reads the second queue and does an insert or an update as needed.
Flows B & C run multi instances (50 instances each? I could look but I'm too lazy) across 3 brokers in a cluster and C points to an Oracle 11something database.
YMMV and this is WMBv7 not v9 currently. Migration is a process. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
Dave Ziegler |
Posted: Wed Jun 11, 2014 3:24 pm Post subject: |
|
|
Centurion
Joined: 15 Apr 2014 Posts: 118
|
Thank you for the reply. I'm trying the file->mq->db route first, and tinkering with instancing and commit count as suggested to see how that affects things.
Lots of options and lots of combinations of options. Difficult to find the sweet spot without experimentation  |
|
Back to top |
|
 |
mb01mqbrkrs |
Posted: Wed Jun 11, 2014 11:19 pm Post subject: Re: File to SQL Server, IIB V9, and Performance Concerns |
|
|
Apprentice
Joined: 18 Nov 2011 Posts: 48
|
Dave Ziegler wrote: |
Setup is: 5-10G text file, millions of records, need to get them from file to SQL Server 2008. The destination is one table with ~50 columns. I am using DFDL and parsing on-demand. |
Is this a regular job, or just a one off? Sounds like you need an ETL tool (Excuse me while I cover from the flack I'm guessing I'm about to get). |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jun 12, 2014 5:39 am Post subject: Re: File to SQL Server, IIB V9, and Performance Concerns |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
mb01mqbrkrs wrote: |
Dave Ziegler wrote: |
Setup is: 5-10G text file, millions of records, need to get them from file to SQL Server 2008. The destination is one table with ~50 columns. I am using DFDL and parsing on-demand. |
Is this a regular job, or just a one off? Sounds like you need an ETL tool (Excuse me while I cover from the flack I'm guessing I'm about to get). |
An ETL tool is an excellent choice, even if only to segment the file into something that can be handled more reasonably with good performance in Broker.
Broker is *not* an ETL tool, and it does not perform in the same ways. |
|
Back to top |
|
 |
Dave Ziegler |
Posted: Thu Jun 12, 2014 7:24 am Post subject: Re: File to SQL Server, IIB V9, and Performance Concerns |
|
|
Centurion
Joined: 15 Apr 2014 Posts: 118
|
Vitor wrote: |
You could also ensure that the "single threaded" had the commit count set such that you don't thrash the database's I/O which will (IMHO) give you a better performance improvement that running multi-threaded. |
I did end up putting MQ in the middle here, and from what I can tell you are spot on. I've been experimenting with commit count and additional instances and from what I can tell, upping the commit count really makes a difference (in my particular use case anyway). Thanks for the suggestions!
mqjeff wrote: |
An ETL tool is an excellent choice, even if only to segment the file into something that can be handled more reasonably with good performance in Broker.
Broker is *not* an ETL tool, and it does not perform in the same ways.
|
Understood! I've been doing a lot of reading and asking a lot of questions on the topic. One of your peers out here suggested to me that if I have a requirement to enrich and transform this data before inserting it into the database, then IIB could be the right tool for the job. I'm going to give it a shot anyway, since that is my scenario.
mb01mqbrkrs wrote: |
Is this a regular job, or just a one off?
|
It will be a number of daily jobs, several large files incoming from third parties that need to be loaded into an ODS.
Thanks for the discussion, everyone. I'll continue with my experiments! |
|
Back to top |
|
 |
Dave Ziegler |
Posted: Thu Jun 12, 2014 7:30 am Post subject: |
|
|
Centurion
Joined: 15 Apr 2014 Posts: 118
|
With my current process, setting additional instances to 0 and commit count to 1000 gets me 470K rows loaded in about 5 mins. Additional instances at 9 and commit count at 1000 gets me 470K rows loaded in just under a minute. So apparently I need a combination of things to squeeze out the desired performance.
This is all on my laptop and not our dev server with multi-instance QM running... |
|
Back to top |
|
 |
mqjeff |
Posted: Thu Jun 12, 2014 7:36 am Post subject: Re: File to SQL Server, IIB V9, and Performance Concerns |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Dave Ziegler wrote: |
mqjeff wrote: |
An ETL tool is an excellent choice, even if only to segment the file into something that can be handled more reasonably with good performance in Broker.
Broker is *not* an ETL tool, and it does not perform in the same ways.
|
Understood! I've been doing a lot of reading and asking a lot of questions on the topic. One of your peers out here suggested to me that if I have a requirement to enrich and transform this data before inserting it into the database, then IIB could be the right tool for the job. I'm going to give it a shot anyway, since that is my scenario. |
Yeah, again, Broker is great as part of an ETL scenario, but you probably want a real ETL tool to deal with giving work to Broker in adequate and appropriate chunks.
Otherwise, yes, lots of different factors affect how a given broker flow performs.
There are some old comments from lancelotlinc about doing database inserts using a propagate loop that forwards to a compute node that does a single insert and then returns false.
There are complicating factors if you need to explicitly control the size of the batch commit, and you may want to use JDBC to do bulk inserts.
And then you need to make sure you don't have message affinity issues. |
|
Back to top |
|
 |
McueMart |
Posted: Thu Jun 12, 2014 7:42 am Post subject: |
|
|
 Chevalier
Joined: 29 Nov 2011 Posts: 490 Location: UK...somewhere
|
Quote: |
Additional instances at 9 and commit count at 1000 gets me 470K rows loaded in just under a minute. |
8k tps reading/parsing file records and inserting into a table? Not half bad!
Im guessing your laptop has a SSD. |
|
Back to top |
|
 |
Dave Ziegler |
Posted: Thu Jun 12, 2014 7:44 am Post subject: |
|
|
Centurion
Joined: 15 Apr 2014 Posts: 118
|
McueMart wrote: |
Im guessing your laptop has a SSD. |
You guessed correctly  |
|
Back to top |
|
 |
fjb_saper |
Posted: Thu Jun 12, 2014 12:22 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
try running with a commit count at 2000 and between 3 and 5 additional instances. See if that works faster or slower...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
Dave Ziegler |
Posted: Fri Jun 13, 2014 1:20 pm Post subject: |
|
|
Centurion
Joined: 15 Apr 2014 Posts: 118
|
fjb_saper wrote: |
try running with a commit count at 2000 and between 3 and 5 additional instances. See if that works faster or slower...  |
I tried 3 and 2000, and 3 and 5000, and a few other combos. 9 and 1000 seems to run the fastest, but I'll keep trying things! |
|
Back to top |
|
 |
fjb_saper |
Posted: Fri Jun 13, 2014 1:59 pm Post subject: |
|
|
 Grand High Poobah
Joined: 18 Nov 2003 Posts: 20756 Location: LI,NY
|
Dave Ziegler wrote: |
fjb_saper wrote: |
try running with a commit count at 2000 and between 3 and 5 additional instances. See if that works faster or slower...  |
I tried 3 and 2000, and 3 and 5000, and a few other combos. 9 and 1000 seems to run the fastest, but I'll keep trying things! |
If your particular sweet spot is at 9 and 1000, that's what it is.... Thinking that the qmgr is allowed a max of 10,000 uncommitted msgs by default, upping that limit might change your sweet spot some...  _________________ MQ & Broker admin |
|
Back to top |
|
 |
|