this post was submitted on 06 Sep 2024
14 points (81.8% liked)

Programming

17366 readers
398 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 

Omg it's sooo daammmn slooow it takes around 30 seconds to bulk - insert 15000 rows

Disabling indices doesn't help. Database log is at SIMPLE. My table is 50 columns wide, and from what i understand the main reason is the stupid limit of 2100 parameters in query in ODBC driver. I am using the . NET SqlBulkCopy. I only open the connection + transaction once per ~15000 inserts

I have 50 millions rows to insert, it takes literally days, please send help, i can fucking write with a pen and paper faster than damned Microsoft driver inserts rows

you are viewing a single comment's thread
view the rest of the comments
[–] aMockTie@beehaw.org 2 points 2 months ago (1 children)

Been a little while since I worked on ODBC stuff, but I have a couple of thoughts:

  • Would it be possible to use something like a table function on the DB side to simplify the query from the ODBC side?

  • I could be misremembering, but I feel like looping through individual inserts with an open connection was faster than trying to submit data in bulk when inserting that much data in one shot. Might be worth doing a benchmark in a test DB and table to confirm.

I know I was able to insert more than 50M rows in a manner of single digit hours, but unfortunately don't have access to that codebase anymore to double check the specifics.

[–] deegeese@sopuli.xyz 1 points 2 months ago

Looping single inserts over an open connection is far far slower than a bulk insert because every row is another transaction.

Only thing it’s faster than is if you opened and closed a connection for each row.