this post was submitted on 06 Sep 2024
14 points (81.8% liked)

Programming

17366 readers
398 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 1 year ago
MODERATORS
 

Omg it's sooo daammmn slooow it takes around 30 seconds to bulk - insert 15000 rows

Disabling indices doesn't help. Database log is at SIMPLE. My table is 50 columns wide, and from what i understand the main reason is the stupid limit of 2100 parameters in query in ODBC driver. I am using the . NET SqlBulkCopy. I only open the connection + transaction once per ~15000 inserts

I have 50 millions rows to insert, it takes literally days, please send help, i can fucking write with a pen and paper faster than damned Microsoft driver inserts rows

you are viewing a single comment's thread
view the rest of the comments
[–] RagingHungryPanda@lemm.ee 7 points 2 months ago* (last edited 2 months ago) (1 children)

I've done a lot of work and no, that is not normal.

A few things: First - SQL server has tools for migrating data that's pretty fast. SQL bulk copy can use some of these. Check to see if the built in db tools are better for this.

SQL bulk copy can handle way more than 15,000 records

Why are you wrapping a data dump in a transaction? That will slow things down for sure.

You generally shouldn't be doing huge queries like that to where you're nearing the parameter limit.

Can you share the code?

[–] kSPvhmTOlwvMd7Y7E@programming.dev 1 points 2 months ago (1 children)

I timed the transaction and opening of the connection, it takes maybe a 100 milliseconds, absolutely doesn't explain ghe abysmal performance

Transaction is needed because 2 tables are touched, i don't want to deal with partially inserted data

Cannot share the code, but it's python calling .NET through "clr", and using SqlBulkCopy

What do you suggest i shouldn't be using that? It's either a prepared query, with thousands of parameters, or a plain text string with parameters inside (which admittedly, i didn't try, might be faster lol)

[–] RagingHungryPanda@lemm.ee 3 points 2 months ago* (last edited 2 months ago)

One thing to know about transactions is that they track data and then write it. It's not the opening that slows it down. I have a question though, what is your source data? Do you have a big CSV for something? Can you do a db to db transfer instead? There's another tool called the BCP utility.

Edit: SQL server/ssms have tools for doing migrations and batch imports