January 25th, 2023
While making a remittance checker I was running into issues with the huge amount of records coming in that needed to be stored and processed.
A single SQL query/insert was taking something like 100-200ms and a single record import might have several.
For a file of 1m+ rows, this is unacceptable so I started looking for speed improvements.
Below is a function I settled on that took my time from 3000+ minutes down to 12 for a large file.
<?php function createBatchInsertStatement($table, $columns, $rows) { $processed_data = array(); foreach ($rows as $data) { foreach($data as $val) { $processed_data[] = $val; } } $placeholders = implode(',', array_fill(0, count($rows), '('.str_pad('', (count($columns)*2)-1, '?,').')')); $sql = "INSERT INTO {$table} (" . implode(', ', $columns) . ") VALUES " . $placeholders . ";"; return [$sql, $processed_data]; } ?>
The function takes in 3 parameters: $table, $columns, and $rows.
$table is the table name
$columns is a 1-dimensional array of column headers, as in the database
$table is an n-dimensional array of rows to insert
The function will then respond with a 1×2 array containing the statement in $response[0] and the formatted array (compressed to 1-dimensional) in the $response[1].
An example usage would look like this:
<?php $table = "charges"; $columns = array("order_id", "charge_type", "amount"); // your batched data, limit to 50-100 records at a time for stability // array_chunk can handle automatically chunking a main data array // for you $rows = array( array("10001", "Principal", "32.66"); array("10002", "Tax", "2.21"); array("10003", "GiftWrapping", "0.48"); ); $response = createBatchInsertStatement($table, $columns, $rows); // here $db is a PDO connection object $db->beginTransaction(); $orders = $db->prepare($response[0]); $orders->execute($response[1]); $db->commit(); ?>
That’s it, use this function to create queries wherever you find yourself batch-inserting large amounts of data!