-
Notifications
You must be signed in to change notification settings - Fork 121
Open
Labels
Description
Hello!
I ran across #196 when first investigating batch insert but performance between Model.insert_all
and INSERT
behavior with FORMAT JSONEachRow
looks to be significant (at least locally for me).
With simple & small records (<10 Strings/UInt8/Float64 fields), 20k inserts sit at ~325ms with FORMAT JSONEachRow
, while INSERT INTO {} VALUES ()
(stock insert_all
behavior) sits at almost 5x this (~1550ms).
I was able to hack this in within my codebase by defining a new method on my abstract base class and digging into connection
internals as a POC, but would love to understand if there's a desire to support this in a more official capacity.
Example:
class ClickhouseRecord < ActiveRecord::Base
self.abstract_class = true
establish_connection :clickhouse
def self.insert_all_json_each_row(rows)
if rows.blank?
return
end
conn = connection.instance_variable_get(:@connection)
connection_config = connection.instance_variable_get(:@connection_config).merge({
query: "INSERT INTO #{table_name} FORMAT JSONEachRow"
}).to_param
body = rows.map(&:to_json).join("\n")
res = conn.post("/?#{connection_config}", body)
connection.send(
:process_response,
res,
ActiveRecord::ConnectionAdapters::Clickhouse::SchemaStatements::DEFAULT_RESPONSE_FORMAT
)
end
end