Skip to content

Support for INSERT INTO {table} FORMAT JSONEachRow? #212

@joshuaclayton

Description

@joshuaclayton

Hello!

I ran across #196 when first investigating batch insert but performance between Model.insert_all and INSERT behavior with FORMAT JSONEachRow looks to be significant (at least locally for me).

With simple & small records (<10 Strings/UInt8/Float64 fields), 20k inserts sit at ~325ms with FORMAT JSONEachRow, while INSERT INTO {} VALUES () (stock insert_all behavior) sits at almost 5x this (~1550ms).

I was able to hack this in within my codebase by defining a new method on my abstract base class and digging into connection internals as a POC, but would love to understand if there's a desire to support this in a more official capacity.

Example:

class ClickhouseRecord < ActiveRecord::Base
  self.abstract_class = true
  establish_connection :clickhouse

  def self.insert_all_json_each_row(rows)
    if rows.blank?
      return
    end

    conn = connection.instance_variable_get(:@connection)
    connection_config = connection.instance_variable_get(:@connection_config).merge({
      query: "INSERT INTO #{table_name} FORMAT JSONEachRow"
    }).to_param

    body = rows.map(&:to_json).join("\n")

    res = conn.post("/?#{connection_config}", body)
    connection.send(
      :process_response,
      res,
      ActiveRecord::ConnectionAdapters::Clickhouse::SchemaStatements::DEFAULT_RESPONSE_FORMAT
    )
  end
end

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions