ClickHouse Connect ChangeLog

WARNING -- Python 3.8 EOL

Python 3.8 was EOL on 2024-10-07. It is no longer tested, and versions after 2025-04-07 will not include Python 3.8 wheel distributions. As of version 0.8.15, wheels are not built for Python 3.8 AARCH64 versions due to missing dependencies in the build chain.

WARNING -- JSON Incompatibility between versions 22.8 and 22.10

The internal serialization format for experimental JSON was updated in ClickHouse version 24.10. clickhouse-connect will set the compatibility level on a global basis based on the last client created, so multiple clients using the library with mixed versions 22.8/22.9 and 22.10 and later versions will break. If you need JSON support for mixed versions you must use different Python interpreters for each version.

WARNING -- Impending Breaking Change - Server Settings in DSN

When creating a DBAPI Connection method using the Connection constructor or a SQLAlchemy DSN, the library currently converts any unrecognized keyword argument/query parameter to a ClickHouse server setting. Starting in the next minor release (0.9.0), unrecognized arguments/keywords for these methods of creating a DBAPI connection will raise an exception instead of being passed as ClickHouse server settings. This is in conjunction with some refactoring in Client construction. The supported method of passing ClickHouse server settings is to prefix such arguments/query parameters withch_.

UNRELEASED

Breaking Changes

Remove the legacy executor-based async client. The AsyncClient(client=...) constructor pattern, executor_threads, and executor parameters are no longer supported. Use clickhouse_connect.get_async_client() (or create_async_client()) which creates a native aiohttp-based async client directly. The pool_mgr parameter is also rejected on the async path. aiohttp remains an optional dependency, installed via pip install clickhouse-connect[async].
The internal AiohttpAsyncClient class has been renamed to AsyncClient and the module clickhouse_connect.driver.aiohttp_client has been removed. Import AsyncClient from clickhouse_connect.driver as before.
Removed the deprecated utc_tz_aware parameter entirely. Use tz_mode instead: "naive_utc" (default, was False), "aware" (was True), or "schema" (unchanged). Closes #654, #665
Removed the deprecated apply_server_timezone parameter entirely. Use tz_source instead: "auto" (default), "server" (was True), or "local" (was False).
Dropped pandas 1.x support. Minimum pandas version is now 2.0. Users with pandas < 2.0 will get a NotSupportedError at import time. Non-pandas usage is unaffected. Closes #661
Removed the preserve_pandas_datetime_resolution common setting. Datetime columns now always return their natural resolution, e.g. datetime64[s] for DateTime, datetime64[ms] for DateTime64(3), instead of coercing everything to datetime64[ns]. Closes #662
Dropped Python 3.9 support. The minimum supported Python version is now 3.10. 0.15.x is the last series supporting Python 3.9.

Bug Fixes

Fix async streaming race condition that caused unhandled InvalidStateError exceptions on early stream termination. When breaking out of an async stream early, shutdown() scheduled a set_result callback for pending futures via call_soon_threadsafe, but Task.cancel() could cancel the future before the callback ran. The done-check is now deferred into the callback itself so it sees the actual future state at execution time.
SQLAlchemy: Wrap raw SQL strings in text() in ChClickHouseDialect.get_schema_names() and get_table_names(), so Inspector.get_schema_names() and get_table_names() work on SQLAlchemy 2.x instead of raising ObjectNotExecutableError.

Development

Replaced pylint with Ruff for linting and formatting. Double quotes are now the standard quote style. Bulk formatting commits are listed in .git-blame-ignore-revs. CI lint job no longer requires building C extensions or installing project dependencies, significantly reducing lint check time.

Improvements

Lazy loading of optional dependencies (numpy, pandas, pyarrow, polars) now applies to the async client as well, matching the pattern established in 0.15.0 for the sync client.
Clearer error message when attempting to use the async client without aiohttp installed.
The generic_args parameter is now properly parsed on the async client creation path, matching the sync client behavior.
Pandas 3.x compatibility. Removed deprecated copy=False parameter from Series(), concat(), and astype() calls. Updated datetime insert path to use vectorized numpy conversion instead of element-by-element nanosecond arithmetic.

0.15.1, 2026-03-30

Bug Fixes

Use timezone from parameter type hint instead of server_tz when formatting tz-aware datetimes in {param:Type} bind expressions. Previously, bind_query always converted datetimes to the server timezone, ignoring explicit timezone declarations in type hints like DateTime64(6, 'UTC'). This caused incorrect query results when server_tz differed from the hint timezone. Handles LowCardinality, Nullable, and container type wrappers. Fixes #697

0.15.0, 2026-03-26

Improvements

SQLAlchemy: Comprehensive ClickHouse JOIN support via the new ch_join() helper. All strictness modifiers (ALL, ANY, SEMI, ANTI, ASOF), the GLOBAL distribution modifier, and explicit CROSS JOIN are now available. Use with select_from() to generate ClickHouse-specific join syntax like GLOBAL ALL LEFT OUTER JOIN. Closes #635
SQLAlchemy: array_join() now supports multiple columns for parallel array expansion. Pass a list of columns and a matching list of aliases to generate ARRAY JOIN col1 AS a, col2 AS b, col3 AS c. Single-column usage is unchanged. Closes #633
SQLAlchemy: ch_join() now supports USING syntax via the new using parameter. Pass a list of column name strings to generate USING (col1, col2) instead of ON. This is important for FULL OUTER JOIN where USING merges the join column correctly while ON produces default values (0, '') for unmatched sides. Closes #636
SQLAlchemy: Add missing Replicated table engine variants: ReplicatedReplacingMergeTree, ReplicatedCollapsingMergeTree, ReplicatedVersionedCollapsingMergeTree, and ReplicatedGraphiteMergeTree. Closes #687
Lazy imports for optional dependencies (numpy, pandas, pyarrow, polars). If installed, these heavy libraries are no longer imported at import clickhouse_connect time. They are only imported when features that need them are actually used. The C/Numpy optimization bridge is also deferred. This speeds up bare import time of clickhouse-connect about 4X in environments where all four are installed. Closes #589

Other

Remove py.typed marker file. The package does not have comprehensive type annotations, so the PEP 561 marker was causing false type errors for mypy/pyright users. Closes #691

Bug Fixes

SQLAlchemy: Fix .final() and .sample() silently overwriting each other when chained. Both methods now store modifiers as custom attributes on the Select instance and render them during compilation, replacing the previous with_hint() approach that only allowed one hint per table. Chaining in either order (e.g. select(t).final().sample(0.1)) correctly produces FROM t FINAL SAMPLE 0.1. Also fixes rendering for aliased tables (FROM t AS u FINAL) and supports explicit table targeting in joins. Fixes #658
SQLAlchemy: Fix sqlalchemy.values() to generate ClickHouse's VALUES table function syntax. The compiler now emits VALUES('col1 Type1, col2 Type2', ...) with the column structure as the first argument, instead of the standard SQL form that places column names after the alias. Generic SQLAlchemy types are mapped to ClickHouse equivalents (e.g. Integer to Int32, String to String). Also handles CTE usage by wrapping in SELECT * FROM VALUES(...). Fixes #681
SQLAlchemy: Fix GraphiteMergeTree and ReplicatedGraphiteMergeTree to properly single-quote the config_section argument as ClickHouse requires.

0.14.1, 2026-03-11

Bug Fixes

Fix JSON and Dynamic column read paths to properly decode shared variant data instead of returning raw binary with discriminator byte prefixes. Shared data values, used when paths exceed max_dynamic_paths or types exceed max_dynamic_types are now decoded from ClickHouse's binary variant encoding. Scalar types like integers, floats, strings, booleans, and nulls as well as nested objects are now fully decoded. Compound types like Array, Tuple, Map, DateTime, Date, Decimal, and UUID are not yet decoded and will be returned as raw bytes. Fixes #599, #615, and #674
SQLAlchemy: Fixed empty ORM/DBAPI SELECT results so cursor.description is still populated when ClickHouse Native format returns no data blocks. This restores correct handling for empty result sets, including parameterized and limited queries. Closes #675
Restore the default Cython runtime path so compiled driverc modules are used again unless CLICKHOUSE_CONNECT_USE_C=0 is set. Fix C/Python parity issues in streaming exception handling, FixedString string reads, nullable array helpers, and numpy conversion helpers, and expand CI and unit parity coverage to keep the optimized and pure-Python paths in sync. Addresses #676
Simplify pivot in the Cython data conversion module to use tuple(zip(*...)) instead of a manual tuple-building loop which matches the pure-Python implementation and provides significant insert speedup.

0.14.0, 2026-03-09

Breaking Changes

Renamed apply_server_timezone parameter to tz_source across Client and HttpClient. The new tz_source parameter accepts string values: "auto" (default, was None), "server" (was True or "always"), and "local" (was False). The old apply_server_timezone parameter is still accepted but emits a DeprecationWarning and will be removed in 1.0. Passing both tz_source and apply_server_timezone raises ProgrammingError. The "always" value (which had no distinct runtime behavior from True) maps to "server".
Renamed utc_tz_aware parameter to tz_mode across Client, QueryContext, and all query methods. The new tz_mode parameter accepts string values: "naive_utc" (default, was False), "aware" (was True), and "schema" (unchanged). The old utc_tz_aware parameter is still accepted but emits a DeprecationWarning and will be removed in 1.0. Passing both tz_mode and utc_tz_aware raises ProgrammingError. Closes #654
Removed the deprecated Object('json') type. This was the legacy experimental JSON type that has been superseded by the new JSON type in ClickHouse. Closes #556

Deprecations

Pandas 1.x support is now deprecated and will be removed in v1.0.0. A DeprecationWarning is emitted at import time for pandas 1.x users.

Improvements

Added support for the SAMPLE clause in SQLAlchemy statements. Note: Due to a SQLAlchemy limitation, only one hint (SAMPLE or FINAL) can be applied per table; chaining both will silently ignore one. For now, this change enables use of sample(), but chaining with final() is not yet supported. Closes #634
Experimental: Added Python 3.14 free-threading (cp314t) wheel builds for all platforms. The full test suite currently (as of 2 MAR, 2026) passes under free-threaded Python, but is not added to the CI test matrix at this time nor has it been otherwise tested to any degree. Free-threading support should be considered experimental with no guarantees of correctness at this time. Closes #573

0.13.0, 2026-02-26

Improvements

BREAKING CHANGE: Implement native write path for Variant data type with type-aware dispatching. Previously, all values inserted into a Variant column were stringified and sent to the server, which would store them in the String member if present, or attempt server-side conversion otherwise. Values are now serialized using their native ClickHouse types client-side (e.g. inserting 100 into Variant(Int64, String) stores Int64(100) instead of String("100")). Key changes:
- Values that don't match any variant member now raise DataError instead of being stringified and delegated to the server.
- A typed_variant(value, 'TypeName') helper is provided for cases where automatic dispatch cannot resolve the target type, such as when multiple variant members map to the same Python type (e.g. Array(UInt32) vs Array(String)).
Added utc_tz_aware="schema" mode which returns timezone-aware datetimes only when the server's column schema explicitly defines a timezone (e.g. DateTime('UTC')), and naive datetimes for bare DateTime columns. This matches the ClickHouse schema definition exactly. Not yet supported for Arrow-based query methods. Closes #645
Add type annotations to public API methods in Client, AsyncClient, HttpClient, and QueryResult. Ref #567

Bug Fixes

Fix dict_add parameter typed as builtin any instead of typing.Any.
Recognize UPDATE as a command so lightweight updates work correctly via client.query() and SQLAlchemy.
SQLAlchemy: GROUP BY now renders label aliases instead of full expressions which avoids circular reference errors when an alias shadows a source column name in ClickHouse.

0.12.0rc1, 2026-02-11

Implement a native async client. Closes #141

0.11.0, 2026-02-10

Python 3.9 Deprecation

Python 3.9 EOL'd Oct 2025. Support for Python 3.9 is now softly deprecated and has been removed from our CI test matrix but distribution wheels will continue to be built until the 1.0 release or until the builds naturally fail, whichever comes first. A DeprecationWarning will now be displayed when initializing the client on Python 3.9. Users should plan to upgrade to Python 3.10+ as 3.9 compatibility may break unexpectedly in future updates.

Bug Fixes

Fix issue where settings matching server defaults were not stored on client during initialization. Explicitly setting a default value is now respected (e.g., to prevent ClickHouse from auto-enabling optimizations). Closes #638
Raise OperationalError when ResponseSource hits network failure before any data is received. Previously, empty result would be returned. Closes #620
Fix issue with DROP table in client temp table test.
Fixed a bug where InsertContext state was not reset on insert failure, leading to reuse errors when data was passed separately.
Fixed UTC-equivalent timezone recognition issue where servers returning Etc/UCT, GMT, or other UTC-equivalent timezone names caused inconsistent behavior with utc_tz_aware=False. DateTime columns with explicit UTC timezones now correctly return naive datetimes when utc_tz_aware=False regardless of the specific UTC-equivalent timezone name returned by the server. Closes #629
Fixed percent sign (%) double encoding in SQLAlchemy string literals when using text() queries with formatDateTime and similar functions. The cursor now correctly unescapes %% back to % for non-parameterized queries. Closes #297

Improvements

Add support for mid-stream exceptions. Closes #626
Add support for QBit data type. Closes #570
Add the ability to create table from PyArrow objects. Addresses #588
Always generate query_id from the client side as a UUID4 if it is not explicitly set. Closes #596
Extend support for creating tables from PyArrow objects to include datetime/timestamp types. Closes #605
Add pre-commit hooks for auto-linting contributions at commit time. Addresses #607
Add missing type annotation. Closes #567
Auto-enable cancel_http_readonly_queries_on_client_close setting for HTTP clients to ensure SELECT queries are cancelled on the server when the client disconnects. Closes #641

0.10.0, 2025-11-14

Bug Fixes

Fixed DST fallback bug in DateTime and DateTime64 types caused by passing potentially ambiguous times to pd.DateTimeIndex constructor. Closes #585
Fixed issue with JSON key dot escaping. Closes #571

Improvements

Added SQLAlchemy core API support for ARRAY JOIN and FINAL modifier. Closes #579
Added Python 3.14 support (non-free-threaded build only; free-threaded builds are not yet supported). Closes #574
Added utc_tz_aware parameter to client and query methods to opt in to returning timezone-aware UTC objects for DateTime/DateTime64 columns. Default behavior remains the same and returns tz naive objects for backward compatibility. Note: this parameter will likely be removed and only return tz-aware dts in some future release. Closes #566
Added executor parameter to AsyncClient constructor to allow passing a custom executor for async operations. This allows users to control the concurrency and thread pool used by the async client.

0.9.2, 2025-09-25

Updated python_requires to drop Python 3.8 and advertise support for 3.9–3.13
Allow passing role as a field in the settings keyword argument to set a role for a specific query

0.9.1, 2025-09-16

Fix problem with typing that forced numpy to be installed.

0.9.0, 2025-09-16

Breaking Changes

WARNING: BREAKING CHANGE — Removed support for sqlalchemy 1.3 which reached its EOL in 2021. The minimum required version is now 1.4.40.
WARNING: BREAKING CHANGE — Behavior for reading from IPv6 columns has changed:
- With read_format='native', the client will always return ipaddress.IPv6Address objects, even for IPv4-mapped addresses (e.g., "::ffff:192.168.1.1"). Previously, the client returned ipaddress.IPv4Address objects for these cases. This change enforces type consistency and avoids surprising implicit conversions. If your application requires IPv4 objects, you can explicitly convert using the ipv4_mapped attribute of IPv6Address.
- With read_format='string', the client will always return IPv6 string representations, e.g., "::ffff:192.168.1.1" instead of "192.168.1.1", for the same reasons as above. If you require only the IPv4 string, you can parse or truncate this in your application code.
- Closes #493

Major Features

Added support for SQLAlchemy 2.x. The minimum required version is 1.4.40. Closes #263
Added Polars support for Arrow-based query and insert methods (query_df_arrow, query_df_arrow_stream, insert_df_arrow). This initial implementation provides basic dataframe conversion through the Arrow format, similar to how we support the pyarrow-backed pandas dataframes. Closes #111 and #542
Added support for querying/inserting pyarrow-backed DataFrames:
- query_df_arrow(): returns a pandas DataFrame with PyArrow dtype backend. Note that Arrow data types are preserved without additional conversions.
- query_df_arrow_stream(): Streaming version of query_df_arrow() for processing large result sets.
- insert_df_arrow(): Optimized insertion method for pandas DataFrames with PyArrow backend, which should provide better performance than standard insert_df().
Added Time and Time64 type support. Closes #509
Support for both pandas 1.x and 2.x.
Added support for Nullable(JSON) types
Added support for BFloat16 types

Improvements

Add support for lightweight DELETE in sqlalchemy. Closes #382
Added support for SELECT/JOIN operations via SQLAlchemy's core API (table operations and explicit statements--not ORM sessions-based queries)
Added client connection option rename_response_column (default None) that allows the user to define how response columns are automatically renamed according to a predefined scheme. Helpful for stripping alias prefixes, etc. in potentially complex queries. Closes #228
Add third-party library identifiers (name/version) in the User-Agent, e.g. pandas/2.2.5. Users can opt out by changing the common setting send_integration_tags to False.
Added support for form encoding query parameters when using HTTP interface. This addresses #342. Query parameters can now be sent as form-encoded data in the request body by setting form_encode_query_params=True when creating the client. This is particularly useful for queries with large parameter payloads that might exceed URL length limits.
Added support for special interval types. Closes #391
Added new common setting option "preserve_pandas_datetime_resolution" (default is False) allowing pandas 2.x users to opt into (when set to True) using the additional pandas 2.x datetime64/timedelta64 resolutions of "s", "ms", "us". If set to False or using pandas 1.x, all datetime64/timedelta64 resolutions will be coerced to "ns". (See here for more info). Closes #165 and #531
Tightens up type consistency of date-like objects when using query_df
When writing to an IPv6 column type, the client will "promote" IPv4 addresses to IPv4-mapped IPv6 addresses to prevent write errors. Closes #498
Changed AsyncClient.settings typing to Optional[Dict[str, Any]] to accept None inputs.
Added more robust error handling and tests. Closes #508
Replace the use of deprecated datetime.utcfromtimestamp

Bug Fixes

Fixed an AttributeError on http.client when importing clickhouse_connect under certain circumstances
Fixes problem with df inserts of Time and Time64 types. Closes #524

0.8.18, 2025-06-24

Improvements

Added a standalone test file (tests/unit_tests/test_driver/test_cursor.py) for testing cursor behavior

Bug Fixes

Fix SQLAlchemy execution error by using text() function by @lakako in #491
Test fixes for main by @genzgd in #497
Ensure types are returned even if there are no rows by @orian in #500
Added a standalone test file (tests/unit_tests/test_driver/test_cursor.py) for testing cursor behavior
Fix some issues with cursor behavior by @joe-clickhouse in #506
- Reset cursor location after performing an execute.
- Fix behavior of fetchall to only return rows from the current cursor location.
- Fixes logic of fetchmany to respect size parameter.

0.8.17, 2025-04-10

Improvements

The parameter transport_settings has been added to the Client query and insert methods. For the HTTP client (currently the only option), this dictionary of string is directly translated into additional HTTP headers at a query level. This can be used to provide additional proxy directives or other extra 'non-ClickHouse' information that is passed via headers. Thanks to Paweł Szczur of PostHog for the original PR!
There was previously no way to add a path to the ClickHouse server host in cases where the ClickHouse server was behind a proxy that used path based routing (such as https://big_proxy:8080/clickhouse). The new proxy_path get_client` argument can now be used to set that path. Closes #486

Bug Fix

Version 0.8.16 introduced a bug where changing a Client setting value and then changing that setting value back to the original server value would fail to restore the original setting. This has been fixed. Closes #487

0.8.16, 2025-03-28

Bug Fixes

Don't send a setting value if the setting is already correct according to the system.settings table. Closes #469
Ensure that the http user_agent header is in ascii. Note this could lead to an incorrectly encoded os_user if the os_user is not an Ascii string. Closes #484
Fix "cannot access local variable" exception where the http client encounters an unexpected streaming error. Also log that unexpected streaming error to assist debugging. Closes #483
Check that arrow/pandas is installed when calling query_df and query_arrow and raise a more meaningful exception if the required library is absent. Closes #477

Improvements

Some typing hints have been corrected. Thanks to Avery Fischer for the PR!
The docker based tests have been fixed to work with security improvements in recent ClickHouse releases
Query string cleanup is now (in theory) microseconds faster. Thanks to Sviatoslav Bobryshev for the optimization

0.8.15, 2025-01-25

Bug Fix

The async client was not shutting down its associated executor thread pool, result in a memory leak if multiple async clients were created. Closes #424. Note that the close function for the async client is now async to cleanly close down the pool. The recommended way to use an async client is now within an AsyncContext. See the associated PR for details. Thanks to ClickHouse core developer @pufit for the fix!

0.8.14, 2025-01-13

Bug Fix

Fix an edge case where a Pandas dataframe that contains only Int64 (or smaller) values would cause an exception when inserting into a ClickHouse "big int" table of U/Int128/256. Closes #452

0.8.13, 2025-01-07

Bug Fix

Fix missing default for new access_token parameter. Thanks to Lukas Thaler for the PR.

0.8.12, 2025-01-06

Improvement

Added support for JWT authentication (ClickHouse Cloud feature). It can be set via the access_token client configuration option for both sync and async clients. The token can also be updated via the set_access_token method in the existing client instance. NB: do not mix access token and username/password credentials in the configuration; the client will throw an error if both are set.

0.8.11, 2024-12-21

Improvement

Support of ISO8601 strings for inserting values to columns with DateTime64 type was added. If the driver detects that the inserted data for a DateTime64 is a string, it will attempt to parse an ISO-8601 datetime from that string. Other string formats are not currently supported. Thanks to Nikita Reznikov for the PR!

Bug Fix

Correctly handled native format column prefixes and insert names for Variant/Dynamic/JSON. Fixes #441 and likely some other issues with experimental types Variant,Dynamic, and JSON.

0.8.10, 2024-12-14

Bug Fixes

The experimental JSON type would break in some circumstances with ClickHouse server version 24.10 and later. This has been fixed. The fix is incompatible with ClickHouse version 24.8 and 24.9 however, so see the above WARNING about mixing JSON types
Experimental JSON types within a Tuple was broken. This has been fixed; however, the fix fails on ClickHouse server versions 24.8 and 24.9. If you need Tuple(JSON) support, you must use ClickHouse server version 24.10 or later. Closes #436.

0.8.9, 2024-12-02

Bug Fix

Roll back some timezone changes that caused incorrect usage of "local time" objects for some ClickHouse queries. Note that has deprecated "naive" timestamps; however converting everything to timezone aware objects (with the UTC timezone as appropriate) causes some numpy and possibly Pandas side effects. Eventually naive datetime object support will be deprecated/eliminated, but it will take some time to ensure no breaking changes. Fixes #433

0.8.8, 2024-11-27

Improvement

Handle low level HTTP errors as "Stream Complete". This provides better compatibility with the most recent ClickHouse version when the HTTP stream is abruptly closed after a server error.

0.8.7, 2024-11-21

Improvement

Added basic support for ClickHouse geometric types Ring, Polygon, MultiPolygon, LineString, and MultiLineString. Closes #427

Bug Fix

Settings/parameters from one Client will no longer leak into later client instantiations. Fixes #426

0.8.6, 2024-11-01

Bug Fixes

Correctly stream unchunked HTTP responses. Fixes #417.
Don't use wait_end_of_query for any streaming requests. Fixes #416

0.8.5, 2024-10-24

Bug fix

Inserts into a Nullable integer/float column could throw an exception if the first value was None and the column required conversion to the numeric type (such as Python str to float). This has been fixed. Note that "mixed" Python types in an insert data set will still throw an exception (i.e., Python strings and ints should not be combined into the same column for insert. Closes #414

0.8.4, 2024-10-23

Improvement

Python 3.13 is now included in CI tests and 3.13 wheels are built for distribution. Note that PyArrow is not yet available for Python 3.13.

Bug fixes

ClickHouse errors are now detected and throw an exception even if the HTTP status code returned by ClickHouse is a 200. This can happen when there is a long-running query (such as a large INSERT INTO ... SELECT FROM ...) and send_progress_in_http_headers is enabled to keep the HTTP connection open.
Pandas NA (which is equivalent to Float NaN for Float values) is now inserted as NULL into Nullable(Float*) columns. Closes #412

0.8.3, 2024-10-07

Improvement

Add an optional executor_threads argument to the get_async_client method. This controls the number of concurrent threads that each AsyncClient has available for queries. Defaults to "number of CPU cores plus four". Closes #407

0.8.2, 2024-10-04

Bug Fix

Ensure lz4 compression does not exit on an empty block. May fix #403.

Improvement

Compress Arrow inserts (using pyarrow compression) if compression is set to lz4 or zstd. Closes #267.

0.8.1, 2024-09-29

Bug Fix

Fixed an edge case where the HTTP buffer could theoretically return empty blocks.

0.8.0, 2024-09-26

Experimental Feature - "New" JSON/Dynamic/Variant DataTypes

Usage Notes

JSON data can be inserted as either a Python dictionary or a JSON string containing a JSON object {}. Other forms of JSON data are not supported
Valid formats for the JSON type are 'native', which returns a Python dictionary, or 'string', which returns a JSON string
Any value can be inserted into a Variant column, and ClickHouse will try to correctly determine the correct Variant Type for the value, based on its String representation.
More complete documentation for the new types will be provided in the future.

Known limitations:

Each of these types must be enabled in the ClickHouse settings before using. The "new" JSON type is available started with the 24.8 release
Returned JSON objects will only return the max_dynamic_paths number of elements (which defaults to 1024). This will be fixed in a future release.
Inserts into Dynamic columns will always be the String representation of the Python value. This will be fixed in a future release.
The implementation for the new types has not been optimized in C code, so performance may be somewhat slower than for simpler, established data types.

This is the first time that a new clickhouse_connect features has been labeled "experimental", but these new datatypes are complex and still experimental in ClickHouse server. Current test coverage for these types is also quite limited. Please don't hesitate to report issues with the new types.

Bug Fixes

When operating ClickHouse Server in strict TLS mode, HTTPS connections require a client certificate even if that certificate is not used for authentication. A new client parameter tls_mode='strict' can be used in this situation where username/password authentication is being used with client certificates. Other valid values for the new tls_mode setting are 'proxy' when TLS termination occurs at a proxy, and 'mutual' to specify mutual TLS authentication is used by the ClickHouse server. If tls_mode is not set, and a client certificate and key are provided, mutual is assumed.
The server timezone was not being used for parameter binding if parameters were sent as a list instead of a dictionary. This should fully fix the reopened #377.
String port numbers (such as from environmental variables) are now correctly interpreted to determine the correct interface/protocol. Fixes #395
Insert commands with a SELECT FROM ... LIMIT 0 will no longer raise an exception. Closes #389.

Improvements

Some low level errors for problems with Native format inserts and queries now include the relevant column name in the error message. Thanks to Angus Holder for the PR!
There is a new intermediate buffer for HTTP streaming/chunked queries. The buffer will store raw data from the HTTP request until it is actually requested in a stream. This allows some lag between reading the data from ClickHouse and processing the same data. Previously, if processing the data stream fell 30 seconds behind the ClickHouse HTTP writes to the stream, the ClickHouse server would close the connection, aborting the query and stream processing. This will now be mitigated by storing the data stream in the new intermediate buffer. By default, this buffer is set to 10 megabytes, but for slow processing of large queries where memory is not an issue, the buffer size can be increasing using the new common setting http_buffer_size. This is a fix in some cases of #399, but note that slow processing of large queries will still cause connection and processing failures if the data cannot be buffered.

It is now possible to correctly bind DateTime64 type parameters when calling Client query methods through one of two approaches:

Wrap the Python datetime.datetime value in the new DT64Param class, e.g.

  query = 'SELECT {p1:DateTime64(3)}'  # Server side binding with dictionary
  parameters={'p1': DT64Param(dt_value)}

  query = 'SELECT %s as string, toDateTime64(%s,6) as dateTime' # Client side binding with list 
  parameters=['a string', DT64Param(datetime.now())]

If using a dictionary of parameter values, append the string _64 to the parameter name

  query = 'SELECT {p1:DateTime64(3)}, {a1:Array(DateTime(3))}'  # Server side binding with dictionary

  parameters={'p1_64': dt_value, 'a1_64': [dt_value1, dt_value2]}

This closes #396, see also the similar issue #212

0.7.19, 2024-08-23

Bug Fix

Insertion of large strings was triggering an exception. This has been fixed.

0.7.18, 2024-07-30

Bug Fix

In some cases retrieving the os_user as part of the client data in the HTTP User-Agent header could throw an exception. This has been fixed (os_user will not be sent in those cases). Closes #380.

0.7.17, 2024-07-24

Bug Fix

The client server_tz was not being correctly set if the server timezone was not UTC. This should close #377

Improvement

The os user can now be sent as part of the User-Agent HTTP header. To disable this functionality for privacy reasons, set the new common/global setting send_os_user to False. Closes #371.

0.7.16, 2024-07-08

Improvement

Added the AsyncClient wrapper which is intended for asyncio environment usage. AsyncClient has the same methods with the same parameters as the standard Client, but they are coroutines when applicable. Internally, these methods from the Client that perform I/O operations are wrapped in a run_in_executor call. See also the updated run_async example.

0.7.15, 2024-07-01

Bug Fix

If the ClickHouse server was behind an https proxy that required mutual TLS authentication, the client would incorrectly attempt to use ClickHouse mutual TLS instead and authentication would fail. It should now be possible to authenticate correctly in this situation by settings the verify parameter to proxy. This should close #370

0.7.14, 2024-06-24

Bug Fix

Fix insert of UUID strings including dashes. Closes #368

0.7.13, 2024-06-24

Bug Fixes

Set required minimum version for optional tzlocal dependency. Thanks to drew-talon for reporting the issue and submitting the fix. Closes #360.
Extended the effect of the show_clickhouse_errors client setting to exclude showing hostname and port for errors when that setting is False. Thanks to Andy for the PR!

Improvement

Add the ability to bind arbitrary, "heredoc" data (including binary data) into the query, as described here. To use this functionality, use a single heredoc tag, such as $my_tag$ , in the query, and add that tag and the associated data into the query method parameters argument. For some examples, see the test_embedded_binary test in test_client.py. Closes #363.

0.7.12, 2024-06-04

Bug Fix

When using query_df with a FixedString column with a read format of 'string' (and the default query_df setting use_extended_dtypes=True), the resulting column in the dataframe will now be correctly set to the (extended) String dtype. Fixes #356

0.7.11, 2024-05-26

Improvement

Python or Pandas float value to ClickHouse Decimal now correctly rounds Float values for more accurate conversions. Thanks to Frederik Eychenié for the investigation and PR!

0.7.10, 2024-05-22

Bug Fix

Clean up pandas series concatenation issue

0.7.9, 2024-05-21

Bug Fixes

query_df would raise a deprecation warning with recent Pandas version if there were empty blocks. This should be fixed. #349
avoid a warning in timezone handling using the tzlocal library. Thanks to Tanner for the fix

Improvement

The new client keyword argument show_clickhouse_errors controls whether the full ClickHouse error (including possibly sensitive information) is displayed when there is an error in ClickHouse processing. It defaults to True. If False, the simple string 'The ClickHouse server returned an error.' will be displayed. Closes #344.
Updated to Cython 3.0.10

Improvement

0.7.8, 2024-04-14

Breaking Change

The default behavior of applying the client timezone if the GMT offset of the client matched the GMT offset of the server for the current time has been changed. The new default is to always apply the server timezone unless the optional apply_server_timezone get_client parameter is explicitly set to False. The previous behavior could cause confusing results where datetime values would be rendered in a Daylight Savings Time/Summer Time zone when DST was not active, and vice versa.

0.7.7, 2024-04-03

Bug Fix

Fixed client side binding for complex types containing floats or integers that was broken in version 0.7.5. Closes #335.

Improvement

Added a raw_stream method to the Client the returns an io.Base. Use this instead of the raw_query method with the (now removed) optional stream keyword boolean. Thanks to Martijn Thé for the PR that highlighted the somewhat messy public API.

0.7.6, 2024-04-01

Bug Fix

Fixed issue with SQLAlchemy Point type. Closes #332.

0.7.5, 2024-03-28

Bug Fixes

Fixed client side binding for Python format strings using %d (int) and %f (float) format patterns. Closes #327
Allows empty data argument in the initializer of ExternalFile / ExternalData objects. Thanks to martijnthe for the PR!

0.7.4, 2024-03-24

Improvement

Added the new client method query_arrow_stream for streaming PyArrow queries from ClickHouse. Big thanks to NotSimone for the feature and tests! Closes #155.

0.7.3, 2024-03-14

Improvement

Add summary field to Cursor object to retrieve the result of 'X-Clickhouse-Summary' header. Thanks to elchyn-cheliabiyeu for the PR!

0.7.2, 2024-03-07

Bug Fixes

Inserts into columns with multibyte UTF-8 names were broken. This has been fixed. #312
If the result of applying the precedence of timezones to a column results in an explicit UTC timezone, the datetime object returned should now be timezone naive. This should make the behavior consistent with the documentation. Closes #308 (except for a documentation update)
Extraneous semicolons are automatically removed from the end of queries. Addresses the most basic behavior in #310.

Performance Improvement

Pandas DataFrame returned from the client query_df method should be constructed somewhat faster in cases where the data returned in ClickHouse is in many small blocks. Note that performance gains in this use case are somewhat limited because of the memory and copying cost of building a large DataFrame from many smaller ClickHouse Native block structures, so such performance problems should normally be addressed at the query or ClickHouse data storage level (by for example, reducing the number of partitions and/or shards referenced by the query). This may partially address #307.

0.7.1, 2024-02-28

Bug Fixes

Changed type hint of the query parameter in Client query* methods to Optional[str] to work correctly with type analyzers. This also highlights that using a query_context instead of a query in these methods is supported (and preferred for repeated queries). Thanks to Avery Fischer for the PR!
Fixed sending a full table name to the insert_file tools function. Closes #305

0.7.0, 2024-01-22

Breaking Change

Python 3.7 builds are no longer part of the wheels deployed to PyPI

Bug Fix

Due to a change in default ClickHouse settings, inserts with "named" Tuple types no longer worked with ClickHouse version 24.1 and later. This has been fixed.

Improvements

Some types of security and other proxies require additional query parameters on any call to ClickHouse server behind such a proxy. Because the HTTPClient makes certain initialization queries to ClickHouse before any query parameters are set, it was difficult or impossible to create a Client successfully. You can now modify the HTTPClient class level properties params and valid_transport_settings before calling get_client so that such "special" query parameters will be included even on initialization queries. Thanks to Aleksey Astafiev for highlighting the problem and contributing a PR.
In some cases the user make want to disable urllib3 timeout settings connect_timeout and send_receive_timeout by setting them to none. The same PR from Aleksey Astafiev now allows setting to values to None
Update to Cython 3.0.8

0.6.23, 2023-12-15

Bug Fix

Add missing Nothing SQLAlchemy datatype, which fixes some edge case Superset queries. Thanks to elchyn-cheliabiyeu for the PR!

Improvement

Avoid concatenation of empty dataframes during query_df due to Pandas future warning. Thanks to Dylan Modesitt for the PR!

0.6.22, 2023-12-01

Improvements

Fix typo in log message for bad inserts. Thanks to Stas for the fix.
Allow non ClickHouse Cloud tests to run on community Pull Requests
Update to Cython 3.0.6

Bug Fix

ATTACH queries were not be correctly processed as "commands". Thanks to Aleksei Palshin for the PR!

0.6.21, 2023-11-23

Improvements

Added support for Point type. Closes #151. Thanks to Dhruvit Maniya for the PR!
Upgraded to Cython 3.0.5
Change exception handling in C API to stop spamming stderr

0.6.20, 2023-11-09

Bug Fix

Fixed an issue where client side binding of datetimes with timezones would produce the incorrect time string if timezones differed between the client and ClickHouse server. Closes #268

0.6.19, 2023-11-07

Bug Fixes

In some circumstances it was possible to insert a None value into a non-Nullable String column. As this could mask invalid input data, any attempt to insert None into a non-Nullable String or LowCardinality(String) will now throw a DataError
Reading a named Tuple column where the Tuple element names contained spaces would fail. In particular this would cause expected failures reading the experimental JSON column type with spaces in the keys. This has been fixed. Closes #265. Note that handling spaces in column types is tricky and fragile in several respects, so the best approach remains to use simple column names without spaces.

0.6.18, 2023-10-25

Bug Fixes

Reduce the estimated insert block size from 16-32MB to 1-2MB for large inserts. The large data transfers could cause "write timeout" errors in the Python code or "empty query" responses from ClickHouse over HTTPS connections. Should fix #258
Ensure that the internal client _progress_interval is positive even if a very small send_receive_timeout value is specified. Closes #259. Note that a very short send_receive_timeout is not recommended.

0.6.17, 2023-10-21

Bug Fix

Fix "negative" Date32 (before 1970-01-01) values for numpy and Pandas queries. Closes #254

0.6.16, 2023-10-18

Bug Fix

Remove bad private import to fix C Linkage. Closes #252

0.6.15, 2023-10-16

Improvement

Added Python 3.12 wheels and CI tests. Note that PyArrow is not yet available for 3.12, but should be soon. See apache/arrow#37880
The main clickhouse-connect.get_client method now displays type hints and ignores non-keyword arguments. Thanks to Avery Fischer for the usability improvement!
Log messages regarding C optimization availability and JSON library selection have been change from INFO to DEBUG. Closes #249

0.6.14, 2023-09-22

Bug Fixes

Fixed insert error when inserting a zero length string into a FixedString column. Closes #244
Removed unnecessary validate_entrypoints import from top level package init that was breaking Python 3.7. Note that Python 3.7 is EOL and will no longer be supported as of January 1, 2024.

0.6.13, 2023-09-20

Bug Fix

Fixed an issue with the automatic retry of "connection reset errors". This should prevent exceptions when the ClickHouse server closes a Keep Alive connection while a new request is in flight.

Improvement

Improved support for typing tools by adding a py.typed file. Thanks to Avery Fischer for the contribution.

0.6.12, 2023-08-30

Bug Fix

Nested empty Maps would return an IndexError when queried. #239. Thanks to Ashton Hudson for the report and the fix

0.6.11, 2023-08-30

Bug fixes

Inserts using Pandas 2.1 would fail due to a removed method in the Pandas library. There is now a workaround/fix for this. Closes #234
Inserts into a FixedString column that were not the expected size could cause corrupt insert blocks and mysterious errors from the ClickHouse server. Validation has been added so that more meaningful error messages are generated if a fixed string value is an invalid size. A reminder that strings which are "too short" for a FixedString column will be padded with 0 bytes, while strings that are "too long" will generate an exception during the insert.

0.6.10, 2023-08-27

Improvement

Add support and tests for the Object(Nullable('json')) type, which is sometimes detected by schema inference.

0.6.9, 2023-08-21

Improvements

Logging and exception handling for failed insert transformations has been reworked. If an exception is thrown when attempting to convert Python, Pandas, or Numpy data into ClickHouse Native format, the column name and type will be logged, as well as a stack trace of actual exception (note this may be in the C/Cython code, so the exception data may still be difficult to interpret). This partially addresses #229. Unfortunately determining data errors on a row level in addition to the column level is not practical in most cases without seriously impacting performance.
Version information has been moved from a top level VERSION to a Python __version__ file in the package. This removes the Python 3.7 dependency on importlib_metadata.
Cython .pyx, and .pxd files are now included in the PyPI source distribution to improve compatibility with 3rd party build tools.

0.6.8, 2023-07-18

Bug Fix

Fixed client raw_insert method when a compression method specified. #223

Improvement

Add compression parameter to the clickhouse tools.insert_file method. '.gz' and '.gzip' extensions are automatically recognized.

0.6.7, 2023-07-18

Bug Fixes

Fixed an issue for older versions of ClickHouse where the server would send an initial block of 0 rows for larger queries. This would break some queries with LowCardinality columns. Closes #221
Fixed thecompression alias for the compress client setting in SQLAlchemy/Superset DSN urls.

Improvements

Upgraded to Cython 3.0.0 final release!
Reversed the internal variable names of keys and indexes for low cardinality columns to be consistent with the ClickHouse server nomenclature.

0.6.6, 2023-07-07

Bug Fix

Inserting into an Enum column from a Pandas DataFrame with integer values only inserted 0 values. This is fixed. #219

0.6.5, 2023-07-06

Bug Fixes

The Client min_version method now ignores unrecognized "text" elements. This could cause issues for unofficial ClickHouse releases. Thanks to Diego Nieto for the fix!
In most cases insert query is now sent as part of the POST body instead of as a query parameter. This fixes #213. Note that this does not happen for direct file inserts using the driver.tools module, since these rely on an unmodified buffered input stream to efficiently upload files. In that case the actual insert query will still be passed as a query parameter.
All datetime objects returned from a query will now be timezone aware. This fixes #210. There remains one exception to this -- if the calculated timezone and the local timezone are both UTC, then naive timezones will be used to improve performance in such "all UTC" environments.
Inserting Python dictionaries into a ClickHouse "named" Tuple column now works correctly. Fixes #215. Note that using dictionaries for inserts will be noticeably slower than inserting the equivalent Python tuple value (with elements in the correct order)

Improvements

Client error messages used to be cut off at 240 characters to avoid creating huge log files. This value is now configurable using the common.max_error_size setting. Use 0 for this setting to get the full ClickHouse error message. In addition, the default has been changed to 1024 to capture more SQL errors without needing to modify the global setting value. Thanks to Ramlah Aziz for the update!
All Client insert methods now return a simple QuerySummary object, which includes properties written_rows, written_bytes, and query_id calculated from ClickHouse HTTP response headers. A QuerySummary object is also returned from the Client command method if the command does not return other data. Closes #216
Version determination no longer indirectly depends on the setuptools pkg_resources package. This also avoids some indirect dependency problems. Thanks to cwegener for the PR!

0.6.4, 2023-06-22

Bug Fixes

Quote database name when retrieving tables via SQLAlchemy. Fixes the Superset issue apache/superset#24372 for recent versions of Superset using clickhouse-connect
Don't rely on the ClickHouse currentDatabase() function to set an explicit database parameter. This should not change functionality when no database is specified in Client creation since ClickHouse will use the user's default database in that situation regardless. Fixes #207

0.6.3, 2023-06-16

Bug Fix

Inserts into decimal columns first convert the source value to a Python Decimal to work around floating point rounding issues. Fixes #203
DateTime64 values were broken for dates before 01-01-1970. This is fixed. #204

0.6.2, 2023-06-10

Improvements

Cython version upgraded to 3.0.0b3
Inserts for string columns are now C optimized (approximately 2x faster)

Bug Fix

Very long running queries could break because ClickHouse returned too many progress headers. Thanks to Ivan for the fix

0.6.1, 2023-06-06

Improvements

Minor documentation clean up regarding Superset compatibility

0.6.0, 2023-06-05

Bug Fixes

Use uuid4 instead of uuid1 for generating client level session_ids, as well as use a new urllib3 PoolManager when multiprocessing mode is detected. This should fix #194. Thanks to Guillaume Matheron for filing the issue and digging into details. The underlying problem is that the Python uuid1() is not guaranteed to be unique in a forked multiprocessing environment.
Change log warning to debug message if numpy is not available for C bindings. This check is harmless if numpy is not installed and should not have produced a warning. Fixes #195

Improvements

Cython version upgraded to 3.0.0b2
The block size (number of rows) for chunked/streaming inserts is now dynamically determined based on sample of the insert data. This allows more efficient streaming of large inserts and significantly improves insert performance in some circumstances.
Pivoting row based data to native columns for inserts has been optimized in C. This improves insert performance for large inserts of row oriented data.

0.5.25, 2023-05-23

Bug Fix

The client will now validate that the client_protocol_version query parameter is actually received and used by the ClickHouse server before assuming that data returned confirms to the expected protocol version. This fixes an incompatibility with the current versions of CHProxy (and possibly other proxies that restrict the query parameters passed to the ClickHouse Server). Note that other features that require the use of query parameters (such as server side bound query parameters) may also fail because of this behavior in CHProxy. Fixes #191

0.5.24, 2023-05-11

Bug Fixes

The client command method now accepts ClickHouse "external data." Closes #186
Arrays of Python date and datetime objects are now correctly formatted when use as server side parameters. Fixes #188
Fixed inserts of SimpleAggregateFunction columns with a LowCardinality type parameter. #187

0.5.23, 2023-05-03

Bug Fixes

SQLAlchemy table reflection threw an exception for SimpleAggregateFunction columns. This has been fixed. #180
The client no longer logs an invalid warning for query types that did not return a timezone header. #181
Querying SimpleAggregateFunction columns with a LowCardinality type parameter was broken. This has been fixed. #182
The query_arrow method now correctly accepts the external_data parameter. #183
The query_arrow method has been fixed for read only queries/settings. #184

New Feature

A common setting max_connection_age has been added, which will ensure that HTTP connections are not reused forever (this can help with certain load balancing issues. It defaults to 10 minutes

0.5.22, 2023-04-27

Bug Fix

There was a critical issue when using zstd compression (the default) with urllib3 version 2.0+. This has been fixed.

0.5.21, 2023-04-26

Bug Fix

Logging "Unexpected Http Driver Exception" only as WARNING instead of ERROR. Use the raised OperationalError if you depend on this. Thanks to Alexandro Sandre for the fix.
The wait_end_of_query setting is no longer automatically sent with inserts. This caused unnecessary buffering on the ClickHouse server file system, especially in the case of many small inserts. It can still be added using the settings dictionary of the client *insert methods if needed for some reason.
The query setting use_na_values has been renamed to use_extended_dtypes and now applies to all extended/special Pandas dtypes (except the Pandas Timestamp type). Set this to False to limit the dtypes returned in Pandas dataframes to the "basic" numpy types. (Note that this will force the use of numpy object arrays for most "nullable types") This should allow creating "basic" dataframes for greater compatibility. Closes #172.

0.5.20, 2023-04-06

Bug Fixes

Fix Pandas dataframe inserts where the Dataframe index does not match the data values (after, for example, creating a new DataFrame from a subset of the original.) #167 Thanks to Georgi Peev for the report and suggested fix, and his continued stress testing of Pandas functionality.
Compression and other control settings were not properly sent with the request if the corresponding setting was not enabled on the server. Many thanks to Alexander Khmelevskiy for the extended investigation and subsequent fix. #157

0.5.19, 2023-04-05

Bug Fixes

Fix quoting and escaping of array literals in server parameters. See #159. Big thanks to Joachim Jablon for the report and the fix.
Pandas and numpy Date values were incorrect for values after 2050. This has been fixed. #164
Fixed server side parameter binding of the NULL value for Nullable types
Added support for no_proxy/NO_PROXY environment variable. Also added support for lower case http_proxy and https_proxy variables. Note that lower case versions have precedence over upper case versions. Fixes #163

0.5.18, 2023-03-30

Performance Improvement

The server timezone will not be applied (and Python datetime types will be timezone naive) if the client and server timezones match and the get_client apply_server_timezone parameter is True (the default). This improves performance where client and server have the same (non-UTC) timezone. To override this behavior and always apply a server timezone to the result, use apply_server_timezone='always'. This should fix #157

0.5.17, 2023-03-26

Timezone Improvements

The client query_df and query_df_stream methods now accept query_tz and column_tzs parameters like other query* methods.
A new boolean parameter apply_server_timezone has been added to the main get_client method. Setting this parameter to True (the default) will apply the server timezone (if not UTC) to values returned by the client query* methods. The previous behavior would always return timezone naive, UTC based Python and Pandas datetime objects for ClickHouse DateTime and DateTime64 columns without a defined timezone. To revert to the previous behavior, set the apply_server_timezone parameter to False. Closes #152
The timezone logic applied to query results has been simplified and now uses the following order of precedence:
- Use the column timezone for the column if it is specified using the column_tzs parameter
- Use the column timezone for the column if specified in the ClickHouse column definition (only works for ClickHouse versions 23.2 and later)
- Use the query timezone for the query if it is set using the query_tz parameter
- Use the "response" timezone for the query as read from the X-ClickHouse-Timezone header if different from the server timezone. This closes #138.
- Use the ClickHouse server timezone (if the client parameter apply_server_timezone is True)
Note if the detected timezone according to the above precedence is UTC, clickhouse-connect will always return a naive datetime object with no timezone information

New Feature

ClickHouse external data is now support for all client query methods. To send external data, construct a driver.external.ExternalData object and send it as the external_data parameter in the appropriate query method. See the ClickHouse documentation for additional details. There are also examples in the test file . Closes #98

0.5.16, 2023-03-15

Bug Fix

Creating a client would fail if for some reason the user did not have access to the system.settings table. Thanks to Filipp Balakin for the fix.

Improvements

String columns now accept values of bytes-like objects (bytes/bytearray/etc.) for inserts (as with other inserts, all values for the inserted column should be the same types, either a bytes-like object or str). A corresponding bytes read format has been enabled for String columns as well. Thanks to Tim Nooran for opening the issue and providing unit tests. #148
Cython version upgraded to 3.0.0b1

0.5.15, 2023-03-10

Bug Fix

Remove unnecessary addition of the client database to the table name for inserts. Fixes #145

Improvement

The driver should now work for older versions of ClickHouse back to 19.16. Note that older versions are not officially tested or supported (like the main ClickHouse database, we officially support the last three monthly ClickHouse releases and the last two LTS ClickHouse releases). For versions prior to 19.17, you may want change the new readonly clickhouse_connect.common setting to '1' to allow sending ClickHouse settings with individual queries (if the user has write permissions). Thanks to Aleksey Astafiev for this contribution and for updating the tests to run with these legacy versions!

0.5.14, 2023-03-02

Bug Fix

Remove direct pandas import that caused an unrecoverable error when pandas was not installed. #139

0.5.13, 2023-02-27

Improvements

By default, reading Pandas Dataframes with query_df and query_df_stream now sets a new QueryContext property of use_pandas_na to True. When use_pandas_na is True, clickhouse_connect will attempt to use Pandas "missing" values, such as pandas.NaT and pandas.NA, for ClickHouse NULLs (in Nullable columns only), and use the associated extended Pandas dtype. Closes #132
There are new low level optimizations for reading some Nullable columns, and writing Pandas dataframes

Bug Fixes

Timezone information from ClickHouse DateTime columns with a timezone was lost. There was a workaround implemented for this issue in v0.5.8 that allowed assigned timezones to the query or columns on the client side. ClickHouse now support sending this timezone data with the column, but only in server versions 23.2 and later. If such a version is detected, clickhouse-connect will return timezone aware DateTime values without a workaround. Fixes #120
For certain queries, an incorrect, non-zero "zero value" would be returned for queries where use_none was set to False. All NULL values are now properly converted.
Timezone data was lost when a DateTime64 column with a timezone was converted to a Pandas DataFrame. This has been fixed. #136
send_progress headers were not being correctly requested, which could result in unexpected timeouts for long-running queries. This has been fixed.

0.5.12, 2023-02-16

Improvement

A new keyword parameter server_host_name is now recognized by the clickhouse_connect.get_client method. This identifies the "real" ClickHouse server hostname that should be used for HTTPS/TLS certificate validation, in cases where access to the server is through an ssh tunnel or other proxy with a different hostname. For examples of how to use the new parameter, see the updated file https://github.com/ClickHouse/clickhouse-connect/blob/main/examples/ssh_tunnels.py.

Bug fix

The database element of a DSN was not recognized when present in the dsn parameter of clickhouse_connect.get_client. This has been fixed.

0.5.11, 2023-02-15

Bug Fix

Referencing the QueryResult named_results property after other properties such as row_count would incorrectly raise a StreamClosedError. Thanks to Stas for the fix.

Improvement

A better error message is returned when trying to read a "non-standard" DateTime64 column function for a numpy array or Pandas DataFrame. "non-standard" means a DateTime64 precision not conforming to seconds, milliseconds, microseconds, or nanoseconds (0, 3, 6, or 9 respectively). These DateTime64 types are not supported for numpy or Pandas because there is no corresponding standard numpy datetime64 type and conversion would be unacceptably slow (supported numpy types are datetime64[s], datetime64[ms], datetime64[us], and datetime64[ns]). A workaround is to cast the DateTime64 type to a supported type, i.e. SELECT toDateTime64(col_name, 3) for a millisecond column.
The base configuration required for a urllib PoolManager has been broken out into its own help method, clickhouse_connect.driver.http_util.get_pool_manager_options. This makes it simpler to configure a SOCKSProxyManager as in the new example file https://github.com/ClickHouse/clickhouse-connect/blob/main/examples/ssh_tunnels.py

0.5.10, 2023-02-13

Improvement

Reading Nullable(String) columns has been optimized and should be approximately 2x faster. (This does yet not include LowCardinality(Nullable(String)) columns.)
Extraction of ClickHouse error messages included in the HTTP Response has been improved

Bug Fixes

When reading native Python integer columns, the use_none=False query parameter would not be respected, and ClickHouse NULLS would be returned as None instead of 0. use_none=False should now work correctly for Nullable(Int) columns
Starting with release 0.5.0, HTTP Connection pools were not always cleanly closed on exit. This has been fixed.

0.5.9, 2023-02-11

Bug Fixes

Large query results using zstd compression incorrectly buffered all incoming data at the start of the query, consuming an excessive amount of memory. This has been fixed. #122 Big thanks to Denny Crane for his detailed investigation of the problem. Note that this affected large queries using the default compress=True client setting, as ClickHouse would prefer zstd compression in those cases.
Fixed an issue where a small query_limit would break client initialization due to an incomplete read of the system.settings table. #123

Improvement

Stream error handling has been improved so exceptions thrown while consuming a stream should be correctly propagated. This includes unexpected stream closures by the ClickHouse server. Errors inserted into the HTTP response by ClickHouse during a query should also be reported as part of a StreamFailureError

0.5.8, 2023-02-10

Bug Fix

Return empty dataframe instead of empty list when no records returned from query_df method Fixes #118

Default parameter change

The client query_limit now defaults to 0 (unlimited rows returned), since the previous default of 5000 was unintuitive and led to confusion when limited results were returned.

New Feature

Allow client side control of datetime.datetime timezones for query results. The client query methods for native Python results now accept two new parameters: query_tz is the timezone to be assigned for any DateTime or DateTime64 objects in the results, while timezones can be set per column using the column_tzs dictionary of column names to timezones. See the test file for simple examples. This is a workaround for #120 and the underlying ClickHouse issue ClickHouse/ClickHouse#40397 Note that this issue only affects DateTime columns, not DateTime64, although the query context parameters will override the returned DateTime64 timezone as well.

0.5.7, 2023-02-01

Bug Fix

Http proxies did not work after removing the requests library. #114. This should be fixed. Note that socks proxies are still not supported directly, but can be added by creating a correctly configured urllib3 SOCKSProxyManager and using it as the pool_mgr argument to teh clickhouse_connect.create_client method.

0.5.6, 2023-02-01

Bug Fix

Dataframe inserts would incorrectly modify null-like elements of the inserted dataframe. #112. This should be fixed

0.5.5, 2023-02-01

Bug Fix

Queries of LowCardinality columns using pandas or numpy query methods would result in an exception. #108 This has been fixed.

0.5.4, 2023-01-31

New Features

Several streaming query methods have been added to the core ClickHouse Connect client. Each of these methods returns a StreamContext object, which must be used as a Python with Context to stream data (this ensures the underlying streaming response is properly closed/consumed.) For simple examples, see the basic tests.
- query_column_block_stream -- returns a generator of blocks in column oriented (Native) format. Fastest method for retrieving data in native Python format
- query_row_block_stream -- returns a generator of blocks in row oriented format. Used for processing data in a "batch" of rows at time while limiting memory usage
- query_rows_stream -- returns a convenience generator to process rows one at a time (data is still loaded in ClickHouse blocks to preserve memory)
- query_np_stream -- returns a generator where each ClickHouse data block is transformed into a Numpy array
- query_df_stream -- returns a generator where each ClickHouse data block is transformed into a Pandas Dataframe
The client_name is now reported in a standardized way to ClickHouse (as the http_user_agent). For better tracking of your Python application, use the new product_name common setting or set client_name get_client parameter to identify your product as <your-product-name>/<product-version>.

Performance Improvements

C/Cython optimizations for transforming ClickHouse data to Python types have been improved, and additional datatypes have been optimized in Cython. The performance increase over the previous 0.5.x version is approximately 10% for "normal" read queries.
Transformation of Numpy arrays and Pandas Dataframes has been completely rewritten to avoid an intermediate conversion to Python types. As a result, querying in Numpy format, and especially Pandas format, has been significantly improved -- from 2x for small datasets to 5x or more for very large Pandas DataFrames (even without streaming). Queries including Numpy datetime64 or Pandas Timestamp objects have particularly benefited from the new implementation.

Bug Fixes

The default maxsize for concurrent HTTP connections to a single host was accidentally dropped in the 0.5.x release. It has been restored to 8 for better performance when using multiple client objects.
A single low level retry has been restored for HTTP connections on ConnectionReset or RemoteDisconnected exceptions. This should reduce connection errors related to ClickHouse closing expired KeepAlive connections.

Internal Changes

As noted above, streaming, contexts and exception handling have been tightened up to avoid leaving HTTP responses open when querying streams.
Previous versions used threading.local() variables to store context information during query processing. The architecture has been changed to pass the relevant Query or Insert Context to transformation methods instead of relying on thread local variables. This is significantly safer in an environment where multiple queries can conceivably be open at the same on the same thread (for example, if using async functions).
Per query formatting logic has moved from ClickHouseType to the QueryContext.
ClickHouseType methods have been renamed to remove outdated references to native format (everything is native now)
Upgraded Cython Build to 3.0.11alpha release

0.5.3, 2023-01-23

Bug Fix

Correctly return QueryResult object when created as a context using a with statement. This fixes examples and the preferred context syntax for processing query results. Thanks to John McCann Cunniff Jr

0.5.2, 2023-01-17

Bug fix

Fix issue where client database is set to None (this normally only happens when deleting the initial database)

0.5.1, 2023-01-16

Bug fix

Fix ping check in http client. Closes #96.

0.5.0, 2023-01-14

WARNING -- Breaking Change -- Removing get_client Arbitrary Keyword Arguments

The clickhouse_connect get_client method (which proxies the driver.Client constructor) previously accepted arbitrary keyword arguments that were interpreted as ClickHouse server settings sent with every request. To be consistent with other client methods, get_client now accepts an optional settings Dict[str, Any] argument that should be used instead to set ClickHouse server settings.

WARNING -- Breaking Change -- HttpClient argument http_adapter replaced with pool_mgr

The driver.HttpClient constructor previously accepted the optional keyword argument http_adapter, which could be used to pass a custom requests.adapter.HttpAdapter to the client. ClickHouse Connect no longer uses the requests library (see Dependency Changes below). Instead, the HttpClient constructor now accepts an optional pool_mgr keyword argument which can be used to set a custom urllib.poolmanager.PoolManager for the client. In most cases the default PoolManager is all that is needed, but multiple PoolManagers may be required for advanced server/proxy applications with many client instances.

Dependency Changes

ClickHouse Connect no longer requires the popular requests library. The requests library is built on urllib3, but ClickHouse Connect was utilizing very little of the added functionality. Requests also has very restricted access to the urllib3 streaming API, which made adding additional compression methods difficult. Accordingly, the project now interfaces to urllib3 directly. This should not change the public API (except as noted in the warning above), but the HttpClient internals have changed to use the lower level library.
ClickHouse Connect now requires the zstandard and lz4 binding libraries to support zstd and lz4 compression. ClickHouse itself uses these compression algorithms extensively and is optimized to work with them, so ClickHouse Connect now takes advantages of them when compression is desired.

New Features

The core client query method now supports streaming. The returned QueryResult object has new streaming methods:
- stream_column_blocks - returns a generator of smaller result sets matching the ClickHouse blocks returned by the native interface.
- stream_row_blocks - returns a generator of smaller result sets matching the ClickHouse blocks returned by the native interface, but "pivoted" to return data rows.
- stream_rows - returns a generator that returns a row of data with each iteration.
  These methods should be used within a with context to ensure the stream is properly closed when done. In addition, two new properties result_columns and result_rows have been added to QueryResult. Referencing either of these properties will consume the stream and return the full dataset. Note that these properties should be used instead of the ambiguous result_set, which returns the data oriented based on the column_oriented boolean property. With the addition of result_rows and result_columns the result_set property and the column_oriented property are unnecessary and may be removed in a future release.
More compression methods. As noted above, ClickHouse Connect now supports zstd and lz4 compression, as well as brotli (br), if the brotli library is installed. If the client compress method is set to True (the default), ClickHouse Connect will request compression from the ClickHouse server in the order lz4,zstd,br,gzip,deflate, and will compress inserts to ClickHouse using lz4. Otherwise, the client compress argument can be set to any of lz4, zstd, br, or gzip, and the specific compression method will be used for both queries and inserts. While gzip is available, it doesn't perform as well as the other options and should normally not be used.

Performance Improvements

More data conversions for query data have been ported to optimized C/Cython code. Rough benchmarks suggest that this improves query performance approximately 20% for standard data types.
Using the new streaming API to process data in blocks significantly improves performance for large datasets (largely because Python has to allocate significantly less memory and do much less internal data copying otherwise required to build and hold the full dataset). For datasets of a million rows or more, streaming can improve query performance 2x or more.

Bug Fixes

As mentioned, ClickHouse gzip performance is poor compared to lz4 and zstd. Using those compression methods by default avoids the major performance degradation seen in #89.
Passing SqlAlchemy query parameters to the driver.Client constructor was broken by changes in release 0.4.8. #94. This has been fixed.

0.4.8, 2023-01-02

New Features

Documentation has been expanded to cover recent updates.
File upload support. The new driver.tools module adds the function insert_file to simplify directly inserting data files into a table. See the test file for examples. This closes #41.
Added support for server side http query parameters For queries that contain bindings of the form {<name>:<datatype>}, the client will automatically convert the query* method parameters dictionary to the appropriate http query parameters. Closes #49.
The main clickhouse_connect.get_client command will now accept a standard Python dsn argument and extract host, port, user, password, and settings (query parameters) from the dsn. Note that values for other keyword parameters will take precedence over values extracted from the dsn.
The QueryResult object now contains convenience properties for the first_item, first_row, and row_count in the result.

0.4.7, 2022-12-05

Bug Fixes

JSON inserts with the ujson failed, this has been fixed. #84

New Features

The JSON/Object datatype now supports writes using JSON strings as well as Python native types

0.4.6, 2022-11-29

Bug Fixes

Fixed a major settings issue with connecting to a readonly database (introduced in v0.4.4)
Fix for broken database setup dialog with recent Superset versions using SQLAlchemy 1.4

0.4.5, 2022-11-24

Bug Fixes

Common settings were stored in an immutable named tuple and could not be changed. This is fixed.
Fixed issue where the query_arrow method would not use the client database

0.4.4, 2022-11-22

Bug Fixes

Ignore all "transport settings" when validating settings. This should fix #80 for older ClickHouse versions

0.4.3, 2022-11-22

New Features

The get_client method now accepts a http_adapter parameter to allow sharing a requests.HTTPAdapter (and its associated connection pool) across multiple clients.
The VERSION file is now included in every package installation. Closes #76

0.4.2, 2022-11-22

New Features

Global/common configuration options are now available in the clickhouse_connect.common module. The available settings are:
- autogenerate_session_id [bool] Whether to generate a UUID1 session id used for every client request. Defaults to True. Disabling this can facilitate client sharing and load balancing in some use cases.
- dict_parameter_format [str] Options are 'json' and 'map'. This controls whether parameterized queries convert a Python dictionary to JSON or ClickHouse Map syntax. Default to json for insert into Object('json') columns.
- invalid_setting_action [str] Options are 'send' and 'drop'. Client Connect normally validates and drops (with a warning any settings that aren't recognized by the Server or are readonly). Changing this setting to 'send' will include such settings with the request anyway -- which will normally result in an error being returned.
The clickhouse_connect.get_client method now accepts a settings dictionary argument for consistency with other client methods.

Bug Fixes

Fixed insert of Pandas Dataframes for Timestamp columns with timezones #77
Fixed exception when inserting a Pandas Dataframes with NaType values into ClickHouse Float column (see known issue)

Known Issue

When inserting Pandas DataFrame values into a ClickHouse Nullable(Float*) column, a Float NaN value will be converted to a ClickHouse NULL. This is a side effect of a Pandas issue where df.replace cannot distinguish between NaT and NaN values: pandas-dev/pandas#29024

0.4.1, 2022-11-14

Bug Fixes

Numpy array read and write compatibility has been refined and performance has been improved. This fixes #69
Pandas Timestamp objects are now correctly handled for all supported ClickHouse Date* types. This fixes #68
SQLAlchemy datatypes are now correctly mapped to the underlying ClickHouse type regardless of case. This fixes an issue with migrating Superset datasets and queries from clickhouse-sqlalchemy to clickhouse-connect. Thanks to Eugene Torap

0.4.0, 2022-11-07

New Features

The settings, table information, and insert progress used for client inserts has been centralized in a new reusable InsertContext object. Client insert methods can now accept such objects to simplify code and reduce overhead
Query results can now be returned in a column oriented format. This is useful to efficiently construct other objects (like Pandas dataframes) that use column storage internally
The transformation of Pandas data to Python types now bypasses Numpy. As a result compatibility for ClickHouse date, integer, and NULL types has been significantly improved

Bug Fixes

An insert using chunked transfer encode could fail in progress during serialization to ClickHouse native format. This would "hang" the request after throwing the exception, leading to ClickHouse reporting "concurrent session" errors. This has been fixed.
Pandas DataFrame inserts into tables with a "large" integer column would throw an exception. This has been fixed.
Pandas DataFrame inserts with NaT/NA/nan values would fail, even if inserted into Nullable column types. This has been fixed.

Known Issues

Numpy inserts into large integer columns are not supported. #69
Insert of Pandas timestamps with nanosecond precision will lose the nanosecond value. #68

0.3.8, 2022-11-03

Bug Fixes

Fix read compression typo

0.3.7, 2022-11-03

New Features

Insert performance and memory usage for large inserts has been significantly improved
- Insert blocks now use chunked transfer encoding (by sending a generator instead of a bytearray to the requests POST method)
- If the client is initialized with compress = True, gzip compression is now enabled for inserts
Pandas DataFrame inserts have been optimized by keep the data in columnar format during the entire insert process

Bug Fixes

Fix inserts for date and datetime columns from Pandas dataframes.
Fix serialization issues for Decimal128 and Decimal256 types

0.3.6, 2022-11-02

Bug Fixes

Update QueryContext.updated_copy method to preserve settings, parameters, etc. #65

0.3.5, 2022-10-28

New Features

Build Python 3.11 Wheels

0.3.4, 2022-10-26

Bug fixes

Correctly handle insert into JSON/Object('json') column via SQLAlchemy
Fix some incompatibilities with SQLAlchemy 1.4

0.3.3, 2022-10-21

Bug fix

Fix 'SHOW CREATE' issue. #61

0.3.2, 2022-10-20

Bug fix

"Queries" that do not return data results (like DDL and SET queries) are now automatically treated as commands. Closes #59

New Features

A UUID session_id is now generated by default if session_id is not specified in clickhouse_connect.get_client
Test infrastructure has been simplified and test configuration has moved from pytest options to environment files

0.3.1, 2022-10-19

Bug Fixes

UInt64 types were incorrectly returned as signed Python ints even outside of Superset. This has been fixed
Superset Engine Spec will now format (U)Int256 and (U)Int128 types as strings to avoid throwing a conversion exception

0.3.0, 2022-10-15

Breaking changes

The row_binary option for ClickHouse serialization has been removed. The performance is significantly lower than Native format and maintaining the option added complexity with no corresponding benefit

Bug Fixes

The Database Connection dialog was broken in the latest Superset development builds. This has been fixed
IPv6 Addresses fixed for default Superset configuration

0.2.10, 2022-09-28

Bug Fixes

Add single retry for HTTP RemoteDisconnected errors from the ClickHouse Server. This prevents exception spam when requests (in particular inserts) are sent at approximately the same time as the ClickHouse server closes a keep alive connection.

0.2.9, 2022-09-24

Bug Fixes

Fix incorrect validation errors in the Superset connection dialog

0.2.8, 2022-09-21

New Features

This release updates the build process to include binary wheels for the majority of platforms, include MacOS M1 and Linux Aarch64. This should also fix installation errors on lightweight platforms without build tools.
Builds are now included for Python 3.11

Known issues

Docker images built on MacOS directly from source do not correctly build the C extensions for Linux. However, installing the official wheels from PyPI should work correctly.

0.2.7, 2022-09-10

New Features

The HTTP client now raises an OperationalError instead of a DatabaseError when the HTTP status code is 429 (too many requests), 503 (service unavailable), or 504 (gateway timeout) to make it easier to determine if it is a retryable exception
Add query_retries client parameter (default 2) for "retryable" HTTP queries. Does not apply to "commands" like DDL or to inserts

0.2.6, 2022-09-08

Bug Fixes

Fixed an SQLAlchemy dialect issue with SQLAlchemy 1.4 that would cause problems in the most recent Superset version

0.2.5, 2022-08-30

Bug Fixes

Fixed an issue where DBAPI cursors returned an invalid description object for columns. This would cause 'property' object has no attribute 'startswith' errors for some SqlAlchemy and SuperSet queries.
Fixed an issue where datetime parameters would not be correctly rendered as ClickHouse compatible strings

New Features

The "parameters" object passed to client query methods can now be a sequence instead of a dictionary, for compatibility with query strings that contain simple format unnamed format directives, such as 'SELECT * FROM table WHERE value = %s'

0.2.4, 2022-08-19

Bug Fixes

The wait_end_of_query parameter/setting was incorrectly being stripped. This is fixed

0.2.3, 2022-08-14

Bug Fixes

Fix encoding insert of multibyte characters

New Features

Improve identifier handling/quoting for Clickhouse column, table, and database names
Add client arrow_insert method to directly insert a PyArrow Table insert ClickHouse using Arrow format

0.2.2, 2022-08-06

Bug Fixes

Fix issue when query_limit set to 0

0.2.1, 2022-08-04

Bug Fixes

Fix SQL comment problems in DBAPI cursor

0.2.0, 2022-08-04

New Features

Support (experimental) JSON/Object datatype. ClickHouse Connect will take advantage of the fast orjson library if available. Note that inserts for JSON columns require ClickHouse server version 22.6.1 or later
Standardize read format handling and allow specifying a return data format per column or per query.
Added convenience min_version method to client to see if the server is at least the requested level
Increase default HTTP timeout to 300 seconds to match ClickHouse server default

Bug Fixes

Fixed multiple issues with SQL comments that would cause some queries to fail
Fixed problem with SQLAlchemy literal binds that would cause an error in Superset filters
Fixed issue with parameterized queries
Named Tuples were not supported and would result in throwing an exception. This has been fixed.
The client query_arrow function would return incomplete results if the query result exceeded the ClickHouse max_block_size. This has been fixed. As part of the fix query_arrow method returns a PyArrow Table object. While this is a breaking change in the API it should be easy to work around.

0.1.6, 2022-07-06

New Features

Support Nested data types.

Bug Fixes

Fix issue with native reads of Nullable(LowCardinality) numeric and date types.
Empty inserts will now just log a debug message instead of throwing an IndexError.

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

ClickHouse Connect ChangeLog

WARNING -- Python 3.8 EOL

WARNING -- JSON Incompatibility between versions 22.8 and 22.10

WARNING -- Impending Breaking Change - Server Settings in DSN

UNRELEASED

Breaking Changes

Bug Fixes

Development

Improvements

0.15.1, 2026-03-30

Bug Fixes

0.15.0, 2026-03-26

Improvements

Other

Bug Fixes

0.14.1, 2026-03-11

Bug Fixes

0.14.0, 2026-03-09

Breaking Changes

Deprecations

Improvements

0.13.0, 2026-02-26

Improvements

Bug Fixes

0.12.0rc1, 2026-02-11

0.11.0, 2026-02-10

Python 3.9 Deprecation

Bug Fixes

Improvements

0.10.0, 2025-11-14

Bug Fixes

Improvements

0.9.2, 2025-09-25

0.9.1, 2025-09-16

0.9.0, 2025-09-16

Breaking Changes

Major Features

Improvements

Bug Fixes

0.8.18, 2025-06-24

Improvements

Bug Fixes

0.8.17, 2025-04-10

Improvements

Bug Fix

0.8.16, 2025-03-28

Bug Fixes

Improvements

0.8.15, 2025-01-25

Bug Fix

0.8.14, 2025-01-13

Bug Fix

0.8.13, 2025-01-07

Bug Fix

0.8.12, 2025-01-06

Improvement

0.8.11, 2024-12-21

Improvement

Bug Fix

0.8.10, 2024-12-14

Bug Fixes

0.8.9, 2024-12-02

Bug Fix

0.8.8, 2024-11-27

Improvement

0.8.7, 2024-11-21

Improvement

Bug Fix

0.8.6, 2024-11-01

Bug Fixes

0.8.5, 2024-10-24

Bug fix

0.8.4, 2024-10-23

Improvement