Skip to content

Fix memory leak in AsyncClient#457

Merged
genzgd merged 5 commits intoClickHouse:mainfrom
pufit:main
Jan 22, 2025
Merged

Fix memory leak in AsyncClient#457
genzgd merged 5 commits intoClickHouse:mainfrom
pufit:main

Conversation

@pufit
Copy link
Copy Markdown
Member

@pufit pufit commented Jan 21, 2025

Summary

It's known that ThreadPoolExecutor doesn't deallocate memory without .shutdown being called. Python's GC doesn't stop running thread pools, so the code like

while True:
    client = await get_async_client(...)
    ...
    client.close()

will be leaking memory since each new AsyncClient instance will have a new executor.

Closes #424

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG
  • For significant changes, documentation in https://github.com/ClickHouse/clickhouse-docs was updated with further explanations or tutorials

Copy link
Copy Markdown
Collaborator

@genzgd genzgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this! My only question is whether this can be simpler by just shutting down the executor synchronously. It looks like you added the __aenter__ and __aexit__ methods to ensure this autocloses when using it as an AsyncContext?

return self.client.min_version(version_str)

def close(self):
async def close(self):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have to be async? Can we just call the executor.shutdown method synchronously?

@pufit
Copy link
Copy Markdown
Member Author

pufit commented Jan 22, 2025

Thanks for digging into this! My only question is whether this can be simpler by just shutting down the executor synchronously. It looks like you added the __aenter__ and __aexit__ methods to ensure this autocloses when using it as an AsyncContext?

This way is more Pythonic. Usually, you open and close clients inside the event loop. If you try to wait for executor shutdown in sync mode it will hang the loop. If you close the thread pool without wait=True, you won't be able to guarantee the graceful shutdown of your application.

Here is an example of how it is usually done in other libraries: https://github.com/aio-libs/aiohttp/blob/master/examples/curl.py#L11

@genzgd genzgd merged commit eec6d2a into ClickHouse:main Jan 22, 2025
jovezhong pushed a commit to timeplus-io/timeplus-connect that referenced this pull request Apr 25, 2025
* Fix memory leak in AsyncClient

* Graceful shutdown

* fix test

* fix tests

* fix tests
jovezhong pushed a commit to timeplus-io/timeplus-connect that referenced this pull request Apr 25, 2025
* Fix memory leak in AsyncClient

* Graceful shutdown

* fix test

* fix tests

* fix tests
jovezhong pushed a commit to timeplus-io/timeplus-connect that referenced this pull request Apr 28, 2025
* Gg/update test jwt (ClickHouse#458)

* update test jwt, ensure query params are final

* tweak test job

* Assume CI "secret" is full JWT

* Fix lint

* Fix memory leak in AsyncClient (ClickHouse#457)

* Fix memory leak in AsyncClient

* Graceful shutdown

* fix test

* fix tests

* fix tests

* Fix lint (ClickHouse#459)

* Exclude 3.8 Aarch64 builds (ClickHouse#460)

* Fix lint

* Exclude pypy 38 build

* Exclude all Python 3.8 builds

* Update changelog re Python 3.8 aarch64 wheels

* Gg/update test matrix (ClickHouse#464)

* Update some tests

* Fix lint

* Skip JSON buggy test

* Fix CI tests with default user (ClickHouse#465)

* Replace removal of ; in the loop line with rstrip (ClickHouse#472)

* Docker test fixes (ClickHouse#473)

* Don't prevent settings that don't change the value

* Add docker related config file

* Fix typo

* Update README.md doc link (ClickHouse#476)

* Gg/update test jwt (ClickHouse#458)

* update test jwt, ensure query params are final

* tweak test job

* Assume CI "secret" is full JWT

* Fix lint

* Fix memory leak in AsyncClient (ClickHouse#457)

* Fix memory leak in AsyncClient

* Graceful shutdown

* fix test

* fix tests

* fix tests

* Fix lint (ClickHouse#459)

* Correct typing of create_client(host, username) (ClickHouse#482)

The parameters `host` and `username` of `create_client` actually do
accept None values, as demonstrated by their default values being `None`
and the docstring explaining default behavior when not-set.

Correcting these types (by marking as Optional) allows users using `dsn`
or default behavior to not see type-checking errors.

* Release 0 8 16 (ClickHouse#485)

* Check for optional libraries in client methods

* Log unexpected http next chunk unexpected

* Log unexpected http next chunk unexpected

* Updates for 0.8.16 release

* Exclude 3.8 Aarch64 builds (ClickHouse#460)

* Fix lint

* Exclude pypy 38 build

* Exclude all Python 3.8 builds

* Update changelog re Python 3.8 aarch64 wheels

* Gg/update test matrix (ClickHouse#464)

* Update some tests

* Fix lint

* Skip JSON buggy test

* Updates for 0.8.17 release (ClickHouse#488)

* Updates for 0.8.17 release

* Update test matrix

* Try to punt on SSL issues

* Update TLS test certificates

* Fix CI tests with default user (ClickHouse#465)

* Add param extra_http_headers to query/command methods (ClickHouse#489)

* Add param extra_http_headers to query/command methods

* add test, fix dict copy

---------

Co-authored-by: Geoff Genz <geoff@clickhouse.com>

* Change http_headers to transport settings, add transport settings to async client and insert methods (ClickHouse#490)

* wrap sql with text() (ClickHouse#491)

* Replace removal of ; in the loop line with rstrip (ClickHouse#472)

* Docker test fixes (ClickHouse#473)

* Don't prevent settings that don't change the value

* Add docker related config file

* Fix typo

* Update test_dynamic.py for variant and json data types

* Update httpclient.py

update comments

* Update dialect.py

* Update client.py

* Update test_jwt_auth.py

* bring back dbapi, otherwise test fails

* diable test_transport_settings in test_client.py

* fix the JSON->json data type name issue

---------

Co-authored-by: Geoff Genz <geoff@clickhouse.com>
Co-authored-by: pufit <pufit@yandex.ru>
Co-authored-by: Sviatoslav Bobryshev <61021258+sbobryshev@users.noreply.github.com>
Co-authored-by: Avery Fischer (biggerfisch) <avery@averyjfischer.com>
Co-authored-by: Paweł Szczur <orian@users.noreply.github.com>
Co-authored-by: lakako <46197434+lakako@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory leaks AsyncClient

2 participants