SQLAlchemy: add support for JOIN USING syntax

**Is your feature request related to a problem? Please describe.**

The SQLAlchemy dialect always generates `ON` syntax for joins. There is no way to generate `USING (col)` syntax, which is needed for correct FULL OUTER JOIN results in ClickHouse when joining on same-named columns.

With `ON`, ClickHouse keeps columns from both tables separate. For unmatched rows, the missing side gets default values (0 for integers, empty string for strings) instead of the actual value. With `USING`, ClickHouse merges same-named columns into one, returning the correct value from whichever side has data.

This can leads to silently wrong data when using FULL OUTER JOIN via the SQLAlchemy dialect.

**Describe the solution you'd like**

A way to generate `USING` syntax through the dialect, for example:

```python
stmt = select(sales.c.product_id, sales.c.sold, returns.c.returned).join(
    returns, using=["product_id"], full=True
)
```

Generating:

```sql
SELECT ... FROM test_sales FULL OUTER JOIN test_returns USING (product_id)
```

**Describe alternatives you've considered**

Using `client.raw_query()` with raw SQL instead of the SQLAlchemy dialect.

**Additional context**

Reproduction script (v0.10.0):

```python
import clickhouse_connect
from sqlalchemy import create_engine, MetaData, Table, Column, select
from clickhouse_connect.cc_sqlalchemy.datatypes.sqltypes import UInt32 as ChUInt32

client = clickhouse_connect.get_client(host="localhost", port=8123, username="default", password="")

client.command("DROP TABLE IF EXISTS test_sales")
client.command("DROP TABLE IF EXISTS test_returns")
client.command("CREATE TABLE test_sales (product_id UInt32, sold UInt32) ENGINE = MergeTree() ORDER BY product_id")
client.command("CREATE TABLE test_returns (product_id UInt32, returned UInt32) ENGINE = MergeTree() ORDER BY product_id")
client.command("INSERT INTO test_sales VALUES (1, 10), (2, 20), (3, 30)")
client.command("INSERT INTO test_returns VALUES (2, 5), (3, 10), (4, 15)")

engine = create_engine("clickhousedb://default:@localhost:8123/default")
metadata = MetaData()
sales = Table("test_sales", metadata, Column("product_id", ChUInt32), Column("sold", ChUInt32))
returns = Table("test_returns", metadata, Column("product_id", ChUInt32), Column("returned", ChUInt32))

# SQLAlchemy generates ON - product_id=0 for unmatched row (wrong)
stmt = select(sales.c.product_id, sales.c.sold, returns.c.returned).join(
    returns, sales.c.product_id == returns.c.product_id, full=True
)
print(f"SQL: {stmt.compile(engine)}")
# FULL OUTER JOIN ... ON `test_sales`.`product_id` = `test_returns`.`product_id`
with engine.connect() as conn:
    print(conn.execute(stmt).fetchall())
# [(1, 10, 0), (2, 20, 5), (3, 30, 10), (0, 0, 15)]  <-- last row: product_id=0 (wrong)

# Raw SQL with USING - product_id=4 for unmatched row (correct)
result = client.query("SELECT product_id, sold, returned FROM test_sales FULL OUTER JOIN test_returns USING (product_id)")
print(result.result_rows)
# [(1, 10, 0), (2, 20, 5), (3, 30, 10), (4, 0, 15)]  <-- last row: product_id=4 (correct)

client.command("DROP TABLE test_sales")
client.command("DROP TABLE test_returns")
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLAlchemy: add support for JOIN USING syntax #636

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

SQLAlchemy: add support for JOIN USING syntax #636

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions