V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
jackLoveDota
V2EX  ›  数据库

aws mysql 数据同步到 clickhouse cloud 的问题求助

  •  
  •   jackLoveDota · 44 天前 · 1068 次点击
    这是一个创建于 44 天前的主题,其中的信息可能已经有所发展或是发生改变。

    昨天看有 v 友推荐 seatunnel ,仔细看了一下官网文档,内网的 mysql 同步至 clickhouse 完全正常,但是配置成 clickhouse cloud 的 https 连接地址就连接失败,查看错误日志,一直是使用 http 协议,没办法切换到 https ,clickhouse 的端口是 8443 ,貌似也没办法切换到 https 的协议,请问有哪一项配置可以切换协议吗

    5 条回复    2024-12-07 11:44:19 +08:00
    jackLoveDota
        1
    jackLoveDota  
    OP
       44 天前
    ```
    env {
    execution.parallelism = 8
    job.mode = "STREAMING"
    # 检查点配置
    checkpoint.interval = 60000
    # 支持多次运行
    restart.strategy = "fixed-delay"
    restart.attempts = 3
    }
    source {
    # MySQL 源表配置
    MySQL-CDC {
    result_table_name = "mysql_source"
    server-id = 5400-5408
    hostname = "localhost"
    port = 3306
    username = "root"
    password = "root"
    database-name = "abc"
    # t_test 每天数据量 1000w ,按照天为纬度分表,这里合并到 clickhouse ,聚合成一张表,方便统计,是否合理?
    table-names = ["abc.t_test","abc.t_test_*"]
    base-url = "jdbc:mysql://localhost:3306/abc"
    # 需要全量+增量
    startup.mode = "INITIAL"
    driver = "com.mysql.cj.jdbc.Driver"
    # CDC 配置
    monitor.interval = 1000
    chunk.size = 32768
    exactly-once = true
    # 并行读取配置
    split.size = 50000
    split.even-distribution.factor.lower-bound = 0.05
    split.even-distribution.factor.upper-bound = 0.95
    # 并行读取优化
    connection.pool.size = 8 # 增加连接池大小
    fetch.size = 10000 # 增加每次获取的记录数

    # 启用批量模式
    enable.batch = true
    batch.size = 10000

    # 记录同步位点
    offset.storage = "filesystem"
    offset.storage.path = "/tmp/seatunnel/offset"
    }
    }
    transform {
    Sql {
    query = """
    SELECT
    id,
    CASE
    WHEN statistic_finished = 1 THEN 1
    ELSE 0
    END as statistic_finished,
    ip,
    created_at,
    updated_at
    FROM mysql_source
    """
    }
    }
    sink {
    Clickhouse {
    # ClickHouse Cloud 连接配置
    host = "xxx.clickhouse.cloud:8443"
    database = "local_test"
    table = "t_test"
    username = "default"
    password = "123456"
    # 表操作配置
    primary_key = "id"
    # 针对公网延迟优化的写入配置
    bulk_size = 5000 # 考虑网络延迟,稍微减小批量
    flush_interval = 5000 # 增加刷新间隔
    # 重试配置
    retry_codes = [429, 500, 503]
    max_retries = 5
    retry_interval = 10000
    # 超时配置
    connect_timeout = 60000
    socket_timeout = 300000
    # 连接池配置
    connection_pool {
    max_size = 16
    core_size = 8
    min_evictable_idle_time_millis = 300000
    }
    # 写入优化
    enable_partition = true
    partition_strategy = "balanced"
    # 压缩配置
    compression = true
    compression_type = "gzip"
    # 时区设置
    server_time_zone = "UTC"
    }
    }
    ```
    jackLoveDota
        2
    jackLoveDota  
    OP
       44 天前
    这是脱敏后的配置文件,有那么用过的能帮忙解决下,解决后留地址,打 100u 给大佬买咖啡,谢谢了
    liuymf
        3
    liuymf  
       44 天前
    配置加上 secure = true 试试?
    jackLoveDota
        4
    jackLoveDota  
    OP
       44 天前
    谢谢回答,还是不生效,我看了 connector-clickhouse 的源码,好像就没有配置项可以开启 https
    这是加上 secure=true 的启动结果
    Caused by: com.clickhouse.client.ClickHouseException: Connection reset, server ClickHouseNode [uri=http://xxx.aws.clickhouse.cloud:8443/local_test, options={server_time_zone=UTC}]@-1455942788
    at com.clickhouse.client.ClickHouseException.of(ClickHouseException.java:168)
    at com.clickhouse.client.AbstractClient.lambda$execute$0(AbstractClient.java:291)
    at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1768)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1583)
    Caused by: java.net.SocketException: Connection reset
    at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:318)
    at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
    at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
    at java.base/java.net.Socket$SocketInputStream.read(Socket.java:1099)
    at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:291)
    at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:347)
    at java.base/java.io.BufferedInputStream.implRead(BufferedInputStream.java:420)
    at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:399)
    at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:827)
    at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:759)
    at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:786)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1690)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1599)
    at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:531)
    at com.clickhouse.client.http.HttpUrlConnectionImpl.checkResponse(HttpUrlConnectionImpl.java:162)
    at com.clickhouse.client.http.HttpUrlConnectionImpl.post(HttpUrlConnectionImpl.java:289)
    at com.clickhouse.client.http.ClickHouseHttpClient.send(ClickHouseHttpClient.java:123)
    at com.clickhouse.client.AbstractClient.sendAsync(AbstractClient.java:156)
    at com.clickhouse.client.AbstractClient.lambda$execute$0(AbstractClient.java:289)
    whirlp00l
        5
    whirlp00l  
       32 天前
    按照 https://seatunnel.apache.org/docs/2.3.8/connector-v2/sink/Clickhouse/#sink-options
    里面提到的 clickhouse.config ,其支持的是 jdbc client
    https://github.com/ClickHouse/clickhouse-java/blob/main/clickhouse-jdbc/README.md

    根据 clickhouse 官网给出的 jdbc 例子: https://clickhouse.com/docs/en/integrations/java/jdbc-driver#ssl-properties

    理论上应该是
    sink {
    Clickhouse {
    # ClickHouse Cloud 连接配置
    host = "xxx.clickhouse.cloud:8443"
    database = "local_test"
    table = "t_test"
    username = "default"
    password = "123456"
    ....
    clickhouse.config {
    ssl = "true"
    sslmode = "strict"
    sslrootcert = “path_to_root_cert”
    }
    }
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   5500 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 23ms · UTC 08:44 · PVG 16:44 · LAX 00:44 · JFK 03:44
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.