V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
kyf0722
V2EX  ›  PostgreSQL

pg 的前辈们进来帮看看索引的问题

  •  
  •   kyf0722 · 2019-03-23 21:32:28 +08:00 · 4149 次点击
    这是一个创建于 2070 天前的主题,其中的信息可能已经有所发展或是发生改变。

    我现在有一个数据库, 表 notice 大概 100W row 左右的数据

    表结构如下:

    CREATE TABLE public.notice
    (
        id integer NOT NULL DEFAULT nextval('notice_id_seq'::regclass),
        title character varying(200) COLLATE pg_catalog."default" NOT NULL,
        tags integer[] DEFAULT '{}'::integer[],
        public_time timestamp without time zone,
        content text COLLATE pg_catalog."default",
        type integer NOT NULL,
        site character(20) COLLATE pg_catalog."default",
        url character varying(1024) COLLATE pg_catalog."default",
        area character(10) COLLATE pg_catalog."default",
        CONSTRAINT notice_pkey PRIMARY KEY (id)
    )
    WITH (
        OIDS = FALSE
    )
    TABLESPACE pg_default;
    

    索引如下:

    CREATE INDEX area_type_time_desc
        ON public.notice USING btree
        (area COLLATE pg_catalog."default" bpchar_pattern_ops, type, public_time DESC)
        TABLESPACE pg_default;
    

    如下查询语句:

    explain analyze select id, public_time, title, area from notice where area ~ '^110' and type=31  limit 10
    

    输出:

    "Limit  (cost=0.43..39.49 rows=10 width=122) (actual time=0.930..0.977 rows=10 loops=1)"
    "  ->  Index Scan using area_type_time_desc on notice  (cost=0.43..67845.25 rows=17369 width=122) (actual time=0.928..0.971 rows=10 loops=1)"
    "        Index Cond: ((area ~>=~ '110'::bpchar) AND (area ~<~ '111'::bpchar) AND (type = 31))"
    "        Filter: (area ~ '^110'::text)"
    "Planning Time: 0.532 ms"
    "Execution Time: 1.020 ms"
    

    可以看出索引是生效的!

    当加上排序:

    explain analyze select id, public_time, title, area from notice where area ~ '^110' and type=31 order by public_time desc limit 10
    

    输出:

    "Limit  (cost=51918.94..51918.97 rows=10 width=122) (actual time=214.522..214.530 rows=10 loops=1)"
    "  ->  Sort  (cost=51918.94..51962.36 rows=17369 width=122) (actual time=214.520..214.522 rows=10 loops=1)"
    "        Sort Key: public_time"
    "        Sort Method: top-N heapsort  Memory: 29kB"
    "        ->  Bitmap Heap Scan on notice  (cost=2012.57..51543.60 rows=17369 width=122) (actual time=30.013..205.214 rows=18635 loops=1)"
    "              Recheck Cond: (type = 31)"
    "              Filter: (area ~ '^110'::text)"
    "              Heap Blocks: exact=15625"
    "              ->  Bitmap Index Scan on area_type_time_desc  (cost=0.00..2008.23 rows=17362 width=0) (actual time=23.107..23.107 rows=18635 loops=1)"
    "                    Index Cond: ((area ~>=~ '110'::bpchar) AND (area ~<~ '111'::bpchar) AND (type = 31))"
    "Planning Time: 0.636 ms"
    "Execution Time: 214.591 ms"
    

    排序耗时非常长, 没有用到索引排序

    我尝试只建立一个 public_time desc 的索引, 查询不加任何条件只有一个 order by 同样排序要很耗时

    请前辈们指正一下我哪里用的用问题, 我查了好多文档, 没有找到解决方案, 特来求助

    9 条回复    2019-06-19 16:36:31 +08:00
    zbinlin
        1
    zbinlin  
       2019-03-23 23:05:10 +08:00
    调整下 work_mem 或 maintenance_work_mem,重新建索引试试。
    xiangyuecn
        2
    xiangyuecn  
       2019-03-23 23:38:37 +08:00
    一楼正解,调大排序内存,虽然不懂 pg。Sort Method: top-N heapsort Memory: 29kB,这句翻译过来不知道是不是字面意思,1.8 万数据这么点内存去排序,200 毫秒算是蛮快了。
    reus
        3
    reus  
       2019-03-24 00:06:22 +08:00
    work_mem 太小可能是原因,但 work_mem 小并没有导致排序慢,而是触发了 recheck cond,所以才慢

    Bitmap Index Scan 这一步,如果 work_mem 太小,它不会返回 row id 而是 page id,这样就需要 Bitmap Heap Scan 里 recheck 这一步,因为一个 page 可能有不符合条件的 row。

    排序并不慢,Bitmap Heap Scan 实际跑了 205.214 毫秒,加上排序是 214.520 ,占比不高。
    kyf0722
        4
    kyf0722  
    OP
       2019-03-24 00:13:07 +08:00
    @zbinlin

    pg 默认的 work_mem 是 4M maintenance_work_mem 是 64M
    我修改当前 session 的 work_mem 到 128M, maintenance_work_mem 到 512M
    然后 drop index, 再重新 create index

    同样上面带 order by 的查询语句执行结果
    ```
    "Limit (cost=51977.56..51977.58 rows=10 width=122) (actual time=126.678..126.685 rows=10 loops=1)"
    " -> Sort (cost=51977.56..52021.04 rows=17394 width=122) (actual time=126.676..126.678 rows=10 loops=1)"
    " Sort Key: public_time DESC"
    " Sort Method: top-N heapsort Memory: 29kB"
    " -> Bitmap Heap Scan on notice (cost=2017.69..51601.68 rows=17394 width=122) (actual time=35.036..117.511 rows=18635 loops=1)"
    " Recheck Cond: (type = 31)"
    " Filter: (area ~ '^110'::text)"
    " Heap Blocks: exact=15625"
    " -> Bitmap Index Scan on area_type_public_time_desc (cost=0.00..2013.34 rows=17386 width=0) (actual time=28.123..28.123 rows=18635 loops=1)"
    " Index Cond: ((area ~>=~ '110'::bpchar) AND (area ~<~ '111'::bpchar) AND (type = 31))"
    "Planning Time: 0.615 ms"
    "Execution Time: 126.819 ms"
    ```

    比刚刚确实快一些, 但是排序似乎还是和索引没有关系, 如果我把 order by 的 public_time 替换为 id, 可以在 2-5ms 内查出数据

    @xiangyuecn Sort Method: top-N heapsort Memory: 29kB 中的 29kB 应该是排序使用了 29K 的内存, 我试过如果数据量巨大, 内存不够的时候会显示 disk size, 也就是用到了磁盘
    kyf0722
        5
    kyf0722  
    OP
       2019-03-24 00:21:58 +08:00
    @reus 这位仁兄说的很有道理

    我索性直接把 work_mem 加到了 512M, 但是执行结果更加匪夷所思
    "Limit (cost=118704.89..118704.92 rows=10 width=134) (actual time=688.049..688.057 rows=10 loops=1)"
    " -> Sort (cost=118704.89..118816.86 rows=44786 width=134) (actual time=688.047..688.049 rows=10 loops=1)"
    " Sort Key: public_time DESC"
    " Sort Method: top-N heapsort Memory: 27kB"
    " -> Bitmap Heap Scan on notice (cost=1869.15..117737.08 rows=44786 width=134) (actual time=47.015..663.260 rows=44366 loops=1)"
    " Filter: ((area ~ '^110'::text) AND (type <> 31))"
    " Rows Removed by Filter: 18635"
    " Heap Blocks: exact=39610"
    " -> Bitmap Index Scan on area_type_public_time_desc (cost=0.00..1857.96 rows=62153 width=0) (actual time=28.356..28.356 rows=63001 loops=1)"
    " Index Cond: ((area ~>=~ '110'::bpchar) AND (area ~<~ '111'::bpchar))"
    "Planning Time: 0.578 ms"
    "Execution Time: 688.124 ms"

    无法理解了!!, 多次执行都差不多
    kyf0722
        6
    kyf0722  
    OP
       2019-03-24 00:25:52 +08:00
    @reus 抱歉, 错了, 请忽略上条回复. work_mem 加到 512M 之后, 的执行结果

    "Limit (cost=51977.56..51977.58 rows=10 width=134) (actual time=112.037..112.044 rows=10 loops=1)"
    " -> Sort (cost=51977.56..52021.04 rows=17394 width=134) (actual time=112.034..112.037 rows=10 loops=1)"
    " Sort Key: public_time DESC"
    " Sort Method: top-N heapsort Memory: 29kB"
    " -> Bitmap Heap Scan on notice (cost=2017.69..51601.68 rows=17394 width=134) (actual time=30.136..103.225 rows=18635 loops=1)"
    " Recheck Cond: (type = 31)"
    " Filter: (area ~ '^110'::text)"
    " Heap Blocks: exact=15625"
    " -> Bitmap Index Scan on area_type_public_time_desc (cost=0.00..2013.34 rows=17386 width=0) (actual time=22.173..22.173 rows=18635 loops=1)"
    " Index Cond: ((area ~>=~ '110'::bpchar) AND (area ~<~ '111'::bpchar) AND (type = 31))"
    "Planning Time: 0.571 ms"
    "Execution Time: 112.108 ms"

    还是触发了 Recheck
    reus
        7
    reus  
       2019-03-24 11:47:14 +08:00
    如果存储是 SSD,用 SET random_page_cost = 1; 看有没有改善。bitmap index scan 的目的是减少不必要的 page read,因为贵。但如果随机读 page 的成本低,planner 会倾向于直接读而不是用 bitmap index scan。

    还有个可能的原因是物理存储位置太分散,也会用 bitmap index scan,试下 vacuum full notice
    encro
        8
    encro  
       2019-06-19 16:33:58 +08:00
    这个组合索引顺序有点问题,先 type,再 area,在这个查询里,area 索引应该失效了。
    encro
        9
    encro  
       2019-06-19 16:36:31 +08:00
    area 因为是模糊查询,导致 public_time 也失效了。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2792 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 26ms · UTC 15:01 · PVG 23:01 · LAX 07:01 · JFK 10:01
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.