Clickhouse Thinking
列存储 https://clickhouse.com/docs/knowledgebase/key-value ch的列存储,每一列是独立的文件存储,select * from test where k = 1 limit 10的查询,如果k是order by的列,那么查找会很快速的定位到文件,然后文件内查找定位到行,然后就得到所在文件的行数,之后再去找其他所有列对应的文件,然后读取对应行的数据,组合起来,才能得到这一行的数据。 ...
列存储 https://clickhouse.com/docs/knowledgebase/key-value ch的列存储,每一列是独立的文件存储,select * from test where k = 1 limit 10的查询,如果k是order by的列,那么查找会很快速的定位到文件,然后文件内查找定位到行,然后就得到所在文件的行数,之后再去找其他所有列对应的文件,然后读取对应行的数据,组合起来,才能得到这一行的数据。 ...
确认projection生效: https://clickhouse.com/docs/knowledgebase/projection_example ReadFromPreparedSource (for_column2) 可以看出是走了projection clickhouse-23.12 :) explain indexes = 1 SELECT count() from table1_projections where column2 > 999 and column1 > 1; ┌─explain─────────────────────────────────────┐ │ Expression ((Projection + Before ORDER BY)) │ │ Aggregating │ │ Filter │ │ ReadFromPreparedSource (for_column2) │ └─────────────────────────────────────────────┘ Using projections to speedup queries in Clickhouse This is important as this means we have to add all columns to projection that are used in the query. This is different from row-based databases indexes, which referrence entire rows. But columnar database doesn’t even know what row is, that’s why projection != index. Projection is just some part of the original table data stored in a different form, which is better suitable for certain queries. ...
小朋友快要3岁了,有时候会说一些很有意思的话,这里做一些记录。 我好烦 //闹钟响了 Baby: 什么声音? Mom: 闹钟的声音 Baby:快把闹钟关掉,我好烦 //妈妈去关闹钟 无中生有 去取快递时 Baby: 是不是我的消防车到了 之前并没有答应过他要买消防车 拆快递时 Baby:这应该是我的消防车(很期待的表情) ...
Query cache https://clickhouse.com/docs/en/operations/query-cache https://clickhouse.com/blog/introduction-to-the-clickhouse-query-cache-and-design v23.1 开始支持Query cache, 即可以缓存查询结果,可以选择针对会话启用。 设置项: query_cache_min_query_runs 查询运行过多少次才会启用缓存 query_cache_min_query_duration 查询运行时间超过多少才会缓存 query_cache_ttl cache 过期时间 query_cache_max_entries 往缓存表中存储的缓存结果条数 use_query_cache 此查询是否使用缓存 enable_reads_from_query_cache 此查询是否先从缓存中读 enable_writes_to_query_cache 此查询的结果是否写入缓存 Materialized View 物化视图,看这篇 ...
Deduplication Strategies 笔记 去重操作,CH不是在插入阶段去重,而是插入后,速度会很快,影响: 任何时间,数据中可能会有重复数据 分区合并时,才会去重 查询语句需要适配重复数据 几种合并: ReplacingMergeTree: 相同排序键的会被合并 CollapsingMergeTree: 有一个字段标识取消,可以声明取消前面的数据,然后再插入新数据,多线程写的情况不适用。 VersionedCollapsingMergeTree: 带有版本号的去重,可以 Q: 为什么数据没有被去重? ...
《Everything you should know about materialized views.》 阅读笔记 原理 materialized views(以下简称MV)在insert的时候触发,并不会读源表,往一个表insert后,MV会从insert的数据中进行创建。 因此,源表甚至可以是个NULL Engine,即实际不写入数据,仅用来在inser时触发MV的动作。 ...
Hello 2024, goodby 2023. In 2024, I wish I’ll write more blog posts, at least one pre week. Last year, I finally got a basement to do some woodworking, and recently I bought a some wood and a lot of woodworking tools, every weekend I will go there and do some woodworking, I only finished a horse bench, three shelfs, wish I will finish one or two tables in 2024.
I got a strange problem today, a URL is supposed to return something in it’s body, but when use curl to get that url, it outputs nothing. curl 'http://localhost:9527/static/a.js' At first, I think it might be server problems, but when using wireshark to see the traffic, there is nothing wrong, http header followed by \r\n then http response body. I have also noticed that this happens when the response body is short. Testing with browser or python-requests is fine. So it’s not the server error, could it be curl’s problem? Let’s strace it. >> strace -vv curl 'http://localhost:9527/static/a.js' ...... recvfrom(5, "HTTP/1.1 200 OK\r\nServer: YNM3K-9"..., 102400, 0, NULL, NULL) = 443 newfstatat(1, "", {st_dev=makedev(0, 0x1d), st_ino=6, st_mode=S_IFCHR|0620, st_nlink=1, st_uid=1000, st_gid=5, st_blksize=1024, st_blocks=0, st_rdev=makedev(0x88, 0x3), st_atime=1703671640 /* 2023-12-27T18:07:20.893334033+0800 */, st_atime_nsec=893334033, st_mtime=1703671640 /* 2023-12-27T18:07:20.893334033+0800 */, st_mtime_nsec=893334033, st_ctime=1703662393 /* 2023-12-27T15:33:13.893334033+0800 */, st_ctime_nsec=893334033}, AT_EMPTY_PATH) = 0 brk(0x560907e02000) = 0x560907e02000 rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f68b677f710}, NULL, 8) = 0 rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f68b677f710}, NULL, 8) = 0 write(1, "a.js", 4a.js) = 4 rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_IGN, sa_mask=[PIPE], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f68b677f710}, 8) = 0 ...... From strace output, I can see it write a.js to fd 1, which is stdout, so it get the response body, and write to stdout, maybe it’s the terminal? ...
I poweroff my laptop every night. So I have to open some app I use everyday when I started the laptop. I use a script by Justin Hileman, to open news tabs in terminal. Save this script as init.sh, run sh init.sh in the terminal, and it will do several things: run wg.sh, setup wireguard tunnal open new tab, start a tmux session open new tab, using mosh to ssh a server open new tab, open a txt file using vim open github in chrome start vs code show a popup message #!/bin/bash # # Open new Terminal tabs from the command line # # Author: Justin Hileman (http://justinhileman.com) # # Installation: # Add the following function to your `.bashrc` or `.bash_profile`, # or save it somewhere (e.g. `~/.tab.bash`) and source it in `.bashrc` # # Usage: # tab Opens the current directory in a new tab # tab [PATH] Open PATH in a new tab # tab [CMD] Open a new tab and execute CMD # tab [PATH] [CMD] ... You can prob'ly guess function tab () { local cmd="" local cdto="$PWD" local args="$@" if [ -d "$1" ]; then cdto=`cd "$1"; pwd` args="${@:2}" fi if [ -n "$args" ]; then cmd="; $args" fi osascript &>/dev/null <<EOF tell application "iTerm" tell current window set newTab to (create tab with default profile) tell newTab tell current session write text "cd \"$cdto\"$cmd" end tell end tell end tell end tell EOF } function alert(){ title="$1" msg="$2" osascript &>/dev/null <<EOF display notification "$msg" with title "$title" EOF } ~/wg.sh || exit tab "tmux new -s x" tab "mosh root@xxxx" tab "vi notes.txt" open -a "Google Chrome" https://github.com/ code alert "have a nice day"
最近开始用一台旧的XPS 13笔记本,之前被自动升级到了win 11,于是就开始了windows的体验,目前可以使用WSL2,里面安装了ubuntu 22.04, 然后发现了另外一个比较有意思的软件,就是winget, 这个之前一直没有用过,这次就体验了一下。 winget是一个windows的命令行的软件管理器,运行winget update, 它会自动探测系统中已安装的软件版本,然后和数据库中对比,之后就可以选择性的升级。命令行安装或者升级软件还是很方便的,除了部分软件会弹出一个权限确认的窗口外,基本不需要人工交互。这个解决了windows没有软件库的问题,也解决了升级问题。不是所有软件都自身带有自动升级和新版本检测的,比如kicad,不用winget的话,想更新就要自己去官网检查、下载、安装才可以。 ...