MonetDB multiplex-funnel

2024-03-11 03:59
文章标签 monetdb multiplex funnel

本文主要是介绍MonetDB multiplex-funnel,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

本文介绍一下MonetDB的漏斗功能. 此功能也是需要基于MonetDB的discovery和remote database.

m-funnel这个功能有点像plproxy, 多了一个queue/funnel的限制, 并且没有plproxy 的路由算法, 一个纯粹的代理.
plproxy接受客户端请求, 客户端的请求是并行提交给数据节点.
而MonetDB multiplex-funnel则是并行的接收客户端请求, 但是请求最终是一个一个提交给后端的所有数据节点的. 
返回结果的话则是所有数据节点并行执行的结果返回, 如有4个后端monetdbd, 那么count(*)返回4条记录.
另外需要注意的是m-funnel还限制了单条SQL返回的结果集的大小, 默认是80K.
注意使用m-funnel会打乱事务特性, 例如一个客户端的事务请求, 事务的中间可能被其他客户端的SQL插入.
MonetDB multiplex-funnel - 德哥@Digoal - PostgreSQL research

MonetDB的multiplex-funnel举例 : 
使用monetdb创建m-funnel数据库
       -m patternWith the -m flag, instead of creating a database, a multiplex-funnel is created.  See section MULTIPLEX-FUNNEL in monetdbd(1).  The pattern argument is not fully the same as a pattern for connecting  or  dis-covery.   Each  parallel target for the multiplex-funnel is given as username+password@pattern sequence,separated by commas.  Here the pattern is an ordinary pattern as would  be  used  for  connecting  to  adatabase, and can hence also be just the name of a database.

注意pattern的格式,    username+password@pattern, 这里的pattern是数据库, 所以数据库需要在同一个广播域. 并且配置好tag(假设有重名的情况下)
MonetDB multiplex-funnel - 德哥@Digoal - PostgreSQL research
这里dbx是边界, 主机属于两个广播域, 所以forward必须=proxy, 因为从db0的主机无法直接连接dbx1和dbx2. 必须通过dbx所在的monetdbd进行代理.

使用举例 : 
配置hostname
150.sky-mobi.com
db-172-16-3-221

配置/etc/sysconfig/network
HOSTNAME=db-172-16-3-221

在所有monetdb节点配置DNS或/etc/hosts, 使解析正常.
本例配置/etc/hosts
172.16.3.221 db-172-16-3-221
172.16.3.150 150.sky-mobi.com

配置防火墙, 让monetdbd之间可以相互通讯
iptables -t filter -I INPUT -s 172.16.0.0/16 -j ACCEPT

创建monetdbd storage
monetdbd create /data01/mdb1

配置discovery
monetdbd set discovery=yes /data01/mdb1

可选, 配置forward = proxy (如果这是个边界m-funnel库的话, 默认就是proxy)
monetdbd set forward=proxy /data01/mdb1

启动monetdbd
monetdbd start /data01/mdb1

创建数据库, 解锁数据库
[root@150 postgresql-9.3.5]# monetdb create db1
created database in maintenance mode: db1
[root@150 postgresql-9.3.5]# monetdb create db2
created database in maintenance mode: db2
[root@150 postgresql-9.3.5]# monetdb release db1
taken database out of maintenance mode: db1
[root@150 postgresql-9.3.5]# monetdb release db2
taken database out of maintenance mode: db2[root@db-172-16-3-221 ~]# monetdb create dbx1
created database in maintenance mode: dbx1
[root@db-172-16-3-221 ~]# monetdb create dbx2
created database in maintenance mode: dbx2
[root@db-172-16-3-221 ~]# monetdb release dbx1
taken database out of maintenance mode: dbx1
[root@db-172-16-3-221 ~]# monetdb release dbx2
taken database out of maintenance mode: dbx2

配置shared, tag(可选, 仅仅当数据库在广播域有重名是需要使用)
[root@150 postgresql-9.3.5]# monetdb set shared=150 db1
[root@150 postgresql-9.3.5]# monetdb set shared=150 db2
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx1
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx2


发现
[root@db-172-16-3-221 ~]# monetdb discoverlocation
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221[root@150 mdb1]# monetdb discoverlocation
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221


创建第一个m-funnel, 这里假设dbx这个m-funnel是边界funnel.
首先测试一下库名是否正常.
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx1/221
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx1'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>\q
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx2/221
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on

使用这两个库名创建m-funnel, 注意用户密码我没改, 用默认的
[root@db-172-16-3-221 ~]# monetdb create -m monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221 dbx
created multiplex-funnel in maintenance mode: dbx
[root@db-172-16-3-221 ~]# monetdb release dbx
taken database out of maintenance mode: dbx

查看状态
[root@db-172-16-3-221 ~]# monetdb get all dbxname          prop     source           value
dbx              name      -        dbx
dbx              type      local    mfunnel
dbx              shared    default  yes
dbx              mfunnel   local    monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221

测试第一个m-funnel是否正常, 可以正常的像两个底层数据库发出SQL.
[root@db-172-16-3-221 ~]# mclient -h 127.0.0.1 dbx
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>create table abc(id int);
operation successful (22.611ms)
sql>insert into abc values(1);
2 affected row (7.259ms)
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    1 |
+------+
2 tuples (1.854ms)
sql>insert into abc values(2);
2 affected row (5.553ms)
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    2 |
|    1 |
|    2 |
+------+
4 tuples (1.014ms)

已经正常了, 插入和查询都从两个节点正常返回并合并结果.
配置第一个m-funnel的tag(可选, 仅仅当广播域中这个m-funnel的名字有重名才需要)
[root@db-172-16-3-221 ~]# monetdb set shared=221 dbx
[root@db-172-16-3-221 ~]# monetdb get all dbxname          prop     source           value
dbx              name      -        dbx
dbx              type      local    mfunnel
dbx              shared    local    221
dbx              mfunnel   local    monetdb+monetdb@dbx1/221,monetdb+monetdb@dbx2/221


创建第二个m-funnel : 
先执行discover得到URI, 从而得知database name.(含shared设置的tag)
[root@150 postgresql-9.3.5]# monetdb discoverlocation
mapi:monetdb://150.sky-mobi.com:50000/db1/150
mapi:monetdb://150.sky-mobi.com:50000/db2/150
mapi:monetdb://db-172-16-3-221:50000/dbx/221
mapi:monetdb://db-172-16-3-221:50000/dbx1/221
mapi:monetdb://db-172-16-3-221:50000/dbx2/221

创建m-funnel,  注意用户密码我没改, 用默认的
这个m-funnel包含3个库, 其中2个是normal 库, 另一个是前面创建的funnel, 所以实际上就是4个normal库, 但是funnel里看到的是3个库.
[root@150 postgresql-9.3.5]# monetdb create -m monetdb+monetdb@db1/150,monetdb+monetdb@db2/150,monetdb+monetdb@dbx/221 db0
created multiplex-funnel in maintenance mode: db0
[root@150 postgresql-9.3.5]# monetdb release db0
taken database out of maintenance mode: db0

查看m-funnel的状态
[root@150 postgresql-9.3.5]# monetdb get all db0name          prop     source           value
db0              name      -        db0
db0              type      local    mfunnel
db0              shared    default  yes
db0              mfunnel   local    monetdb+monetdb@db1/150,monetdb+monetdb@db2/150,monetdb+monetdb@dbx/221

测试 :
[root@150 postgresql-9.3.5]# mclient db0
Welcome to mclient, the MonetDB/SQL interactive terminal (Jan2014-SP3)
Database: MonetDB v11.17.21 (Jan2014-SP3), 'mapi:monetdb://db-172-16-3-221:50000/dbx2'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>create table abc(id int);  -- 前面测试的时候创建了abc表, 所以这里创建会报错.
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use

-- 特别注意, 虽然报错了, 但是实际上只是2个节点报错, 其他2个节点创建成功了.
-- 这个测试用了auto commit, 所以导致部分提交成功, 部分提交失败了.
-- 要全局一致的话, 需要用start transaction
-- 但是用start transaction的话, 会扰乱, 因为并行的事情. 后面有测试
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    2 |
|    1 |
|    2 |
+------+
4 tuples (1.607ms)
sql>insert into abc values (3);  -- 从这个插入我们就能知道, 上面建表的语句在2个节点执行成果了.
4 affected row (5.192ms)
sql>select * from abc;
+------+
| id   |
+======+
|    3 |
|    3 |
|    1 |
|    2 |
|    3 |
|    1 |
|    2 |
|    3 |
+------+
8 tuples (1.459ms)


假设我们用了事务会怎么样呢?
可能出现部分提交失败, 部分提交成功或者全部提交成功的情况, 回滚的话则全部节点回滚.
所以鉴于此, 如果报错了, 就立刻回滚, 那么是全部回滚的.
sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>select * from abc;
node dbx/221 failed: node dbx1/221 failed: 25005!current transaction is aborted (please ROLLBACK)
sql>commit;  -- 此时commit, 部分节点提交成功, 部分节点提交失败.
node dbx/221 failed: node dbx1/221 failed: 2D000!COMMIT: failed
sql>rollback;  -- 事务已经结束了, 所以不能再回滚.
node db1/150 failed: 2DM30!ROLLBACK: not allowed in auto commit mode
sql>select * from abc;
+------+
| id   |
+======+
|    1 |
|    2 |
|    1 |
|    2 |
+------+
4 tuples (1.676ms)
sql>insert into abc values (3);  -- 因为部分提交成功, 所以DB1和DB2创建abc表完成.
4 affected row (5.538ms)
sql>select * from abc;
+------+
| id   |
+======+
|    3 |
|    3 |
|    1 |
|    2 |
|    3 |
|    1 |
|    2 |
|    3 |
+------+
8 tuples (1.395ms)


全部回滚的测试 : 
sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>rollback;
auto commit mode: on
sql>\dt
TABLE  sys.abc
TABLE  sys.abc

部分提交成功和失败的测试 : 
sql>create table abc(id int);
node dbx/221 failed: node dbx1/221 failed: 42S01!CREATE TABLE: name 'abc' already in use
sql>\dt
TABLE  sys.abc
TABLE  sys.abc
TABLE  sys.abc
TABLE  sys.abc

全部提交成功的测试 : 
sql>insert into abc values (10);
4 affected row (5.611ms)
sql>select * from abc where id=10;
+------+
| id   |
+======+
|   10 |
|   10 |
|   10 |
|   10 |
+------+
4 tuples (1.906ms)

如果需要全局事务一致的话, 必须使用start trasaction;

使用start transaction带来的困扰, 原因就是漏斗效应, 因为并行转换成了串行来执行, 所以中间会被其他SQL插入, 测试如下.
会话A
sql>start transaction;
auto commit mode: off会话B
sql>start transaction;  -- 因为在m-funnel 中已经启动了事务, 所以再次启动失败
node db1/150 failed: 25001!START TRANSACTION: cannot start a transaction within a transaction
sql>select * from abc;
node db1/150 failed: 25005!current transaction is aborted (please ROLLBACK)会话A
sql>select * from abc;  -- 回退
node db1/150 failed: 25005!current transaction is aborted (please ROLLBACK)-- 任何一个会话执行了rollback都可以, 就结束这个事务了.


[参考]
1. man monetdbd
MULTIPLEX-FUNNELSMonetdbd  implements multiplex-funnel capabilities.  As the name suggests two techniques are combined, the mul-tiplexer and the funnel.The funnel capability limits the access to the database to one client at a time.  That is, if multiple  clientsconnect  to  the  funnel, their queries will be serialized such that they are executed one after the other.  Aneffect of this approach is that clients no longer have an exclusive channel to the database, which  means  thatindividual queries from one client may have been interleaved with queries from others.  This most notably makesSQL transaction blocks unreliable with a funnel.  The funnel, hence, is meant to scale down a large  amount  ofclients that perform short-running (read-only) queries, as typically seen in web-based query loads.When a funnel is defined to use multiple databases, the funnel adds a multiplexer to its query channel.  A mul-tiplex-funnel sends each query to all of the defined databases.  This behavior can be quite confusing at first,but  proves  to be useful in typical sharding configurations, where in particular simple selection queries haveto be performed on each of the shards.  The multiplexer combines the answers from all defined databases in  onesingle  answer  that it sends back to the client.  However, this combining is without any smart logic, that is,the multiplexer does not evaluate the query it is running, but just combines all answers it receives  from  thedatabases.   This  results  in  e.g.  as many return tuples for a SELECT COUNT(*) query, as there are databasesdefined.Due to the two above mentioned characteristics, a multiplex-funnel has some limitations.  As mentioned  before,transactions over multiple queries are likely not to result in the desired behavior.  This is due to each queryto the funnel is required to be self-contained.  Further, since for  each  query,  the  results  from  multipleservers  have  to  be  combined  into one, that query must only return a single response, i.e.  multi-statementqueries are most likely causing the funnel to respond with an error, or return garbled results.  Last, the sizeof  each query is limited to currently about 80K.  While this size should be sufficient for most queries, it islikely not enough for e.g. COPY INTO statements.  Apart from the data transfer  implications,  such  statementsshould not be used with the funnel, as the results will be undefined due to the limited query buffer.  Applica-tions using the funnel should aim for short and single-statement queries that require no transactions.See the create command in the monetdb(1) man-page for details on how to setup a multiplex-funnel.

这篇关于MonetDB multiplex-funnel的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/796568

相关文章

【单片机】An Introduction to the Digital Multiplex Protocol

https://www.electroschematics.com/an-introduction-to-the-digital-multiplex-protocol/ The Digital Multiplex (DMX) protocol is everywhere in lighting, so if you’re fiddling with lighting, it’s not goin

【论文阅读】DMGI:Unsupervised Attributed Multiplex Network Embedding

无监督属性化多路复用网络嵌入 摘要1 引言2 问题陈述3 Unsupervised Attributed Multiplex Network Embedding3.1 Deep Multiplex Graph Infomax: DMGI 4 实验 摘要 多路复用网络中的节点由多种类型的关系连接。然而,大多数现有的网络嵌入方法都假设节点之间只存在一种单一的关系。即使对于考虑网络多重

【论文阅读】Multiplex Graph Neural Network for Extractive Text Summarization

【论文阅读】Multiplex Graph Neural Network for Extractive Text Summarization 用于提取文本总结的多路复用图神经网络 Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 摘要 提取文本摘要旨在从给定文档中提取最

Python IO multiplex

python--第十天总结(IO多路复用) 服务器端编程经常需要构造高性能的IO模型,常见的IO模型有四种: (1)同步阻塞IO(Blocking IO):即传统的IO模型。 (2)同步非阻塞IO(Non-blocking IO):默认创建的socket都是阻塞的,非阻塞IO要求socket被设置为NONBLOCK。注意这里所说的NIO并非Java的NIO(New IO)库。

【论文阅读 | AAAI2020】Unsupervised Attributed Multiplex Network Embedding

论文题目:Unsupervised Attributed Multiplex Network Embedding 论文地址:https://arxiv.org/abs/1911.06750 代码地址:https://github.com/pcy1302/DMGI 引用方式:@article{park2019unsupervised,title={Unsupervised Attrib

Representation Learning for Attributed Multiplex Heterogeneous Network 论文阅读笔记

摘要部分 网络嵌入(图嵌入)在真实世界中已经有了非常大规模的应用,然而现存的一些网络嵌入(图嵌入)相关的方法主要还是集中在同质网络的应用场景下,即节点和边的类型都是单一类型的情况下。但是真实世界网络中每个节点的类型都多种,每条边的类型也有多种,而且每一个节点都具有不同且繁多的属性。所以本论文提出了一种在Attributed Multiplex Heterogeneous Network中进行嵌入

GNN 2021(十四) HDMI: High-order Deep Multiplex Infomax,WWW

本文算是对之前的模型Deep Graph Infomax(DGI)的一个改进,DGI主要作用于自监督的节点分类,使用最大化局部节点嵌入和全局摘要之间的互信息的方式在许多下游任务(如节点分类)上显示出了很好的结果。然而,DGI有两个主要的限制。首先,DGI只考虑外部监督信号(即节点嵌入与全局总结之间的互信息)而忽略了内在信号(即节点嵌入与节点属性之间的相互依赖)。其次,现实网络中的节点通常由多条关

Multiplex 数据格式转化

遇到这种数据格式,参考CKD的代码可以将其转换为普通邻接矩阵

论文笔记:Multiplex Heterogeneous Graph Convolutional Network

Multiplex Heterogeneous Graph Convolutional Network 现有的工作忽略了多类型节点之间多重网络的关系异质性和节点嵌入元路径中关系的不同重要性 导致很难捕获到跨不同关系的异构结构信号 什么是多类型节点之间多重网络的关系异质性? 首先要知道什么是多重网络(multiplex network),在一个网络中,用户可能会对一个商品有多种交互,比如点击

Representation Learning for Atributed Multiplex Heterogeneous Network

Representation Learning for Atributed Multiplex Heterogeneous Network 本文中心思想问题背景核心思路edge embedding Heterogeneous Network) 本文中心思想 问题背景 异构图分这么多种种类:要注意不是只有节点类型不同的才叫异构图,根据不同的节点类型,和不同的边的连接关系(单