<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <id>https://tis.pub/blog</id>
    <title>零代码乐高式搭建数据管道 Blog</title>
    <updated>2024-05-10T00:00:00.000Z</updated>
    <generator>https://github.com/jpmonette/feed</generator>
    <link rel="alternate" href="https://tis.pub/blog"/>
    <subtitle>零代码乐高式搭建数据管道 Blog</subtitle>
    <icon>https://tis.pub/img/favicon.ico</icon>
    <entry>
        <title type="html"><![CDATA[使用依赖反转原则实现TIS增量通道]]></title>
        <id>dip-in-tis</id>
        <link href="https://tis.pub/blog/dip-in-tis"/>
        <updated>2024-05-10T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[前言]]></summary>
        <content type="html"><![CDATA[<h2 class="anchor anchorWithStickyNavbar_LWe7" id="前言">前言<a class="hash-link" href="#前言" title="Direct link to heading">​</a></h2><p>说到<strong>依赖反转原则</strong> ，<a href="https://baike.baidu.com/item/%E4%BE%9D%E8%B5%96%E5%8F%8D%E8%BD%AC%E5%8E%9F%E5%88%99/22718037?fr=ge_ala" target="_blank" rel="noopener noreferrer">百度百科是这样介绍的</a>，如下：</p><blockquote><p>在面向对象编程领域中，依赖反转原则（Dependency inversion principle，DIP）是指一种特定的解耦（传统的依赖关系创建在高层次上，而具体的策略设置则应用在低层次的模块上）形式，使得高层次的模块不依赖于低层次的模块的实现细节，依赖关系被颠倒（反转），从而使得低层次模块依赖于高层次模块的需求抽象。</p></blockquote><p>大家应该会马上会联想到Spring Framework，在介绍Spring Framework框架常常会提及依赖反转原则，看上面的介绍估计会云里雾里，说得通俗一点，该原则的初衷<strong>要求服务提供者与服务调用者在代码实现层面实现解耦</strong>。</p><p>为了加深理解，经常会提到的一个例子，以前古时候的包办婚姻，假如是男方到了适婚年龄，只要把自己的条件、要求告诉媒婆。接下来找合适对象的过程就交给媒婆就行了，男方只需要负责到时候入洞房就行了。对于男方来说，
婚姻和他是服务提供者和消费者的关系，由于引入了媒婆的角色，男方省去了谈婚论嫁的麻烦过程，只需要专注于核心业务-入洞房。最终使得整个过程显得简单高效，形式格外优雅。</p><p>因此，<strong>依赖反转原则DIP</strong>作为一个朴素的原则存在，可以应用到软件设计领域每一个流程环节当中，而不仅仅适用于Spring Framework当中。</p><p>本文就以<strong>依赖反转原则DIP</strong>在TIS增量实时数据通道的设计、实现过程中如何利用这一原则来优化设计、实现流程进行阐述。</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="实现实时增量数据管道需求">实现实时增量数据管道需求<a class="hash-link" href="#实现实时增量数据管道需求" title="Direct link to heading">​</a></h2><p>在TIS中为用户提供了基于Flink端到端的实时增量数据通道功能，市面上已经提供了基于Flink和<a href="https://github.com/apache/flink-cdc" target="_blank" rel="noopener noreferrer">Flink-CDC</a>的实时流同步工具，从用户反馈来看已经很方便了，那为什么还要通过TIS来
使用Flink-CDC呢？</p><p>这是一个非常好的问题，要回答这个问题，首先我们需要从用户的角度了解用户到底需要什么？然后从需求出发设计并且构建出用户体验达到极致的产品。</p><p>大数据流计算领域，用户的核心需求是：</p><ol><li>可追溯操作历史的控制系统，这样可以方便回滚历史操作。 </li><li>不关心算子实现细节，流计算的使用者往往是对Flink不了解的数据分析人员，所以在产品使用体验上需要屏蔽底层技术细节。</li><li>可扩展的端类型：Flink-CDC从3.0版本支持的<a href="https://nightlies.apache.org/flink/flink-cdc-docs-release-3.0/docs/connectors/overview/" target="_blank" rel="noopener noreferrer">Connectors</a>，只支持了有限个数的基于增量监听CDC技术的Source端
，和少量Sink端实现，如：Doris和StarRocks的Sink端类型。还远远没有达到用户实际生产场景下的端类型。所以，需要提供在更高层次上，通过便捷方式扩展Source和Sink端类型的手段。</li></ol><p>TIS正式为了弥补以上三个使用Flink-CDC框架中的不足而开发的。</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="具体实现">具体实现<a class="hash-link" href="#具体实现" title="Direct link to heading">​</a></h2><p>下面具体对以上第2点进行进行说明，配置并且触发执行基于Flink-CDC的数据管道具体通过以下步骤完成</p><iframe width="100%" height="500px" src="https://www.processon.com/embed/66407f1712931152f2850d1b?cid=66407f1712931152f2850d1e"></iframe><p><a href="https://www.processon.com/diagraming/66407f1712931152f2850d1b" target="_blank" rel="noopener noreferrer">编辑</a></p><p>在<code>构建DataStreamSource</code>步骤中，通过调用Flink-CDC提供的API代码，可以方便订阅到如MySQL的增量更新消息，如下代码：</p><div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">main</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">String</span><span class="token punctuation" style="color:#393A34">[</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> args</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">throws</span><span class="token plain"> </span><span class="token class-name">Exception</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token class-name">MySqlSource</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">String</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token plain"> mySqlSource </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token class-name">MySqlSource</span><span class="token punctuation" style="color:#393A34">.</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">String</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token function" style="color:#d73a49">builder</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">hostname</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"yourHostname"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">port</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">yourPort</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">databaseList</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"yourDatabaseName"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// set captured database, If you need to synchronize the whole database, Please set tableList to ".*".</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">tableList</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"yourDatabaseName.yourTableName"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// set captured table</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">username</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"yourUsername"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">password</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"yourPassword"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">deserializer</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">JsonDebeziumDeserializationSchema</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// converts SourceRecord to JSON String</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">build</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token class-name">StreamExecutionEnvironment</span><span class="token plain"> env </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token class-name">StreamExecutionEnvironment</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getExecutionEnvironment</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// enable checkpoint</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    env</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">enableCheckpointing</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">3000</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    env</span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">fromSource</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">mySqlSource</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">WatermarkStrategy</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">noWatermarks</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"MySQL Source"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token comment" style="color:#999988;font-style:italic">// set 4 parallel source tasks</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">setParallelism</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">4</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">setParallelism</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// use parallelism 1 for sink to keep message ordering</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    env</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">execute</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"Print MySQL Snapshot + Binlog"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>通过以上流计算的流程中可以使用创建出<code>MySqlSource&lt;String&gt; mySqlSource</code>Source加入到各种算子中去进行计算。</p><p>使用SQL的方式将Stream Source 注册为Flink Table：</p><div class="language-sql codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-sql codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">CREATE</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">TABLE</span><span class="token plain"> mysql_source </span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">WITH</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'connector'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'mysql-cdc'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.mode'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'earliest-offset'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Start from earliest offset</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.mode'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'latest-offset'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Start from latest offset</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.mode'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'specific-offset'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Start from specific offset</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.mode'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'timestamp'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Start from timestamp</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.specific-offset.file'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'mysql-bin.000003'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Binlog filename under specific offset startup mode</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.specific-offset.pos'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'4'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Binlog position under specific offset mode</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.specific-offset.gtid-set'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'24DA167-0C0C-11E8-8442-00059A3C7B00:1-19'</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- GTID set under specific offset startup mode</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token string" style="color:#e3116c">'scan.startup.timestamp-millis'</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">'1667232000000'</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">-- Timestamp under timestamp startup mode</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">)</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>以上是Flink-CDC提供的标准化的Demo案例。</p><p>在这里我们重新用依赖反转原则来思考一下，这个构建流程是否有违背该原则？确实，从用户的角度来说，用户只关心最终构建出来的<code>MySqlSource&lt;String&gt;</code>实例，至于构建该实例的过程用户并不关心，所以在设计过程需要将
<code>MySqlSource&lt;String&gt;实例</code>构建过程与它的调用者之间进行解耦合。</p><p>是时候发挥TIS的作用了，TIS需要发挥实例容器的作用，由TIS根据用户配置的Source端参数自动地创建<code>MySqlSource&lt;String&gt;</code>实例，
在运行时自动注入到执行流程中。</p><ul><li>配置Source/Sink Connector<figure><a data-fancybox="true" href="/assets/images/source-sink-config-26a71b75918cfc0382e41c0e03db7cf7.png"><img src="/assets/images/source-sink-config-26a71b75918cfc0382e41c0e03db7cf7.png"></a></figure></li><li>直接引用TIS注入的SourceStream实例<figure><a data-fancybox="true" href="/assets/images/instance-by-tis-and-inject-850907bd03214cd7a11af8f07b5798fa.png"><img src="/assets/images/instance-by-tis-and-inject-850907bd03214cd7a11af8f07b5798fa.png"></a></figure></li><li>当用户选择Flink SQL类型脚本,直接引用已经注册完成的Table名即可<figure><a data-fancybox="true" href="/assets/images/instance-by-tis-and-register-flink-table-978a89528ef9c95afcf4b1b0a7f85369.png"><img src="/assets/images/instance-by-tis-and-register-flink-table-978a89528ef9c95afcf4b1b0a7f85369.png"></a></figure></li></ul><p>以上具体提供注入实例的封装工厂是：</p><ul><li>执行Stream Source创建： <a href="https://github.com/qlangtech/plugins/blob/master/tis-incr/tis-flink-cdc-mysql-plugin/src/main/java/com/qlangtech/tis/plugins/incr/flink/cdc/mysql/FlinkCDCMySQLSourceFactory.java" target="_blank" rel="noopener noreferrer">FlinkCDCMySQLSourceFactory.java</a></li><li>执行将Stream Source转成Flink Table： <a href="https://github.com/qlangtech/plugins/blob/master/tis-incr/tis-flink-chunjun-mysql-plugin/src/main/java/com/qlangtech/tis/plugins/incr/flink/connector/table/MySQLDynamicTableFactory.java" target="_blank" rel="noopener noreferrer">MySQLDynamicTableFactory.java</a></li></ul><p>以上两段代码的执行逻辑类似Spring <a href="https://www.baeldung.com/spring-factorybean" target="_blank" rel="noopener noreferrer">FactoryBean</a> 执行逻辑，实现容器预定的扩展工厂接口，运行期由容器负责初始化，继而将实例注入到需要反向依赖的实例中。</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="总结">总结<a class="hash-link" href="#总结" title="Direct link to heading">​</a></h2><p>本文介绍了利用依赖反转原则在TIS中实现实时增量通道的优化方法，可使最终用户最大限度地关注流式计算核心业务本身，其他的琐碎的与实例初始化相关的工作都交给TIS来完成即可。</p><p>与此类似的功能优化，在TIS实现过程中还有很多，会在日后的博客分享中陆续发表。</p>]]></content>
        <author>
            <name>百岁</name>
            <uri>https://github.com/baisui1981</uri>
        </author>
    </entry>
    <entry>
        <title type="html"><![CDATA[优化从MongoDB表中读取数据]]></title>
        <id>read-from-mongodb</id>
        <link href="https://tis.pub/blog/read-from-mongodb"/>
        <updated>2024-05-10T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[前言]]></summary>
        <content type="html"><![CDATA[<h2 class="anchor anchorWithStickyNavbar_LWe7" id="前言">前言<a class="hash-link" href="#前言" title="Direct link to heading">​</a></h2><p>MongoDB是一个基于分布式文件存储的开源数据库系统，其内容存储格式为BSON（一种类json的二进制形式），其主要特点是高性能、易部署、易使用，存储数据支持高度事务性且支持完全索引，包括地理空间索引、散列索引和全文索引，还有一个比较大的优势是其适应于海量数据的存储，其数据被分散在不同的服务器上，以自动分区数据。</p><p>MongoDB在实际生产环境中有很多应用场景，利用器BSON数据结构可以在运行期动态扩展数据Schema。</p><p>MongoDB被TIS整合进了数据集成方案，通过TIS中可以<strong>方便对MongoDB端进行读取或者写入操作</strong>，方便实现对MongoDB的数据迁移、实时容灾备份、异构数据端（如：Doris）实时同步实现复杂OLAP操作。</p><p>本文就实际操作过程中，发现从MongoDB中不能预先读取表Schema，对此进行了优化，并且对此优化过程作以详细介绍。</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="发现瓶颈">发现瓶颈<a class="hash-link" href="#发现瓶颈" title="Direct link to heading">​</a></h2><p>MongoDB作为文档型数据库的代表，区别于传统关系型数据库MySQL，MongoDB的表数据结构Schema在运行期是可变的，而不是像MySQL那样通过Create Table DDL预先定义好表Schema。因此，在为MongoDB做数据集成操作时带来一个麻烦事儿，需要通过手工配置的方式
为读取MongoDB的表作为依据。例如，用户通过Alibaba DataX来读取MongoDB需编写DataX Reader任务配置：</p><div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockTitle_Ktv7">https://github.com/alibaba/DataX/blob/master/mongodbreader/doc/mongodbreader.md</div><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"job"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"content"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token property" style="color:#36acaa">"reader"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"mongodbreader"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token property" style="color:#36acaa">"parameter"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token property" style="color:#36acaa">"address"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token string" style="color:#e3116c">"127.0.0.1:27017"</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token property" style="color:#36acaa">"dbName"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"tag_per_data"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token property" style="color:#36acaa">"collectionName"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"tag_data12"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token property" style="color:#36acaa">"column"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"unique_id"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"type"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"string"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"sid"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"type"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"string"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"user_id"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"type"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"string"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"auction_id"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"type"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"string"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"content_type"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"type"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"string"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"pool_type"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token property" style="color:#36acaa">"type"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"string"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">              </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token property" style="color:#36acaa">"writer"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">          </span><span class="token property" style="color:#36acaa">"name"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"odpswriter"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>以上配置文件中 reader <code>mongodbreader</code>需要配置对应表的列枚举信息，列中存在<code>BSON</code>类型的列，还存在拆列的问题，会更加复杂，配置过程虽然简单，但还是很容易会出错，特别在配置<code>type</code>属性时。</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="优化">优化<a class="hash-link" href="#优化" title="Direct link to heading">​</a></h2><h3 class="anchor anchorWithStickyNavbar_LWe7" id="思路">思路<a class="hash-link" href="#思路" title="Direct link to heading">​</a></h3><p>优化思路，是否可以通过MongoDB的JDBC客户端，通过反射的方式得到表的列信息列表。然后通过模版机制（如：velocity）自动生成DataX配置文件中<code>column</code>配置。</p><p>尝试通过MongoDB Client API读取表Schema元数据信息，我们可以尝试从collection中读取一条记录，然后通过解析记录获得Schema记录，如下：</p><div class="language-javascript codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-javascript codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">var</span><span class="token plain"> schemaObj </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> db</span><span class="token punctuation" style="color:#393A34">.</span><span class="token property-access">users</span><span class="token punctuation" style="color:#393A34">.</span><span class="token method function property-access" style="color:#d73a49">findOne</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>遍历记录的所有列</p><div class="language-javascript codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-javascript codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">printSchema</span><span class="token punctuation" style="color:#393A34">(</span><span class="token parameter">obj</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword control-flow" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">var</span><span class="token plain"> key </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> obj</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">indent</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">typeof</span><span class="token plain"> obj</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">;</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>可以将Schema打印出，如下：<br><br>
<img loading="lazy" src="/assets/images/1_9aFUkXYSRE_6ezUxH7pw3Q-f28addc41b7cd5957f034c9babae066c.webp" width="206" height="279" class="img_ev3q"></p><p>这非常酷，而用户自定义Collection往往存在子属性，希望将这些子属性进行拆接打平，可以导入到下游目标端中。</p><p>我们可以优化以上代码：</p><div class="language-javascript codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-javascript codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">function</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">printSchema</span><span class="token punctuation" style="color:#393A34">(</span><span class="token parameter">obj</span><span class="token parameter punctuation" style="color:#393A34">,</span><span class="token parameter"> indent</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword control-flow" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">var</span><span class="token plain"> key </span><span class="token keyword" style="color:#00009f">in</span><span class="token plain"> obj</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token function" style="color:#d73a49">print</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">indent</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">typeof</span><span class="token plain"> obj</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword control-flow" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">typeof</span><span class="token plain"> obj</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"object"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token function" style="color:#d73a49">printSchema</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">obj</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">]</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> indent </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"\t"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token function" style="color:#d73a49">printSchema</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">schemaObj</span><span class="token punctuation" style="color:#393A34">,</span><span class="token string" style="color:#e3116c">""</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>重新执行以上代码，将会打印：<br><br>
<img loading="lazy" alt="1_nWI-bWoJHWE1WU2Mgk8z-A.webp" src="/assets/images/1_nWI-bWoJHWE1WU2Mgk8z-A-5ae27380af9d81be2b4d3cd775d9c71c.webp" width="209" height="380" class="img_ev3q"></p><ul><li>参考： <a href="https://medium.com/@ahsan.ayaz/how-to-find-schema-of-a-collection-in-mongodb-d9a91839d992" target="_blank" rel="noopener noreferrer">https://medium.com/@ahsan.ayaz/how-to-find-schema-of-a-collection-in-mongodb-d9a91839d992</a></li></ul><h3 class="anchor anchorWithStickyNavbar_LWe7" id="在tis中具体实现">在TIS中具体实现<a class="hash-link" href="#在tis中具体实现" title="Direct link to heading">​</a></h3><p>TIS中读取MongoDB表Schema实现方式沿袭以上思路，另外添加额外的工序：</p><ol><li>在控制台中设置尝试读取的记录数，由于MongoDB Collection中每条记录的列数量和类型不一定相同的，可以尝试读取多条Collection中记录，将每条记录Schema进行Merge最终获得Schema。</li><li>读取每条记录类型为<code>BsonType.DOCUMENT</code>的列类型，将内部子列与父列通过'.'号就行连接，打平成为新的列,例如："user.name","user.age"</li><li>将获得到的Schema在前台展示，用户可以通过在表单中对Schema结构进行微调，以确认最终的Schema结构。</li></ol><p>以下为 <a href="https://github.com/qlangtech/plugins/blob/b728488cc198661e6657cc174b4326495f9d5e6d/tis-datax/tis-datax-mongodb-plugin/src/main/java/com/qlangtech/tis/plugin/datax/mongo/MongoColumnMetaData.java#L69" target="_blank" rel="noopener noreferrer">MongoColumnMetaData.java</a> GitHub中路径代码中的片段：</p><div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockTitle_Ktv7">/tis-datax-mongodb-plugin/src/main/java/com/qlangtech/tis/plugin/datax/mongo/MongoColumnMetaData.java#L69</div><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">parseMongoDocTypes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">boolean</span><span class="token plain"> parseChildDoc</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">List</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">String</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token plain"> parentKeys </span><span class="token comment" style="color:#999988;font-style:italic">//</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">Map</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">String</span><span class="token generics punctuation" style="color:#393A34">,</span><span class="token generics"> </span><span class="token generics class-name">MongoColumnMetaData</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token plain"> colsSchema</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">BsonDocument</span><span class="token plain"> bdoc</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">int</span><span class="token plain"> index </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">BsonValue</span><span class="token plain"> val</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">String</span><span class="token plain"> key</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">MongoColumnMetaData</span><span class="token plain"> colMeta</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token class-name">List</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">String</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token plain"> keys </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">Map</span><span class="token class-name punctuation" style="color:#393A34">.</span><span class="token class-name">Entry</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics class-name">String</span><span class="token generics punctuation" style="color:#393A34">,</span><span class="token generics"> </span><span class="token generics class-name">BsonValue</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token plain"> entry </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> bdoc</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">entrySet</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            val </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> entry</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getValue</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            keys </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token class-name">ListUtils</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">union</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">parentKeys</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">Collections</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">singletonList</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">entry</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getKey</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            key </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token class-name">String</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">join</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">MongoCMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">KEY_MONOG_NEST_PROP_SEPERATOR</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> keys</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            colMeta </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> colsSchema</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">colMeta </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">null</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                colMeta </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">MongoColumnMetaData</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">index</span><span class="token operator" style="color:#393A34">++</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getBsonType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getBsonType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">BsonType</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">OBJECT_ID</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                colsSchema</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">put</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> colMeta</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getMongoFieldType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> </span><span class="token class-name">BsonType</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">STRING</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">//</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                        </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">!</span><span class="token plain">val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isNull</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getMongoFieldType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">!=</span><span class="token plain"> val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getBsonType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token comment" style="color:#999988;font-style:italic">//TODO： 前后两次类型不同</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token comment" style="color:#999988;font-style:italic">// 则直接将类型改成String类型</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    colMeta </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">MongoColumnMetaData</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">index</span><span class="token operator" style="color:#393A34">++</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">BsonType</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">STRING</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    colsSchema</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">put</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> colMeta</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">!</span><span class="token plain">val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isNull</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getMongoFieldType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">BsonType</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">DOCUMENT</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;&amp;</span><span class="token plain"> val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">isDocument</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    </span><span class="token function" style="color:#d73a49">parseMongoDocTypes</span><span class="token punctuation" style="color:#393A34">(</span><span class="token boolean" style="color:#36acaa">true</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> keys</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> parseChildDoc </span><span class="token operator" style="color:#393A34">?</span><span class="token plain"> colsSchema </span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">docTypeFieldEnum</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">asDocument</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getMongoFieldType</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token class-name">BsonType</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">STRING</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                    colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">setMaxStrLength</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">val</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">asString</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getValue</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">length</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">                colMeta</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">incrContainValCount</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>MongoDB Reader 页面设置<code>预读记录数</code>，尝试读取Collection记录条数以分析出Collection的Schema，为确保最终Schema准确，可以适当将<code>预读记录数</code>设置得大一些。</p><figure><a data-fancybox="true" href="/assets/images/set-count-of-ready-read-c4cdc482a417ac2785400ebb70000632.png"><img src="/assets/images/set-count-of-ready-read-c4cdc482a417ac2785400ebb70000632.png"></a></figure><p>通过TIS解析得到的Schema会在以下页面中展示结果，用户可进行微调，确定是否需要该列，变更字段类型等。</p><figure><a data-fancybox="true" href="/assets/images/confirm-collection-schema-487e3cdc18d210f34c5d82d983153068.png"><img src="/assets/images/confirm-collection-schema-487e3cdc18d210f34c5d82d983153068.png"></a></figure>通过以上优化，最终完成数据同步通道定义，可以有效避免用户手动输入MongoDB表 Schema，达到了最大限度地降低了出错概率，且提高了工作效率<h2 class="anchor anchorWithStickyNavbar_LWe7" id="总结">总结<a class="hash-link" href="#总结" title="Direct link to heading">​</a></h2><p>数据集成领域，有大部分的端类型是和MongoDB类似属于 Schemaless的，例如Kafka，Redis，基于文件的Hdfs，FTP等等。不像MySQL这样的具有明确预定义Schema的数据源可以通过读取MetaData的方式得到Schema，从而进行自动化操作。</p><p>TIS的初衷是构建一款高度傻瓜化的DataOps数据集成软件，面向一线非技术人员，他们精通业务，在具体操作过程中不需要了解具体MongoDB中的字段类型，有哪些字段。
整个操作流程，只需要轻点鼠标，TIS会帮助用户自动生成所需配置。</p><p>借鉴这种操作思路，可以扩展到其他Schemaless的数据端读取流程上，例如Kafka，Redis，基于文件的Hdfs，FTP，将会极大地提高执行数据集成的效率。</p>]]></content>
        <author>
            <name>百岁</name>
            <uri>https://github.com/baisui1981</uri>
        </author>
        <category label="mongodb" term="mongodb"/>
    </entry>
    <entry>
        <title type="html"><![CDATA[内嵌插件复用]]></title>
        <id>nest-plugin-reuse</id>
        <link href="https://tis.pub/blog/nest-plugin-reuse"/>
        <updated>2024-04-22T00:00:00.000Z</updated>
        <summary type="html"><![CDATA[前言]]></summary>
        <content type="html"><![CDATA[<h2 class="anchor anchorWithStickyNavbar_LWe7" id="前言">前言<a class="hash-link" href="#前言" title="Direct link to heading">​</a></h2><p>在TIS 4.0.0 版本主要的功能是将原先但节点运行的组件扩展到分布式云环境中，以下类图中有三个组件需要依赖到<code>ServerPortExport</code>组件，</p><ol><li>Kubernete Powerjob Server</li><li>Kubernete Flink Session</li><li>Kubernete Flink Application</li></ol><p>ServerPortExport 组件负责在K8S组件（ReplicaSet）发布过程中将目标端口以不同的方式发布（Ingress，LoadBalance，NodePort）</p><iframe width="100%" height="500px" name="embed_dom" frameborder="0" src="https://www.processon.com/embed/6626299e51b84f0dfc92128c?cid=6626299e51b84f0dfc92128f"></iframe><p><a href="https://www.processon.com/diagraming/6626299e51b84f0dfc92128c" target="_blank" rel="noopener noreferrer">编辑</a></p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="遇到问题">遇到问题<a class="hash-link" href="#遇到问题" title="Direct link to heading">​</a></h2><p><code>ServerPortExport</code> 组件聚合到不同的组件中，在具体运行过程中需要根据聚合类的不同有不同的初始值，</p><p>例如，聚合在K8SDataXPowerJobServer类中初始值为<code>7700</code>，而当聚合在<code>BasicFlinkK8SClusterCfg</code>中的初始值为<code>8081</code>，当然，直观来说，最简单的办法是根据聚合到不同的类，创建不同的<code>ServerPortExport</code>的子类从而来设置不同的初始值，
但这会创建出大量的冗余代码，所以，并不可取。</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="解决办法">解决办法<a class="hash-link" href="#解决办法" title="Direct link to heading">​</a></h2><p>在运行期，根据所在聚合类的Descriptor来动态设置 <code>ServerPortExport.serverPort</code> 属性的值</p><iframe width="100%" height="500px" frameborder="0" src="https://www.processon.com/embed/662637256b32731aec0d8fe0?cid=662637256b32731aec0d8fe3"></iframe><p><a href="https://www.processon.com/diagraming/662637256b32731aec0d8fe0" target="_blank" rel="noopener noreferrer">编辑</a></p><p>具体需要做以下功能：</p><ol><li>创建 DefaultExportPortProvider接口，get方法返回对应的端口默认值</li><li><code>BasicFlinkK8SClusterCfg</code>和<code>K8SDataXPowerJobServer</code>对应的 Descriptor分别实现以上接口</li><li>在运行期将Descriptor序列化成Json步骤中，需要将Descriptor实例与当前运行的线程绑定，这部分功能在Json序列化过程中执行，为此需要添加新类<code>DescriptorsJSONResult</code></li><li>为类<code>DescriptorsJSONResult</code>注册到Json序列化注册器中<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockTitle_Ktv7">JsonUtil.java</div><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">   </span><span class="token class-name">ObjectWriter</span><span class="token plain"> descSerializer </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">ObjectWriter</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token annotation punctuation" style="color:#393A34">@Override</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">void</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">write</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">JSONWriter</span><span class="token plain"> jsonWriter</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">Object</span><span class="token plain"> object</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">Object</span><span class="token plain"> fieldName</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token class-name">Type</span><span class="token plain"> fieldType</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">long</span><span class="token plain"> features</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token class-name">DescriptorsJSONResult</span><span class="token plain"> value </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">DescriptorsJSONResult</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> object</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token class-name">Objects</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">requireNonNull</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">value</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"callable of "</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> fieldName </span><span class="token operator" style="color:#393A34">+</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">" can not be null"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">            jsonWriter</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">writeRaw</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">value</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">toJSONString</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    com</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">alibaba</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">fastjson2</span><span class="token punctuation" style="color:#393A34">.</span><span class="token constant" style="color:#36acaa">JSON</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">register</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">DescriptorsJSONResult</span><span class="token punctuation" style="color:#393A34">.</span><span class="token keyword" style="color:#00009f">class</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> descSerializer</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div></li><li>通过<code>ServerPortExport.json</code>配置描述文件，设置属性serverPort的默认值<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockTitle_Ktv7">ServerPortExport.json</div><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">   </span><span class="token property" style="color:#36acaa">"serverPort"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token property" style="color:#36acaa">"help"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"SpringBoot配置，HTTP端口号，默认7700，不建议更改"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token property" style="color:#36acaa">"dftVal"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"com.qlangtech.tis.plugin.datax.powerjob.ServerPortExport.dftExportPort():uncache_true"</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockTitle_Ktv7">ServerPortExport.java</div><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">public</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">static</span><span class="token plain"> </span><span class="token class-name">Integer</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">dftExportPort</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">return</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">DefaultExportPortProvider</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">       </span><span class="token class-name">DescriptorsJSONResult</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">getRootDescInstance</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">get</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></span><span class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg class="copyButtonIcon_y97N" viewBox="0 0 24 24"><path d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg class="copyButtonSuccessIcon_LjdS" viewBox="0 0 24 24"><path d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div></li></ol><h2 class="anchor anchorWithStickyNavbar_LWe7" id="总结">总结<a class="hash-link" href="#总结" title="Direct link to heading">​</a></h2><p>   通过以上步骤，就可以将<code>ServerPortExport</code>根据所在聚合类不同将属性<code>serverPort</code>初始化成不同的默认值。
以此作为一个例子，可以在TIS中相同需求可以推而广之。</p>]]></content>
        <author>
            <name>百岁</name>
            <uri>https://github.com/baisui1981</uri>
        </author>
        <category label="plugin" term="plugin"/>
    </entry>
</feed>