本文主要是介绍ODL之Netconf重连,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
ODL中Netconf支持设备异常下线后定时重连。其相关功能介绍如下:
在节点添加成功后,会创建该设备的Communicator,负责控制器与该设备节点的连接沟通处理逻辑。
AbstractNetconfTopology.java
protected NetconfConnectorDTO createDeviceCommunicator(final NodeId nodeId,final NetconfNode node) {//setup default values since default value is not supported yet in mdsal// TODO remove this when mdsal starts supporting default values 节点配置参数获取final Long defaultRequestTimeoutMillis = node.getDefaultRequestTimeoutMillis() == null ? DEFAULT_REQUEST_TIMEOUT_MILLIS : node.getDefaultRequestTimeoutMillis();final Long keepaliveDelay = node.getKeepaliveDelay() == null ? DEFAULT_KEEPALIVE_DELAY : node.getKeepaliveDelay();//保活心跳间隔 120sfinal Boolean reconnectOnChangedSchema = node.isReconnectOnChangedSchema() == null ? DEFAULT_RECONNECT_ON_CHANGED_SCHEMA : node.isReconnectOnChangedSchema();IpAddress ipAddress = node.getHost().getIpAddress();InetSocketAddress address = new InetSocketAddress(ipAddress.getIpv4Address() != null ?ipAddress.getIpv4Address().getValue() : ipAddress.getIpv6Address().getValue(),node.getPort().getValue());RemoteDeviceId remoteDeviceId = new RemoteDeviceId(nodeId.getValue(), address);RemoteDeviceHandler<NetconfSessionPreferences> salFacade =createSalFacade(remoteDeviceId, node, domBroker, bindingAwareBroker);//这里根据传入节点的keepaliveDelay配置,在设置为0时,会使用NetconfDevicesSalFacade,即无保活心跳机制if (keepaliveDelay > 0) {LOG.warn("Adding keepalive facade, for device {}", nodeId);salFacade = new KeepaliveSalFacade(remoteDeviceId, salFacade, keepaliveExecutor.getExecutor(), keepaliveDelay, defaultRequestTimeoutMillis);}final NetconfDevice.SchemaResourcesDTO schemaResourcesDTO = setupSchemaCacheDTO(nodeId, node);final NetconfDevice device = new NetconfDevice(schemaResourcesDTO, remoteDeviceId, salFacade,processingExecutor.getExecutor(), reconnectOnChangedSchema);final Optional<NetconfSessionPreferences> userCapabilities = getUserCapabilities(node);NetconfDeviceCommunicator communicator = userCapabilities.isPresent() ?new NetconfDeviceCommunicator(remoteDeviceId, device, new UserPreferences(userCapabilities.get(), node.getYangModuleCapabilities().isOverride())):new NetconfDeviceCommunicator(remoteDeviceId, device);final NetconfConnectorDTO netconfConnectorDTO = new NetconfConnectorDTO(communicator, salFacade);salFacade.setListener(communicator);setCommunicator(nodeId, netconfConnectorDTO.getCommunicator());return netconfConnectorDTO;}
leaf connection-timeout-millis {description "Specifies timeout in milliseconds after which connection must be established.";type uint32;default 20000;}leaf default-request-timeout-millis {description "Timeout for blocking operations within transactions.";type uint32;default 60000;}leaf max-connection-attempts {description "Maximum number of connection retries. Non positive value or null is interpreted as infinity.";type uint32;default 0; // retry forever}leaf between-attempts-timeout-millis {description "Initial timeout in milliseconds to wait between connection attempts. Will be multiplied by sleep-factor with every additional attempt";type uint16;default 2000;}leaf sleep-factor {type decimal64 {fraction-digits 1;}default 1.5;}
在session创建成功后,AbstractSessionNegotiator中channelActive,执行startNegotiation,发送Hello报文,NetconfClientSessionNegotiator handleMessage中处理设备返回Hello报文
getSessionForHelloMessage中将session状态修改为ESTABLISHED
connection-timeout-millis:是指发起negotiation时,session从OPEN_WAIT变为ESTABLISHED状态的超时时间,当时间到,并且promise没有完成且没有取消,则协商失败,关闭channel
default-request-timeout-millis:在KeepaliveSalFacade类中KeepaliveDOMRpcService的invokeRpc,在RPC调用超时后,取消
maxConnectionAttempts, betweenAttemptsTimeoutMillis, sleepFactor:用于重连逻辑中重连时机的计算
保活心跳机制:
顾名思义是建立在节点已经连接上的基础上(如当session状态ideal),KeepaliveSalFacade.java
sessionCreated(IoSession session) 当有新的连接建立的时候,该方法被调用。
sessionOpened(IoSession session) 当有新的连接打开的时候,该方法被调用。该方法在 sessionCreated之后被调用。
sessionClosed(IoSession session) 当连接被关闭的时候,此方法被调用。
sessionIdle(IoSession session, IdleStatus status) 当连接变成闲置状态的时候,此方法被调用。
exceptionCaught(IoSession session, Throwable cause)当 I/O 处理器的实现,此方法被调用。
说明:
sessionCreated 和 sessionOpened 的区别。sessionCreated方法是由 I/O 处理线程来调用的,而 sessionOpened是由其它线程来调用的。
因此从性能方面考虑,不要在 sessionCreated 方法中执行过多的操作。
对于sessionIdle,默认情况下,闲置时间设置是禁用的,也就是说sessionIdle 并不会被调用。可以通过 IoSessionConfig.setIdleTime(IdleStatus, int) 来进行设置。
KeepaliveSalFacade.java
@Overridepublic void onDeviceConnected(final SchemaContext remoteSchemaContext, final NetconfSessionPreferences netconfSessionPreferences, final DOMRpcService deviceRpc) {this.currentDeviceRpc = deviceRpc;final DOMRpcService deviceRpc1 = new KeepaliveDOMRpcService(deviceRpc, resetKeepaliveTask, defaultRequestTimeoutMillis, executor);salFacade.onDeviceConnected(remoteSchemaContext, netconfSessionPreferences, deviceRpc1);LOG.debug("{}: Netconf session initiated, starting keepalives", id);scheduleKeepalive();}
连接成功后,调用scheduleKeepalive启动保活心跳机制
private void scheduleKeepalive() {Preconditions.checkState(currentDeviceRpc != null);LOG.trace("{}: Scheduling next keepalive in {} {}", id, keepaliveDelaySeconds, TimeUnit.SECONDS);currentKeepalive = executor.schedule(new Keepalive(currentKeepalive), keepaliveDelaySeconds, TimeUnit.SECONDS);}
KeepaliveSalFacade.java中Keepalive实现了Runnable和FutureCallBack,其调用了rpc(get-config),其回调函数中,除成功返回响应外,都触发重连。
@Overridepublic void onSuccess(final DOMRpcResult result) {if (result != null && result.getResult() != null) {LOG.debug("{}: Keepalive RPC successful with response: {}", id, result.getResult());scheduleKeepalive();} else {LOG.warn("{} Keepalive RPC returned null with response: {}. Reconnecting netconf session", id, result);reconnect();}}@Overridepublic void onFailure(@Nonnull final Throwable t) {LOG.warn("{}: Keepalive RPC failed. Reconnecting netconf session.", id, t);reconnect();}
考虑到除了getConfig请求,业务的其它RPC也能返回节点的数据,亦能证明节点Session存在,所以KeepaliveDOMRpcService的invokeRpc调用回调成功函数中会重置keepalive定时器。借助业务的RPC降低keepalive的心跳压力。
<node xmlns="urn:TBD:params:xml:ns:yang:network-topology"><node-id>testa</node-id><host xmlns="urn:opendaylight:netconf-node-topology">10.42.94.233</host><port xmlns="urn:opendaylight:netconf-node-topology">17830</port><username xmlns="urn:opendaylight:netconf-node-topology">admin</username><password xmlns="urn:opendaylight:netconf-node-topology">admin</password><tcp-only xmlns="urn:opendaylight:netconf-node-topology">false</tcp-only><keepalive-delay xmlns="urn:opendaylight:netconf-node-topology">0</keepalive-delay><sleep-factor xmlns="urn:opendaylight:netconf-node-topology">1</sleep-factor><reconnect-on-changed-schema xmlns="urn:opendaylight:netconf-node-topology">true</reconnect-on-changed-schema></node>
可以通过节点参数配置,可以参考YANG文件netconf-node-topology.yang
断链重连:
断链,则之前已经创建链接,Netconf要创建链接,首先进行了设备节点的添加(写config库)
ProtocolSessionPromise.java
synchronized void connect() {final Object lock = this;try {final int timeout = this.strategy.getConnectTimeout();LOG.debug("Promise {} attempting connect for {}ms", lock, timeout);if(this.address.isUnresolved()) {this.address = new InetSocketAddress(this.address.getHostName(), this.address.getPort());}this.b.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, timeout);final ChannelFuture connectFuture = this.b.connect(this.address);// Add listener that attempts reconnect by invoking this method again.connectFuture.addListener(new BootstrapConnectListener(lock));this.pending = connectFuture;} catch (final Exception e) {LOG.info("Failed to connect to {}", address, e);setFailure(e);}}
timeout默认为2秒,即示2秒连接不上(其后退避策略计算),则超时(信号灯超时时间)
BootstrapConnectListener.java这个监听器的关键是在于连接不成功的逻辑(重连)
LOG.debug("Attempt to connect to {} failed", ProtocolSessionPromise.this.address, cf.cause());final Future<Void> rf = ProtocolSessionPromise.this.strategy.scheduleReconnect(cf.cause());rf.addListener(new ReconnectingStrategyListener());ProtocolSessionPromise.this.pending = rf;
超时连接不成功,则开始重连逻辑,使用的策略为TimedReconnectStrategy.java
leaf between-attempts-timeout-millis {description "Initial timeout in milliseconds to wait between connection attempts. Will be multiplied by sleep-factor with every additional attempt";config true;type uint16;default 2000;}
这里的重连等待时间采用的是退避算法(借助sleep-factor)
ReconnectingStrategyListener则比较简单,在重连时间计算feature到达后,连接即可。
connect的流程又回到了起始地方,形成一个循环。
当连接断开后,又是如何进行重连的。
在设备掉线后,一系列的channelInactive会触发,进入ClosedChannelHandler.channelInactive从而会触发ReconnectPromise的connect
@Override
public void channelInactive(final ChannelHandlerContext ctx) throws Exception {// This is the ultimate channel inactive handler, not forwardingif (promise.isCancelled()) {return;}if (promise.isInitialConnectFinished() == false) {LOG.debug("Connection to {} was dropped during negotiation, reattempting", promise.address);}LOG.debug("Reconnecting after connection to {} was dropped", promise.address);promise.connect();
}
最后的打印,表明重连
针对于后序的Ssh连接:
在进行重连后,进入AbstractChannelHandlerContext.java
private void invokeConnect(SocketAddress remoteAddress, SocketAddress localAddress, ChannelPromise promise) {if (isAdded()) {try {((ChannelOutboundHandler) handler()).connect(this, remoteAddress, localAddress, promise);} catch (Throwable t) {notifyOutboundHandlerException(t, promise);}} else {connect(remoteAddress, localAddress, promise);}}
其中handle()方法会依次调用返回:
DefaultChannelPipeline.java connect
NetconfHelloMessageToXMLEncoder
EOMFramingMechanismEncoder
AsynSshHandler.java
这篇关于ODL之Netconf重连的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!