pikesaku’s blog

個人的な勉強メモです。記載内容について一切の責任は持ちません。

OpenStackエラーログ調査

目的

OpenStackのエラーログを、こつこつ原因調査してみる。
 

環境

Centos7
OpenStafck Ocata
1NICだけのホスト
 

事象

/var/log/neutron/linuxbridge-agent.log
上記ログファイルに、以下のエラーログが2秒に1回出力される。
再起動しても解消されない。
 

2017-06-11 12:28:12.541 993 ERROR neutron.agent.linux.utils [req-b4a61cad-f978-4624-ac68-d2e3ad65f4d3 - - - - -] Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: File exists
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent [req-b4a61cad-f978-4624-ac68-d2e3ad65f4d3 - - - - -] Error in agent loop. Devices info: {'current': set(['tapc0b2f396-b4', 'tap90033646-4f', 'tapd2d00b90-58']), 'timestamps': {'tapd2d00b90-58': 7, 'tapc0b2f396-b4': 8, 'tap90033646-4f': 10}, 'removed': set([]), 'added': set(['tapc0b2f396-b4', 'tap90033646-4f', 'tapd2d00b90-58']), 'updated': set([])}
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent Traceback (most recent call last):
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 453, in daemon_loop
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     sync = self.process_network_devices(device_info)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     return f(*args, **kwargs)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 210, in process_network_devices
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     resync_a = self.treat_devices_added_updated(devices_added_updated)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 153, in wrapper
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     return f(*args, **kwargs)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 227, in treat_devices_added_updated
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     self._process_device_if_exists(device_details)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 254, in _process_device_if_exists
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     device, device_details['device_owner'])
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 502, in plug_interface
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     tap_name, device_owner)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 451, in add_tap_interface
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     return False
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     self.force_reraise()
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     six.reraise(self.type_, self.value, self.tb)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 443, in add_tap_interface
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     tap_device_name, device_owner)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 474, in _add_tap_interface
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     segmentation_id):
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 427, in ensure_physical_in_bridge
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     physical_interface)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 245, in ensure_flat_bridge
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     gateway):
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 386, in ensure_bridge
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     self.update_interface_ip_details(bridge_name, interface, ips, gateway)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py", line 325, in update_interface_ip_details
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     dst_device.addr.add(cidr=ip['cidr'])
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 579, in add
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     self._as_root([net.version], tuple(args))
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 363, in _as_root
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     use_root_namespace=use_root_namespace)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 99, in _as_root
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     log_fail_as_error=self.log_fail_as_error)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 108, in _execute
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     log_fail_as_error=log_fail_as_error)
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 152, in execute
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent     "Stdout: %(stdout)s; "
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: File exists
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent 
2017-06-11 12:28:12.563 993 ERROR neutron.plugins.ml2.drivers.agent._common_agent 

 

調査

愚直にいく。
Pythonトレースログの最初のスクリプトから追ってみる。
 
File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 152, in execute
この152行目付近は以下の通り。
 

118	def execute(cmd, process_input=None, addl_env=None,
119	            check_exit_code=True, return_stderr=False, log_fail_as_error=True,
120	            extra_ok_codes=None, run_as_root=False):
121	
122	    try:
123	        if process_input is not None:
124	            _process_input = encodeutils.to_utf8(process_input)
125	        else:
126	            _process_input = None
127	        if run_as_root and cfg.CONF.AGENT.root_helper_daemon:
128	            returncode, _stdout, _stderr = (
129	                execute_rootwrap_daemon(cmd, process_input, addl_env))
130	        else:
131	            obj, cmd = create_process(cmd, run_as_root=run_as_root,
132	                                      addl_env=addl_env)
133	            _stdout, _stderr = obj.communicate(_process_input)
134	            returncode = obj.returncode
135	            obj.stdin.close()
136	        _stdout = helpers.safe_decode_utf8(_stdout)
137	        _stderr = helpers.safe_decode_utf8(_stderr)
138	
139	        extra_ok_codes = extra_ok_codes or []
140	        if returncode and returncode not in extra_ok_codes:
141	            msg = _("Exit code: %(returncode)d; "
142	                    "Stdin: %(stdin)s; "
143	                    "Stdout: %(stdout)s; "
144	                    "Stderr: %(stderr)s") % {
145	                        'returncode': returncode,
146	                        'stdin': process_input or '',
147	                        'stdout': _stdout,
148	                        'stderr': _stderr}
149	
150	            if log_fail_as_error:
151	                LOG.error(msg)
152	            if check_exit_code:
153	                raise ProcessExecutionError(msg, returncode=returncode)
154	        else:
155	            LOG.debug("Exit code: %d", returncode)

 
raise ProcessExecutionError(msg, returncode=returncode)
ここで例外発生。関数executeの一部。コマンドとか実行してそう。
例外発生時のcmdの中身を確認してみる。
 
/etc/neutron/neutron.confに以下パラメタ設定

debug = True

 
utils.pyの例外発生直前(152行目)に以下コードを追加。
 

LOG.debug(' '.join(cmd) + " TEST HOGE DEBUG11: %d", 9999)

 
修正後、neutron関連サービスの再起動が必要。
 

# systemctl list-unit-files | egrep neutron | egrep enabled | awk '{print $1}' | xargs -i systemctl restart {}

 
デバッグログ確認。cmdのコマンドが特定。
 

2017-06-11 18:00:20.080 11363 DEBUG neutron.agent.linux.utils [req-f089e83d-afeb-443b-b472-2cc0dedc9d3a - - - - -] ip -4 addr add 192.168.0.100/24 scope global dev brq99c79baa-9a brd 192.168.0.255 TEST HOGE DEBUG11: 9999 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:153

 
このコマンドは何をやるもの?試しにシェルから実行すると、、、
 

# ip -4 addr add 192.168.0.100/24 scope global dev brq99c79baa-9a brd 192.168.0.255
RTNETLINK answers: File exists

 
エラーログのメッセージと同じ。IPアドレスをbrq99c79baa-9aに設定しようとしてる。
192.168.0.100は、このサーバ自体のIPアドレス。brq99c79baa-9aには重複するので設定できないはず。
なぜ、このコマンドが呼び出されるのか?
 
どのプロセスが、繰り返しこの操作をしてくるのか?
 

# ps -elfH | egrep "/var/log/neutron/linuxbridge-agent.log"
0 S root     10371  1529  0  80   0 - 28162 pipe_w 19:56 pts/0    00:00:00         grep -E --color=auto /var/log/neutron/linuxbridge-agent.log
4 S neutron   1049     1  3  80   0 - 84542 ep_pol 19:31 ?        00:00:56   /usr/bin/python2 /usr/bin/neutron-linuxbridge-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/linuxbridge_agent.ini --config-dir /etc/neutron/conf.d/common --config-dir /etc/neutron/conf.d/neutron-linuxbridge-agent --log-file /var/log/neutron/linuxbridge-agent.log

 
結局、以下の操作とリンクしてエラーログが発生することが判明。
・外部ネットワークのサブネット作成し、DHCP有効
ルーターを作成しゲートウェイ設定
 
1NICの環境だからかも。