0004546: network interface with greater than 1 Gbit of bandwidth does plot correctly

Notes
(0008278) GlenG (reporter) 2010-09-08 17:03	The png file is not from plotting the attached orcallator file.

(0008286) GlenG (reporter) 2010-09-13 21:00	I got the plot to work by: 1. removing data_max from orcallator.cfg # Interface bits per second for > 1 Gbit interfaces. # data_max 2000000000 plot { title %g Interface Bits Per Second: $1 source orcallator data 1024 * 8 * ((?:(?:aggr))\d+)InKB/s data 1024 * 8 * $1OuKB/s line_type area line_type line1 legend Input legend Output y_legend Bits/s data_min 0 plot_width 800 href http://www.orcaware.com/orca/docs/orcallator.html#interface_bits_per_second [^] } 2. killing Orca master: pkill orca 3. deleting rrd files: ls -ltrh /var/opt/csw/orca/rrd/orcallator/o_beaker \| grep aggr -rw-r--r-- 1 root root 49K Sep 2 14:43 gauge_1024_X_18_X_aggr1OuKB_per_s.rrd -rw-r--r-- 1 root root 49K Sep 2 14:43 gauge_1024_X_18_X_aggr1InKB_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:23 gauge_1024_X_8_X_aggr1InKB_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:23 gauge_1024_X_8_X_aggr1OuKB_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Coll_pct.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Defr_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1IErr_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1InDtSz_per_p.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1InOvH_pct_per_p.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Ipkt_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1NoCP_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1OErr_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Opkt_per_s.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1OuDtSz_per_p.rrd -rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1OuOvH_pct_per_p.rrd cd /var/opt/csw/orca/rrd/orcallator/o_beaker rm gauge_1024_X_18_X_aggr1OuKB_per_s.rrd gauge_1024_X_18_X_aggr1InKB_per_s.rrd gauge_1024_X_8_X_aggr1InKB_per_s.rrd gauge_1024_X_8_X_aggr1OuKB_per_s.rrd 4. restarting master /opt/csw/bin/orca -d /opt/csw/etc/orcallator.cfg I attached an "after" plot file. If I should have done something else please let me know.

(0009428) GlenG (reporter) 2011-11-21 16:23	This appears to be working for me. Did I fail to update this issue correctly? Thanks, GlenG (PS. I am still having trouble(?) with Orca master crashing after about 10 days. I believe this is the result of a memory leak. SMF is restarting, so I guess in someways the trouble is minimal.)

(0009429) dam (administrator) 2011-11-21 17:14	I happen to just have a machine with nxge in the lab which allows me to fix other issues while I am at it. How high should aggr be? If you bundle up 4 gigabit interfaces it should be even higher. 10 GBE? Or 20 GBE for trunking 10 GBE interfaces? Regarding the crash: I can't promise when I will have a reasonable amount of time to look into this. Nonetheless I am working on fixing all SE/orcallator issues in one go now (apart from the leak). IIRC the was splitting off orcallator to limit deps on server machines and new nxge interfaces. Any other issues you have for a new release? Best regards -- Dago

(0009431) GlenG (reporter) 2011-11-22 21:03	(sorry of the delay in replying - major application software upgrades over the weekend) >>How high should aggr be? As the number of NICs in a aggr can change dynamically, I think the best choice is to let the max float. The things that I have run into: 1. the single threaded-ness of the master means my graphs tend to update less frequently than I like 2. memory leak leading to abend/dump (at this point /var is a little too small and it fills up - until a cron task moves the dump elsewhere) - Note: The suggestion from the orcalist is that this is a pearl-solaris 10 problem) 3. dynamic change in the number of CPUs causes collection to fail (T5220 with LDoms) 4. restarting csworca causes maintenance state, although clearing allows startup. The console from today follows: ex=0 11:30:27 fozzie ~ gunselmg $sudo /usr/sbin/svcadm -v restart svc:/network/csworca:default Action restart set for svc:/network/csworca:default. ex=1 11:30:50 fozzie ~ gunselmg $sudo svcs -l svc:/network/csworca fmri svc:/network/csworca:default enabled true state online next_state offline state_time Tue Nov 22 11:30:30 2011 logfile /var/svc/log/network-csworca:default.log restarter svc:/system/svc/restarter:default contract_id 908171 dependency require_all/none svc:/system/filesystem/local (online) dependency require_all/none svc:/network/loopback (online) ex=0 11:31:04 fozzie ~ gunselmg $sudo svcs -l svc:/network/csworca fmri svc:/network/csworca:default enabled true state maintenance next_state none state_time Tue Nov 22 11:31:31 2011 logfile /var/svc/log/network-csworca:default.log restarter svc:/system/svc/restarter:default contract_id 908171 dependency require_all/none svc:/system/filesystem/local (online) dependency require_all/none svc:/network/loopback (online) ex=1 11:32:06 fozzie ~ gunselmg $sudo vi /var/svc/log/network-csworca:default.log "/var/svc/log/network-csworca:default.log" 749 lines, 56006 characters ... Version string '1.05 ' contains invalid data; ignoring: ' ' at /opt/csw/bin/orca line 66. [ Nov 22 11:30:30 Stopping because service restarting. ] [ Nov 22 11:30:30 Executing stop method ("/var/opt/csw/svc/method/svc-csworca stop") ] /var/opt/csw/svc/method/svc-csworca: kill: no such process [ Nov 22 11:30:30 Method "stop" exited with status 0 ] [ Nov 22 11:31:30 Method or service exit timed out. Killing contract 908171 ] [ Nov 22 11:31:31 Method or service exit timed out. Killing contract 908171 ] :q ex=2 11:32:22 fozzie ~ gunselmg $sudo /usr/sbin/svcadm -v clear svc:/network/csworca:default Action maint_off set for svc:/network/csworca:default. ex=0 11:32:29 fozzie ~ gunselmg $sudo svcs -l svc:/network/csworca fmri svc:/network/csworca:default enabled true state online next_state none state_time Tue Nov 22 11:32:29 2011 logfile /var/svc/log/network-csworca:default.log restarter svc:/system/svc/restarter:default contract_id 955499 dependency require_all/none svc:/system/filesystem/local (online) dependency require_all/none svc:/network/loopback (online) ex=0 11:32:36 fozzie ~ gunselmg $

(0009432) dam (administrator) 2011-11-22 21:26	Hi Glen, > 1. the single threaded-ness of the master means my graphs tend to update less frequently than I like This is not easy to overcome. While there are solutions like Parallel::ForkManager it would probably mean restructuring large chunks of the code. As a workaround I usually partition the monitored machines and run multiple instances of orcaweb at the same time. > 2. memory leak leading to abend/dump (at this point /var is a little too small and it fills up - until a cron task moves the dump elsewhere) - Note: The suggestion from the orcalist is that this is a pearl-solaris 10 problem) If you happen to have a core it would be nice if you could link that. Maybe I can get something out of it. > 3. dynamic change in the number of CPUs causes collection to fail (T5220 with LDoms) Why should this happen? Is this related to the code or a general restriction of the reconfiguration? > 4. restarting csworca causes maintenance state, although clearing allows startup. The console from today follows: I thought this was fixed in 0004505 ?

Relationships

OpenCSW Bug Tracker