OpenCSW Bug Tracker


Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0004546 [orca_web] regular use major always 2010-09-08 16:59 2011-11-22 21:26
Reporter GlenG View Status public  
Assigned To dam
Priority normal Resolution open  
Status acknowledged  
Summary 0004546: network interface with greater than 1 Gbit of bandwidth does plot correctly
Description (This is a copy of my post to 'orca-users@orcaware.com')

I'm trying to add a plot for a network interface with greater than 1 Gbit of bandwidth. I'm getting a plot but the y axis tops out at 1000Mb/s and when the interface is receiving > 1000Mb/s that variable is not plotted (the text portion Max. does not report the larger values either). Here's my plot configuration. I copied a working 1Gbit plot and changed data_max from 1000000000 to 2000000000. Also, tried deleting data_max.
 
# Interface bits per second for > 1 Gbit interfaces.
plot {
title %g Interface Bits Per Second: $1
source orcallator
data 1024 * 8 * ((?:(?:aggr))\d+)InKB/s
data 1024 * 8 * $1OuKB/s
line_type area
line_type line1
legend Input
legend Output
y_legend Bits/s
data_min 0
data_max 2000000000
plot_width 800
href http://www.orcaware.com/orca/docs/orcallator.html#interface_bits_per_second [^]
}
Additional Information I've uploaded an orcallator collection file containing data with bandwidth utilization > 1Gbit.
Tags No tags attached.
Attached Files ? file icon beaker.saved.orcallator-2010-09-02-000 [^] (974,098 bytes) 2010-09-08 16:59
png file icon o_beaker_gauge_1024_X_8_X_aggr1InKB_per_s,__1024_X_8_X_aggr1OuKB_per_s-daily.png [^] (29,060 bytes) 2010-09-08 17:01
png file icon B.o_beaker_gauge_1024_X_8_X_aggr1InKB_per_s,__1024_X_8_X_aggr1OuKB_per_s-daily.png [^] (28,241 bytes) 2010-09-13 20:49

- Relationships

-  Notes
(0008278)
GlenG (reporter)
2010-09-08 17:03

The png file is not from plotting the attached orcallator file.
(0008286)
GlenG (reporter)
2010-09-13 21:00

I got the plot to work by:

1. removing data_max from orcallator.cfg

# Interface bits per second for > 1 Gbit interfaces.
# data_max 2000000000
plot {
title %g Interface Bits Per Second: $1
source orcallator
data 1024 * 8 * ((?:(?:aggr))\d+)InKB/s
data 1024 * 8 * $1OuKB/s
line_type area
line_type line1
legend Input
legend Output
y_legend Bits/s
data_min 0
plot_width 800
href http://www.orcaware.com/orca/docs/orcallator.html#interface_bits_per_second [^]
}

2. killing Orca master:

pkill orca

3. deleting rrd files:

ls -ltrh /var/opt/csw/orca/rrd/orcallator/o_beaker | grep aggr
-rw-r--r-- 1 root root 49K Sep 2 14:43 gauge_1024_X_18_X_aggr1OuKB_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 2 14:43 gauge_1024_X_18_X_aggr1InKB_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:23 gauge_1024_X_8_X_aggr1InKB_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:23 gauge_1024_X_8_X_aggr1OuKB_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Coll_pct.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Defr_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1IErr_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1InDtSz_per_p.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1InOvH_pct_per_p.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Ipkt_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1NoCP_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1OErr_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1Opkt_per_s.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1OuDtSz_per_p.rrd
-rw-r--r-- 1 root root 49K Sep 13 11:50 gauge_aggr1OuOvH_pct_per_p.rrd

cd /var/opt/csw/orca/rrd/orcallator/o_beaker
rm gauge_1024_X_18_X_aggr1OuKB_per_s.rrd gauge_1024_X_18_X_aggr1InKB_per_s.rrd gauge_1024_X_8_X_aggr1InKB_per_s.rrd gauge_1024_X_8_X_aggr1OuKB_per_s.rrd

4. restarting master

/opt/csw/bin/orca -d /opt/csw/etc/orcallator.cfg


I attached an "after" plot file.


If I should have done something else please let me know.
(0009428)
GlenG (reporter)
2011-11-21 16:23

This appears to be working for me. Did I fail to update this issue correctly?

Thanks,
GlenG
(PS. I am still having trouble(?) with Orca master crashing after about 10 days. I believe this is the result of a memory leak. SMF is restarting, so I guess in someways the trouble is minimal.)
(0009429)
dam (administrator)
2011-11-21 17:14

I happen to just have a machine with nxge in the lab which allows me to fix other issues while I am at it.
How high should aggr be? If you bundle up 4 gigabit interfaces it should be even higher. 10 GBE? Or 20 GBE for trunking 10 GBE interfaces?

Regarding the crash: I can't promise when I will have a reasonable amount of time to look into this. Nonetheless I am working on fixing all SE/orcallator issues in one go now (apart from the leak). IIRC the was splitting off orcallator to limit deps on server machines and new nxge interfaces. Any other issues you have for a new release?

Best regards

  -- Dago
(0009431)
GlenG (reporter)
2011-11-22 21:03

(sorry of the delay in replying - major application software upgrades over the weekend)

>>How high should aggr be?
As the number of NICs in a aggr can change dynamically, I think the best choice is to let the max float.

The things that I have run into:
1. the single threaded-ness of the master means my graphs tend to update less frequently than I like
2. memory leak leading to abend/dump (at this point /var is a little too small and it fills up - until a cron task moves the dump elsewhere) - Note: The suggestion from the orcalist is that this is a pearl-solaris 10 problem)
3. dynamic change in the number of CPUs causes collection to fail (T5220 with LDoms)
4. restarting csworca causes maintenance state, although clearing allows startup. The console from today follows:

ex=0 11:30:27 fozzie ~ gunselmg $sudo /usr/sbin/svcadm -v restart svc:/network/csworca:default
Action restart set for svc:/network/csworca:default.
ex=1 11:30:50 fozzie ~ gunselmg $sudo svcs -l svc:/network/csworca
fmri svc:/network/csworca:default
enabled true
state online
next_state offline
state_time Tue Nov 22 11:30:30 2011
logfile /var/svc/log/network-csworca:default.log
restarter svc:/system/svc/restarter:default
contract_id 908171
dependency require_all/none svc:/system/filesystem/local (online)
dependency require_all/none svc:/network/loopback (online)
ex=0 11:31:04 fozzie ~ gunselmg $sudo svcs -l svc:/network/csworca
fmri svc:/network/csworca:default
enabled true
state maintenance
next_state none
state_time Tue Nov 22 11:31:31 2011
logfile /var/svc/log/network-csworca:default.log
restarter svc:/system/svc/restarter:default
contract_id 908171
dependency require_all/none svc:/system/filesystem/local (online)
dependency require_all/none svc:/network/loopback (online)
ex=1 11:32:06 fozzie ~ gunselmg $sudo vi /var/svc/log/network-csworca:default.log
"/var/svc/log/network-csworca:default.log" 749 lines, 56006 characters
...
Version string '1.05 ' contains invalid data; ignoring: ' ' at /opt/csw/bin/orca line 66.
[ Nov 22 11:30:30 Stopping because service restarting. ]
[ Nov 22 11:30:30 Executing stop method ("/var/opt/csw/svc/method/svc-csworca stop") ]
/var/opt/csw/svc/method/svc-csworca: kill: no such process
[ Nov 22 11:30:30 Method "stop" exited with status 0 ]
[ Nov 22 11:31:30 Method or service exit timed out. Killing contract 908171 ]
[ Nov 22 11:31:31 Method or service exit timed out. Killing contract 908171 ]
:q
ex=2 11:32:22 fozzie ~ gunselmg $sudo /usr/sbin/svcadm -v clear svc:/network/csworca:default
Action maint_off set for svc:/network/csworca:default.
ex=0 11:32:29 fozzie ~ gunselmg $sudo svcs -l svc:/network/csworca
fmri svc:/network/csworca:default
enabled true
state online
next_state none
state_time Tue Nov 22 11:32:29 2011
logfile /var/svc/log/network-csworca:default.log
restarter svc:/system/svc/restarter:default
contract_id 955499
dependency require_all/none svc:/system/filesystem/local (online)
dependency require_all/none svc:/network/loopback (online)
ex=0 11:32:36 fozzie ~ gunselmg $
(0009432)
dam (administrator)
2011-11-22 21:26

Hi Glen,

> 1. the single threaded-ness of the master means my graphs tend to update less frequently than I like

This is not easy to overcome. While there are solutions like Parallel::ForkManager it would probably mean restructuring large chunks of the code. As a workaround I usually partition the monitored machines and run multiple instances of orcaweb at the same time.

> 2. memory leak leading to abend/dump (at this point /var is a little too small and it fills up - until a cron task moves the dump elsewhere) - Note: The suggestion from the orcalist is that this is a pearl-solaris 10 problem)

If you happen to have a core it would be nice if you could link that. Maybe I can get something out of it.

> 3. dynamic change in the number of CPUs causes collection to fail (T5220 with LDoms)

Why should this happen? Is this related to the code or a general restriction of the reconfiguration?

> 4. restarting csworca causes maintenance state, although clearing allows startup. The console from today follows:

I thought this was fixed in 0004505 ?


Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker