OpenCSW Bug Tracker


Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0005163 [squid] regular use crash have not tried 2014-04-11 23:13 2016-04-04 15:10
Reporter hudesd View Status public  
Assigned To dam
Priority normal Resolution fixed  
Status closed  
Summary 0005163: squid 3.4.4 crashes on Solaris 10
Description I have been using Squid 3.1 for quite awhile with no problem. I recently upgraded all my CSW packages and Squid 3.4.4 came with it, no option otherwise it's in stable/unstable/testing.
The problem is that it is NOT stable: it exits after awhile.
It's running as a service (cswsquid) as per the package.
This on a T2000 Solaris 10 148888-05 with 8GB RAM and about 600GB of available disk space .
I had made no change to the configuration between 3.1 and 3.4. I subsequently have tried both aufs and my original ufs (diskd isn't available) to no avail.
I increased the size of the disk and memory cache to no avail.
Squid will run happily as long as users are only tunneling through it; once some caching gets going with regular http it exits.

I'm not finding any core dumps in /var/opt/csw/squid/cache or the 00 directory under that.
I can provide squid config files and log files.
Additional Information
Tags No tags attached.
Attached Files ? file icon squid.conf [^] (4,220 bytes) 2014-04-11 23:22

- Relationships

-  Notes
(0010799)
dam (administrator)
2014-04-12 03:21

I am using 3.4.4,REV=2014.03.14 with no problems so far. Please check
  /var/svc/log/network-cswsquid:default.log
for messages.

However, 3.4.4 is only in testing and unstable, whereas stable still has 3.1:

root@web [web]:/var/opt/csw/squid/cache > ls -l /export/mirror/opencsw-official/*/sparc/5.10/squid-*
-rw-r--r-- 3 web web 2619826 Mar 14 14:10 /export/mirror/opencsw-official/bratislava/sparc/5.10/squid-3.4.4,REV=2014.03.14-SunOS5.10-sparc-CSW.pkg.gz
lrwxrwxrwx 1 web web 65 Mar 15 03:16 /export/mirror/opencsw-official/dublin/sparc/5.10/squid-2.7,REV=2010.10.05_STABLE9-SunOS5.9-sparc-CSW.pkg.gz -> ../5.9/squid-2.7,REV=2010.10.05_STABLE9-SunOS5.9-sparc-CSW.pkg.gz
-rw-r--r-- 2 web web 2483888 Sep 25 2012 /export/mirror/opencsw-official/kiel/sparc/5.10/squid-3.1,REV=2012.06.15_20-SunOS5.10-sparc-CSW.pkg.gz
-rw-r--r-- 3 web web 714982 Oct 8 2009 /export/mirror/opencsw-official/legacy/sparc/5.10/squid-2.6,REV=2007.09.02_STABLE15-SunOS5.8-sparc-CSW.pkg.gz
-rw-r--r-- 2 web web 2483888 Sep 25 2012 /export/mirror/opencsw-official/stable/sparc/5.10/squid-3.1,REV=2012.06.15_20-SunOS5.10-sparc-CSW.pkg.gz
-rw-r--r-- 3 web web 2619826 Mar 14 14:10 /export/mirror/opencsw-official/testing/sparc/5.10/squid-3.4.4,REV=2014.03.14-SunOS5.10-sparc-CSW.pkg.gz
-rw-r--r-- 3 web web 2619826 Mar 14 14:10 /export/mirror/opencsw-official/unstable/sparc/5.10/squid-3.4.4,REV=2014.03.14-SunOS5.10-sparc-CSW.pkg.gz
(0010806)
hudesd (reporter)
2014-04-23 19:09

The service method script needs to be updated: it is using -D which is deprecated and slated to be removed.
(0010809)
hudesd (reporter)
2014-04-24 00:09

Apr 23 13:08:23 Leaving maintenance because clear requested. ]
[ Apr 23 13:08:23 Enabled. ]
[ Apr 23 13:08:23 Executing start method ("/var/opt/csw/svc/method/svc-cswsquid start") ]
starting squid server.
[ Apr 23 13:08:23 Method "start" exited with status 0 ]
2014/04/23 13:08:23| WARNING: -D command-line option is obsolete.
[ Apr 23 16:08:25 Stopping because process dumped core. ]
[ Apr 23 16:08:26 Executing stop method ("/var/opt/csw/svc/method/svc-cswsquid stop") ]
squid server is already down
[ Apr 23 16:08:26 Method "stop" exited with status 0 ]
[ Apr 23 16:09:27 Method or service exit timed out. Killing contract 8579326 ]
(0010815)
dam (administrator)
2014-05-02 15:52

I have now an idea what goes wrong: when retreiving something via FTP squid dumps core. Here is the stacktrace:


pstack core.squid.8044

core 'core.squid.8044' of 8044: (squid-1) -D
 fe6c8e07 _lwp_kill (1, 6, feffe248, fe670ff1) + 7
 fe670ffd raise (6, 0, feffe298, fe6487ad) + 25
 fe6487cd abort (0, 1, 2b, 8647430, fe766c80, fe762000) + f5
 082a2b2d _Z5deathi (b, 0, feffe3e0, fe69e537, fdf72a40, fe762000) + 1cd
 fe6c4b05 __sighndlr (b, 0, feffe3e0, 82a2960) + 15
 fe6b7eae call_user_handler (b) + 2d2
 fe6b8346 sigacthandler (b, 0, feffe3e0) + ee
 --- called from signal handler with signal 11 (SIGSEGV) ---
 083434c5 _ZNK2Ip7Address4portEv (4, fe762000, feffe998, fe661667, 965adf0, fe762000) + 15
 081c7c01 ???????? (913fdac, 845f909, feffea08, fe6bd29c, 97bb4a0, fe762000)
 081c97bf ???????? (913bca0, 25, 913fda8, feffeabc, b9, 25)
 081c8bd9 _ZN12FtpStateData18handleControlReplyEv (913bca0, 94c2048, 832f25b, 94c2040, 94c2020) + 149
 081ce2a2 _ZN13CommCbMemFunTI12FtpStateData14CommIoCbParamsE6doDialEv (94c203c, 94c2020, feffeb28, 81cdc03, 94c2040, fe762000) + 32
 081ce013 _ZN9JobDialerI12FtpStateDataE4dialER9AsyncCall (94c203c, 94c2020, feffeb58, fe661c5a, fe763098, 84a4020) + 33
 081ce178 _ZN10AsyncCallTI13CommCbMemFunTI12FtpStateData14CommIoCbParamsEE4fireEv (94c2020, 84a137f, feff1b7f, 81a9daf, 88000000, 4056e1fc) + 18
 0832d42d _ZN9AsyncCall4makeEv (94c2020, feffecc4, e650d871, fe76930c, cf5d3200, 8) + 3bd
 083317b6 _ZN14AsyncCallQueue8fireNextEv (86b1a70, feffecdc, feffec08, 82a0d4a, 86086f0, 0) + 1f6
 08331ba0 _ZN14AsyncCallQueue4fireEv (86b1a70, feffecb0, 1, 842e5f9, 40, 402e0000) + 30
 081aadd4 _ZN9EventLoop7runOnceEv (feffecdc, 402e0000, 1, feffecb0, 0, feffecb4) + 104
 081aaf70 _ZN9EventLoop3runEv (feffecdc, feffecb4, 0, 0, 402e0000, 1) + 20
 0822b2d4 _Z9SquidMainiPPc (2, feffed60, 84a4020, feffed1c, feffed3c, fe7fa8bc) + 14b4
 08430c7d main (2, feffed60, feffed6c) + 1d
 0811f2e0 _start (2, feffee38, feffee42, 0, 86b60b0, feffee5d) + 80

dam@unstable10s [unstable10s]:/home/dam/tmp > cat yyy | /opt/SUNWspro/bin/c++filt
core 'core.squid.8044' of 8044: (squid-1) -D
 fe6c8e07 _lwp_kill (1, 6, feffe248, fe670ff1) + 7
 fe670ffd raise (6, 0, feffe298, fe6487ad) + 25
 fe6487cd abort (0, 1, 2b, 8647430, fe766c80, fe762000) + f5
 082a2b2d death(int) (b, 0, feffe3e0, fe69e537, fdf72a40, fe762000) + 1cd
 fe6c4b05 __sighndlr (b, 0, feffe3e0, 82a2960) + 15
 fe6b7eae call_user_handler (b) + 2d2
 fe6b8346 sigacthandler (b, 0, feffe3e0) + ee
 --- called from signal handler with signal 11 (SIGSEGV) ---
 083434c5 Ip::Address::port() const (4, fe762000, feffe998, fe661667, 965adf0, fe762000) + 15
 081c7c01 ???????? (913fdac, 845f909, feffea08, fe6bd29c, 97bb4a0, fe762000)
 081c97bf ???????? (913bca0, 25, 913fda8, feffeabc, b9, 25)
 081c8bd9 FtpStateData::handleControlReply() (913bca0, 94c2048, 832f25b, 94c2040, 94c2020) + 149
 081ce2a2 CommCbMemFunT<FtpStateData, CommIoCbParams>::doDial() (94c203c, 94c2020, feffeb28, 81cdc03, 94c2040, fe762000) + 32
 081ce013 JobDialer<FtpStateData>::dial(AsyncCall&) (94c203c, 94c2020, feffeb58, fe661c5a, fe763098, 84a4020) + 33
 081ce178 AsyncCallT<CommCbMemFunT<FtpStateData, CommIoCbParams> >::fire() (94c2020, 84a137f, feff1b7f, 81a9daf, 88000000, 4056e1fc) + 18
 0832d42d AsyncCall::make() (94c2020, feffecc4, e650d871, fe76930c, cf5d3200, 8) + 3bd
 083317b6 AsyncCallQueue::fireNext() (86b1a70, feffecdc, feffec08, 82a0d4a, 86086f0, 0) + 1f6
 08331ba0 AsyncCallQueue::fire() (86b1a70, feffecb0, 1, 842e5f9, 40, 402e0000) + 30
 081aadd4 EventLoop::runOnce() (feffecdc, 402e0000, 1, feffecb0, 0, feffecb4) + 104
 081aaf70 EventLoop::run() (feffecdc, feffecb4, 0, 0, 402e0000, 1) + 20
 0822b2d4 SquidMain(int, char**) (2, feffed60, 84a4020, feffed1c, feffed3c, fe7fa8bc) + 14b4
 08430c7d main (2, feffed60, feffed6c) + 1d
 0811f2e0 _start (2, feffee38, feffee42, 0, 86b60b0, feffee5d) + 80

Digging further.
(0010816)
dam (administrator)
2014-05-02 16:32

I am pretty confident this is the same bug as this one:
  http://bugs.squid-cache.org/show_bug.cgi?id=4004 [^]
(0010819)
hudesd (reporter)
2014-05-05 17:49

While it is interesting that you found a bug with FTP tunneling, I don't have much of that in my organization. What I do have a LOT of is HTTPS tunneling. The HP SAN equipment likes to "phone home" a LOT -- of the last 80 entries in access.log, 68 are CONNECT requests to trilogy2.3pardata.com and 141 of the last 180 requests are CONNECT (the other major CONNECT sources is Oracle Ops Center).
(0010838)
dam (administrator)
2014-05-23 09:41

I pushed 3.4.5 in the meantime which still crashes from time to time. What is interesting is that the crashes vanish completely when squidguard is not used (that means no URL filtering is used at all).
(0010839)
hudesd (reporter)
2014-05-23 15:30

I found a problem. The configuration file from 3.1 had been overwrittn by the default 3.4 file. I had changed memory cache to 1G but not incread disk cache from default 100 16 256. I thought that was 100*16*256 but I reread docs and found it is 100M. So memory cache presumably fills. But it is bigger than disk cache so it can't swap all out. So it exits, silently.
When I tried similar on squid 3.1 as delivered with Solaris 11.1, squid complained about the configuration and refused to start (appropriate behavior).
Changing the disk cache to 1024 16 256 resolved that issue but squid 3.4 should complain just as 3.1 did not gamely try to work then fail.

I'll see about 3.4.5 on a test machine.
(0010908)
maciej (reporter)
2014-09-12 19:07

FYI this bug blocks the integration of squid from unstable to testing.
(0010925)
dam (administrator)
2014-09-27 22:49

@Maciej: I think it is good that it blocks integration as this is probably an important issue. I'll keep an eye on all open issues, however this one seems hard to fix.
(0010996)
dam (administrator)
2014-12-11 11:19

Meanwhile I released 3.4.10 and adjusted some build flags as reported in another bug, can you please retry and see if the error is still present?
(0011034)
dam (administrator)
2015-05-04 13:21

Meanwhile 3.5.4 has been released to experimental:
  http://buildfarm.opencsw.org/experimental.html#squid [^]
Please give it a try.
(0011129)
dam (administrator)
2016-04-04 15:10

I am not sure if this is fixed in the latest release, but this is definitely an upstream issue, so I'll close the issue for now here.


Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker