TCPSocket delay problem

Matz,

I am running into a problem where it takes about 10 seconds for Ruby to make a
connection using TCPSocket. I've looked into this, and found that it is
making a call to getaddrinfo(), and encountering a long delay because of an
address that fails to resolve. Turning off reverse lookup has no effect,
because it isn't a reverse lookup, it is a forward lookup.

Any suggestions on how to work around this?

···

--
Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

Hi,

···

In message “TCPSocket delay problem” on 03/03/04, Seth Kurtzberg seth@cql.com writes:

Any suggestions on how to work around this?

How about using octet decimal e.g.

TCPSocket.open(“192.168.0.1”, port)

or consult your sysadmin to set up proper /etc/resolv.conf,
/etc/nsswitch.conf, etc. ?

						matz.

Hi,

Any suggestions on how to work around this?

How about using octet decimal e.g.

TCPSocket.open(“192.168.0.1”, port)

The dotted address is not available. That’s what DNS lookups are for, to
translate the symbolic name to an IP address.

or consult your sysadmin to set up proper /etc/resolv.conf,
/etc/nsswitch.conf, etc. ?

Please read my questions before responding with that sort of answer. I’m not
an idiot.

My system is not set up incorrectly. The names resolve properly using the dig
command, or in programs other than Ruby. And, in fact, for most names, Ruby
has no problem.

The problem only occurs with names that do not resolve. These particular
names are returned as part of an http document; when used in an http
retrieval operation they result in an HTTP redirection.

Again there is no delay when used in a tool or in programs in other languages.
Most likely this difference occurs because these other tools and programs
don’t use the getaddrinfo() call.

By the way, I tried to use the replacement getaddrinfo() call that is provided
for systems that don’t have getaddrinfo(), however, (1) compilation fails
because a method signature is different (compared to that in netdb.h). When
I force the compilation, a link error occurs.

···

On Tuesday 04 March 2003 05:55 am, Yukihiro Matsumoto wrote:

In message “TCPSocket delay problem” > > on 03/03/04, Seth Kurtzberg seth@cql.com writes:

  					matz.


Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

The problem only occurs with names that do not resolve. These particular
names are returned as part of an http document; when used in an http
retrieval operation they result in an HTTP redirection.

Can you give an example for such names ?

Guy Decoux

chuckle

Please read my questions before responding with that sort of
answer. I’m not an idiot.

The dotted address is not available. That’s what DNS lookups are
for, to translate the symbolic name to an IP address.

If you want some respect, perhaps you should try showing some.

···

=====

Yahoo IM: michael_s_campbell


Do you Yahoo!?
Yahoo! Tax Center - forms, calculators, tips, more

Hi,

or consult your sysadmin to set up proper /etc/resolv.conf,
/etc/nsswitch.conf, etc. ?

Please read my questions before responding with that sort of answer. I’m not
an idiot.

I’m sorry if you feel offended, it’s not my intention. But if
getaddrinfo() on your system does not work as you expect, I’m not the
right person to report.

The only workaround I can think of is to use “resolv-replace”, which
is pure Ruby resolver. It is not fast at all, but at least other
thread can work during resolving. Try putting

require “resolv-replace”

at the top of your script.

						matz.
···

In message “Re: TCPSocket delay problem” on 03/03/05, Seth Kurtzberg seth@cql.com writes:

Is it possible that the hosts in question have IPv6 addresses, and the
delay is actually in (erroneously) trying to use IPv6 on a system
which has only a valid IPv4 interface? (I ask this because
getaddrinfo can return IPv6 records, whereas gethostbyname et al.
cannot.)

Have you taken a packet dump of this to see what your host is actually
doing?

Ethan

···

On Wed, 5 Mar 2003 01:32:43 +0900, Seth Kurtzberg wrote:

Again there is no delay when used in a tool or in programs in
other languages. Most likely this difference occurs because these
other tools and programs don’t use the getaddrinfo() call.


Happiness is a belt-fed weapon.

ad.doubleclick.net

···

On Tuesday 04 March 2003 09:43 am, ts wrote:

> The problem only occurs with names that do not resolve. These
particular S> names are returned as part of an http document; when used in
an http S> retrieval operation they result in an HTTP redirection.

Can you give an example for such names ?

Guy Decoux

--
Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

i see the same problem as seth. when i run this :

ruby -r socket -e 'p (Time.now); p (TCPSocket.new %q(ad.doubleclick.net), 80); p (Time.now)

it takes around 20 seconds to run. so i tried

strace !! > strace 2>&1

and here is where it got interesting. looking at the strace file i saw :

247 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
248 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr(“137.75.132.181”)}}, 16) = 0
249 send(3, “\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”…, 36, 0) = 36
250 time(NULL) = 1046809069

251 poll([{fd=3, events=POLLIN}], 1, 5000) = 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
THIS TIMED OUT!

252 close(3) = 0

so then tried running

strace dnsquery ad.doubleclick.net > strace 2>&1

and looking at it’s strace, which shows :

36 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
37 connect(3, {sin_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr(“137.75.132.181”)}}, 16) = 0
38 send(3, “s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”…, 36, 0) = 36
39 gettimeofday({1046809802, 554050}, NULL) = 0
40 rt_sigprocmask(SIG_SETMASK, NULL, , 8) = 0
41 select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {4, 950000})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
THIS DID NOT TIMED OUT!
42 rt_sigprocmask(SIG_SETMASK, , NULL, 8) = 0
43 recvfrom(3, “s\332\201\200\0\1\0\1\0\4\0\4\2ad\vdoubleclick\3net\0”…, 8192, 0, {sin_family=AF_ INET, sin_port=htons(53), sin_addr=inet_addr(“137.75.132.181”)}}, [16]) = 206
44 close(3) = 0

so, ruby sends (via syscalls)

“\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”

and then polls

while dnsquery sends (via syscalls)

“s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”

and then selects

the problems is that the poll from ruby times out?! i don’t know much about
dns, and so don’t understand the meaning of the queries, but perhaps one is
incorrect? if not, then this seems like it could only be a bug in poll?

hope this helps.

-a

···

On Wed, 5 Mar 2003, Yukihiro Matsumoto wrote:

The only workaround I can think of is to use “resolv-replace”, which
is pure Ruby resolver. It is not fast at all, but at least other
thread can work during resolving. Try putting

require “resolv-replace”

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

ad.doubleclick.net

What is your system ?

Guy Decoux

Hi,

ad.doubleclick.net

Something weird for IPv6 at that site?

$ time ruby -rsocket -e ‘p Socket.getaddrinfo(“ad.doubleclick.net”, “http”)’
[[“AF_INET”, 80, “ad.us.doubleclick.net”, “216.73.86.110”, 2, 1, 6], [“AF_INET”, 80, “ad.us.doubleclick.net”, “216.73.86.110”, 2, 2, 17]]

real 0m10.142s
user 0m0.030s
sys 0m0.020s

$ time ruby -rsocket -e ‘p Socket.getaddrinfo(“ad.doubleclick.net”, “http”, Socket::AF_INET)’
[[“AF_INET”, 80, “206.65.183.140”, “206.65.183.140”, 2, 1, 6], [“AF_INET”, 80, “206.65.183.140”, “206.65.183.140”, 2, 2, 17]]

real 0m0.634s
user 0m0.040s
sys 0m0.020s

$ time ruby -rsocket -e ‘p Socket.getaddrinfo(“ad.doubleclick.net”, “http”, Socket::AF_INET6)’
-e:1:in `getaddrinfo’: getaddrinfo: Temporary failure in name resolution (SocketError)
from -e:1

real 0m10.085s
user 0m0.020s
sys 0m0.010s

···

At Wed, 5 Mar 2003 01:51:26 +0900, Seth Kurtzberg wrote:


Nobu Nakada

[Reformatted for clarity]

i see the same problem as seth. when i run this :

ruby -r socket -e ‘p (Time.now);
p (TCPSocket.new %q(ad.doubleclick.net), 80); p (Time.now)’

249 send(3, “\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”, 36, 0) = 36
250 time(NULL) = 1046809069
251 poll([{fd=3, events=POLLIN}], 1, 5000) = 0

Ruby is making a request for Type AAAA (ipv6) record.

strace dnsquery ad.doubleclick.net > strace 2>&1

38 send(3, “s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”, 36, 0) = 36
41 select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {4, 950000})

Dnsquery is making a request for Type A (ipv4) record.

Try this to see the problem:

dnsquery -t aaaa ad.doubleclick.net

which times out after 10 seconds. The DNS server for doubleclick.net
is refusing to answer AAAA queries which is just plain rude, but what
else is new for a company who makes its money by annoying the public?

Of course a better “solution” would be to ask getaddrinfo() to only
use the protocol types you are interested in (AF_INET). A previous
poster demonstrated how to do this in ruby. Unfortunately, there are
so many domains out there whose dns servers ignore AAAA requests that
using an ipv6-enabled web browser is a straight path to profanity.

-Jacob

···

ahoward ahoward@fsl.noaa.gov wrote:

jnews@epicsol.org Violations of McQ flagrantly “sponsored” by…

-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 80,000 Newsgroups - 16 Different Servers! =-----

Ara,

That is extremely valuable information! Thanks much.

This definitely looks like a bug, because there is no sensible reason for poll
to block but select to not block.

···

On Tuesday 04 March 2003 12:58 pm, ahoward wrote:

On Wed, 5 Mar 2003, Yukihiro Matsumoto wrote:

The only workaround I can think of is to use “resolv-replace”, which
is pure Ruby resolver. It is not fast at all, but at least other
thread can work during resolving. Try putting

require “resolv-replace”

i see the same problem as seth. when i run this :

ruby -r socket -e 'p (Time.now); p (TCPSocket.new %q(ad.doubleclick.net),
80); p (Time.now)

it takes around 20 seconds to run. so i tried

strace !! > strace 2>&1

and here is where it got interesting. looking at the strace file i saw :

247 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
248 connect(3, {sin_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr(“137.75.132.181”)}}, 16) = 0 249 send(3,
“\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”…, 36, 0) = 36 250
time(NULL) = 1046809069

251 poll([{fd=3, events=POLLIN}], 1, 5000) = 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
THIS TIMED OUT!

252 close(3) = 0

so then tried running

strace dnsquery ad.doubleclick.net > strace 2>&1

and looking at it’s strace, which shows :

36 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
37 connect(3, {sin_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr(“137.75.132.181”)}}, 16) = 0 38 send(3,
“s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”…, 36, 0) = 36 39
gettimeofday({1046809802, 554050}, NULL) = 0
40 rt_sigprocmask(SIG_SETMASK, NULL, , 8) = 0
41 select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {4, 950000})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
THIS DID NOT TIMED OUT!
42 rt_sigprocmask(SIG_SETMASK, , NULL, 8) = 0
43 recvfrom(3,
“s\332\201\200\0\1\0\1\0\4\0\4\2ad\vdoubleclick\3net\0”…, 8192, 0,
{sin_family=AF_ INET, sin_port=htons(53),
sin_addr=inet_addr(“137.75.132.181”)}}, [16]) = 206 44 close(3)
= 0

so, ruby sends (via syscalls)

“\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”

and then polls

while dnsquery sends (via syscalls)

“s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”

and then selects

the problems is that the poll from ruby times out?! i don’t know much
about dns, and so don’t understand the meaning of the queries, but perhaps
one is incorrect? if not, then this seems like it could only be a bug in
poll?

hope this helps.

-a

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259

====================================


Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

Quite likely. I’ll try disabling IPV6 support.

···

On Tuesday 04 March 2003 11:33 am, nobu.nokada@softhome.net wrote:

Hi,

At Wed, 5 Mar 2003 01:51:26 +0900, > > Seth Kurtzberg wrote:

ad.doubleclick.net

Something weird for IPv6 at that site?

$ time ruby -rsocket -e ‘p Socket.getaddrinfo(“ad.doubleclick.net”,
“http”)’ [[“AF_INET”, 80, “ad.us.doubleclick.net”, “216.73.86.110”, 2, 1,
6], [“AF_INET”, 80, “ad.us.doubleclick.net”, “216.73.86.110”, 2, 2, 17]]

real 0m10.142s
user 0m0.030s
sys 0m0.020s

$ time ruby -rsocket -e ‘p Socket.getaddrinfo(“ad.doubleclick.net”, “http”,
Socket::AF_INET)’ [[“AF_INET”, 80, “206.65.183.140”, “206.65.183.140”, 2,
1, 6], [“AF_INET”, 80, “206.65.183.140”, “206.65.183.140”, 2, 2, 17]]

real 0m0.634s
user 0m0.040s
sys 0m0.020s

$ time ruby -rsocket -e ‘p Socket.getaddrinfo(“ad.doubleclick.net”, “http”,
Socket::AF_INET6)’ -e:1:in `getaddrinfo’: getaddrinfo: Temporary failure in
name resolution (SocketError) from -e:1

real 0m10.085s
user 0m0.020s
sys 0m0.010s


Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

[Reformatted for clarity]

i see the same problem as seth. when i run this :

ruby -r socket -e ‘p (Time.now);
p (TCPSocket.new %q(ad.doubleclick.net), 80); p (Time.now)’

249 send(3, “\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”, 36, 0) =
36 250 time(NULL) = 1046809069
251 poll([{fd=3, events=POLLIN}], 1, 5000) = 0

Ruby is making a request for Type AAAA (ipv6) record.

strace dnsquery ad.doubleclick.net > strace 2>&1

38 send(3, “s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”, 36, 0) =
36 41 select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {4,
950000})

Dnsquery is making a request for Type A (ipv4) record.

Try this to see the problem:

dnsquery -t aaaa ad.doubleclick.net

which times out after 10 seconds. The DNS server for doubleclick.net
is refusing to answer AAAA queries which is just plain rude, but what
else is new for a company who makes its money by annoying the public?

Of course a better “solution” would be to ask getaddrinfo() to only
use the protocol types you are interested in (AF_INET). A previous
poster demonstrated how to do this in ruby.

I somehow managed to miss that email with the information about restricting
protocol types. If you still have it, can you forward, or can the original
sender please resend? TIA.

Unfortunately, there are

···

On Tuesday 04 March 2003 02:38 pm, Jacob News wrote:

ahoward ahoward@fsl.noaa.gov wrote:
so many domains out there whose dns servers ignore AAAA requests that
using an ipv6-enabled web browser is a straight path to profanity.

-Jacob


Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

this is valuable information. thanks for the info!

perhaps you should forward your ‘better’ solution on to ruby-dev?

-a

···

On Tue, 4 Mar 2003, Jacob News wrote:

Ruby is making a request for Type AAAA (ipv6) record.

Dnsquery is making a request for Type A (ipv4) record.

Try this to see the problem:

dnsquery -t aaaa ad.doubleclick.net

which times out after 10 seconds. The DNS server for doubleclick.net
is refusing to answer AAAA queries which is just plain rude, but what
else is new for a company who makes its money by annoying the public?

Of course a better “solution” would be to ask getaddrinfo() to only
use the protocol types you are interested in (AF_INET). A previous
poster demonstrated how to do this in ruby. Unfortunately, there are
so many domains out there whose dns servers ignore AAAA requests that
using an ipv6-enabled web browser is a straight path to profanity.

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

Thanks for everyone’s help with my delay problem. The key insight was
provided by those who noticed that the problem is related to IPV6. I “fixed”
the problem (well, hacked around the problem) by forcing the family in calls
to getaddrinfo to be PF_INET. As someone pointed out there are a lot of
hosts on the 'net that have this deficiency, so although the ruby code is not
incorrect, it might be a good idea to change the default behavior, at least
for the moment.

···

On Tuesday 04 March 2003 06:45 pm, Seth Kurtzberg wrote:

On Tuesday 04 March 2003 02:38 pm, Jacob News wrote:

[Reformatted for clarity]

ahoward ahoward@fsl.noaa.gov wrote:

i see the same problem as seth. when i run this :

ruby -r socket -e ‘p (Time.now);
p (TCPSocket.new %q(ad.doubleclick.net), 80); p (Time.now)’

249 send(3, “\314Q\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”, 36, 0)
= 36 250 time(NULL) = 1046809069
251 poll([{fd=3, events=POLLIN}], 1, 5000) = 0

Ruby is making a request for Type AAAA (ipv6) record.

strace dnsquery ad.doubleclick.net > strace 2>&1

38 send(3, “s\332\1\0\0\1\0\0\0\0\0\0\2ad\vdoubleclick\3net\0”, 36, 0) =
36 41 select(4, [3], NULL, NULL, {5, 0}) = 1 (in [3], left {4,
950000})

Dnsquery is making a request for Type A (ipv4) record.

Try this to see the problem:

dnsquery -t aaaa ad.doubleclick.net

which times out after 10 seconds. The DNS server for doubleclick.net
is refusing to answer AAAA queries which is just plain rude, but what
else is new for a company who makes its money by annoying the public?

Of course a better “solution” would be to ask getaddrinfo() to only
use the protocol types you are interested in (AF_INET). A previous
poster demonstrated how to do this in ruby.

I somehow managed to miss that email with the information about restricting
protocol types. If you still have it, can you forward, or can the original
sender please resend? TIA.

Unfortunately, there are

so many domains out there whose dns servers ignore AAAA requests that
using an ipv6-enabled web browser is a straight path to profanity.

-Jacob


Seth Kurtzberg
M. I. S. Corp.
480-661-1849
seth@cql.com

FYI there's a very nice archive of this list. From www.ruby-lang.org it's
two clicks away: "mailing list" then "past mail archive site: ruby-talk"

The "Namazu" search doesn't seem to work, but the searches at the bottom
(subject and last 1000 mails) do.

Regards,

Brian.

···

On Wed, Mar 05, 2003 at 10:45:13AM +0900, Seth Kurtzberg wrote:

I somehow managed to miss that email with the information about restricting
protocol types. If you still have it, can you forward, or can the original
sender please resend? TIA.

In my system the following options to configure when building
Ruby fix it:
–enable-ipv6
–with-lookup-order-hack=INET

···

On Wed, Mar 05, 2003 at 01:03:02PM +0900, Seth Kurtzberg wrote:

Thanks for everyone’s help with my delay problem. The key insight was
provided by those who noticed that the problem is related to IPV6. I “fixed”
the problem (well, hacked around the problem) by forcing the family in calls
to getaddrinfo to be PF_INET. As someone pointed out there are a lot of
hosts on the 'net that have this deficiency, so although the ruby code is not
incorrect, it might be a good idea to change the default behavior, at least
for the moment.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

We apologize for the inconvenience, but we’d still like yout to test out
this kernel.
– Linus Torvalds, announcing another kernel patch