Who says those pages are doing redirects? What if the first web page
parses the query string attached to the url, then uses javascript to
load a different page?
Still, the op's question remains unanswered: why doesn't open-uri follow
all those redirects?
Looking at the curl -Lv output, there are cookies being set. Maybe a
server side script kicks you out if the requests for the redirect urls
do not include those cookies.
One option is to switch to Mechanize, which will automatically send any
cookies that were set in a response.
Still, the op's question remains unanswered: why doesn't open-uri follow
all those redirects?
it can:
According to the docs, open-uri follows redirects by default. So
according to the docs, open-uri can follow redirects, but the fact
remains it doesn't in this case. Why?
If the first page is redirecting using Javascript client side perhaps you
should consider using PhantomJS through the phantomjs.rb gem or other
equivalent mean.
To assert before which way the redirection actually happens, I'd try with a
browser plugin like Live HTTP Headers or good old Wireshark.
Marvan
···
On Thu, Sep 6, 2012 at 7:33 PM, Derek T. <lists@ruby-forum.com> wrote:
If that's the case, is there a way I can follow the link all the way
through so I can what I want?
% ri OpenURI::OpenRead#open | grep -A8 :redirect:
:redirect:
Synopsis:
:redirect=>bool
:redirect=>false is used to disable HTTP redirects at all.
OpenURI::HTTPRedirect exception raised on redirection. It is true by default.
The true means redirections between http and ftp is permitted.
···
On Sep 6, 2012, at 17:21 , 7stud -- <lists@ruby-forum.com> wrote:
Still, the op's question remains unanswered: why doesn't open-uri follow
all those redirects?
Still, the op's question remains unanswered: why doesn't open-uri follow
all those redirects?
it can:
According to the docs, open-uri follows redirects by default. So
according to the docs, open-uri can follow redirects, but the fact
remains it doesn't in this case. Why?
Because it redirects with invalid URI's:
Last login: Fri Sep 7 23:45:55 on ttys008
10000 % ruby -ropen-uri -e 'URI.parse(ARGV.shift).read' "OOPS! The offer you're looking for has expired.;
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/uri/common.rb:436:in `split': bad URI(is not URI?): OOPS! The offer you're looking for has expired.; (URI::InvalidURIError)
from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/uri/common.rb:485:in `parse'
...
from -e:1
+1 for curl obviously ... sorry was in overkill mode ..
···
On Thu, Sep 6, 2012 at 9:30 PM, Brian Candler <lists@ruby-forum.com> wrote:
Reza Marvan Spagnolo wrote in post #1074957:
> To assert before which way the redirection actually happens, I'd try
> with a
> browser plugin like Live HTTP Headers or good old Wireshark.
>
> Marvan
Or good old curl.
$ curl -v
' OOPS! The offer you're looking for has expired.
'
* About to connect() to www.anrdoezrs.net port 80 (#0)
* Trying 89.207.18.129... connected
* Connected to www.anrdoezrs.net (89.207.18.129) port 80 (#0)
> GET
/click-5329913-10569016?url=http%3A%2F%2Fwww.fashion58.com
%2Fitemdetail.asp%3Fmod%3DEH5BG213DFSK&cjsku=F58-EH5BG213DFSK
HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4
OpenSSL/0.9.8r zlib/1.2.5
> Host: www.anrdoezrs.net
> Accept: */*
>
< HTTP/1.1 302 Found
< Server: Resin/3.1.8
< P3P: policyref="http://www.anrdoezrs.net/w3c/p3p.xml", CP="ALL BUS LEG
DSP COR ADM CUR DEV PSA OUR NAV INT"
< Cache-control: no-store, no-cache, must-revalidate, post-check=0,
pre-check=0
< Pragma: no-cache
< Expires: Thu, 06 Sep 2012 19:27:18 GMT
< Location: