Please help with a scraping problem !

pcd

Developer
Jul 22, 2013
39
0
6
UK
Hi,

Greetings from UK ! :)

While doing some investigation of the streamcomplet sites I noticed the following problem. I will be grateful if any web expert can suggest a solution.

The source of 'http://streamcomplet.biz/3333-imitation-game.html' contains the iframe url :-

http://stream-the.net/gomovie.php?link=MnU0WVIyQzk3eC80dWcxSUd4VDE3V3A3dkloNTJqZEFyWXpoNzI5MkZ5bFArcjVpWnd5cGp5YW02VXVVMXlsd0JOTzRpRE9KT293dDhEdmxOa1FtM3NEWW5mN1NRRHh4VEJsN3YvQUpQTldvdmdGNTFLNXdRUFJ3VmxyWC9sOFFuNjE1ZVUvQzNRVzRRQXh1bjQ3NnZJRFB3L0dSSW9lY1QzZjRuKzdnVGdRPXwxNldYdVRUOTN4bnZNajJMYS9EeG9UYUh3WWVyZUt2YUdWd3k4a2hUTmIwPQ==

If I click on the link in the source page - I get a page which includes the video urls as:-

Code:
jwplayer("player").setup({
		id: "player",
		image: "http://i.imgur.com/zU6vp3H.jpg",
		sources: [{"file":"http:\/\/217.20.157.206\/?sig=8cfba6df24938ac29f40c6300219eed2e2ed214e&ct=0&expires=1437051969153&clientType=0&id=43027270323&type=4","type":"mp4","label":"mobile"},{"file":"http:\/\/217.20.157.206\/?sig=4e30698563adf9d236cbce416edef62a7ba0ff25&ct=0&expires=1437051969153&clientType=0&id=43027270323&type=0","type":"mp4","label":"lowest"},{"file":"http:\/\/217.20.157.206\/?sig=3a4f69c4fa65a3130621cb7aa7da3f06843661dc&ct=0&expires=1437051969153&clientType=0&id=43027270323&type=1","type":"mp4","label":"low"},{"file":"http:\/\/217.20.157.206\/?sig=a4520a27f3ac1eab9556106ff6a140b36e09cc3f&ct=0&expires=1437051969153&clientType=0&id=43027270323&type=2","type":"mp4","label":"sd"},{"file":"http:\/\/217.20.157.206\/?sig=1ba46ceff89602f4320248c350e19a08b826c60d&ct=0&expires=1437051969153&clientType=0&id=43027270323&type=3","type":"mp4","label":"hd"}],
		skin: "stormtrooper",
But if I try to open the link with the usual urlopen methods (or open in a pc browser) - I see a google page.

If I do a httplib2 no redirect method with :-

h = httplib2.Http()
h.follow_redirects = False
(response, body) = h.request(url)

I get no body but get a response :-

response = {'status': '302', 'x-powered-by': 'PHP/5.4.40', 'transfer-encoding': 'chunked', 'set-cookie': '__cfduid=ded15432da9964ded7419175a77eb41cc1436966909; expires=Thu, 14-Jul-16 13:28:29 GMT; path=/; domain=.stream-the.net; HttpOnly', 'server': 'cloudflare-nginx', 'connection': 'keep-alive', 'location': 'http://www.google.com', 'date': 'Wed, 15 Jul 2015 13:28:29 GMT', 'cf-ray': '2065cf8f36dd063a-LHR', 'content-type': 'text/html'}

How do I open this iframe link correctly please ?

Regards, pcd.
 
Last edited:

tknorris

New member
Feb 18, 2014
201
0
0
You are getting an HTTP Error 302 back with a location header of google.com. That's why there is no body in the response. To get the correct response, you have to provide the referer header that the iframe expects. In this case, the referer that it expects is "http://streamcomplet.biz/3333-imitation-game.html". The referer might vary from page to page.
 

pcd

Developer
Jul 22, 2013
39
0
6
UK
Thank you very much ! It worked !:)

BTW - your plugin SA*TS is working well in eni#ma2 boxes. I had to make a small change.

Best regards.