mercredi 17 janvier 2018

Ruby scraping and access remote CSV

I am working on scraping the website by using Capybara + Poltergeist and access to remote csv(https). The actual scraping part after website login works but when I access to remote csv with open(url) method, I get 403 error. However when I type the url into browser, I can download the csv. Is there any additional authentication needed? I assume open(url) just works since the website login with Capybara + Poltergeist already happens.

The csv url: https://sample/csv?start_date=2018-01-15&end_date=2018-01-15

The ruby code:

csv_url = build_url "https://sample/csv", start_date: '2018-01-15', end_date: '2018-01-15'
csv = open(csv_url)

Response:

[httplog] Sending: GET http://sample:443/csv?start_date=2018-01-15&end_date=2018-01-15
[httplog] Status: 403
[httplog] Response: (not available yet) => "403 Forbidden"

Aucun commentaire:

Enregistrer un commentaire