Unable to download pdf - 302 aws link not being returned

Hi, I am having an issue with trying to get the PDF downloaded. The code I am using is VBA in Excel.

All my other API calls to get basic company information and filing history seem to work.

The filing history document response shows me 2 links, the “self” and the “document link” which are below:
I use the following code to get the response:
baseUrl = “https://api.company-information.service.gov.uk
url = baseUrl & “/company/” & companyNumber & “/filing-history”

        Set xmlHttp = New MSXML2.XMLHTTP60
        xmlHttp.Open "GET", url, False
        xmlHttp.setRequestHeader "Authorization", "Basic " & Base64Encode(API_KEY & ":")
        xmlHttp.send

I then pickup the url from the response data. Partial response shown below as this is embedded in the filing history.
“links”:{“self”:“https://document-api.company-information.service.gov.uk/document/iEfD-fTMmUChdIs_8YviXdGNrHKdO4rDspJauabod-0",“document”:"https://document-api.company-information.service.gov.uk/document/iEfD-fTMmUChdIs_8YviXdGNrHKdO4rDspJauabod-0/content

As soon as I use the document url I get a “permission denied object” in the xmlHttp.send command and I assume this is becuase the document link does not exist but why do I not get the 302 redirect response which will redirect me to the Amazon web server. Why does adding “/content” cause this error ?

        pdfUrl = https://document-api.company-information.service.gov.uk/document/iEfD-fTMmUChdIs_8YviXdGNrHKdO4rDspJauabod-0/content
    
        Dim xmlHttp As MSXML2.XMLHTTP60

        Set xmlHttp = New MSXML2.XMLHTTP60
        xmlHttp.Open "GET", pdfUrl, False
        xmlHttp.setRequestHeader "Authorization", "Basic " & Base64Encode(API_KEY & ":")
        xmlHttp.setRequestHeader "Accept", "application/pdf"
        xmlHttp.send
        
        'Handle redirect to Amazon web server where the actual file is located
        If xmlHttp.status = 302 Then
            pdfUrl = xmlHttp.getResponseHeader("Location")
            Set xmlHttp = New MSXML2.XMLHTTP60
            xmlHttp.Open "GET", pdfUrl, False
            xmlHttp.send
        End If
        
        'check request was ok
        If xmlHttp.status = 200 Then
	'Do rest of code
    End If

I don’t use VBA much now but a quick Google seems to show that this component is following redirects.

Quoting the top answer (warning - this is from a long time ago e.g. 2010):

The answer is, No - MSXML2.XMLHTTP automatically follows redirects.

If you need to track and optionally not follow redirects, then you can use the WinHttp.WinHttpRequest object, which, like MSXML2.XMLHTTP is accessible to script.

There look to be other people querying this as you would expect. Google is your friend.

Although this isn’t a Companies House query as this is to do with VBA URL handling, for reference of others searching in the future it would help if you post your findings back here!

Good luck.

Interesting, and thank you for your quick reply. I was getting a 302 last week but deleted that code stupidly to try something else.

Do you have an idea then why the /content request fails.

    Set xmlHttp = New MSXML2.XMLHTTP60
    xmlHttp.Open "GET", pdfUrl, False
    xmlHttp.setRequestHeader "Authorization", "Basic " & Base64Encode(API_KEY & ":")
    xmlHttp.setRequestHeader "Accept", "application/pdf"
    xmlHttp.send

The send command above causes the “Permission to use object denied” but only when “/content” is in the url

You sort of answered that yourself:

Why would you need the redirect - why not just let the system follow it to get the content? (Assuming you were doing this “for reason”…)
The answer of course is “because that doesn’t work, because I get an error if I do” and the reason that’s the case is that the redirect takes you to a different site with its own authorization method.

You’re supplying a Companies House http basic authorization header - and it seems that the http client (the MSXML component) is like many of these sending that to Amazon. But a) you don’t want to send your credentials to another site and b) Amazon has its own methods, and if you supply this it will say “hmm… you’ve given me duff credentials I can’t understand AND you’ve given me the correct ones (which are in the URL you send) so … error!”

I wrote this in Python as well and that handles it much better and handles the AWS redirect without any problems. I will check the Python code and see if I can see if I have done something differently in that but I suspect the problem is the AWS redirect not working. I will try and use your suggestion to use the WinHttp.WinHttpRequest.

If you didn’t see this or similar already I’ve had a couple of goes at describing the process (which we follow).

I think difficulties (or not) here depend on the http client e.g. I think with curl you can set the flag for “follow links” and it will just work. Ultimately though it fails if you send Amazon the Companies House http header. Once you’ve got something that works you can play with this to check e.g. try sending that and then not, assuming the Amazon link is not “single use only” - I think it is time-limited).

I am not personally suggesting anything for VBS as I noted, that is just what Google turned up. I do not access Companies House data that way and it has been a long time since I did anything using http in VBS, if ever…

I get what your are saying now, the send in vba is sending authorization to aws which is why i most likely get the error. Thank you for your time today in responding.