@deskildsen
To explain what is happening I will walkthrough the sequence that has been highlighted in the post previously but with further explanations and code snippets where applicable.
The sequence
- Call filing history
- Call document api for transaction you are interested in.
2.1 document api returns a 302 Moved response with a Location: header pointing at the document
- Get document by following the Location: URL. This can only be done once, without repeating 2, and must be done within a short time window, before it expires.
Sequence Details
1. Call filing history
Search for the transaction you interested in via the filing history endpoint. Given that you have a document you are testing with I am going to assume that this step needs no further explanation.
2. Call document api for the transaction you are interested in
The reason for performing this step is to retrieve the metatdata related to the transaction. This defines information such as the company number and the category but more importantly the content types that may be available. Currently the default is PDF (application/pdf) but future types will include XBRL/iXBRL (application/xhtml+xml) for accounts filings, for example.
This is the step that you have performed using the example code in the post and provided the response.
2.1 Call the document API to retrieve the one time use URL to access the document.
This is achieved by calling the “document” link returned from the request above, the following in the example response provided, the URL with /content on the end:
["links"]=> object(stdClass)#14 (2) { ["self"]=> string(95) "https://document-api.companieshouse.gov.uk/document/dSD6iyVGq4Nx5jnj0IOlqti5a7veemHTJREfRO0Gm6s" ["document"]=> string(103) "https://document-api.companieshouse.gov.uk/document/dSD6iyVGq4Nx5jnj0IOlqti5a7veemHTJREfRO0Gm6s/content" }
This is the step you are performing with the NULL returned. When this URL is requested it provides a 302 redirect and a location header which is the one time use URL to access the document. The code you are executing has been configured to follow these redirects automatically via:
> curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
so the code automatically jumps to the next step.
3. Get the document using the URL from the location header returned from 2.1
Retrieve the document , a PDF image in this example, using the one time use URL.
Given that the example code automatically follows the redirect it has extracted the image as binary data and stored in the $result5. The code is then treating this as json, performing a json_decode and then dumping the contents which is the NULL output (I suspect that a json_decode of binary data is failing, hence the NULL!). If you change the script above to echo $result5, you will see the “image” data output.
To further explain the following is a snippet of code (developed to run in browser) performing 2.1 and 3.
<html>
<head>
<title>PHP Test</title>
</head>
<body>
<?php
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'https://document-api.companieshouse.gov.uk/document/dSD6iyVGq4Nx5jnj0IOlqti5a7veemHTJREfRO0Gm6s/content');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HEADER, 1); // return HTTP headers with response
curl_setopt($curl, CURLOPT_VERBOSE, true);
curl_setopt($curl, CURLOPT_USERPWD,"<<YOUR_API_KEY>>");
#curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$response = curl_exec($curl);
$redirect = curl_getinfo($curl, CURLINFO_REDIRECT_URL); #Retrieve the re-direct URL
curl_close($curl);
echo $response;
echo "<a href='".$redirect."'>Click for image</a>";
?>
</body>
</html>
As you can see the followlocation has been commented out which means that we do not follow the re-direct returned when we call the document-api with the /content appended. We can then select the URL from the header returned and then use this URL as a link to open the PDF in a browser.
Hope this provides some further clarification
Thanks,
@mfairhurst