API returns a total_count of '0' when there are definitely more

Hey everyone, been trying to set up an application using the API but have run into some problems and I’m hoping somebody can help. I am using python to retrieve the data.

Bascially, I have a txt file which contains a list of company numbers that I want the API to search for, here is my code:

input_file = open('company.txt', 'r')
for x in input_file:

    url = 'https://api.companieshouse.gov.uk/company/{}/filing-history'.format(x)

    r = requests.get(url, auth=('AUTH', ''))

    data = r.json()

    print(data)

This basically cycles through all the entries in the document, and for every new line, constructs a new URL which is then sent through the API. The problem is however, is that the API only really processes the first one on the list, even though I know for a fact each line is in fact being individually cycled through. This is what happens:

https://api.companieshouse.gov.uk/company/06495921
/filing-history
{'filing_history_status': 'filing-history-available', 'total_count': 0, 'items_per_page': 25, 'start_index': 0, 'items': []}

https://api.companieshouse.gov.uk/company/03778604/filing-history
{'filing_history_status': 'filing-history-available', 'total_count': 137,

As you can see, one of the URL’s gets sent off to the API but returns 0 results, even though I know this is not true, while the other returns 137. Its very odd because it doesn’t matter what company numbers I put into the text file, at least 1 of them returns blank.

I have tested each company individually and lots of results are returned, but its only when I try to run them one after the other do I get this error.

Furthermore, even if I run a script that makes the 2 same queries one right after the other, the API returns the results exactly as required, this code is:

import requests
import json

url = 'https://api.companieshouse.gov.uk/company/03778604/filing-history'

r = requests.get(url, auth=('**AUTH**, ''))

data = r.json()

print(data)

url = 'https://api.companieshouse.gov.uk/company/00235446/filing-history'

r = requests.get(url, auth=('AUTH', ''))

data = r.json()

print(data)

Do you think this is a problem on my side? or perhaps something to do elsewhere.

Any help is greatly appreciated

Ok think I’ve figured out a solution, basically I was stupid because I didn’t realise that one of the API requests had added a new line in between the company number and the ‘/filing-history’, rendering the request invalid.

This is because when the code reads the text document, it also copies over the new line character ‘/n’ and adds that to the API request, hence the mixup.

Looking for a solution online now but its a very simple problem

@carl_jabbour

My suspicion is you have a carriage return/newline on the end of the company number in your company.txt file. You are then calling the API with effectively a company number including the CR or Linefeed. As this company doesn’t exists, we should return a 404 status code of “not found”, but are returning the message you have referenced in your post with the total count of 0.

I have tested using the code below with and without the x = x.replace line and have replicated what you are experiencing.

#!/usr/bin/env python

import requests

input_file = open('company.txt', 'r')
for x in input_file:
    x = x.replace("\n", "")
    url = 'https://api.companieshouse.gov.uk/company/{}/filing-history'.format(x)
    print(url)
    r = requests.get(url, auth=('<<auth_key>>', ''))

    data = r.json()

    print(data)

Hope this helps

Thanks

@mfairhurst