I am able to search for company using the first url “https://api.companieshouse.gov.uk/search/companies?q={}” but not using “https://api.company-information.service.gov.uk/search/companies?q={}”. Could you please tell me what the difference between these two? This python piece to connect to CH api is hosted in Pythonanywhere.com? Do you think it has something to do with IP address restriction?
url = “https://api.companieshouse.gov.uk/search/companies?q={}”
#url = “https://api.company-information.service.gov.uk/search/companies?q={}”
api_key = “****” #API Key
password = “”
auth_string = base64.b64encode(f"{api_key}:{password}".encode()).decode()
headers = {‘Authorization’: f’Basic {auth_string}’}
response = requests.get(url.format(query),headers=headers)
I believe api.company-information.service.gov.uk is recommended. They both currently work though.
What http response code do you get (I believe this may be response.status_code
in Python)? Any message? Both hosts work fine for me this morning (I’ve snipped some of the JSON response):
curl -u MY_API_KEY_HERE: "https://api.company-information.service.gov.uk/search/companies?q=natwest&items_per_page=3&start_index=0"
{"total_results":140,"items":[{ ... }],"start_index":0,"kind":"search#companies","items_per_page":3,"page_number":1}
curl -u MY_API_KEY_HERE: "https://api.companieshouse.gov.uk/search/companies?q=natwest&items_per_page=3&start_index=0"
{"total_results":140,"items_per_page":3,"page_number":1,"items":[{ ... }],"start_index":0,"kind":"search#companies"}
Is it possibly for you to try this using curl from the environment where you’re trying to run things? I recommend that as it’s probably the simplest way to do this and you can very easily turn on “verbose” mode and see exactly what is being sent, to where, and what is returned.
If the environment where you are running these from is the same then if one runs the other should. If you have registered the IP with Companies House it be fine. You mentioned this being hosted somewhere (presumably not on your own server). In which case I guess it’s possible this may actually be on a server cloud and perhaps the actual IP this is being called from is switching between runs? If you try to call Companies House from an IP not registered to your API key it won’t work.
Not that it should make a difference if you have the same search terms each time but what search term are you using?
Aside / probably unrelated: why are you needing to do the .encode()).decode()
part at the end of the string? (I’m not a Python coder but this seems odd).
I am getting this response:
requests.exceptions.ProxyError: HTTPSConnectionPool(host=‘api.company-information.service.gov.uk’, port=443): Max retries exceeded with url: /search/companies?q=CompanyNumber%7C07416642 (Caused by ProxyError(‘Cannot connect to proxy.’, OSError(‘Tunnel connecti
on failed: 403 Forbidden’)))
Do you know what it means?
However, this works fine : https://api.companieshouse.gov.uk/search/companies?q={}. I am getting 200 as response_status_code.
Not sure what’s happening but if you’re getting a proxy error that does suggest something else entirely. (You can search on this forum for 403 errors, not sure that will help you).
What is that URL query string though? q=CompanyNumber%7C07416642
?
If you are searching for a company number you can just look up the number directly (and obtain the Company Profile):
curl -u MY_API_KEY_HERE: "https://api.company-information.service.gov.uk/company/07416642"
(this finds a company)
I’m not sure why you’d want to but if you did want to search using that number (note - this searches several fields as well as the company number e.g. the name field…):
curl -u MY_API_KEY_HERE: "https://api.company-information.service.gov.uk/search/companies?q=07416642"
This also happens to find this company (and nothing else, for this number…)
Thanks voracityemail! I think I already have figured out why https://api.companieshouse.gov.uk is working and api.company-information.service.gov.uk is not. It is because someone has already registered the former to the pythonanywhere.com whitelist. I think I have to do the same if I want the latter to also work in this site. Thank you for all the valuable insights that you’ve shared!
https://www.pythonanywhere.com/whitelist/