How to use API with Python?can anyone help me on this.Thanks
Just search the forum or google, unless you’ve got a specific question.
Getting access - solving some simple issues:
There are some wrappers out there:
There’s a good demo of what you can do:
Thanks,but i facing authorization issue.
HTTPError: 401 Client Error: Unauthorized for url: https://api.companieshouse.gov.uk/search/companies?access_token=rVSsISQmU2Xt4wR-xDfYV1ACbakI9IT8vRVSttxZ&q=dyson
Since you haven’t said what you’re actually doing / want to do I can only guess - but:
a) I think you need to read the authorisation documentation at this point. It looks like you’ve tried to authorize in an incorrect way.
b) Did you try reading through any of the posts listed in my previous post? I’d try copying the simplest of these e.g. just log on and run a single query to ensure you’re getting the authorization correct.
If you still wanted some more assistance after addressing the points above you would need to be a bit more specific. Rather than asking “how does this work?” or stating “I get an error x” try stating (simply) what you want to achieve, what you tried and the result.
Hi,
I am using ‘chwrapper’ python package to search companies.
import chwrapper
search_client = chwrapper.Search(access_token=‘rVSsISQmU2Xt4wR-xDfYV1ACbakI9IT8vRVSttxZ’)
response = search_client.search_companies(‘dyson’)
response.json()
I tried with both keys,but i still face this authorization issue,
HTTPError: 401 Client Error: Unauthorized for url: https://api.companieshouse.gov.uk/search/companies?access_token=rVSsISQmU2Xt4wR-xDfYV1ACbakI9IT8vRVSttxZ&q=dyson
But,when I accidentally used another API key(‘V0NtegKtBvgJlYZben-yVvWAl1ce5U9u7oG0FLGv’) found in one of the questions in forum,it is working…I don’t want to violate the rules…
Can you please help me on this,like I am not able to extract data with my API id.Thanks.
(Aside) I think we’re supposed to keep the keys secret - there should be no need to post them anyway.
I don’t use Python or the chwrapper - so apologies if I’ve mentioned something which has bugs in it.
A quick look shows you appear to be using the library as it recommends. I can’t do debugging for you but as a pointer the error seems to show that instead of - or in addition to - submitting the normal “q” parameter for companies search, you’re adding your API key as a parameter on the query string. As per the documentation this shouldn’t be passed as a query parameter but as part of the http header. Again you’ll need to debug this yourself but I did notice a couple of places where this could be being set:
In services/base.py, class Service, function get_session
The line:
session.params.update(access_token=access_token)
…seems a little odd, given that later in this function there seems to be something setting the authorisation parameter
session.auth = (access_token, “”)
In the search.py, class Search function search_companies
…there is the possibility to send additional query parameters (kwargs):
def search_companies(self, term, **kwargs)
I have run this basic code.
import requests
import base64
key = ‘xxxxxxxxxx’
#keyEncoded = base64.b64encode(key.encode(‘utf-8’))
test = requests.get(‘https://api.companieshouse.gov.uk/company/00000006’, headers={‘Authorization’:key})
print(test)
bit still,i get 400 or 401 error.
Can you please help me on this
Your code fragment suggests that you’re:
- Not using http basic authentication correctly e.g. no “
Basic
” in there… - Your code doesn’t show this but I think you may not be appending the colon after the API key.
See anywhere e.g. wikipedia on http Basic authentication
A request contains a header field of the form
Authorization: Basic <credentials>
, where credentials is the base64 encoding of id and password joined by a single colon (:).
Even thought the CH documentation is fairly clear this seems to cause disproportionate amounts of trouble. I suppose they could spell this out in terms of the actual html headers or provide worked examples in different languages. It was enough for me…
.
I think Companies House did themselves no favours when they decided not to use both username and password in their API key system. I’m sure they had their reasons.
It may also be that http Basic is “too simple” e.g. not obviously tricky so people pay less attention.
I tried with that too…
from requests.auth import HTTPBasicAuth
import base64
import requests
apiKey = “xxxxxxxxxxxxx:”
keyEncoded = base64.b64encode(bytes(apiKey, “utf-8”))
url1 = “https://api.companieshouse.gov.uk/company/00002065”
re = requests.get(url1,auth=HTTPBasicAuth(keyEncoded,’’))
but i get response 400 error.
Can you please help me on this…Thanks
When you say “response 400 error” do you mean 401? If it was a 400 that means you’ve garbled your url / parameters somehow - although I can’t make CH give this response on this endpoint. However it points to the following line having an issue:
re = requests.get(url1,auth=HTTPBasicAuth(keyEncoded,''))
(Aside - I’ve listed typical error codes from the API towards the end of this post).
Last post from me. I think you should start from basics to prove that your key / IP / network are working together first then go from there. I always start with something like curl (command line) to prove I can get a response at all, then start putting things into code.
Good overview in the following comment / thread (code example is for perl but it’s really simple):
There’s a checklist for debugging CH on my post on the same thread.
(This also links to two examples in Python).
I was going to ask if you’d checked you were making the requests from a registered IP but I think you should get 403 in this case - see:
If you are using localhost as the server you need to have registered this as above and there’s some advice below (end of the thread):
On your examples - I do note that in each example you’ve posted you’ve changed several things compared to the last example e.g. changing the key data AND the library you’re using which might make it difficult to fix the bug.
Errors / responses
These vary slightly depending on the endpoint but for the company profile endpoint you’re calling - which looks OK e.g. just tested that company myself - you should get one of the following:
200 - you should get the data
400 - bad request - unsure how you’d get this for this endpoint (I’ve just tried) but some others may give you this for incorrect requests.
401 - unauthorised - there’s a problem with your API key or the Authorisation header
403 - forbidden - you were trying to access CH from an IP / domain you hadn’t registered with CH, or you’ve been banned
404 - not found - no such company
429 - you exceeded the rate limit.
(Very rarely these days but occasional 5xx responses when there’s a server issue).
Hi,
Don’t know if you’ve managed to solve this yet, you’re missing the .json at the end of your request, also Authorisation shouldn’t have a capital, this should work:
re= requests.get(url1, auth=(key, ' ')).json()
free code snippet of a python class (python3) which can be used like an api wrapper(ish)
from requests import Session
import base64
# pip install fuzzywuzzy
# pip install python-Levenshtein
from fuzzywuzzy import fuzz
def normalise(s):
for value, replace_value in normalise_map.items():
s = s.replace(value, replace_value)
return s.split('t/as')[0].split('t/a')[0].split('t/ a')[0]
def fuzz_company_name(a, b, verbose=False, strict_verbose=False):
if verbose and strict_verbose:
raise NotImplementedError('You can only use verbose or strict_verbose, not both')
if verbose:
print('\t{} {} | {}'.format(fuzz.ratio(normalise(a).lower(), normalise(b).lower()), normalise(a), normalise(b)))
if strict_verbose:
# only verbose names which might be of interest, instead of all regardless of fuzz match ratio
if fuzz.ratio(normalise(a).lower(), normalise(b).lower()) > 60:
print('\t{} {} | {}'.format(fuzz.ratio(normalise(a).lower(), normalise(b).lower()), normalise(a), normalise(b)))
return fuzz.ratio(normalise(a).lower(), normalise(b).lower())
class CompaniesHouse(object):
def __init__(self):
self.s = Session()
self.s.auth = ('you-api-key-goes-here', '',)
self.url = 'https://api.companieshouse.gov.uk/{}?{}'
def companies(self, search_term):
with self.s.get(self.url.format('search/companies', search_term)) as response:
try:
return response.json()
except JSONDecodeError:
# if you get a 429 error here then you are using the api too much
import pdb;pdb.set_trace()
def company_search(self, name: str, fuzz_ratio=89):
items = None
items = self.companies('q={}'.format(name)).get('items')
if items and name:
for item in items:
if fuzz_company_name(item.get('title'), name) > fuzz_ratio:
return dict(OrderedDict(sorted(item.items(), key=lambda y: y[0])))
else:
return name, item.get('title'), fuzz_company_name(item.get('title'), name),
Also just noticed that you are using the streaming keys to access the non-streaming API.
To resolve this you should request a rest API key from here:
Hello! I know it has been a little while since you posted this but just thought I would try my luck and see if you could help as I am having a similar issue. I am quite new to Python so apologies in advance if I am missing something simple! I have read through a few of these threads and tried some different solutions but still cant figure out what it is that isn’t working. I am trying the below with a rest API and get the error message: {‘error’: ‘Invalid Authorization header’, ‘type’: ‘ch:service’}
from pprint import pprint
from requests.auth import HTTPBasicAuth
import base64
import requests
apiKey = ‘MYKEY:’
keyEncoded = base64.b64encode(bytes(apiKey, ‘utf-8’))
url1 = ‘https://api.companieshouse.gov.uk/company/FC015679’
re = requests.get(url1,auth=HTTPBasicAuth(keyEncoded,’’)).json()
pprint(re)
Did you manage to solve your issue? Any help would be much appreciated. Thanks in advance!
I have a question about how to connect using Python and requests library to CH Streaming API. I don’t think that chwrapper package can help with the access to the Streaming API.
Adding this on this thread because dean_mcginn mentioned that someone else was mixing up streaming API with the REST one. Plus title of this post does not say that the original question was only about REST API.
I removed my_api_key from the code because I want to keep my key save. I did take the steam key and not the rest the API one.
import requests
url = 'https://stream.companieshouse.gov.uk/insolvency-cases'
headers = {'my_api_key': '***********my_api_key**********'}
r = requests.get(url, headers=headers)
print(r.text)
This results in the following error:
{"error":"Empty Authorization header","type":"ch:service"}
What am I doing wrong? Any pointer in the right direction will help. Thanks in advance.