Filing History html/XBRL Python Help

Hi everyone! I’m trying to access company financials and I’m having a difficult time with the content type of the filing. I’m looking at company ID :00197009 and I was able to find it in the bulk data with a html format; however, I’m only finding the pdf version of this filing. Can anyone help me? I’ve submitted my code below in python.

import requests
import json
import requests.packages.urllib3
requests.packages.urllib3.disable_warnings()

url = “https://api.companieshouse.gov.uk/company/00197009/filing-history/MzE5Mzc0OTc3MGFkaXF6a2N4

resp = requests.get(url, auth=(‘API_KEY’, ‘’))

resp_json = resp.json()
print(resp_json[“links”][“document_metadata”])

url_2 = “https://document-api.companieshouse.gov.uk/document/T53BLYf734zxeBWyvna131JtREqLsBgclFME-v6rxI8/content

resp_2 = requests.get(url_2, auth=(‘API_KEY’, ‘’,), headers={“content-type”:“application/html”})

print(resp_2.headers)
print(resp_2.url)

Was able to fix this myself with the following code. I had to change “content-type” to “Accept” as well as “html” to “xhtml-xml”

import requests
import json
import requests.packages.urllib3
requests.packages.urllib3.disable_warnings()

url = “https://api.companieshouse.gov.uk/company/00197009/filing-history/MzE5Mzc0OTc3MGFkaXF6a2N4

resp = requests.get(url, auth=(‘API_KEY’, ‘’))

resp_json = resp.json()
print(resp_json[“links”][“document_metadata”])

url_2 = “https://document-api.companieshouse.gov.uk/document/T53BLYf734zxeBWyvna131JtREqLsBgclFME-v6rxI8/content

resp_2 = requests.get(url_2, auth=(‘API_KEY’, ‘’,), headers={“Accept”:“application/xhtml-xml”})

print(resp_2.headers)
print(resp_2.url)

1 Like

I was facing the exact same problem. You’re a life saver.