Not selling anything… I’m just asking for help, guidance, support, approval…
I created a tool that allows user to access companies house data using natural language.
So I already have the python code that automates collecting the company data, officers, people with significant control and accounts. It is all kept up to date in a self managed MySQL database. Probably what some of you are trying to achieve automating yourself… I have already done this, and built in the streaming api’s to keep the data updated.
Here is a link to the tool https://www.datadini.ai (it will only work on desktop).
I initially designed this for small businesses, startup, solo-entrepreneurs to make data driven decisions effortlessly, and affordably. Its designed to be as simple as possible.
The data is updated in real time using the steaming api’s, so it is updated within minutes of the stream data being provided.
All xbrl files have been translated into database format. That’s over 4million company accounts available in formattable text.
If you would like to test the prototype try typing this simple query into the chat “show me companies beginning with a”.
My ambitions were to have more complex queries such as “show me companies with a turnover of over £100,000 and all directors aged over 70 years old”, however it has been a struggle to consistently get the AI to perform as well as I expected.
Slowly it is improving, however after working on it single handedly for nearly a year, I am running out of steam. It seems a shame to loose all this data, and all the tools created to keep it up to date.
In short, the processes include automating the downloads of snapshot data, transforming, normalising, cleansing and uploaded to a self managed MySQL database. The streams run continuously, taking the 23 hour break to comply with the rules. The most complex part is identifying which data provided by the streams, matches the snapshot data.
I have all the company profiles, officers, people with significant control and all xbrl accounts files converted to mutable text.
To the best of my ability, I am abiding by Companies house’s terms of use because the data is kept up to date, accurate and copywrite statements display the source of the data.
I either would like help continuing with this, maybe funding to keep it going if anybody finds it interesting, approval from companies house to say what I am doing is permitted, or I will just shut it down and pivot to something else.
I spend a year of my life building this, while earning nothing. Now it is potentially all going to go to waste.
Let me know what you think.