Search company officers returns HTTP 416 when start_index over 300

The following reqest to search company officers returns response with HTTP status 416 (Request Range Not Satisfiable).

GET /search/officers?q=John+Smith&items_per_page=100&start_index=301 

The same request with start_index <= 300 works fine. This seems to have changed recently as we used to paginate further than this in the past (last used a few weeks ago).

Is this a known issue? It causes problems for us as when searching for a common name (like John Smith) we often need to paginate further than 300 results to find the right result.

Yes this is a known issue. It follows a restriction we deliberately introduced to prevent behaviour that was causing performance issues on our search service. Whilst you are not guilty of that behaviour, you have been inadvertently caught by the same restriction.

We are working on a better solution to limit that behaviour whilst allowing legitimate searches of the extent you describe.

Hi,

We have noticed using the search-companies API that the system now allows 400 records to be returned and issues a 416 error whenever the page containing the 401st record is requested.

Some further issues which need addressing:

  1. Your search algorithm seems incorrect as it is “OR” based not “AND” based and there is no ability to search for exact phrases. This results in a greater number of irrelevant records rather than “AND” based or exact phrase searches. For example, “'The Donaldson Trust” returns nearly 300,000 results containing all companies with “The” OR “Donaldson” OR “Trust” in the title.
  2. The record limit combined with the bad search algorithm results in users being unable to find their company. When users are asked to narrow their search they tend to type in more words which only makes the result set larger and more unreachable through paging and also adds an unecessary overhead on your resources.
    .
    Is it possible to increase the record limit for ourselves, or implement the above changes?

I’m facing the same issue…

This is a frustrating experience, why isn’t this documented?

The single overriding aim of our search it to find the SINGLE company a customer is looking for or at least get it to be in the first 10 matches. It is NOT to trawl through our complete company (or officer) names and numbers, there are other products for that.
All that said, the search is an ‘OR’ and a ‘AND’ search, that searches indexes of the company name, company number, surname, forename, other names and postcode, with some analysis on other fields like dissolved dates to help with relevance.
Just looking at your example of ‘The Donaldson Trust’ , that company does not exist, so no exact match. If you tried for example ‘DONALDSON TRUSTEE LIMITED’ it is the first match as an exact match.
There is a bit more detail regarding our search facility on the following forum thread:-

I’ve come across when searching for Companies by postcode, eg HP2 7DN, which finds 487 results and I can only retrieve the first 400.

Given the ‘or’ issue and no ability to order the results, I can’t see any way to retrieve the ‘missing’ 87 or to split into two searches.

Can you assist, please?

I have just spotted that our service (https://check-payment-practices.service.gov.uk) is falling foul of this. We paginate based on the total result count returned in the json response from the search but when the user tries to select above page 15 they see an error because we get the 416 result.

I appreciate the reasons you give for this, and can do some work to limit the pages numbers on our service, but this limit isn’t mentioned in the Search Companies api documentation, so the first time I have become aware of it is when our service is live in production.

Please could you add this to the documentation?