It has never been easier to build an OSINT tool that looks capable. With a handful of APIs, an AI coding assistant, and a clear use case, investigators and intelligence analysts can now assemble something in days that would previously have taken a developer months.
Search a subject, pull records from multiple sources, display them in a single interface, and add enough automation to make it feel like a real investigation product. That is exactly what we tested with this build.
We created a UK-focused entity investigation tool designed to bring together public records, social media signals, breach data, web results, and evidence capture in one workflow. No engineering team. No long development cycle. Just APIs, some iteration, and a bit of trial and error.
The result was impressive. It also reinforced the same lesson we have seen elsewhere: building a tool is no longer the hard part. Building one that structures public data to generate intelligence is.
The tool was built around a simple starting point: investigate a person or entity through a single interface.
A user enters a subject and the system pulls back data from multiple sources, including:
The aim was not just to display results side by side, but to begin structuring them into something closer to an investigation workflow.
That included:
This made the build much more than a search interface. It started to behave like an investigation system.
To make that work, we combined a set of APIs and services that each played a different role.
Public Insights - Used as the public records layer for UK people and businesses. This provided structured access to datasets that help establish identity, location, business interests, and wider local connections.
OSINT Industries - Used for social media and online account discovery. This added behavioural and platform-level signals that public records alone would not provide.
DeHashed - Used for breach data. This added contact points, exposure indicators, and additional identifiers that can support further resolution work, while also needing careful handling because breach data should not be treated as fact.
Brave Search - Used for open web search. This brought in articles, mentions, and other contextual results that can help widen the picture around a subject.
Site-Shot - Used for screenshot capture. This helped preserve relevant pages and results visually, supporting traceability and later review.
WhatsMyName - Integrated as a fallback option for username checks. This provided broader support when looking at usernames or online handles that might not be captured by reverse searching emails and phone numbers.
A lot of tools can query multiple sources. That part is increasingly straightforward. What made this build more interesting was the attempt to introduce structure between the raw outputs and the investigator.
Returned records were not simply displayed as isolated results. The tool attempted to assess whether records from different sources were likely to refer to the same person.
This is one of the hardest parts of building an OSINT tool that processes people data. Similar names, reused usernames, shared locations, and partial identifiers all create ambiguity. Even with scoring and matching logic, this still requires human review.
Not every source should be treated equally. A public filing, a breach record, a social profile, and a web article all carry different levels of reliability. The build aimed to reflect this by introducing intelligence grading so that stronger signals could be distinguished from weaker ones.
Rather than treating every returned record as a fact, the system represented outputs more like claims. For example:
This moves the tool away from flat search results and closer to structured investigation logic.
Each result needed context. Where did it come from? When was it collected? What source produced it? What type of record was it? Without metadata, results are difficult to trust and difficult to explain.
Relevant data could be passed into a separate link analysis workflow. That meant people, companies, addresses, accounts, and relationships were not trapped inside a search interface. They could be explored visually as part of a wider investigation.
The overall cost to get it working in a meaningful way was around £300. That is important for two reasons.
First, it shows how low the barrier to entry has become. A few years ago, a build like this would have required much more time, more engineering effort, and a larger budget.
Second, it makes the build vs buy decision more interesting. When you can assemble something that looks useful for a relatively modest amount, it becomes much easier to see why teams are tempted to build internally.
At a glance, the tool felt powerful. It brought together multiple strands of information in one place. It reduced the need to switch between tools. It gave investigators a faster way to explore a subject and start identifying possible links, risks, and avenues for follow-up.
It also showed that AI-assisted building has changed the pace of prototyping dramatically. You can now get from concept to working interface far faster than most teams expect.
The API integration was the simple part of the build. The harder part was deciding how to structure the output so that it could support an actual investigation. A few examples stood out quickly.
Identity ambiguity - A common name can return multiple plausible records. Some belong together. Some do not. This is where entity resolution becomes critical, and where human review still matters.
Source weighting - Different sources make different kinds of claims. Some are more reliable than others. Without explicit grading, everything starts to look equally true.
Evidence and traceability - A result is only useful if you can understand where it came from and why it surfaced. This is where metadata and evidence capture become essential.
Conflicting information - One source may suggest one address, while another suggests something different. A good investigation tool should not flatten that tension. It should preserve it.
Moving from retrieval to reasoning - The build was very capable at retrieval. The real question was how far it could support reasoning.
That is the point where many internally built tools begin to struggle.
This project was not just about proving that a UK entity investigation tool can be built with APIs and AI. It was about showing the difference between aggregating data and structuring intelligence.
The first is increasingly accessible. The second still requires judgement. That judgement shows up in decisions about:
This build makes the appeal of self-build obvious. You get flexibility. You can shape the workflow around your own use case. You can choose your own data sources. You can move quickly.
But it also makes the trade-offs clear. As soon as you move beyond retrieval, you run into questions around reliability, structure, explainability, and operational trust. Those are the areas where commercial platforms still provide value, especially for teams that need consistency, support, or compliance.
In practice, many teams will land somewhere in the middle. They will buy some capability, build around it, and integrate APIs and internal workflows where it makes sense.
You can now build a surprisingly capable UK entity investigation tool using APIs, AI, and a relatively modest budget. That is no longer the surprising part. The more important question is whether the tool is helping you retrieve data, or helping you run an investigation. That is the difference that matters.