Safety information in high-risk industries has a stale-data problem. Regulations change, site-specific guidance varies, and the people who actually need the answer (a supervisor on the floor, an EHS officer between site visits) often can't find it fast. Sentinel, our Safety Copilot, is built to solve that. It's a chatbot trained on occupational safety and health material that gives precise, jurisdiction-aware answers in seconds.

Worker safety isn't just a compliance checkbox. In construction, manufacturing, and transportation, it's directly tied to productivity, insurance cost, and workforce retention. The cost of a serious incident dwarfs the cost of preventing it.

What Safety Copilot does

Safety Copilot is the conversational front end to a body of safety knowledge that would otherwise live in PDFs and binders. The features that matter:

  1. Current OSH guidance. Up-to-date regulations and industry guidance, including jurisdiction-specific guidance. The screenshot below shows it answering a query about the Singapore WSH guide on video surveillance systems.
  2. Screenshot of safety-copilot showing a query about video analytics.
  3. Context-aware answers. Industry, region, and site context shape the response. The screenshot below shows it surfacing the differences between WSH (Singapore) and OSHA (USA) on fall protection requirements. Generic LLMs collapse these into one answer; Safety Copilot doesn't.
  4. Screenshot of safety-copilot showing a query about fall protection systems.
  5. Stays in its lane. The engine filters out off-topic queries instead of bluffing through them. The screenshot below shows it declining a question that isn't worker-safety-related, which is exactly what you want when a tool is used as a reference by safety professionals.
  6. Screenshot of safety-copilot showing refusal of irrelevant queries.

What it does for the safety team

Three concrete benefits when this is deployed properly:

  1. Faster, more accurate advice in the moment. When the question is "do I need a guardrail here", the answer arrives in seconds. That changes what supervisors are willing to ask, which changes how often safety considerations actually enter a decision.
  2. Better compliance posture. Regulations shift. Safety Copilot is updated against the current guidance, so the team isn't quoting last year's standard.
  3. Time freed up for the harder work. The safety officer's job has a long tail of routine queries that don't need a human. Off-loading those frees the team for incident investigation, risk assessment, and the work that actually moves the needle on outcomes.

How it fits into existing workflows

A few practical notes on deployment:

  1. Lives where the team already works. Slack, Teams, or an internal messaging tool. Not yet another portal.
  2. The model under the hood. Tuned for OSH content, with retrieval over the relevant regulation corpus. The combination is why it answers more precisely than a generic LLM on this material.
  3. Refreshed against current guidance. The knowledge base is maintained, so a question asked today gets an answer reflecting today's standards.
  4. Designed for non-experts. A line supervisor can use it without training. That's the whole point.

How it performs on WorkerSafetyQAEval

We needed a way to evaluate whether the model was actually better than alternatives at worker-safety questions, so we built a benchmark called WorkerSafetyQAEval. It's in the same spirit as HumanEval for code or StaticAnalysisEval for vulnerability detection.

Safety Copilot is state-of-the-art on the benchmark. It beats generic LLMs like ChatGPT and also beats another commercial worker-safety chatbot we tested it against.

Leaderboard showing peformance on the worker safety question & answer eval benchmark.

Safety Copilot is the part of the stack that puts OSH knowledge in the hands of the people who actually need it, when they need it. Less time spent hunting through PDFs, more time spent acting on what the regulations actually say. That's the leverage we wanted to give safety teams, and the benchmark numbers say we're delivering it.

If you want to see it in action, try Sentinel on your own questions. It's the fastest way to evaluate whether it would help your team.