Responsibility Statement — Accountability

read Articles →

Responsibility Statement

We believe we are accountable for the intelligence we release. At dogAdvisor, our aim is to provide dog owners with tools that are genuinely safe, dependable, and aligned with their real-world needs. No AI is ever perfect, but it's our responsibility to make it as safe, reliable, and genuinely useful as possible. This statement summarises our position more formally.

By using Max and dogAdvisor, you agree to our Terms of Service.

Tech companies often treat safety as a constraint - as something that slows innovation and complicates otherwise straightforward engineering decisions. We reject this framing entirely. At dogAdvisor, we believe accountability is not separate from innovation — it is a precondition for truly meaningful innovation. An AI system that provides incorrect health guidance is not innovative; it is dangerous. A feature that increases user engagement but degrades safety is not progress; it is negligence.

We believe that the only innovation worth pursuing is innovation that serves users well and serves them safely. When we evaluate whether to build a feature, "can we build this safely?" is not a secondary question - it is the first question we ask ourselves. When we assess a new Max model, "is this more capable?" matters far less than "is this safer and more reliable?"

This approach has real costs. We move slower than we could. We invest in safety infrastructure that users never see. We turn down features that would drive engagement but increase risk. We accept these costs because we recognise a fundamental truth: trust is built slowly and destroyed instantly.

Most AI companies do not commit to this level of safety infrastructure, testing rigour, and discipline. The ones that do are genuinely rare. We are proud of what we have built, and we believe our safety architecture represents meaningful advancement in responsible AI deployment. But at the end of the day, we are working with technology that we — and the entire AI industry — do not fully understand, and cannot fully manage. AI systems are inherently probabilistic, generating responses based on pattern recognition across massive datasets. They can be remarkably helpful and often impressively accurate, but they can also produce outputs that are incorrect, misleading, or potentially dangerous. This limitation is not specific to dogAdvisor — it is fundamental to how AI systems work today. No provider, regardless of resources or safety measures, can guarantee perfect accuracy. Despite this, we will continue to do everything in our power to make Max as accurate and safe as we possibly can.

Most critically, our users cannot verify our internal safety processes. They cannot audit our testing methodology. They cannot observe our decision-making. They must trust that we do what we say we do. We take that trust with the utmost seriousness.

Dealing with Low Risk

Low Risk questions represents the bedrock of dogAdvisor — information that is stable, well-established, and does not involve health decision-making. This includes breed characteristics, training fundamentals, general pet care practices, socialisation guidance, and product information. When users ask about German Shepherd temperament, how to introduce a puppy to a household, what supplies are needed for a new dog, or how often dogs typically need grooming, they are engaging with Foundation-level capabilities on Max.

The risk profile here is minimal because the information is factual, widely agreed upon, and errors would not lead to immediate harm. If Max provides slightly different training advice than another source, users can evaluate approaches and choose what resonates with them. If breed information is incomplete, users can supplement from other resources. There is no time pressure, no emergency context, no health consequence.

Nevertheless, even Low Risk level content undergoes editorial review for accuracy and clarity. Any content that touches on health topics, even tangentially, receives fact-checking. We update content regularly to reflect evolving best practices in animal care. Our risk profile here is low, but it is not no-risk, and we treat it with appropriate care.

Dealing with Elevated Risk

Elevated Risk represents the threshold where Max moves from general information into territory where guidance could influence decisions affecting a dog's health and wellbeing. This classification encompasses symptom recognition, guidance on when veterinary care may be needed, information about health conditions and treatments, medication information presented factually rather than as recommendations, and health monitoring guidance. Medical Intelligence falls in this risk category.

The risk profile at this level is moderate but significant. Users consulting Max about health topics are often trying to decide whether to seek veterinary care. They may be managing costs, weighing convenience against concern, or seeking reassurance about symptoms they have noticed. The guidance Max provides at this level directly influences these decisions. If Max inappropriately minimises concerning symptoms, users may delay necessary care. If Max creates unwarranted alarm about normal variations, users face unnecessary anxiety and expense. our challenge here is calibration.

To operate safely at an Elevated Risk level, Max maintains conservative bias toward recommending veterinary consultation. When in doubt, escalate. When symptoms could indicate multiple possibilities, err toward professional evaluation. We test extensively across symptom presentations, breed variations, age differences, and ambiguous scenarios. We conduct red-team testing specifically targeting dangerous edge cases — queries designed to elicit inappropriate reassurance, attempts to extract diagnostic conclusions, scenarios where Max might fail to recognise escalation triggers. We monitor all Elevated Risk level interactions with automated systems that flag deviations from our safety protocols. Our safety team reviews a sample of elevated risk conversations, looking for edge cases we had not anticipated, queries that revealed limitations in our guidance, situations where Max's response could have been clearer or more appropriately conservative.

Dealing with Enhanced Risk

Enhanced Risk questions represents the peak of risk and responsibility in dogAdvisor's capabilities — the recognition of genuine emergencies requiring immediate veterinary intervention. At this safety level, Max operates in contexts where minutes matter, where incorrect guidance could be fatal, where false reassurance could prevent life-saving action. This classification encompasses emergency recognition across critical situations: bloat, seizures, severe trauma, poisoning, respiratory distress, heat stroke, collapse, uncontrolled bleeding, and other time-sensitive conditions where delay has serious consequences. Max's Emergency Guidance feature operates at an Elevated level, and it has helped save four dogs' lives.

The risk profile at this level is profound. Emergency situations create intense pressure. Owners are frightened, potentially panicked, desperate for guidance. They may provide incomplete or unclear information because they are focused on their dog's distress. They may be looking for permission to wait, to save money, to avoid an emergency vet visit if it is not truly necessary. The psychological dynamics are complex and Max must navigate them with absolute clarity. False negatives at this level — failing to recognise a genuine emergency — can be fatal. If Max suggests monitoring a bloating dog, if it normalizes seizure activity, if it does not convey appropriate urgency about poisoning, the delay costs critical time. But false positives also carry cost. If Max triggers unnecessary emergency visits for normal variations, users lose trust, they second-guess future guidance, they may ignore actual emergencies because they have learned Max overstates risk.

Max is intentionally conservative in its reasoning; when uncertainty exists, it will classify towards the higher end of the safety spectrum to prevent under-recognising a genuine threat. We work to minimise false positives, but our framework is designed to err on the side of caution rather than risk deprioritising a true emergency. To operate safely at this classification, we maintain rigorous validation standards. Emergency recognition accuracy must be demonstrated at above ninety-five percent across our test scenario library before deployment. We test across breeds, ages, symptom variations, and ambiguous presentations. We conduct adversarial testing where we deliberately provide incomplete information, where we simulate users who are minimising symptoms, where we test Max's resilience against reassurance-seeking behavior.

The default response at this safety level is immediate escalation to emergency veterinary care. There is no waiting, no monitoring at home, no "see how it goes." The guidance requires owners to immediate veterinary attention. In situations where appropriate stabilisation measures exist— specific poisoning protocols, basic first aid for trauma — Max can provide that guidance, but always in the context of seeking immediate professional care, never as an alternative to it.

Every Enhanced Risk level interaction is documented and retained for safety analysis. We track patterns, we identify scenarios where guidance could have been clearer, we refine our understanding of how emergencies present in real-world queries versus clinical descriptions.

Pre-Deployment Commitment

We will never deploy any model, feature, capability, or update that operates at Elevated Risk or Enhanced Risk level without comprehensive safety testing and validation.

Before any new capability reaches users, it undergoes our full safety testing. It is a rigorous, time-intensive process designed to identify failures before users encounter them. We would rather delay deployment for months than release capabilities that have not been thoroughly validated.

The process begins with classification determination. What level of risk does this capability introduce? What decisions might users make based on this information? What happens if the system fails? This assessment determines which safety requirements apply and what testing depth is necessary. We err conservatively in classification.

We conduct extensive scenario testing across the diversity of queries users actually send. Common breeds and rare breeds, young dogs and senior dogs, specific symptoms and vague descriptions, single concerns and complex multi-symptom presentations, calm inquiries and panicked reports. We test queries from users who provide thorough context and queries from users who omit critical details. We test situations where users have already decided what is wrong and are seeking confirmation, where users are clearly hoping for reassurance that avoids expense, where users seem to distrust their own veterinarian.

For Enhanced Risk Level capabilities, we conduct specialised emergency scenario testing. We simulate bloat presentations, seizure descriptions, poisoning reports, trauma scenarios, or other scenarios. We test variations in how symptoms are described because lay language varies dramatically. We test scenarios where users understate severity, where they mischaracterize symptoms, where they provide contradictory information. We believe emergency recognition must work despite imperfect input because that is the reality of emergency situations.

Pause and Rollback

Safety testing before deployment addresses foreseeable risks. But AI systems are complex, real-world usage reveals scenarios that testing did not anticipate, and responsible operation requires the ability to respond when post-deployment monitoring identifies concerns.

If our monitoring systems detect safety deviations in production — responses that violate our protocols, failure to recognize emergency patterns, inappropriate guidance that reaches users — we can restrict access to affected capabilities within hours. Individual features can be disabled independently without affecting core functionality. Entire classification levels can be restricted if necessary. In extreme scenarios, we can take Max offline entirely while we investigate and remediate.

We maintain previous stable versions of our system so that rollback is practically straightforward. If a new model shows degraded safety performance compared to the version it replaced, we can revert. Before adopting any new underlying model, we conduct comprehensive safety re-evaluation using our full testing protocol. The model must demonstrate equivalent or superior safety performance across all classification levels before deployment. We scale capabilities responsibly, which means we validate first and deploy second, not the reverse.

Monitoring and Learning

Deployment is not the end of our safety responsibility. Every interaction with Max is logged and monitored through automated systems and human review processes designed to identify safety deviations, emerging risks, and opportunities for improvement.

Our automated monitoring systems track every conversation for patterns that indicate potential safety concerns. Queries about emergency symptoms, references to concerning health changes, repeated questions about the same condition, language suggesting distress or urgency — these trigger elevated scrutiny. Not every flagged conversation represents a problem, but every potential safety concern receives investigation. We would rather review a hundred false positives than miss one genuine safety issue.

When Max identifies potential emergencies, these conversations automatically enter our high-priority review queue. Did Max recognize the emergency appropriately? Was the guidance clear and unambiguous about urgency? Did the user appear to understand the recommendation? Were there any indications that additional clarification would have been helpful? This review is not fault-finding but learning — every emergency interaction teaches us something about how emergencies present in real conversations versus clinical scenarios.

Beyond automated monitoring, we maintain multiple channels for users and veterinary professionals to report concerns. If someone encounters guidance that seems incorrect, inappropriate, or potentially harmful, we want to know immediately. These reports trigger investigation protocols separate from routine monitoring. You can contact us through multiple avenues of communication to flag a conversation to us.

Transparency in Safety Architecture

We believe that meaningful accountability requires transparency, and that transparency in high-stakes AI systems must extend beyond marketing claims to actual safety practices and results.

We commit to publishing comprehensive safety reports documenting our testing procedures, monitoring results, any significant safety incidents encountered, improvements made in response, and lessons learned. These reports will be written for broad accessibility — technical enough that experts can evaluate our rigor, clear enough that non-experts can understand our practices. We believe users deserve to know not just that we take safety seriously but how we implement safety in practice.

Within these reports, we will share aggregate data on classification-level interaction volumes, common query types, emergency guidance provided, monitoring flags generated, and what proportion proved to represent genuine concerns versus false positives. We will discuss edge cases identified through monitoring, how our testing protocols evolved in response, and what capabilities or guidance areas required refinement.

Beyond reports, we maintain public documentation of our safety architecture, classification framework, and testing methodology. We explain how we think about risk, what tradeoffs we consider, why we have established specific boundaries. We provide context for our conservative approach in certain areas and our reasoning when we determine capabilities are safe to deploy.

We also commit to engaging with external stakeholders — veterinary professional organizations, AI safety researchers, animal welfare advocates, regulatory bodies — to solicit feedback on our safety practices and to contribute to broader dialogue about responsible AI deployment in high-stakes consumer applications. We do not claim to have all answers, and we believe our safety program benefits from external expertise and scrutiny.

Conclusion

We also recognise that this statement represents our understanding today, and that responsible practice requires continuous evolution. As we learn from operational experience, as AI technology advances, as veterinary medicine evolves, as regulatory expectations develop, we will update our practices and this statement. We commit to review of this document by senior leadership to assess whether our commitments remain appropriate and adequate.

We are dogAdvisor. We build AI that saves lives and answers to the people and animals it serves.

This statement is made on the 26th November 2025, by Deni Darenberg.

You can contact dogAdvisor's AI Engineering team at ai.safety@dogadvisor.dog.

You can contact Deni by connecting on LinkedIn at linkedin.com/in/deni-d