Under the hood - How does Cavell go from speech to a note?

23 January 2026

—

5 minutes to read

Cavell has been developed from the start as a generic AI engine that can be deployed across different care settings. General practitioners, home care nurses and specialists each work in their own way, with different workflows and different expectations of their records. Yet all these applications share the same technological core. Technically speaking, Cavell is a speech-to-text-to-code engine. The components that convert speech to text and subsequently convert that text into a coded report are generalized building blocks. This architectural choice was made deliberately: improvements to the Cavell engine automatically propagate across all applications, create economies of scale and allow new use cases to be supported quickly without having to start from scratch each time.

Anthony, co-founder of Cavell and Product Lead, recently went under the hood of the CareConnect AI Assistant during a Corilus podcast. Following that, we would like to dive deeper into the different blocks that make up this Cavell engine, and show how those blocks are deployed differently per care setting. The building blocks remain the same, but their configuration is tailored to the reality of each care setting.

Step 1: Capturing speech

Everything starts with capturing spoken information. How that information is delivered differs greatly per context. Home care nurses often work with short voice notes of twenty to thirty seconds, usually after completing a home visit. In such a note, they dictate all relevant observations and actions. Because this involves a single speaker and the dictation is done deliberately, a smartphone microphone is perfectly sufficient. Consultations are a different matter. During a consultation with general practitioners, specialists or psychologists, crucial information is spoken not only by the care provider, but also by the patient, often spread throughout the entire conversation. To reliably capture that information, the audio capture needs to be broader and more consistent. That is why we recommend using an external microphone in those settings. In the podcast, Anthony provided more context about the importance of the external microphone.

To offer a good balance between audio quality, range and price, we had this external microphone custom-developed. The current microphone connects via USB to the computer and delivers audio of sufficient quality to correctly capture multiple speakers, without disrupting the workflow in practice.

Step 2: Transcription

The audio forms the input for the next step: transcription. In this phase, spoken language is converted to text via cloud-based processing, which is necessary to guarantee speed and scalability. During the podcast, Anthony clarifies why this step cannot happen locally on the care provider’s computer or smartphone.

An important factor during consultations is speaker diarization. During consultations with a general practitioner or specialist, a companion often accompanies the patient, and it is essential to be able to distinguish what is said by the patient, by a companion or by the physician. This separation is crucial for the correct interpretation of the consultation. For home care nurses, where usually one person dictates, speaker diarization is much less relevant and the pipeline can be kept simpler.

Step 3: From transcription to a coded report

The transcription is not the end point. In the third step, the text is converted into a coded report, tailored to the care setting and to the way the EHR expects information. Such a coded report consists of a combination of free text, diagnosis codes and structured, coded parameters.

For home care nurses, Cavell extracts, in addition to a limited free text field, approximately forty parameters that are specifically relevant for nursing observations and wound care. For general practitioners, Cavell generates a report in SOAP format, with a clear separation between what the patient subjectively reports, what can be objectively observed or measured, the coded assessment and the plan for further follow-up. Here too, approximately forty parameters are automatically recognized and structured, ranging from blood pressure and weight to more specialized parameters, for example in the context of diabetes consultations. For medical specialists, the specific format of the report is even more important. Each specialty has its own focus, terminology and report structure. That is why Cavell contains templates for more than twenty-five specialties and subspecialties, from endocrinology and cardiology to orthopedics and psychiatry.

To make Cavell fit as well as possible across all these care settings, our team of AI engineers set up a set of collaborating AI models. These models ensure that coded reports are not only generated quickly, but are also substantively accurate and relevant within each specific care context. During the podcast, Anthony provided more detail about how these AI models work concretely for general practitioners.

At the end of this process, the coded report is available in the electronic patient record.

Cavell was therefore built as one generic AI engine that adapts to the context in which it is deployed. Whether it concerns a short voice note from a home care nurse, a consultation with the general practitioner or a specialist report, Cavell always goes through the same fundamental steps: capturing speech, transcription and conversion to a coded report. What differs is the configuration of those steps, tailored to the workflow, content and requirements of the care setting. By working with reusable building blocks, we combine quality, speed and scalability, without sacrificing specificity. This makes Cavell broadly deployable in healthcare today, and at the same time ready to evolve with new use cases and care models.

Want to learn more about what is under the hood of Cavell? Feel free to listen to the full podcast via this link.

CONTACT

info@cavell.ai

Polaris Health BV
Lammerstraat 13
9000 Gent

BTW: BE 1008.259.372

SOCIAL

LEGAL

ISO 27001

GDPR compliant

EU AI Act compliant

Zero data retention