"Your newsletter felt like someone holding my hand through an unfamiliar field. It had none of the jargon-heavy complexity I brace for in most tech newsletters—just clarity, warmth, and flow."
Share
#2 | Data: Why Every Product Demands a Tradeoff
Published 3 months ago • 6 min read
Issue #2
Data: Why Every Product Demands a Tradeoff
Hi there!
Take any Mental HealthTech product company's 'How it Works' page and observe closely. You'll find that they all have a similar ask:
Websites: Kintsugi, Aiberry, Freed
The Pitch Tradeoff
When designers put together a 'How it works' page, their top intent is to illustrates clearly and cleanly the value you will receive from the product.
In order to receive this value, you must give something in return. All of this put together is the product pitch, the product's sales story to you. Every product pitch is on a fundamental level, is a tradeoff.
If a product promises something, it is also asking for something in return. Which is not-necessarily a bad thing. What matters is that you are able to identify this tradeoff soon into the pitch, so that you can make an informed decision for yourself as opposed to being swept away or turned off.
Most product companies promise a variety of synthetic intelligence that will either:
Save you time
Save you money
Save you capacity/ energy
The product does this using the power of computation and pattern-recognition. If you sit down and set your mind to do the same tasks, it will most certainly take up a behemoth amount of your resources.
And so the product promises to deliver long term value: a promise of increasing your discoverability, decreasing your documentation effort, and increasing positive outcome for your clients.
In return, most product pitches ask for a sample of your effort.
A sample of your speech, your clients speech, a sample of your session, your notes, your transcript, a sample of the video recording of your session… the list can go on.
Now, the reason these pitches are asking for a sample of your effort is 9/10 times not because they want to replace you (a lot of fear mongering social media would think otherwise).
It is is because they want to build a synthetic intelligence within a certain niche expertise, and to build the intelligence it requires training data, lots and lots of it.
Humour me an analogy.
Human intelligence develops through a complex interplay of genetic predispositions and environmental influences, starting from infancy and continuing throughout life. Early childhood experiences, particularly those involving learning, play, and positive social interaction, are crucial for building a strong foundation for intelligence.
Similarly, to develop synthetic intelligence, one must feed the ‘child’ ie. the model, with a variety of ‘experiences’ reinforced to with labels to identify the right vs the wrong, the good vs the bad. Using this intelligence, the model further learns from concepts and applies logic and reason.
When accepting the terms of a product pitch, the user accepts the terms of a document called Terms of Service (I've said 'terms' too many times and am reaching semantic satiation).
Terms of Service, or ToS, is usually written by the legal team of product companies. I myself have gotten a couple of these written and updated every time we launched a new feature or product
The ToS can be found in fine print on the footer of the product’s website. You scroll all the way to the bottom and you should find it right there next to the Privacy Policy.
The ToS is a standard document that states that when the user consents to giving their data, they continue to retain it's ownership but allow the Company to sublease it for training purposes.
Understand for yourself the tradeoff of the pitch:
Go all the way to the bottom of the website
Find the Terms of Service document
Once open, use keyboard shortcuts ‘command+F’ and type the word ‘Data’
Most ToS will state that once handed over, the company has rights to sub-licence the data and they can use it to improve the intelligence of their product.
Liking the read so far? Perhaps a friend might like it too!
I hope to run TinT as a solely reader supported newsletter. For that, I need you to be my points of light into the world. Help my outreach efforts, share this newsletter with therapists who will find it useful!
In therapy, a client’s story unfolds over time—shaped by their history, environment, behaviours, and reflections. You don’t draw conclusions from a single sentence; you look at patterns across sessions.
In machine learning, data comes together to make something similar to case histories. Datasets are collections of examples—past behaviours, inputs, outcomes—that the system uses to learn. Just like how therapy relies on what’s brought into the room week after week, AI models rely on the quality and diversity of data they’re trained on.
And just like therapy needs good boundaries and context, datasets need careful curation.
Data: The Technical Definition
Dataset A dataset is a collection of data arranged in a particular format. It’s mainly used for research, data analysis, or projects in machine learning.
Database A database, on the other hand, is a structured system for storing, managing, and retrieving data, often used for ongoing operations. Databases are typically larger and more complex than datasets. It’s designed to store large amounts of information that can be accessed, managed, and updated efficiently.
The difference between them lies in their use Datasets are best for analysis tasks, while databases excel in handling ongoing, live data management.
Data in Mental HealthTech
Data in MhTech is a unit of information that can be gathered from the client-therapist, therapist-business, therapist-supervisor (the list can go on) relationships.
This unit of information is available for any medium that co-relate to our senses or sensations. If we can hear it, its audio data, if we can see it, its visual data, if we can write it, it’s text data.
Then there’s combinations of data: bio-markers data, audio-visual data and so on.
Data sets make up specific kind of data within a very niche context. For example:
The DAIC-WOZ dataset comprises voice and text samples from 189 interviewed healthy and control persons and their PHQ-8 depression detection questionnaire.
The PAIR dataset consists of brief interactions between counselors and clients portraying different levels of reflective listening skills.
Mental HealthTech is a nascent albeit fast evolving field when it comes to datasets. Product companies employ existing data sets and then begin building on top of them. Therefore, the caveat in their pitch is that they can sublease the data they collect from you and use it to refine their models.
The Rough and Tumble of Data Gathering
As a therapist, this whole data conversation has probably set off more than a few alarms in the back of your mind.
Data gathering is extremely challenging in mental health. Period. Perhaps one of the key reasons why tech has reached mental health so late is the challenge of gathering data.
Let me present to you the perspective of someone who set's out to build data-sets for MhTech. The first roadblock we'll encounter is most certainly:
Privacy & Ethical concerns: The consent process in data gathering must be airtight and transparent. Absolutely no loop-holes, fine prints, or fuzziness.
If we are able to make that happen, then,
Scarcity & Fragmentation: Unlike other medical fields, mental health data either does not exist, or is scattered across videos, notes, personal journals, wearable devices, and EMR.
Let’s assume we manage to bring all of it under one roof,
Subjectivity: A person’s emotional state, symptoms, or therapeutic progress is hard to quantify or label consistently.
Even if we manage to create some kind of structure and labelling for one person, then
Unstructured Multimodality at Scale: Notes, transcripts, voice recordings, and body language—mental health data comes in varied formats that are difficult to standardize and analyse at for large population.
Not to mention whatever data you collect, it will hold the same prejudices as of its sample set
Bias in Collection: Training data reflects biases of those who seeks therapy (Urban, educated, English-speaking), of who give therapy, and of those who make the models
Suppose we did manage to reach this far,
Longitudinal Complexity: The nature of mental health journeys is non-linear and spans weeks to years. Short-term data snapshots often miss critical context or patterns.
And finally, the most herculean challenge of all,
Lack of Objective Truth: Unlike blood tests or scans, there’s often no objective “truth” in mental health to validate against. Labels like “improved” or “distressed” may vary basis observer or day/ time of observation
So then should we just abandon the pursuit of data-gathering altogether? Unlikely.
My personal belief is that the art of inter-disciplinary collaboration is the key to making conscientious progress in Mental HealthTech.
Addendum: Precision Mental Health & Data-driven Therapy
Data-informed clinical decision making is a part of the larger trend of Clinical Decision Support Systems (CDSS).
CDSS are tools that uses patient data and knowledge bases to help healthcare professionals make better clinical decisions.
These systems can provide alerts, reminders, guidelines, and other information to assist with diagnosis, treatment, and overall patient care. More on that another time.
That's all for today, see you next week!
Toward more technology-informed therapists! Harshali Founder, TinT
You've stayed with me this far, you're surely a champion for tech literacy for therapists!
Tell me, is there anything you'd like this newsletter to cover ASAP? Simply reply to this email or hit the button below to respond via a 2 min form.
Therapists deserve clarity on AI! TinT delivers insights and community to help you stay grounded and ahead
"Your newsletter felt like someone holding my hand through an unfamiliar field. It had none of the jargon-heavy complexity I brace for in most tech newsletters—just clarity, warmth, and flow."
4 min readWebsite Guiding Principles for MH-AI Founders & Builders Hello dear reader, After many many boxes, bags, and a city change, we’re so back! In case you missed it, I’m now living in Madison (Wisconsin), the charming isthmus city and capital of the dairy state. Translation: I fully intend to eat all the ice cream there is. For now, I’m sipping warm honey water, thrilled to be back at writing this newsletter! Today we continue our TinT Labs five-part special series, co-written with two...
Website A glitch in the matrix Dear reader, Despite my best efforts, there won’t be a TinT newsletter today. After a wonderful summer in Seattle, I moved this week to Madison, WI, where my husband is pursuing his PhD. In the rush of managing luggage and boxes, I underestimated how much time it would take to settle in and find my rhythm. As a first-time content creator, this has been a humbling lesson in scheduling. To everyone who shows up online with consistency, I see you! Your discipline...
Website #15 | Building AI-Resilient Therapy: TinT’s Next Chapter Hello dear reader, We break our usual weekly programming to bring to you an exciting announcement! Over the last few months of writing and research for this newsletter, one theme has surfaced again and again:Tech literacy builds AI-resilient therapy practices. A practice strengthened by an understanding how algorithms are built and how they shape real lives. We’ve been mapping the incredible skills clinicians are developing to...