PTSD-in-the-wild dataset
Description
POST-traumatic stress disorder (PTSD) is a chronic and debilitating mental condition that is developed in response to catastrophic life events, such as military combat, sexual assault, and natural disasters. PTSD is characterized by flashbacks of past traumatic events, intrusive thoughts, nightmares, hypervigilance, and sleep disturbance, all of which affect a person's life and lead to considerable social, occupational, and interpersonal dysfunction.
Welcome to the page of the Labeled PTSD-in-the-Wild dataset. This is a database of videos of people With or Without PTSD. The dataset was collected to help study or detect PTSD using AI techniques. The dataset contains 613 videos of interviews of people with PTSD or not. It was constituted by firstly downloading YouTube videos of people who have gone through traumatic events and subsequently developed PTSD and are talking about their story. For the non PTSD counterpart, it is composed of interviews collected from YouTube of random celebrities. The dataset is perfectly balanced.
Composition
Apart from the labeling, other categories of information were collected. These are given below:
- Gender of the interviewee. Fig. 2b shows the distribution of the gender in the dataset across the labels With PTSD and Without PTSD.
- Degree of confidence, that is how confident the annotator is about the diagnosis.
- Type of trauma: The different types of trauma in dataset are: War veteran, Sexual assault, Plane crash, Terrorist attack, Domestic abuse, Car accident and other. Fig. 2a shows how the different types of trauma are distributed in the dataset.
Examples of video frames from the PTSD-in-the-wild dataset
Visual examples extracted from the dataset are shown here. It first shows two victims of War and plane crash who have developed PTSD. Then it shows two individuals without PTSD.

Training, Validation, and Testing
Our dataset consists of 317 videos of subjects with PTSD and 317 videos of healthy control subjects with no PTSD symptoms.
Two benchmarks are proposed for the evaluation and the comparison of new approaches for PTSD diagnosis and recognition. Then, for splitting the dataset into training, validation and testing sets, two strategies or partitions are given with the dataset:
Train/Validation/Test split: We randomly split the dataset into training, validation and testing sets with the percentages of 80%, 10% and 10%, respectively.
3-fold cross validation: To further evaluate the effectiveness of our baseline models and any new approach for video PTSD diagnosis, n-fold cross validation strategy with n=3 for are used for the splitting of training and testing sets and the three folds are given in the dataset.
Citations
Sawadogo, M. A. L., Pala, F., Singh, G., Selmi, I., Puteaux, P., & Othmani, A. (2022). PTSD in the Wild: A Video Database for Studying Post-Traumatic Stress Disorder Recognition in Unconstrained Environments. arXiv preprint arXiv:2209.14085.
Download
To access the database, you need to fill and sign the End User License Agreement (EULA). You can download the EULA with this link.
You should send signed EULA to alice.othmani@u-pec.fr
If you are granted access, you will receive in return to your mail a link to download the database.



