About MRSDB
Data Library
The prototype version of MRSDB contains mock data generated from real clinical and research datasets to demonstrate the functionality of MRSDB. The real-life reference library is a unique collection of MRS studies across a multitude of subject matters that include traumatic brain injuries, brain tumors, psychiatric, and metabolic disorders, and includes over 4,000 brain samples using single-voxel spectroscopy and tens of thousands oftissue samples using spectroscopic imaging. Using these datasets, promising ML applications have been identified, such as modeling brain tissuecompositions from metabolic components, automation of data preprocessing and quantification of spectral signals, and modeling of behavioral and cognitive scores from brain metabolites in former football players.

Standardized Data Format
One goal of this project is to extend the Brain Imaging Data Structure (BIDS) format to encompass MRS. Once achieved, all current and future data acquired from different sites and systems will be harmonized and standardized using the BIDS format. This will allow the incorporation of BIDS-compatible multimodal data formats into the database. One example is MR imaging, such as structural imaging (T1, T2, FLAIR), diffusion-weighted imaging, and functional MRI, which are often acquired alongside MR spectroscopy. Furthermore, a universal data format will encourage users to contribute their own data to improve and scale the database, thereby enabling the development of more complex and robust ML models over time.

Data Processing, Spectral Quantification, and Brain Tissue Component Extraction
A fully automated and cloud-based back-end service built using Django, a Python-based backend framework, will be implemented to quantify raw MRS data and obtain relative, pseudo-absolute, and absolute concentrations in mmol/L and mmol/Kg. This service will interface with the LCModel command-line API to compute these values using spectroscopy data, corresponding water reference data, and anatomical volumes. In addition, volume fractions of gray matter, white matter, and cerebrospinal fluid in acquired voxel locations will be extracted from incorporated anatomical information (T1w, T2w, FLAIR) during the data processing pipeline. These components will be added to the database metadata, which will provide additional dimensions for analysis.

Web Architecture and Resources
The user interface of the database will be a streamlined web application that will facilitate efficient browsing, uploading, and downloading of MRS data. This platform will be built using ReactJS, a modular and scalable modern web development framework that will allow new features to be added iteratively to accommodate new sites and datasets. The data library will be stored in PostgreSQL, a secure and scalable open-source relational database. The Django back-end service will also handle communications between the database and user interface, which will be secured behind a user-authentication system to prevent unauthorized access of study data and PHI.