ObjectiveThis project aims to identify and assistin confronting current issues with the clinical diagnosis and treatment ofBipolar Disorder (BD) patients through the application of Bioinformatics tools.Machine learning and a new algorithm will be employed to not only gain moreunderstanding of the comprising biology of BD through the analysis of risk lociand co-expression/ co-regulation interactions, but also create a model andgenetic risk pathway to better predict an individual’s association with BD. Introduction Bipolar Disorder is a heritable mood disorder characterised byrecurring episodes of depression and mania. It affects 2% of the world’spopulation with an additional 2% affected by sub-threshold variants.
1 The World HealthOrganisation lists BD as one of the leading causes of disability-adjusted lifeyears in young adults. 2 In addition to other socio-economic challengesthe disorder carries with it, patients are exposed to an increased mortalityrate from suicide at 7.8% in men and 4.9% in women. 3 Current methods for clinicaldiagnosis and treatment of BD are inadequate 4 and can have asignificant impact on a patient’s functioning and quality of life 5.
A number ofinterconnected reasons can explain these inadequacies: 1. Diagnosis is based on observation of patient behaviour withreference to criteria defined in manuals such as the ICD. 10. Treatmentdecisions are based on clinician and patient preferences 5. Psychiatricdisorders are often highly heterogeneous in etiology and symptomatic manifestationand yet at no point is the biology that underpins their own variant of thedisorder examined.
2. Symptoms and geneticcomposition are shared between other mood disorders. Notably Schizophrenia andMajor Depressive Disorder.
6 7 8 Mis- and delayed diagnosisof BD patients is common at 60% and it can take between 5 and 10 years for apatient to receive an accurate diagnosis. 93. There are currently noreliable and objective methods to predict which patients will likely respond towhat medication. 5 Thesechallenges remain and are tightly coupled to shortfalls in our currentunderstanding of the biological aetiology of BD.
Bioinformatics has andcontinues to assist with the challenges in this domain through the applicationof a variety of analytical tools to explain the causal relationship geneticvariation within a population and the observed phenotypic differences betweenits members and provides the foundations for predicting genetic risk by identifyingthe risk loci and explaining the genetic architecture of phenotypic traits. Genome-WideAssociation Studies (GWAS) are useful as a foundation exercise to explain thegenetic architecture of a trait and have identified a number of risk loci SNPs forBD and other mood disorders. The Psychiatric Genomics Consortium conducted metaanalyses comprising over 20000 BD patients with 30000 controls and identified21 such loci.
4Table 1: Risk loci hits in BD from GWAS 10 11 Loci Implicated Gene(s) 2q11.2 LMAN2L 2q32.1 ZNF804A 3p22.2 TRANK1 (LBA1) 5p15.
31 ADCY2 6q16.1 MIR2113, POU3F2 (OTF7) 6q25.2 SYNE1 7p22.3 MAD1L1 9p21.3 Intergenic 10q21.2 ANK3 11q14.1 TENM4 (ODZ4) 12p13.3 CACNA1C 12q13.
1 DDN 17q12 ERBB2 There areconfounding factors in the success of GWAS such as sample size and diseaseheterogeneity, population stratification and many more loci remain undiscovered.DNA microarrays allow us to perform a gene expression profiling and implicategenes through measuring their levels of expression. Gene expression data is ofparticular interest to this project in examining BD because its format as avector of real numbers is amenable to the application of machine learning. Clusteringalgorithms and other multivariate pattern analyses have been helpful inunderstanding gene function and regulation and can also identify subgroupswithin a population by identifying and separating sets of genes that somehowplay a similar role in a disease. Co-expressed genes in a cluster are likely tobe involved in the same cellular processes and expression patterns between theminfers co-regulation. Details of the transcriptional regulatory network can be inferredsuch analyses. Whilstanalyses of gene expression data with the aid of machine learning tools canassist in understanding more about a disease, we can do much more with the dataand even employ methods to model it for prediction.
This project will takeresulting cluster data and organise it in a binary vector to represent thepresence of absence of upregulation of genes in each cluster. We can use thisdata as input for the HyperTraPS algorithm, a method for sampling paths on ahypercubic transition network. which create a model of the progression pathwaysto BD.
Similar recentstudies have set out to build a genetic risk model to aid in the diagnosis ofBD 12 Innovationand ImpactProviding ineffective therapies for patients has significant individualand societal costs, especially considering the high prevalence of BD. 5 Evidence-basedmedicine has helped to further understanding of BD, its prognosis and provideoptimal treatments particularly when accounting for heterogeneity 3 and machine learningtools are gaining traction in psychiatric research. 13 Additionally, similarstudies have suggested that current literature is lacking in providing themeans to asses a patient’s genetic risk of BD 12 there is scope for additionalcontributions from Bioinformaticians within this problem domain. PreliminaryResultsDo some basic clustering here to show whether Methodology 1. Obtainmicroarray data for both suffers of BD and healthy patient controls.2.
Applyclustering and normalise the data.3. Transformclustered data into a format suitable for the HyperTraPS algorithm, decidingupon the threshold levels of upregulation in clusters.4. Applythe HyperTraPS algorithm Algorithm 1: HypercubicTransition Path Sampling1. Initialise a set of Nh trajectoriesat s.2. For each trajectory i inthe set of Nh:a.
Compute the probability of making a move to a t-compatiblenext step (for the first step, all trajectories are at the same point and theprobability for each is thus the same); record this probability as ?’i.b. If current state is s, set ?i= ?’i, otherwise set ?i??i?’i. c.
Select one of the available t-compatible steps according to theirrelative weight. Update trajectory i bymaking this move.3. If current state is everywhere t goto 4., otherwise go to 2.
4. Further Work References 1 K. Merikangas Lifetime and 12-month prevalence of bipolar spectrum disorder in the National Comorbidity Survey replication Arch. Gen. Psychiatry, vol. 64, p. 543. 2 C.
Mathers Adjusting for dependent comorbidity in the calculation of healthy life expectancy Popul. Health Metr., vol. 4, p.
4, 2006. 3 D. L-Garzia The impact of machine learning techniques in the study of bipolar disorder: A systematic review Neurosci. & Behav.
Rev., vol. 90, pp. 538-554, 2007. 4 P. J. Harrison, J. Geddes and E.
M. Tunbridge The Emerging Neurobiology of Bipolar Disorder Trends in Neurosci, vol. 41, pp.
984-994, 2018. 5 J. DeQuevedo and L. Yatham Biomarkers in Mood Disorders. Are we there yet? J. Affect.
Disord. 6 A. Forstner Identification of shared risk loci and pathways for bipolar disorder and schizophrenia PLoS One, 2017. 7 S. H. Lee Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs Nat. Genet.
, vol. 45, pp. 984-994, 2013. 8 J.
Soares Individualized Prediction of Euthymic Bipolar Disorder and Euthymic Major Depressive Disorder Patients Using Neurocognitive scores, Neuroimaging Data and Machine Learning Biological Psychiatry, vol. 81, p. 274, 2017. 9 F. Hoffmann L-type CaV1.
2 channels: from in vitro findings to in vivo function Physiol. Rev. vol. 94, p. 303–326, 2014. 10 L.
P. Hou Genome-wide association study of 40000 individuals identifies two novel loci associated with bipolar disorder Hum. Mol.
Genet, vol. 25, pp. 3383-3394, 2016.