' Tumor Hypoxia Diagnosis ' using Deep CNN Learning strategy a theranostic pharmacogenomic approach

Tumor hypoxia results in most of the anticancer drugs becoming ineffective. However, due to lack of proper signaling in the hypoxic micro environment, the condition cannot be detected in advance, leading into unnecessary delay in the diagnosis and treatment. The main objective of the work is to identify the 'hypoxia prone SNPs to help the patients to predict their possibility of hypoxia formation and to Design and develop a machine helping in diagnosing the hypoxia from pathological images using deep learning with 'convolution neural network'. The genetic signatures corresponding to 'tumor hypoxia development' have been identified by pharmacogenomic method, comprising of genomics, epigenomics, metagenomics and environmental genomics. All the common hypoxia related mutations have been included in the study. The formation of the hypoxia condition has to be carefully identified and monitored during the process of treatment to ensure that the right drug is being administered. In the present manuscript, a novel method of elucidating the condition using 'deep convolution network' from simple pathological image has been suggested. The efficiency of the suggested machine is found to be 92.8% making it as a potential device for prediction of hypoxia mutation and thereby helping us to monitor the hypoxic conditions effectively. Thus, the hypoxia prone SNPs corresponding to common mutations have been identified. The patients having the hypoxia prone SNPs are advised to guard against hypoxia formation with the help of diagnostic tests using the machine. The machine helps to warn the patients against the respective mutations from simple pathological image of the tumor cells.


INTRODUCTION
Cancer is one of the terminal diseases leading to large scale of mortality every year all around the world (Fitzmaurice et al., 2015).The normal symptoms of cancer include increased proliferation, decreased apoptotic pathway functions, deregulated metabolism and depletion in the cellular oxygen content.
Although treatment strategy for cancer involves multiple protocols such as chemotherapy, radiation therapy, surgical intervention etc., the failure rate remains unexpectedly high.One of the major reasons behind this is the development of a cellular condition known as hypoxia in cancer cells along with prolonged treatment of cancer, making these cells highly resistant to most of the anti-cancer drugs.The condition is characterized by maintaining insufficient oxygen within the cancer cells leading into a decrease in the cellular metabolic rate.This results in a different cellular environment, where most of the anti-cancer drugs fail to function, providing a natural 'drug resistance'.The 'hypoxia condition' can be considered as a 'protective adaptation' by the cancer cells especially solid tumor cells to increase anticancer drug resistance (Sriraman, Aryasomayajula & Torchilin, 2014).In most cases, this unfavorable condition of cancer cells triggers off extensive metastasis and accelerated malignant progression.The hypoxia in sarcomas leads to distant metastasis while hypoxia in cervical cancer results in local and regional spreading of cancer Another major challenge associated with hypoxia is the difficulty in its early detection as the condition does not support proper signaling for recognizing the reactive oxygen species (ROS) (Fleet, 2006) , used for the diagnosis of hypoxia.This may lead the tumor cells to over-populate and promote metastasis (Brown & Wilson, 2004), (Wilson & Hay, 2011) excessively.Even an effective drug delivery system may fail to reach the region of hypoxia because of the poorly developed blood vessels, deregulated metabolism and increased drug resistance.
Though oxygen sensors such as Eppendorf needle are suggested to monitor hypoxia condition, due to the operational difficulty in introducing individual needle sensors, the technique is not widely accepted.The non-invasive analysis using indirect assays, studying the hypoxia inducible factors (HIF), bio-reductive metabolism, etc. has been suggested to measure and monitor the condition.Few imaging technique such as blood oxygen level-dependent magnetic resonance imaging (BOLD-MRI), phosphorescence have been introduced.The major disadvantage of using BOLD-MRI is that it measures only deoxyhemoglobin concentrations.The toxicity of phosphorescence dye used in the analysis and the inability to assess deeper tissues are the drawbacks of the phosphorescence based imaging technique.Moreover, the pharmacogenomic individual variations seen in the diagnostic finger prints of patients demand a 'person-specific diagnostic system' incorporating the attributes such as genomics, epigenomics, metagenomics, environmental genomics and drug genomics (HimaVyshnavi et al., 2017), (Iyer, Karthikeyan, Sanjay Kumar & Krishnan Namboori, 2017), (Iyer, Palayat, Shanmugam & Namboori, 2017).The early diagnosis of hypoxia condition associated with cancer is still a challenge.
The 'deep convolution neural network (CNN)' based learning environment has been reported as a novel efficient theranostics technique to get biological functional information from cellular images (Rawat & Wang, 2017).Well established 'tensor flow Convolution Neural Networks for CIFAR-10' as shown in ("TensorFlow Tutorial | Deep Learning Using TensorFlow | Edureka", 2018) has been widely used to address biological functionalities including molecular biology and genomic imprinting.
The CNN has been used in the analysis of histopathological images for a few biological conditions and found to be very efficient in retrieving diagnostic and prognostic information about the disease conditions (Khosravi, Kazemi, Imielinski, Elemento & Hajirasouliha, 2018), (Komura & Ishikawa, 2018), (Qu et al., 2018).In the present work, a novel approach has been used in studying the possibility of using this technique in identifying the 'hypoxia condition' associated with cancer treatment by incorporating deep CNN learning environment and correlating the same with pharmacogenomic variants.
These SNPs are further characterized and classified into damaging, tolerated, benign, possibly damaging, probably damaging and deleterious using Sorting Intolerant from Tolerant (SIFT) and Polymorphism Phenotyping (Polyphene) analysis (Ng, 2003), (Adzhubei, Jordan & Sunyaev, 2013).The probable metagenomic contribution in the formulation of hypoxia is identified by comparing the microbial genome with the genes responsible for hypoxia using Basic Local Alignment Search Tool (BLAST) of National Center for Biotechnology Information (NCBI) (Ye, Ma, Madden & Ostell, 2013).
The environmental factors causing the mutations have been noted down from the 'Comparative Toxicogenomics Database (CTD)' (Davis et al., 2016).The anti-tumor drugs were taken from the drug bank and coding SNPs that are most likely to have an impact on biological function were identified from LS-SNP/PDB tool.Solid tumor related genes have been taken up for the analysis and their epigenetic contributions towards variations have been studied (Bock, Walter, Paulsen & Lengauer, 2007), (Dworkin, Huang & Toland, 2009).The DNA methylation is found to be the most prominent epigenetic mechanism responsible for causing variation in CpG dinucleotide (Thienpont et al., 2016).The attributes corresponding to methylation of DNA have been identified.

Deep learning Implementation
Totally, 300 breast cancer sample images were included in the present analysis, among which 150 were hypoxia positive breast cancer images and the remaining were hypoxia negative breast cancer images to avoid class imbalance.From the 300 samples 'pathology images', 80% have been considered as the training set and the remaining 20% as the testing set.The labels are encoded and used for training purpose and the python library tensor flow is used for performing deep CNN in the model ("The Human Protein Atlas", 2018).

Algorithm and optimization of the conditions
The structure of the prediction model is depicted in Figure .1.In this manuscript, the pathological images collected are of larger pixel values.In order to avoid higher weights in the initial hidden layers, the CNN is not fully connected instead is attached to few regions of the layer to avoid over fitting by changing the hyper parameters such as the filter size (to 7), epochs (to 50) and increasing the number of layers.
In the optimized process, there are totally 23 layers consisting of alternating convolution layer, maxpooling layer and rectified linear unit activation layer along with two fully connected layers, where the output of alternative individual layers is stacked together.The cross entropy is calculated and an 'adaptive moment estimation' has been used to optimize the network weights by an iterative method.The deep neural net has been trained to different epochs to converge the results and provide maximum prediction accuracy (Figure 2).

RESULTS AND DISCUSSION
Through classification, the upregulation of hypoxia gene mutations has been identified.The hypoxia related SNPs have been identified from the pharmacogenomic analysis through online database such as dbSNP -NCBI and SNP Nexus.The pharmacogenomic and deep learning analysis have been carried out parallelly and a correlation has been set up between the results.Thus, deep CNN is used for image classification task to identify the specific mutations responsible for hypoxia.
Early detection of the development of hypoxia is made possible through the deep CNN model, while the proneness of hypoxia formulation has been made possible through the pharmacogenomic model (Namboori et al., 2011).
Obviously, the people with the SNPs expressed in their respective genes are more prone to the mutations.

Epigenomics
While considering the attributes corresponding to epigenetic variations, it has been found that the repetitive DNA, evolutionary history, transcriptome, epigenome & chromatin structure are the factors contributing most towards 'hypoxia micro environment (Figure 3).The SNPs in the 'methylation prone region' called Methylation prone SNPs (MeSNPs) have been identified (Table 2).The persons with these SNPs are more inclined to epigenomic variations.

Metagenomics
The metagenomic analysis describes the influence of microbes living in our body in supporting the mutations.The microbes having genome similar to the hypoxia genes have been identified as most influencing in causing the mutation (Banerjee, Mishra & Dhas, 2015).The corresponding SNPs have been included in Table 3, suggesting the people with these SNPs are more susceptible towards metagenomic variations.

Environmental factors
Many epidemiological studies prove that the environmental factors also contribute towards mutations (Boffetta & Nyberg, 2003).The mutagens such as resveratrol, deferoxamine, quercetin, benzopyrenes, methylcholanthrene, betanaphthoflavone, estradiol, dinoprostone, tetrachloro dibenzodioxin, celecoxib, indomethacin etc. are found to be most influential in causing the mutations.The chemicals interacting with the genes and having toxic effect are included in the

Drug genomics
The drug genomics study correlates between the genetic signatures for various proteins specific to hypoxia genes and anti-cancer drugs that could be effective when advised under identification of the corresponding SNPs ( Table 5. Hypoxia associated proteins and proneness to drug action.

Deep CNN
The predictive machine attained convergence in 33 iterations with an accuracy of 92.8%, trained to 50 epochs.The parameters have been defined and the machine is set to give a paramount accuracy in detecting the hypoxic condition from a given pathological image (Figure 4).

CONCLUSION
The hypoxia leading mutations, ABCC1, ABCB1, MTHFR, RFC1, HPRT1, CYP2B6, CYP2C8, CYP2C9, ADAM17, CYP2C19, CYP2D6, HIF2A, HIF1A, CYP1A1, CYP1A2, CASP1, AKR1C1, AKR1C2, PTGS2, CASP6 have been included in the analysis.The genetic signatures corresponding to proneness of these mutations have been listed.Moreover, the epigenetic, metagenomic and environmental factors and their SNPs leading into hypoxic conditions have been computed.Early identification and a continuous monitoring of hypoxia is essential in making the cancer treatment effective.This can be made using the 'deep CNN based image processing' of pathological images.The predictive model designed has been identified as a potential tool for identifying the tumour hypoxia and the mutation behind it.The machine gives a predictive accuracy of 92.8%, suggesting the tool as a useful device for tumor hypoxia prediction.The tool helps in incorporating the theranostic pharmacogenomic approach for the early detection and continuous monitoring of tumor hypoxia.
The deep CNN based support systems have been identified as potential theranostic devices in modern diagnostic and prognostic scenario especially in the pharmacogenomic approach, where a person specific continuous monitoring system is highly appreciated.The strategy followed in the paper helps in making similar correlations with all diseases and conditions and providing 'simple, effective and economic theranostic' devices.This may further be extended to address critical biological and medical conditions that are expensive and time consuming to detect in the early stage and providing a 'personalized' strategy.The individual mutations can be identified specific to the setting up of tumor micro environment.

Figure 1 .Figure 2 .
Figure 1.The model structure of tensor flow convolution neural network

Figure 4 .
Figure 4.The accuracy and loss % versus epochs where the accuracy and loss tend to converge at epochs 33 and 37 respectively.

Table 3 .
Hypoxia genes and proneness to microbial influence.

Table 4 .
Genes and Environmental factors influencing mutation.