Vous êtes sur la page 1sur 18

Williams

1
Discovering Valuable
Lung Cancer Biomarkers

Aaliyah Williams
Niles North High School























Table of Contents: Page Number
Title......................................................................................................1
Table of Contents.................................................................................2
Acknowledgements..............................................................................3
Purpose/Hypothesis/Variables.............................................................4
Review of Literature............................................................................5-12
Materials & Procedure.........................................................................13
Results/Data Analysis..........................................................................14-16
Conclusion............................................................................................17-18
Reference List.......................................................................................19-20

Williams
2


`


















Acknowledgements:


I would like to thank my teacher, Ms.Camel, for her guidance throughout my project and her
critiques on my paper. I would also like to acknowledge Malik Yousef who wrote the algorithm
used within my experiment.















Williams
3


















Purpose:

The purpose of this experiment is to identify valuable miRNA lung cancer biomarkers
that are present within the blood plasma. These biomarkers are identified through the use of a
learning algorithm, specifically a Support Vector Machine. Often miRNA can be found
circulating through the bloodstream. If miRNA biomarkers of lung cancer are present within the
blood plasma, then a simple blood test may be able to diagnose a patient with lung cancer. This
can reduce the use of invasive tests, such as biopsies or imaging tests, thus reducing expenses
and quicker diagnoses.

Hypothesis/Rationale:

If the miRNA expression levels of lung cancer patients are analyzed then a set of miRNA
will be identified that are present within blood plasma that are may potentially be able to
diagnose lung cancer. MiRNAs are a recent discovery and although there is little research on the
role of miRNAs, it has been theorized that these small molecules play a large role in control the
expression of genes. If these small molecules are able to control the expression of genes, then
these miRNAs should be useful in providing further research and creating new and innovative
diagnostic measures.


Variables
Controlled Variables: Adenocarcinoma lung cancer tissues, miRNA, reads per million
Williams
4
mapped
Control Variable: Healthy miRNA tissue reads per million mapped
Independent Variables: lung cancer Tissue miRNA mapped reads per million
Dependent Variables: miRNA Biomarkers, presence in blood plasma








Potential Lung Cancer Biomarkers
Aaliyah Williams
Review of Literature
Lung cancer holds the highest mortality rate in cancers amongst women and men. This
disease is also the most common cancer in the world. Lung cancer accounted for 28% of cancer
deaths in the year of 2012.The United States spends close to $10.3 billion on lung cancer
treatment yearly. According to the American Lung Association the average lung cancer survival
rate is 16.3%, which is much lower than many other cancers(American Lung Association,
2013,pg.1). This high mortality rate can significantly be decreased if a test is created that is able
to accurately detect lung cancer at an early stage.
Cancer is a range of diseases that involves the uncontrolled growth of cells, and can lead
to the metastasization of of the cells throughout the body. There are various lung cancers that are
treated differently depending on its type. These various types of lung cancer can be put into 2
groups, non-small-cell carcinoma and small-cell carcinoma. Within the non-small-cell carcinoma
group are the lung cancers, squamous carcinoma,adenocarcinoma, and large-cell carcinoma.
These non-small-cell carcinoma are all similarly related. Small-cell carcinoma mostly only
occurs in people who smoke heavily and this type of lung cancer is the least common of the two.
The differentiation of the types of lung cancer can be seen by examining a piece of the tumor
Williams
5
underneath a microscope. (Williams,1992, pg.6-8).
Within these many lung cancers are stages that are used to describe the development of
the disease. During stage one in lung cancer, the cancer has not spread to the lymph nodes and
the diameter of the tumor is 2 inches long, so it is only present in the lung area. In stage two the
tumor might have increased to a size bigger than two inches, or it may have spread to other
structures that are close. In some cases the lymph nodes may have been infected by the cancer.
While in stage three, the tumor has grown to a large size, and has spread to organs in the body
that are near the lungs, or/and the cancer has reached lymph nodes that are not near the lungs. In
the last stage, stage four, the cancer has spread to areas of the body that are far away from the
lungs, or the cancer has spread to another lung. (Mayo Clinic Staff,2014, pg.1). In stage IA
there's a 49% 5 year survival rate, but in Stage IB 5 year survival rate. In stage IIA their is a 30%
5 year survival rate, but a 31% 5 year survival rate. In stage IIIA there is a 14% 5 year survival
rate, but in stage B there is a 5% 5 year survival rate. In the last stage, stage IV, there is only a
1% 5 year survival rate. As the stages increase, so does the chance of survival emphasizing the
importance of diagnosing lung cancer in its earlier stages, so treatment can be received early
on.(American Cancer Society,2013,pg.1)
Williams
6
Diagnosing lung cancer in its earlier stages is beneficial, but is actually a difficult task.
According to Lung Cancer Alliance, a biopsy is required to validate that a person has lung
cancer. Before a biopsy is used, another test is performed,chest X-Ray, which uses radiation to
create an image to see clearly inside of the chest cavity. This is not the most effective method,
since small tumors can not be detected. In case of a tumor being detected using an X-Ray , a CT
scan is then performed showing more details of the chest cavity.
(University of Florida Health,n.d,pg.1). This is a CT Scan (photo above) comparing a patient
without lung cancer from the University of Florida Health, to a patient with lung cancer. The
Tumor is present on the right side of the second photo (Right Lung of the patient).To ultimately
confirm the presence of lung cancer a biopsy is performed,a procedure that requires obtaining a
piece of tissue from the tumor or fluid to be examined, usually by a microscope. A biopsy is
usually required because it is hard to identify a tumor, and often a tumor goes unnoticed because
of the inability of the doctor to recognize it is there in its earlier stages. (Lung Cancer Alliance,
pg.1). The amount of effort required to diagnose lung cancer is not ideal due to the invasive and
costly measures taken, and often tumors are declared benign when imaging tests are used. A
more efficient screening method needs to be created in order to diagnose cancer earlier and
Williams
7
accurately.
Before a patient is able to receive these very invasive tests, they must provide their
doctors with signs of lung cancer and their risks for lung cancer. One risk factor is age, About
two out of three lung cancers are diagnosed in people over age 65, and most people are older
than 45. The average age at diagnosis is 71.(Cancer Treatment Centers of America, pg.1).
Another risk factor, like many other diseases, is the predisposition of lung cancer due to a closely
related family member being diagnosed with it. Smoking and secondhand smoking can also
attribute to lung cancer, along with radon exposure. Radon is a colorless, scentless radioactive
gas that is found in some houses and is a leading cause of lung cancer(Cancer Treatment
Centers of America,pg.1). Exposure to air pollutants or chemicals at a persons job may also lead
to lung cancer. If a patient has received radiation therapy in the chest area, then they are at risk of
lung cancer.
When a patient has lung cancer, they experience a varied amount of symptoms that
usually arent clearly present until the tumor begins going. According to Stanford Medicine
Cancer Institute the symptoms of lung cancer are constant chest pain, shortness of breath,
hoarseness, bloody or rust colored sputum, fever, tumor near the lungs, fatigue, loss of appetite,
loss of weight without effort, bone fracture, or headaches. There is no guarantee that these
symptoms will occur in all patients that have been diagnosed with lung cancer and each patient
will experience different symptoms (Stanford Medicine,pg.1). These risk factors and causes of
cancer can not provide the patient with a definite diagnosis of lung cancer because the causes and
risks are very similar to other diseases. A method that is able to use biomarkers to identify lung
cancer should be very efficient.
Biomarkers are cellular, biochemical or molecular alterations that are measurable in
Williams
8
biological media such as human tissues, cells, or fluids. (Mayeux,2004,182-188). Biomarkers
are able to aid in diagnosing, prognosing, response to therapy or even test the recurrence of a
disease. Biomarkers are specific to the type of disease and can be present in many types of
forms. Biomarkers can be discovered in blood, urine, saliva, brain, skin, cerebrospinal fluid and
many other forms. These indicators are also able to aid in drug development because the specific
biomarkers can specifically be targeted (Mayeux,2004,182-188).
In other studies, different biomarkers have been considered due to their presence in
blood plasma. In a study at the University of Texas, researchers attempted to test if the plasma
proteins, Insulin-like growth Factors, could be considered as biomarkers. They suspect that these
proteins are associated with lung cancer and its risks. They obtained 10 mL of blood from
recently diagnosed lung cancer patients and control subjects. These blood samples were then
analyzed using an immunoassay kit that measured the amount of IGFs within the blood plasma.
The researchers then used statistical analysis to analyze which growth factors, IGF-I, IGF-2, or
IGF-3 , were most associated with lung cancer. The amount of IGFs present within the blood
plasma were then statistically analyzed to find a correlation between lung cancer risks, such as
BMI, family history, age etc. The results showed that the IGF-I showed the most association with
lung cancer and its risk. The researchers conclude that the protein IGF-I should be considered as
a blood protein biomarker. Many researchers have looked into different physiological biomarkers
that can be found in the blood plasma because they allow a simple blood test to be performed.
These blood proteins have created great interest because like miRNAs their presence in the blood
allows for easy diagnostic use of diseases, such as lung cancer.(Yu H., Spitz M., Mistry J., Gu
J., Hong W., Wu X.,1999,151-156).
As of 1993, miRNAs have only recently been discovered and there has only been 940
Williams
9
human miRNAs identified. Scientists previously believed miRNA only had a role within the
cellular processes of non mammalians, but their theory has been refuted due to the first miRNA
identified, recognized as lin-4. Researchers have discovered that the lack of lin-4 leads to
reiteration of cell lineages, thus leading to cellular developmental disruptions. According to the
American Psychological Society,These findings reveal two aspects of miRNA's unique
functionality: 1) precise regulation of the timing of a cellular event via 2) synchronous inhibition
of a cadre of genes that are functionally interdependent, thus operating as an efficient molecular
switch.(Lu,Gu, 2005, 834-838).
MiRNAs are post-transcriptional small RNA molecules that are non-coding and are
involved in gene silencing. The miRNA have shown great interest in researchers because it is
believed that they play a large role in cell cycle regulation, apoptosis, angiogenesis,
inflammation, metastasis etc. The belief that these small molecules are able to impact numerous
cellular processes stems from miRNA often being found within the blood plasma. The presence
of miRNA within the blood plasma makes miRNA a desirable candidate for a biomarker.
Especially because in previous studies where they compared the RNAs ability to classify tumour
tissues compared to miRNAs, they discovered miRNAs were more accurately able to classify
poorly differentiated tumour samples (Lu,Gu, 2005, 834-838).
After further testing of these small molecules through cloning, scientists revealed that
many of these small molecules have completely different transcriptional units. This theory has
been developed because of the observation of miRNA genes being derived from different
genomes, therefore the genes they develop from arent necessarily related in any way. Although
they escaped notice until relatively recently, MiRNAs comprise one of the more abundant classes
of gene regulatory molecules in multicellular organisms and likely influence the output of many
Williams
10
protein-coding genes(Bartel, 2004, 281-297).
Approximately seven-thousand human genes are regulated by miRNAs that are encoded
within the genome. The process of the regulation requires the 21-nucleotide miRNA interacting
with the target site of an mRNA by binding to the 3UTR(untranslated region) of the miRNA.
This interaction thus results in gene silencing because protein synthesis is suppressed, thus
stopping protein production. MiRNA genes are transcribed by Polymerase II an enzyme and as
this is performed pri-miRNA are processed by the Drosha enzyme. These actions results in the
development of pre-miRNA, made of 70 nucleotide. The pre-miRNA then travels to the
cytoplasm and is processed by the Dicer enzyme to form a 22 nucleotide miRNA. The
microRNA is then incorporated into a ribonuclear particle to make the RNA-induced silencing
complex, RISC, which controls gene silencing. A visual of the miRNA regulation process is
provided below (Jackson, 2006, 13).

This photo is an accurate visual representation of the miRNA regulation process, and shows the ending
result of the miRNA after RISC, as previously mentioned above is created (Fazi, Nervi, 2008,553). This
Williams
11
photo shows the mRNA targets that provide binding sites for these miRNAs. These miRNAs, as
depicted in the picture may bind to 3UTR of mRNAs that will perform translation, degradation or
deadenylation, or P-body localization (Jackson, 2006, 13). Due to miRNAs involvement in gene
expression, it can be theorized that these small molecules play a large role in in many physiological
processes throughout the human body.
Lung cancer is a disease that is affecting many people in the United States and all around the
world. The creation of a noninvasive test that utilizes a biomarker, such as miRNA, and is able to
accurately produce an early diagnosis of lung cancer will surely save the lives of many, since treatment
can be received earlier and will aid in drug development and predicting drug response. Thus increasing a
patient's chances of survival immensely and potentially decreasing the large amount of money the
government spends on the treatment and diagnosis of Lung Cancer.


Materials
Dell Laptop
Mirandola Database
NCBI database
miRNA reads per million mapped of normal tissues
WEKA data mining tool
Support Vector Machine- Recursive Network Elimination
TCGA Database portal
41 lung cancer adenocarcinoma database
Explanation:
The Mirandola Database was used to obtain healthy cancer tissue reads per million miRNA
mapped to compare to the lung cancer tissues reads per million miRNA mapped obtained from
the TCGA database. The Support Vector Machine- Recursive Network Elimination Data mining
algorithm was used to narrow down a set of significant genes. The WEKA data mining tool was
used to run the algorithm.
Procedure:
1. Download 41 adenocarcinoma miRNA data from TCGA database.
2. Download Weka and the SVM algorithm
3. Upload datasets to excel
Williams
12
4. Download SVM-RNE algorithm from http://web.macam.ac.il/~myousef.
5. Upload Data Sets obtained from database to Weka.
6. Open algorithm in WEKA( automatically performs the following tasks)
a. Uses a statistical T-test to identify the most expressed miRNA
b. Calculate a score for the miRNA.
c. Remove lowest 10% of scores.
d. Take remaining miRNA and pool.
e. Repeat until only 4 miRNAs remain.
7. Repeat until 4 miRNA are remaining
8. Look through miRNA database to determine which specific miRNAs identified as
significant are found in blood plasma








Results:

Dataset summary of lung cancer tissues
# Lung Cancer tissue
instances
# of lung cancer tissues Mean(reads per million
miRNA mapped of lung
cancer patients)
132746 41 1820
Their were 132,746 instances of miRNAs used within this experiment, 41 lung cancer tissues
used to extract miRNAs, and the mean reads per million in the data was approx. 1820.

Comparison Between Lung Cancer and Healthy
Tissues:miRNA Reads Per Million Mapped
Williams
13


This graph visually shows the significance difference in average miRNA reads per million
mapped in lung cancer tissues compared to healthy tissues average miRNA reads per
million mapped.


Data Table of Reads Per Million Mapped
MiRNA
MiRNA Reads Per Million
Mapped averages
(Lung Cancer Tissues)
MiRNA Reads Per Million
Mapped averages
(Healthy Tissues)
hsa-let-7i 1.4x10^7 6.5x10^3
hsa-mir-3168 9.1x10^6 44.5
Williams
14
hsa-mir-21 7.06x10^6 2.36x10^4
hsa-mir-22 1.95x10^6 3.7X10^3
This data table consists of the average miRNA reads per million mapped of the lung cancer
tissues analyzed in this experiment and the average miRNA expression in healthy tissues.






Data Analysis:
The following miRNAs were identified through computational analysis, hsa-let-7i, hsa-mir-3168,
hsa-mir-22, hsa-mir-21. After SVM-RNE analyzation the most significant miRNA identified was
hsa-mir-21. The miRNA hsa-7-leti reads per million miRNA mapped is approximately 6.58e3 in
a normal tissue, but approx. 1.4e7 reads per million miRNA in a lung cancer patient. The hsa-
mir-20 in normal tissue is approx. 2.36e4 reads per million, but in a lung cancer patient there is
approx. 7.0e6 reads per million mapped. The miRNA hsa-mir-3168 in normal tissue reads 44.5
reads per million, but in lung cancer patients approx. 9.1e6 reads per million mapped. The
miRNA hsa-mir-22 has approx. 1.9e6 reads per million, but for normal tissue it is approx.
3.72e3. All of the lung cancer miRNA reads per million mapped were significantly higher than
their normal miRNA counterparts reads per million mapped by over thousands of reads per
million mapped. The miRNA identified all had miRNA reads per million mapped higher than the
average of the lung cancer dataset, which was 1820 reads per million mapped. After searching
through the miRNA database for miRNA found within the blood plasma, every miRNA that was
discovered as significant was found to have been present in blood plasma within other scientific
studies.
Within this experiment there are many experimental errors, including analyzing more
datasets over a longer period of time, to increase the accuracy of miRNA reads per million in
lung cancer tumor tissues. Another experimental error includes the use of in silico analysis,
which is analyzing data using a computer program. In vivo testing will be needed in order to
truly conclude miRNAs are biomarkers of lung cancer.


Williams
15










Conclusion:
In this experiment, miRNA expression levels of lung cancer patients with
adenocarcinoma were analyzed. Through the use of the SVM-RNE algorithm in WEKA a set of
upregulated biomarkers were singled out. After SVM-RNE was performed, it was concluded that
hsa-mir-21, hsa-let-7i, hsa-mir-22, and hsa-mir-3168 were highly upregulated compared to other
miRNAs within the dataset. The miRNAs reads per million mapped of the identified lung cancer
miRNAs were then compared to the healthy tumor tissues miRNA. All four of the miRNAs
identified as very upregulated within the lung cancer tissue dataset, when compared to the
healthy tissues miRNA per million mapped showed a large difference in miRNAs mapped.
These miRNAs were also searched through a database, in order to see if they were present within
the blood plasma, and all four have been identified as being present in blood plasma. It was
hypothesized that significant miRNA would be found and that they would be present in the blood
plasma if expression levels were analyzed of lung cancer adenocarcinoma patients. Since the
data showed a large distribution between lung cancer tissues of the miRNA identified map
between its healthy tissue counterpart, it can be concluded that the hypothesis was supported.
This large significant difference between lung cancerous tissues miRNA reads per million
miRNA mapped and the healthy tissues miRNAs show that these miRNAs of lung cancer
patients have abnormal regulation, so these miRNA may be potential biomarker candidates.
Also, the hypothesis in the beginning of the experiment is supported because the miRNAs
identified were all proven to be found present within blood plasma, making them an excellent
candidate to be useful biomarkers. The analysis used in this experiment does not fully conclude
that miRNAs can be used as biomarkers of lung cancer, but that they should be considered for
recognition as candidates for more research on their ability to act as a biomarker for lung cancer.
The potential biomarker candidates identified within my experiment, have detailed
annotation that has been created through the years based on researchers studies. The potential
biomarker hsa-let-7i has been observed in previous experimentations to have a role in cellular
proliferation, and is specifically targeted by the A549 gene . This gene serves as a miRNA target,
and specifically the hsa-let7i is located on the 12th chromosome. Another potential biomarker
identified was the hsa-mir-3168 whose target site is located on various genes, but one in
particular is the ARID1A gene, a protein-coding gene,located on chromosome 1, that has been
shown in other studies to be one of many genes contributing to the development of cancer if
Williams
16
mutated. Also, the hsa-mir-21 has been observed to target the tumor suppression gene
TPM1(Tropomyosin) , on chromosome 17. The hsa-mir-22 is target the following genes CDK6,
SIRT1, and Sp1, which are all genes involved in the senescence program, plays a role in tumor
suppression. This information supports the accuracy of my hypothesis because the miRNAs
genes and proteins that they were generally associated with, within scientific studies are related
to lung cancer, or cancer in general.
The discovery of miRNAs that are able to act as a biomarker of lung cancer is greatly
desired because this would decrease the invasive tests that a lung cancer patient typically has to
have in order to be diagnosed for lung cancer. The use of biomarkers that are present in the
blood, such as miRNAs, would allow for a simple blood test to be performed to properly
diagnose a patient with lung cancer. This would also allow for more drugs to be created
specifically targeting the miRNAs that are proved to be a significant part of lung cancer
development and these miRNAs may be able to predict a patients drug response. The results of
this study do not accurately represent the reads per million mapped of miRNAs in lung cancer
patients because their was only 41 datasets containing lung cancer adenocarcinoma miRNA
information. The discovery of a simple blood test would reduce the large sums of money the U.S
and countries all around the world invest in lung cancer testing and research. Further
experimentation on this topic can include physically making a blood test that is able to identify
these miRNAs that show a strong association to cancer, so this test can use the expression levels
of miRNA to determine if a person has lung cancer.














References

Bartel, D. (2004) MicroRNAs: Genomics, Biogenesis, Mechanism, and Function. Cell
Press,116(2):281-297. Retrieved February 10,2014 from
http://www.sciencedirect.com/science/article/pii/S0092867404000455.
Williams
17
Fazi, F., Nervi,C.(2008) MicroRNA regulation[Diagram]. Cardiovascular
Research.553.79(4). Oxford University Press, n.d.
Jackson, R., Standart,N.(2007). How do MicroRNAs Regulate Gene Expression? Science
Direct. Retrieved February 10,2014 from http://www.gene-quantification.eu/jackson-
review-microrna-2007.pdf.
Lu, J., Getz, G.(2005). MicroRNA Expression Profiles Classify Lung Cancer.
Nature:International Weekly Journal of Science.(435): 834-838. Retrieved February
10,2014 from http://www.nature.com/nature/journal/v435/n7043/abs/nature03702.html.
(2013). Lung Cancer Fact Sheet. Retrieved January 2,2014 from
http://www.lung.org/lung-disease/lung-cancer/resources/facts-figures/lung-cancer-fact-
sheet.html
(2013). Non-Small Lung Cancer Survival Rates By Stage. Retrieved January 2,2014, from
http://www.cancer.org/cancer/lungcancer-non-smallcell/detailedguide/non-small-cell-
lung-cancer-survival-rates
(2014, Jan 2). Diseases and Conditions:Lung Cancer.Mayo Clinic. Retrieved January
2,2014, from
(n.d). Diagnosing Lung Cancer. Retrieved January 2,2014, from
http://www.lungcanceralliance.org/get-information/what-is-lung-cancer/diagnosing-lung-
cancer.html
(n.d.). Lung Cancer Risk Factors. Retrieved January,2,2014, from
http://www.cancercenter.com/lung-cancer/risk-factors/
(n.d). Symptoms of Lung Cancer. Retrieved January 2,2014, from
http://cancer.stanford.edu/lungcancer/lung/symptoms.html
Williams
18
Mayeux R. (2004). Biomarkers:Potential Uses and Limitations, The American Society for
Experimental NeuroTherapeutics, 1(2): 182-188. Retrieved January 2, 2014 from
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC534923/.
University of Florida Health.(n.d). [Normal Chest vs. Chest with Right Lung Cancer]. Retrieved
January 2,2014, from http://ufhealthjax.org/cancer/images/ct-lungs-large.jpg
Williams, C. , (1992) Lung Cancer:The Facts. Oxford, New York: Oxford University
Press
http://www.mayoclinic.org/diseases-conditions/lung-cancer/basics/tests-diagnosis/CON-
20025531
Yu H., Spitz M., Mistry J., Gu J., Hong W., Wu X.(1999) Plasma Levels of Insulin-Like Growth
Factor-I and Lung Cancer Risk: a Case-Control Analysis. Journal of the National Cancer
Institute, 91(2), 151-156.doi: 10.1093/jnci/91.2.151. Retrieved january 2,2014 from
http://jnci.oxfordjournals.org/content/91/2/151.full

Vous aimerez peut-être aussi