Day 1 – Session 2: Building a Consortium of Cohorts

>>Eric Boerwinkle:
Good, thank you, Terry. So the picture for those
of you who have not been there is the very large
Texas medical center. So since the word cohort
has been used about 7 bajillion [phonetic sp]
times, I thought maybe we should pause and define
what a cohort is. First it’s a sample of
individuals and it’s a sample of individuals
who share some characteristic. They’ve been identified
through a place or they’ve been identified
through an experience such as the V.A. I think it’s also
important to remember that it has some
time frame to it. They were ascertaining
for time and they were also followed up over
time by definition. And the typical scenario
is that they’re measured early or monitored in
our case — we’re quite enthusiastic about the
use of mobile monitors. And then they’re followed
up over time for later outcomes. And that would be the
typical cohort study. There’s a lot of interest
in whether a cohort needs to be representative. It does not. You don’t need to have a
representative cohort, or you don’t need to
have a cohort that is representative of
the United States. We just need to have a
cohort that we know from whence they came so
we can generalize the results to the
population. So I think when we
use the word cohort repeatedly it’s important
to keep these definitions in mind. I’ve been very fortunate
and honored to work in multiple NHLBI-funded
cohorts and formed a consortium known as
the charge consortium. It has the goal to
identify genes or genomic motifs influencing
common chronic disease particularly those
related to cardiovascular disease, heart lung and
blood diseases and also aging. They are not all ancient,
a word that was used earlier. Some of them are very
newly established such as the study of Latinos. The ARIC studies,
a large study. A multi-ethnic study or
bi-ethnic study in that case. And the well-known
Framingham heart study just to name a few. The advantages of these
individuals were measured for hundreds if not
thousands of variables and they’re measured
for these variables over time. They’re not
genetic studies. They’re studies usually
of the outcomes. And so they have both
environmental and other exposures. And so really it’s an
ideal platform to examine gene-environment
interaction. The charge consortium
now exceeds 100,000 individuals. There’s about 75,000
individuals who were very deeply genotyped and or
sequenced with very deep phenotypic information. All of these data
including genetic data and environmental data
and phenotypic and outcomes data are
available via dbGaP, and I do want to thank the
support of NHLBI and NIA for maintaining these
cohorts as valuable resources for the
scientific community. In the charge consortium
the working group — usually seated by young
investigators and I think that’s a word that we
haven’t used enough so far. Having these types of
resources are great training opportunities
for the next generation of biomedical
researchers. Just to give you a taste
of the kind of data and by the way in the bottom
of this slide are dbGaP session numbers for the
data I will be talking about. So as of — I don’t know
about today — but as of you know contemporary,
we have whole exome sequencing on
approximately 11,500 individuals. Whole genome sequencing
on 5,700 individuals. The charge consortium
has a very close collaboration with the
exome-sequencing project. And when you combine the
data we exceed then 23 plus 11 5,000
individuals. We have a very strong
analytic infrastructure that I’ll talk about
in a little bit. So it really provides
an outstanding analysis platform for
investigational sciences. One of the examples of
the work that this group has sought in
collaboration with Richard Gibbs and Alanna
Morrison whose pictures are on this slide is
really to try to tackle the analysis of whole
genome sequence. Francis, when he
introduced today mentioned that an idea
would be to have whole genome sequence data on
all of these individuals. For many, it’s an
intractable data framework and we’re
trying to make this seemingly intractable
data framework accessible by providing annotation
and analytical tools that would be available to
the scientific community. This is one of two what
I’ll label as important slides. I think it’s really
important to not think about the U.S. research cohort as simply
a collection — or we’ve used the word stitched
together many times today. A consortium of cohorts. I really think we need
to step back and ask how this collection of maybe
existing cohorts studies bringing in clinical
samples and also as we identify needs. New sampling
opportunities. How those can be much
greater than the sum of the parts. As long as I think we
have just heterogeneous groups of individuals
that we’re not paying attention to synergies
will fall short of our existing goals. So I think it’s very
important that we think not about a simple
consortium of cohorts but rather think about how we
can create this community and have truly a
national or U.S. research cohort. This is the other what
I’m labeling important slide. Is really there’s
enhanced opportunities if we can pull this off
and make a true U.S. research cohort. The first is, we’ve
talked many times today that the U.S. research cohort needs
to be participatory. They’re not just research
subjects but rather they’re fully integrated
into the cohort research enterprise. We’re approaching
that already today. There’s the cohort
studies have good community engagement,
good outreach. But I think we
can do better. I also think there’s
great enhanced phenotyping opportunities
if we can bring together existing sample resources
and create a U.S. cohort, such as the use
of mobile devices, and also linking in real-time
these participants into the electronic medical
record and research platform. One of the things that
hasn’t received a lot of attention today — I’m
a bit surprised, is this is also a great a platform
and infrastructure for clinical trials. Very rapid and ongoing
clinical trials. Since you’ll have a
deeply phenotype group you’ll be able to
very quickly ascertain potential trial
participants. This may be
heresy to say. Maybe it’s the second
time I’ve used that word. It really is a platform
for a national healthcare exchange or a national
electronic medical record. So as people move around,
we’ll be able to move around and their
electronic medical record will move with them. Also, the ultimate goal
is to make enhanced community resources. I’m going to spend a
little time because it’s something I’m very
excited about. This virtual cycle
between health care and research. As of right now, most of
us are in this phase of generating data and
doing discovery and that discovery phase –you
know, hopefully moves into translation. I think if we can
generate and pull together this U.S. research cohort, we can
basically build a cycle of which the
translational information again feeds back into
data in real time and it creates more discovery
opportunities and it again makes more
translational opportunities. That really I think is
the promise of having this cohort in place for
both the clinical and the research community. I’m going to give you
a couple examples. There’s been a desire
— you know, what are we going to do
with this thing? Here is a couple of very
simple examples that we’ve annotated and all
of the exomes in the charge consortium loss
of function variants. And so we’re able to look
at either heterozygous or homozygous loss of
function variants and then look how those
phenotypes are related to common diseases. Here are the position
of mutations in the HAL Gene. This is blood
histidine levels. You don’t have to
be a North Carolina statistician to see that
people have a mutation in the HAL Gene have
extremely high histidine levels. Histidine is a
powerful antioxidant. And those individuals
then are protected from small vessel disease. This model has been used
in both research and in translation. In collaboration with
Helen Hobbs and Jonathan Cohen PCSK9, something’s
the poster child. In collaboration with
sectia thoracen [phonetic sp], also an
APOC3 and NPC1L1. And then more recently in
the product for LPPLA2. It also — I think it’s
important for us to realize that we need
to gauge the individual labs. In the early days of the
genome project there was a lot of fear about how
this very large mega project could absorb
resources away from the individual labs. And what we see today
is they really provide resources for these
individual or smaller laboratories. So here’s an example. In collaboration with Jim
Lupski of really trying to identify all the genic
models and span rare Mendelian disease and
common diseases of having a case series of
individuals with Charcot-Marie-Tooth
syndrome. We can identify in a
subset the individuals that have point mutations
but then there’s the remaining individuals we
don’t have a molecular diagnosis. We then compare those
individuals –in this case the individuals in
the charge consortium. And you can see there’s
an accumulation of variation in
the CMT Genes. So by having a very large
national cohort, it’s an outstanding comparison
opportunity for many, many other studies. So again I think the two
major messages that I’m trying to get across
is first I think it’s important to think about
a national unified cohort or U.S. research cohort. Not just simply a
stitched together of existing cohorts. And second, if we can
pull this off, I think we’re all going to
benefit from this virtual cycle of really
discovery, translation and further
data generation. So thank you very much. [applause]>>Female Speaker:
Okay, in the gray.>>Male Speaker: Tom
in [unintelligible]?>>Thomas Insel: Tom
Insel from NIMH. At the beginning, Eric,
you said that this was a way of getting out gene
environment issues. So in charge, what
are the environmental measures and how do
you address that?>>Eric Boerwinkle: So
we have several gene environment
working groups. Alcohol consumption,
dietary exercise. Many of the studies
have particulates, air particulates measured. And so we have many,
many opportunities. That’s just actually
a small subset.>>Louis Staudt: Hi,
Lou Staudt, NCI. One thing hasn’t been
talked about too much is what I would consider
sort of the special sauce of a perspective cohort
is the ability to get biological samples
over time from the same individual. So you would be able
to see — you know, the phenotype before a
disease occurred. Especially important for
circulating tumor DNA in the case of the cancer. Could be important
in metabolomics. Immune signals of
incipient auto-immune disease and we haven’t
mentioned that. I think it could double
nicely with the mobile monitoring such that you
get some signal just like you would in a
doctor’s office. Something’s wrong. And that could be the
trigger to go get a blood specimen.>>Eric Boerwinkle: I
couldn’t agree with you more. So all of these studies
do have longitude noclection [phonetic
sp] so we have longitude labial markers that can
be used both as outcomes and exposures as
you mentioned. I’m a big fan of using
metabolomic data for both gene discovery
and biomarkers. I also think it’s going
to be interesting and — I don’t know
what the V.A. is doing, you know. If we’re trying to
link this with existing healthcare what can we do
as a group to think about using the doctor’s office
as a place where blood is drawn and how that could
ever be moved into a biorepository. The logistics of that
would be extremely complicated in the
general practice setting.>>Female Speaker:
Okay, we have one last clarifying comment, but
just before we take that, if I could have Dave and
Eric and Mike go ahead and sit up at
the front table. We will have a panel
discussion now and then, Rod, go ahead.>>Roderic Pettigrew:
Roderic Pettigrew NIBIB, and I wanted to respond
to Tom Insel’s question about environmental
exposures and what could be SE. Tom verse fully, anything
for which there exists a ligand. Be it a chemical,
bacterial virus particle like toxicant,
can be a SAD. The platforms to do that,
the technology could do that exists.

You May Also Like

About the Author: Oren Garnes

Leave a Reply

Your email address will not be published. Required fields are marked *