Precision Health Seminar Series, January 2020


Precision Health Seminar Series, January 2020

– Welcome to our Precision
Health Seminar Series. The first of the decade here. I’m Scott Roberts, Faculty member in the School of Public Health and part of the Precision Health
Education and Training Workgroup. And before we jump into
our fantastic speaker, I just wanted to say a few words about our Precision Health
Progream for those of you who don’t know. It’s really a three-part
mission for the program. The first is to try to
build infrastructure to enable interdisciplinary
research with tools and resources that are
available to researchers across campus, including a
exciting new Analytics platform. So, that’s one element. Two is encouraging research with funding and educational events. In fact, we have a grants program that’s already disbursed
over six million dollars to researchers here at the University. And then finally, the big goal is really to implement findings from
Precision Health Research into real life Health Care settings. So, another element to
the program that’s new is a new certificate
program for those of you who are graduate students, you might be interested in learning more. You can sign up at the
tables outside to learn more about our new certificate program. And also, if you’re not a
member, you can sign up outside as well and get on our listserves for future events, like today. Also, just a little bit of housekeeping, next month our seminar speaker
will be on February 11th, it’ll be Jennifer Roberts, no relation, from the University of Maryland. She’ll be discussing
Public Health Outcomes and the Effects of the Built Environment. So, we hope you’ll come back for that. So, today we’re really excited
to welcome Dr. Amy McGuire from Baylor College of Medicine, and she’s gonna be
speaking, as you see here, about Genetic Privacy and
Investigative Genetic Genealogy, which has become a really hot topic, and I think illustrates this
tension that we’re seeing, even beyond genetics between data privacy and the common good in public safety. So, those kinds of tensions,
if you hear about Apple, you know, trying to unlock their phones. I feel like that brought our attention. It’s also illustrated in genetics work. And so, she’s gonna talk to you today about how genetic databases
are becoming of interest of law enforcement to
help solve cold cases and other criminal justice issues. But that, of course, raises some questions about civil liberties as well. So, just a little bit about Amy, she’s the Leon Jaworski
Professor of Biomedical Ethics and she Directs of the
Center for Medical Ethics and Health Policy at
Baylor College of Medicine. And perhaps, not surprisingly, she researches Ethical
and Policy issues relating to emerging technologies with an emphasis on Genome Research in Precision Medicine. I could be here all day reciting her CV but just a few highlights,
she’s published over 200 Peer-Review Articles,
including some high-profile pieces and Science Genome in New
England Journal of Medicine, among others, and has led
numerous NIH funded projects. And she’s also been an advisor
to some major initiatives in the field eMERGE Consortium, that some of you may know
about, as well as serving on the National Advisory Council
for Human Genome Research. So, without further ado,
Amy, thanks for joining us and the stage is yours. – Thanks, Scott. (audience applauding) Thank you guys for coming
out this afternoon. Can you hear me? (microphone rustling)
Is that okay? It’s great to be back here, my
sister was an Undergrad here and my brother was in
Business School here. So, I haven’t been in many
years but it’s really nice to be back on campus
and to be with you all. So, from 1974 to 1986 there was one individual
in the State of California who was responsible for more
than 13 murders, 50 rapes, and hundreds of armed robberies. Now, they knew that this
was all one individual because he left his DNA at several of the crime scenes and he was
engaging in this crime spree for more than a decade. Ravaging the entire State of California, people were terrified
to be in their homes. And he was a particularly
violent criminal. So, he had a very specific
M.O. that he used across all of his crimes, where
he would typically show up in an individuals home
while they were sleeping, he’d go into the bedroom
and often there was the husband and wife in
bed, and he’d wake them up by shining a very, very bright flashlight from above his head in their
eyes, and he would direct them to get on their hands and
knees and he would tie them up. He’d throw shoe laces at them and say, first he’d have the
wife tie the husband up, and then he’d tie the wife up. And then he’d have the husband
get on his hands and knees and he would stack plates
from the kitchen on his back. So, if he moved, the plates
would fall and they would break and he would know. And he’d take the wife into
another room in the house and he would brutally rape her. And then he would leave her and
he would go into the kitchen and he’d have a beer or make
a sandwich, watch some TV, hang out for about an hour,
go back and rape her again. And then in some cases, he
would go back and kill both, the husband and the wife, and
other cases he left the home without killing them. And police spent this decade
looking for this individual and trying to identify who he might be. He became know, sort of nationally, as the Golden State Killer, because all of his crimes
were committed in California. And there were several
people who had seen him because they were, you know,
remained alive after the crime and so they had many
sketches of this individual of what he might look like. And this was one of the
sketches that came out of what this individual looked like. So, he was a Caucasian
male, in his 20’s or 30’s, and they could not figure
out who this guy was. And I mentioned that he
left his DNA at several of the crime scenes, so they had the DNA from multiple crime scenes. And they tried to identify
him based on his DNA. So, how do we typically identify criminals based on their DNA? We use a National Forensic Database, but this National Forensic
Database is populated with DNA specimens from
individuals who have been convicted of crimes, or in some states, individuals who have
been accused of crimes. So, there’s a couple
problems with DNA database. One is that, if you haven’t been convicted or accused of a crime then
you’re not in the DNA database. And also, the DNA database, CODIS, which is the National Forensic Database, reflects, sort of, the
racial biases that exists in our current criminal justice system. So, it may be very difficult
to find Caucasian males in the database. So, this, these crimes where
he seemed to stop doing these crimes after 1986, and
nobody kind of understood why. But they really seemed to stop but they continue to
search for who the suspect in these crimes might be,
for decades, with no success. And in 2017, one of the FBI agents who
had been working on the case had this idea. He said, we’ve got all
this DNA from this guy, he’s not in our forensic databases, I wonder if we could find
him in other DNA databases that are out there. Right, so by now, there’s
a bunch of DNA databases that are out there and there
are several DNA databases that collect genetic genealogy
information from individuals and the purpose of them is
to do familial matching. So, they said, what if we
could upload this guy’s DNA to one of those databases
and try to find one of his relatives and then
find him through his relative. They were able to successfully do that, and on April 24th of 2018, they apprehended this gentleman
who is Joseph DeAngelo as the suspect in the
Golden State Killer case. So, this made huge news
and raised a whole bunch of questions about how law
enforcement is gonna be able to use our DNA when it’s not
in forensic DNA databases to solve cases like this. Since April 2018, several
hundred other cold cases have been solved using this technique. So, I just want to talk for a minute about what they actually did in this case in order to solve this crime. So, I don’t know if
all of you can see this but I delineated here in green
what they would normally do in any case, and in yellow what they did, very specifically related to investigative genetic genealogy. So, they got the crime
scene DNA from the suspect, they then exhausted all of
their investigative leads. So, I mentioned, that
they put out tip lines, thousands of people called in,
they had all these sketches that were out on the
news and in newspapers, and people were trying to, you know, identify people who
look like the sketches, they followed, you know, eye witnesses, they did everything that
you would normally do in any investigative case. Then what they did for
investigative genetic genealogy, is they took the crime scene DNA, and they created a SNP profile
from the crime scene DNA, and they uploaded it to a
genetic genealogy database called GEDmatch. Now, GEDmatch is a
database where you can take your DNA data that you
generate from other companies, like say you get 23andMe testing or testing through You can down, you can
upload your SNP profile from those databases into GEDmatch and GEDmatch will then
link you to other people in GEDmatch who you might be related to. This is a huge industry now. Like, there are millions
and millions of people who are engaging in genetic genealogy. I think the last numbers
were that GEDmatch has about one and a half million consumers in their database, Ancestry
has about 15 million people in their database. So, this is a huge industry. So, they uploaded this fake profile. They said, hi, I’m John
Doe, here’s my profile, link me to people who
I might be related to. And then through the
website they got linked to several people that
John Doe shared significant enough amounts of DNA
with that they suggested a genetic relationship, and they identified about a third cousin. Then the hard work comes
in because they take that third cousin and they say, okay, who is this person
related to, and they start to build out a family
tree, and they build out that family tree using every public data that’s out there, which is
actually tremendous amounts of data on all of us. Right, so they say, okay,
how can we build up this tree to figure out their common ancestor, and then build down the tree to figure out who all of the third cousins might be, and then let’s look at each
of those third cousins, who was in the right
place at the right time, has the right demographic,
and might be our suspect? And that’s how they try
to identify who they are, the person of interest might be. In this particular case,
they narrowed it down to Joseph DeAngelo, he was
a retired Police Officer, living in California, at the right time. He was married and had
several grown children. They followed him around
once they figured out that he might be their person of interest. They waited for him to discard something with his DNA on it. In this case he threw out a used tissue and they collected that tissue. They took the DNA from the tissue and they matched it to the crime scene DNA from the Golden State Killer case file, and it was a perfect match. Okay, so does everybody
understand how it works? Okay. So, that’s investigative
genetic genealogy. So, what does this mean? Was it totally random and
lucky that they were able to find Mr. DeAngelo using
investigative genetic genealogy? Well, we have some
colleagues who looked at, let’s look at how many people are in these genetic genealogy databases and let’s infer how easy it
would be to identify a relative, that’s a third cousin or closer, in terms of your genetic relationship and see what that means. And they inferred data from
the MyHeritage database, which has about a million people in it. And what they found out was
that based on the number of people in these databases right now, about 60% of Americans,
of Caucasian descent, so of European descent,
could be, you could identify their third cousin or closer. Okay, and they suspected
in the next five years, the way that the industry is growing, 99% of white Americans will
be able to be identified through a relative match using genetic genealogy. Now, I say a Caucasian individual, because most of these databases, quite opposite to the forensic
DNA databases that we have, are populated by individuals
of European descent. And so, the likelihood
of identifying somebody of European descent is
much, much higher using these genetic genealogy databases. So, in May of 2018, right after the Golden State Killer
case, kind of, hit the media and everybody was super
excited about this, we published a blog about this and we asked at the end of our
blog, just very informally, how many people would be okay
with law enforcement accessing and using their DNA data
to identify criminals in cold cases. So, let me just get a sense of the room. How many of you would feel
comfortable having your DNA used in that way to identify
suspects in crimes. Okay. So, about half of you. Tell me if you’re not comfortable with it? Just to get a sense of. Okay, so it’s not that none of you don’t have an opinion, good, okay. So about half of you
are comfortable with it. In our little informal
survey we found more people were comfortable with it,
about 83% of the 278 people who answered our poll said,
yeah, I’d be fine with that. It seemed like there was some, sort of, acceptance of this as is as a way to protect the public safety. But there were also, at the
same time, a lot of articles that were coming out that
raised concerns about this and the concerns really
seemed to focus on issues of privacy. And so, there were articles like this one, where they were showing
that individuals were, they were using this
technology to identify suspects in less serious crimes,
like assault crimes, and there were concerns
about the slippery slope. Like, is the government going
to be accessing DNA databases and using it to solve all kinds of things that we might not think they
should be using DNA for. And so, there were a lot
of concerns raised by this. So to, kind of, understand
and situate the issue from a societal perspective, I just kind of want to
identify two trends that seem to be going on at the same time here. So, on the one hand, we
have a significant increase in the amount of data that’s
being generated today. And this includes genetic
data that’s being generated, both through health care and research, as well as through self discovery, like direct-to-consumer genetic
testing, genetic genealogy, and other aspects like that. We also have a lot of other types of data that are being generated on all of us, that when linked to our
genetic information, tells you a lot of information. Right, I mean, you would be,
I’ve been working a little bit with the FBI now, based on this project, you’d be shocked at what’s
out there on all of us in the public domain that
we don’t even think about that’s being shared on a daily basis. So, we have this sort of
massive amount of both, genetic and non-genetic information, being shared about all of
us in a very public way, increasingly, and at the same time, we’re all kinda concerned
about our privacy, and we’re still concerned
about our privacy. And I’ve been working on privacy
for many, many, many years and sometimes when I talk
about privacy people say, they’re just getting old, like
people these days don’t care that much about their privacy,
look at the young people out there they’re posting
everything on social media, you know, they’re taking pictures on Snapchat 50 times a day,
like, they just don’t care about their privacy. So, this was actually
something that we got a little bit interested in as,
kind of, a tangential issue, which is, is that true? Like, are we worrying less
and less about our privacy with the new generation
of people coming up and sort of younger millennials? So, we did a survey using Mechanical Turk, which is an online platform
in 2016 of 1310 individuals and we analyzed the results
based on generation, just to see if there are
generational differences in how people felt about their privacy. We asked individuals
in the survey whether, how concerned they were about the privacy of their health information,
and how concerned they were about the privacy
of their online information. And you can see here that
overall 68% of individuals were concerned or very
concerned about the privacy of their health information
and 69% were concerned or very concerned about the privacy of their online information. And none of this differed based
on their age or generation, except for when we split
out our millennials into older millennials
and younger millennials, the split point seemed
to be at 28 years of age, and those who were under 28
were a little bit less concerned about the privacy of
their health information. My hypothesis is that they
haven’t really encountered the health system that much
by that age, many of them are on their parents
health insurance policies, so they may not know that
they ought to be concerned, (audience laughing)
or they just may not have experience, but
that’s just a hypothesis. What was really interesting here is, we also asked a bunch of
questions about people’s use of social media and how
often they were online. And there was absolutely no relationship between people’s use of social media, what they posted on social
media, and how often they were online with their
concerns about privacy. So what does this tell us? This tells us that
people are posting tons, and tons, and tons of private information and they’re super
concerned about the privacy of the information that they’re posting. So, the conclusion that I have is that it’s really, as in everything that we do, it’s all about trade-offs. Right? You have to make this sort of internal, whether it’s conscious or
unconscious calculation, at any point in time, about
whether the benefits to you or to society or to others, you know, outweigh the potential
risks that you’re taking. So, when I shop online
and I enter my, you know, credit card number, I’m acutely aware, maybe sometimes it’s subconsciously aware, that I’m giving out private
information about myself and it could be used against me. My identity could be stolen,
my credit card number could be stolen, and it’s
really convenient for me to shop online and not
have to get out of my bed to go to Target, right? And so, I make that trade
off and I decide it’s worth the risk that I’m taking to my privacy to be able to use Amazon
instead of getting out of bed and going to the store, right? This is the same kind of risk
calculation that individuals are making when they’re sharing
their genetic information and other types of information as well. So, we’ve done a bunch of
studies about these types of trade-offs in the
medical and research space. We know that individuals
are really willing to take privacy risks
when the potential benefit is to their own health. If they are doing something
that they think will improve their own health, they’re
willing to risk their privacy, particularly if they’re sick, right? They’re also, most people are willing to take some privacy
risks to advance research. We found that people who have an illness, even if they know that
the research won’t benefit them personally, if it
will benefit other people who might be going through
the same thing as them in the future, so future
patients, they’re willing to risk some of their
privacy and share their data in order to provide that
benefit to future patients. The big question is whether
people are willing to take the same types of risks,
their informational risks, to their privacy when the
benefit is more to society as a whole, in terms of the public safety and
being able to, you know, follow through on criminal justice. So, to solve crimes and
get violent criminals off the streets. So, we were very interested
in this question. We haven’t actually done
a trade-off study yet between privacy and this
particular consideration, but we did do a very
brief, a very, kinda sorta, a quick and dirty survey
using MTurk again, shortly after the Golden
State Killer case came out, 1587 individuals, it was an online survey, this is really, really busy
but what you really need to focus on here are two questions. One, we asked, should
law enforcement be able to search genealogical websites
that match DNA to relatives? And what’s important
here, that you see is, 91% of people in our survey said, yes, they should be able to do it
if it’s for a violent crime. 89%, the gray line, said, yes,
they should be able to do it if it’s for a crime against children. The yellow line, which is 91%
said, yes they should be able to do it if it’s to
identify a missing person. But only 46% said, yes they
should be able to use this for non-violent crime. So, there seems to be a
distinction that people are making that they don’t want to go
down that slippery slope of using genetic genealogy
databases to identify individuals who might be a suspect
in a non-violent crime. We also asked a question
about whether law enforcement should be able to search
cell phone records or to identify location
information from cell phones, using, without a warrant,
in order to solve a crime. And we asked this because
this particular question was pending before the Supreme
Court when we did the survey, and we were very curious
sort of how that would relate to what they said about the
use of genetic databases. So, very similar findings,
85, 82, and 78% said yeah, you should be able to
search cell phone records and cell tower records for violent crimes, crimes against children,
and missing persons, but we’re not so sure,
only 50% of us think you should be able to do
it for non-violent crimes. So, a couple of months later,
the Supreme Court issued their opinion in this particular case, which is Carpenter
versus the United States. And what this case was asking, was whether law enforcement
could access cell tower record, location record, without a warrant in order to identify
individuals in a crime. So, imagine this situation, right? You are on the corner of
West Street and 3rd Street, which I don’t even know
if that exists anywhere but, okay, you’re on the
corner of West Street and 3rd Street, and there is an ATM machine and one of those, you know, armored trucks pulls up
to get the money out, somebody pulls up behind
him, gets out of the car, robs him at gunpoint, steals
all the money, and gets away. Right, there’s a cell tower record, a cell tower close to that
area and it records all of the cell phones that
were in that neighborhood 30 minutes prior to the
crime being committed and 10 minutes after the
crime being committed, and the police want to
know who was in that area at that corner during
that time period, right? So, if they can get that information, they’re kind of going
on a fishing expedition. They may get 5,000 phone records, right? And say, okay, there were
5,000 people in that area. Well, let’s say now, three
months later, across town, the same crime gets committed,
and they want to get the records from that cell tower, and then they want to cross examine, who was in both places
at the time of the crime and they might narrow it
down to five or 10 people, who then they can go on and
use as investigative leads and try to identify who’s
the criminal, right? So, that’s what they were asking to do, and this was challenged,
saying that’s a violation of the Fourth Amendment, which
protects against unreasonable search and seizure without probable cause and a warrant, right, you
have to have probable cause that you’re looking
for a particular person and get a warrant to search records. And the argument that was
made by the defense was, well, this, I mean, by
the law enforcement was, well, this isn’t unreasonable
search and seizure, we don’t need a warrant because there’s a well recognized exception
to the Fourth Amendment, which says that if you voluntarily share your private information
with a third-party, it’s no longer considered private. And they were saying that
all of us voluntarily share our location information
with our cell provider, with AT&T, or Verizon, or whoever. When we sign our contract
and get a cell phone, we’re voluntarily sharing
our location information at all times with our cell provider. And that was being challenged in court. What the Supreme Court found was that, no, no that doesn’t that doesn’t
qualify, we’re not gonna say that you’re voluntarily sharing
your location information when you get a cell phone,
because that’s not really what we meant, but they said, but this is super narrow, our finding, and this doesn’t mean that
we’re all chipping away in any other sphere, the
third-party doctrine. So, that still holds. So, there’s a big open question
now of whether DNA data is similar to cell tower location data, and whether you are voluntarily
sharing that information with a genetic genealogy database, which presumably, if
you’re uploading your data to that database it’s
a more active sharing, whether that would invoke
the third-party doctrine, and that law enforcement could
then access that information to do their search without a warrant. Make sense? Okay, perfect. All right, so how can
we protect individuals? Some people have said, law
enforcement should not be able to do this, they should not be able to access third-party,
they should not be able to access genetic genealogy
websites to solve crimes. One state has tried to
legislate against it, in January of 2019, there
was a bill that was passed in Maryland. Maryland and DC are the only two states that don’t allow law enforcement to use the national forensic
DNA database, CODIS, to do familial matches. So, within CODIS they have
a limited amount of DNA in those databases, you can’t
do extensive familial matching but you can identify
first degree relatives. So, parent, child, sibling. And in most states,
you’re allowed to do that when you’re investigating a crime to identify who the relative might be, even if you don’t identify
your exact suspect. But in Maryland and DC you can’t do that. And Maryland tried to introduce
the Bill that would say, and in this State, law
enforcement is not allowed to do familial matching using
genetic genealogy databases. That Bill did not pass,
but it is indicative of what some states might do to try to legislate against this. So, one thing I want to,
kind of, pause for a second and ask is, when we think about privacy and genetic genealogy
databases and the use of them for investigation, whose
privacy are we worried about? So, like, let me just ask you guys. Whose privacy are we worried about? – [Man] Our own. – Hmm? – [Man] Our own. – Your own, okay but, let’s think about the Joseph DeAngelo case, right? Whose privacy were we
concerned about in his, would we be concerned about in his case. Huh? – [Woman] His children? – So, maybe the relatives, his children, people that might be matched, right? Anybody else’s privacy? – [Man] A false positive. – A false positive. Really good, so we might be worried that we identify somebody,
and we go down that lead and it’s not the right person. Right, yeah. So, there’s several different players, and I kind of want to walk us
through each person’s privacy, and I’ll get to the, we
can talk a little bit more about the false lead as well. So there’s the customer,
right, this is Sally, she’s the one who’s like super interested in where did my family
come from and I’m gonna do genetic genealogy, and
I’m gonna upload my DNA to GEDmatch, which is the
database that was used for DeAngelo’s case, right? So, are we concerned
about Sally’s privacy? Little bit? Okay, so we just talked about
the third-party doctrine, how many of you think she
has voluntarily shared her information with a
third-party in such a way that it should be able
to be accessible for use by law enforcement? Okay, so some of you would say, no, she doesn’t have a
legitimate privacy interest in her DNA data once it’s
been uploaded to GEDmatch. How is her privacy actually protected? Typically it’s through the
Terms of Service of the company. So, it’s a contractual agreement. So, whatever the company
says they’re gonna do to protect your privacy,
they’re supposed to do. And if they don’t do that,
then there’s the potential for them to get a slap on
the wrist by the Texas, the FTC, the Federal Trade Commission, and they could potentially
get fined as well. But there aren’t really
strong protections in place if they violate their
own Terms of Service. All right, the next
person we might be worried about their privacy is DeAngelo, right? But I think most people
would probably agree and we generally agree in society, that if you commit a crime
and your leave your DNA at the crime scene, then
you forfeit any rights that you have to the privacy of that DNA. And so, we might not be
as concerned about him. So what you guys pointed out
is what about all the people in between them. Right, so we identify Sally,
who’s the third cousin of DeAngelo, but then we got
to build the tree, right? So, what about all those
people on the tree, are we worried about the
privacy of all of those people on the tree? Right? So, I think one way to think about this is to actually get a
really clear understanding of what is it that the police are seeing, and what is it that they’re doing. Because I think there’s a lot
of misconceptions around this. So, the first thing is, a lot of people say, well,
I don’t want them looking at the DNA of all of
those people on the tree. Well, they don’t, at all. What they do, what they see,
is what any customer sees, which is basically this, it
says you have a match, right, you have three matches. You have a match with match
number one on chromosome three, you guys share 17.6 and
132.1 centimorgans of DNA. 17.6 means almost nothing,
all of us share centimorgans of DNA because we’re all
somewhat related way, way, way, way back when, so they
would probably throw that out, 132, they might say okay
that seems a little bit like, it might could be a
relative, and they say, what does that mean in
terms of the degree relative that it would be, and should
we pursue this or not? You know, the rest of
them matches two and three have very low percentages of
centimorgans that are shared, and so they may not
pursue those leads at all. And then they say, okay, let’s
hone in on match number one and let’s see if there’s, you know, who match number one matches to, and on what chromosome,
and how many centimorgans they share, and can we
somehow build a family tree out of this. Okay, so that’s all they’re
seeing on these customers. Okay? They don’t get names
and they don’t get DNA. They do get names in the
sense that they look, this is also very confusing,
but they get the username that you share, when you upload your data. Now, what’s confusing about
that is some people upload data for their multiple family members or they may have a community
organizer who uploads all the data for the community and uses their own email address and
name to upload those data. So, you have to kind of figure out like, we’ve gone through several cases together with law enforcement, and they said, well, we found this person, they
had this email address, but we’re pretty sure that’s
his mother and not him, and so we can deduce that this is him. And it’s really like,
tricky investigative work. The second question is, the
second objection that people have is, oh my gosh,
they’re gonna be testing the DNA, they’re gonna be
following all these leads, and following people around,
and collecting their DNA, and testing the DNA of tons,
and tons, and tons of people, and that’s violating
their rights to privacy. Right? So, this is, I’m trying
to get real data on this. This is anecdotal confidential
data from a couple of cases. But it gives you a general sense. So, when you do investigative
genetic genealogy, right, these were for cases where they did investigative genetic genealogy. What do you think was the
maximum number of people that they had to collect
DNA from in order to solve any one of the crimes? Anybody want to guess? – [Woman] Six. – Six. – [Man] 45. – 45. Huh? – One.
– One. Okay. The maximum number was 10, using investigative
genetic genealogy where they actually collected
DNA and did testing, oftentimes it’s just one,
or it might be one or two. When they do regular
investigations, right, on the same exact cases, before as they used
investigative genetic genealogy, what was the maximum number
of people they had to test their DNA, prior to using the investigative genetic genealogy? Now, none of these
actually solved the crime. In this case, the maximum
10, the crime was solved. Okay, so what was the
maximum that they use prior to using investigative genetic genealogy, what’s the maximum number of people that they tested their DNA? – [Man] 44. – 44? (laughing) You’re the one who said 45,
okay, we’re hedging our bets. What else, anything else? Huh? A few hundred. More than 600. So, what’s really surprising here, is that I don’t think people realize what law enforcement does
to try to solve crimes and how many people they investigate, and you would never know
if you’re investigated if they do it well and
if they do it right. And the same thing with
investigative genetic genealogy. If they go down the wrong
path and they’re following best practices and they do it right, that person should never be contacted, and should never know that
they’re even the subject of investigation, unless
they get stuck and need help from somebody to help build out the tree, and then they might
ask them for permission to collect their DNA and
help build out the tree. So, the problem is is
that there are many, many, many law enforcement agencies
and not everybody does this well or does this right. So, what are the current
companies doing with regard to investigative genetic genealogy? So, this is, kind of, shook up the direct-to-consumer
genetic testing industry. And when the Golden State
Killer case came out, everybody was like, oh
my god, what do we do? Like, should law enforcement
be able to access this, we are stewards of our,
you know, customers data, we didn’t warn them that this
could be used in this way, should we try to prevent it,
how do we try to prevent it, should we work with law enforcement, we want to help protect the public safety, these are horrible
people who have committed really violent crimes in most cases, and we want to bring them to justice, and we want to protect others. So, they have taken very, very
different approaches to this. 23andMe and, which are direct-to-consumer
genetic testing companies, have both said we will not
work with law enforcement, we will not let them access
your DNA data without a valid warrant. And so, they have said, no,
to law enforcement access to their customers data. FamilyTreeDNA, which is housed in Houston, decided, yes we want to
work with law enforcement, we think it’s a good thing. So they changed their Terms
of Service and they allow now an opt out for their customers
to say, I don’t want you to include my DNA in the
database that law enforcement can access, if they choose to opt out. I think for their European customers because of some of the laws in Europe now, they have to opt in,
but for U.S. customers, they can opt out. They’ve had, I don’t want to quote this, because it’s not exactly right, but somewhere between 10 and 20%, I think of their customers,
have opted out of this but a fairly low percentage of them. They also work really
closely with law enforcement, they have to register as law enforcement, declare themselves to be law enforcement, they go into particular
part of the database, and they’re very transparent
about how this works. And they’ve also partnered
with individuals, like, you guys remember Elizabeth Smart. Who was, I think she was kidnapped, and I think she was murdered, right? No, she was found! She was found, yes, that’s right. And her father is a huge
advocate for solving crimes and he’s actually gone out and done like, public service
announcements where he said, please send your DNA to
FamilyTreeDNA in order to help solve crime. So, this might be a
motivating factor for people who aren’t really interested
in their genetic genealogy, but they want to help
solve crimes in this way. GEDmatch, which was the
other company that was used, it was fully open. They had an opt out, they
changed their Terms of Service to an opt out, they then violated
their own Terms of Service in some way because they said, we would only allow law
enforcement to access these data for violent crimes, then they
allowed them to access it for a non-violent crime,
there was a huge outcry, and they ended up switching
to an opt in system, which was very, very
detrimental to law enforcement because they had 1.4 million
people in their database, once they switched to
opt in, it’s growing, and growing, and growing but
the last number, I heard, was roughly around 200,000
people had opted in, which makes it really
difficult for law enforcement to identify people, when
you only have that pool of individuals, and it creates, it makes it more work, they
have to investigate more people, and they have to look to other databases and resources as well. So, one of the big
issues that has come up, is that law enforcement
has been doing this without a warrant, under this idea that the third-party doctrine applies, and that they can go in
and search these databases that are available. This has been tested recently, and there was a investigator
in, a prosecutor in Florida, who said, I am going to
try to get a warrant to see if I can access all of the individuals in the GEDmatch database, all 1.4 million, even those who have not opted
into law enforcement access. And he was able to
successfully get a warrant to do that search, and was able to search GEDmatch’s entire database. So, that warrant has not yet
been challenged in court. I think a lot of people
are very interested to see how specific that warrant was and whether, I would presume that the argument
is, for getting a warrant, is that based on the research, you can currently identify
60% of familial matches for Caucasian male or
for Caucasian individuals in the United States, right? We soon think we can identify 99%. Is that mathematical probability
sufficient probable cause if you know that you are
looking for a Caucasian suspect, to be able to access these
databases and do this search? I think that’s a very open and
controversial legal question. In the meantime, what we’ve
done to try to self regulate in this space is that the
Department of Justice has come out with an interim policy, basically trying to set best practices of
what you can and can’t do. This doesn’t apply to all
law enforcement agencies, but what it does is it
sets out some parameters. It says that law enforcement
should only be using this technique for violent
crimes or crimes against to identify missing persons, and that it should only
be used as a technique of last resort. So, you
have to go through all of your other investigative work before you can try to do
investigative genetic genealogy. So, some people think
that’s a really good thing because they they want to
put some limits on the use of this, others say, not so good. Because what if you have an
active kidnapping, for example, you want to act fast to try to
find out who that person is. And if this is the best way to do it, maybe we should be using
this as a technique of first resort. They also limit how you
can use the information that you’re getting from this. So, they said, even if you get DNA data from your search, you
can’t use it to go back and try to identify
genetic predispositions to certain diseases or anything like that, and you have to get rid of the information as soon as the case is closed. They have limitations on
how you can use the data, and they have specific
recommendations around transparency. So, you can’t create a fake profile and pretend you’re not
law enforcement anymore, according to their best
practice standards. You have to actually
say, I’m law enforcement, and this is how I’m gonna
use this information. But there really is no
way to regulate people from creating fake profiles, there isn’t, currently, and that’s a big issue. And that raises tremendous
security issues, not just for the law enforcement
agent who’s gonna go out there and create a fake
profile, and dupe the public, or the consumers, but for
other individuals as well. So, there have been several articles that have been published that
talk about the security risks of these DNA databases. I think what you have
to understand is like, let’s look at GEDmatch for
example, GEDmatch was created by two retired businessmen who
were part of FamilyTreeDNA, and they said, you know,
it’d be really cool if we had other tools to try
to connect more freely with our relatives. So, they set up this database as a hobby. It’s a nonprofit database,
they make no money from it, they got 1.4 million people in it, they have no background in
informatics, or security, or data analytics. And so the database naturally
has a lot of limitations to it, that wasn’t the purpose
for them setting it up. And so these academics
went in and said, okay, what’s the vulnerability
of these databases. And they said, it’s pretty, pretty high, and individuals can
basically fake profiles, and they can do a whole
bunch of things with that. They can miss direct investigations
by planting, sort of, genetic profiles into the database, there can be con artists
who defraud victims by creating fake profiles. And then there’s a real
concern about, sort of, national security and political operatives blackmailing opponents. Think about the implications
of this for, you know, anybody who’s undercover,
or CIA agents, I mean, your DNA is left everywhere
you go and if we can identify, in a couple years, 99% of
individuals of European descent in the United States based
on familial matching using these databases that
creates a lot of risk. So, what have we done to
try to combat this risk? One thing that we’ve
done is GEDmatch said, this is just way too much. They were the main target
of a lot of this criticism and they recently got acquired by Verogen. Verogen is a spin off from Alumina, and they are dedicated
completely to working with law enforcement to develop tools for investigative genetics. And so, they’ve acquired GEDmatch, they promise to keep the
same Terms of Service, you have to opt in, but
there’s a lot of questions about what this does. This is now a for profit
company as opposed to a not for profit company. And there’s big questions
about whether this will become the trend in doing this. The other thing that’s
happened, is a lot of people in the genetic genealogy
community have said, well, we should just have
a forensic DNA database that people opt into, like GEDmatch. And so the first one has come
out, it’s called DNASolves, you can opt in to upload your DNA to this, knowing that it will be used specifically for the purposes of solving crimes. One of the problems with this
is that this database has to become, first of all,
there’s several different groups that are starting to
generate these databases, the more competition you
have, the more spread out people are across databases,
and you really have to have the size of a GEDmatch database
like 1.4 million people, right, for it to be
useful to law enforcement. The other thing is that this is a, some of these are for profit, others are being conceptualized
as not for profit. I think there’s questions
about whether we should be profiting off of
this endeavor to protect the public safety, and how that works. Okay, any questions about that? Before I. I have a couple of other
things I want to talk about. But any burning questions? Yeah. – [Woman] How do those
solving crime websites protect security? Like, you were talking about
when the security (mumbles). – So, okay, great. So, how do these websites
protect security? So, Verogen, for example, is a much bigger for profit company that has the resources to create security protections. So, one of the major security protections would be like having digital
signatures or encrypting data, you know, to try to prevent false uploads. DNASolves would be set up so nobody breaks into the CODIS database. It’s highly, highly secure because we know how to secure data at
the government level. So, you would hope that
the people who are setting up these specific forensic DNA databases are better equipped to set them up with higher security standards in place. Yeah. Okay. Okay, we’ll take one more
question, two more questions, and then I’ll. Go ahead. – [Man] I’m just wondering,
what kind of oversight was that? – Yeah, so the question was, what enforcement mechanisms are there for making sure law
enforcement follows the rules? And the answer is none, right now, and that’s a major, major, major problem. So we’re, this is a, you know, this is a, we’re about a year and a half out from the Golden State Killer, like, opening this giant field up. The first step was to
try to set best standards of what should the rules even be, which is what the Department
of Justice has tried to do and others are still trying to do. But I think we still have a
lot of work to figure out, okay, how can you actually, not only, what are the enforcement,
but how do you actually know when somebody is
not following the rules, because it’s really, there’s
no sort of accountability in terms of knowing when
somebody is doing something that they shouldn’t be doing. And so, many of these cases, in fact, one of the limitations in
the Department of Justice is, this is, this generates
only an investigative lead, you’re never arresting somebody just based on investigative genetic genealogy, you’re using it to identify
where your leads are, and then you’re following those leads. And so this usually
doesn’t get even admitted into a court of law during
the prosecution, right, ’cause it’s irrelevant. You just found what lead you’re following and then how you actually identify that person was your suspect,
is what gets submitted into the court of law. And so we don’t even know
sometimes when cases are using this technique in order to identify leads. Erin, did you have a question? – [Erin] When a company
acquires another company, do the privacy practices transfer over or do they start from scratch? – Yeah, so really interesting question. What happens when a company
acquires another company, do the privacy rules transfer or not? They should, in some cases
transfer, but not always. And that’s an FTC question
of whether you still have to uphold the privacy of your consumers. What Verogen did, is they basically, every consumer of GEDmatch
had to go in and then agree to rehave their data go into Verogen. So, not always, but that’s
a very open question. Yeah, okay. So, one thing I just want
to address real quickly before we finish up is, there’s a big question I
think that people worry about in, maybe this room, and
in the rooms that I sit in, of, okay, this is fine law
enforcement is accessing these genetic genealogy databases. This is where people have
voluntarily shared information, the purpose is to link to relatives that they’re genetically related to, that makes a whole lot of sense. But Gosh, are they
gonna start coming after our research databases and
our clinical databases? And first, I want to just kind
of put you at ease and say, I have not heard anything from
any law enforcement agencies that they would find that useful or that they would pursue
that in the coming, you know, years, but you never know. So, what protections are out there? Well, with regards to health information, the way that we typically
protect health information, their strongest protection
we have is through the Health Insurance Portability
and Accountability Act, or HIPAA, but I do want to point out that there are important
exceptions to HIPAA for law enforcement access. So, HIPAA covered entities
can disclose protected health information to law enforcement
without any consent or authorization from
the individual to prevent or lessen a serious and
imminent threat to the health and safety of the
individual or the public, which seems kinda like this, right? To comply with a court order,
or warrant, or subpoena, which they started, they
did it in Florida, right? And then for the purposes
of identifying or locating a suspect, fugitive, material
witness, or missing person, but then they say, it has
to be limited to demographic and health information about that person. There’s other exceptions
as well that might apply but these are the main ones. So, that might not help
us very much with regard to health information. There are also some state laws that provide additional protections for confidentiality and
privacy of health information, so we’d have to look state
to state to see if any of those would get us out of this bind. The thing that I think can
reassure us a little bit, is that law enforcement
would have to do a whole lot more work if they access genetic databases that were held as part
of your health records, because it’s not, we don’t
typically do familial matching within the health system
and that’s what they want. They want to know who
the family matches are. And so they’d have to
actually build out that system and do their own familial matching, which seems like quite a bit of work. Interestingly, research data is probably the most well protected
because we have certificates of confidentiality and under
the 21st Century Cures Act, we actually have heightened
protection under certificates of confidentiality for research data, where you automatically get a
certificate of confidentiality for NIH funded research data,
and you can apply for it for other types of data,
and it actually allows you to refuse to give over research data. In fact, you can’t give
over research data, even with a subpoena, without
the individuals consent. And so this is probably the
most heightened protection we have against law
enforcement access to data. If you have questions about certificates, there’s a website here
there’s a really great kiosk on the NIH website around the certificates of confidentiality. So, with all of this,
some people have argued, you know what, there’s so
much information out there, it’s really useful to the public safety, maybe it’s time that we have a universal forensic database,
and everybody just gets in it in the United States. This is obviously a hugely
controversial proposal. I don’t actually see it happening from a political standpoint, anytime soon, but it’s certainly something that I think we can think about. It gets rid of the need for
familial matching all together, because if everybody’s in the database you can just do direct
matching for criminals. I think we’re starting to make
some headway in this space in October of this, of this past year, we had a really fascinating
conference at Banbury Center that I helped organize, and it was the first time
we really got the companies, we got law enforcement, all
levels of law enforcement, we got consumers, people in the
genetic genealogy community, all together for three days
to talk about these issues. And I was telling Scott and
others, the other night, that I think the biggest
win in this conference, is by the end of the three
days, everybody was speaking to each other, which
oftentimes for those of us, who do multidisciplinary research, that’s the best thing you can hope for. And we’re still working on
the outputs of that meeting but there were several things
that we really focused on, including, we need to
have quality standards. Everybody’s gotta be
doing, who’s doing this, has to be doing it well
because the problems come when they don’t know what they’re doing and they contact individuals
who might be a sixth cousin, who actually mean nothing,
and then they really feel like their privacy was violated. We need to have better security
and accountability in place. So, going back to the
question in the back, of like, there’s got to be enforcement
mechanisms to make sure that people are not breaking the rules. And we need more transparency
and consent around this. And those are kind of the three
areas that I think we need to develop policy. So I will end there and see if any of you have any questions,
I think we have five minutes if anybody has anything they want to ask. (audience applauding) – Well, thanks, Amy. And I’ve got the mic
here for people who might have questions. There’s a, in the back there. – Thank you, it’s fascinating. So, I’m just thinking extrapolating. So, there’s a lot of inequity
in some of the systems that are involved here, in
health care, law enforcement, government, like all those things. And I think that there
is out in the populations of people that are in those databases and those who are probably
intentionally not in them. So, I guess from your perspective, what would be needed in order to, kind of, ensure equity in this,
like, arena, given that some of the supporting
organizations and structures that are involved are
inequitable right now? – Yep. So, ironically, I actually think that investigative genetic
genealogy helps to offset some of the inequities in the
ways that I talked about, in some way, because it
does address, sort of, the racial biases that exists
in our forensic DNA databases by sort of oversampling for
individuals of European descent, or Caucasian individuals. So, in some ways that helps a little bit. I agree with you 100%. Certainly a universal DNA
database would help, right? And it’s a self perpetuating
problem, like particularly in the criminal justice system, right, so you have more people in the database who are African American, you’re more likely to identify
people who are suspect of crimes and solve the
crimes where there was an African American perpetrator. Then you have more people in the database who are African American
and if we could, kind of, break that cycle it might
help us a little bit with the issue of the
criminal justice system. Although, I am not going to
pretend that that will solve all of our biases within the
criminal justice system by any stretch of the imagination. But it’s a really difficult problem, and I think it’s not
something we can ignore. You’re right, there’s inherent,
you know, inequities in all of these systems. And we want to make sure that
we develop policies that, if they can’t address them,
at least don’t exacerbate ’em. Yeah. Yeah.
– Hi. – Hey. – I was just curious,
given the possibility of people creating fake profiles
with their genetic data, as you discussed, if given
a GEDmatch news acquisition, if there’s any efforts
regarding like, KYC, Know Your Customer of having to like, upload, you know, your
passport or your, you know, driver’s license. That’s like very common
in financial services for these sorts of– – Yes, I think there’s a
lot of discussion of that among the companies themselves. – I’m not endorsing that,
that sounds pretty creepy but just–
– Well, I think in terms of, like, there’s been a lot
of discussion around, like, digital signatures and
encryption and things like that that yes, you should have to verify that the information you’re
getting is from, you know, where they say it’s from
and that it’s a valid source and that sort of thing about it. – Yeah.
– Thank you. – Mm-hm. Other Questions? – [Scott] Other questions? – So, if newborn
screening is ever expanded to include genetic information, would that end up going into some sort of, universal forensic database? – So, I have not heard
any record, you know, proposals that it does either
include genomic screening or go into a forensic database. So, I mean, I think it’s
very open I, you know, those of you who are familiar
with newborn screening here in Michigan, we have a similar, you know, Michigan and Texas are kind of bounded by our class action lawsuits against our State Newborn Screening
programs for using those data without explicit
consent for research purposes. So, I think we’d have to
overcome all of those hurdles of how does it get used. Again, the value to law enforcement, and I can’t emphasize this enough, is the familial matching piece of it. It’s not the actual DNA itself. Now, of course, you can get
to the familial matching if you have the DNA but most
people in law enforcement don’t have that skill set and
don’t have the, you know, I mean, that’s a lot to do. So, they would have to
create sort of a, you know, shadow database that does
the familial matching because that’s what they’re looking for. So, I don’t know, it’s an open question. I think when people talk
about the slippery slope of data privacy and
law enforcement access, that’s one of the things they worry about. Yep. Yeah, so the question was, you
know, could law enforcement, we have bad actors in law
enforcement, we know that, I mean we’ve gotten more and
more publicity around that. What if you have a bad
actor in law enforcement who uses this to like
plant DNA at a crime scene and then is not the actual
suspect and frame somebody? So, I think that’s a problem,
regardless of whether you use genetic genealogy or
other types of forensic data to solve crimes, and it’s
a huge problem, obviously, and that person, you know,
we should do everything in our power to identify people like that and prosecute them to the
full extent of the law, I would think. I don’t think we can prevent
bad actors, you know, without, I mean, it’s a bigger issue
of screening police officers and those sorts of things. Interestingly, you know,
just like Joseph DeAngelo, several of the suspects who
have been identified using this technology had
been ex-police officers or current police officers. There’s a case in Texas where it was a current police officer
who was pulling women over on the side of the road
and raping them at night, and with under the auspices
of giving them tickets. So, you know, oftentimes, police
officers might not be found out in other ways because
there’s you know, sort of, the Brotherhood-Sisterhood of the force and they have protections in certain ways and this might be a way to get at them. So, I don’t have a good
answer for that, other than, you know, I think that’s a bigger problem that we need to try to address, yeah. Yeah.
– Well I think we’re at the top of the hour. I think Amy can stay a
little bit longer though, if people have burning questions that want to come up afterwards. But let’s thank her for a fantastic job. (audience applauding) – Can I just, I just
wanna say one more thing, ’cause I almost forgot. So, I will tell you that the first case, using this technique to exonerate somebody who was through the Innocence
Project was reported, so that’s also one way and
that just is being used. – [Scott] Okay, well thanks and thanks to everybody for coming. And, again, February
11th is our next seminar. (audience laughing and chattering)

Tags: , ,

1 thought on “Precision Health Seminar Series, January 2020”

  1. XimerTracks - Sub To Me says:

    Here before 27000 Subscribers. Don't stop. Also, I want to be youtube friends 😮

Leave a Reply

Your email address will not be published. Required fields are marked *