Skip to main content

Verified by Psychology Today

Genetics

DNA Donors Should Be Aware of Privacy Risks

Don’t rely on promises of anonymity in DNA-based research.

Would you make your DNA public for the benefit of science? Even the embarrassing bits?

If you say no, not entirely, you’re in good company: James Watson, the co-discoverer of DNA, released his entire genome to researchers, except for one particular gene. That one he didn’t want anyone to know about, because it might indicate whether you were likely to get early-onset Alzheimer’s. In fact, he didn’t want to know himself because there was nothing he could do about it.

But what if no researcher could tell that the genes they were looking at were yours? That would surely be different. That’s the idea behind anonymized, crowdsourced genomic research. It’s become a significant trend. The idea is that if thousands of people with a particular disease— Parkinson’s, for example—all contribute samples of their DNA, then scientists can analyze the data and perhaps isolate some important common information.

But there is always a delicate balance to be drawn between data and privacy. What volunteers may be happy to tell scientific researchers—for instance, that they have early symptoms of a disease that could become debilitating—they may not want to tell their employers. It’s all too easy to imagine a boss (illegally but understandably) wanting to get rid of potential liabilities ahead of time.

The same goes for landlords and schools and bankers. The 2008 Genetic Information Nondiscrimination Act (GINA) protects against abuses by health insurance companies. But not in housing, or education, or mortgage lending, to name but a few. That’s why some states, such as California and Massachusetts, are looking at expanding those protections.

Your family might want to have a say, too. Since you share genes with your closest relatives, what you find out about your genes has a direct bearing on the likelihood that your siblings or your children might have the same tendencies. You might want to know if you have a higher risk for breast cancer, say, but they might not. Indeed, some cultures are quite clear that your shared genetic heritage belongs to your extended family as a whole: “It’s not yours to give.” That’s why Louise Erdrich refused to have her DNA tested on one of Henry Louis Gates’ TV programs.

And, of course, the predictions are notoriously vague when applied to individuals. If you have, say, a 10% greater probability of developing some disease for which there is no prevention and possibly no cure, what good does that do you? Not much. But there may be a chance that a large, combined study that looks at DNA from thousands of people might be able to tease out some associations that could be valuable.

Perhaps one allele—this is, one version of a gene—correlates with a good response to a particular medication. Eventually you might want your doctor to know if you have that allele or another variant. If there were a pill or an injection that would ward off Alzheimer’s for people with one particular allele, then it would be worth knowing if you should take it; Watson would surely agree.

(Right now there is no such vaccine, and it’s not even on the horizon, this is just a thought experiment. But even if you got the shot, would you want everyone else to know? What if you had to take it every day? Would that affect your decision about going public?)

Back to the present: For lots of reasons, anonymity is an important cornerstone of the crowdsource phenomenon when it comes to genomic research. When you give your DNA for analysis, it’s supposed to be “anonymized,” that is, separated from any possibility of connecting it with your identity in future. In particular, the databases circulated in the research community do not hold the DNA itself but evidence of how the DNA is expressed as RNA. By focusing on the effects of the genes rather than the genes themselves, it was thought that the individual donors could be isolated from the research itself, so a snoop could not work backwards from the available data to the original person.

That turns out to give a false sense of security. A recently published article in Nature Genetics essentially says that keeping aggregated DNA data anonymous is impossible. The full article is subscription-only, but Science Insider has an accessible report:

Eric Schadt and colleagues...have developed a technique for generating a personal SNP profile, or a DNA "bar code," for an individual based on their gene expression results. This means that, in principle, if someone had a DNA sample from a participant in a study stored in GEO, they could devise a SNP barcode, match it to a GEO [Gene Expression Omnibus database] sample, and look at that participant's biological data.

This is not the first paper to raise questions about the anonymity of DNA data. In 2008, David Craig et al published one in PLoS Genetics that demonstrated "the ability to accurately and robustly determine whether individuals are in a complex genomic DNA mixture." As a result, the National Institutes of Health in the US and the Wellcome Trust in the UK limited public access to some genetic data.

Academic discussion has continued, for instance with papers in the International Journal of Epidemiology late last year [abstract] and, in February 2012, in the Journal of Medical Ethics [abstract]:

How anonymous is ‘anonymous’? Some suggestions towards a coherent universal coding system for genetic samples

This issue is becoming critical, particularly when crowdsourced research is performed by private businesses. Genentech and 23andMe have formed a partnership to investigate breast-cancer patients who may have benefited from the use of Avastin, after the FDA withdrew its approval of using the drug for breast cancer last year because of concerns about safety and efficacy. The 23andMe/Genentech study has been criticized by some, including Karuna Jaggar, the executive director of Breast Cancer Action in San Francisco, partly because of concerns that private companies will have the patients' DNA data. To which the response is:

Genentech officials said the samples are de-identified and protected under the same scientific guidelines and federal laws as other studies.

Well, yes. That’s the problem, not the solution.

Meanwhile, there is a growing, related debate over privacy in another realm, namely the online world. In the UK, the government plans to increase monitoring of the Internet, in particular email and social media. Indeed, the inventor of the web spoke out strongly against this, in April:

Sir Tim Berners-Lee, who serves as an adviser to the government on how to make public data more accessible, says the extension of the state's surveillance powers would be a "destruction of human rights" and would make a huge amount of highly intimate information vulnerable to theft or release by corrupt officials. In an interview with the Guardian, Berners-Lee said: "The amount of control you have over somebody if you can monitor internet activity is amazing.

He was strongly supported by Sergey Brin, co-founder of Google, who sees the threat coming partly from "walled gardens" such as Facebook and Apple, and particularly from "governments increasingly trying to control access and communication by their citizens." In response to criticism, he reiterated that "government filtering of political dissent" is the biggest threat to Internet freedom.

Brin's wife runs 23andMe. Not only is the company working with Genentech on genomic analysis, it is extremely active in the analysis of Parkinson’s. (Brin’s mother has the disease, and he is at risk for it.) Indeed, this kind of research seems to be the heart of the company’s business plan. Back in 2008, tech analyst David Hamilton wrote that individual consumer testing was not the point:

Instead, their business generally depends on amassing a giant anonymized database of customer genetic information that can be mined for research studies by academic researchers or drug companies.

Brin is concerned about the government and about competitors, but presumably thinks that companies in which he has a hand will, as Google famously claims, not be evil. Their intentions may be good, but can they control access to their data?

Meanwhile, researchers keep trying to mine DNA for personality traits, as well as identity. "Can You Predict a Monkey’s Social Status by Looking at Its Genes?" asks an article in Scientific American. We at Genetic Crossroads (and many researchers, including some who strongly disagree with us) regularly disparage efforts to discover the "gene for" some obnoxious or desirable trait. But people keep looking, and they may yet find something — or, worse, politicians may implement some kind of discrimination based on a false finding.

It's quite possible that the public really does not care about the privacy issues involved in DNA databases. Dan Vorhaus, an attorney who is expert in these matters, seems to have accepted that privacy is a relic of a bygone age, since "the idea that we can promise a complete separation from data and identity is now largely discredited." He concludes that people who take part in studies should be warned, and "free to assume that risk if they wish."

Vorhaus may know the risks; most people don't. And until very recently there was said to be no risk at all. This is a topic that demands debate, and strong regulations—which should cover private as well as public institutions. Isn't it time to update the Genetic Information Nondiscrimination Act?

And until then, we in the general public need to keep ourselves informed and aware. Caveat donor!

advertisement
More from Pete Shanks
More from Psychology Today