Dr. John Novembre is a Professor in the Departments of Human Genetics, and Ecology and Evolution at the University of Chicago. John received the Macarthur fellowship in 2015 for his pioneering work as a Computational Biologist. John’s work on uncovering fine-scale population structure and variation has been essential towards understanding human genetic variation and diversity. As an example of the impact of his work, the figure from Novembre et al, Nature 2008, (shown below) has now become iconic and is a standard part of population genetics textbooks. We were delighted to chat with John about a range of issues – his beginnings in this field, his work in population genetics, and the topic of race in genetics and his recent efforts at addressing some of the problems surrounding the topic.
Q: We are interested in learning a bit how you got to where you are today, at the University of Chicago. Could you also tell us a little about your early interests?
A: As a kid growing up, I was exposed to different languages and different places. I was born in Paraguay and my mother is from Uruguay and I grew to enjoy thinking about the world broadly. Biology in particular became a passion when I was in high school. At the same time, I was also a bit of a computer nerd, for instance building a little lunar lander type game with my friends. It wasn’t until a good way through undergrad when I realized I could take my passion for biology and combine it with my hobby of programming – that path wasn’t as obvious at the time as it is now. Around that time I became fascinated with the protein folding problem and worked in a lab that was doing site-directed mutagenesis to understand the impact of amino acid changes on protein structure. That got me thinking about molecular variation and the fate of new mutations. I heard some talks on human population genetics and became all the more fascinated as it’s a field that combines molecular thinking with population history, language, and geography. So it’s sort of this perfect storm of overlap in the Venn diagram of what I found interesting. I was very excited then to go to Berkeley and work with Monty Slatkin to learn theoretical population genetics. In my post-doc I worked with Matthew Stephens at University of Chicago and got more exposed to mainstream human genetics. My first tenure track job was at UCLA and now I’m back at the University of Chicago.
Q: Tell us a little about your work during and after your PhD. Did you work in different model systems between your PhD and now?
A: I did – For my PhD, the aim was more general methodologies, but I did do a little bit of human genetics. I worked on codon bias with a molecular evolution emphasis when I first arrived at Berkeley. Then I shifted to work on dispersal and trying to infer how far individuals move based on the distribution of rare variants. That work was motivated by ecological studies in non-model organisms including plants. Finally, I worked on methods for analyzing the geographic spread of an advantageous allele and applied that to a case study in humans.
Q: Did you find this shift in model organisms or scientific questioning difficult?
A: I think at different points of my career I’d answer this question differently. As a computational biologist, I think it can seem pretty simple to switch model organisms because in some sense you are just changing the file names. Your files still have rows of genotypes – you’re not learning how to rear or capture a new organism. But over the years I’ve focused more and more on human variation. Sticking to one model system allows for a more nuanced understanding and there are huge advantages to staying in one system. That said, you still have to read the papers on the other systems. You then realize what is specific to your study system and is not a universal generality of nature, because scientists can tend to over-generalize from their own study systems.
Q: What was your favorite fact about canids that you learned when you researched them?
A: We did this study analyzing some of the first whole genome sequencing of canids. We looked at a Croatian, Israeli, and a Chinese wolf, and also a dingo (Australian wild dog), a basenji (African dog), and a boxer. We had some rough ideas of possible ways the data might play out, for instance, where the dingo and the Chinese wolf might cluster together, the basenji and the Israeli wolf would cluster together, and the boxer and Croatian wolf would cluster together which would indicate multiple origins of these geographical lineages. Another idea was that the dogs would be a clade descended from one branch of wolf phylogeny. Instead, we saw that the dogs all clustered together and the wolves all clustered together. We had to posit that there was a wolf lineage that was an outgroup to all of the dogs that went extinct, that we’re missing that lineage, and that current wolves are descended from some sister group of wolves to the ones that brought us dogs. This was a really big surprise, and happens all the time in research, when you expect one thing and get another.
Q: Let’s now switch gears a bit and go back to human genetics. We hear about human genetics in the news almost on a daily basis. The use of human genetic data has been prolific, and sometimes for nefarious reasons. As students studying human genetics, do you think the current system of education is doing enough to educate us on the background of the human population genetics? For example, human population genetics has ties to the eugenics movements, are some of the problems with these historical approaches covered sufficiently in current curricula?
A: Population genetics has its founders and roots connected to researchers who were thinking about eugenics. This cannot easily be overlooked or missed by anyone who studies in the field deeply. I remember having the experience in my PhD where I was following a paper trail, going from paper to paper, and all of a sudden, I was in the Annals of Eugenics 1937 reading a paper that had the essential math to do my work. That eugenics legacy is there and I think it is important to teach, especially as population genetics continues to become more mainstream.
Population genetics isn’t a niche field anymore. It’s a field that has a lot of relevance to all parts of biology. More people are being exposed to population genetics and need to learn this background. So, we have been trying to stress this more in our graduate curriculum here at the University of Chicago.
First, we are integrating these topics into two sessions of the Responsible Conduct of Research (RCR) courses that are part of NIH requirements for training grants and fellowships. In the first session, taught by Joe Thornton, we talk about the history of eugenics in the US and genetic engineering (all the way to CRISPR). An important part that’s covered is how even progressive voices in the US, who we would generally associate with more socially progressive policies, were actually pro-eugenics at one point. This sort of thinking was not simply isolated to Nazi Germany but was also here in the US. It’s also a reminder that as our current moment grapples with technologies like CRISPR, there can be slippery slopes and surprising turns. The second session we’ve added is on Genetics, Race, and Discrimination; I teach this section. I present an overview of what we know about human genetic variation, how it relates to concepts of race and how it undermines concepts of race. Then we shift the conversation to discrimination, not just in terms of racial discrimination but also discrimination based on genotypes. Here I introduce the Genetic Information Nondiscrimination Act (GINA) and have us think about how we may have a future where say schools might decide whether or not to admit individuals based on their genotypes.
Second, in our Human Genetic Variation and Disease course, which is a mandatory course in the Human Genetics program, we have a session on genetics and race, and in particular discuss a set of readings from public facing discussions in the media in the last couple of years. Third, we have our departmental journal clubs where the themes are student driven, and we encourage students to choose genetics and society issues as themes. This has allowed us to have sessions on genetic discrimination and data ownership, and we hope this will happen on a regular basis. Fourth, we have just started a genetics and race reading group this past summer, where we started with the book Superior by Angela Saini. Finally, we are trying to encourage our seminar series to host more “genetics and society” speakers, such as humanities scholars who think about genetics. These are incremental steps but I think we are at a qualitatively different place from where we were five years ago, and we are learning as we go.
Q: How different is teaching these topics from teaching mathematical, computational biology?
A: It’s very different and challenging. My colleagues and I talk about how difficult it is to even get up in front of students to try to teach it. One thing we do is to set the tone at the beginning of the class by emphasizing that the topics that will be covered are difficult and that there will also likely be actual, physical discomfort when discussing these issues. But we also encourage the students to try to push through and have these discussions. It will make us better thinkers and speakers on these topics. In these discussions we emphasize being forgiving of each other, acknowledging that we will slip up on how things come out or we might misunderstand something someone else has said. We encourage the students to accept that everyone is coming at it with good intentions and try to see and acknowledge the contributions that someone is trying to make. We try to create a space where students can open up and talk about these issues. We have some material with content that we use as a base, and then we jump from there to discussions with questions that promote dialogue. There isn’t a lot of pedagogical material to go from and there is a big need for resources for education in this area. We like Brian Donovan who is an education scholar who has had training in human genetics and thinks about ways to teach genetics in a way that helps reduce racialized thinking of human diversity.
Q: You’re also involved in an effort to revamp and fix GINA (Genetic Information Nondiscrimination Act). Can you tell us about your work in this area?
A: GINA is a federal law passed in 2008 that prevents genetic information from being used for health insurance or employment decisions. You can think of civil rights in the US as expanding through time and GINA is part of that history. The definition of “genetic data” in GINA is really savvy; it is defined as not just the results of a genetic test you take, but also family history data. The bad news is that the coverage of GINA only extends to health insurance and employment. There are many other areas it doesn’t cover like long-term care insurance, disability insurance, life insurance, education, and mortgage lending. And now we’re in an era of ever increasing GWAS (genome-wide association studies) which look into all kinds of traits, including behavioral traits, such as educational attainment and risk taking. And these studies may be noisy and not port well outside the study populations, but that may not matter to mortgage lenders or insurance companies who are working with thousands of individuals. Even if the studies are slightly informative about these traits that could help the companies. So how do we deal with this? Well, we could expand the domain of GINA the way California did, passing a law called CalGINA which covers genetic discrimination in education and lending, but still does not include long-term care and disability insurance. The legal scholars have varying opinions on what the best course of action should be and it’s been fascinating talking to them.
Q: How did you get involved in working on GINA?
A: In my presentations I had started to include a blurb about the importance of GINA and how it should be protected and expanded in some way. Then In October of 2018 I attended a meeting that was organized to talk about the growth of GWAS and polygenic scores, and I was talking with some other attendees and I came to wonder about who was advocating around these issues and what is the current state of the field. I happened to be on sabbatical at the time, so I read a lot about the topic, spoke with subject matter experts, eventually getting involved with the Cyberlaw Clinic at the Harvard Law School. Working with one of the clinical faculty there and two fellows, we developed a report that analyzed the potential for and different actions that could be taken around genetic discrimination in the education and housing spheres.
Q: You’re the only tenured professor I’ve heard of that works on legislation like this. Do you think there should be more scientists and academics involved in shaping legislation related to genetics?
A: I think it’s personal – each of us has our own space and capacities for putting back in for what we’ve received. At the moment, it feels like this is the best use of some my service time. It’s not like I do this in a professional capacity, it’s more like an extracurricular activity. That said, I think we should feel responsible to the knowledge we have, the subject matter expertise, and I think we should be helping to foster discussions around these subjects where we can anticipate issues that maybe others can’t see quite yet. The technology so often moves quicker than the conversations about how it should be used, and so if you’re at the bleeding edge of understanding the technology, you can say hey, let’s start this conversation. And there is movement. Florida just passed a law months ago banning the use of genetic information in life insurance. But again, I have to be humble and cannot anticipate how far my particular actions will go. But I think it’s important to move the conversation forward.
Q: Is there anything else that you’d like to share with graduate students?
A: I guess I would say, based on what we have focused a bit on here, that It is natural to be thinking about these genetics and society issues, and so it is normal to be asking the environment you’re in to provide better training programs, guidance, and space within the curricula to discuss these issues. For some support in that regard, the National Academies of Sciences released a report, 21st Century Graduate Education Report in STEM, which discusses what makes graduate education today different from education 20 years ago. One of the key points is that biology is so interwoven with society, and curricula should be adapting and helping students engage in their capacity to have those conversations, and to have that extra set of gears needed. So, I would just give this note of encouragement, that while we are searching for truth, being a situated member of society is really important.
 Novembre, J., Johnson, T., Bryc, K., Kutalik, Z., Boyko, A. R., Auton, A., Indap, A., King, K. S., Bergmann, S., Nelson, M. R., Stephens, M., & Bustamante, C. D. (2008). Genes mirror geography within Europe. Nature, 456(7218), 98–101. https://doi.org/10.1038/nature07331
Materials to read on this topic, recommended by John:
Interpreting polygenic scores, polygenic adaptation, and human phenotypic differences by Noah A Rosenberg, Michael D Edge, Jonathan K Pritchard, Marcus W Feldman in Evolution, Medicine, and Public Health (2019)
“The ‘geno-economists’ say DNA can predict our chances of success. Critics counter their methods are naive, offensive or both, but all agree: either way, multigene testing will lead to a social upheaval.” by Jacob Ward in New York Times (2018)