João Gabriel Genova

The targert audience of this essays are undergraduate students.

Essay March 17th: What does replication implies to phylogenetics?

The dichotomous nature of DNA's replication is the cornerstone of phylogenetics. One lineage splitting in two is a reflex of one molecule of DNA duplicating itself. This process is a very delicate one and, to prevent undesired and/or harmful effects, a complex machinery acts repairing whenever a mistake happens. Damage to the DNA can be done in various forms, be it chemical, mechanical or through radiation. Although very efficient, this system is not perfect and sometimes errors can slip through, whenever this happens we call it a mutation. If this mutation can be passed down and spread to the population (thus not being restrained to the individual or molecule where it appeared), it's called a substitution. When a change, or the accumulation of changes, are enough to reproductive isolate one lineage from another, speciation occurs. An issue with our current paradigm is that it is built only based in vertical transfers of information. In nature, mainly in prokaryotes, it's common to have horizontal transfers of genetic material. Albeit knowing how this phenom happens, we are not able to fully understand it under phylogenetic lenses.

Essay March 24th: The Neutral Theory of Molecular Evolution

It’s a common thought that a mutation must be either positive or negative for an organism. In 1968, Mooto Kimura, a Japanese geneticist showed that vast part of mutations in genome are neutral. That implies they are not being fixed through natural selection. Kimura’s Neutral Theory states that random fixation of neutral or very nearly neutral mutations through drift in finite populations is cause behind the majority of evolutionary changes. By neutral we understand that those variants doesn’t interfere in the organism’s fitness. This proposal clashed with the Neo-Darwinians at time, for their thought that natural selection is the main driving force behind evolutionary changes. Data found comparing hemoglobin and other molecules of “living fossils” (organisms that have gone trough few evolutionary chances since they appeared on Earth) with rapid evolving species show that they have undergone the same number of nucleotids substitutions. This evidence gives support to Kimura’s predictions.

Kimura, Motoo (1968) Evolutionary Rate at the Molecular Level. Nature, 217:624.

Essay May 5th: What is tree-thinking?

Tree-thinking is a way to consider objects in a context of descent and inheritance (which can be applied to living and non-living things). In biology, this means working on a phylogenetic basis. Taxa are related through inheritance and descent, sharing common ancestors. If we trace and represent those relationships, we will obtain a tree-like figure. In our tree, the subjects of study are at the tips of the diagram. When we trace the history of the tips, we go down through a branch. The point where branches met is called node, each node represents an ancestor shared by the branches joined there. Does that means that all tips indicate extant organisms? No, fossils and extinct taxa can be placed at tips too. Tree-thinking is such a powerful tool because it informs us of the shared history of our objects and we can plot events that led to divergence between branches to better understand how they affected them. One of the most usual errors in reading evolutionary trees derivate from a tip bias. A being placed right next to B doesn’t mean that they are closely related than B and C. It doesn’t matter who’s alongside who on the top of the tree, what matters is who last shared an ancestor with who.

Baum D. A. & Offner S. 2008. Phylogenics & tree-thinking. The American Biology Teacher, 70:222–229.
O’hara R. J. 1997. Population thinking and tree thinking in systematics. Zoologica Scripta, 26:323–329.

Essay May 12th: Trees and similarity

Suppose you’re a researcher who wants to study the relationships of a genera containing 15 species, how many trees can you build? The answer is in the order of 8,2x1021 . Building all possible trees and analyzing all relevant parameters are near impossible tasks. Even with a computer the time spent would be longer than 10 years. So how can scientists arrive at trees that satisfies their hypothesis amidst this huge forest of possibilities? You can’t look all the trees, but you can search for specific ones. The trick is choosing parameters that narrow down the number of trees. So instead of millions, you end up with a small batch that satisfies the conditions you inputted. The first method proposed for building optimal trees was the UPGMA (Unweighted Pair-Group Method using arithmetic Averages), published by Sokal & Sneath (1963). Under it trees are built by the medium similarity shared between its groups. One main critic about the UPGMA was that it presumes that all changes are equal, something that’s not observed biologically. A solution for this is the Neighbor-Joining method, developed by Saitou & Ney (1987) which grants different weight to each change.

Sokal. R.R. & P.H.A. Sneath. 1963. The principies of numeric taxonomy. W.H. Frceman & Co.San Francisco,Cal.
Saitou N, Nei M. "The neighbor-joining method: a new method for reconstructing phylogenetic trees." Molecular Biology and Evolution, volume 4, issue 4, pp. 406-425, July 1987.

Essay May 19th: Likelihood

Likelihood is a statistical method for estimating the value of a parameter from a set of data. And where does it fit in phylogenetics? There is an array of things that can be estimated such as topologies, branch length and substitutions rates. Those things are the parameters of your model. And as any parameter, they have a lot of values that are possible. So, what are those values? What am I measuring here? People often make a misinterpretation of likelihood values. They read it as a probability of the tree obtained being correct. That’s not true. First, likelihood isn’t a probability. It’s a density function obtained by the distribution of a set of probabilities. Second, it doesn’t indicate how correct a tree is. It tells us the degree of certainty which you can estimate that the event happened.

Essay May 26th : Choosing Models.

Models are simplified representations of reality. When reconstructing phylogenies using DNA or amino-acid sequences we need an evolutionary model that can explain the process of nucleotide or amino-acid substitutions. Our model should be able to describe the different probabilities of change from one nucleotide to another. Choosing a model is nothing more than deciding what are the parameters that I’m going to estimate in it. Let’s take the 4 nucleotides (A, T, C, G): in a general model, each transformation would have a different value, thus making each one a different parameter. So this model would consist in 12 parameters (each base can transform in the other three, amounting to a total of twelve transformations). If the most complex model would assume 12 parameters, in the simplest one all transformations would have the same value, thus making it only one parameter (in molecular evolution this model is known as Jukes-Cantor, or simply JC, while the 12-parameter is called GRT). After choosing the number of parameters originating from substitutions, there are some correction parameters that can be applied. They are the frequency of nucleotides (F) – since the quantity of them may vary between organisms; proportion of invariable sites (I) – number of site that don’t change no matter what between aligned sequences; and rate of heterogeneity among sites (). After building all the possible models, I need to choose which one is the most useful for my purposes. That can be made through tests such as the Likelihood Ratio Test (LRT) or the Akaike Information Criterion.

Sullivan J. & Joyce P. 2005. Model Selection in Phylogenetics. Annual Review of Ecology, Evolution, and Systematics, 36:445–466.

Essay June 2nd: Tree searching: an uphill battle.

Searching for trees is a high-complexity problem. The number of possible trees grows exponentially with each new terminal added. Hill-Climbing algorithms helps narrowing down this forest to an optimal tree. The so-called numeric methods (UPGMA, NJ) are useful to getting a starting point. Once you have located yourself in the landscape, comes the methods of optimality criterion. Instead of the heavy work of building all possible trees, they use algorithms for finding the optimal tree. This search engines are the Hill-Climbing Algorithms. The problem is that sometimes they can be “greedy”. That means they are very good in finding the optimal local tree but can miss the optimal global tree. As a way to counter that scientist developed refining methods. They disturb the hill-climbing algorithms in different ways and see if it can recover the generated tree. Some types of disturb are: giving new weights to the characters used to construct the tree (useful for parsimony-based topologies), cutting and pasting a branch in another place, swapping branches between different trees and even doing the hill-climbing backwards.

Giribet G., 2007. Efficient tree searches with available algorithms. Evol Bioinform Online 3: 341–356.

Essay June 9th: Red Queen and Court Jester

Species are constantly evolving but do not become better adapted. Leigh Van Valen (1973) observed that the probability of a species becoming extinct did not increase or decrease in accordance to its age. His explanation for the driving force behind extinction rates is the constant arms race between species. If a organism is constantly evolving, all the others whom interact with it should be evolving at the same if they want to survive. Such observations were made in host-parasites system through the years, confirming Van Valen's hypothesis, which he dubbed Red Queen (a homage to the Lewis Caroll's character). But many scientists, mainly paleontologists were discontent with this explanation. In their view, the abiotic factors would be the driving forcer behind the extinction-speciation balance. Anthony Barnosky (2001) made a compilation of abiotic-supporting theories and tested them for the evolution of mammals of the northern Rocky Mountains. He labeled them Court Jester, since abiotic factors doesn’t follow evolution rules, are unexpected and act outside of the systems. Barnosky's study stated that both Red Queen and Court Jesters affect the net speciation rates, but they work on different time scales. The first one acts on 'small' times and the later in great time scales.

Van Valen L. (1973): "A New Evolutionary Law", Evolutionary Theory 1, p. 1-30.
Anthony D. Barnosky (2001). Distinguishing the Effects of the Red Queen and Court Jester on Miocene Mammal
Evolution in the Northern Rocky Mountains. Journal of Vertebrate Paleontology 21(1): 172-185

Ensaio 23 de Junho: Avaliação da disciplina e auto-avaliação

A disciplina superou minhas expectativas de forma geral. Esperava que, por ser exclusivamente teórica, as discussões e os textos trabalhados seriam maçantes, ainda mais para alguém que não tem o “background” molecular ou nunca tinha trabalhado com reconstrução histórica como eu. Me enganei, as discussões foram fluídas e os conceitos e métodos apresentados sempre foram questionados quanto à sua aplicação na reconstrução histórica. Gostei bastante da maioria dos textos da bibliografia recomendada, muitos deles refletem a proposta da matéria na produção de textos científicos, tendo uma linguagem clara, objetiva e direta. Um ponto que eu acho que poderia mudar para as próximas turmas é correção dos ensaios. Que ela continue sendo feita pelos pares, mas acho que poderia ter uma discussão aberta entre a turma sobre os ensaios também.
Avaliando meu desempenho pessoal na disciplina, me sinto satisfeito comigo mesmo. Comecei meio desanimado nas aulas que antecederam a primeira prova, pensando inclusive em trancar a disciplina, muito pela falta de familiaridade com os conceitos trabalhados, mas como meu maior interesse estava nas aulas da segunda parte da matéria, resolvi continuar e não me arrependi da decisão. Creio ter ganhado uma boa base para entender estudos que utilizam bases moleculares para reconstrução filogenética. Penso até, quem sabe, utilizar filogenias moleculares em um possível projeto de doutorado. Na primeira avaliação meu desempenho e a dificuldade sentida foram dentro do que eu esperava. Gostei bastante também do incentivo à escrita na disciplina. A produção constante de textos me ajudou muito a melhorar não só o que escrevo, como desenvolver hábitos saudáveis para a escrita. Apesar de ter perdido uma aula e ter deixado de produzir o ensaio referente (e também o ensaio extra), considero meu desempenho geral na disciplina como uma crescente.

Auto-avaliação: 0.9

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License