Thursday, January 14, 2016

Gene Spotlight: SLC4A10

(Cross-posted from my other blog)

EXECUTIVE SUMMARY: SLC4A10 is a member of a large family of genes that encode proteins for transporting ions (charged particles) across cell membranes. Within that (super)family, SLC4A10 belongs to a family of transporter proteins specializing in bicarbonate (HCO3-) ion transport, which is important for maintaining a constant pH within the cell --- i.e., preventing it from becoming too acidic or basic for the cell's biological machinery to function. SLC4A10 encodes a version of this transporter protein specific to certain cells of the central nervous system, and mutations disrupting this gene have been found in two instances: first, in a set of autistic twins who participated in a genomic study, and second, in a girl with epilepsy and intellectual disability. Disruption of this gene is thought to make brain cells more excitable, which can lead to seizures (which is probably why the girl in the second case study has them). ____________________________________________________________

Where is it?

Its "cytogenetic band" is given as 2q23-q24 (or, alternatively, 2q24.2), which means that it's on the long arm ("q" as opposed to "p") of chromosome 2, somewhere in the middle. Here's a map of chromosome 2, with a red line marking where SLC4A10 is: This gene spans 360,942 base pairs (which is fairly large, but not enormous; there's a lot of variability in gene size, with the smallest ones only a few hundred bases long and the largest spanning several million bases --- see this e-textbook chapter for details), covering the distance between bases 162,480,845 and 162,841,786 (measuring from the centromere to the end of the chromosome). 

What does it do?

The name SLC4A10 refers to its membership in a family of genes encoding similar proteins: solute carrier family (SLC) 4, which is a group of ten genes whose protein products transport bicarbonate ions (HCO3-) across cell membranes.

Bicarbonate ion (and its protonated form, carbonic acid, which is readily synthesized from carbon dioxide and hydrogen ions) plays a central role in regulating pH, both within cells and outside of them, as in blood. pH is a measure of acidity, expressed as the (negative) logarithm of the concentration of hydrogen ions (H+) in the fluid being tested. Pure water has a "neutral" pH of 7 (meaning that, of the H2O molecules making up liquid water, an approximately equal number exist in their dissociated forms of H+ and -OH at any given time --- if an acidic or basic compound is added, it will either add or remove H+ to the solution, and thus move the pH down or up); the water inside human bodies is slightly basic and has a pH of around 7.4 ("physiological pH"). 

Bicarbonate/carbon dioxide can act as a "buffer" between an acidic or basic substance and the physiological environment: depending on what form it's in, it can either donate (H2CO3 --> HCO3- --> CO32-) or receive (CO32- --> HCO3- --> H2CO3) hydrogen ions and keep the surrounding fluid from having to disrupt its acid-base equilibrium.

Because bicarbonate cannot diffuse across cell membranes by itself, it needs to be transported into cells by ion-exchanging membrane proteins whenever it is needed. The protein produced by SLC4A10 ferries bicarbonate ion and sodium ion into the cell while expelling a chloride ion from the cell. Two bicarbonate ions are imported for every sodium ion, which keeps the net gain/loss of electrical charge at zero. 

This particular gene is primarily expressed in the central nervous system (i.e., the brain and spinal cord), though related genes encode similar bicarbonate-transporting proteins for other tissue types. In mice, SLC4A10 is expressed in some types of brain tissue but not in others: it was specific to gray matter (neurons, but not glial cells), and was not expressed in white matter; and it was also specific to certain regions of the brain: the olfactory bulb, cortex, hippocampus and cerebellum

In neurons, ion concentrations inside and outside the cell play a role in whether a given neuron will "fire" --- undergo dramatic and rapid change in the electrical potential difference across its membrane, which triggers electrical and/or chemical signaling of adjacent neurons --- so ion transporters in neurons also help mediate neurotransmission.

What mutant versions of this gene have been discovered?

In the article I mentioned in my last post --- Sebat et al., 2007 (full text here) --- the authors report finding a spontaneous deletion of the first coding region of SLC4A10 in a pair of twin girls with autism.

There is also a recent report of a girl with epilepsy and intellectual disability having part of this gene --- a 48,000-base stretch of the 2q24 region falling between coding regions 2 and 3 of SLC4A10 --- moved to another chromosome: chromosome 13.

How do these mutations affect protein function? 

Mice bred with the entire SLC4A10 gene missing were found to have much smaller brain ventricles than normal mice, and also had altered choroid plexus tissue. (The choroid plexus is where cerebrospinal fluid is made and waste is filtered out of it; active-transport proteins are especially dense there). Researchers found it harder to induce seizures in these mice as compared with normal mice using the proconvulsant (i.e., seizure-inducing) drugs pentylenetetrazole and pilocarpine

Neither of the mutations observed in humans involves knocking out the entire gene; one involves deleting the first (of twenty-six) coding region, and the other involves switching a fairly long non-coding region with a sequence from another chromosome. Nothing is deleted in that case, but the insertion of something random into the middle of a gene might derail the process of assembling a working protein using that gene's (garbled) instructions. So both mutations impair the production of this protein to an unknown degree --- the protein probably isn't completely absent, but it might be present in reduced quantities or truncated, less-than-fully-functional form.

How common are they?

Very rare. Mutations in this gene are probably only a factor for a tiny, tiny minority of autistic people, whom I would suspect also have seizures. 

Database entries for this gene: AutDB, Entrez Gene, Ensembl, Genatlas, GeneCards, SFARI Gene


Damkier, H., Aalkjaer, C., & Praetorius, J. (2010). Na+-dependent HCOFormula Import by the slc4a10 Gene Product Involves Cl- Export Journal of Biological Chemistry, 285 (35), 26998-27007 DOI: 10.1074/jbc.M110.108712 

Gurnett CA, Veile R, Zempel J, Blackburn L, Lovett M, & Bowcock A (2008). Disruption of sodium bicarbonate transporter SLC4A10 in a patient with complex partial epilepsy and mental retardation. Archives of neurology, 65 (4), 550-553 PMID: 18413482 

Jacobs, S., Ruusuvuori, E., Sipila, S., Haapanen, A., Damkier, H., Kurth, I., Hentschke, M., Schweizer, M., Rudhard, Y., Laatikainen, L., Tyynela, J., Praetorius, J., Voipio, J., & Hubner, C. (2008). Mice with targeted Slc4a10 gene disruption have small brain ventricles and show reduced neuronal excitability Proceedings of the National Academy of Sciences, 105 (1), 311-316 DOI: 10.1073/pnas.0705487105 

Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese-Martin, C., Walsh, T., Yamrom, B., Yoon, S., Krasnitz, A., Kendall, J., Leotta, A., Pai, D., Zhang, R., Lee, Y., Hicks, J., Spence, S., Lee, A., Puura, K., Lehtimaki, T., Ledbetter, D., Gregersen, P., Bregman, J., Sutcliffe, J., Jobanputra, V., Chung, W., Warburton, D., King, M., Skuse, D., Geschwind, D., Gilliam, T., Ye, K., & Wigler, M. (2007). Strong Association of De Novo Copy Number Mutations with Autism Science, 316 (5823), 445-449 DOI: 10.1126/science.1138659

Gene Spotlight: CNTNAP2

(Cross-posted from my other blog)

EXECUTIVE SUMMARY: CNTNAP2 is a large gene near the end of chromosome 7 that encodes a cell-adhesion protein involved in distributing ion channels along axons (the long tails of nerve cells) and in attaching the fatty cells making up the myelin sheath to the surface of the axon. DIsruptions in this gene have been associated with autism, epilepsy, Tourette syndrome and other neurodevelopmental disorders. Variations at certain points within the gene that don't alter or disrupt its expression have also been associated with an increased likelihood of autism.

Where is it?

Chromosome 7, in the 7q35 region (i.e., near the end of the long, lower arm of chromosome 7).
CNTNAP2 is, according to Entrez Gene, one of the largest single genes in the human genome; it's about 2.3 million base pairs long, taking up 1.5% of the total space on chromosome 7.

What does it do?

CNTNAP2 encodes a cell-adhesion protein called contactin-associated protein-like 2 (Caspr2), which is part of a superfamily of adhesion proteins specific to nerve cells called neurexins.

During development, Caspr2 plays an important role in organizing the long tail of the neuron, called the axon. Caspr2 directs certain types of voltage-gated potassium ion channels (i.e., channels that open or close in reponse to changes in membrane potential) to insert themselves into the axon's membrane at specific intervals; it also forms part of the junction between the axon and the fatty glial cells that form the myelin sheath around the axon, which both insulates the axon (like the rubber tubing around a wire insulates the wire) and allows the electrical current to travel faster, in a discontinuous, hopping ("saltatory") manner from node to node along the myelinated axon.

(Image adapted from Wikipedia) 

To allow the current to travel discontinuously, the fatty cells making up the myelin sheath leave small stretches of axon uncovered at regular intervals. These unmyelinated points are called the nodes of Ranvier, and it is at these nodes that Caspr2 plays its traffic-directing role.
(Node of Ranvier and surrounding regions; taken from Figure 4 in Poliak and Peles, 2003)

Caspr2 sits in the membrane of the axon in its juxtaparanodal region (which means, "the region right next to the region next to the node" in ScienceSpeak), where its cytoplasmic domain (the part of a membrane-spanning protein that's inside the cell) links up with potassium ion channels prior to their insertion in the membrane, and directs them to insert adjacent to the complex of adhesion proteins including Caspr2. This ensures that the potassium channels all cluster together in the juxtaparanodal region, rather than distribute themselves more or less evenly along the axon, as they would do without guidance from the adhesion proteins.

This superabundance of potassium ion channels near the nodes of Ranvier makes the nodes hypersensitive to membrane depolarization (which is mediated by traffic of ions, including potassium, into and out of the cell), which allows an action potential (the "firing" of a neuron that happens once it reaches a certain threshold level of membrane depolarization) to be transmitted from node to node more easily.

This gene may also play a role in organizing the layers of cortical tissue during development. 

What mutant versions of this gene have been discovered?

Lots of different ones! Here are (some of) the mutations that have been described so far:

An exchange of genetic material between two chromosomes: 7q35 and 15q26.2, with the breakpoint on chromosome 7 occurring inside CNTNAP2 (in the 11th intron, or noncoding region), and the breakpoint on chromosome 15 occurring in a relatively ill-understood region that's hypothesized to be a gene. This translocation was found in three generations of Old Order Amish, and described in this article in the European Journal of Human Genetics. Depending on whether the translocation was balanced or not (i.e., whether there was any net gain or loss of genetic material), the people having this mutation might be completely healthy and neurotypical, or they might have severe problems and die young.

An autistic woman in Italy was found to be missing a large (12 million base pairs!) chunk of chromosome 7q33-q36; CNTNAP2 is contained within the deleted region.

An autistic boy in the Netherlands was found to have an inversion --- a break in the q arm of one of his copies of chromosome 7 that reversed the sequence of genes on the broken part when it repaired itself --- between regions 7q32.1 and 7q35. Parts of CNTNAP2 --- the promoter region, which is where the enzymes involved in DNA transcription (the first stage of gene expression) attach to the genome and begin transcription, and parts of intron 1 and exon 2, which also contain important regulatory sites --- have been moved to another chromosome entirely: chromosome 1q31.2.

A preschool-aged boy with seizures and autistic traits (but not enough to be diagnosed with an ASD) was found to have an inversion between regions 7q11.22 and 7q35. The break in 7q35 occurs within CNTNAP2, somewhere between exons 10 and 13. 

Nine of eighteen Old Order Amish people with developmental disabilities and a childhood-onset form of epilepsy were found to have a deletion of a single nucleotide (#3709) in coding region 22 of CNTNAP2. The deletion was present in both copies of the gene.

A family in which several members (the father and both children) have Tourette syndrome, obsessive-compulsive disorder, or both, were found to have a complex rearrangement of genes on chromosomes 2 and 7, including some swapping of parts of genes between those two chromosomes. Among other things, part of a gene on chromosome 2 is inserted into CNTNAP2, in a noncoding region. The inserted part is very large (12 million bases, six times the size of CNTNAP2 itself). 

In a fairly large sample of families including more than one autistic member, a single-nucleotide change --- a substitution of thymine for adenine --- at a position approximately one-quarter of the way between coding regions 2 and 3 of CNTNAP2 (in other words, in an intron, or noncoding region) was found to occur at somewhat higher rates in autistic children than in their nonautistic siblings. This was especially true if the mutation was inherited from the mother.

A study of 185 Han Chinese families found another single-nucleotide variation in a noncoding region of CNTNAP2 that's associated with an increased likelihood of having autism. 

A study of families participating in the Autism Genetic Resource Exchange found several single-nucleotide changes near the end of the intron between exons (coding regions) 13 and 14 in CNTNAP2, where the presence of a variant nucleotide at one of four different positions (with variation at one site in particular, designated rs2710102, seeming to drive variation at the other three) was associated with delays in development of speech.

Three people (two of them siblings) who had undergone genetic testing for a separate mutation (in a gene called TCF4, the underexpression of which causes Pitt-Hopkins syndrome) were found to have deletions in CNTNAP2; the two siblings were missing exons 2-9, and the other person was missing exons 5-8, and had another mutation rendering a splice site (a place where various enzymes cut out those parts of a transcribed gene that are not needed in protein synthesis, and then join the remaining fragments back together) potentially invisible to splicing enzymes, which could mean that exon 10, also, has been functionally deleted. 
Domain map of CNTNAP2, reproduced from Zweier et al., 2009

How do these mutations affect protein function?
(Drawing of CNTNAP2 exons and the protein domains they encode taken from Zweier et al., 2009)

A mutation's effect on protein function depends on where it is in the gene. The color-coded map I posted at the top of this section shows what kind of protein domain each coding region of CNTNAP2 encodes, and what role each domain plays in the protein's overall function (to the extent that either of those things is known, which can vary a lot from gene to gene).

For instance, the deletions mentioned in this article --- exons 5-8 in one person, and exons 2-9 in the others --- include a large block of laminin G domains (exons 5-10) and all three of the discoidin-like (DISC) domains near the end of the (exons 2-4). 

Both of these groups are on the part of Caspr2 that reaches outside the cell, and are involved in binding to other proteins on other cells to join the two cells together. In the nervous system, the two types of cells likeliest to be joined together are neurons and glial cells, during myelination.

Another domain that's important to Caspr2 function is the PDZ-binding domain at the end of the cytoplasmic half of the protein. That domain binds to PDZ domains on potassium channels while they're free in the cytoplasm and guide them to embed in the cell membrane near Caspr2. 

The point mutation described in this article would lead to garbled (or non-)expression of exons 23 and 24, which encode the transmembrane domain (i.e., the part of the protein that is embedded in the cell membrane) and the PDZ-binding domain; absence of those domains from Caspr2 might prevent that protein from clustering the potassium channels near the node of Ranvier. 

How common are they? 

Most of the mutations described above --- the deletions and translocations --- are very rare, possibly even unique to the individuals or families in whom they were discovered.

However, some of the single-nucleotide polymorphisms (SNPs) --- the alteration of a single nucleotide base --- are fairly common. The polymorphism described in this article, the presence of thymine at a point in a noncoding region of CNTNAP2 where most people have adenine, is thought to occur in 36% of people. 

So, while those variant alleles might be somewhat more common in people with autism than in people without it, there will still be lots of people without autism who also have those genotypes. The prevalence of autism being what it is, there are probably a lot more neurotypical people with a given polymorphism than there are autistic (or otherwise non-neurotypical) people. 

(This article from 2015 describes an extended family in Italy who share a homozygous -- present on both copies of chromosome 7 -- deletion spanning exons 2 and 3 of CNTNAP2; everyone in the family who possessed this deletion displayed a range of similar neurodevelopmental disabilities. The authors of the article say that, while heterozygous mutations in CNTNAP2 are common and frequently have no effect on development, homozygous mutations are much rarer and are always accompanied by some obvious disruption of neurological development.)

Database entries for this gene: AutDB, Ensembl, Entrez Gene, GeneCards,, Leiden Open Variation Database 


Arking, D., Cutler, D., Brune, C., Teslovich, T., West, K., Ikeda, M., Rea, A., Guy, M., Lin, S., & Cook Jr., E. (2008). A Common Genetic Variant in the Neurexin Superfamily Member CNTNAP2 Increases Familial Risk of Autism The American Journal of Human Genetics, 82 (1), 160-164 DOI: 10.1016/j.ajhg.2007.09.015 

Bakkaloglu, B., O'Roak, B., Louvi, A., Gupta, A., Abelson, J., Morgan, T., Chawarska, K., Klin, A., Ercan-Sencicek, A., & Stillman, A. (2008). Molecular Cytogenetic Analysis and Resequencing of Contactin Associated Protein-Like 2 in Autism Spectrum Disorders The American Journal of Human Genetics, 82 (1), 165-173 DOI: 10.1016/j.ajhg.2007.09.017 

Belloso, J., Bache, I., Guitart, M., Caballin, M., Halgren, C., Kirchhoff, M., Ropers, H., Tommerup, N., & Tümer, Z. (2007). Disruption of the CNTNAP2 gene in a t(7;15) translocation family without symptoms of Gilles de la Tourette syndrome European Journal of Human Genetics, 15 (6), 711-713 DOI: 10.1038/sj.ejhg.5201824 

Poliak S, Gollan L, Martinez R, Custer A, Einheber S, Salzer JL, Trimmer JS, Shrager P, & Peles E (1999). Caspr2, a new member of the neurexin superfamily, is localized at the juxtaparanodes of myelinated axons and associates with K+ channels. Neuron, 24 (4), 1037-47 PMID: 10624965

Poliak, S., & Peles, E. (2003). The local differentiation of myelinated axons at nodes of Ranvier Nature Reviews Neuroscience, 4 (12), 968-980 DOI: 10.1038/nrn1253 

Poot, M., Beyer, V., Schwaab, I., Damatova, N., Slot, R., Prothero, J., Holder, S., & Haaf, T. (2009). Disruption of CNTNAP2 and additional structural genome changes in a boy with speech delay and autism spectrum disorder neurogenetics, 11 (1), 81-89 DOI: 10.1007/s10048-009-0205-1 

Rodenas-Cuadrado P., Pietrafusa N., Francavilla T., La Neve A., Striano P., & Vernes S. C. (2015). Characterisation of CASPR2 deficiency disorder - a syndrome involving autism, epilepsy, and language impairment BioRxiv DOI: 10.1101/034363

ROSSI, E., VERRI, A., PATRICELLI, M., DESTEFANI, V., RICCA, I., VETRO, A., CICCONE, R., GIORDA, R., TONIOLO, D., & MARASCHIO, P. (2008). A 12Mb deletion at 7q33–q35 associated with autism spectrum disorders and primary amenorrhea European Journal of Medical Genetics, 51 (6), 631-638 DOI: 10.1016/j.ejmg.2008.06.010 

Strauss KA, Puffenberger EG, Huentelman MJ, Gottlieb S, Dobrin SE, Parod JM, Stephan DA, & Morton DH (2006). Recessive symptomatic focal epilepsy and mutant contactin-associated protein-like 2. The New England journal of medicine, 354 (13), 1370-7 PMID: 16571880 

Verkerk AJ, Mathews CA, Joosse M, Eussen BH, Heutink P, Oostra BA, & Tourette Syndrome Association International Consortium for Genetics (2003). CNTNAP2 is disrupted in a family with Gilles de la Tourette syndrome and obsessive compulsive disorder. Genomics, 82 (1), 1-9 PMID: 12809671 

Zweier, C., de Jong, E., Zweier, M., Orrico, A., Ousager, L., Collins, A., Bijlsma, E., Oortveld, M., Ekici, A., & Reis, A. (2009). CNTNAP2 and NRXN1 Are Mutated in Autosomal-Recessive Pitt-Hopkins-like Mental Retardation and Determine the Level of a Common Synaptic Protein in Drosophila The American Journal of Human Genetics, 85 (5), 655-666 DOI: 10.1016/j.ajhg.2009.10.004

Gene Spotlight: MECP2

(Cross-posted from my other blog)

Where is it?

Near the very end of the X chromosome, at Xq28. Here is a picture of its position relative to some other genes at that part of the X chromosome: You can see that it's not the last gene on there, and there are quite a few known and potential genes following it, but it's really, really close to the end. That picture I just posted? With MECP2 appearing at the far left? That's the very end of a 24-page image. So, based on that I feel comfortable calling MECP2 one of the last genes on the X chromosome. 

What does it do?

It encodes a protein, MeCP2, that can bind to methylated DNA (and also to a variety of other transcription-repressing proteins) and whose function is to repress transcription of its target genes. (More recent research has also found that it can also serve as a transcriptional activator). It has a lot of target genes, and their functions vary widely; many of them are other transcription factors, and many are involved in cell-cell signaling, or in signal transduction within the cell. Overall, transcription and neurotransmission seem to be the physiological processes that the majority of MeCP2 target genes are involved with, though it is also important for nerve and muscle cell growth (and thus, needs to be expressed in different amounts at different times during development). It is highly expressed in nerve cells. It's also been found to have other functions, like RNA splicing, chromatin remodeling and DNA methylation.

What mutant versions of this gene have been discovered? (Here's a very rough impression of where some of the more common mutations (and some less-common ones that I talk about in the next section) associated with Rett syndrome fall on a map of MECP2 coding regions. Mutations that only change an amino acid are outlined in different shades of red-orange; mutations that produce a truncated version of the MeCP2 protein are outlined in black, and indicated on the map with little stop signs. Image not drawn to scale) 

This 1999 article in Nature Genetics (full text here) describes a genetic analysis of 29 girls with Rett syndrome* (8 of whom had a family history of the condition), which found seven point mutations (changes in a single nucleotide) and one case where an extra nucleotide (thymine) was inserted into the gene, which threw off the "reading" of everything that came after, since protein synthesis depends on grouping the nucleotides into threes, and stringing together the amino acids corresponding to each sequence of three nucleotides, or "codons". Changing one nucleotide to another will therefore change one amino acid in the resulting protein, while adding or subtracting a nucleotide will change every amino acid that follows. (Such "frameshift" mutations are much more likely than point mutations to result in a nonfunctional protein). 

They were: a substitution of cytosine for thymine at nucleotide #538; substitutions of thymine for cytosine at nucleotides #390, #471, #547, #656, #837, and #1307; and the aforementioned insertion of (an extra) thymine between nucleotides #694 and #695.

Another genetic analysis described in a 2000 article in Human Molecular Genetics found 17 different mutations in 46 girls with Rett syndrome; these mutations included substitutions of thymine for cytosine at nucleotides #473, #502, #763, #808, #880, and #916; substitutions of guanine for cytosine at nucleotides #905 and #1038; a substitution of thymine for adenine at nucleotide #592; a substitution of cytosine for adenine at nucleotide #1461; a substitution of adenine for guanine at nucleotide #317; and a ten-nucleotide deletion starting at nucleotide #1158. Most of these mutations were in exon 3, though there were a few in exons 2 and 4 as well.

  A 2004 analysis of DNA samples from 56 French women and girls with Rett syndrome found five frameshift mutations: a deletion of nucleotide #345, in exon 3; a deletion 202 nucleotides long, starting at position #895; another deletion 53 nucleotides long starting at position #1124; a deletion of 8 nucleotides and an insertion of 18 nucleotides starting at position #989; and an insertion of an AG dinucleotide after nucleotide #996. All of these last four were in exon 4. 

An article from this year describes a 41-base deletion in a Korean girl with Rett syndrome; the deleted region started at nucleotide #1152, in exon 4.

Another article from this year found a substitution of thymine for cytosine at nucleotide #535 in a Tunisian girl with Rett syndrome.

This article (full text here) describes 17 mutations: a substitution of thymine for guanine at nucleotide #298; a substitution of adenine for guanine at nucleotide #398; a substitution of guanine for adenine at nucleotide #914; a substitution of thymine for cytosine at nucleotide #730; an insertion of (an extra) guanine after nucleotide #704; an insertion of cytosine after nucleotide #747; and multiple deletions, most of which had starting points between nucleotides 1,000 and 1,200, and all but one of which were deletions of multiple nucleotides. There was also a sequence of 137 nucleotides, starting at position #1169, that was repeated.

This 2009 genomic analysis of 74 people with Rett syndrome in New Zealand turned up four new mutations, including a fairly large deletion (1,596 nucleotides) that encompassed both exons 3 and 4.

There are a lot more --- the International Rett Syndrome Foundation's database of mutations associated with Rett syndrome (RettBASE) lists 4,225 different mutations. Not all of them are in MECP2, but a large majority of them are.

Mutations in MECP2 can also be associated with conditions other than Rett syndrome: this article describes mutations found in five children with Angelman syndrome. Two of them had deletions in exon 4, one had a two-nucleotide deletion in exon 3, and the others had single-base substitutions.

How do these mutations affect protein function?

The MeCP2 protein has two regions (called domains) that are crucial to its function in the cell: the methyl-DNA binding domain (MBD), which allows it to bind to methylated cytosines, and the transcription repression domain (TRD), which binds to other enzymes that condense chromosomal DNA and make it impossible for the enzymes reponsible for transcription to bind to it. MeCP2's role in transcription repression seems to be to bring the enzymes that do the actual repressing to its target sequences of DNA, rather than to block transcription itself. (Image of the structure of the MeCP2 methylDNA-binding domain, showing the amino acids affected by some of the more common mutations) 

The MECP2 gene has four exons, of which three contain sequences encoding these domains: Exon 2 encodes most of the DNA-binding domain, with some of it spilling over into exon 3, and parts of exons 3 and 4 encode the transcription repressor domain. So, depending on where it occurs in the gene, a mutation might disrupt either the MeCP2 protein's DNA-binding capacity, or its ability to bind to those other, transcription-repressing enzymes. 

Most of the mutations associated with Rett syndrome (or other conditions mentioned in the above section) change the structure of one of those domains in such a way as to weaken, or completely destroy, its ability to bind to whatever it needs to bind to. This article describes the effect on DNA binding ability of several known mutations (including a few of the most common ones) that alter the amino-acid sequence of the MBD. The mutation with the greatest effect on MeCP2's DNA-binding ability, p.R111G, swaps out a positively-charged amino acid on the long, flexible loop within the MBD for a nonpolar one; since that loop normally lies close to the sugar-and-phosphate "backbone" of the DNA (the part of the DNA to which the A's, T's, G's and C's all attach, and which forms the two outer ridges of the double helix), and since that backbone carries a negative charge (from all the phosphate groups), knocking out positively-charged amino acids in this region will disrupt the attraction between the DNA and the methylDNA-binding region of MeCP2.

Another mutation that can cause a sharp decline in DNA-binding ability, which also happens to be one of the most commonly-occurring mutations in people with Rett syndrome, is p.R133C, which also replaces a positively-charged amino acid with a nonpolar one. This one occurs in a different part of the MBD than p.R111G does, a "beta sheet" made up of long, flat strings of amino acids laid side by side. One of the short loops connecting two of the component strands has a sequence of five amino acids with hydrophobic side chains that create a "pocket" sequestering the methyl groups attached to the DNA. It may not always lead to loss of function, though; this group of mostly Japanese researchers conducted a similar analysis (full text here) of protein function, comparing some of the most common mutant versions of MeCP2 with its normal, "wild-type" form, and they found that the R133C variant bound to DNA almost as readily as the wild-type MeCP2 did. 

Other mutations associated with a near-total loss of DNA-binding ability are p.G114P, which replaces an amino acid in the middle of the long, flexible loop described above with one whose rigidly-structured, bulkier sidechain would greatly restrict the loop's ability to move and re-fold itself to fit into the groove of the DNA helix; p.D121A and p.D121E, which substitute amino acids with, respectively, nonpolar and negatively-charged sidechains for one with a positively-charged sidechain on one of the strands of the beta-sheet comprising another of the MBD's DNA-contacting surfaces; two other fairly common mutations, p.R106W and p.F155S, throw off the protein's overall folding to such an extent that it becomes unstable at body temperature. 

Several mutations cause transcription of MECP2 to stop prematurely, leading to the production of an incomplete protein. Depending on where the erroneous "stop" signal occurs, the resulting protein might be missing all or part of its transcription-repressor domain.

Mutations occurring downstream of the transcription-repressor domain have also been associated with problems; this experiment showed that mutant versions of MeCP2 that don't have the long tail following the TRD are less stable than wild-type MeCP2, and tend to break down quickly in the cellular environment. 

How common are they?

This article in the European Journal of Human Genetics lists eight MECP2 mutations its authors consider "common," along with each mutation's prevalence among the people with Rett syndrome listed in either the British Isles Rett Survey or the Australian Rett Syndrome Database. Of the 524 cases they looked at, 65 (12.8%) had the mutation p.T158M, which is the substitution of thymine for cytosine at nucleotide #473; 58 (11.1%) had the mutation p.R168X, which is the substitution of thymine for cytosine at nucleotide #502; 44 (8.4%) had the mutation p.R270X, which is the substitution of thymine for cytosine at nucleotide #808; and 42 (8%) had the mutation p.R255X, which is the substitution of thymine for cytosine at nucleotide #763. The other four mutations listed as "common" in this paper --- p.R106W (thymine substituted for cytosine at nucleotide 316), p.R133C (thymine for cytosine at nucleotide 397), p.R294X (thymine for cytosine at nucleotide 880) and p.R306C (thymine for cytosine at nucleotide 916) all account for between 3 and 7 percent of all cases surveyed.

Another article (full text here) also found those eight mutations occurred several times in their sample of 116 people with Rett syndrome; these researchers also found p.T158M to be the most common, present in 12 different people. (The next-most common ones were p.R270X, found in eight people, and p.R255X and p.R106W, each found in seven people). This study also listed three other mutations in its table of "recurring" mutations: a substitution of guanine for cytosine at nucleotide 455 (observed four times), a substitution of thymine for cytosine at nucleotide 965 (observed twice), and a modification of a splice site in exon 4 (an AG sequence becomes GG; this permutation was also observed only twice).

RettBASE also ranks the various mutations by frequency of occurrence: there, too, p.T158M is the most common, with 363 known occurrences and accounting for 8.59% of all mutations identified so far. Most of the mutations (about two-thirds) listed there are unique. 

Rett syndrome occurs in between 1:10,000 and 1:22,000 girls, and has only been recorded in 20 boys, ever. (Usually if a boy is born with the kind of mutations that would lead to Rett syndrome in a girl, he dies). So when I say a given mutation is found in, say, 10% of all people with Rett syndrome, that would translate into between 1:100,000 and 1:220,000 for its frequency among all people. So, while some MECP2 mutations might be less rare than others, I'd say they're all rare.

Database entries for this gene: 

AutDB, Ensembl, Entrez Gene, GeneCards, Genetics Home Reference, WikiGenes


Amir RE, Van den Veyver IB, Wan M, Tran CQ, Francke U, & Zoghbi HY (1999). Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nature genetics, 23 (2), 185-188 PMID: 10508514 

Bienvenu, T. (2000). MECP2 mutations account for most cases of typical forms of Rett syndrome Human Molecular Genetics, 9 (9), 1377-1384 DOI: 10.1093/hmg/9.9.1377

Bienvenu T, Souville I, Poirier K, Aquaviva C, Burglen L, Amiel J, Héron B, Kaminska A, Couvert P, Beldjord C, & Chelly J (2001). Five novel frameshift mutations in exon 3 and 4 of the MECP2 gene identified in Rett patients: Consequences for the molecular diagnosis strategy. Human mutation, 18 (3), 251-252 PMID: 11524737 

Díaz de León-Guerrero, S., Pedraza-Alva, G., & Pérez-Martínez, L. (2011). In sickness and in health: the role of methyl-CpG binding protein 2 in the central nervous system European Journal of Neuroscience, 33 (9), 1563-1574 DOI: 10.1111/j.1460-9568.2011.07658.x 

Fendri-Kriaa N, Hsairi I, Kifagi C, Ellouze E, Mkaouar-Rebai E, Triki C, Fakhfakh F, & The Tunisian network on mental retardation study (2011). A case of a Tunisian Rett patient with a novel double-mutation of the MECP2 gene. Biochemical and biophysical research communications, 409 (2), 270-274 PMID: 21575601 

Free, Andrew, Robert I. D. Wakefield, Brian O. Smith, David T. F. Dryden, Paul N. Barlow, & Adrian P. Bird (2000). DNA Recognition by the Methyl-CpG Binding Domain of MeCP2 Journal of Biological Chemistry, 276 (5), 3353-3360 DOI: 10.1074/jbc.M007224200 

Hite, K., Adams, V., & Hansen, J. (2009). Recent advances in MeCP2 structure and function Biochemistry and Cell Biology, 87 (1), 219-227 DOI: 10.1139/o08-115 

Hoffbuhr K, Devaney JM, LaFleur B, Sirianni N, Scacheri C, Giron J, Schuette J, Innis J, Marino M, Philippart M, Narayanan V, Umansky R, Kronn D, Hoffman EP, & Naidu S (2001). MeCP2 mutations in children with and without the phenotype of Rett syndrome. Neurology, 56 (11), 1486-1495 PMID: 11402105 

Kudo, S., Y. Nomura, M. Segawa, N. Fujita, M. Nakao, C. Schanen, & M. Tamura (2003). Heterogeneity in residual function of MeCP2 carrying missense mutations in the methyl CpG binding domain Journal of Medical Genetics, 40 (7), 487-493 DOI: 10.1136/jmg.40.7.487 

Kumar, A., Kamboj, S., Malone, B., Kudo, S., Twiss, J., Czymmek, K., LaSalle, J., & Schanen, N. (2008). Analysis of protein domains and Rett syndrome mutations indicate that multiple regions influence chromatin-binding dynamics of the chromatin-associated protein MECP2 in vivo Journal of Cell Science, 121 (7), 1128-1137 DOI: 10.1242/jcs.016865  
Lee EY, Chung HJ, Ki CS, Yoo JH, & Choi JR (2011). A novel mutation in the MECP2 gene in a Korean patient with Rett syndrome. Annals of clinical and laboratory science, 41 (1), 93-96 PMID: 21325263 

Raizis AM, Saleem M, MacKay R, & George PM (2009). Spectrum of MECP2 mutations in New Zealand Rett syndrome patients. The New Zealand medical journal, 122 (1296), 21-28 PMID: 19652677 

Singh, J., Saxena, A., Christodoulou, J., & Ravine, D. (2008). MECP2 genomic structure and function: insights from ENCODE Nucleic Acids Research, 36 (19), 6035-6047 DOI: 10.1093/nar/gkn591 

Yusufzai, Timur M., & Wolffe, Alan P. (2000). Functional consequences of Rett syndrome mutations on human MeCP2 Nucleic Acids Research, 28 (21), 4172-4179 DOI: 10.1093/nar/28.21.4172