Rahul Patharkar | Hi Marc,
It is great that you are making it easier for people to handle Illumina data (without having to install Linux and learn all the command line stuff)! I am getting an error when trying to map reads to the Arabidopsis genome. I am trying to map my reads to ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_chromosome_files/TAIR10_chr_all.fas ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/TAIR10_gff3/TAIR10_GFF3_genes.gff
After a few moments of reading the reference file I get this error: java.lang.RuntimeException: Gene null is on more than one chromosome (Chr1 and Chr2) at matrix.GeneGC$Gene.addExon(GeneGC.java:226) at matrix.GeneGC$Gene.access$100(GeneGC.java:205) at matrix.GeneGC.calc(GeneGC.java:114) at de.mpimp.golm.robin.GUI.RNASeq.mapping.RNASeqReferenceGenomePanel$8.doInBackground(RNASeqReferenceGenomePanel.java:428) at de.mpimp.golm.robin.GUI.RNASeq.mapping.RNASeqReferenceGenomePanel$8.doInBackground(RNASeqReferenceGenomePanel.java:406) at javax.swing.SwingWorker$1.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at javax.swing.SwingWorker.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
After this error is thrown, RobiNA continues to say "Reading reference file" indefinitely. Are there other files for the Arabidopsis reference that I can use that will work? Any help on this would be much appreciated.
Thanks, Rahul Patharkar
From: Marc Lohse [lohse@mpimp-golm.mpg.de] Sent: Monday, February 25, 2013 10:40 AM To: Patharkar, Osric R. Subject: Re: RobiNA fails reading refence file
Hi Rahul,
i could identify the cause of your problem now. The GFF3 file supplied by TAIR uses different chromosome names than those given in the corresponding FASTA file (also supplied by TAIR)! In the GFF3 file, the chromosomes are called "Chr1", "Chr2" etc, while in the actual genome FASTA file, the names are just numbers "1", "2" etc. When RobiNA reads the two files, it can't match the gene annotation information given in the GFF3 to the corresponding sequences in the FASTA and hence finds no genes at all and throws the error message.
This is annoying, especially since one would expect that two files from the same source are consistent with respect to the identifiers they use. Anyhow, the problem is easy to solve. Either rename the header lines in the FASTA file to give the chromosome names used in the GFF3 file or vice versa.
For example, if you rename >1 CHROMOSOME dumped from ADB: Feb/3/09 16:9; last updated: 2009-02-02 to >Chr1 CHROMOSOME dumped from ADB: Feb/3/09 16:9; last updated: >2009-02-02
(and do the same for all the other header lines starting with a ">" character) GFF3 and FASTA are "in sync" again and RobiNA should be able to work with the files.
I hope this helps, best greetings, Marc
Marc Lohse, PhD Max Planck Institute of Molecular Plant Physiology AG Integrative Carbon Biology Am Muehlenberg 1 14476 Potsdam-Golm Tel.: +49 331 5678157 email lohse@mpimp-golm.mpg.de http://tinyurl.com/IntegrativeCarbonBiology -------------------------------------------------- ________________________________ Von: Patharkar, Osric R. [patharkaro@missouri.edu] Gesendet: Dienstag, 26. Februar 2013 03:56 An: Marc Lohse Betreff: RE: RobiNA fails reading refence file
Hi Marc,
Thanks for the response. The solution you suggested below did not work for me. In fact the testing data set on the RobiNA page uses the "Chr" naming scheme on both the fasta and gff files but I still get the same error when RobiNA is "Reading the reference file". Anyway, help would be greatly appreciated. Thanks, Rahul
Hi Rahul,
I think i have also eliminated the problem that nagged you - i will upload new versions of RobiNA shortly with which this GFF import should work. Please download and (will be version 1.2.4) install this new version.
(and allow maybe 1 hour more before the packages are available on our website.
best greetings, Marc |