Welcome to the MapMan Family of Software Forum

Please do not hesitate to register and post your question.

Don't forget to subscribe to your posted message so you get notified on updates.
Every question you post will help others and or enhance the software!

Post a question,   post a bug!

Welcome to the MapMen Family of Software Forum Welcome to the MapMen Family of Software Forum

Using MapMan

Mercator Error

Mercator Error
Vastaus
16.10.2014 12:38
Hi!

I submitted a job to Mercator (GeneralMercator422c731d242682996fcc5fab9407aadb). I was trying to annotate the protein file from ftp://ftp.solgenomics.net/genomes/Nicotiana_benthamiana/annotation/. It failed with the following error message.


: FATAL ERROR: The input contains both nucleotide and protein sequences.
Please make sure the input only contains data of either type.
Number of sequence types found in input:
Protein : 76377
Nucleotide: null

While it failed because of nucleotide sequences in the file, the error message clearly states that there are no nucleotide sequences?!


Thanks for your help.

Best,
Andrea

RE: Mercator Error
Vastaus
16.10.2014 16:54 vastaus kirjoittajalle Andrea Braeutigam.
Hi Andrea,

I guess that it is the Niben.genome.v0.4.4.proteins.fasta file that you were annotating?

I'm not sure why it failed (but I will try and run this locally).
But you are correct that it gives, at best, an inaccurate message (claiming to have found both nucleotides and protein sequences, then reporting 'null').

Cheers,
Marie.

RE: Mercator Error
Vastaus
17.10.2014 12:02 vastaus kirjoittajalle Andrea Braeutigam.
Hi Andrea,

I have found the problem. It seems that the code tries to determine whether the input characters are nucleotides or proteins.
The code however tried to deal with situations where ambiguous nucleotides are used (eg R can be either A or G - see this link for further examples http://www.bioinformatics.org/sms/iupac.html)

If a fasta entries does not contain a character that definitely characterises it as 'protein', it can fall into the nucleotide_ambiguous category.
These unfortunately are not deal with in the code (a bug which needs fixing)

In the input fasta from the link you provided, I have two problematic entries:-
NbC24331615g0001.1 This contains all Rs and one S (rather bizarre, most likely an incorrect protein).
NbS00016185g0015.1 This is below the minimum length so should not have been evaluated (will fix this also).


I have run a short test with these entries removed and it seems to run.
If you want to submit the job again removing this entries - or otherwise let me know what parameters you would like to run with and I can run from here.

We are currently working on porting the mercator platform to a new infrastructure, so it may be a couple of weeks before I can actually fix these bugs (I have logged them as high priority).

Cheers,
Marie.

RE: Mercator Error
Vastaus
20.10.2014 15:54 vastaus kirjoittajalle Marie Bolger.
Thanks! Removal of the sequences solved the problem.

Best,
Andrea