Protocol Online logo
Top : Forum Archives: : Bioinformatics and Biostatistics

FOR C++ EXPERTS - need your help (Jan/06/2007 )

Pages: 1 2 3 4 Next

hi,

i'm designing a small c++ program for simple sequence analyzing (GC content, %, length,...)
, also i want to predict mRNA transcript for the entered sequence... rolleyes.gif
i tried this, but each time i run the program the original sequence (DNA) appears at the end of the transcript (mRNA).. angry.gif
so i f anyone have an idea about programming, PLZ help...thanx in advance

-strawberry-

I'm sure I or someone else will be able to help you out, but you need to give more details. Is your program short enough to post, or can you at least show the relevant code for how you are generating your mRNA transcript?

Also, it would be simpler to do this with perl.

-jaknight-

well, i'm just a beginner in C++ blush.gif
here it is the program...sorry, i tried to attach it as a file, but that failed

it goes fine with calculation of length, %nt, etc...but the mRNA ...not really ph34r.gif

CODE
#include<iostream.h>
int main()
{
    char s1[200];
    char s2[20];
    int t,n,mp;
    int i, count1=0,count2=0,count3=0,count4=0;

   cout<<"enter your dna sequence: \n";
   cin>>s1;
  

    for (i=0;s1[i]!='\';i++)
    {
        if (s1[i]=='A')
            count1=count1+1;
        else if (s1[i]=='T')
            count2=count2+1;
        else if(s1[i]=='G')
            count3=count3+1;
        else
            count4=count4+1;
        }
    
    n=count1+count2+count3+count4;
    cout<<"total length of the sequence= "<<n<<endl;

    cout<<"number of amino acids= "<<n/3<<endl;

    cout<<"%A= "<<count1*100/n<<endl;
    cout<<"%T= "<<count2*100/n<<endl;
    cout<<"%G= "<<count3*100/n<<endl;
    cout<<"%C= "<<count4*100/n<<endl;

    t=(count3+count4)*100/n;
    cout<<"GC%= "<<t<<endl;

    mp=2*(count1+count2)+4*(count3+count4);
    cout<<"melting point of this sequence= "<<mp<<endl;

    for(i=0;i<=n;i++)
    {
        if (s1[i]=='A')
            s2[i]='U';

        else if (s1[i]=='G')
            s2[i]='C';

        else if (s1[i]=='C')
            s2[i]='G';

        else if (s1[i]=='T')
            s2[i]='A';
        else
            s2[i]=' ';

    }
    cout<<"mRNA transcript is "<<s2<<endl;
return 0;
}

-strawberry-

QUOTE (jaknight @ Jan 7 2007, 02:53 PM)
Also, it would be simpler to do this with perl.


i have never used this program before smile.gif

-strawberry-

One problem is that you are only initializing s2 to a length of 20, whereas s1 is 200. This will cause problems if s1 has a length over 20 and may be where your error is arising. Try that first.

-jaknight-

OK, i don't it's the problem...because even i make the s2 length 200, the original esquence appears at the end of the s2..
output for s2[200]

-strawberry-

Hi.

for a start use the case and switch instead of your if else loop.
don't use #include<iostream.h> use #include<iostream> - listen to your compiler.

how is it that this works? you don't have using namespace std; anywhere and all your std::cin are cin etc.
I would also not do two iterations over the array... do both at once... build your mRNA and take all counts at once.
you should also make your arrays the same length ... if you are going to practice programming you may as well do it properly....

I must agree with Jaknight, this kind of problem is a perl / python / ruby problem not a c++ one... in addition the way this is written it is actually pretty much C.


please put your code in code blocks - it should enable a quick - copy ; compile; edit.

-perlmunky-

QUOTE (perlmunky @ Jan 8 2007, 09:49 AM)
Hi.


I would also not do two iterations over the array... do both at once... build your mRNA and take all counts at once.


hi perlmunky, thanx for your recommendations, i'll give it a try

but i didn't understand your above comment..can you explain it PLZ happy.gif

-strawberry-

ok so you have your code like this:

CODE
int [] seq = new int [200];
//POPULATE SEQ FROM STDIN

char [] rna = new char [200]
char [] baseCount = new char [200]

int [] counter = new int [4]

// you do this for (i=0;s1[i]!='\';i++)
//count bases
//then you do  for  for(i=0;i<=n;i++)
//make rna
//where you should do this

for ( int i = 0; i < seq.len; i++) {
//use switch case
     char base = seq[i];
     switch(base) {
          case 'A':
               rna[i] = 'U';
               baseCount[0] += 1;
               break;
          case 'C':
               rna[i] = 'G';
               baseCount[1] += 1;
               break;
... repeat for C and T
      }
default:
     std::cout << "Sorry but " << base << " is an unknown case!" << endl;
     break;


}

This saves you looping over the same array twice. In this case, because of the small size of your input, the two loops are fine but add needless typing.
Additionally, as you are learning c++ rather than c, you should create and object called say Dna. Dna should contain the methods toRna baseCount meltingPoint

toRna - takes the DNA and converts it to an MRNA sequence.
baseCount - takes the DNA or MRNA and counts the number of bases.
meltingPoint - does just that.

you could then say something like:

CODE
//this is more java syntax
//create a new DNA object;
Dna myDNA = new Dna();

myDNA = 'ATCGGGGTTCCCC'
char mRNA = myDNA.toRNA();
double meltingPoint = myDNA.meltingPoint();

All of that nasty looking switch case is hidden in a nice .h file.

Hope that is of some help.

I suggest you also have a look at the BOOST libraries if you have not done so already.

good luck - C++ is a bitch.

-perlmunky-

thanx again, i'll go throught it once more

-strawberry-

Pages: 1 2 3 4 Next