<!-- Preferences start here * Set PAGETITLE = Barrick Lab :: Publicly Archiving Data Preferences end here --> <noautolink> ---+ Publicly Archiving Data These locations can give you accession numbers for data that may not be easily communicated as supplementary information for a research report. An advantage of submitting to these public databases is that your data will be archived in standard formats that others can use more easily. ---++ Submitting Sequences to GenBank The easiest way for a few sequences is to use the [[http://www.ncbi.nlm.nih.gov/WebSub/?tool=genbank][BankIt]] web submission tool. The Geneious submission tool does not properly format GenBank submissions as of v6.06 (@JEB). ---++ Submitting Genome Sequencing Reads to the SRA | [[http://www.ncbi.nlm.nih.gov/sra][%ICON{external}%]] SRA main page | | [[https://pdalogin.ncbi.nlm.nih.gov/uls.cgi?url=http://trace.ncbi.nlm.nih.gov/Traces/sra_sub/sub.cgi?login=pda][%ICON{external}%]] SRA login page | | [[http://www.ncbi.nlm.nih.gov/books/NBK47528/][%ICON{external}%]] NCBI online SRA manual | Notes: * You cannot change aliases in the normal upload area, so be careful to enter them correctly the first time! * It is easiest to upload uncompressed FASTQ files. * Illumina uploads must be in [[http://en.wikipedia.org/wiki/FASTQ_format][Illumina 1.5+ FASTQ format]], not converted to Sanger FASTQ format. * The flow cell number and lane are [[http://en.wikipedia.org/wiki/FASTQ_format][encoded in the name of every read in the FASTQ]]. * Paired-end or mate-paired FASTQ files must be interleaved (one file alternating corresponding first and second reads), rather than with all of the first reads in one file and all of the second reads in another file. The script [[%ATTACHURL%/interleave_paired_fastq.pl.txt][interleave_paired_fastq.pl]] can construct the interleaved file. * The script [[%ATTACHURL%/estimate_insert_length.sh][estimate_insert_length.sh]] can be used to estimate the fragment size in a paired library to complete those fields. * Use the <code>md5sum</code> command to calculate the MD5 checksum for FASTQ files. ---++ Dryad [[http://www.datadryad.org/][Dryad]] is especially good for submitting large data tables and analysis scripts (e.g., in R).
Attachments
Attachments
Topic attachments
I
Attachment
History
Action
Size
Date
Who
Comment
sh
estimate_insert_length.sh
r1
manage
0.4 K
2013-02-17 - 17:25
JeffreyBarrick
txt
interleave_paired_fastq.pl.txt
r1
manage
0.6 K
2014-12-08 - 23:06
JeffreyBarrick
This topic: Lab
>
WebHome
>
ProtocolList
>
PubliclyArchivingData
Topic revision: r5 - 2014-12-08 - JeffreyBarrick