The software running environment is Linux (Ubuntu for example) and Windows (7 or later versions).
Java Running Environment (JRE) Java SE 8u241 or higher.
Please extract the downloaded ZIP file to the hard disk before running the program.
Here we use NGSMLEditor to create a NGSML file of FASTQ
format. The content of FASTQ is:
Step 1: New NGSML File
Click the new button to create a new NGSML file.
Step 2: Add Sequence
Right click on "ngs" node, the popup menu will be shown. Click "list_of_seqs" menu to create new node.
Right click on "list_of_seqs" node to add "seq" child node.
Right click on "seq" node to add "nid" attribute.
Right click on "nid" node and select "Edit" menu to edit the value of the node.
Add "origin" node like "nid" node. Right click on "origin" node and select "Edit" menu to input the value of sequence.
Step 3: Add Quality Score
Right click on "ngs" to add "list_of_quals" node.
Add "nid","origin" node. Right click on the "origin" node to add quality score.
Step 4: Add Sequence Information
Right click on "ngs" node to add "list_of_seqinfos" node.
Right click on "list_of_seqinfos" to "seqinfo" node.
Right click on "seqinfo" node to "seq" node.
Right click on "seq" node to add "seqref" attribute.
Right click on "seqinfo" node to "qual" node.
Add "seqref" and "qualref" node in "seq" and "qual" node.
Right click on "seqref" and "qualref" node to input reference value.
In this example, the sequence of the first record is "s1", the quality score is "q1".
Add the second record of FASTQ file like the first record.
Step 5: Save NGSML File
At last, the sequences are saved in the "list_of_seqs" node, the quality scores are saved in the "list_of_quals" node,
the FASTQ records are saved in the "list_of_seqinfos" node.
Click the "Save" button to save the NGSML file. The result is below.
4. Convertion Between NGS File and NGSML File
Users can also use NGSMLEditor to convert between NGS file and NGSML file.
At present, NGSMLEditor support FASTA, FASTQ, SAM three formats.
NGSMLEditor can be excuted in Windows and Linux system.
Format convertion can be called by GUI and command line. 4.1 Using NGSMLEditor GUI
4.1.1 Convert FASTQ To NGSML
1. Use "Add" button to add a FASTQ file.
2. Input select FASTQ and Output select NGSML.
3. Output Dir select a folder using "Browse" button.
4. Click "Start" button.
4.1.2 Convert NGSML To FASTQ
1. Use "Add" button to add a NGSML file.
2. Input select NGSML and Output select FASTQ.
3. Output Dir select a folder using "Browse" button.
4. Click "Start" button. 4.2 Using NGSMLEditor Command Line
The example is excuted in Linux system.
Input "java -jar NGSMLEditor.jar -h" to show the help.
4.2.1 Convert SAM To NGSML
Input "java -jar NGSMLEditor.jar -c sam2ngsml -i input_path -o output_path" to convert SAM to NGSML.
4.2.2 Convert NGSML To SAM
Input "java -jar NGSMLEditor.jar -c ngsml2sam -i inputpath -o outputpath" to convert NGSML to SAM.
5. Examples of NGSML format
5.1 FASTA NGSML
The content of FASTA sample file.
The content of corresponding NGSML file . 5.2 FASTQ NGSML
The content of FASTQ sample file.
The content of corresponding NGSML file. 5.3 SAM NGSML
The content of SAM sample file.
The content of corresponding NGSML file. 5.4 BAM NGSML
The content of BAM sample file.
BAM file is compressed into the BGZF compression format. Decompress the BAM file we will get the corresponding content in BAM format.
The Sequence Alignment/Map Format Specification can be found at https://github.com/samtools/hts-specs.
The content of corresponding NGSML file. The length of the NGSML file is too long, here only displays the r001 read. 5.5 CAF NGSML
The content of CAF sample file.
The content of corresponding NGSML file. 5.6 VCF NGSML
The content of VCF sample file.
The content of corresponding NGSML file. 5.7 BED NGSML
The content of BED sample file.
The content of corresponding NGSML file. The length of the NGSML file is too long, here only displays the first line. 5.8 GTF NGSML
The content of GTF sample file.
The content of corresponding NGSML file. The length of the NGSML file is too long, here only displays the first line. 5.9 GFF3 NGSML
The content of GFF3 sample file.
The content of corresponding NGSML file. The length of the NGSML file is too long, here only displays the first line.
The program and datasets are free to use. For any questions, please do not hesitate to contact us:
Center for Systems Biology (CSB), Soochow University; No.1 Shizi Street, Suzhou, Jiangsu, China
E-mail: bairong.shen@scu.edu.cn, yucj@siso.edu.cn.
Copyright @ 2020 CSB. All Rights Reserved.