[go: up one dir, main page]

File: split_fasta.m4

package info (click to toggle)
cctools 7.1.2-5
  • links: PTS, VCS
  • area: main
  • in suites: bullseye
  • size: 43,744 kB
  • sloc: ansic: 187,336; cpp: 20,196; python: 18,633; sh: 11,159; xml: 3,688; perl: 2,817; makefile: 1,105
file content (38 lines) | stat: -rw-r--r-- 1,273 bytes parent folder | download | duplicates (3)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
include(manual.h)dnl
HEADER(split_fasta)

SECTION(NAME)
BOLD(split_fasta) - Split a fasta file according to sequence and character counts

SECTION(SYNOPSIS)
CODE(BOLD(split_fasta query_granularity character_granularity fasta_file))

SECTION(DESCRIPTION)
BOLD(split_fasta) is a simple script to split a fasta file according to user provided parameters.  The script iterates over the given file, generating a new sub_file called input.i each time the contents of the previous file (input.(i-1)) exceed the number of queries given by query_granularity or the number of characters given by character_granularity.

SECTION(OPTIONS)
OPTIONS_BEGIN
OPTIONS_END

SECTION(EXIT STATUS)
On success, returns zero.  On failure, returns non-zero.

SECTION(ENVIRONMENT VARIABLES)

SECTION(EXAMPLES)

To split a fasta file smallpks.fa into pieces no larger than 500 queries and with no piece receiving additional sequences if it exceeds 10000 characters we would do:
LONGCODE_BEGIN
python split_fasta 500 10000 smallpks.fa
LONGCODE_END
This would generate files input.0, input.1, ..., input.N where N is the number of appropriately constrained files necessary to contain all sequences in smallpks.fa.

SECTION(COPYRIGHT)

COPYRIGHT_BOILERPLATE

SECTION(SEE ALSO)

SEE_ALSO_MAKEFLOW

FOOTER