The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2017-03-13 | 12.6 kB | |
samtools-1.4-solo.tar.bz2 | 2017-03-13 | 3.1 MB | |
samtools-1.4.tar.bz2 | 2017-03-13 | 4.2 MB | |
htslib-1.4.tar.bz2 | 2017-03-13 | 1.1 MB | |
bcftools-1.4-solo.tar.bz2 | 2017-03-13 | 1.8 MB | |
bcftools-1.4.tar.bz2 | 2017-03-13 | 2.8 MB | |
Totals: 6 Items | 12.9 MB | 1 |
HTSlib Release 1.4 (13 March 2017)
-
Incompatible changes: several functions and data types have been changed in this release, and the shared library soversion has been bumped to 2.
-
bam_pileup1_t has an additional field (which holds user data)
- bam1_core_t has been modified to allow for >64K CIGAR operations and (along with bam1_t) so that CIGAR entries are aligned in memory
- hopen() has vararg arguments for setting URL scheme-dependent options
- the various tbx_conf_* presets are now const
- auxiliary fields in bam1_t are now always stored in little-endian byte order (previously this depended on if you read a bam, sam or cram file)
- index metadata (accessible via hts_idx_get_meta()) is now always stored in little-endian byte order (previously this depended on if the index was in tbi or csi format)
- bam_aux2i() now returns an int64_t value
- fai_load() will no longer save local copies of remote fasta indexes
-
hts_idx_get_meta() now takes a uint32_t * for l_meta (was int32_t *)
-
HTSlib now links against libbz2 and liblzma by default. To remove these dependencies, run configure with options --disable-bz2 and --disable-lzma, but note that this may make some CRAM files produced elsewhere unreadable.
-
Added a thread pool interface and replaced the bgzf multi-threading code to use this pool. BAM and CRAM decoding is now multi-threaded too, using the pool to automatically balance the number of threads between decode, encode and any data processing jobs.
-
New errmod_cal(), probaln_glocal(), sam_cap_mapq(), and sam_prob_realn() functions, previously internal to SAMtools, have been added to HTSlib.
-
Files can now be accessed via Google Cloud Storage using gs: URLs, when HTSlib is configured to use libcurl for network file access rather than the included basic knetfile networking.
-
S3 file access now also supports the "host_base" setting in the $HOME/.s3cfg configuration file.
-
Data URLs ("data:,text") now follow the standard RFC 2397 format and may be base64-encoded (when written as "data:;base64,text") or may include percent-encoded characters. HTSlib's previous over-simplified "data:text" format is no longer supported -- you will need to add an initial comma.
-
When plugins are enabled, S3 support is now provided by a separate hfile_s3 plugin rather than by hfile_libcurl itself as previously. When --enable-libcurl is used, by default both GCS and S3 support and plugins will also be built; they can be individually disabled via --disable-gcs and --disable-s3.
-
The iRODS file access plugin has been moved to a separate repository. Configure no longer has a --with-irods option; instead build the plugin found at https://github.com/samtools/htslib-plugins.
-
APIs to portably read and write (possibly unaligned) data in little-endian byte order have been added.
-
New functions bam_auxB_len(), bam_auxB2i() and bam_auxB2f() have been added to make accessing array-type auxiliary data easier. bam_aux2i() can now return the full range of values that can be stored in an integer tag (including unsigned 32 bit tags). bam_aux2f() will return the value of integer tags (as a double) as well as floating-point ones. All of the bam_aux2 and bam_auxB2 functions will set errno if the requested conversion is not valid.
-
New functions fai_load3() and fai_build3() allow fasta indexes to be stored in a different location to the indexed fasta file.
-
New functions bgzf_index_dump_hfile() and bgzf_index_load_hfile() allow bgzf index files (.gzi) to be written to / read from an existing hFILE handle.
-
hts_idx_push() will report when trying to add a range to an index that is beyond the limits that the given index can handle. This means trying to index chromosomes longer than 2^29 bases with a .bai or .tbi index will report an error instead of apparantly working but creating an invalid index entry.
-
VCF formatting is now approximately 4x faster. (Whether this is noticable depends on what was creating the VCF.)
-
CRAM lossy_names mode now works with TLEN of 0 or TLEN within +/- 1 of the computed value. Note in these situations TLEN will be generated / fixed during CRAM decode.
-
CRAM now supports bzip2 and lzma codecs. Within htslib these are disabled by default, but can be enabled by specifying "use_bzip2" or "use_lzma" in an hts_opt_add() call or via the mode string of the hts_open_format() function.
SAMTools Release 1.4 (13 March 2017) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Noteworthy changes in samtools:
-
Fixed Issue #345 - out-by-one error in insert-size in samtools stats
-
bam_split now add a @PG header to the bam file
-
Added mate cigar tag support to fixmate
-
Multi-threading is now supported for decoding BAM and CRAM (as well as the previously supported encoding). Most commands that read BAM or CRAM have gained an -@ or --threads arguments, providing a significant speed bonus. For commands that both read and write files the threads are shared between decoding and encoding tasks.
-
Added -a option to samtools mpileup to show all locations, including sites with zero depth; repeating the option as -aa or -a -a additionally shows reference sequences without any reads mapped to them (#496).
-
The mpileup text output no longer contains empty columns at zero coverage positions. Previously it would output "...0\t\t..." in some circumstances (zero coverage due to being below a minumum base quality); this has been fixed to output as "...0\t\t..." with placeholder '*' characters as in other zero coverage circumstances (see PR #537).
-
To stop it from creating too many temporary files, samtools sort will now not run unless its per-thread memory limit (-m) is set to at least 1 megabyte (#547).
-
The misc/plot-bamstats script now has a -l / --log-y option to change various graphs to display their Y axis log-scaled. Currently this affects the Insert Size graph (PR #589; thanks to Anton Kratz).
-
Fixmate will now also add and update MC (mate CIGAR) tags.
BCFTools Release 1.4 (13 March 2017)
Two new commands - mpileup
and csq
:
- The
mpileup
command has been imported from samtools to bcftools. The reasoning behind this is that bcftools calling is intimately tied to mpileup and any changes to one, often requires changes to the other. Only the genotype likelihood (BCF output) part of mpileup has moved to bcftools, while the textual pileup output remains in samtools. The BCF output option insamtools mpileup
will likely be removed in a release or two or when changes tobcftools call
are incompatible with the old mpileup output.
The basic mpileup functionality remains unchanged as do most of the command line options, but there are some differences and new features that one should be aware of:
-
The option
samtools mpileup -t, --output-tags
changed tobcftools mpileup -a, --annotate
to avoid conflict with the-t, --targets
option common across other bcftools commands. -
-O, --output-BP
and-s, --output-MQ
are no longer used as they are only for textual pipelup output, which is not included inbcftools mpileup
.-O
short option reassigned to--output-type
and-s
reassigned to--samples
for consistency with other bcftools commands. -
-g, --BCF
,-v, --VCF
, and-u, --uncompressed
options fromsamtools mpileup
are no longer used, being replaced by the-O, --output-type
option common to other bcftools commands. -
The
-f, --fasta-ref
option is now required by default to help avoid user errors. Can be diabled using--no-reference
. -
The option
-d, --depth .. max per-file depth
now behaves as expected and according to the documentation, and prints a meaningful diagnostics. -
The
-S, --samples-file
can be used to rename samples on the fly. See man page for details. -
The
-G, --read-groups
functionality has been extended to allow reassignment, grouping and exclusion of readgroups. See man page for details. -
The
-l, --positions
replaced by the-t, --targets
and-T, --targets-file
options to be consistent with other bcftools commands. -
gVCF output is supported. Per-sample gVCFs created by mpileup can be merged using
bcftools merge --gvcf
. -
Can generate mpileup output on multiple (indexed) regions using the
-r, --regions
and-R, --regions-file
options. In samtools, one was restricted to a single region with the-r, --region
option. -
Several speedups thanks to @jkbonfield (cf3a55a).
-
csq
: New command for haplotype-aware variant consequence calling. See man page and paper.
Updates, improvements and bugfixes for many other commands:
-
annotate
:--collapse
option added.--mark-sites
now works with VCF files rather than just tab-delimited files. Now possible to annotate a subset of samples from tab file, not just VCF file (#469). Bugfixes (#428). -
call
: New option-F, --prior-freqs
to take advantage of prior knowledge of population allele frequencies. Improved calculation of the QUAL score particularly for REF sites (#449, 7c56870).PLs>=256
allowed incall -m
. Bugfixes (#436). -
concat --naive
now works with vcf.gz in addition to bcf files. -
consensus
: handle variants overlapping region boundaries (#400). -
convert
: gvcf2vcf support for mpileup and GATK. new--sex
option to assign sex to be used in certain output types (#500). Large speedup of--hapsample
and--haplegendsample
(e8e369b) especially with--threads
option enabled. Bugfixes (#460). -
cnv
: improvements to output (be8b378). -
filter
: bugfixes (#406). -
gtcheck
: improved cross-check mode (#441). -
index
can now specify the path to the output index file. Also, gains the--threads
option. -
merge
: Large overhaul ofmerge
command including support for merging gVCF files created bybcftools mpileup --gvcf
with the new-g, --gvcf
option. New options-F
to control filter logic and-0
to set missing data to REF. Resolved a number of longstanding issues (#296, #361, #401, #408, #412). -
norm
: Bugfixes (#385,#452,#439), more informative error messages (#364). -
query
:%END
plus%POS0
,%END0
(0-indexed) support - allows easy BED format output (#479).%TBCSQ
for use with the newcsq
command. Bugfixes (#488,#489). -
plugin
: A number of new plugins: -
GTsubset
(thanks to @dlaehnemann) ad-bias
af-dist
fill-from-fasta
fixref
guess-ploidy
(deprecatesvcf2sex
plugin)isecGT
trio-switch-rate
and changes to existing plugins:
tag2tag
: Addedgp-to-gt
,pl-to-gl
and--threshold
options and bugfixes (#475).ad-bias
: New-d
option for minimum depth.impute-info
: Bugfix (49a9eaf).fill-tags
: Added ability to aggregate tags for sample subgroups, thanks to @mh11. (#503). HWE tag added as an option.-
mendelian
: Bugfix (#566). -
reheader
: allow muiltispace delimiters in--samples
option. -
roh
: Now possible to process multiple samples at once. This allows considerable speedups for files with thousands of samples where the cost of HMM is neglibible compared to I/O and decompressing. In order to fit tens of thousands samples in memory, a sliding HMM can be used (new--buffer-size
option). Viterbi training now uses Baum-Welch algorithm, and works much better. Support for gVCFs or FORMAT/PL tags. Added-o, output
and-O, --output-type
options to control output of sites or regions (compression optional). Many bugs fixed - do not segfault on missing PL values anymore, a typo in genetic map calculation resulted in a slowdown and incorrect results. -
stats
: Bugfixes (16414e6), new options-af-bins
and-af-tags
to control allele frequency binning of output. Per-sample genotype concordance tables added (#477). -
view -a, --trim-alt-alleles
various bugfixes for missing data and more informative errors should now be given on failure to pinpoint problems.
General changes:
-
Timestamps are now added to header lines summarising the command (#467).
-
Use of the
--threads
options should be faster across the board thanks to changes in HTSlib meaning meaning threads are now shared by the compression and decompression calls. -
Changes to genotype filtering with
-i, --include
and-e, --exclude
(#454).