The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.txt | 2023-07-25 | 15.8 kB | |
samtools-1.18.tar.bz2 | 2023-07-25 | 9.1 MB | |
htslib-1.18.tar.bz2 | 2023-07-25 | 4.8 MB | |
bcftools-1.18.tar.bz2 | 2023-07-25 | 7.7 MB | |
Totals: 4 Items | 21.6 MB | 4 |
------------------------------------------------------------------------------ htslib - changes v1.18 ------------------------------------------------------------------------------ Updates ------- * Using CRAM 3.1 no longer gives a warning about the specification being draft. Note CRAM 3.0 is still the default output format. (PR#1583) * Replaced use of sprintf with snprintf, to silence potential warnings from Apple's compilers and those who implement similar checks. (PR#1594, fixes #1586. Reported by Oleksii Nikolaienko) * Fastq output will now generate empty records for reads with no sequence data (i.e. sequence is "*" in SAM format). (PR#1576, fixes samtools/samtools#1576. Reported by Nils Homer) * CRAM decoding speed-ups. (PR#1580) * A new MN aux tag can now be used to verify that MM/ML base modification data has not been broken by hard clipping. (PR#1590, PR#1612. See also PR samtools/hts-specs#714 and issue samtools/hts-specs#646. Reported by Jared Simpson) * The base modification API has been improved to make it easier for callers to tell unchecked bases from unmodified ones. (PR#1636, fixes #1550. Requested by Chris Wright) * A new bam_mods_queryi() API has been added to return additional data about the i-th base modification returned by bam_mods_recorded(). (PR#1636, fixes #1550 and #1635. Requested by Jared Simpson) * Speed up index look-ups for whole-chromosome queries. (PR#1596) * Mpileup now merges adjacent (mis)match CIGAR operations, so CIGARs using the X/= operators give the same results as if the M operator was used. (PR#1607, fixes #1597. Reported by Marcel Martin) * It's now possible to call bcf_sr_set_regions() after adding readers using bcf_sr_add_reader() (previously this returned an error). Doing so will discard any unread data, and reset the readers so they iterate over the new regions. (PR#1624, fixes samtools/bcftools#1918. Reported by Gregg Thomas) * The synced BCF reader can now accept regions with reference names including colons and hyphens, by enclosing them in curly braces. For example, {chr_part:1-1001}:10-20 will return bases 10 to 20 from reference "chr_part:1-1001". (PR#1630, fixes #1620. Reported by Bren) * Add a "samples" directory with code demonstrating usage of HTSlib plus a tutorial document. (PR#1589) Build changes ------------- * Htscodecs has been updated to 1.5.1 (PR#1654) * Htscodecs SIMD code now works with Apple multiarch binaries. (PR#1587, HTSlib fix for samtools/htscodecs#76. Reported by John Marshall) * Improve portability of "expr" usage in version.sh. (PR#1593, fixes #1592. Reported by John Marshall) * Improve portability to *BSD targets by ensuring _XOPEN_SOURCE is defined correctly and that source files properly include "config.h". Perl scripts also now all use #!/usr/bin/env instead of assuming that it's in /usr/bin/perl. (PR#1628, fixes #1606. Reported by Robert Clausecker) * Fixed NAME entry in htslib-s3-plugin man page so the whatis and apropos commands find it. (PR#1634, thanks to Étienne Mollier) * Assorted dependency tracking fixes. (PR#1653, thanks to John Marshall) Documentation updates --------------------- * Changed Alpine build instructions as they've switched back to using openssl. (PR#1609) * Recommend using -rdynamic when statically linking a libhts.a with plugins enabled. (PR#1611, thanks to John Marshall. Fixes #1600, reported by Jack Wimberley) * Fixed example in docs for sam_hdr_add_line(). (PR#1618, thanks to kojix2) * Improved test harness for base modifications API. (PR#1648) Bug fixes --------- * Fix a major bug when searching against a CRAM index where one container has start and end coordinates entirely contained within the previous container. This would occasionally miss data, and sometimes return much more than required. The bug affected versions 1.11 to 1.17, although the change in 1.11 was bug-fixing multi-threaded index queries. This bug did not affect index building. There is no need to reindex your CRAM files. (PR#1574, PR#1640. Fixes #1569, #1639, samtools/samtools#1808, samtools/samtools#1819. Reported by xuxif, Jens Reeder and Jared Simpson) * Prevent CRAM blocks from becoming too big in files with short sequences but very long aux tags. (PR #1613) * Fix bug where the CRAM decoder for CONST_INT and CONST_BYTE codecs may incorrectly look for extra data in the CORE block. Note that this bug only affected the experimental CRAM v4.0 decoder. (PR#1614) * Fix crypt4gh redirection so it works in conjunction with non-file IO, such as using htsget. (PR#1577) * Improve error checking for the VCF POS column, when facing invalid data. (PR#1575, replaces #1570 originally reported and fixed by Colin Nolan.) * Improved error checking on VCF indexing to validate the data is BGZF compressed. (PR#1581) * Fix bug where bin number calculation could overflow when making iterators over regions that go to the end of a chromosome. (PR#1595) * Backport attractivechaos/klib#78 (by Pall Melsted) to HTSlib. Prevents infinite loops in kseq_read() when reading broken gzip files. (PR#1582, fixes #1579. Reported by Goran Vinterhalter) * Backport attractivechaos/klib@384277a (by innoink) to HTSlib. Fixes the kh_int_hash_func2() macro definition. (PR#1599, fixes #1598. Reported by fanxinping) * Remove a compilation warning on systems with newer libcurl releases. (PR#1572) * Windows: Fixed BGZF EOF check for recent MinGW releases. (PR#1601, fixes samtools/bcftools#1901) * Fixed bug where tabix would not return the correct regions for files where the column ordering is end, ..., begin instead of begin, ..., end. (PR#1626, fixes #1622. Reported by Hiruna Samarakoon) * sam_format_aux1() now always NUL-terminates Z/H tags. (PR#1631) * Ensure base modification iterator is reset when no MM tag is present. (PR#1631, PR#1647) * Fix segfault when attempting to write an uncompressed BAM file opened using hts_open(name, "wbu"). This was attempting to write BAM data without wrapping it in BGZF blocks, which is invalid according to the BAM specification. "wbu" is now internally converted to "wb0" to output uncompressed data wrapped in BGZF blocks. (PR#1632, fixes #1617. Reported by Joyjit Daw) * Fixed over-strict bounds check in probaln_glocal() which caused it to make sub-optimal alignments when the requested band width was greater than the query length. (PR#1616, fixes #1605. Reported by Jared Simpson) * Fixed possible double frees when handling errors in bcf_hdr_add_hrec(), if particular memory allocations fail. (PR#1637) * Ensure that bcf_hdr_remove() clears up all pointers to the items removed from dictionaries. Failing to do this could have resulted in a call requesting a deleted item via bcf_hdr_get_hrec() returning a stale pointer. (PR#1637) * Stop the gzip decompresser from finishing prematurely when an empty gzip block is followed by more data. (PR#1643, PR#1646) ------------------------------------------------------------------------------ samtools - changes v1.18 ------------------------------------------------------------------------------ New work and changes: * Add minimiser sort option to collate by an indexed fasta. Expand the minimiser sort to arrange the minimiser values in the same order as they occur in the reference genome. This is acts as an extremely crude and simplistic read aligner that can be used to boost read compression. (PR#1818) * Add a --duplicate-count option to markdup. Adds the number of duplicates (including itself) to the original read in a 'dc' tag. (PR#1816. Thanks to wulj2) * Make calmd handle unaligned data or empty files without throwing an error. This is to make pipelines work more smoothly. A warning will still be issued. (PR#1841, fixes #1839. Reported by Filipe G. Vieira) * Consistent, more comprehensive flag filtering for fasta/fastq. Added --rf/--incl[ude]-flags and long options for -F (--excl[ude]-flags and -f (--require-flags). (PR#1842. Thanks to Devang Thakkar) * Apply fastq --input-fmt-option settings. Previously any options specified were not being applied to the input file. (PR#1855. Thanks to John Marshall) * Add fastq -d TAG[:VAL] check. This mirrors view -d and will only output alignments that match TAG (and VAL if specified). (PR#1863, fixes #1854. Requested by Rasmus Kirkegaard) * Extend import --order TAG to --order TAG:length. If length is specified, the tag format goes from integer to a 0-padded string format. This is a workaround for BAM and CRAM that cannot encode an order tag of over 4 billion records. (PR#1850, fixes #1847. Reported by Feng Tian) * New -aa mode for consensus. This works like the -aa option in depth and mpileup. The single 'a' reports all bases in contigs covered by alignments. Double 'aa' (or '-a -a') reports Ns even for the references with no alignments against them. (PR#1851, fixes #1849. Requested by Tim Fennell) * Add long option support to samtools index. (PR#1872, fixes #1869. Reported by Jason Bacon) * Be consistent with rounding of "average length" in samtools stats. (PR#1876, fixes #1867. Reported by Jelinek-J) * Add option to ampliconclip that marks reads as unmapped when they do not have enough aligned bases left after clipping. Default is to unmap reads with zero aligned bases. (PR#1865, fixes #1856. Requested by ces) Bug Fixes: * [From HTSLib] Fix a major bug when searching against a CRAM index where one container has start and end coordinates entirely contained within the previous container. This would occasionally miss data, and sometimes return much more than required. The bug affected versions 1.11 to 1.17, although the change in 1.11 was bug-fixing multi-threaded index queries. This bug did not affect index building. There is no need to reindex your CRAM files. (PR#samtools/htslib#1574, PR#samtools/htslib#1640. Fixes #samtools/htslib#1569, #samtools/htslib#1639, #1808, #1819. Reported by xuxif, Jens Reeder and Jared Simpson) * Fix a sort -M bug (regression) when merging sub-blocks. Data was valid but in a poor order for compression. (PR#1812) * Fix bug in split output format. Now SAM and CRAM format can chosen as well as BAM. Also a documentation change, see below. (PR#1821) * Add error checking to view -e filter expression code. Invalid expressions were not returning an error code. (PR#1833, fixes #1829. Reported by Steve Huang) * Fix reheader CRAM output version. Sets the correct CRAM output version for non-3.0 CRAMs. (PR#1868, fixes #1866. Reported by John Marshall) Documentation: * Expand the default filtering information on the mpileup man page. (PR#1802, fixes #1801. Reported by gevro) * Add an explanation of the default behaviour of split files on generating a file for reads with missing or unrecognised RG tags. Also a small bug fix, see above. (PR#1821, fixes #1817. Reported by Steve Huang) * In the INSTALL instructions, switched back to openssl for Alpine. This matches the current Alpine Linux practice. (PR#1837, see htslib#1591. Reported by John Marshall) * Fix various typos caught by lintian parsers. (PR#1877. Thanks to Étienne Mollier) * Document consensus --qual-calibration option. (PR#1880, fixes #1879. Reported by John Marshall) * Updated the page about samtools duplicate marking with more detail at www.htslib.org/algorithms/duplicate.html Non user-visible changes and build improvements: * Removed a redundant line that caused a warning in gcc-13. (PR#1838) ------------------------------------------------------------------------------ bcftools - changes v1.18 ------------------------------------------------------------------------------ Changes affecting the whole of bcftools, or multiple commands: * Support auto indexing during writing BCF and VCF.gz via new `--write-index` option Changes affecting specific commands: * bcftools annotate - The `-m, --mark-sites` option can be now used to mark all sites without the need to provide the `-a` file (#1861) - Fix a bug where the `-m` function did not respect the `--min-overlap` option (#1869) - Fix a bug when update of INFO/END results in assertion error (#1957) * bcftools concat - New option `--drop-genotypes` * bcftools consensus - Support higher-ploidy genotypes with `-H, --haplotype` (#1892) - Allow `--mark-ins` and `--mark-snv` with a character, similarly to `--mark-del` * bcftools convert - Support for conversion from tab-delimited files (CHROM,POS,REF,ALT) to sites-only VCFs * bcftools csq - New `--unify-chr-names` option to automatically unify different chromosome naming conventions in the input GFF, fasta and VCF files (e.g. "chrX" vs "X") - More versatility in parsing various flavors of GFF - A new `--dump-gff` option to help with debugging and investigating the internals of hGFF parsing - When printing consequences in nonsense mediated decay transcripts, include 'NMD_transcript' in the consequence part of the annotation. This is to make filtering easier and analogous to VEP annotations. For example the consequence annotation 3_prime_utr|PCGF3|ENST00000430644|NMD is newly printed as 3_prime_utr&NMD_transcript|PCGF3|ENST00000430644|NMD * bcftools gtcheck - Add stats for the number of sites matched in the GT-vs-GT, GT-vs-PL, etc modes. This information is important for interpretation of the discordance score, as only the GT-vs-GT matching can be interpreted as the number of mismatching genotypes. * bcftools +mendelian2 - Fix in command line argument parsing, the `-p` and `-P` options were not functioning (#1906) * bcftools merge - New `-M, --missing-rules` option to control the behavior of merging of vector tags to prevent mixtures of known and missing values in tags when desired - Use values pertaining to the unknown allele (<*> or <NON_REF>) when available to prevent mixtures of known and missing values (#1888) - Revamped line matching code to fix problems in gVCF merging where split gVCF blocks would not update genotypes (#1891, #1164). * bcftool mpileup - Fix a bug in --indels-v2.0 which caused an endless loop when CIGAR operator 'H' or 'P' was encountered * bcftools norm - The `-m, --multiallelics +` mode now preserves phasing (#1893) - Symbolic <DEL.*> alleles are now normalized too (#1919) - New `-g, --gff-annot` option to right-align indels in forward transcripts to follow HGVS 3'rule (#1929) * bcftools query - Force newline character in formatting expression when not given explicitly - Fix `-H` header output in formatting expressions containing newlines * bcftools reheader - Make `-f, --fai` aware of long contigs not representable by 32-bit integer (#1959) * bcftools +split-vep - Prevent a segfault when `-i/-e` use a VEP subfield not included in `-f` or `-c` (#1877) - New `-X, --keep-sites` option complementing the existing `-x, --drop-sites` options - Force newline character in formatting expression when not given explicitly - Fix a subtle ambiguity: identical rows must be returned when `-s` is applied regardless of `-f` containing the `-a` VEP tag itself or not. * bcftools stats - Collect new VAF (variant allele frequency) statistics from FORMAT/AD field - When counting transitions/transversions, consider also alternate het genotypes * plot-vcfstats - Add three new VAF plots