The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.txt | 2025-05-30 | 18.9 kB | |
bcftools-1.22.tar.bz2 | 2025-05-30 | 8.2 MB | |
samtools-1.22.tar.bz2 | 2025-05-30 | 9.3 MB | |
htslib-1.22.tar.bz2 | 2025-05-30 | 4.8 MB | |
Totals: 4 Items | 22.3 MB | 6 |
------------------------------------------------------------------------------ htslib - changes v1.22 ------------------------------------------------------------------------------ Note this release changes the default output CRAM version from 3.0 to 3.1.HTSlib and SAMtools have been able to read CRAM 3.1 since version 1.12, however other tools may not yet be able to cope. We know Noodles reads CRAM3.1 and htsjdk has a draft implementation that has not yet been released. HTSlib has options for modifying the output formats, which are exposed in SAMtools. When specifying an output format you can explicitly change the version via e.g. `samtools view -O cram,version=3.0 ...`. Further documentation on this change can be found at https://www.htslib.org/benchmarks/CRAM.html HTSlib no longer fetches CRAM reference data from EBI's server by default. Your organisation may wish to set up local infrastructure to supply reference sequences, e.g., using the new ref-cache tool included in this HTSlib release. See the REF_CACHE and REF_PATH environment variables documented in https://www.htslib.org/doc/reference_seqs.html and the SAMtools manpage for details. Updates ------- * NEW. Add ref-cache, a caching proxy for reference sequences. This is a local server of reference sequences, for use when encoding or decoding CRAM files that use reference-based compression. (PR #1911, PR #1921, PR #1922) * Add support for matching VCF lines by ID. (PR #1844, addresses issue bcftools#1739 reported by Han Cao) * Make it possible to test for VCF_REF as declared in the documentation. (PR #1879) * Updated VCF code to work with VCF 4.4 prefixed phasing info. (PR#1861, fixes #1847. Reported by John Marshall) * Use the highest VCF version when merging headers. (PR#1912, see bcftools#2395 and bcftools#2404) * Update RLEN calculation for VCF 4.4 and 4.5. (PR#1897, fixes #1820. Reported by Dave Lawrence) * Convert U to T instead of U to N when sam_parsing. Though SAM format itself can contain U the BAM format cannot. (PR #1854, fixes samtools#2131 reported by James Ferguson) * Add an hts_crc32 function to use zlib or libdeflate. The libdeflate crc32 function is faster than native zlib and should be used when available. (PR #1850) * Increase the input block size for bgzip. This deals with a slow down introduced in PR #1493 when reading from a pipe. (PR #1768, fixes #1767. Reported by Konstantin Riege) * Allow BYTE_ARRAY_STOP to work on non-zero STOP code with TOK3. Although the htscodecs name tokeniser uses a NUL between names there is no reason why another value could not be used. This change lets CRAM recognise other separator values. (PR #1871) * Remove cram seek ability to do range queries via SEEK_CUR. A probable misfeature from the original implementation. (PR #1878, fixes #1877. Reported by Rick Wertenbroek) * Add hts_tpool_worker_id() API. This may be used to associate data with a thread rather than to a job. (PR #1875) * Update bcf_synced_reader to use htsFile. (PR #1868, implements #1862. Requested by Brent Pedersen) * Exit with return value 1 on tabix parse error. This previously returned 0. (PR #1887, fixes #1885. Reported by Fan-iX) * Automatically recognise BED vs TSV files and add the option -C, --coords to set index positions (1 or 0 based coordinates) in annot-tsv. (PR #1894) * Reading SQ lines with multiple differing LN will now fail. Such lines are invalid (by the spec) and previous handling was inconsistent. (PR #1882, fixes #1866) * Return errors instead of EOF after all I/O errors etc in hts_itr_multi_next/sam_itr_next/sam_read1/vcf_parse/bcf_read. (PR#1899. Thanks to John Marshall) * Remove UR:file:// and UR:ftp:// from ref search path, plus REF_PATH to EBI. Removing EBI as the default fallback when REF_PATH not set prevents the unintended DDOS on EBI's servers. (PR#1881. PR#1915, fixes oss-fuzz issue 418125747) Build Changes ------------- * Detect the presence of getauxval() and elf_aux_info() for *BSD variants. (PR #1835, thanks to Brad Smith) * Make HAVE_ATTRIBUTE_TARGET check also check that SSSE3 intrinsics work. Mainly for use with old compilers. (PR #1886, fixes #1838 and pysam-developers/pysam#1327. Thanks to John Marshall) * Fix broken tests due to MSYS2 changes. Due to changes in how MSYS2 perl reported the identity of the OS it was built for, our tests were failing to adapt to the Windows style file locations. (PR #1892) * Updated htscodecs submodule to version 1.6.3 (PR #1917) * Fix the script used to build the symbol version file. (PR #1918) Bug fixes --------- * Fix possible 1 byte underflow in find_file_extension(). Fixes an issue reported by OSS-Fuzz. (PR #1840, fixes oss-fuzz id 71740) * Replace home-brew string end searching with memchr() to speed up looking at long aux tags. (PR #1842) * Prevent segfault on empty tbi index. This could happen when a VCF file has a header but no data lines. (PR #1845, fixes bcftools#2286. Reported by Devon Ryan) * Fix CRAM embed_ref=2 with seqs overlapping ref end. (PR #1848 and PR #1849 which fixed oss-fuzz issue 372547397) * Fix sam_hdr_remove_line_pos() not dealing with the 0 index position properly. (PR #1853. Thanks to Julian Regalado Perez) * Fix threaded sam_read1() after EOF. Prevents sam_read1() getting stuck when trying to read after EOF and waiting forever for data that is never going to arrive. (PR #1856, fixes #1855. Reported by Yan Gao) * Fix a bug in breakend detection. It was incorrectly assuming that the ALT allele is of equal length to REF allele, but the VCF specification allows breakend insertions. (PR #1858, fixes bcftools#2317. Reported by Nicolai von Kügelgen). * Fix cram_encode fuzzer issue caused by negative reference lengths. Reported by OSS-Fuzz. (PR #1863 fixes oss-fuzz issue 382922241) * Fixed a typo in vcf.h. (PR #1870, thanks to Yu Wang) * Reset variant types after updating alleles with bcf_update_alleles() or bcf_update_alleles_str(). Prevents an out-of-bounds access by bcftools consensus. (PR #1883) * Recognize T > A[chr15:12345[ breakend type in VCF. (PR#1903, fixes bcftools#2389. Reported by Dennis Hendriksen) * Fix possible buffer overruns in expand_path(). (PR#1907) Documentation updates --------------------- * Add instructions to INSTALL for FreeBSD, NetBSD and OpenBSD. (PR #1843) * Clarify bam_set1() parameter documentation to note that quality values do not have the ASCII 33 offset. (PR #1891. Thanks to Chris Wright) * Fixed incorrectly named table in bam1_t structure documentation. (PR #1923. Thanks to Julian Hess) ------------------------------------------------------------------------------ samtools - changes v1.22 ------------------------------------------------------------------------------ Note this release changes the default output CRAM version from 3.0 to 3.1.HTSlib and SAMtools have been able to read CRAM 3.1 since version 1.12, however other tools may not yet be able to cope. We know Noodles reads CRAM3.1 and htsjdk has a draft implementation, but not yet released. HTSlib has options for modifying the output formats, which are exposed in SAMtools. When specifying an output format you can explicitly change the version via e.g. `samtools view -O cram,version=3.0 ...`. Further documentation on this change can be found at https://www.htslib.org/benchmarks/CRAM.html HTSlib no longer fetches CRAM reference data from EBI's server by default. Your organisation may wish to set up local infrastructure to supply reference sequences, e.g., using the new ref-cache tool included in this HTSlib release. See the REF_CACHE and REF_PATH environment variables documented in https://www.htslib.org/doc/reference_seqs.html and the SAMtools manpage for details. New work and changes: * New `samtools checksum` command. This checksums sequence, name, quality and barcode tags in an order and orientation agnostic manner, to facilitate validation of no data loss between raw fastq (or unmapped crams) through alignment, duplication marking, sorting, and other processing operations to get to the final aligned bam/cram. (PR#2122) * Extend `samtools sort -M` to distinguish between mapped and unmapped files. (PR#2110, fixes #2105. Reported by Armin Töpfer) * Allow the `samtools sort` "merging from..." message to be silenced. Setting the verbosity to 0 or 1 will now silence this message. (PR#2197, resolves #2185. Requested by Alex Predeus) * Add `--save-counts` option to `samtools view`. Adds an option to store counts of records processed, accepted and rejected by filtering to a file. (PR#2120, resolves #2038. Requested by Chang Y) * `samtools fasta` and `fastq` can now make faidx/fqidx indexes while writing using the `--write-index` option. (PR#2125, resolves #2118. Requested by Filipe G. Vieira) * Add a warning for `samtools fastq` on coordinate sorted data. (PR#2176, fixes #2169 and #2161. Reported by wook2014) * `samtools tview` add `-i` to hide inserts. (PR#2123. Thanks to Benjamin Bræstrup Sayoc) * Show optional headers with `samtools bedcov -H`. (PR#2140, fixes #2126. Reported by biounix) * `samtools consensus` now supports proper multi-threading. Previously this was restricted to decompression only, but it should now scale better. (PR#2174, supersedes PR#2141) * Add `samtools consensus -T ref.fa` functionality. This reports the reference value if a consensus value cannot be calculated. (PR#2153, fixes an additional request in #1915) * In `samtools consensus`, do not use consensus N for "*" (absent) calls that are masked due to insufficient depth. (PR#2204, fixes #2167. Reported by sanschiffre) * Improve `plot-bamstats` quality plots. (PR#2143 combined with PR#2116 (thanks to James Gilbert)) * Make `reheader -h` use /tmp and honour TMPDIR. (PR#2168, related to #2165. Reported by Zhang Yuanfeng) * Set sort order header tag to unsorted when ordering is lost during `samtools merge`. (PR#2173, fixes #2159. Reported by Filipe G. Vieira) * Protect against merging CRAM files with different headers. (PR#2220, fixes #2218. Reported by Kevin Lewis) * `samtools stats` bug-fix to checksum calculation for quality values. This corrects the checksums but in turn makes the calculated value different to that reported by previous samtools versions. (PR#2193, fixes #2187) * Clarification for `samtools stats` when used on files with different sort orders. (PR#2198, fixes #2177. Reported by Filipe G. Vieira) * In `samtools stats`, dovetailed (completely overlapping) read pairs are now always counted as inward-oriented. Previously they could have been inwards or outwards depending on read ordering. (PR#2216, resolves #2210. Requested by Pontus Hüer) Documentation: * Correct the example for 1:1 `samtools consensus` coords. (PR#2113, fixes #2111. Reported by schorlton-bugseq) * Documents the fastq format options used in SAMtools and HTSlib. (PR#2123, fixes #2121) * Remove mention of threads from `samtools cat` man page. (PR#2162, fixes #2160. Reported by Brandon Pickett) * Update `samtools merge` man page to include `--template-coordinate`. (PR#2164. Thanks to Nils Homer) * Revised CRAM reference sequence documentation in the samtools man page. (PR#2178) * Added fish shell completion and renamed completion for bash shell. These files can be copied to appropriate directories by the user. For full functionality it requires Python3.5+ and installed samtools manpages. (PR#2203. Thanks to LunarEclipse363) * Fix URL printed by the `seq_cache_populate.pl` script. (PR#2222. Thanks to Charles Plessy) Bug fixes: * `samtools consensus` previously could give different results for BAM and CRAM files with the same content. This was because MD/NM tag generation was disabled in CRAM, but the `decode_md=0` option did nothing with BAM. Note with `--no-adj-MQ` both BAM and CRAM gave identical results. Now use `--input-fmt-option decode_md=0` to get the old CRAM behaviour. Otherwise, both BAM and CRAM will be utilising MD/NM to locally modify mapping quality. (PR#2156) * `samtools consensus` without `-a` previously still padded with leading Ns in some cases. It now consistently removes both leading and trailing Ns. Use "-a" if you want all reference bases displayed. (Part of PR#2174 above) * Change how `markdup` looks for single reads. Due to changes to `fixmate` in 1.21 `markdup` no longer recognised single reads that would have normally have been part of a pair. (PR#2117, fixes #2117. Reported by Kristy Horan) * Fix `samtools merge` crash on BAM files with malformed headers. (PR#2128, fixes #2127. Reported by Frostb1te) * Fix `faidx --write-index` invalid free. (PR#2147, fixes 2142. Reported by Alex Leonard) * Fix `samtools fastq -i` to force CRAM aux tag decoding. (PR#2155, fixes #2155. Reported by Alex Leonard) Non user-visible changes and build improvements: * Improve htslib#PRnum support for Cirrus-CI and GitHub Actions. (PR#2115) * Fix broken tests due to MSYS2 changes. Due to changes in how MSYS2 perl reported the identity of the OS it was built for, our tests were failing to adapt to the Windows style file locations. (PR #2196) * Upgrade to `_XOPEN_SOURCE=700`, to match HTSlib. Also replace `usleep()` with `nanosleep()`. (PR#2221) ------------------------------------------------------------------------------ bcftools - changes v1.22 ------------------------------------------------------------------------------ Changes affecting the whole of bcftools, or multiple commands: * Add support for matching lines by ID via the --pair-logic and --collapse options (#1739) * The -i/-e filtering expressions - The expressions now properly match the regex negation of missing values, e.g. -i 'TAG!~"\."' (#2355) - Added support for Fisher's exact test * Add the option `-v, --verbosity INT` to all bcftools commands and plugins. Verbosity values bigger than 3 are passed to the underlying HTSlib library so that the user can investigate network issues and other problems occurring at the library level. Changes affecting specific commands: * bcftools annotate - Fix Number in the header definition of transferred FILTER and ID tags (#2335) * bcftools call - The `-s, --samples` option was not working properly, now also supporting sample negation as advertised in the manual page, e.g. `-s ^sample1,sample2` to include all samples but sample1 and sample2 (#2380) * bcftools consensus - Preserve entire missing gVCF blocks with --missing (#2350) - Fixed a bug, the `-S, --samples-file` option is no longer ignored (#2398) * bcftools convert - The command `convert --gvcf2vcf` was not filling the REF allele when BCF was output (#243) * bcftools csq - Check the input GFF for features outside transcript boundaries and extend the transcript to contain the feature fully (#2323) - Add experimental support for alternative genetic code tables, accessible via a new option `-C, --genetic-code` (#2368) - Change in the `--unify-chr-names` option, no automatic sequence name modification is attempted anymore, the prefixes to trim must be given explictly. For example, if run with `--unify-chr-names chr,Chromosome,-`, the program will trim the "chr" prefix in the VCF, "Chromosome" in the GFF, leaving the fasta unchanged (#2378) * bcftools +fill-tags - Thanks to the extension of filtering expressions with Fisher's exact test, the plugin can now be used to add FT annotation (#1582) * bcftools merge - Preserve phasing in half-missing genotypes (#2331) - The option `--merge none` is expected to create no new multiallelic sites, but it should allow to merge, say, A>C with A>C,AT (#2333) - Make `--merge both` work with indel-only records; for example, the multiallelic site G>GT,T should be merged with G>GT (#2339) - Do not merge symbolic alleles unless they have not just the same type, eg. <DEL>, but also length, i.e the INFO/END coordinate (#2362) - Fix a bug where an incorrectly formatted gVCF file with overlapping blocks would trigger an infinite loop in the program (#2410) * bcftools mpileup - The -r/-R option newly merge overlapping regions, preventing the output of duplicate sites * bcftools norm - Print the number of removed duplicate sites in the final statistics (#2346) - Preserve the original alleles in `--old-rec-tag` when `--check-ref s` requested (#2357) - Print a warning when INFO/SVLEN is not defined as Number=A (#2371) * plot-vcfstats - Make the option `-s, --sample-names` functional again (#2353) * bcftools +prune - New option to remove or annotate clusters of sites within a window * bcftools query - The functions used in -i/-e filtering expressions (such as SUM, MEDIAN, etc) can be now used in formatting expressions (#2271). If the VCF contains INFO/AD and FORMAT/AD, try: bcftools query test.vcf -f '%CHROM:%POS \t [ %AD] \t [ %sSUM(FMT/AD)]' bcftools query test.vcf -f '%CHROM:%POS \t [ %AD] \t [ %SUM(FMT/AD)]' bcftools query test.vcf -f '%CHROM:%POS \t [ %AD] \t %SUM(FMT/AD)' bcftools query test.vcf -f '%CHROM:%POS \t [ %AD] \t %SUM(INFO/AD)' - Make it possible to refer to the ID column from the FORMAT expression (#2337) bcftools query test.vcf -f 'ID=%ID ID=[ %/ID] vs FMT_ID=[ %ID]' * bcftools roh - New visualization tool misc/roh-viz, see below * bcftools +setGT - Support for setting missing genotypes with arbitrary ploidy via `-n c:./.` (#2303) * bcftools +split-vep - The `-s, --select` option was extended to print only one consequence. Previously it was possible to select a single transcript (e.g., the one with the worst consequence), and it was possible to filter by consequence severity (e.g., missing or worse), but in some cases multiple consequences are reported within a single transcript (e.g., start_lost&splice_region). The extended option allows to print the worst part, for example as --select primary:missense+:worst * bcftools +trio-dnm2 - Fix a problem with --strictly-novel option which would neglect the presence of the apparent de novo allele in the father for male offspring - Fix a problem with uncalled mosaic chrX variants in males * roh-viz - HTML/JavaScript visualization of bcftools/roh output and homozygosity rate. * bcftools +vrfs - New experimental plugin for scoring variants and assess site noisiness (variant read frequency profiles) from a large number of unaffected parental samples