Creating study carrel named neuroscience-from-bioarxiv Initializing database Creating cache from Bioarxiv xml file 10.1101/2020.09.21.305516 10_1101-2020_09_21_305516.pdf 10.1101/2021.02.10.430649 10_1101-2021_02_10_430649.pdf 10.1101/2021.02.12.431018 10_1101-2021_02_12_431018.pdf 10.1101/2021.02.11.430847 10_1101-2021_02_11_430847.pdf 10.1101/2021.02.11.430871 10_1101-2021_02_11_430871.pdf 10.1101/2021.02.12.430963 10_1101-2021_02_12_430963.pdf 10.1101/2021.02.13.429885 10_1101-2021_02_13_429885.pdf 10.1101/2021.02.12.430830 10_1101-2021_02_12_430830.pdf 10.1101/2021.02.12.430739 10_1101-2021_02_12_430739.pdf 10.1101/2021.02.12.430764 10_1101-2021_02_12_430764.pdf 10.1101/2020.01.28.923532 10_1101-2020_01_28_923532.pdf 10.1101/2021.02.11.430762 10_1101-2021_02_11_430762.pdf 10.1101/2020.09.23.308239 10_1101-2020_09_23_308239.pdf 10.1101/2020.09.23.310276 10_1101-2020_09_23_310276.pdf 10.1101/2021.02.11.430806 10_1101-2021_02_11_430806.pdf 10.1101/2021.02.11.430695 10_1101-2021_02_11_430695.pdf 10.1101/2021.02.12.430989 10_1101-2021_02_12_430989.pdf 10.1101/2021.02.12.430979 10_1101-2021_02_12_430979.pdf 10.1101/2021.02.12.430923 10_1101-2021_02_12_430923.pdf 10.1101/727867 10_1101-727867.pdf 10.1101/2020.12.24.424317 10_1101-2020_12_24_424317.pdf 10.1101/2020.10.08.327718 10_1101-2020_10_08_327718.pdf 10.1101/2021.02.09.430550 10_1101-2021_02_09_430550.pdf 10.1101/2021.02.10.430656 10_1101-2021_02_10_430656.pdf 10.1101/2021.02.10.430619 10_1101-2021_02_10_430619.pdf 10.1101/2021.02.11.430789 10_1101-2021_02_11_430789.pdf 10.1101/2021.02.10.430512 10_1101-2021_02_10_430512.pdf 10.1101/2021.02.10.430705 10_1101-2021_02_10_430705.pdf 10.1101/2021.02.10.430606 10_1101-2021_02_10_430606.pdf 10.1101/698605 10_1101-698605.pdf 10.1101/2021.02.10.430563 10_1101-2021_02_10_430563.pdf 10.1101/2020.11.17.386649 10_1101-2020_11_17_386649.pdf 10.1101/2020.05.15.090266 10_1101-2020_05_15_090266.pdf 10.1101/2021.02.01.429246 10_1101-2021_02_01_429246.pdf 10.1101/2021.02.10.430623 10_1101-2021_02_10_430623.pdf 10.1101/2021.02.09.430405 10_1101-2021_02_09_430405.pdf 10.1101/2021.02.10.430367 10_1101-2021_02_10_430367.pdf 10.1101/2021.02.09.430536 10_1101-2021_02_09_430536.pdf 10.1101/2021.02.09.430363 10_1101-2021_02_09_430363.pdf 10.1101/2021.02.09.430460 10_1101-2021_02_09_430460.pdf 10.1101/2021.02.08.430070 10_1101-2021_02_08_430070.pdf 10.1101/2021.02.09.430036 10_1101-2021_02_09_430036.pdf 10.1101/2021.02.08.428881 10_1101-2021_02_08_428881.pdf 10.1101/2021.02.08.430343 10_1101-2021_02_08_430343.pdf 10.1101/2021.02.08.430275 10_1101-2021_02_08_430275.pdf 10.1101/2021.02.08.430270 10_1101-2021_02_08_430270.pdf 10.1101/2021.02.10.430604 10_1101-2021_02_10_430604.pdf 10.1101/2021.02.08.430280 10_1101-2021_02_08_430280.pdf 10.1101/2020.09.02.279521 10_1101-2020_09_02_279521.pdf 10.1101/2020.02.04.934216 10_1101-2020_02_04_934216.pdf 2021-02-14 21:22:16 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.431018v1.full.pdf [149895] -> "./cache/10_1101-2021_02_12_431018.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430764v1.full.pdf [730138] -> "./cache/10_1101-2021_02_12_430764.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430979v1.full.pdf [842591] -> "./cache/10_1101-2021_02_12_430979.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430604v1.full.pdf [679851] -> "./cache/10_1101-2021_02_10_430604.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2020.12.24.424317v2.full.pdf [209489] -> "./cache/10_1101-2020_12_24_424317.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430963v1.full.pdf [696163] -> "./cache/10_1101-2021_02_12_430963.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430695v1.full.pdf [1150493] -> "./cache/10_1101-2021_02_11_430695.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430830v1.full.pdf [643872] -> "./cache/10_1101-2021_02_12_430830.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430762v1.full.pdf [378049] -> "./cache/10_1101-2021_02_11_430762.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430619v1.full.pdf [661726] -> "./cache/10_1101-2021_02_10_430619.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430989v1.full.pdf [427940] -> "./cache/10_1101-2021_02_12_430989.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430806v1.full.pdf [577783] -> "./cache/10_1101-2021_02_11_430806.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2020.01.28.923532v3.full.pdf [1791954] -> "./cache/10_1101-2020_01_28_923532.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430923v1.full.pdf [715550] -> "./cache/10_1101-2021_02_12_430923.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430036v1.full.pdf [544464] -> "./cache/10_1101-2021_02_09_430036.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2020.05.15.090266v2.full.pdf [1203677] -> "./cache/10_1101-2020_05_15_090266.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430270v1.full.pdf [804427] -> "./cache/10_1101-2021_02_08_430270.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.01.429246v2.full.pdf [877153] -> "./cache/10_1101-2021_02_01_429246.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430275v1.full.pdf [1459264] -> "./cache/10_1101-2021_02_08_430275.pdf" [1] 2021-02-14 21:22:17 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430367v1.full.pdf [689268] -> "./cache/10_1101-2021_02_10_430367.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430512v1.full.pdf [1286961] -> "./cache/10_1101-2021_02_10_430512.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2020.11.17.386649v2.full.pdf [956127] -> "./cache/10_1101-2020_11_17_386649.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430280v1.full.pdf [1963137] -> "./cache/10_1101-2021_02_08_430280.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2020.09.02.279521v5.full.pdf [1869837] -> "./cache/10_1101-2020_09_02_279521.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430871v1.full.pdf [1785729] -> "./cache/10_1101-2021_02_11_430871.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430656v1.full.pdf [2480984] -> "./cache/10_1101-2021_02_10_430656.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430343v1.full.pdf [1886281] -> "./cache/10_1101-2021_02_08_430343.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2020.10.08.327718v2.full.pdf [2805772] -> "./cache/10_1101-2020_10_08_327718.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430705v1.full.pdf [4222890] -> "./cache/10_1101-2021_02_10_430705.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/698605v3.full.pdf [1986213] -> "./cache/10_1101-698605.pdf" [1] 2021-02-14 21:22:18 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430847v1.full.pdf [3059886] -> "./cache/10_1101-2021_02_11_430847.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430649v2.full.pdf [4971886] -> "./cache/10_1101-2021_02_10_430649.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430623v1.full.pdf [1966479] -> "./cache/10_1101-2021_02_10_430623.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.430070v1.full.pdf [4088950] -> "./cache/10_1101-2021_02_08_430070.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430536v1.full.pdf [2648870] -> "./cache/10_1101-2021_02_09_430536.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430405v1.full.pdf [3503948] -> "./cache/10_1101-2021_02_09_430405.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430550v2.full.pdf [4191549] -> "./cache/10_1101-2021_02_09_430550.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430460v1.full.pdf [4089353] -> "./cache/10_1101-2021_02_09_430460.pdf" [1] 2021-02-14 21:22:19 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430563v1.full.pdf [3094960] -> "./cache/10_1101-2021_02_10_430563.pdf" [1] 2021-02-14 21:22:20 URL:https://www.biorxiv.org/content/10.1101/2021.02.13.429885v1.full.pdf [6052365] -> "./cache/10_1101-2021_02_13_429885.pdf" [1] 2021-02-14 21:22:20 URL:https://www.biorxiv.org/content/10.1101/2021.02.12.430739v1.full.pdf [4157988] -> "./cache/10_1101-2021_02_12_430739.pdf" [1] 2021-02-14 21:22:20 URL:https://www.biorxiv.org/content/10.1101/2021.02.08.428881v1.full.pdf [8688452] -> "./cache/10_1101-2021_02_08_428881.pdf" [1] 2021-02-14 21:22:21 URL:https://www.biorxiv.org/content/10.1101/2020.02.04.934216v2.full.pdf [9321917] -> "./cache/10_1101-2020_02_04_934216.pdf" [1] 2021-02-14 21:22:21 URL:https://www.biorxiv.org/content/10.1101/2021.02.10.430606v1.full.pdf [7489918] -> "./cache/10_1101-2021_02_10_430606.pdf" [1] 2021-02-14 21:22:22 URL:https://www.biorxiv.org/content/10.1101/727867v2.full.pdf [10837058] -> "./cache/10_1101-727867.pdf" [1] 2021-02-14 21:22:22 URL:https://www.biorxiv.org/content/10.1101/2020.09.21.305516v2.full.pdf [9506384] -> "./cache/10_1101-2020_09_21_305516.pdf" [1] 2021-02-14 21:22:27 URL:https://www.biorxiv.org/content/10.1101/2021.02.11.430789v1.full.pdf [591073] -> "./cache/10_1101-2021_02_11_430789.pdf" [1] 2021-02-14 21:22:29 URL:https://www.biorxiv.org/content/10.1101/2020.09.23.308239v4.full.pdf [22093690] -> "./cache/10_1101-2020_09_23_308239.pdf" [1] 2021-02-14 21:22:30 URL:https://www.biorxiv.org/content/10.1101/2020.09.23.310276v3.full.pdf [6105307] -> "./cache/10_1101-2020_09_23_310276.pdf" [1] 2021-02-14 21:22:34 URL:https://www.biorxiv.org/content/10.1101/2021.02.09.430363v1.full.pdf [19076613] -> "./cache/10_1101-2021_02_09_430363.pdf" [1] === updating bibliographic database Building study carrel named neuroscience-from-bioarxiv FILE: cache/10_1101-2021_02_12_431018.pdf OUTPUT: txt/10_1101-2021_02_12_431018.txt FILE: cache/10_1101-2020_12_24_424317.pdf OUTPUT: txt/10_1101-2020_12_24_424317.txt FILE: cache/10_1101-2021_02_10_430619.pdf OUTPUT: txt/10_1101-2021_02_10_430619.txt FILE: cache/10_1101-2021_02_12_430830.pdf OUTPUT: txt/10_1101-2021_02_12_430830.txt FILE: cache/10_1101-2021_02_12_430979.pdf OUTPUT: txt/10_1101-2021_02_12_430979.txt FILE: cache/10_1101-2020_09_21_305516.pdf OUTPUT: txt/10_1101-2020_09_21_305516.txt FILE: cache/10_1101-2021_02_12_430923.pdf OUTPUT: txt/10_1101-2021_02_12_430923.txt FILE: cache/10_1101-2020_05_15_090266.pdf OUTPUT: txt/10_1101-2020_05_15_090266.txt FILE: cache/10_1101-2021_02_11_430871.pdf OUTPUT: txt/10_1101-2021_02_11_430871.txt FILE: cache/10_1101-2021_02_11_430695.pdf OUTPUT: txt/10_1101-2021_02_11_430695.txt FILE: cache/10_1101-2021_02_11_430806.pdf OUTPUT: txt/10_1101-2021_02_11_430806.txt FILE: cache/10_1101-2021_02_11_430847.pdf OUTPUT: txt/10_1101-2021_02_11_430847.txt FILE: cache/10_1101-2021_02_12_430963.pdf OUTPUT: txt/10_1101-2021_02_12_430963.txt FILE: cache/10_1101-2021_02_08_430070.pdf OUTPUT: txt/10_1101-2021_02_08_430070.txt FILE: cache/10_1101-2021_02_10_430367.pdf OUTPUT: txt/10_1101-2021_02_10_430367.txt FILE: cache/10_1101-2021_02_09_430036.pdf OUTPUT: txt/10_1101-2021_02_09_430036.txt FILE: cache/10_1101-698605.pdf OUTPUT: txt/10_1101-698605.txt FILE: cache/10_1101-2021_02_11_430789.pdf OUTPUT: txt/10_1101-2021_02_11_430789.txt FILE: cache/10_1101-2020_11_17_386649.pdf OUTPUT: txt/10_1101-2020_11_17_386649.txt FILE: cache/10_1101-2021_02_12_430989.pdf OUTPUT: txt/10_1101-2021_02_12_430989.txt FILE: cache/10_1101-2021_02_12_430739.pdf OUTPUT: txt/10_1101-2021_02_12_430739.txt FILE: cache/10_1101-2021_02_11_430762.pdf OUTPUT: txt/10_1101-2021_02_11_430762.txt FILE: cache/10_1101-2021_02_10_430604.pdf OUTPUT: txt/10_1101-2021_02_10_430604.txt FILE: cache/10_1101-2021_02_10_430656.pdf OUTPUT: txt/10_1101-2021_02_10_430656.txt FILE: cache/10_1101-2021_02_10_430512.pdf OUTPUT: txt/10_1101-2021_02_10_430512.txt FILE: cache/10_1101-2020_10_08_327718.pdf OUTPUT: txt/10_1101-2020_10_08_327718.txt FILE: cache/10_1101-2020_09_23_310276.pdf OUTPUT: txt/10_1101-2020_09_23_310276.txt FILE: cache/10_1101-2021_02_09_430363.pdf OUTPUT: txt/10_1101-2021_02_09_430363.txt FILE: cache/10_1101-2021_02_10_430623.pdf OUTPUT: txt/10_1101-2021_02_10_430623.txt FILE: cache/10_1101-2021_02_10_430606.pdf OUTPUT: txt/10_1101-2021_02_10_430606.txt FILE: cache/10_1101-2020_02_04_934216.pdf OUTPUT: txt/10_1101-2020_02_04_934216.txt FILE: cache/10_1101-2021_02_09_430550.pdf OUTPUT: txt/10_1101-2021_02_09_430550.txt FILE: cache/10_1101-2021_02_08_430270.pdf OUTPUT: txt/10_1101-2021_02_08_430270.txt FILE: cache/10_1101-2021_02_08_430343.pdf OUTPUT: txt/10_1101-2021_02_08_430343.txt FILE: cache/10_1101-2021_02_01_429246.pdf OUTPUT: txt/10_1101-2021_02_01_429246.txt FILE: cache/10_1101-2020_09_02_279521.pdf OUTPUT: txt/10_1101-2020_09_02_279521.txt FILE: cache/10_1101-2021_02_08_430280.pdf OUTPUT: txt/10_1101-2021_02_08_430280.txt FILE: cache/10_1101-2021_02_12_430764.pdf OUTPUT: txt/10_1101-2021_02_12_430764.txt FILE: cache/10_1101-2020_01_28_923532.pdf OUTPUT: txt/10_1101-2020_01_28_923532.txt FILE: cache/10_1101-2021_02_09_430536.pdf OUTPUT: txt/10_1101-2021_02_09_430536.txt FILE: cache/10_1101-2021_02_10_430705.pdf OUTPUT: txt/10_1101-2021_02_10_430705.txt FILE: cache/10_1101-2021_02_13_429885.pdf OUTPUT: txt/10_1101-2021_02_13_429885.txt FILE: cache/10_1101-2021_02_09_430405.pdf OUTPUT: txt/10_1101-2021_02_09_430405.txt FILE: cache/10_1101-2021_02_08_430275.pdf OUTPUT: txt/10_1101-2021_02_08_430275.txt FILE: cache/10_1101-2021_02_10_430563.pdf OUTPUT: txt/10_1101-2021_02_10_430563.txt FILE: cache/10_1101-2021_02_09_430460.pdf OUTPUT: txt/10_1101-2021_02_09_430460.txt FILE: cache/10_1101-2021_02_10_430649.pdf OUTPUT: txt/10_1101-2021_02_10_430649.txt FILE: cache/10_1101-2021_02_08_428881.pdf OUTPUT: txt/10_1101-2021_02_08_428881.txt FILE: cache/10_1101-727867.pdf OUTPUT: txt/10_1101-727867.txt FILE: cache/10_1101-2020_09_23_308239.pdf OUTPUT: txt/10_1101-2020_09_23_308239.txt === file2bib.sh === id: 10_1101-2021_02_10_430604 author: Youngblut, Nicholas D. title: Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets date: 2021 pages: 4 extension: .pdf txt: ./txt/10_1101-2021_02_10_430604.txt cache: ./cache/10_1101-2021_02_10_430604.pdf Content-Type application/pdf Creation-Date 2021-02-10T14:11:24Z Keywords Last-Modified 2021-02-14T18:25:42Z Last-Save-Date 2021-02-14T18:25:42Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 109 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T14:11:24Z date 2021-02-14T18:25:42Z dc:format application/pdf; version=1.4 dc:subject dc:title Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets dcterms:created 2021-02-10T14:11:24Z dcterms:modified 2021-02-14T18:25:42Z meta:creation-date 2021-02-10T14:11:24Z meta:keyword meta:save-date 2021-02-14T18:25:42Z modified 2021-02-14T18:25:42Z pdf:PDFVersion 1.4 pdf:charsPerPage ['930', '4650', '2600', '3359'] pdf:docinfo:created 2021-02-10T14:11:24Z pdf:docinfo:creator_tool Chrome pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T18:25:42Z pdf:docinfo:producer macOS Version 10.14.6 (Build 18G7016) Quartz PDFContext pdf:docinfo:title Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0'] producer macOS Version 10.14.6 (Build 18G7016) Quartz PDFContext resourceName b'10_1101-2021_02_10_430604.pdf' subject title Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets xmp:CreatorTool Chrome xmpMM:DocumentID uuid:46a164ee-1dd2-11b2-0a00-ef0927edca00 xmpTPg:NPages 4 === file2bib.sh === id: 10_1101-2020_05_15_090266 author: Zhang, R. title: SpacePHARER: Sensitive identification of phages from CRISPR spacers in prokaryotic hosts date: 2021 pages: 6 extension: .pdf txt: ./txt/10_1101-2020_05_15_090266.txt cache: ./cache/10_1101-2020_05_15_090266.pdf Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 Content-Type application/pdf Creation-Date 2021-02-10T16:31:46Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 43 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T16:31:46Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.5 dc:title 34320483 dcterms:created 2021-02-10T16:31:46Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-10T16:31:46Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4925', '4142', '2075', '344', '344', '344'] pdf:docinfo:created 2021-02-10T16:31:46Z pdf:docinfo:creator_tool Appligent AppendPDF Pro 5.5 pdf:docinfo:custom:Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer xdvipdfmx (20200315) pdf:docinfo:title 34320483 pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0'] producer xdvipdfmx (20200315) resourceName b'10_1101-2020_05_15_090266.pdf' title 34320483 xmp:CreatorTool Appligent AppendPDF Pro 5.5 xmpMM:DocumentID uuid:11f1d5d2-b085-11b2-0a00-782dad000000 xmpTPg:NPages 6 === file2bib.sh === id: 10_1101-2021_02_12_431018 author: Truong Nguyen, Phuoc title: HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. date: 2021 pages: 14 extension: .pdf txt: ./txt/10_1101-2021_02_12_431018.txt cache: ./cache/10_1101-2021_02_12_431018.pdf Content-Type application/pdf Creation-Date 2021-02-12T22:55:36Z Last-Modified 2021-02-14T19:16:35Z Last-Save-Date 2021-02-14T19:16:35Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 393 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T22:55:36Z date 2021-02-14T19:16:35Z dc:format application/pdf; version=1.4 dc:title HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. dcterms:created 2021-02-12T22:55:36Z dcterms:modified 2021-02-14T19:16:35Z meta:creation-date 2021-02-12T22:55:36Z meta:save-date 2021-02-14T19:16:35Z modified 2021-02-14T19:16:35Z pdf:PDFVersion 1.4 pdf:charsPerPage ['1876', '2014', '2602', '2132', '1961', '953', '1972', '1160', '2014', '1880', '2254', '2405', '1553', '833'] pdf:docinfo:created 2021-02-12T22:55:36Z pdf:docinfo:creator_tool Word pdf:docinfo:modified 2021-02-14T19:16:35Z pdf:docinfo:producer macOS Version 11.2.1 (Build 20D74) Quartz PDFContext pdf:docinfo:title HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer macOS Version 11.2.1 (Build 20D74) Quartz PDFContext resourceName b'10_1101-2021_02_12_431018.pdf' title HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. xmp:CreatorTool Word xmpMM:DocumentID uuid:58d2adfc-1dd2-11b2-0a00-b30827bd7700 xmpTPg:NPages 14 === file2bib.sh === id: 10_1101-2021_02_11_430806 author: Badaczewska-Dawid, Aleksandra title: BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences date: 2021 pages: 3 extension: .pdf txt: ./txt/10_1101-2021_02_11_430806.txt cache: ./cache/10_1101-2021_02_11_430806.pdf Author Content-Type application/pdf Creation-Date 2021-02-04T20:10:30Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 35 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-04T20:10:30Z creator date 2021-02-14T21:22:17Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences dcterms:created 2021-02-04T20:10:30Z dcterms:modified 2021-02-14T21:22:17Z meta:author meta:creation-date 2021-02-04T20:10:30Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4645', '5732', '5920'] pdf:docinfo:created 2021-02-04T20:10:30Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:subject pdf:docinfo:title BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['2', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-2021_02_11_430806.pdf' subject title BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c70edf-1dd2-11b2-0a00-020a27bd7700 xmpTPg:NPages 3 === file2bib.sh === id: 10_1101-2021_02_09_430036 author: Goldsborough, Thibaut title: A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea date: 2021 pages: 7 extension: .pdf txt: ./txt/10_1101-2021_02_09_430036.txt cache: ./cache/10_1101-2021_02_09_430036.pdf Author Thibaut Gold Content-Type application/pdf Creation-Date 2021-02-09T17:39:12Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 44 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-09T17:39:12Z creator Thibaut Gold date 2021-02-14T21:22:17Z dc:creator Thibaut Gold dc:format application/pdf; version=1.4 dc:subject dc:title A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea dcterms:created 2021-02-09T17:39:12Z dcterms:modified 2021-02-14T21:22:17Z meta:author Thibaut Gold meta:creation-date 2021-02-09T17:39:12Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['3613', '3079', '2428', '2810', '1917', '3361', '1048'] pdf:docinfo:created 2021-02-09T17:39:12Z pdf:docinfo:creator Thibaut Gold pdf:docinfo:creator_tool Word pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Mac OS X 10.13.6 Quartz PDFContext pdf:docinfo:subject pdf:docinfo:title A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0'] producer Mac OS X 10.13.6 Quartz PDFContext resourceName b'10_1101-2021_02_09_430036.pdf' subject title A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea xmp:CreatorTool Word xmpMM:DocumentID uuid:85c6fc4a-1dd2-11b2-0a00-ca0827edca00 xmpTPg:NPages 7 === file2bib.sh === id: 10_1101-2021_02_10_430367 author: Chen, Meili title: Genome Warehouse: A Public Repository Housing Genome-scale Data date: 2021 pages: 18 extension: .pdf txt: ./txt/10_1101-2021_02_10_430367.txt cache: ./cache/10_1101-2021_02_10_430367.pdf Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 Content-Type application/pdf Creation-Date 2021-02-11T01:56:12Z Last-Modified 2021-02-14T20:28:35Z Last-Save-Date 2021-02-14T20:28:35Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 50 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T01:56:12Z date 2021-02-14T20:28:35Z dc:format application/pdf; version=1.5 dc:title 9691071 dcterms:created 2021-02-11T01:56:12Z dcterms:modified 2021-02-14T20:28:35Z meta:creation-date 2021-02-11T01:56:12Z meta:save-date 2021-02-14T20:28:35Z modified 2021-02-14T20:28:35Z pdf:PDFVersion 1.5 pdf:charsPerPage ['1753', '469', '1895', '2624', '2620', '2657', '2533', '2599', '2634', '2173', '753', '3213', '1644', '926', '1127', '344', '344', '344'] pdf:docinfo:created 2021-02-11T01:56:12Z pdf:docinfo:creator_tool Appligent AppendPDF Pro 5.5 pdf:docinfo:custom:Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 pdf:docinfo:modified 2021-02-14T20:28:35Z pdf:docinfo:producer Acrobat Distiller 9.0.0 (Windows) pdf:docinfo:title 9691071 pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Acrobat Distiller 9.0.0 (Windows) resourceName b'10_1101-2021_02_10_430367.pdf' title 9691071 xmp:CreatorTool Appligent AppendPDF Pro 5.5 xmpMM:DocumentID uuid:dbcd43fb-b085-11b2-0a00-782dad000000 xmpTPg:NPages 18 === file2bib.sh === id: 10_1101-2021_02_10_430619 author: Schutz, Sacha title: Cutevariant: a GUI-based desktop application to explore genetics variations date: 2021 pages: 8 extension: .pdf txt: ./txt/10_1101-2021_02_10_430619.txt cache: ./cache/10_1101-2021_02_10_430619.pdf Author Content-Type application/pdf Creation-Date 2021-02-10T23:59:47Z Keywords Last-Modified 2021-02-14T21:22:16Z Last-Save-Date 2021-02-14T21:22:16Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 68 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-10T23:59:47Z creator date 2021-02-14T21:22:16Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Cutevariant: a GUI-based desktop application to explore genetics variations dcterms:created 2021-02-10T23:59:47Z dcterms:modified 2021-02-14T21:22:16Z meta:author meta:creation-date 2021-02-10T23:59:47Z meta:keyword meta:save-date 2021-02-14T21:22:16Z modified 2021-02-14T21:22:16Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4040', '3428', '2261', '3414', '3216', '4578', '5638', '627'] pdf:docinfo:created 2021-02-10T23:59:47Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:16Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:subject pdf:docinfo:title Cutevariant: a GUI-based desktop application to explore genetics variations pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '1', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_10_430619.pdf' subject title Cutevariant: a GUI-based desktop application to explore genetics variations trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c6c1a1-1dd2-11b2-0a00-0e0a27fd5800 xmpTPg:NPages 8 === file2bib.sh === id: 10_1101-2020_12_24_424317 author: Muazzam, Fariha title: Multi-class Cancer Classification and Biomarker Identification using Deep Learning date: 2021 pages: 12 extension: .pdf txt: ./txt/10_1101-2020_12_24_424317.txt cache: ./cache/10_1101-2020_12_24_424317.pdf Content-Type application/pdf Creation-Date 2021-02-11T09:45:39Z Last-Modified 2021-02-14T21:22:16Z Last-Save-Date 2021-02-14T21:22:16Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 71 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T09:45:39Z date 2021-02-14T21:22:16Z dc:format application/pdf; version=1.6 dc:language en-GB dc:title Multi-class Cancer Classification and Biomarker Identification using Deep Learning dcterms:created 2021-02-11T09:45:39Z dcterms:modified 2021-02-14T21:22:16Z language en-GB meta:creation-date 2021-02-11T09:45:39Z meta:save-date 2021-02-14T21:22:16Z modified 2021-02-14T21:22:16Z pdf:PDFVersion 1.6 pdf:charsPerPage ['2052', '4575', '4008', '1734', '1778', '1769', '1698', '1509', '909', '2553', '2652', '3042'] pdf:docinfo:created 2021-02-11T09:45:39Z pdf:docinfo:creator_tool Writer pdf:docinfo:modified 2021-02-14T21:22:16Z pdf:docinfo:producer LibreOffice 7.0 pdf:docinfo:title Multi-class Cancer Classification and Biomarker Identification using Deep Learning pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer LibreOffice 7.0 resourceName b'10_1101-2020_12_24_424317.pdf' title Multi-class Cancer Classification and Biomarker Identification using Deep Learning xmp:CreatorTool Writer xmpMM:DocumentID uuid:85c6c437-1dd2-11b2-0a00-2c09276d7200 xmpTPg:NPages 12 === file2bib.sh === id: 10_1101-2021_02_09_430405 author: Quazi, Sameer title: In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD date: 2021 pages: 23 extension: .pdf txt: ./txt/10_1101-2021_02_09_430405.txt cache: ./cache/10_1101-2021_02_09_430405.pdf Author Administrator Content-Type application/pdf Creation-Date 2021-02-09T11:42:11Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 290 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-09T11:42:11Z creator Administrator date 2021-02-14T21:22:17Z dc:creator Administrator dc:format application/pdf; version=1.4 dc:title In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD dcterms:created 2021-02-09T11:42:11Z dcterms:modified 2021-02-14T21:22:17Z meta:author Administrator meta:creation-date 2021-02-09T11:42:11Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['1009', '1911', '2821', '2570', '2562', '2577', '1720', '428', '1276', '556', '1603', '1747', '1012', '1189', '1356', '1452', '1064', '2808', '1825', '1961', '1940', '1464', '342'] pdf:docinfo:created 2021-02-09T11:42:11Z pdf:docinfo:creator Administrator pdf:docinfo:creator_tool PScript5.dll Version 5.2.2 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Acrobat Distiller 8.1.0 (Windows) pdf:docinfo:title In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Acrobat Distiller 8.1.0 (Windows) resourceName b'10_1101-2021_02_09_430405.pdf' title In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD xmp:CreatorTool PScript5.dll Version 5.2.2 xmpMM:DocumentID uuid:85c8294e-1dd2-11b2-0a00-d70827bd3700 xmpTPg:NPages 23 === file2bib.sh === id: 10_1101-2020_09_23_310276 author: Greenfest-Allen, Emily title: NIAGADS Alzheimer's GenomicsDB: A resource for exploring Alzheimer's Disease genetic and genomic knowledge date: 2021 pages: 19 extension: .pdf txt: ./txt/10_1101-2020_09_23_310276.txt cache: ./cache/10_1101-2020_09_23_310276.pdf Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 Author Emily Greenfest-Allen Content-Type application/pdf Creation-Date 2021-02-12T15:45:35Z Last-Modified 2021-02-14T21:22:27Z Last-Save-Date 2021-02-14T21:22:27Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 155 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T15:45:35Z creator Emily Greenfest-Allen date 2021-02-14T21:22:27Z dc:creator Emily Greenfest-Allen dc:format application/pdf; version=1.7 dc:language en-US dc:title 97992561 dcterms:created 2021-02-12T15:45:35Z dcterms:modified 2021-02-14T21:22:27Z language en-US meta:author Emily Greenfest-Allen meta:creation-date 2021-02-12T15:45:35Z meta:save-date 2021-02-14T21:22:27Z modified 2021-02-14T21:22:27Z pdf:PDFVersion 1.7 pdf:charsPerPage ['1219', '2048', '3230', '2954', '2587', '3080', '3145', '3484', '3621', '3162', '2797', '3464', '3377', '1506', '1005', '722', '265', '309', '266'] pdf:docinfo:created 2021-02-12T15:45:35Z pdf:docinfo:creator Emily Greenfest-Allen pdf:docinfo:creator_tool Appligent AppendPDF Pro 5.5 pdf:docinfo:custom:Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 pdf:docinfo:modified 2021-02-14T21:22:27Z pdf:docinfo:producer Microsoft® Word for Microsoft 365 pdf:docinfo:title 97992561 pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Microsoft 365 resourceName b'10_1101-2020_09_23_310276.pdf' title 97992561 xmp:CreatorTool Appligent AppendPDF Pro 5.5 xmpMM:DocumentID uuid:076548d5-b089-11b2-0a00-782dad000000 xmpTPg:NPages 19 === file2bib.sh === id: 10_1101-2021_02_12_430739 author: Malekian, Negin title: Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli date: 2021 pages: 13 extension: .pdf txt: ./txt/10_1101-2021_02_12_430739.txt cache: ./cache/10_1101-2021_02_12_430739.pdf Author Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder Content-Type application/pdf Creation-Date 2021-02-12T10:09:54Z Keywords E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS) Last-Modified 2021-02-14T21:22:16Z Last-Save-Date 2021-02-14T21:22:16Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 104 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-12T10:09:54Z creator Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder date 2021-02-14T21:22:16Z dc:creator Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder dc:format application/pdf; version=1.5 dc:subject E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS) dc:title Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli dcterms:created 2021-02-12T10:09:54Z dcterms:modified 2021-02-14T21:22:16Z meta:author Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder meta:creation-date 2021-02-12T10:09:54Z meta:keyword E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS) meta:save-date 2021-02-14T21:22:16Z modified 2021-02-14T21:22:16Z pdf:PDFVersion 1.5 pdf:charsPerPage ['3702', '4965', '4466', '4704', '4734', '1542', '1200', '1401', '2846', '987', '740', '6975', '1619'] pdf:docinfo:created 2021-02-12T10:09:54Z pdf:docinfo:creator Negin Malekian, Ali Al-Fatlawi, Thomas U. Berendonk, Michael Schroeder pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 pdf:docinfo:keywords E Coli, Quinolone, Antibiotic Resistance, Genome-Wide Association Study (GWAS) pdf:docinfo:modified 2021-02-14T21:22:16Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:subject pdf:docinfo:title Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-2021_02_12_430739.pdf' subject title Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c69c39-1dd2-11b2-0a00-ce09271d5700 xmpTPg:NPages 13 === file2bib.sh === id: 10_1101-2021_02_08_430270 author: Gerard, David title: Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty date: 2021 pages: 22 extension: .pdf txt: ./txt/10_1101-2021_02_08_430270.txt cache: ./cache/10_1101-2021_02_08_430270.pdf Author Content-Type application/pdf Creation-Date 2021-02-06T15:17:17Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 140 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-06T15:17:17Z creator date 2021-02-14T21:22:17Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty dcterms:created 2021-02-06T15:17:17Z dcterms:modified 2021-02-14T21:22:17Z meta:author meta:creation-date 2021-02-06T15:17:17Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['2802', '2526', '2381', '3475', '1934', '1023', '1087', '826', '1646', '807', '1100', '1292', '1510', '1328', '1406', '1480', '2341', '1019', '1098', '754', '3273', '1665'] pdf:docinfo:created 2021-02-06T15:17:17Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:subject pdf:docinfo:title Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '14', '14', '2', '0', '0', '0', '0', '23', '32', '30', '28', '0', '7', '16', '10', '8', '19', '0', '0', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-2021_02_08_430270.pdf' subject title Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c73f0b-1dd2-11b2-0a00-810827edca00 xmpTPg:NPages 22 === file2bib.sh === id: 10_1101-2021_02_08_430275 author: Zhang, Jianbo title: Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes date: 2021 pages: 6 extension: .pdf txt: ./txt/10_1101-2021_02_08_430275.txt cache: ./cache/10_1101-2021_02_08_430275.pdf Author Content-Type application/pdf Creation-Date 2020-11-24T15:53:05Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 456 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2020-11-24T15:53:05Z creator date 2021-02-14T21:22:17Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes dcterms:created 2020-11-24T15:53:05Z dcterms:modified 2021-02-14T21:22:17Z meta:author meta:creation-date 2020-11-24T15:53:05Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['5882', '7329', '6126', '2575', '3579', '6462'] pdf:docinfo:created 2020-11-24T15:53:05Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) kpathsea version 6.3.1 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:subject pdf:docinfo:title Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['3', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-2021_02_08_430275.pdf' subject title Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c6d549-1dd2-11b2-0a00-840827bd7200 xmpTPg:NPages 6 === file2bib.sh === id: 10_1101-2020_11_17_386649 author: Danciu, Daniel title: Topology-based Sparsification of Graph Annotations date: 2021 pages: 15 extension: .pdf txt: ./txt/10_1101-2020_11_17_386649.txt cache: ./cache/10_1101-2020_11_17_386649.pdf Content-Type application/pdf Creation-Date 2021-02-10T17:24:37Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 107 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T17:24:37Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.5 dc:title Topology-based Sparsification of Graph Annotations dcterms:created 2021-02-10T17:24:37Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-10T17:24:37Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['3229', '3783', '3431', '3239', '3042', '2396', '2551', '2253', '3283', '3496', '2308', '1552', '3245', '2696', '2690'] pdf:docinfo:created 2021-02-10T17:24:37Z pdf:docinfo:creator_tool TeX pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:title Topology-based Sparsification of Graph Annotations pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '2', '5', '1', '7', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2020_11_17_386649.pdf' title Topology-based Sparsification of Graph Annotations trapped False xmp:CreatorTool TeX xmpMM:DocumentID uuid:85c770f7-1dd2-11b2-0a00-fe08275dc400 xmpTPg:NPages 15 === file2bib.sh === id: 10_1101-2021_02_11_430847 author: Pinatti, Lisa M. title: SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer date: 2021 pages: 26 extension: .pdf txt: ./txt/10_1101-2021_02_11_430847.txt cache: ./cache/10_1101-2021_02_11_430847.pdf Author Brenner, Chad Content-Type application/pdf Creation-Date 2021-02-11T21:58:56Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 250 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T21:58:56Z creator Brenner, Chad date 2021-02-14T21:22:17Z dc:creator Brenner, Chad dc:format application/pdf; version=1.6 dc:title SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer dcterms:created 2021-02-11T21:58:56Z dcterms:modified 2021-02-14T21:22:17Z meta:author Brenner, Chad meta:creation-date 2021-02-11T21:58:56Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.6 pdf:charsPerPage ['2756', '1833', '2101', '558', '2288', '2027', '1758', '2115', '2279', '1965', '2251', '2283', '2408', '2357', '2974', '3686', '3256', '2101', '1500', '655', '352', '537', '441', '190', '385', '30'] pdf:docinfo:created 2021-02-11T21:58:56Z pdf:docinfo:creator Brenner, Chad pdf:docinfo:creator_tool Acrobat PDFMaker 21 for Word pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Adobe Acrobat Pro DC (32-bit) 21.1.20135 pdf:docinfo:title SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Adobe Acrobat Pro DC (32-bit) 21.1.20135 resourceName b'10_1101-2021_02_11_430847.pdf' title SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer xmp:CreatorTool Acrobat PDFMaker 21 for Word xmpMM:DocumentID uuid:480d24e0-a20f-4afa-a462-139f117bb089 xmpTPg:NPages 26 === file2bib.sh === id: 10_1101-2021_02_10_430656 author: Zakeri, Mohsen title: A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing date: 2021 pages: 7 extension: .pdf txt: ./txt/10_1101-2021_02_10_430656.txt cache: ./cache/10_1101-2021_02_10_430656.pdf Author Content-Type application/pdf Creation-Date 2021-02-10T20:57:39Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 124 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-10T20:57:39Z creator date 2021-02-14T21:22:17Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing dcterms:created 2021-02-10T20:57:39Z dcterms:modified 2021-02-14T21:22:17Z meta:author meta:creation-date 2021-02-10T20:57:39Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['5995', '6561', '3463', '792', '6766', '6861', '5533'] pdf:docinfo:created 2021-02-10T20:57:39Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:subject pdf:docinfo:title A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['1', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_10_430656.pdf' subject title A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c7268f-1dd2-11b2-0a00-ee08271d5700 xmpTPg:NPages 7 === file2bib.sh === id: 10_1101-2021_02_08_430070 author: Zhang, Yao-zhong title: On the application of BERT models for nanopore methylation detection date: 2021 pages: 7 extension: .pdf txt: ./txt/10_1101-2021_02_08_430070.txt cache: ./cache/10_1101-2021_02_08_430070.pdf Content-Type application/pdf Creation-Date 2021-02-09T06:48:34Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 220 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-09T06:48:34Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.7 dc:title On the application of BERT models for nanopore methylation detection dcterms:created 2021-02-09T06:48:34Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-09T06:48:34Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.7 pdf:charsPerPage ['4064', '4333', '5898', '4026', '4707', '1134', '4710'] pdf:docinfo:created 2021-02-09T06:48:34Z pdf:docinfo:creator_tool dvips(k) 2020.1 Copyright 2020 Radical Eye Software pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer GPL Ghostscript 9.50 pdf:docinfo:title On the application of BERT models for nanopore methylation detection pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '2', '0', '0', '0', '0'] producer GPL Ghostscript 9.50 resourceName b'10_1101-2021_02_08_430070.pdf' title On the application of BERT models for nanopore methylation detection xmp:CreatorTool dvips(k) 2020.1 Copyright 2020 Radical Eye Software xmpMM:DocumentID uuid:60da6715-a2bf-11f6-0000-35e12fd4d910 xmpTPg:NPages 7 === file2bib.sh === id: 10_1101-2021_02_12_430923 author: Modi, Vivek title: Kincore: a web resource for structural classification of protein kinases and their inhibitors date: 2021 pages: 18 extension: .pdf txt: ./txt/10_1101-2021_02_12_430923.txt cache: ./cache/10_1101-2021_02_12_430923.pdf Author vivekmodi Content-Type application/pdf Creation-Date 2021-02-12T12:59:47Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 121 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T12:59:47Z creator vivekmodi date 2021-02-14T21:22:17Z dc:creator vivekmodi dc:format application/pdf; version=1.7 dc:language en-US dc:title Kincore: a web resource for structural classification of protein kinases and their inhibitors dcterms:created 2021-02-12T12:59:47Z dcterms:modified 2021-02-14T21:22:17Z language en-US meta:author vivekmodi meta:creation-date 2021-02-12T12:59:47Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.7 pdf:charsPerPage ['548', '2461', '4120', '4318', '3898', '1805', '3138', '1650', '3611', '1882', '1633', '2942', '3687', '3445', '2654', '2853', '3053', '2600'] pdf:docinfo:created 2021-02-12T12:59:47Z pdf:docinfo:creator vivekmodi pdf:docinfo:creator_tool Microsoft Word pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:title Kincore: a web resource for structural classification of protein kinases and their inhibitors pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] resourceName b'10_1101-2021_02_12_430923.pdf' title Kincore: a web resource for structural classification of protein kinases and their inhibitors xmp:CreatorTool Microsoft Word xmpMM:DocumentID uuid:86B06038-C2B3-4083-8EB1-C3E7E10688FB xmpTPg:NPages 18 === file2bib.sh === id: 10_1101-2021_02_12_430989 author: Sofer, Tamar title: Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies date: 2021 pages: 27 extension: .pdf txt: ./txt/10_1101-2021_02_12_430989.txt cache: ./cache/10_1101-2021_02_12_430989.pdf Author Administrator Content-Type application/pdf Creation-Date 2021-02-12T20:11:32Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 102 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T20:11:32Z creator Administrator date 2021-02-14T21:22:17Z dc:creator Administrator dc:format application/pdf; version=1.4 dc:title Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies dcterms:created 2021-02-12T20:11:32Z dcterms:modified 2021-02-14T21:22:17Z meta:author Administrator meta:creation-date 2021-02-12T20:11:32Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['1793', '1826', '2384', '2072', '2125', '2065', '2176', '2085', '2167', '2092', '2248', '2203', '2202', '1934', '2013', '2319', '2363', '2242', '2021', '1882', '2977', '3460', '1111', '868', '1358', '994', '946'] pdf:docinfo:created 2021-02-12T20:11:32Z pdf:docinfo:creator Administrator pdf:docinfo:creator_tool PScript5.dll Version 5.2.2 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Acrobat Distiller 8.1.0 (Windows) pdf:docinfo:title Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '44', '71', '10', '47', '0', '2', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0'] producer Acrobat Distiller 8.1.0 (Windows) resourceName b'10_1101-2021_02_12_430989.pdf' title Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies xmp:CreatorTool PScript5.dll Version 5.2.2 xmpMM:DocumentID uuid:85c70ff8-1dd2-11b2-0a00-b709275d6100 xmpTPg:NPages 27 === file2bib.sh === id: 10_1101-2021_02_12_430963 author: Gerber, Stefan title: Streamlining differential exon and 3' UTR usage with diffUTR date: 2021 pages: 17 extension: .pdf txt: ./txt/10_1101-2021_02_12_430963.txt cache: ./cache/10_1101-2021_02_12_430963.pdf Author Content-Type application/pdf Creation-Date 2021-02-12T16:44:54Z Keywords Last-Modified 2021-02-14T21:22:16Z Last-Save-Date 2021-02-14T21:22:16Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 313 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-12T16:44:54Z creator date 2021-02-14T21:22:16Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Streamlining differential exon and 3' UTR usage with diffUTR dcterms:created 2021-02-12T16:44:54Z dcterms:modified 2021-02-14T21:22:16Z meta:author meta:creation-date 2021-02-12T16:44:54Z meta:keyword meta:save-date 2021-02-14T21:22:16Z modified 2021-02-14T21:22:16Z pdf:PDFVersion 1.5 pdf:charsPerPage ['1413', '2472', '1908', '2213', '2583', '2365', '1177', '2644', '2325', '2335', '2542', '2188', '2388', '2261', '2673', '3135', '1179'] pdf:docinfo:created 2021-02-12T16:44:54Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:16Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:subject pdf:docinfo:title Streamlining differential exon and 3' UTR usage with diffUTR pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '6', '0', '0', '5', '1', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_12_430963.pdf' subject title Streamlining differential exon and 3' UTR usage with diffUTR trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c690de-1dd2-11b2-0a00-740827bd7200 xmpTPg:NPages 17 === file2bib.sh === id: 10_1101-2021_02_11_430871 author: Vadnais, David title: ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data date: 2021 pages: 24 extension: .pdf txt: ./txt/10_1101-2021_02_11_430871.txt cache: ./cache/10_1101-2021_02_11_430871.pdf Author David Vadnais Content-Type application/pdf Creation-Date 2021-02-12T05:29:17Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 128 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T05:29:17Z creator David Vadnais date 2021-02-14T21:22:17Z dc:creator David Vadnais dc:format application/pdf; version=1.7 dc:language en-US dc:title ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data dcterms:created 2021-02-12T05:29:17Z dcterms:modified 2021-02-14T21:22:17Z language en-US meta:author David Vadnais meta:creation-date 2021-02-12T05:29:17Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.7 pdf:charsPerPage ['4816', '5498', '3558', '1617', '3207', '2894', '1245', '2154', '1938', '1478', '2449', '2665', '1841', '1873', '1080', '2052', '1332', '1975', '2190', '3674', '4372', '4469', '1590', '1943'] pdf:docinfo:created 2021-02-12T05:29:17Z pdf:docinfo:creator David Vadnais pdf:docinfo:creator_tool Microsoft Word pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:title ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] resourceName b'10_1101-2021_02_11_430871.pdf' title ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data xmp:CreatorTool Microsoft Word xmpMM:DocumentID uuid:1B1B6485-34DE-46C5-9550-0CC4C4D733CD xmpTPg:NPages 24 === file2bib.sh === id: 10_1101-2021_02_11_430695 author: Gordon-Rodriguez, Elliott title: Learning Sparse Log-Ratios for High-Throughput Sequencing Data date: 2021 pages: 12 extension: .pdf txt: ./txt/10_1101-2021_02_11_430695.txt cache: ./cache/10_1101-2021_02_11_430695.pdf Author Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham Content-Type application/pdf Creation-Date 2021-02-11T17:27:55Z Keywords Machine Learning, ICML Last-Modified 2021-02-11T17:27:55Z Last-Save-Date 2021-02-11T17:27:55Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 115 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject Proceedings of the International Conference on Machine Learning 2021 created 2021-02-11T17:27:55Z creator Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham date 2021-02-11T17:27:55Z dc:creator Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham dc:format application/pdf; version=1.5 dc:subject Machine Learning, ICML dc:title Learning Sparse Log-Ratios for High-Throughput Sequencing Data dcterms:created 2021-02-11T17:27:55Z dcterms:modified 2021-02-11T17:27:55Z meta:author Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham meta:creation-date 2021-02-11T17:27:55Z meta:keyword Machine Learning, ICML meta:save-date 2021-02-11T17:27:55Z modified 2021-02-11T17:27:55Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4061', '4908', '4284', '3291', '3759', '3982', '4289', '3758', '3949', '3927', '4057', '230'] pdf:docinfo:created 2021-02-11T17:27:55Z pdf:docinfo:creator Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 pdf:docinfo:keywords Machine Learning, ICML pdf:docinfo:modified 2021-02-11T17:27:55Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:subject Proceedings of the International Conference on Machine Learning 2021 pdf:docinfo:title Learning Sparse Log-Ratios for High-Throughput Sequencing Data pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP false pdf:unmappedUnicodeCharsPerPage ['0', '2', '8', '14', '6', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-2021_02_11_430695.pdf' subject Proceedings of the International Conference on Machine Learning 2021 title Learning Sparse Log-Ratios for High-Throughput Sequencing Data trapped False xmp:CreatorTool LaTeX with hyperref xmpTPg:NPages 12 === file2bib.sh === id: 10_1101-2020_09_21_305516 author: Nikolic, Ana title: Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer date: 2021 pages: 32 extension: .pdf txt: ./txt/10_1101-2020_09_21_305516.txt cache: ./cache/10_1101-2020_09_21_305516.pdf Content-Type application/pdf Creation-Date 2021-02-12T16:20:22Z Last-Modified 2021-02-14T21:22:18Z Last-Save-Date 2021-02-14T21:22:18Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 126 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T16:20:22Z date 2021-02-14T21:22:18Z dc:format application/pdf; version=1.4 dc:title Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer dcterms:created 2021-02-12T16:20:22Z dcterms:modified 2021-02-14T21:22:18Z meta:creation-date 2021-02-12T16:20:22Z meta:save-date 2021-02-14T21:22:18Z modified 2021-02-14T21:22:18Z pdf:PDFVersion 1.4 pdf:charsPerPage ['3381', '4467', '1165', '5419', '866', '4521', '1520', '4215', '669', '1345', '4158', '4270', '3473', '3942', '4230', '3824', '810', '1165', '884', '856', '689', '1095', '704', '725', '776', '807', '846', '1061', '990', '704', '703', '627'] pdf:docinfo:created 2021-02-12T16:20:22Z pdf:docinfo:creator_tool Word pdf:docinfo:modified 2021-02-14T21:22:18Z pdf:docinfo:producer macOS Version 11.2.1 (Build 20D74) Quartz PDFContext pdf:docinfo:title Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer macOS Version 11.2.1 (Build 20D74) Quartz PDFContext resourceName b'10_1101-2020_09_21_305516.pdf' title Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer xmp:CreatorTool Word xmpMM:DocumentID uuid:85c8661d-1dd2-11b2-0a00-f209277d8900 xmpTPg:NPages 32 === file2bib.sh === id: 10_1101-2021_02_12_430830 author: Gergely, Tibély title: Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data date: 2021 pages: 19 extension: .pdf txt: ./txt/10_1101-2021_02_12_430830.txt cache: ./cache/10_1101-2021_02_12_430830.pdf Author Content-Type application/pdf Creation-Date 2021-02-12T12:51:07Z Keywords Last-Modified 2021-02-14T21:22:16Z Last-Save-Date 2021-02-14T21:22:16Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 300 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-12T12:51:07Z creator date 2021-02-14T21:22:16Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data dcterms:created 2021-02-12T12:51:07Z dcterms:modified 2021-02-14T21:22:16Z meta:author meta:creation-date 2021-02-12T12:51:07Z meta:keyword meta:save-date 2021-02-14T21:22:16Z modified 2021-02-14T21:22:16Z pdf:PDFVersion 1.5 pdf:charsPerPage ['2344', '2892', '3313', '1197', '2366', '1948', '2587', '2787', '1337', '1536', '3109', '1199', '2709', '1289', '3331', '3112', '2139', '2457', '945'] pdf:docinfo:created 2021-02-12T12:51:07Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref package pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:16Z pdf:docinfo:producer pdfTeX-1.40.16 pdf:docinfo:subject pdf:docinfo:title Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '1', '0', '8', '25', '6', '3', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0'] producer pdfTeX-1.40.16 resourceName b'10_1101-2021_02_12_430830.pdf' subject title Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data trapped False xmp:CreatorTool LaTeX with hyperref package xmpMM:DocumentID uuid:85c6b3e8-1dd2-11b2-0a00-130a277d8900 xmpTPg:NPages 19 === file2bib.sh === id: 10_1101-2021_02_11_430789 author: Tyagin, Ilya title: Accelerating COVID-19 research with graph mining and transformer-based learning date: 2021 pages: 9 extension: .pdf txt: ./txt/10_1101-2021_02_11_430789.txt cache: ./cache/10_1101-2021_02_11_430789.pdf Author Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro Content-Type application/pdf Creation-Date 2021-02-10T14:30:51Z Keywords Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, Last-Modified 2021-02-14T21:22:26Z Last-Save-Date 2021-02-14T21:22:26Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 130 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject - Applied computing -> Bioinformatics.Document management and text processing.- Computing methodologies -> Learning latent representations.Neural networks.Information extraction.Semantic networks. created 2021-02-10T14:30:51Z creator Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro date 2021-02-14T21:22:26Z dc:creator Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro dc:description - Applied computing -> Bioinformatics.Document management and text processing.- Computing methodologies -> Learning latent representations.Neural networks.Information extraction.Semantic networks. dc:format application/pdf; version=1.5 dc:language en dc:subject Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, dc:title Accelerating COVID-19 research with graph mining and transformer-based learning dcterms:created 2021-02-10T14:30:51Z dcterms:modified 2021-02-14T21:22:26Z description - Applied computing -> Bioinformatics.Document management and text processing.- Computing methodologies -> Learning latent representations.Neural networks.Information extraction.Semantic networks. language en meta:author Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro meta:creation-date 2021-02-10T14:30:51Z meta:keyword Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, meta:save-date 2021-02-14T21:22:26Z modified 2021-02-14T21:22:26Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4414', '5587', '5796', '4953', '6020', '5204', '4953', '6112', '9146'] pdf:docinfo:created 2021-02-10T14:30:51Z pdf:docinfo:creator Ilya Tyagin, Ankit Kulshrestha, Justin Sybrandt, Krish Matta, Michael Shtutman, and Ilya Safro pdf:docinfo:creator_tool LaTeX with acmart 2020/04/30 v1.71 Typesetting articles for the Association for Computing Machinery and hyperref 2020-05-15 v7.00e Hypertext links for LaTeX pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:keywords Hypothesis Generation, Literature-Based Discovery, Transformer Models, Semantic Networks, Biomedical Recommendation, pdf:docinfo:modified 2021-02-14T21:22:26Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:subject - Applied computing -> Bioinformatics.Document management and text processing.- Computing methodologies -> Learning latent representations.Neural networks.Information extraction.Semantic networks. pdf:docinfo:title Accelerating COVID-19 research with graph mining and transformer-based learning pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '5', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_11_430789.pdf' subject - Applied computing -> Bioinformatics.Document management and text processing.- Computing methodologies -> Learning latent representations.Neural networks.Information extraction.Semantic networks. title Accelerating COVID-19 research with graph mining and transformer-based learning trapped False xmp:CreatorTool LaTeX with acmart 2020/04/30 v1.71 Typesetting articles for the Association for Computing Machinery and hyperref 2020-05-15 v7.00e Hypertext links for LaTeX xmpMM:DocumentID uuid:85d60a5f-1dd2-11b2-0a00-e00927bd7700 xmpTPg:NPages 9 === file2bib.sh === id: 10_1101-2021_02_12_430764 author: Ascensión, Alex M. title: Triku: a feature selection method based on nearest neighbors for single-cell data date: 2021 pages: 18 extension: .pdf txt: ./txt/10_1101-2021_02_12_430764.txt cache: ./cache/10_1101-2021_02_12_430764.pdf Author Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo Content-Type application/pdf Creation-Date 2021-02-12T10:37:24Z Keywords scRNAseq, feature selection, bioinformatics, python Last-Modified 2021-02-14T20:14:07Z Last-Save-Date 2021-02-14T20:14:07Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 327 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T10:37:24Z creator Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo date 2021-02-14T20:14:07Z dc:creator Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo dc:format application/pdf; version=1.5 dc:subject scRNAseq, feature selection, bioinformatics, python dc:title Triku: a feature selection method based on nearest neighbors for single-cell data dcterms:created 2021-02-12T10:37:24Z dcterms:modified 2021-02-14T20:14:07Z meta:author Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo meta:creation-date 2021-02-12T10:37:24Z meta:keyword scRNAseq, feature selection, bioinformatics, python meta:save-date 2021-02-14T20:14:07Z modified 2021-02-14T20:14:07Z pdf:PDFVersion 1.5 pdf:charsPerPage ['2989', '3432', '3121', '3143', '3126', '3107', '3377', '3213', '3024', '2866', '2104', '2654', '5322', '3235', '2377', '1210', '1017', '851'] pdf:docinfo:created 2021-02-12T10:37:24Z pdf:docinfo:creator Alex M. Ascensión, Olga Ibañez-Solé, Inaki Inza, Ander Izeta, Marcos J. Araúzo-Bravo pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:keywords scRNAseq, feature selection, bioinformatics, python pdf:docinfo:modified 2021-02-14T20:14:07Z pdf:docinfo:producer xdvipdfmx (20190225) pdf:docinfo:title Triku: a feature selection method based on nearest neighbors for single-cell data pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '396', '0', '0', '0'] producer xdvipdfmx (20190225) resourceName b'10_1101-2021_02_12_430764.pdf' subject scRNAseq, feature selection, bioinformatics, python title Triku: a feature selection method based on nearest neighbors for single-cell data xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:6d664b1d-1dd2-11b2-0a00-8208278d5b00 xmpTPg:NPages 18 === file2bib.sh === id: 10_1101-2020_02_04_934216 author: Kirchoff, Kathryn E. title: EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning date: 2021 pages: 13 extension: .pdf txt: ./txt/10_1101-2020_02_04_934216.txt cache: ./cache/10_1101-2020_02_04_934216.pdf Content-Type application/pdf Creation-Date 2021-02-10T16:35:43Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 276 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T16:35:43Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.4 dc:title EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning dcterms:created 2021-02-10T16:35:43Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-10T16:35:43Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['5129', '5413', '4261', '3539', '4285', '3469', '4716', '3179', '7747', '530', '1564', '978', '677'] pdf:docinfo:created 2021-02-10T16:35:43Z pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer macOS Version 10.15.7 (Build 19H2) Quartz PDFContext pdf:docinfo:title EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '18', '29', '4'] producer macOS Version 10.15.7 (Build 19H2) Quartz PDFContext resourceName b'10_1101-2020_02_04_934216.pdf' title EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c76a22-1dd2-11b2-0a00-000927fd5800 xmpTPg:NPages 13 === file2bib.sh === id: 10_1101-2021_02_08_428881 author: Lu, Yang Young title: ACE: Explaining cluster from an adversarial perspective date: 2021 pages: 12 extension: .pdf txt: ./txt/10_1101-2021_02_08_428881.txt cache: ./cache/10_1101-2021_02_08_428881.pdf Author Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble Content-Type application/pdf Creation-Date 2021-02-09T13:00:55Z Keywords Machine Learning, ICML Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is MiKTeX-pdfTeX 2.9.4307 (1.40.12) X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1420 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject Proceedings of the International Conference on Machine Learning 2021 created 2021-02-09T13:00:55Z creator Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble date 2021-02-14T21:22:17Z dc:creator Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble dc:description Proceedings of the International Conference on Machine Learning 2021 dc:format application/pdf; version=1.5 dc:subject Machine Learning, ICML dc:title ACE: Explaining cluster from an adversarial perspective dcterms:created 2021-02-09T13:00:55Z dcterms:modified 2021-02-14T21:22:17Z description Proceedings of the International Conference on Machine Learning 2021 meta:author Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble meta:creation-date 2021-02-09T13:00:55Z meta:keyword Machine Learning, ICML meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4324', '4945', '3406', '4075', '4894', '3533', '4631', '3844', '4142', '3413', '561', '665'] pdf:docinfo:created 2021-02-09T13:00:55Z pdf:docinfo:creator Yang Young Lu, Timothy C. Yu, Giancarlo Bonora, William Stafford Noble pdf:docinfo:creator_tool LaTeX with hyperref package pdf:docinfo:custom:PTEX.Fullbanner This is MiKTeX-pdfTeX 2.9.4307 (1.40.12) pdf:docinfo:keywords Machine Learning, ICML pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.12 pdf:docinfo:subject Proceedings of the International Conference on Machine Learning 2021 pdf:docinfo:title ACE: Explaining cluster from an adversarial perspective pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '10', '27', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.12 resourceName b'10_1101-2021_02_08_428881.pdf' subject Proceedings of the International Conference on Machine Learning 2021 title ACE: Explaining cluster from an adversarial perspective trapped False xmp:CreatorTool LaTeX with hyperref package xmpMM:DocumentID uuid:85c6f0fe-1dd2-11b2-0a00-1a0927edca00 xmpTPg:NPages 12 === file2bib.sh === id: 10_1101-2021_02_10_430649 author: Wen, Zi-Hang title: Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data date: 2021 pages: 11 extension: .pdf txt: ./txt/10_1101-2021_02_10_430649.txt cache: ./cache/10_1101-2021_02_10_430649.pdf Author Content-Type application/pdf Creation-Date 2021-02-12T16:42:36Z Keywords Last-Modified 2021-02-14T21:22:16Z Last-Save-Date 2021-02-14T21:22:16Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 1286 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-12T16:42:36Z creator date 2021-02-14T21:22:16Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data dcterms:created 2021-02-12T16:42:36Z dcterms:modified 2021-02-14T21:22:16Z meta:author meta:creation-date 2021-02-12T16:42:36Z meta:keyword meta:save-date 2021-02-14T21:22:16Z modified 2021-02-14T21:22:16Z pdf:PDFVersion 1.5 pdf:charsPerPage ['4396', '3587', '3999', '3258', '3868', '3519', '6427', '2602', '3216', '5326', '3921'] pdf:docinfo:created 2021-02-12T16:42:36Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:16Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:subject pdf:docinfo:title Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '57', '14', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_10_430649.pdf' subject title Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c66f59-1dd2-11b2-0a00-f508278d5b00 xmpTPg:NPages 11 === file2bib.sh === id: 10_1101-2021_02_12_430979 author: Da Silva, Kévin title: StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs date: 2021 pages: 20 extension: .pdf txt: ./txt/10_1101-2021_02_12_430979.txt cache: ./cache/10_1101-2021_02_12_430979.pdf Content-Type application/pdf Creation-Date 2021-02-12T17:18:42Z Last-Modified 2021-02-14T19:23:11Z Last-Save-Date 2021-02-14T19:23:11Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 94 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T17:18:42Z date 2021-02-14T19:23:11Z dc:format application/pdf; version=1.5 dc:title StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs dcterms:created 2021-02-12T17:18:42Z dcterms:modified 2021-02-14T19:23:11Z meta:creation-date 2021-02-12T17:18:42Z meta:save-date 2021-02-14T19:23:11Z modified 2021-02-14T19:23:11Z pdf:PDFVersion 1.5 pdf:charsPerPage ['3449', '4370', '2071', '4378', '2438', '4050', '3798', '3720', '2144', '2109', '4338', '4305', '4821', '3388', '3022', '1456', '1895', '1533', '1522', '541'] pdf:docinfo:created 2021-02-12T17:18:42Z pdf:docinfo:creator_tool TeX pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:modified 2021-02-14T19:23:11Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:title StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_12_430979.pdf' title StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs trapped False xmp:CreatorTool TeX xmpMM:DocumentID uuid:5b2f399e-1dd2-11b2-0a00-d30827bd3700 xmpTPg:NPages 20 === file2bib.sh === id: 10_1101-2021_02_13_429885 author: Househam, Jacob title: A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing date: 2021 pages: 36 extension: .pdf txt: ./txt/10_1101-2021_02_13_429885.txt cache: ./cache/10_1101-2021_02_13_429885.pdf Content-Type application/pdf Creation-Date 2021-02-13T13:37:27Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 216 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-13T13:37:27Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.5 dc:title A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing dcterms:created 2021-02-13T13:37:27Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-13T13:37:27Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['2646', '3593', '3421', '2574', '2758', '3251', '2949', '3250', '4228', '4125', '2811', '2417', '2788', '2811', '2908', '729', '1969', '1259', '1715', '1413', '932', '646', '1253', '862', '744', '744', '745', '710', '711', '711', '711', '711', '712', '1331', '1098', '484'] pdf:docinfo:created 2021-02-13T13:37:27Z pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Skia/PDF m90 pdf:docinfo:title A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Skia/PDF m90 resourceName b'10_1101-2021_02_13_429885.pdf' title A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing xmpMM:DocumentID uuid:a5aab842-f0f5-5f42-b83c-1d098c720a7a xmpTPg:NPages 36 === file2bib.sh === id: 10_1101-2021_02_10_430623 author: Aberasturi, Dillon title: “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases date: 2021 pages: 9 extension: .pdf txt: ./txt/10_1101-2021_02_10_430623.txt cache: ./cache/10_1101-2021_02_10_430623.pdf Author Nima Pouladi Content-Type application/pdf Creation-Date 2021-02-10T16:03:33Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 388 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T16:03:33Z creator Nima Pouladi date 2021-02-14T21:22:17Z dc:creator Nima Pouladi dc:format application/pdf; version=1.4 dc:title “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases dcterms:created 2021-02-10T16:03:33Z dcterms:modified 2021-02-14T21:22:17Z meta:author Nima Pouladi meta:creation-date 2021-02-10T16:03:33Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['4080', '6240', '7703', '4913', '7277', '7679', '7670', '7920', '8386'] pdf:docinfo:created 2021-02-10T16:03:33Z pdf:docinfo:creator Nima Pouladi pdf:docinfo:creator_tool Word pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer macOS Version 10.15.7 (Build 19H2) Quartz PDFContext pdf:docinfo:title “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '43', '145', '4', '0', '2', '0', '0'] producer macOS Version 10.15.7 (Build 19H2) Quartz PDFContext resourceName b'10_1101-2021_02_10_430623.pdf' title “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases xmp:CreatorTool Word xmpMM:DocumentID uuid:85c83ffc-1dd2-11b2-0a00-1509278d5b00 xmpTPg:NPages 9 === file2bib.sh === id: 10_1101-2020_09_02_279521 author: Abi Nader, Clément title: Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data date: 2021 pages: 32 extension: .pdf txt: ./txt/10_1101-2020_09_02_279521.txt cache: ./cache/10_1101-2020_09_02_279521.pdf Author Luigi Content-Type application/pdf Creation-Date 2021-02-10T15:18:40Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 151 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T15:18:40Z creator Luigi date 2021-02-14T21:22:17Z dc:creator Luigi dc:format application/pdf; version=1.7 dc:language en-US dc:title Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data dcterms:created 2021-02-10T15:18:40Z dcterms:modified 2021-02-14T21:22:17Z language en-US meta:author Luigi meta:creation-date 2021-02-10T15:18:40Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.7 pdf:charsPerPage ['2434', '2713', '3212', '3195', '2791', '3085', '3958', '1263', '1767', '2315', '1805', '2788', '2584', '1738', '1346', '2581', '2596', '658', '3084', '3005', '3258', '3194', '2934', '2319', '2637', '2729', '2500', '2660', '2549', '2716', '2848', '509'] pdf:docinfo:created 2021-02-10T15:18:40Z pdf:docinfo:creator Luigi pdf:docinfo:creator_tool Microsoft® Word for Microsoft 365 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Microsoft® Word for Microsoft 365 pdf:docinfo:title Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Microsoft 365 resourceName b'10_1101-2020_09_02_279521.pdf' title Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data xmp:CreatorTool Microsoft® Word for Microsoft 365 xmpMM:DocumentID uuid:06032D44-3EB9-44B9-86B0-2BD226EEB074 xmpTPg:NPages 32 === file2bib.sh === id: 10_1101-2021_02_10_430512 author: Kim, Catherine title: Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification date: 2021 pages: 41 extension: .pdf txt: ./txt/10_1101-2021_02_10_430512.txt cache: ./cache/10_1101-2021_02_10_430512.pdf Author Administrator Content-Type application/pdf Creation-Date 2021-02-10T16:35:24Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 138 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T16:35:24Z creator Administrator date 2021-02-14T21:22:17Z dc:creator Administrator dc:format application/pdf; version=1.4 dc:title Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification dcterms:created 2021-02-10T16:35:24Z dcterms:modified 2021-02-14T21:22:17Z meta:author Administrator meta:creation-date 2021-02-10T16:35:24Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['799', '1424', '3057', '1322', '2396', '2821', '838', '2839', '2649', '2927', '3046', '2894', '2854', '3091', '2975', '2901', '1457', '2391', '2551', '2648', '2590', '2505', '2444', '2515', '2623', '2360', '2741', '519', '572', '2642', '2718', '1974', '358', '358', '359', '361', '359', '358', '358', '358', '520'] pdf:docinfo:created 2021-02-10T16:35:24Z pdf:docinfo:creator Administrator pdf:docinfo:creator_tool PScript5.dll Version 5.2.2 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Acrobat Distiller 8.1.0 (Windows) pdf:docinfo:title Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '16', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Acrobat Distiller 8.1.0 (Windows) resourceName b'10_1101-2021_02_10_430512.pdf' title Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification xmp:CreatorTool PScript5.dll Version 5.2.2 xmpMM:DocumentID uuid:85c74254-1dd2-11b2-0a00-a108275dc400 xmpTPg:NPages 41 === file2bib.sh === id: 10_1101-2021_02_08_430343 author: Gibbs, David L title: Patient-specific cell communication networks associate with disease progression in cancer date: 2021 pages: 29 extension: .pdf txt: ./txt/10_1101-2021_02_08_430343.txt cache: ./cache/10_1101-2021_02_08_430343.pdf Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 Author Dave Content-Type application/pdf Creation-Date 2021-02-09T04:01:31Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 334 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-09T04:01:31Z creator Dave date 2021-02-14T21:22:17Z dc:creator Dave dc:format application/pdf; version=1.7 dc:language en-US dc:title 96291204 dcterms:created 2021-02-09T04:01:31Z dcterms:modified 2021-02-14T21:22:17Z language en-US meta:author Dave meta:creation-date 2021-02-09T04:01:31Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.7 pdf:charsPerPage ['2561', '4352', '3379', '3527', '3566', '3828', '4402', '4535', '4228', '4333', '4207', '3279', '1677', '2262', '2297', '2203', '3241', '3317', '2921', '2659', '338', '338', '338', '338', '338', '338', '338', '338', '338'] pdf:docinfo:created 2021-02-09T04:01:31Z pdf:docinfo:creator Dave pdf:docinfo:creator_tool Appligent AppendPDF Pro 5.5 pdf:docinfo:custom:Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Microsoft® Word for Microsoft 365 pdf:docinfo:title 96291204 pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Microsoft 365 resourceName b'10_1101-2021_02_08_430343.pdf' title 96291204 xmp:CreatorTool Appligent AppendPDF Pro 5.5 xmpMM:DocumentID uuid:02a690eb-b082-11b2-0a00-782dad000000 xmpTPg:NPages 29 === file2bib.sh === id: 10_1101-2021_02_09_430550 author: Song, Dongyuan title: scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling date: 2021 pages: 37 extension: .pdf txt: ./txt/10_1101-2021_02_09_430550.txt cache: ./cache/10_1101-2021_02_09_430550.pdf Author Content-Type application/pdf Creation-Date 2021-02-10T07:08:44Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 153 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-10T07:08:44Z creator date 2021-02-14T21:22:17Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling dcterms:created 2021-02-10T07:08:44Z dcterms:modified 2021-02-14T21:22:17Z meta:author meta:creation-date 2021-02-10T07:08:44Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['2602', '3331', '3092', '1969', '2476', '2626', '2136', '2674', '2221', '2436', '1603', '2850', '2182', '2921', '3297', '1944', '2318', '2544', '2453', '2588', '860', '2144', '2326', '2412', '896', '970', '1053', '767', '438', '525', '435', '529', '2318', '2544', '2453', '2588', '860'] pdf:docinfo:created 2021-02-10T07:08:44Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) kpathsea version 6.3.2 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.21 pdf:docinfo:subject pdf:docinfo:title scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '69', '6', '8', '4', '0', '0', '6', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '4', '2', '34', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.21 resourceName b'10_1101-2021_02_09_430550.pdf' subject title scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85c7129e-1dd2-11b2-0a00-4808278d5b00 xmpTPg:NPages 37 === file2bib.sh === id: 10_1101-2021_02_10_430606 author: Wei, Zheng title: NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks date: 2021 pages: 31 extension: .pdf txt: ./txt/10_1101-2021_02_10_430606.txt cache: ./cache/10_1101-2021_02_10_430606.pdf Content-Type application/pdf Creation-Date 2021-02-11T07:26:25Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 161 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T07:26:25Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.4 dc:subject dc:title NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks dcterms:created 2021-02-11T07:26:25Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-11T07:26:25Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.4 pdf:charsPerPage ['1957', '1927', '1871', '1344', '1768', '265', '1654', '1374', '3237', '1483', '3133', '3345', '3309', '2950', '3308', '3422', '2952', '1097', '2527', '2241', '2894', '2551', '2511', '2949', '3107', '3322', '2890', '1419', '2894', '3279', '1145'] pdf:docinfo:created 2021-02-11T07:26:25Z pdf:docinfo:creator_tool Word pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer macOS 版本 10.14.6(版号 18G8012) Quartz PDFContext pdf:docinfo:title NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '32', '4', '0', '12', '0', '0', '0', '0', '0', '1', '20', '0', '0', '0', '0', '0', '0', '55', '132', '27', '119', '88', '8', '0', '0', '5', '0', '0', '0', '0'] producer macOS 版本 10.14.6(版号 18G8012) Quartz PDFContext resourceName b'10_1101-2021_02_10_430606.pdf' subject title NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks xmp:CreatorTool Word xmpMM:DocumentID uuid:85c819d0-1dd2-11b2-0a00-9708277d8900 xmpTPg:NPages 31 === file2bib.sh === id: 10_1101-2020_10_08_327718 author: Jambor, Helena title: Creating Clear and Informative Image-based Figures for Scientific Publications date: 2021 pages: 36 extension: .pdf txt: ./txt/10_1101-2020_10_08_327718.txt cache: ./cache/10_1101-2020_10_08_327718.pdf Author Tracey Weissgerber Comments Company Content-Type application/pdf Creation-Date 2021-02-11T08:45:34Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z SourceModified D:20210211084452 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 166 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-11T08:45:34Z creator Tracey Weissgerber date 2021-02-14T21:22:17Z dc:creator Tracey Weissgerber dc:format application/pdf; version=1.6 dc:language EN-US dc:subject dc:title dcterms:created 2021-02-11T08:45:34Z dcterms:modified 2021-02-14T21:22:17Z language EN-US meta:author Tracey Weissgerber meta:creation-date 2021-02-11T08:45:34Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.6 pdf:charsPerPage ['3190', '1904', '3819', '3897', '910', '2670', '754', '2640', '1136', '3780', '2209', '2046', '2394', '2062', '1692', '1524', '1280', '2079', '2312', '2250', '2370', '3715', '3836', '2374', '3859', '4070', '3003', '1054', '1685', '1658', '1475', '890', '412', '3866', '3908', '2364'] pdf:docinfo:created 2021-02-11T08:45:34Z pdf:docinfo:creator Tracey Weissgerber pdf:docinfo:creator_tool Acrobat PDFMaker 20 for Word pdf:docinfo:custom:Comments pdf:docinfo:custom:Company pdf:docinfo:custom:SourceModified D:20210211084452 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Adobe PDF Library 20.13.96 pdf:docinfo:subject pdf:docinfo:title pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Adobe PDF Library 20.13.96 resourceName b'10_1101-2020_10_08_327718.pdf' subject title xmp:CreatorTool Acrobat PDFMaker 20 for Word xmpMM:DocumentID uuid:5d3787c4-53fe-422c-847f-b8011720723d xmpTPg:NPages 36 === file2bib.sh === id: 10_1101-698605 author: Sarantopoulou, Dimitra title: Comparative evaluation of full-length isoform quantification from RNA-Seq date: 2021 pages: 37 extension: .pdf txt: ./txt/10_1101-698605.txt cache: ./cache/10_1101-698605.pdf Author Thomas Brooks Content-Type application/pdf Creation-Date 2021-02-11T17:18:36Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 315 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T17:18:36Z creator Thomas Brooks date 2021-02-14T21:22:17Z dc:creator Thomas Brooks dc:format application/pdf; version=1.7 dc:title Comparative evaluation of full-length isoform quantification from RNA-Seq dcterms:created 2021-02-11T17:18:36Z dcterms:modified 2021-02-14T21:22:17Z meta:author Thomas Brooks meta:creation-date 2021-02-11T17:18:36Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.7 pdf:charsPerPage ['2300', '2678', '3181', '3316', '2068', '3156', '2759', '1252', '2934', '2853', '1939', '1704', '2616', '1349', '1899', '2026', '1841', '2304', '2390', '2985', '1795', '2952', '3087', '3064', '2972', '2970', '2485', '1728', '2104', '2438', '2468', '2365', '613', '1062', '771', '951', '697'] pdf:docinfo:created 2021-02-11T17:18:36Z pdf:docinfo:creator Thomas Brooks pdf:docinfo:creator_tool Microsoft® Word for Microsoft 365 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Microsoft® Word for Microsoft 365 pdf:docinfo:title Comparative evaluation of full-length isoform quantification from RNA-Seq pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Microsoft 365 resourceName b'10_1101-698605.pdf' title Comparative evaluation of full-length isoform quantification from RNA-Seq xmp:CreatorTool Microsoft® Word for Microsoft 365 xmpMM:DocumentID uuid:c591608c-c000-4b6e-931d-a5352ac54f59 xmpTPg:NPages 37 === file2bib.sh === id: 10_1101-2021_02_10_430563 author: Bandrowski, Anita title: SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data date: 2021 pages: 16 extension: .pdf txt: ./txt/10_1101-2021_02_10_430563.txt cache: ./cache/10_1101-2021_02_10_430563.pdf Author Calmi2 Content-Type application/pdf Creation-Date 2021-02-10T08:53:11Z Last-Modified 2021-02-14T21:22:18Z Last-Save-Date 2021-02-14T21:22:18Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 530 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T08:53:11Z creator Calmi2 date 2021-02-14T21:22:18Z dc:creator Calmi2 dc:format application/pdf; version=1.7 dc:language en-US dc:title SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data dcterms:created 2021-02-10T08:53:11Z dcterms:modified 2021-02-14T21:22:18Z language en-US meta:author Calmi2 meta:creation-date 2021-02-10T08:53:11Z meta:save-date 2021-02-14T21:22:18Z modified 2021-02-14T21:22:18Z pdf:PDFVersion 1.7 pdf:charsPerPage ['2645', '6469', '5580', '4241', '6477', '4383', '4414', '5014', '3391', '4754', '3775', '6024', '3764', '3020', '2517', '3837'] pdf:docinfo:created 2021-02-10T08:53:11Z pdf:docinfo:creator Calmi2 pdf:docinfo:creator_tool Microsoft® Word for Microsoft 365 pdf:docinfo:modified 2021-02-14T21:22:18Z pdf:docinfo:producer Microsoft® Word for Microsoft 365 pdf:docinfo:title SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Microsoft 365 resourceName b'10_1101-2021_02_10_430563.pdf' title SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data xmp:CreatorTool Microsoft® Word for Microsoft 365 xmpMM:DocumentID uuid:F3D69740-A838-49AB-AADD-0C81C78493F2 xmpTPg:NPages 16 === file2bib.sh === id: 10_1101-2021_02_01_429246 author: Zheng, Hongyu title: Sequence-specific minimizers via polar sets date: 2021 pages: 24 extension: .pdf txt: ./txt/10_1101-2021_02_01_429246.txt cache: ./cache/10_1101-2021_02_01_429246.pdf Author Content-Type application/pdf Creation-Date 2021-02-10T23:12:39Z Keywords Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 202 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-10T23:12:39Z creator date 2021-02-14T21:22:17Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Sequence-specific minimizers via polar sets dcterms:created 2021-02-10T23:12:39Z dcterms:modified 2021-02-14T21:22:17Z meta:author meta:creation-date 2021-02-10T23:12:39Z meta:keyword meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['3341', '4822', '3505', '2808', '3432', '3527', '3426', '3056', '2923', '3266', '3567', '3791', '3419', '4322', '3187', '3142', '1097', '3740', '2460', '3254', '3286', '1064', '1574', '785'] pdf:docinfo:created 2021-02-10T23:12:39Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref package pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) kpathsea version 6.2.2 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.17 pdf:docinfo:subject pdf:docinfo:title Sequence-specific minimizers via polar sets pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '1', '0', '7', '4', '10', '13', '12', '0', '2', '3', '0', '0', '0', '0', '0', '0', '13', '2', '2', '0', '0', '0'] producer pdfTeX-1.40.17 resourceName b'10_1101-2021_02_01_429246.pdf' subject title Sequence-specific minimizers via polar sets trapped False xmp:CreatorTool LaTeX with hyperref package xmpMM:DocumentID uuid:85c73a57-1dd2-11b2-0a00-6f09275d6100 xmpTPg:NPages 24 === file2bib.sh === id: 10_1101-2021_02_08_430280 author: Kasukurthi, Mohan V title: SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite date: 2021 pages: 23 extension: .pdf txt: ./txt/10_1101-2021_02_08_430280.txt cache: ./cache/10_1101-2021_02_08_430280.pdf Author glen borchert Content-Type application/pdf Creation-Date 2021-02-08T17:59:42Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 332 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-08T17:59:42Z creator glen borchert date 2021-02-14T21:22:17Z dc:creator glen borchert dc:format application/pdf; version=1.5 dc:language en-US dc:title SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite dcterms:created 2021-02-08T17:59:42Z dcterms:modified 2021-02-14T21:22:17Z language en-US meta:author glen borchert meta:creation-date 2021-02-08T17:59:42Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['1572', '3069', '4757', '4428', '2929', '2020', '5103', '3677', '2141', '4289', '4322', '3554', '3166', '6567', '2370', '4458', '4580', '4205', '3674', '4080', '3597', '3637', '1271'] pdf:docinfo:created 2021-02-08T17:59:42Z pdf:docinfo:creator glen borchert pdf:docinfo:creator_tool Microsoft® Word 2016 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer Microsoft® Word 2016 pdf:docinfo:title SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word 2016 resourceName b'10_1101-2021_02_08_430280.pdf' title SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite xmp:CreatorTool Microsoft® Word 2016 xmpMM:DocumentID uuid:85c75b5e-1dd2-11b2-0a00-c70827bd7200 xmpTPg:NPages 23 === file2bib.sh === id: 10_1101-2021_02_10_430705 author: Stassen, Shobana V. title: VIA: Generalized and scalable trajectory inference in single-cell omics data date: 2021 pages: 24 extension: .pdf txt: ./txt/10_1101-2021_02_10_430705.txt cache: ./cache/10_1101-2021_02_10_430705.pdf Content-Type application/pdf Creation-Date 2021-02-10T05:27:48Z Last-Modified 2021-02-14T18:00:57Z Last-Save-Date 2021-02-14T18:00:57Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 391 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-10T05:27:48Z date 2021-02-14T18:00:57Z dc:format application/pdf; version=1.4 dc:title VIA: Generalized and scalable trajectory inference in single-cell omics data dcterms:created 2021-02-10T05:27:48Z dcterms:modified 2021-02-14T18:00:57Z meta:creation-date 2021-02-10T05:27:48Z meta:save-date 2021-02-14T18:00:57Z modified 2021-02-14T18:00:57Z pdf:PDFVersion 1.4 pdf:charsPerPage ['5373', '2864', '5483', '5526', '5130', '345', '6236', '5474', '2917', '4605', '4025', '3650', '5031', '4563', '5222', '5654', '5634', '4294', '3842', '3498', '4377', '4589', '4431', '1921'] pdf:docinfo:created 2021-02-10T05:27:48Z pdf:docinfo:creator_tool Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.75 Safari/537.36 pdf:docinfo:modified 2021-02-14T18:00:57Z pdf:docinfo:producer Skia/PDF m77 pdf:docinfo:title VIA: Generalized and scalable trajectory inference in single-cell omics data pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Skia/PDF m77 resourceName b'10_1101-2021_02_10_430705.pdf' title VIA: Generalized and scalable trajectory inference in single-cell omics data xmp:CreatorTool Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.75 Safari/537.36 xmpMM:DocumentID uuid:3dc71ea7-1dd2-11b2-0a00-7a08275dc400 xmpTPg:NPages 24 === file2bib.sh === id: 10_1101-2021_02_09_430363 author: Bayer, Johanna M. M. title: Accommodating site variation in neuroimaging data using hierarchical and Bayesian models date: 2021 pages: 20 extension: .pdf txt: ./txt/10_1101-2021_02_09_430363.txt cache: ./cache/10_1101-2021_02_09_430363.pdf Author Content-Type application/pdf Creation-Date 2021-02-09T06:02:04Z Keywords Last-Modified 2021-02-14T21:22:27Z Last-Save-Date 2021-02-14T21:22:27Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 165 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-09T06:02:04Z creator date 2021-02-14T21:22:27Z dc:creator dc:format application/pdf; version=1.5 dc:subject dc:title Accommodating site variation in neuroimaging data using hierarchical and Bayesian models dcterms:created 2021-02-09T06:02:04Z dcterms:modified 2021-02-14T21:22:27Z meta:author meta:creation-date 2021-02-09T06:02:04Z meta:keyword meta:save-date 2021-02-14T21:22:27Z modified 2021-02-14T21:22:27Z pdf:PDFVersion 1.5 pdf:charsPerPage ['1502', '4844', '5626', '4771', '2956', '3003', '1660', '2526', '3391', '4243', '1765', '3323', '2090', '3826', '5707', '3101', '4217', '4665', '4787', '959'] pdf:docinfo:created 2021-02-09T06:02:04Z pdf:docinfo:creator pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0 pdf:docinfo:keywords pdf:docinfo:modified 2021-02-14T21:22:27Z pdf:docinfo:producer pdfTeX-1.40.19 pdf:docinfo:subject pdf:docinfo:title Accommodating site variation in neuroimaging data using hierarchical and Bayesian models pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '10', '9', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.19 resourceName b'10_1101-2021_02_09_430363.pdf' subject title Accommodating site variation in neuroimaging data using hierarchical and Bayesian models trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:85d672f8-1dd2-11b2-0a00-b309275d6100 xmpTPg:NPages 20 === file2bib.sh === id: 10_1101-2021_02_11_430762 author: Schäffer, Alejandro A. title: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation date: 2021 pages: 28 extension: .pdf txt: ./txt/10_1101-2021_02_11_430762.txt cache: ./cache/10_1101-2021_02_11_430762.pdf Content-Type application/pdf Creation-Date 2021-02-11T12:21:20Z Last-Modified 2021-02-14T21:22:17Z Last-Save-Date 2021-02-14T21:22:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/MacPorts 2019.50896_2) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 186 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T12:21:20Z date 2021-02-14T21:22:17Z dc:format application/pdf; version=1.5 dc:title Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation dcterms:created 2021-02-11T12:21:20Z dcterms:modified 2021-02-14T21:22:17Z meta:creation-date 2021-02-11T12:21:20Z meta:save-date 2021-02-14T21:22:17Z modified 2021-02-14T21:22:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['3536', '3441', '3231', '3411', '3332', '2676', '3013', '3264', '3585', '3505', '3340', '3147', '3237', '2993', '3597', '3360', '3849', '3022', '3179', '3559', '3323', '3293', '3281', '3253', '3263', '5532', '1767', '692'] pdf:docinfo:created 2021-02-11T12:21:20Z pdf:docinfo:creator_tool TeX pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019/MacPorts 2019.50896_2) kpathsea version 6.3.1 pdf:docinfo:modified 2021-02-14T21:22:17Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:title Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '17', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-2021_02_11_430762.pdf' title Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation trapped False xmp:CreatorTool TeX xmpMM:DocumentID uuid:85c6efdd-1dd2-11b2-0a00-d909278d5b00 xmpTPg:NPages 28 === file2bib.sh === id: 10_1101-2020_09_23_308239 author: Schultz, Bruce T title: The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing candidates from multimodal knowledge harmonization date: 2021 pages: 31 extension: .pdf txt: ./txt/10_1101-2020_09_23_308239.txt cache: ./cache/10_1101-2020_09_23_308239.pdf Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 Content-Type application/pdf Creation-Date 2021-02-11T08:22:35Z Last-Modified 2021-02-14T21:22:20Z Last-Save-Date 2021-02-14T21:22:20Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 5558 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-11T08:22:35Z date 2021-02-14T21:22:20Z dc:format application/pdf; version=1.7 dc:language en-US dc:title 2599906 dcterms:created 2021-02-11T08:22:35Z dcterms:modified 2021-02-14T21:22:20Z language en-US meta:creation-date 2021-02-11T08:22:35Z meta:save-date 2021-02-14T21:22:20Z modified 2021-02-14T21:22:20Z pdf:PDFVersion 1.7 pdf:charsPerPage ['2584', '1515', '2194', '2218', '491', '2118', '1934', '1624', '2074', '2074', '1754', '1826', '2154', '1228', '1624', '1173', '1263', '2086', '2088', '508', '2314', '1363', '534', '4487', '5088', '348', '344', '347', '349', '344', '7141'] pdf:docinfo:created 2021-02-11T08:22:35Z pdf:docinfo:creator_tool Appligent AppendPDF Pro 5.5 pdf:docinfo:custom:Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 pdf:docinfo:modified 2021-02-14T21:22:20Z pdf:docinfo:producer Microsoft® Word for Microsoft 365 pdf:docinfo:title 2599906 pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Microsoft 365 resourceName b'10_1101-2020_09_23_308239.pdf' title 2599906 xmp:CreatorTool Appligent AppendPDF Pro 5.5 xmpMM:DocumentID uuid:65fc1ff4-b086-11b2-0a00-782dad000000 xmpTPg:NPages 31 === file2bib.sh === id: 10_1101-2020_01_28_923532 author: Ahmadi, Saba title: The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective date: 2021 pages: 46 extension: .pdf txt: ./txt/10_1101-2020_01_28_923532.txt cache: ./cache/10_1101-2020_01_28_923532.pdf Author Schaffer, Alejandro (NIH/NLM/NCBI) [E] Content-Type application/pdf Creation-Date 2021-02-12T10:57:20Z Last-Modified 2021-02-14T20:40:41Z Last-Save-Date 2021-02-14T20:40:41Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 310 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-12T10:57:20Z creator Schaffer, Alejandro (NIH/NLM/NCBI) [E] date 2021-02-14T20:40:41Z dc:creator Schaffer, Alejandro (NIH/NLM/NCBI) [E] dc:format application/pdf; version=1.7 dc:language en-US dc:title The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective dcterms:created 2021-02-12T10:57:20Z dcterms:modified 2021-02-14T20:40:41Z language en-US meta:author Schaffer, Alejandro (NIH/NLM/NCBI) [E] meta:creation-date 2021-02-12T10:57:20Z meta:save-date 2021-02-14T20:40:41Z modified 2021-02-14T20:40:41Z pdf:PDFVersion 1.7 pdf:charsPerPage ['1785', '909', '2031', '2819', '3037', '2594', '1347', '1985', '2038', '3076', '2477', '1952', '1872', '1658', '1185', '2016', '2878', '1058', '1201', '2930', '2007', '1643', '2801', '3005', '2714', '3060', '2616', '3072', '1293', '2812', '2597', '2058', '1247', '856', '1289', '2511', '2560', '2403', '2051', '1906', '2272', '3658', '3506', '3579', '3509', '1722'] pdf:docinfo:created 2021-02-12T10:57:20Z pdf:docinfo:creator Schaffer, Alejandro (NIH/NLM/NCBI) [E] pdf:docinfo:creator_tool Microsoft® Word for Office 365 pdf:docinfo:modified 2021-02-14T20:40:41Z pdf:docinfo:producer Microsoft® Word for Office 365 pdf:docinfo:title The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective pdf:encrypted false pdf:hasMarkedContent true pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Microsoft® Word for Office 365 resourceName b'10_1101-2020_01_28_923532.pdf' title The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective xmp:CreatorTool Microsoft® Word for Office 365 xmpMM:DocumentID uuid:647F7B6F-D5FD-4AA7-A927-7A5C48053B39 xmpTPg:NPages 46 === file2bib.sh === id: 10_1101-2021_02_09_430460 author: Banerjee, Shayantan title: Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes date: 2021 pages: 39 extension: .pdf txt: ./txt/10_1101-2021_02_09_430460.txt cache: ./cache/10_1101-2021_02_09_430460.pdf Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 Content-Type application/pdf Creation-Date 2021-02-09T18:13:05Z Last-Modified 2021-02-14T21:22:18Z Last-Save-Date 2021-02-14T21:22:18Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 501 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-09T18:13:05Z date 2021-02-14T21:22:18Z dc:format application/pdf; version=1.5 dc:title 99120235 dcterms:created 2021-02-09T18:13:05Z dcterms:modified 2021-02-14T21:22:18Z meta:creation-date 2021-02-09T18:13:05Z meta:save-date 2021-02-14T21:22:18Z modified 2021-02-14T21:22:18Z pdf:PDFVersion 1.5 pdf:charsPerPage ['2933', '3586', '3699', '3184', '3141', '3534', '3190', '3343', '3362', '3202', '3398', '3616', '3061', '3423', '3333', '3280', '3702', '3698', '2901', '3903', '3853', '3930', '3937', '1941', '1445', '809', '880', '1975', '3658', '3874', '1427', '341', '341', '341', '341', '341', '341', '341', '341'] pdf:docinfo:created 2021-02-09T18:13:05Z pdf:docinfo:creator_tool Appligent AppendPDF Pro 5.5 pdf:docinfo:custom:Appligent AppendPDF Pro 5.5 Linux Kernel 2.6 64bit Oct 2 2014 Library 10.1.0 pdf:docinfo:modified 2021-02-14T21:22:18Z pdf:docinfo:producer Skia/PDF m90 pdf:docinfo:title 99120235 pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer Skia/PDF m90 resourceName b'10_1101-2021_02_09_430460.pdf' title 99120235 xmp:CreatorTool Appligent AppendPDF Pro 5.5 xmpMM:DocumentID uuid:3332bc21-b083-11b2-0a00-782dad000000 xmpTPg:NPages 39 === file2bib.sh === id: 10_1101-2021_02_09_430536 author: Lin, Cui-Xiang title: Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes date: 2021 pages: 47 extension: .pdf txt: ./txt/10_1101-2021_02_09_430536.txt cache: ./cache/10_1101-2021_02_09_430536.pdf Content-Type application/pdf Creation-Date 2021-02-08T13:33:39Z Last-Modified 2021-02-14T21:22:18Z Last-Save-Date 2021-02-14T21:22:18Z X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 509 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true created 2021-02-08T13:33:39Z date 2021-02-14T21:22:18Z dc:format application/pdf; version=1.4 dc:title Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes dcterms:created 2021-02-08T13:33:39Z dcterms:modified 2021-02-14T21:22:18Z meta:creation-date 2021-02-08T13:33:39Z meta:save-date 2021-02-14T21:22:18Z modified 2021-02-14T21:22:18Z pdf:PDFVersion 1.4 pdf:charsPerPage ['1664', '2073', '2208', '2343', '2100', '2396', '2229', '2243', '2075', '2132', '2088', '2141', '2348', '2222', '2377', '2256', '2313', '2108', '2317', '2344', '2180', '2187', '2176', '2145', '2140', '2082', '2090', '2146', '2191', '2089', '3111', '2933', '2995', '3073', '2378', '1830', '1583', '1697', '2746', '2631', '1348', '1962', '1975', '681', '7158', '1185', '1009'] pdf:docinfo:created 2021-02-08T13:33:39Z pdf:docinfo:creator_tool Word pdf:docinfo:modified 2021-02-14T21:22:18Z pdf:docinfo:producer macOS Version 10.15.7 (Build 19H114) Quartz PDFContext pdf:docinfo:title Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '9', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer macOS Version 10.15.7 (Build 19H114) Quartz PDFContext resourceName b'10_1101-2021_02_09_430536.pdf' title Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes xmp:CreatorTool Word xmpMM:DocumentID uuid:85c859ad-1dd2-11b2-0a00-5408271d5700 xmpTPg:NPages 47 === file2bib.sh === id: 10_1101-727867 author: Tangherloni, Andrea title: scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data date: 2021 pages: 28 extension: .pdf txt: ./txt/10_1101-727867.txt cache: ./cache/10_1101-727867.pdf Author Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic Content-Type application/pdf Creation-Date 2021-02-12T16:25:04Z Keywords Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration Last-Modified 2021-02-14T21:11:17Z Last-Save-Date 2021-02-14T21:11:17Z PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 X-Parsed-By ['org.apache.tika.parser.DefaultParser', 'org.apache.tika.parser.pdf.PDFParser'] X-TIKA:content_handler ToTextContentHandler X-TIKA:embedded_depth 0 X-TIKA:parse_time_millis 3963 access_permission:assemble_document true access_permission:can_modify true access_permission:can_print true access_permission:can_print_degraded true access_permission:extract_content true access_permission:extract_for_accessibility true access_permission:fill_in_form true access_permission:modify_annotations true cp:subject created 2021-02-12T16:25:04Z creator Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic date 2021-02-14T21:11:17Z dc:creator Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic dc:format application/pdf; version=1.5 dc:subject Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration dc:title scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data dcterms:created 2021-02-12T16:25:04Z dcterms:modified 2021-02-14T21:11:17Z meta:author Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic meta:creation-date 2021-02-12T16:25:04Z meta:keyword Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration meta:save-date 2021-02-14T21:11:17Z modified 2021-02-14T21:11:17Z pdf:PDFVersion 1.5 pdf:charsPerPage ['1552', '2911', '3437', '3290', '3172', '2974', '2835', '2923', '3313', '3204', '3158', '3151', '3207', '3309', '3262', '2383', '4689', '5805', '6016', '2696', '1341', '1788', '1428', '1745', '1871', '1746', '1423', '1664'] pdf:docinfo:created 2021-02-12T16:25:04Z pdf:docinfo:creator Andrea Tangherloni, Federico Ricciuti, Daniela Besozzi, Pietro Liò, Ana Cvejic pdf:docinfo:creator_tool LaTeX with hyperref pdf:docinfo:custom:PTEX.Fullbanner This is pdfTeX, Version 3.14159265-2.6-1.40.20 (TeX Live 2019) kpathsea version 6.3.1 pdf:docinfo:keywords Autoencoders, scRNA-Seq, Dimensionality reduction, Clustering, Batch correction, Data integration pdf:docinfo:modified 2021-02-14T21:11:17Z pdf:docinfo:producer pdfTeX-1.40.20 pdf:docinfo:subject pdf:docinfo:title scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data pdf:docinfo:trapped False pdf:encrypted false pdf:hasMarkedContent false pdf:hasXFA false pdf:hasXMP true pdf:unmappedUnicodeCharsPerPage ['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '2', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0'] producer pdfTeX-1.40.20 resourceName b'10_1101-727867.pdf' subject title scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data trapped False xmp:CreatorTool LaTeX with hyperref xmpMM:DocumentID uuid:81d7d1e9-1dd2-11b2-0a00-030a276d7200 xmpTPg:NPages 28 10_1101-2021_02_10_430604 txt/../ent/10_1101-2021_02_10_430604.ent 10_1101-2020_05_15_090266 txt/../ent/10_1101-2020_05_15_090266.ent 10_1101-2021_02_11_430806 txt/../ent/10_1101-2021_02_11_430806.ent 10_1101-2021_02_09_430036 txt/../ent/10_1101-2021_02_09_430036.ent 10_1101-2021_02_10_430367 txt/../ent/10_1101-2021_02_10_430367.ent 10_1101-2021_02_08_430070 txt/../ent/10_1101-2021_02_08_430070.ent 10_1101-2021_02_10_430619 txt/../ent/10_1101-2021_02_10_430619.ent 10_1101-2021_02_12_431018 txt/../ent/10_1101-2021_02_12_431018.ent 10_1101-2021_02_08_430275 txt/../ent/10_1101-2021_02_08_430275.ent 10_1101-2021_02_10_430656 txt/../ent/10_1101-2021_02_10_430656.ent 10_1101-2020_12_24_424317 txt/../ent/10_1101-2020_12_24_424317.ent 10_1101-2020_09_23_308239 txt/../ent/10_1101-2020_09_23_308239.ent 10_1101-2021_02_08_430270 txt/../ent/10_1101-2021_02_08_430270.ent 10_1101-2021_02_12_430963 txt/../ent/10_1101-2021_02_12_430963.ent 10_1101-2021_02_09_430405 txt/../ent/10_1101-2021_02_09_430405.ent 10_1101-2021_02_12_430923 txt/../ent/10_1101-2021_02_12_430923.ent 10_1101-2021_02_12_430989 txt/../ent/10_1101-2021_02_12_430989.ent 10_1101-2020_09_23_310276 txt/../ent/10_1101-2020_09_23_310276.ent 10_1101-2020_11_17_386649 txt/../ent/10_1101-2020_11_17_386649.ent 10_1101-2021_02_12_430979 txt/../ent/10_1101-2021_02_12_430979.ent 10_1101-2021_02_08_428881 txt/../ent/10_1101-2021_02_08_428881.ent 10_1101-2021_02_12_430739 txt/../ent/10_1101-2021_02_12_430739.ent 10_1101-2021_02_11_430695 txt/../ent/10_1101-2021_02_11_430695.ent 10_1101-2020_02_04_934216 txt/../ent/10_1101-2020_02_04_934216.ent 10_1101-2021_02_12_430830 txt/../ent/10_1101-2021_02_12_430830.ent 10_1101-2021_02_10_430623 txt/../ent/10_1101-2021_02_10_430623.ent 10_1101-2021_02_11_430847 txt/../ent/10_1101-2021_02_11_430847.ent 10_1101-2021_02_10_430649 txt/../ent/10_1101-2021_02_10_430649.ent 10_1101-2021_02_11_430789 txt/../ent/10_1101-2021_02_11_430789.ent 10_1101-2021_02_12_430764 txt/../ent/10_1101-2021_02_12_430764.ent 10_1101-2020_09_21_305516 txt/../ent/10_1101-2020_09_21_305516.ent 10_1101-2021_02_10_430563 txt/../ent/10_1101-2021_02_10_430563.ent 10_1101-2021_02_08_430343 txt/../ent/10_1101-2021_02_08_430343.ent 10_1101-2021_02_09_430550 txt/../ent/10_1101-2021_02_09_430550.ent 10_1101-2021_02_09_430363 txt/../ent/10_1101-2021_02_09_430363.ent 10_1101-2020_09_02_279521 txt/../ent/10_1101-2020_09_02_279521.ent 10_1101-2021_02_11_430871 txt/../ent/10_1101-2021_02_11_430871.ent 10_1101-2020_10_08_327718 txt/../ent/10_1101-2020_10_08_327718.ent 10_1101-2021_02_13_429885 txt/../ent/10_1101-2021_02_13_429885.ent 10_1101-698605 txt/../ent/10_1101-698605.ent 10_1101-2021_02_08_430280 txt/../ent/10_1101-2021_02_08_430280.ent 10_1101-2021_02_01_429246 txt/../ent/10_1101-2021_02_01_429246.ent 10_1101-2021_02_10_430705 txt/../ent/10_1101-2021_02_10_430705.ent 10_1101-2021_02_10_430606 txt/../ent/10_1101-2021_02_10_430606.ent 10_1101-2021_02_10_430512 txt/../ent/10_1101-2021_02_10_430512.ent 10_1101-2021_02_11_430762 txt/../ent/10_1101-2021_02_11_430762.ent 10_1101-2021_02_09_430460 txt/../ent/10_1101-2021_02_09_430460.ent 10_1101-727867 txt/../ent/10_1101-727867.ent 10_1101-2020_01_28_923532 txt/../ent/10_1101-2020_01_28_923532.ent 10_1101-2021_02_09_430536 txt/../ent/10_1101-2021_02_09_430536.ent 10_1101-2021_02_10_430604 txt/../pos/10_1101-2021_02_10_430604.pos 10_1101-2021_02_11_430806 txt/../pos/10_1101-2021_02_11_430806.pos 10_1101-2021_02_12_431018 txt/../pos/10_1101-2021_02_12_431018.pos 10_1101-2020_05_15_090266 txt/../pos/10_1101-2020_05_15_090266.pos 10_1101-2021_02_10_430619 txt/../pos/10_1101-2021_02_10_430619.pos 10_1101-2020_12_24_424317 txt/../pos/10_1101-2020_12_24_424317.pos 10_1101-2021_02_09_430036 txt/../pos/10_1101-2021_02_09_430036.pos 10_1101-2021_02_08_430070 txt/../pos/10_1101-2021_02_08_430070.pos 10_1101-2020_09_23_310276 txt/../pos/10_1101-2020_09_23_310276.pos 10_1101-2021_02_09_430405 txt/../pos/10_1101-2021_02_09_430405.pos 10_1101-2021_02_08_430275 txt/../pos/10_1101-2021_02_08_430275.pos 10_1101-2021_02_12_430963 txt/../pos/10_1101-2021_02_12_430963.pos 10_1101-2021_02_10_430367 txt/../pos/10_1101-2021_02_10_430367.pos 10_1101-2021_02_10_430656 txt/../pos/10_1101-2021_02_10_430656.pos 10_1101-2021_02_08_430270 txt/../pos/10_1101-2021_02_08_430270.pos 10_1101-2021_02_11_430847 txt/../pos/10_1101-2021_02_11_430847.pos 10_1101-2020_09_23_308239 txt/../pos/10_1101-2020_09_23_308239.pos 10_1101-2021_02_12_430739 txt/../pos/10_1101-2021_02_12_430739.pos 10_1101-2021_02_12_430923 txt/../pos/10_1101-2021_02_12_430923.pos 10_1101-2021_02_12_430989 txt/../pos/10_1101-2021_02_12_430989.pos 10_1101-2021_02_08_428881 txt/../pos/10_1101-2021_02_08_428881.pos 10_1101-2021_02_12_430830 txt/../pos/10_1101-2021_02_12_430830.pos 10_1101-2021_02_10_430623 txt/../pos/10_1101-2021_02_10_430623.pos 10_1101-2020_11_17_386649 txt/../pos/10_1101-2020_11_17_386649.pos 10_1101-2021_02_12_430764 txt/../pos/10_1101-2021_02_12_430764.pos 10_1101-2020_02_04_934216 txt/../pos/10_1101-2020_02_04_934216.pos 10_1101-2021_02_11_430695 txt/../pos/10_1101-2021_02_11_430695.pos 10_1101-2021_02_11_430871 txt/../pos/10_1101-2021_02_11_430871.pos 10_1101-2021_02_12_430979 txt/../pos/10_1101-2021_02_12_430979.pos 10_1101-2021_02_13_429885 txt/../pos/10_1101-2021_02_13_429885.pos 10_1101-2021_02_11_430789 txt/../pos/10_1101-2021_02_11_430789.pos 10_1101-2021_02_10_430563 txt/../pos/10_1101-2021_02_10_430563.pos 10_1101-2021_02_10_430649 txt/../pos/10_1101-2021_02_10_430649.pos 10_1101-2020_09_21_305516 txt/../pos/10_1101-2020_09_21_305516.pos 10_1101-2021_02_09_430550 txt/../pos/10_1101-2021_02_09_430550.pos 10_1101-2021_02_08_430343 txt/../pos/10_1101-2021_02_08_430343.pos 10_1101-698605 txt/../pos/10_1101-698605.pos 10_1101-2021_02_01_429246 txt/../pos/10_1101-2021_02_01_429246.pos 10_1101-2021_02_10_430606 txt/../pos/10_1101-2021_02_10_430606.pos 10_1101-2021_02_08_430280 txt/../pos/10_1101-2021_02_08_430280.pos 10_1101-2020_09_02_279521 txt/../pos/10_1101-2020_09_02_279521.pos 10_1101-2021_02_10_430512 txt/../pos/10_1101-2021_02_10_430512.pos 10_1101-2020_10_08_327718 txt/../pos/10_1101-2020_10_08_327718.pos 10_1101-2021_02_09_430363 txt/../pos/10_1101-2021_02_09_430363.pos 10_1101-2021_02_10_430705 txt/../pos/10_1101-2021_02_10_430705.pos 10_1101-727867 txt/../pos/10_1101-727867.pos 10_1101-2020_01_28_923532 txt/../pos/10_1101-2020_01_28_923532.pos 10_1101-2021_02_09_430460 txt/../pos/10_1101-2021_02_09_430460.pos 10_1101-2021_02_11_430762 txt/../pos/10_1101-2021_02_11_430762.pos 10_1101-2021_02_09_430536 txt/../pos/10_1101-2021_02_09_430536.pos 10_1101-2021_02_10_430604 txt/../wrd/10_1101-2021_02_10_430604.wrd 10_1101-2020_05_15_090266 txt/../wrd/10_1101-2020_05_15_090266.wrd 10_1101-2021_02_09_430036 txt/../wrd/10_1101-2021_02_09_430036.wrd 10_1101-2021_02_12_431018 txt/../wrd/10_1101-2021_02_12_431018.wrd 10_1101-2021_02_11_430806 txt/../wrd/10_1101-2021_02_11_430806.wrd 10_1101-2020_12_24_424317 txt/../wrd/10_1101-2020_12_24_424317.wrd 10_1101-2021_02_08_430070 txt/../wrd/10_1101-2021_02_08_430070.wrd 10_1101-2021_02_10_430367 txt/../wrd/10_1101-2021_02_10_430367.wrd 10_1101-2021_02_10_430619 txt/../wrd/10_1101-2021_02_10_430619.wrd 10_1101-2021_02_11_430847 txt/../wrd/10_1101-2021_02_11_430847.wrd 10_1101-2021_02_10_430656 txt/../wrd/10_1101-2021_02_10_430656.wrd 10_1101-2020_09_23_310276 txt/../wrd/10_1101-2020_09_23_310276.wrd 10_1101-2021_02_12_430830 txt/../wrd/10_1101-2021_02_12_430830.wrd 10_1101-2021_02_12_430989 txt/../wrd/10_1101-2021_02_12_430989.wrd 10_1101-2021_02_08_430275 txt/../wrd/10_1101-2021_02_08_430275.wrd 10_1101-2021_02_12_430739 txt/../wrd/10_1101-2021_02_12_430739.wrd 10_1101-2021_02_11_430695 txt/../wrd/10_1101-2021_02_11_430695.wrd 10_1101-2021_02_12_430923 txt/../wrd/10_1101-2021_02_12_430923.wrd 10_1101-2021_02_12_430963 txt/../wrd/10_1101-2021_02_12_430963.wrd 10_1101-2021_02_09_430405 txt/../wrd/10_1101-2021_02_09_430405.wrd 10_1101-2021_02_08_430270 txt/../wrd/10_1101-2021_02_08_430270.wrd 10_1101-2020_02_04_934216 txt/../wrd/10_1101-2020_02_04_934216.wrd 10_1101-2021_02_10_430649 txt/../wrd/10_1101-2021_02_10_430649.wrd 10_1101-2021_02_12_430764 txt/../wrd/10_1101-2021_02_12_430764.wrd 10_1101-2020_11_17_386649 txt/../wrd/10_1101-2020_11_17_386649.wrd 10_1101-2021_02_11_430789 txt/../wrd/10_1101-2021_02_11_430789.wrd 10_1101-2020_09_23_308239 txt/../wrd/10_1101-2020_09_23_308239.wrd 10_1101-2021_02_12_430979 txt/../wrd/10_1101-2021_02_12_430979.wrd 10_1101-2021_02_10_430623 txt/../wrd/10_1101-2021_02_10_430623.wrd 10_1101-2021_02_11_430871 txt/../wrd/10_1101-2021_02_11_430871.wrd 10_1101-2021_02_08_428881 txt/../wrd/10_1101-2021_02_08_428881.wrd 10_1101-2021_02_10_430563 txt/../wrd/10_1101-2021_02_10_430563.wrd 10_1101-2021_02_13_429885 txt/../wrd/10_1101-2021_02_13_429885.wrd 10_1101-2021_02_08_430343 txt/../wrd/10_1101-2021_02_08_430343.wrd 10_1101-2021_02_10_430606 txt/../wrd/10_1101-2021_02_10_430606.wrd 10_1101-2020_09_02_279521 txt/../wrd/10_1101-2020_09_02_279521.wrd 10_1101-2021_02_10_430512 txt/../wrd/10_1101-2021_02_10_430512.wrd 10_1101-2020_09_21_305516 txt/../wrd/10_1101-2020_09_21_305516.wrd 10_1101-698605 txt/../wrd/10_1101-698605.wrd 10_1101-2020_10_08_327718 txt/../wrd/10_1101-2020_10_08_327718.wrd 10_1101-2021_02_08_430280 txt/../wrd/10_1101-2021_02_08_430280.wrd 10_1101-2021_02_10_430705 txt/../wrd/10_1101-2021_02_10_430705.wrd 10_1101-2021_02_09_430363 txt/../wrd/10_1101-2021_02_09_430363.wrd 10_1101-2021_02_09_430550 txt/../wrd/10_1101-2021_02_09_430550.wrd 10_1101-2021_02_01_429246 txt/../wrd/10_1101-2021_02_01_429246.wrd 10_1101-2021_02_11_430762 txt/../wrd/10_1101-2021_02_11_430762.wrd 10_1101-2021_02_09_430460 txt/../wrd/10_1101-2021_02_09_430460.wrd 10_1101-2020_01_28_923532 txt/../wrd/10_1101-2020_01_28_923532.wrd 10_1101-727867 txt/../wrd/10_1101-727867.wrd 10_1101-2021_02_09_430536 txt/../wrd/10_1101-2021_02_09_430536.wrd Done mapping. Reducing neuroscience-from-bioarxiv === reduce.pl bib === id = 10_1101-2020_09_21_305516 author = Nikolic, Ana title = Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer date = 2021 pages = 32 extension = .pdf mime = application/pdf words = 10376 sentences = 1280 flesch = 67 summary = Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 uses single-cell epigenomic data to infer copy number variants (CNVs) that define cancer cells. We have tested the ability of Copy-scAT to use scATAC data to call CNVs with three different approaches 100 genome sequencing (WGS) data for adult GBM (aGBM) surgical resections (n = 4 samples, 3,647 cells). adult GBM samples identified using both methods, versus total numbers of gains detected by scATAC or 160 Number of chromosome-arm level gains detected in adult GBM samples identified using both methods, 163 (c) Multiple myeloma samples were profiled by both scATAC and the single-cell CNV assay. chromosome-arm level gains detected in adult GBM samples identified using both methods, versus total 166 CNVs are detected in scATAC clusters with Copy-scAT in pediatric GBM samples. cache = ./cache/10_1101-2020_09_21_305516.pdf txt = ./txt/10_1101-2020_09_21_305516.txt === reduce.pl bib === id = 10_1101-2021_02_10_430649 author = Wen, Zi-Hang title = Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data date = 2021 pages = 11 extension = .pdf mime = application/pdf words = 8418 sentences = 1302 flesch = 71 summary = Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data Recovering dropout events in a sparse gene expression matrix for scRNA-seq data is a long-standing matrix completion We introduce Bfimpute, a Bayesian factorization imputation algorithm that reconstructs two latent gene and cell matrices to impute final gene expression matrix within each cell group, with or without the aid of cell type labels or bulk Bfimpute achieves better accuracy than other six publicly notable scRNA-seq imputation methods on simulated Key words: single cell; RNA-seq; imputation; Bayesian factorization impute dropout events by adopting the bulk RNA-seq data imputation of single cell RNA-seq data could be applied by Bfimpute recovers dropout values and improves cell type identification in the simulated data. and the imputed data by Bfimpute, scImpute, and DrImpute for the human embryonic stem cell differentiation study. imputation method scimpute for single-cell rna-seq data. cache = ./cache/10_1101-2021_02_10_430649.pdf txt = ./txt/10_1101-2021_02_10_430649.txt === reduce.pl bib === id = 10_1101-2021_02_12_431018 author = Truong Nguyen, Phuoc title = HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. date = 2021 pages = 14 extension = .pdf mime = application/pdf words = 3786 sentences = 502 flesch = 61 summary = HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage 2 Several new variants of SARS-CoV-2 have emerged globally, of which the 18 based assemblies on raw SARS-CoV-2 sequences in addition to identifying lineages to detect 26 variants of concern, we have developed an open source bioinformatic pipeline called HaVoC 27 monitor the spread of SARS-CoV-2 variants of concern during local outbreaks. currently being used in Finland for monitoring the spread of SARS-CoV-2 variants. SARS-CoV2, variant detection, reference assembly, lineage identification, coronavirus, 40 surveillance of virus variants by sequencing the SARS-CoV-2 genomes would provide a fast 80 to query SARS-CoV-2 fastq sequence libraries and assigns lineages to them individually in 92 processing and a reference genome of SARS-CoV-2 in a separate FASTA file. The likelihood of emergence of novel SARS-CoV-2 variants of concern is increased and 209 Emerging SARS-CoV-2 Variants. cache = ./cache/10_1101-2021_02_12_431018.pdf txt = ./txt/10_1101-2021_02_12_431018.txt === reduce.pl bib === id = 10_1101-2021_02_11_430847 author = Pinatti, Lisa M. title = SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer date = 2021 pages = 26 extension = .pdf mime = application/pdf words = 6849 sentences = 788 flesch = 57 summary = SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer squamous cell carcinomas; however, the impact of HPV integration into the host human genome SearcHPV uncovered HPV integration sites adjacent to known cancer-related detection of HPV-human integration sites from targeted capture DNA sequencing data. developed a novel HPV integration detection tool for targeted capture sequencing data, which we SearcHPV showed a high frequency of HPV16 integration with a total of six events in UM-SCCIn this study, SearcHPV also called HPV integration sites within TP63. HPV integration sites have been associated with structural variations in the human genome3, 8, 37, which supports an additional genetic mechanism as to why HPV integration sites Genome-wide analysis of HPV integration in human and their integration sites in host genomes through next generation sequencing data. identify viruses and their integration sites using next-generation sequencing of human cancer cache = ./cache/10_1101-2021_02_11_430847.pdf txt = ./txt/10_1101-2021_02_11_430847.txt === reduce.pl bib === id = 10_1101-2021_02_12_430963 author = Gerber, Stefan title = Streamlining differential exon and 3' UTR usage with diffUTR date = 2021 pages = 17 extension = .pdf mime = application/pdf words = 6710 sentences = 896 flesch = 62 summary = adenylation site databases to enable differential 3' UTR usage analysis. Conclusions: diffUTR enables differential 3' UTR analysis and more generally facilitates DEU9 Popular bin-based DEU methods are provided by the limma [25,24], edgeR [23] and DEXSeq [22]41 Bins are prepared from various types of gene annotations as well as, optionally, additional APA-driven segmentation and extension, then read counts among statistically-significant genes, especially for bins with a higher expression (Figure 3A).78 diffUTR provides three main plot types to explore differential bin usage analyses, each with a88 Plotted are the UTR bins found statistically significant (binand gene-level FDR deuBinPlot (Figure 4B) provides bin-level statistic plots for a given gene, similar to those99 than CDS bins, including counts of 3' UTR when calculating overall gene expression could under-121 diffUTR streamlines DEU analysis and outperforms alternative methods in inferring UTR changes,127 For differential UTR analysis, gene-level results are ob-206 cache = ./cache/10_1101-2021_02_12_430963.pdf txt = ./txt/10_1101-2021_02_12_430963.txt === reduce.pl bib === id = 10_1101-2021_02_11_430871 author = Vadnais, David title = ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data date = 2021 pages = 24 extension = .pdf mime = application/pdf words = 10071 sentences = 1053 flesch = 63 summary = ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data chromosome and genome structure reconstruction from Hi-C data using Particle Swarm Optimization approach chromosome bin, according to the particle swarm algorithm, and then iterates its position towards a global best This paper presents ParticleChromo3D, a new distance-based algorithm for chromosome 3D structure The structures generated by ParticleChromo3D also shows that the result at swarm size Structures generated by ParticleChromo3D at different swarm size values. obtained by comparing the ParticleChromo3D algorithm's output structure to the simulated dataset's true plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. chromosome 3D structure reconstruction algorithms on the GM12878 data set at both the 1MB and 500KB chromosome and genome structures reconstructed from Hi-C data. cache = ./cache/10_1101-2021_02_11_430871.pdf txt = ./txt/10_1101-2021_02_11_430871.txt === reduce.pl bib === id = 10_1101-2021_02_13_429885 author = Househam, Jacob title = A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing date = 2021 pages = 36 extension = .pdf mime = application/pdf words = 10584 sentences = 1257 flesch = 49 summary = know tumour purity and the ploidy of a CNA segment, then the VAF mutations mapped A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. cache = ./cache/10_1101-2021_02_13_429885.pdf txt = ./txt/10_1101-2021_02_13_429885.txt === reduce.pl bib === id = 10_1101-2021_02_12_430830 author = Gergely, Tibély title = Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data date = 2021 pages = 19 extension = .pdf mime = application/pdf words = 8181 sentences = 793 flesch = 68 summary = Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data widely available bulk sequencing data where mutations from individual cells are and genomic mutation rate from bulk sequencing data. based on the maximum likelihood estimation of the parameters of a generative model of tumor growth and mutations. human hepatocellular carcinoma sample reveals an elevated per cell division mutation rate and high cell turnover. Due to the limitations of bulk sequencing, which only essays mutation frequencies for a population of cells from each tumor sample and does not The estimation is based on a maximum likelihood fit of the parameters of a birth-death model to the measured mutant and be estimated from readcount data, to separate the effects of the mutation rate We use pre-generated division trees from the ELynx suite at predetermined turnover rate values. Using the turnover rate, we also estimated the number of cell cache = ./cache/10_1101-2021_02_12_430830.pdf txt = ./txt/10_1101-2021_02_12_430830.txt === reduce.pl bib === id = 10_1101-2021_02_12_430739 author = Malekian, Negin title = Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli date = 2021 pages = 13 extension = .pdf mime = application/pdf words = 7516 sentences = 1093 flesch = 70 summary = Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli Here, we systematically screen for candidate quinolone resistance-conferring mutations. coli and performed a genome-wide association study (GWAS) correlating over 200,000 mutations against quinolone resistance phenotypes. significant mutations including one located at the active site of the biofilm dispersal genes bdcA and six silent In summary, we demonstrate that GWAS effectively and comprehensively identifies resistance mutations Keywords: E Coli; Quinolone; Antibiotic Resistance; Genome-Wide Association Study (GWAS) direct route to resistance is mutations in the drug targets gyrA and parC. In summary, we aim to show that a bacterial genomewide association study can effectively and comprehensively identify targets relevant to antibiotic resistance. Based on representative resistance phenotypes, the authors selected 103 isolates for sequencing with Illumina MiSeq, 92 of which are available from coli bdcA may act indirectly on antibiotic resistance. cache = ./cache/10_1101-2021_02_12_430739.pdf txt = ./txt/10_1101-2021_02_12_430739.txt === reduce.pl bib === id = 10_1101-2021_02_12_430764 author = Ascensión, Alex M. title = Triku: a feature selection method based on nearest neighbors for single-cell data date = 2021 pages = 18 extension = .pdf mime = application/pdf words = 9518 sentences = 1135 flesch = 64 summary = Triku: a feature selection method based on nearest neighbors for single-cell data Triku is a feature selection method that favours genes defining the main Single-cell RNA sequencing (scRNA-seq) is a powerful technology to study the biological heterogeneity of tissues at the individual cell level, allowing the characterization of new cell populations and cell states–i.e. cell types responding to different scRNA-seq datasets are multidimensional, i.e. the expression profile per cell consists of multiple genes. feature selection method: 1) the ability to recover basic dataset structure (main cell low, meaning that features selected with the different methods yielded clustering solutions that were quite similar to the manually-labeled cell types, although there are We first studied the expression pattern of genes selected by triku and other methods, To evaluate the cluster expression of selected genes in benchmarking datasets, for proteins within the genes selected by different FS methods in the two sets of benchmarking datasets. cache = ./cache/10_1101-2021_02_12_430764.pdf txt = ./txt/10_1101-2021_02_12_430764.txt === reduce.pl bib === id = 10_1101-2020_01_28_923532 author = Ahmadi, Saba title = The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective date = 2021 pages = 46 extension = .pdf mime = application/pdf words = 16705 sentences = 1572 flesch = 66 summary = We focus our analysis on genes encoding protein targets that encode receptors on the cell all "modular", including one part that specifically targets the tumor cell via one gene/protein and MadHitter and each patient receives an optimal personalized combination of targeted therapies from a prespecified set (pill bottle). Cohort and Individual Target Set Sizes as Functions of Tumor Killing and Given the single-cell tumor data sets and the ILP optimization framework described above, we filtering as this threshold is decreased), decreases the size of the target cell surface receptor gene heterogeneity of the cancer, number of patients within the data set, size of target gene set, lack of used for filtering the gene set to avoid targeting non-cancerous tissues. the genes in the optimal target set, the expression of that gene in that non-tumor cell exceeds the set of genes which is known to be targetable to cell 𝐶. cache = ./cache/10_1101-2020_01_28_923532.pdf txt = ./txt/10_1101-2020_01_28_923532.txt === reduce.pl bib === id = 10_1101-2021_02_11_430762 author = Schäffer, Alejandro A. title = Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation date = 2021 pages = 28 extension = .pdf mime = application/pdf words = 16496 sentences = 1489 flesch = 65 summary = Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation alignments of SSU, LSU and 5S rRNA from all three domains as well as from organelles, along with secondary structure predictions for selected sequences. Ribovore software package for the analysis of SSU rRNA and LSU rRNA sequences 18S SSU rRNA database of 1091 sequences was updated most recently on September 27, 2018 by running version 0.28 of the Ribovore program ribodbmaker on an input set of 579,279 GenBank sequences returned from the eukaryotic SSU rRNA The results of ribotyper and rRNA sensor are combined and each sequence is separated into one of four outcome classes depending on whether it passed or failed each input a set of candidate sequences and a specified rRNA model (e.g. SSU.Bacteria) two blastn databases: one of 1267 bacterial and archaeal 16S SSU rRNA sequences cache = ./cache/10_1101-2021_02_11_430762.pdf txt = ./txt/10_1101-2021_02_11_430762.txt === reduce.pl bib === id = 10_1101-2020_09_23_308239 author = Schultz, Bruce T title = The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing candidates from multimodal knowledge harmonization date = 2021 pages = 31 extension = .pdf mime = application/pdf words = 8797 sentences = 1318 flesch = 57 summary = The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph generated from a initial version of the COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph representing COVID-19 pathophysiology mechanisms that includes both drug targets Figure 3: Overlap of compound hits between different drug repurposing screening experiments. space overlap between different COVID-19 drug repurposing screenings. The COVID-19 PHARMACOME associates pathways derived from drug repurposing targets Figure 4 shows the distribution of repurposing drugs in the COVID-19 cause-and-effect graph, overlap analysis allows for the identification of repurposing drugs targeting mechanisms that Virus-response mechanisms are targets for repurposing drugs Figure 5: Visualization of drug repurposing candidates (and their targets) used in combination treatment as our own drug repurposing screening results, we were able to identify mechanisms targeted COVID-19 PHARMACOME, we are now able to link repurposing drugs, their targets and the SARS-CoV-2 protein interaction map reveals targets for drug repurposing. cache = ./cache/10_1101-2020_09_23_308239.pdf txt = ./txt/10_1101-2020_09_23_308239.txt === reduce.pl bib === id = 10_1101-2020_09_23_310276 author = Greenfest-Allen, Emily title = NIAGADS Alzheimer's GenomicsDB: A resource for exploring Alzheimer's Disease genetic and genomic knowledge date = 2021 pages = 19 extension = .pdf mime = application/pdf words = 5987 sentences = 592 flesch = 52 summary = The NIAGADS Alzheimer's Genomics Database (GenomicsDB) is an interactive knowledgebase for Alzheimer's disease (AD) genetics that provides access to GWAS summary statistics datasets The website makes available >70 genome-wide summary statistics datasets from GWAS and efficient real-time data analysis and variant or gene report generation. Gene reports provide summaries of co-located ADRD risk-associated variants and have pages linking summary statistics to variant and gene annotations, this resource makes these summary statistics available for browsing (on dataset, gene, and variant reports and as genome NIAGADS GenomicsDB variant reports and a track is available on the genome browser. The NIAGADS GenomicsDB includes allele frequency data from 1000 Genomes (phase 3, version visualizations for summarizing search results and annotations in gene and variant reports. compare NIAGADS GWAS summary statistics tracks to each other, against annotated gene or A detailed report is provided for each of the GWAS summary statistics and ADSP meta-analysis cache = ./cache/10_1101-2020_09_23_310276.pdf txt = ./txt/10_1101-2020_09_23_310276.txt === reduce.pl bib === id = 10_1101-2021_02_11_430806 author = Badaczewska-Dawid, Aleksandra title = BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences date = 2021 pages = 3 extension = .pdf mime = application/pdf words = 2698 sentences = 301 flesch = 53 summary = BIAPSS BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences web platform named BIAPSS (BioInformatic Analysis of liquidliquid Phase-Separating protein Sequences) which offers the users interactive data analytic tools for facilitating the discovery of statistically significant sequence signals for proteins with Phase-Separating protein Sequences. The objective of BIAPSS is to enable a rapid and on-the-fly deep statistical analysis of LLPS-driver proteins using the pool of sequences with The comparison to benchmarks of various protein groups enables statistical inference of specific phase-separating affinities. Furthermore, the residue-resolution biophysical regularities inferred from BIAPSS will help not only to accurately identify regions prone to phase separation but also to design sequence modifications targeting various biomedical applications. for comprehensive sequence-based analysis of LLPS proteins. the driving forces for phase separation of prion-like RNA binding proteins. disordered protein regions encode a driving force for liquid-liquid phase separation? of proteins driving liquid-liquid phase separation. cache = ./cache/10_1101-2021_02_11_430806.pdf txt = ./txt/10_1101-2021_02_11_430806.txt === reduce.pl bib === id = 10_1101-2021_02_11_430695 author = Gordon-Rodriguez, Elliott title = Learning Sparse Log-Ratios for High-Throughput Sequencing Data date = 2021 pages = 12 extension = .pdf mime = application/pdf words = 7973 sentences = 817 flesch = 60 summary = Log-ratios are an important class of features for analyzing high-throughput sequencing (HTS) metagenomic data for HTS data, and more generally, high-dimensional CoDa. Unlike existing methods, CoDaCoRe is simultaneously scalable, interpretable, sparse, and accurate. unlabelled datasets, {xi}ni=1, as a method for identiLearning Sparse Log-Ratios for High-Throughput Sequencing Data CoDaCoRe variable selection for the first (most explanatory) log-ratio on the Crohn disease data (Rivera-Pinto et al., 2018). more generally, in the field of CoDa. Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data cache = ./cache/10_1101-2021_02_11_430695.pdf txt = ./txt/10_1101-2021_02_11_430695.txt === reduce.pl bib === id = 10_1101-2021_02_12_430979 author = Da Silva, Kévin title = StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs date = 2021 pages = 20 extension = .pdf mime = application/pdf words = 10624 sentences = 992 flesch = 66 summary = StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as approaches to handle multiple similar genomes as with strains use gene clustering and then select the64 StrainFLAIR assigns and estimates species and strain abundances of a bacterial metagenomic sample graph, called the "node abundance", is computed, first focusing on unique mapped reads (first step). Strain-level abundances are then obtained by exploiting the specific genes of each reference genome188 from the reference variation graph thus simulating a new strain to be identified and quantified.231 strains from a sequenced sample, mapped onto this graph.343 Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or cache = ./cache/10_1101-2021_02_12_430979.pdf txt = ./txt/10_1101-2021_02_12_430979.txt === reduce.pl bib === id = 10_1101-2021_02_12_430989 author = Sofer, Tamar title = Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies date = 2021 pages = 27 extension = .pdf mime = application/pdf words = 8136 sentences = 626 flesch = 48 summary = Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies as well as linear regression-based analyses for studying the association of continuous exposures generation of empirical null distribution of association p-values, and we apply the pipeline to Many studies of phenotypes associated with gene expression from RNA-seq consist of small Residual permutation approach for simulations and for empirical p-value computation covariates, and outcome distributions; and (b) their relationships, aside from the exposureoutcome association, are the same as in the real data, we used a residual permutation approach. association studies applied to residual permutations were included to compute empirical papproach to study the distribution of p-values under the null of no association between the phenotypes and RNA-seq, and used this approach to further study power, and to compute approaches for transcriptome-wide analysis of RNA-seq in population-based studies, including more comprehensive study of statistical permutation approaches for RNA-seq association cache = ./cache/10_1101-2021_02_12_430989.pdf txt = ./txt/10_1101-2021_02_12_430989.txt === reduce.pl bib === id = 10_1101-2021_02_12_430923 author = Modi, Vivek title = Kincore: a web resource for structural classification of protein kinases and their inhibitors date = 2021 pages = 18 extension = .pdf mime = application/pdf words = 7913 sentences = 666 flesch = 62 summary = Kincore: a web resource for structural classification of protein kinases and their inhibitors result, among the DFGin structures, we distinguished between the catalytically active kinase conformation pages for kinase phylogenetic groups, genes, conformational labels, PDBids, ligands and ligand types. options to download data – database tables as a tab separated files; the kinase structures as PyMOL Kincore provides conformational assignments and ligand type labels to protein kinase structures from Figure 1: Representative protein kinase structure (3ETA_A) displaying the residues used to define inhibitor The distribution of different ligand types across kinase conformations is provided in Table 1. Table 1: Distribution of ligand types across protein kinase conformations (Number of chains). including conformational and ligand type labels and C-helix position, kinase family, gene name, Uniprot provides the number of kinase chains in the group across different conformations with their Database table provides the list of all the PDB chains with conformational labels and ligand cache = ./cache/10_1101-2021_02_12_430923.pdf txt = ./txt/10_1101-2021_02_12_430923.txt === reduce.pl bib === id = 10_1101-727867 author = Tangherloni, Andrea title = scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data date = 2021 pages = 28 extension = .pdf mime = application/pdf words = 15281 sentences = 2865 flesch = 72 summary = scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data This computational tool allows for coupling low-dimensional probabilistic representation of gene expression data with the downstream analysis to consider the Finally, the currently available AEs cannot be directly exploited to obtain the latent space or to generate synthetic cells. to show the cells in this embedded space or as a starting point for other dimensionality reduction approaches (e.g., t-SNE and UMAP) as well as downstream analyses Non-linear approaches for dimensionality reduction can be effectively used to capture the non-linearities among the gene interactions that may exist in the highdimensional expression space of scRNA-Seq data [16]. be effectively applied to analyse disparate types of single-cell data from different flexible method developed to cluster single-cell data; (ii) a centroid is calculated batch-effect correction methods for single-cell rna sequencing data. Wang, D., Gu, J.: VASC: dimension reduction and visualization of single-cell RNA-seq data by deep cache = ./cache/10_1101-727867.pdf txt = ./txt/10_1101-727867.txt === reduce.pl bib === id = 10_1101-2020_12_24_424317 author = Muazzam, Fariha title = Multi-class Cancer Classification and Biomarker Identification using Deep Learning date = 2021 pages = 12 extension = .pdf mime = application/pdf words = 4252 sentences = 426 flesch = 57 summary = classification, feature extraction and relevant gene identification through deep learning methods for 12 This research picks up from detection of different types of cancer RNA-Seq expressions using deep neural classification of gene expression profiles for different kinds of cancers. Hence, the effectiveness of deep learning models for feature extraction and relevant gene identification is performed revealing substantial results and they produced five high-ranked gene sets and reduced feature This study was aimed at classifying 12 types of cancer and identifying relevant genes and the results show were able to identify cancer-relevant pathways and genes for the sets, that different experiments generated, A deep learning approach for cancer detection and relevant gene Tumor gene expression data classification via sample expansionbased deep learning. Identification of a multi-cancer gene expression Multi-class Cancer Classification and Biomarker Identification using Deep Learning Multi-class Cancer Classification and Biomarker Identification using Deep Learning cache = ./cache/10_1101-2020_12_24_424317.pdf txt = ./txt/10_1101-2020_12_24_424317.txt === reduce.pl bib === id = 10_1101-2020_10_08_327718 author = Jambor, Helena title = Creating Clear and Informative Image-based Figures for Scientific Publications date = 2021 pages = 36 extension = .pdf mime = application/pdf words = 12824 sentences = 1189 flesch = 56 summary = journals in three fields; plant sciences, cell biology and physiology (n=580 papers). figures were uncommon (physiology 16%, cell biology 12%, plant sciences 2%). among papers published in top journals in plant sciences, cell biology and physiology. contained images (plant science: 68%, cell biology: 72%, physiology: 55%). in physiology (49%) and cell biology (55%), and 28% of plant science papers provided and 29% of plant sciences papers contained no scale information on any image. Some publications use insets to show the same image at two different scales (cell Figure 1: Image types and reporting of scale information and insets physiology and plant science papers contained some images that were inaccessible to B: Most papers explain colors in image-based figures, however, explanations are less Figure 4: Using scale bars to annotate image size Creating clear and informative image-based figures for scientific publications. Creating clear and informative image-based figures for scientific publications. cache = ./cache/10_1101-2020_10_08_327718.pdf txt = ./txt/10_1101-2020_10_08_327718.txt === reduce.pl bib === id = 10_1101-2021_02_09_430550 author = Song, Dongyuan title = scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling date = 2021 pages = 37 extension = .pdf mime = application/pdf words = 13512 sentences = 1548 flesch = 64 summary = (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Therefore, for scRNA-seq data analysis, informative gene selection Besides scRNA-seq data analysis, informative gene selection is also crucial for designing number and a scRNA-seq dataset, scPNMF selects informative genes based on its weight matrix; First, the informative genes selected by scPNMF lead to the most accurate cell clustering. the informative genes and weight matrix of scPNMF lead to the best cell type prediction accuracy Figure 3: Benchmarking scPNMF against 11 informative gene selection methods on seven scRNA-seq (b) UMAP visualization of cells in the Zheng4 dataset based on 100 informative genes selected by We benchmark scPNMF against the 11 gene selection methods in terms of cell type prediction We propose scPNMF, an unsupervised gene selection and data projection method for scRNA-seq For cell type prediction, we project every targeted gene profiling dataset and its scRNA-seq cache = ./cache/10_1101-2021_02_09_430550.pdf txt = ./txt/10_1101-2021_02_09_430550.txt === reduce.pl bib === id = 10_1101-2021_02_10_430656 author = Zakeri, Mohsen title = A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing date = 2021 pages = 7 extension = .pdf mime = application/pdf words = 6557 sentences = 568 flesch = 64 summary = A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing benchmark comparing the kallisto-bustools pipeline (2) for single-cell demonstrate that, when configured to match the computational complexity of kallisto-bustools as closely as possible, alevin-fry processes Alevin-fry (3) is a new pipeline for single-cell RNA-seq benchmarking STARsolo (9), kallisto-bustools (2) and alevin-fry (3), out new tools like alevin-fry for the pre-processing of single-cell data, (1), we have now created a simple-to-follow tutorial for speedoptimized single-cell pre-processing using alevin-fry (https:// by Booeshaghi and Pachter (1) change when a like-for-like comparison between alevin-fry and kallisto-bustools is carried out, we The time and memory used by the relevant steps of the alevin-fry and kallisto-bustools pipelines for pre-processing the 20 diverse tagged-end single-cell RNA-seq datasets used in (1). A comparison of the resulting count matrices obtained from alevin-fry and kallisto-bustools, as run in this manuscript, for the pbmc_10k_v3 dataset. peak memory than alevin-fry, with the kallisto-bustools pipeline using cache = ./cache/10_1101-2021_02_10_430656.pdf txt = ./txt/10_1101-2021_02_10_430656.txt === reduce.pl bib === id = 10_1101-2021_02_10_430619 author = Schutz, Sacha title = Cutevariant: a GUI-based desktop application to explore genetics variations date = 2021 pages = 8 extension = .pdf mime = application/pdf words = 4932 sentences = 632 flesch = 66 summary = Cutevariant: a GUI-based desktop application to explore genetics variations Cutevariant is a user-friendly GUI based desktop application for genomic research designed to search for variations in DNA samples collected in annotated files and encoded in the Variant Calling Format. application imports data into a local relational database wherefrom complex filter-queries can be built either Key words: genomics, DNA variant, desktop application, Domain Specific Language, Graphic User Interface applications import the data from VCF files into an indexed Cutevariant imports data from VCF files into a normalized Fig. 2: The Cutevariant main view showing the variants list sub-window (middle), different controllers sub-windows but not all are Just like Variant Tools, Cutevariant supports operations Features Cutevariant BrowseVCF VCF-Miner VCF-Explorer VCF-Server VCF-Filters GEMINI Variant Tools SnpSift Comparaison of time performance between cutevariant and VCF-miner for importation and query execution. 3. Pablo Cingolani, Adrian Platts, Le Lily Wang, Melissa VCF-Miner: GUI-based application for mining variants cache = ./cache/10_1101-2021_02_10_430619.pdf txt = ./txt/10_1101-2021_02_10_430619.txt === reduce.pl bib === id = 10_1101-2021_02_11_430789 author = Tyagin, Ilya title = Accelerating COVID-19 research with graph mining and transformer-based learning date = 2021 pages = 9 extension = .pdf mime = application/pdf words = 9408 sentences = 807 flesch = 58 summary = Accelerating COVID-19 research with graph mining and transformer-based learning develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. is currently customized and available in the open domain to massively process COVID-19 related queries. Both systems are the next generation of the AGATHA knowledge network mining transformer model [37]. (1) Most of the existing HG systems are domain-specific (e.g., genedisease interactions) that is usually expressed in limiting the processed information (e.g., significant filtering vocabulary and papers a trained deep bi-LSTM model for extracting predicates from unstructured text. For instance, the node representing the entity "COVID-19" is connected to every sentence and predicate that The prior AGATHA semantic network only includes UMLS terms that appear in SemMedDB predicates [18] which is a major limitation. obtain embeddings per node in the semantic graph, we train AGATHA system ranking model. cache = ./cache/10_1101-2021_02_11_430789.pdf txt = ./txt/10_1101-2021_02_11_430789.txt === reduce.pl bib === id = 10_1101-2021_02_10_430512 author = Kim, Catherine title = Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification date = 2021 pages = 41 extension = .pdf mime = application/pdf words = 11859 sentences = 1137 flesch = 56 summary = into DDIs. In this study, a hierarchical machine learning model was created to predict DDIassociated ADRs and pharmacological insight thereof for any drug pair. drugs' chemical structures as inputs to predict their target, enzyme, and transporter (TET) Development of RFCs for Prediction of Target, Enzyme, and Transporter Profiles of Drugs Development of a Model for Prediction of DDI-associated ADRs from TET Profiles of Drugs ADR prediction from Target, Enzyme, and Transporter Profiles of Drug Pairs To predict ADRs of a drug pair from its TET profiles, Random Forest Classifier (RFC), Application of the SVM model for DDI-associated ADRs Involving Three Major Drugs through predicted PRR changes of drug pairs upon removal of each of the targets, enzymes, and changes of drug pairs were predicted by the model upon removal of each of the targets, enzymes, Target, enzyme, and transporter (TET) profiles of atorvastatin and concomitant drugs, cache = ./cache/10_1101-2021_02_10_430512.pdf txt = ./txt/10_1101-2021_02_10_430512.txt === reduce.pl bib === id = 10_1101-2021_02_10_430705 author = Stassen, Shobana V. title = VIA: Generalized and scalable trajectory inference in single-cell omics data date = 2021 pages = 24 extension = .pdf mime = application/pdf words = 13590 sentences = 1383 flesch = 53 summary = 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 35 strategy to compute pseudotime, and reconstruct cell lineages based on lazy-teleporting random walks Step 1: Single-cell level graph is clustered such that each node 50 user defined start cell) is first computed by the expected hitting time for a lazy-teleporting random walk along an 57 network topology and single-cell level pseudotime/lineage probability properties onto an embedding using GAMs, as The cell fates and their lineage pathways are then computed by a two-stage probabilistic method, 94 graph-traversal allows it to infer cell fates when the underlying data spans combinations of multifurcating 201 detected cell fates annotated (o) lineage pathway and gene-pseudotime trend shown for the CD41 Megakaryocytic 259 Figure 3 VIA infers trajectories in single-cell multi-omic and image datasets (a) Major lineages of human Single cells are represented by graph nodes that are connected based on cache = ./cache/10_1101-2021_02_10_430705.pdf txt = ./txt/10_1101-2021_02_10_430705.txt === reduce.pl bib === id = 10_1101-2021_02_10_430606 author = Wei, Zheng title = NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks date = 2021 pages = 31 extension = .pdf mime = application/pdf words = 12013 sentences = 1107 flesch = 63 summary = Each point is a decoupled motif generate by a sample set of sequence. Only the max activation value of the decoupled motifs in Fig. 3b are significantly higher than the decoupled motifs of other neurons in layer 3 of Basset-3 model. discovered (q-value < 0.001) from the neuron in convolutional output layer of Basset, BD-5 and BD-10 model. c, The number of motif discovered (q-value < 0.01) from the neuron in layer 3 of Basset model using different sub-patterns in the input feature map of the max pooling layer to split the sequences set of which are DNA-sequence based DCNN models with 3 general convolutional layers for stacking sequences of different synonymous motifs with the maximum activation value In summary, we presented NeuronMotif as an effective algorithm to reveal the cisregulatory motif grammar learned by DCNN model that use DNA sequence to annotate sequences indicate more synonymous motif mixture in this DCNN model. cache = ./cache/10_1101-2021_02_10_430606.pdf txt = ./txt/10_1101-2021_02_10_430606.txt === reduce.pl bib === id = 10_1101-698605 author = Sarantopoulou, Dimitra title = Comparative evaluation of full-length isoform quantification from RNA-Seq date = 2021 pages = 37 extension = .pdf mime = application/pdf words = 12853 sentences = 1332 flesch = 55 summary = Comparative evaluation of full-length isoform quantification from RNA-Seq Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses benchmarking, isoform quantification, simulated data, pseudo-alignment, RNA-Seq, short Given the difficulty in full-length isoform quantification, many RNA-Seq studies simply analysis performed on the known true isoform quantifications of the simulated data to the For the simulated data we started with 11 real RNA-Seq samples: six liver and six the isoform expression level using idealized and realistic simulated data, with full and true counts), for the set of expressed isoforms in sample 1 in C) idealized and D) realistic data. Method effect on differential expression analysis, using realistic data. Method effect on differential expression analysis, using realistic data. RSEM is a gene/isoform abundance tool for RNA-Seq data which uses a generative model S1 Fig. Method effect on full-length isoform quantification using simulated data. Method effect on full-length isoform quantification using simulated data. cache = ./cache/10_1101-698605.pdf txt = ./txt/10_1101-698605.txt === reduce.pl bib === id = 10_1101-2021_02_10_430563 author = Bandrowski, Anita title = SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data date = 2021 pages = 16 extension = .pdf mime = application/pdf words = 10901 sentences = 1026 flesch = 48 summary = investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. required to organize their data files and metadata organized according to the SPARC Data Structure data according to the SPARC Dataset Structure. is the preferred file format for tabular data in SPARC, the Data files are organized into 3 different top-level folders, The organization structure of the files and folders for a SPARC dataset. https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 investigators include folders that organize data along a from these subjects, data files are organized within fields, the curation team developed a SPARC Dataset files/folders, and share datasets with the SPARC cache = ./cache/10_1101-2021_02_10_430563.pdf txt = ./txt/10_1101-2021_02_10_430563.txt === reduce.pl bib === id = 10_1101-2020_11_17_386649 author = Danciu, Daniel title = Topology-based Sparsification of Graph Annotations date = 2021 pages = 15 extension = .pdf mime = application/pdf words = 8205 sentences = 774 flesch = 67 summary = Experiments on 10,000 RNA-seq datasets show that RowDiff combined with MultiBRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most a binary matrix, where the k-mer set indexes the rows and each annotation label specifies a column. Starting from any vertex in the de Bruijn graph, Algorithm 1 defines a traversal leading to an anchor Each row in a RowDiff-transformed annotation matrix has the same or fewer set bits than A naı̈ve implementation of the RowDiff construction would be to load the matrix A in memory, and gradually replace its rows with their sparsified counterpart, while traversing the graph. We now note that, when querying annotations for paths in the graph, or sets of rows corresponding to vertices We constructed annotated de Bruijn graphs from the RNA-Seq data set in the same We now compare the representation size for RowDiff and other state-of-the-art graph annotation compression methods. cache = ./cache/10_1101-2020_11_17_386649.pdf txt = ./txt/10_1101-2020_11_17_386649.txt === reduce.pl bib === id = 10_1101-2020_05_15_090266 author = Zhang, R. title = SpacePHARER: Sensitive identification of phages from CRISPR spacers in prokaryotic hosts date = 2021 pages = 6 extension = .pdf mime = application/pdf words = 2191 sentences = 283 flesch = 64 summary = Summary: SpacePHARER (CRISPR Spacer Phage-Host Pair Finder) is a sensitive and fast tool for de novo prediction of phage-host relationships via identifying phage genomes that match CRISPR spacers in genomic or metagenomic data. SpacePHARER gains sensitivity by comparing spacers and phages at the protein level, optimizing its scores for matching SpacePHARER by searching a comprehensive spacer list against all complete phage genomes. methods compare individual CRISPR spacers with phage To increase sensitivity, (1) we compare protein coding sequences because phage genomes are mostly coding, and, (0) Preprocess input: scan the phage genome and CRISPR spacers in six ORFs q of CRISPR spacers extracted from one prokaryotic genome, and each target set T comprises the putative protein sequences t from a single phage. The performance of SpacePHARER was evaluated on the spacer test set against a target database predicted the correct host for more phages than BLASTN BLASTN in detecting phage-host pairs, due to searching cache = ./cache/10_1101-2020_05_15_090266.pdf txt = ./txt/10_1101-2020_05_15_090266.txt === reduce.pl bib === id = 10_1101-2021_02_01_429246 author = Zheng, Hongyu title = Sequence-specific minimizers via polar sets date = 2021 pages = 24 extension = .pdf mime = application/pdf words = 15440 sentences = 1407 flesch = 71 summary = minimizers focus on sampling fewer k-mers on a random sequence and use universal hitting sets (sets suggests, a UHS is a set of k-mers that "hits" every w-long window of every possible sequence (hence the the elements of the polar sets are in the sequence: the higher the energy, the more spread apart the k-mers have densities upper bounded by |U|/σk, because only k-mers from the universal hitting set can be selected. Section 2.2 gives a formal definition of the link energy of a polar set and Theorem 1 gives upper and lower bounds using this link energy for the density of a minimizer compatible with a polar set. form a link, which in turn is the number of k-mer pairs in the polar set that are exactly w bases away on S. A context is charged if the minimizer selects a different k-mer in the first window than in the second cache = ./cache/10_1101-2021_02_01_429246.pdf txt = ./txt/10_1101-2021_02_01_429246.txt === reduce.pl bib === id = 10_1101-2021_02_10_430623 author = Aberasturi, Dillon title = “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases date = 2021 pages = 9 extension = .pdf mime = application/pdf words = 9478 sentences = 748 flesch = 58 summary = published S3-type N-of-1-pathways MixEnrich to two paired samples (e.g., diseased vs unaffected tissues) for determining patient-specific enriched genes sets: Odds Ratios (S3-OR) and S3-variance using these models to derive effect sizes and statistical significance in singlesubject studies of transcriptomes, these samples are isogenic or quasi-isogenic, and thus do not necessarily generalize to a group of subjects (cohort-level signal). The novel bioinformatic method identifies meaningful biomechanism differences between very small cohorts by using single-subject-study-derived effect sizes for gene sets. (B) For the generalized linear model-based analyses, we applied a different filtering process to the raw data where we eliminated all the transcripts with 0 counts for each subject and then calculated the coefficient 2.3 Description of the Generalized Linear Models and application of Inter-N-of-1 methods for small cohort comparison and their evaluation in the Breast Cancer Data the analysis of subsets of the TCGA Breast Cancer data, genes were declared differentially expressed if their abs(log2FC) > log2(1.2) and their FDR-adjusted p-value < cache = ./cache/10_1101-2021_02_10_430623.pdf txt = ./txt/10_1101-2021_02_10_430623.txt === reduce.pl bib === id = 10_1101-2021_02_09_430405 author = Quazi, Sameer title = In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD date = 2021 pages = 23 extension = .pdf mime = application/pdf words = 5941 sentences = 1038 flesch = 64 summary = In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD including structure-based drug-like compounds screening from online databases, molecular The final small molecules of drug-like compounds would have more effective and selected for the molecular docking with FGI-103 antiviral drug-using AutoDock 4.2 software. After that, FGI-103 was set and screen other drug-like compounds from PubChem databases. The finally selected drug-like compounds were docked with the P1 site of VP35 of based on ap1 site for ligand in every dock for VP35 MARV utilizing a grid chart of 50 × 50 × 50 The ADMET properties of finally selected drug-like compounds were checked to utilize 2D molecules structure of selected drug-like compounds (A) represents the 2D The molecule structure of three drug-like compounds is shown in Figure 6. "In-Silico Structural and Molecular Docking-Based Drug Discovery "In-Silico Structural and Molecular Docking-Based Drug Discovery cache = ./cache/10_1101-2021_02_09_430405.pdf txt = ./txt/10_1101-2021_02_09_430405.txt === reduce.pl bib === id = 10_1101-2021_02_10_430367 author = Chen, Meili title = Genome Warehouse: A Public Repository Housing Genome-scale Data date = 2021 pages = 18 extension = .pdf mime = application/pdf words = 4875 sentences = 656 flesch = 66 summary = Running title: Chen M et al / Genome Assembly Data Repository 21 Genomics Data Center (NGDC), part of the China National Center for Bioinformation 40 archive high-quality genome sequences and annotations, GWH is equipped with a 46 Collectively, GWH serves as an important resource for genome-scale data 51 https://bigd.big.ac.cn/) [13], the aim of GWH is to accept data submissions worldwide 78 GWH is a centralized resource housing genome-scale data, with the purpose to 105 GWH not only accepts genome assembly associated data through an on-line 111 GWH will assign a unique accession number to the submitted genome assembly upon 149 GWH provides data visualization for both genome 163 Collectively, GWH is a user-friendly portal for genome data submission, release, and 209 Database resources of the National Genomics Data 302 Genome assembly accession number is prefixed with "GWH", followed by four 334 Genome assembly accession number is prefixed with "GWH", followed by four 334 cache = ./cache/10_1101-2021_02_10_430367.pdf txt = ./txt/10_1101-2021_02_10_430367.txt === reduce.pl bib === id = 10_1101-2021_02_09_430536 author = Lin, Cui-Xiang title = Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes date = 2021 pages = 47 extension = .pdf mime = application/pdf words = 20656 sentences = 4864 flesch = 79 summary = Genome-wide prediction and integrative functional characterization of Alzheimer's disease-associated genes example, a module-trait network approach was proposed and applied to identify gene 63 functional enrichment-based approach to identify negative genes that are not likely 94 associated genes through an optimal selection of networks and machine learning 98 FGN, and prediction of AD-associated genes using machine learning models (Fig. 1). addition, we tested their enrichment in three AD-related gene sets associated with 122 The top-ranked genes are enriched in AD-associated functions and phenotypes 154 These results provide additional evidence that our predicted genes are associated with 194 The top-ranked genes are associated with AD based on miRNA-target networks 227 We investigated whether top-ranked genes were functionally related to AD-associated 229 We tested whether the top-ranked k genes were more likely to interact with AD-associated 576 related to AD-associated genes or miRNAs based on miRNA-target interaction networks. cache = ./cache/10_1101-2021_02_09_430536.pdf txt = ./txt/10_1101-2021_02_09_430536.txt === reduce.pl bib === id = 10_1101-2021_02_09_430363 author = Bayer, Johanna M. M. title = Accommodating site variation in neuroimaging data using hierarchical and Bayesian models date = 2021 pages = 20 extension = .pdf mime = application/pdf words = 13439 sentences = 1891 flesch = 70 summary = Accommodating site variation in neuroimaging data using hierarchical and Bayesian models The potential of normative modeling to make individualized predictions has led to structural neuroimaging results that go beyond the case-control approach. in a similar way for multi-site modeling in a pooled neuroimaging data set, which contained 7499 participants that org/abide/) data set to compare a non-linear, Gaussian version of the model, to a linear hierarchical Bayesian version and mathematical description of our approach to include site as predictor in a normative hierarchical Bayesian model. With the aim to create reliable normative models in multi-site neuroimaging data, we developed and compared two model is also able to capture non-linear effects between age and thickness of the cortical region ("Hierarchical Bayesian Gaussian Process term, which allows to model non-linear association between age and cortical thickness measures. The only models that perform better for most regions than the mean of the training data set are the Hierarchical Bayesian cache = ./cache/10_1101-2021_02_09_430363.pdf txt = ./txt/10_1101-2021_02_09_430363.txt === reduce.pl bib === id = 10_1101-2021_02_09_430460 author = Banerjee, Shayantan title = Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes date = 2021 pages = 39 extension = .pdf mime = application/pdf words = 15659 sentences = 1731 flesch = 40 summary = experimentally validated cancer mutation data in this study, we explored various string-based evolutionary features resulted in the development of a pan-cancer mutation effect prediction Distinguishing between driver and passenger mutations from sequenced cancer genomes is a Recent studies have identified specific signatures or patterns of mutations in different cancer than passenger mutations and built probabilistic models to identify driver genes that had this study, missense mutations from 58 genes that were pan-cancer-based were combined from We used the same datasets to judge our model's ability to predict rare driver mutations based Driver and Passenger Mutations' Features Used to Train NBDriver are Significantly Although our method's focus was to identify missense driver mutations from sequenced cancer surrounding driver and passenger mutations obtained from sequenced cancer genomes. computational prediction of driver missense mutations," Cancer Res., vol. functionally validated cancer-related missense mutations," Genome Biology, vol. Figure 7: Differences in the distribution of features between driver and passenger mutations cache = ./cache/10_1101-2021_02_09_430460.pdf txt = ./txt/10_1101-2021_02_09_430460.txt === reduce.pl bib === id = 10_1101-2021_02_08_430070 author = Zhang, Yao-zhong title = On the application of BERT models for nanopore methylation detection date = 2021 pages = 7 extension = .pdf mime = application/pdf words = 5183 sentences = 586 flesch = 60 summary = On the application of BERT models for nanopore methylation detection with deep learning models, have achieved significant performance improvements on nanopore methylation recurrent patterns of positional-signal-shift in the context window surrounding target 5-methylcytosine that the refined BERT model can achieve competitive or even better results than the state-of-the-art biRNN of datasets from the different research groups, BERT models demonstrate a good generalization Fig. 1: Basic BERT's and refined BERT's model structure used for methylation detection. a refined BERT model to take account of signal-shift patterns in the proposed refined BERT model achieves a competitive or even better result explore applying the BERT model for the nanopore methylation detection 2.2 Applying BERT models for nanopore methylation For the cross-sample evaluation, we train models on one dataset and test a BERT model to pay more attention to center positions. In-sample evaluation of different deep learning models on 5mC datasets. cache = ./cache/10_1101-2021_02_08_430070.pdf txt = ./txt/10_1101-2021_02_08_430070.txt === reduce.pl bib === id = 10_1101-2021_02_09_430036 author = Goldsborough, Thibaut title = A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea date = 2021 pages = 7 extension = .pdf mime = application/pdf words = 3128 sentences = 477 flesch = 70 summary = A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea is a carnivorous plant that grows on nitrogen-poor waterlogged sandstone aurea's genome, CDS and non-coding DNA 2) Determination of transcriptomic nitrogen content and codon usage bias associated with higher nitrogen content tRNAs (among codons that are coding for the same amino a considerably lower number of nitrogen atoms in its genome than the two other plant species. has higher nitrogen counts per molecular unit in genomic DNA, CDS, Non-Coding DNA, protein, aurea has a higher nitrogen usage in its DNA, RNA and proteins Figure 2: Average number of nitrogen atoms per molecular unit in genomic DNA, CDS, Non-Coding DNA, aurea had lower nitrogen content in tRNA sequences but not in other Figure 3: Bar graph representing the codon usage bias and tRNA nitrogen content in G. cache = ./cache/10_1101-2021_02_09_430036.pdf txt = ./txt/10_1101-2021_02_09_430036.txt === reduce.pl bib === id = 10_1101-2021_02_08_428881 author = Lu, Yang Young title = ACE: Explaining cluster from an adversarial perspective date = 2021 pages = 12 extension = .pdf mime = application/pdf words = 7909 sentences = 790 flesch = 66 summary = A common workflow in single-cell RNA-seq analysis is to project the data to a latent space, cluster the cells in that space, and identify sets of marker genes that explain the differences among the nonlinear embedding model which maps the gene expression to the low-dimensional representation where the groups A notable feature of ACE's approach is that, by identifying genes jointly, the method moves away from the notion Input: gene expression matrix Deep autoencoder learns low-dimensional representation Embedding clustering Clustering is neuralized and concatenated with the encoder Differentiation analysis by ACE Output: gene relevance ACE takes as input a single-cell gene expression matrix and learns a low-dimensional representation for each Next, a neuralized version of the k-means algorithm is applied to the learned representation to identify cell groups. input gene expression profile that lead the neuralized clustering model to alter the assignment from one group to the other. cache = ./cache/10_1101-2021_02_08_428881.pdf txt = ./txt/10_1101-2021_02_08_428881.txt === reduce.pl bib === id = 10_1101-2021_02_08_430343 author = Gibbs, David L title = Patient-specific cell communication networks associate with disease progression in cancer date = 2021 pages = 29 extension = .pdf mime = application/pdf words = 11335 sentences = 1445 flesch = 58 summary = tumor microenvironment, the method identified ligands, receptors and cells meeting certain criteria of 56 9,234 samples in The Cancer Genome Atlas (TCGA), starting from a network of 64 cell types and 1,894 62 Data sources including TCGA and cell-sorted gene expression, bulk tumor expression, cell type scores, 78 ligands and receptors for each of the 64 cell types in xCell, using the source gene expression data. With this procedure, a network scaffold is induced, where cells produce ligands that bind to receptors on 113 (PFI) and tumor stage for each sample, a matrix of patient-specific edge weights was constructed 206 number of high weight edges in each tumor type did not associate with the number of samples, as might 254 in the tumor stage contrast, a majority of ligand-producing cells include GMP cells, Osteoblasts, MSC 283 In the PFI results, Th1 cells appeared in 13 high scoring edges in SKCM, all with 394 cache = ./cache/10_1101-2021_02_08_430343.pdf txt = ./txt/10_1101-2021_02_08_430343.txt === reduce.pl bib === id = 10_1101-2021_02_08_430275 author = Zhang, Jianbo title = Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes date = 2021 pages = 6 extension = .pdf mime = application/pdf words = 6404 sentences = 694 flesch = 68 summary = Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes identified using BSA-Seq, a technology in which next-generation sequencing (NGS) is applied to bulked segregant analysis (BSA). recently developed the significant structural variant method for BSASeq data analysis that exhibits higher detection power than standard to analyze BSA-Seq data in which genome sequences of one parent served as the reference sequences in genotype calling, and thus We analyzed a public BSA-Seq dataset using our modified method and the standard allele frequency and Gmethod allows the detection of such associations without sequencing the parental genomes, leading to further lower the the BSA-Seq data with the genome sequences of both the parents101 when the parental genome sequences are used to aid BSA-Seq data 193 The allele frequency method: The ΔAF value of each SNP in 267 BSA-Seq data analysis using the genome sequences of both the parents and the bulks. BSA-Seq data analysis using only the bulk genome sequences. cache = ./cache/10_1101-2021_02_08_430275.pdf txt = ./txt/10_1101-2021_02_08_430275.txt === reduce.pl bib === id = 10_1101-2021_02_08_430270 author = Gerard, David title = Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty date = 2021 pages = 22 extension = .pdf mime = application/pdf words = 7219 sentences = 1582 flesch = 69 summary = Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty Keywords and phrases: attenuation bias, genotype likelihood, linkage disequilibrium, polyploidy, reliability ratio. Let XiA and XiB be the posterior means at loci A and B for individual Equations (5)–(7) take the naive estimators most researchers use in practice (the sample covariance/correlation of posterior means) and inflate these by a multiplicative effect. Gerard and Ferrão, 2019] to obtain the posterior moments for each individual's genotype at each SNP reliability ratios of most SNPs only increase their correlation estimates by less than 10%. To evaluate the LD estimates of high reliability ratio SNPs, we calculated the MLEs for ρ2 applied to simple linear regression with an additive effects model (where the SNP effect is proportional to the dosage), result in the standard ordinary least squares estimates when using the extreme reliability ratio of PotVar0080327, the genotype-error adjusted correlation estimate is -1. cache = ./cache/10_1101-2021_02_08_430270.pdf txt = ./txt/10_1101-2021_02_08_430270.txt === reduce.pl bib === id = 10_1101-2021_02_10_430604 author = Youngblut, Nicholas D. title = Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets date = 2021 pages = 4 extension = .pdf mime = application/pdf words = 1409 sentences = 157 flesch = 56 summary = Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets 1 Struo2: efficient metagenome profiling database construction for ever-expanding 10 Mapping metagenome reads to reference databases is the standard approach for 12 reference databases often lack recently generated genomic data such as 15 method for constructing custom databases; however, the pipeline does not scale well with the 17 not allow for efficient database updating as new data are generated. 20 HUMAnN3 databases that can be easily updated with new genomes and/or individual gene Struo2 enables feasible database generation for continually increasing large-scale 25 ● Pre-built databases: http://ftp.tue.mpg.de/ebio/projects/struo2/ 26 ● Utility tools: https://github.com/nick-youngblut/gtdb_to_taxdump 28 Metagenome profiling involves mapping reads to reference sequence databases and is 39 computational resources, which led us to create Struo for straight-forward custom metagenome 54 CPU hours per genome versus ~2.4 for Struo (Figure 1B). 67 taxonomy (available at https://github.com/nick-youngblut/gtdb_to_taxdump ). (2020) Struo: a pipeline for building custom databases for cache = ./cache/10_1101-2021_02_10_430604.pdf txt = ./txt/10_1101-2021_02_10_430604.txt === reduce.pl bib === id = 10_1101-2021_02_08_430280 author = Kasukurthi, Mohan V title = SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite date = 2021 pages = 23 extension = .pdf mime = application/pdf words = 12363 sentences = 1218 flesch = 58 summary = given transcriptome provided as either a raw user-generated RNA-Seq dataset or NCBI SRR file identifier. SURFR identifies all ncRNA fragments (both annotated and novel) and their expressions in up to ten datasets per comprehensively compare all fragment expressions identified in up to 30 individual datasets by entering multiple SURFR session IDs window detailing each fragment identified in the individual, selected small RNA-Seq dataset. of the results page redirects the user to a SURFR window detailing the expressions of all full length sncRNAs in the provided datasets. Fragments" window (Figure 2D) for each fragment identified in the individual, selected small RNA-Seq dataset within its host gene along with the fragment's expression (RPM) in each individual small RNA-Seq dataset, and lncRNAs expressed in a given human transcriptome from either a user-provided RNA-Seq dataset or publically More importantly, however, LAGOOn identified MALAT1 as the most highly expressed lncRNA in MDAMB-231 breast cancer cells (Figure 9). cache = ./cache/10_1101-2021_02_08_430280.pdf txt = ./txt/10_1101-2021_02_08_430280.txt === reduce.pl bib === id = 10_1101-2020_02_04_934216 author = Kirchoff, Kathryn E. title = EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning date = 2021 pages = 13 extension = .pdf mime = application/pdf words = 8121 sentences = 726 flesch = 59 summary = EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning task of kinase-motif phosphorylation prediction as a multi-label kinase or substrate, as well as protein scaffolds that facilitate structural orientation and downstream catalysis of the reaction, modify the efficacy of motif phosphorylation. prediction of phosphorylation events), a deep learning approach for predicting multi-label kinase-motif phosphorylation relationships. example, the TLK kinase family only has nine positive labels (verified TLK-motif interactions) and more than 10,000 resulting data set is comprised of 7302 phosphorylatable motifs and their reaction-associated kinase families (Table 1). The final output is a vector, k, of length eight, where each value corresponds to the probability that the motif a was phosphorylated by one of the kinase families indicated in We sought to illuminate the relationship between kinase-family dissimilarity and phosphorylated motif-group dissimilarity described results provide motivation to incorporate both motif dissimilarity and kinase relatedness into the predictive model, as of kinase-motif prediction compared to the single-label approaches. cache = ./cache/10_1101-2020_02_04_934216.pdf txt = ./txt/10_1101-2020_02_04_934216.txt === reduce.pl bib === id = 10_1101-2020_09_02_279521 author = Abi Nader, Clément title = Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data date = 2021 pages = 32 extension = .pdf mime = application/pdf words = 12164 sentences = 1038 flesch = 56 summary = Simulating the outcome of amyloid treatments in Alzheimer's disease from imaging and clinical data When applied to multimodal imaging and clinical data from the Alzheimer's Disease Neuroimaging Initiative our * Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database Keywords : Alzheimer's Disease ; Clinical trials ; Disease progression; Amyloid hypothesis; of large datasets of different data modalities, such as clinical scores, or brain imaging measures to model Alzheimer's disease progression based on specific assumptions on the biochemical combining traditional DPMs with dynamical models of Alzheimer's disease progression. In this work we present a novel computational model of Alzheimer's disease progression to multi-modal imaging and clinical data from the Alzheimer's Disease Neuroimaging To simulate the long-term progression of Alzheimer's disease we first project the AD subjects Figure 3 Model-based progression of Alzheimer's disease. clinical data, based on the estimation of latent biomarkers' relationships governing Alzheimer's cache = ./cache/10_1101-2020_09_02_279521.pdf txt = ./txt/10_1101-2020_09_02_279521.txt Building ./etc/reader.txt 10_1101-2021_02_09_430460 10_1101-727867 10_1101-2021_02_13_429885 10_1101-2021_02_09_430363 10_1101-2021_02_13_429885 10_1101-2021_02_09_430460 number of items: 50 sum of words: 466,439 average size in words: 9,328 average readability score: 61 nouns: preprint; data; cell; genes; version; gene; copyright; review; author; holder; funder; peer; license; %; cells; preprintthis; analysis; methods; perpetuity; model; licenseavailable; number; set; sequence; expression; sequences; results; datasets; cancer; values; seq; method; dataset; figure; models; mutations; value; p; information; ad; sample; time; types; structure; sets; approach; k; samples; size; drug verbs: is; are; was; be; has; using; were; posted; certified; made; display; granted; used; based; biorxiv; have; https://www.zotero.org/google-docs/?hsltkm; associated; �; been; selected; set; given; found; compared; shown; obtained; identified; including; use; generated; show; see; performed; identify; shows; provided; do; known; related; developed; described; defined; applied; following; calculated; provide; expected; expressed; proposed adjectives: -; single; different; available; other; high; such; specific; non; same; genome; human; new; more; multiple; similar; large; first; small; many; low; random; clinical; best; individual; significant; average; linear; functional; higher; most; genomic; original; biological; international; polar; wide; relative; possible; subject; additional; computational; total; multi; standard; real; full; top; various; dimensional adverbs: not; also; only; more; then; however; e.g.; well; thus; most; as; here; first; very; respectively; highly; therefore; out; often; finally; further; even; significantly; so; hence; previously; top; still; fully; up; instead; much; currently; at; less; specifically; recently; now; randomly; least; directly; similarly; rather; next; already; easily; better; usually; generally; moreover pronouns: we; it; our; their; its; they; i; them; us; one; itself; https://doi.org/10.1101/2020.09.21.305516; you; themselves; your; https://doi.org/10.1101/2020.11.17.386649; https://doi.org/10.1101/2020.01.28.923532; he; 𝒙; ​sample​; u; s; https://doi.org/10.1101/2021.02.10.430705; ∆̂′; ours; m′; https://doi.org/10.1101/2021.02.12.430830; his; 𝑙𝑎; λ; ourselves; n; my; il-; https://doi.org/10.1101/2021.02.08.430280; http://paperpile.com/b/5tes3g/x5omi; 𝜟; 𝑒𝑖; 𝑆∗of; ’s; τ2; α; ʻʻuniprotdom_postmodenzʼ; y∗; yij; yes; whole-644; when398; uw; us- proper nouns: al; february; et; international; j.; m.; .; nc; rna; s.; d.; c.; a.; nd; k; m; r.; c; l.; p.; s; j; b.; e.; g.; figure; fig; t.; a; k.; t; by; supplementary; n; h.; µm; r; seq; n.; e; d; http://creativecommons.org/licenses/by-nc/4.0/; data; f.; alzheimer; b; genome; li; y.; y keywords: february; international; rna; cell; gene; seq; figure; supplementary; mutation; fig; data; alzheimer; set; sequence; sars; pca; motif; method; gwas; covid-19; cancer; δaf; vql; vp35; vcf; vaf; utr; usc; umls; type; tet; target; swarm; surfr; subject; struo; strain; stad; ssu; sparc; snp; skcm; single; siamese; sds; sda; scc; rrna; ribovore; remdesivir one topic; one dimension: 10 file(s): ./cache/10_1101-2020_09_21_305516.pdf titles(s): Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer three topics; one dimension: 10; org; 10 file(s): ./cache/10_1101-2021_02_09_430536.pdf, ./cache/10_1101-2021_02_09_430460.pdf, ./cache/10_1101-2021_02_11_430762.pdf titles(s): Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes | Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes | Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation five topics; three dimensions: 10 org doi; 10 org 2021; org https google; http com https; 10 sequences org file(s): ./cache/10_1101-727867.pdf, ./cache/10_1101-2021_02_09_430536.pdf, ./cache/10_1101-2021_02_09_430460.pdf, ./cache/10_1101-2021_02_08_430343.pdf, ./cache/10_1101-2021_02_11_430762.pdf titles(s): scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data | Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes | Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes | Patient-specific cell communication networks associate with disease progression in cancer | Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation Type: biorxiv title: neuroscience-from-bioarxiv date: 2021-02-14 time: 21:20 username: emorgan patron: Eric Morgan email: emorgan@nd.edu input: eMBISSE4cL.xml ==== make-pages.sh htm files ==== make-pages.sh complex files ==== make-pages.sh named enities ==== making bibliographics id: 10_1101-2021_02_10_430623 author: Aberasturi, Dillon title: “Single-subject studies”-derived analyses unveil altered biomechanisms between very small cohorts: implications for rare diseases date: 2021 words: 9478 sentences: 748 pages: 9 flesch: 58 cache: ./cache/10_1101-2021_02_10_430623.pdf txt: ./txt/10_1101-2021_02_10_430623.txt summary: published S3-type N-of-1-pathways MixEnrich to two paired samples (e.g., diseased vs unaffected tissues) for determining patient-specific enriched genes sets: Odds Ratios (S3-OR) and S3-variance using these models to derive effect sizes and statistical significance in singlesubject studies of transcriptomes, these samples are isogenic or quasi-isogenic, and thus do not necessarily generalize to a group of subjects (cohort-level signal). The novel bioinformatic method identifies meaningful biomechanism differences between very small cohorts by using single-subject-study-derived effect sizes for gene sets. (B) For the generalized linear model-based analyses, we applied a different filtering process to the raw data where we eliminated all the transcripts with 0 counts for each subject and then calculated the coefficient 2.3 Description of the Generalized Linear Models and application of Inter-N-of-1 methods for small cohort comparison and their evaluation in the Breast Cancer Data the analysis of subsets of the TCGA Breast Cancer data, genes were declared differentially expressed if their abs(log2FC) > log2(1.2) and their FDR-adjusted p-value < id: 10_1101-2020_09_02_279521 author: Abi Nader, Clément title: Simulating the outcome of amyloid treatments in Alzheimer’s disease from imaging and clinical data date: 2021 words: 12164 sentences: 1038 pages: 32 flesch: 56 cache: ./cache/10_1101-2020_09_02_279521.pdf txt: ./txt/10_1101-2020_09_02_279521.txt summary: Simulating the outcome of amyloid treatments in Alzheimer''s disease from imaging and clinical data When applied to multimodal imaging and clinical data from the Alzheimer''s Disease Neuroimaging Initiative our * Data used in preparation of this article were obtained from the Alzheimer''s Disease Neuroimaging Initiative (ADNI) database Keywords : Alzheimer''s Disease ; Clinical trials ; Disease progression; Amyloid hypothesis; of large datasets of different data modalities, such as clinical scores, or brain imaging measures to model Alzheimer''s disease progression based on specific assumptions on the biochemical combining traditional DPMs with dynamical models of Alzheimer''s disease progression. In this work we present a novel computational model of Alzheimer''s disease progression to multi-modal imaging and clinical data from the Alzheimer''s Disease Neuroimaging To simulate the long-term progression of Alzheimer''s disease we first project the AD subjects Figure 3 Model-based progression of Alzheimer''s disease. clinical data, based on the estimation of latent biomarkers'' relationships governing Alzheimer''s id: 10_1101-2020_01_28_923532 author: Ahmadi, Saba title: The Landscape of Precision Cancer Combination Therapy: A Single-Cell Perspective date: 2021 words: 16705 sentences: 1572 pages: 46 flesch: 66 cache: ./cache/10_1101-2020_01_28_923532.pdf txt: ./txt/10_1101-2020_01_28_923532.txt summary: We focus our analysis on genes encoding protein targets that encode receptors on the cell all "modular", including one part that specifically targets the tumor cell via one gene/protein and MadHitter and each patient receives an optimal personalized combination of targeted therapies from a prespecified set (pill bottle). Cohort and Individual Target Set Sizes as Functions of Tumor Killing and Given the single-cell tumor data sets and the ILP optimization framework described above, we filtering as this threshold is decreased), decreases the size of the target cell surface receptor gene heterogeneity of the cancer, number of patients within the data set, size of target gene set, lack of used for filtering the gene set to avoid targeting non-cancerous tissues. the genes in the optimal target set, the expression of that gene in that non-tumor cell exceeds the set of genes which is known to be targetable to cell 𝐶. id: 10_1101-2021_02_12_430764 author: Ascensión, Alex M. title: Triku: a feature selection method based on nearest neighbors for single-cell data date: 2021 words: 9518 sentences: 1135 pages: 18 flesch: 64 cache: ./cache/10_1101-2021_02_12_430764.pdf txt: ./txt/10_1101-2021_02_12_430764.txt summary: Triku: a feature selection method based on nearest neighbors for single-cell data Triku is a feature selection method that favours genes defining the main Single-cell RNA sequencing (scRNA-seq) is a powerful technology to study the biological heterogeneity of tissues at the individual cell level, allowing the characterization of new cell populations and cell states–i.e. cell types responding to different scRNA-seq datasets are multidimensional, i.e. the expression profile per cell consists of multiple genes. feature selection method: 1) the ability to recover basic dataset structure (main cell low, meaning that features selected with the different methods yielded clustering solutions that were quite similar to the manually-labeled cell types, although there are We first studied the expression pattern of genes selected by triku and other methods, To evaluate the cluster expression of selected genes in benchmarking datasets, for proteins within the genes selected by different FS methods in the two sets of benchmarking datasets. id: 10_1101-2021_02_11_430806 author: Badaczewska-Dawid, Aleksandra title: BIAPSS - BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences date: 2021 words: 2698 sentences: 301 pages: 3 flesch: 53 cache: ./cache/10_1101-2021_02_11_430806.pdf txt: ./txt/10_1101-2021_02_11_430806.txt summary: BIAPSS BioInformatic Analysis of liquid-liquid Phase-Separating protein Sequences web platform named BIAPSS (BioInformatic Analysis of liquidliquid Phase-Separating protein Sequences) which offers the users interactive data analytic tools for facilitating the discovery of statistically significant sequence signals for proteins with Phase-Separating protein Sequences. The objective of BIAPSS is to enable a rapid and on-the-fly deep statistical analysis of LLPS-driver proteins using the pool of sequences with The comparison to benchmarks of various protein groups enables statistical inference of specific phase-separating affinities. Furthermore, the residue-resolution biophysical regularities inferred from BIAPSS will help not only to accurately identify regions prone to phase separation but also to design sequence modifications targeting various biomedical applications. for comprehensive sequence-based analysis of LLPS proteins. the driving forces for phase separation of prion-like RNA binding proteins. disordered protein regions encode a driving force for liquid-liquid phase separation? of proteins driving liquid-liquid phase separation. id: 10_1101-2021_02_10_430563 author: Bandrowski, Anita title: SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data date: 2021 words: 10901 sentences: 1026 pages: 16 flesch: 48 cache: ./cache/10_1101-2021_02_10_430563.pdf txt: ./txt/10_1101-2021_02_10_430563.txt summary: investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. required to organize their data files and metadata organized according to the SPARC Data Structure data according to the SPARC Dataset Structure. is the preferred file format for tabular data in SPARC, the Data files are organized into 3 different top-level folders, The organization structure of the files and folders for a SPARC dataset. https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 https://github.com/SciCrunch/sparc-curation/releases/tag/dataset-template-1.2.3 investigators include folders that organize data along a from these subjects, data files are organized within fields, the curation team developed a SPARC Dataset files/folders, and share datasets with the SPARC id: 10_1101-2021_02_09_430460 author: Banerjee, Shayantan title: Sequence neighborhoods enable reliable prediction of pathogenic mutations in cancer genomes date: 2021 words: 15659 sentences: 1731 pages: 39 flesch: 40 cache: ./cache/10_1101-2021_02_09_430460.pdf txt: ./txt/10_1101-2021_02_09_430460.txt summary: experimentally validated cancer mutation data in this study, we explored various string-based evolutionary features resulted in the development of a pan-cancer mutation effect prediction Distinguishing between driver and passenger mutations from sequenced cancer genomes is a Recent studies have identified specific signatures or patterns of mutations in different cancer than passenger mutations and built probabilistic models to identify driver genes that had this study, missense mutations from 58 genes that were pan-cancer-based were combined from We used the same datasets to judge our model''s ability to predict rare driver mutations based Driver and Passenger Mutations'' Features Used to Train NBDriver are Significantly Although our method''s focus was to identify missense driver mutations from sequenced cancer surrounding driver and passenger mutations obtained from sequenced cancer genomes. computational prediction of driver missense mutations," Cancer Res., vol. functionally validated cancer-related missense mutations," Genome Biology, vol. Figure 7: Differences in the distribution of features between driver and passenger mutations id: 10_1101-2021_02_09_430363 author: Bayer, Johanna M. M. title: Accommodating site variation in neuroimaging data using hierarchical and Bayesian models date: 2021 words: 13439 sentences: 1891 pages: 20 flesch: 70 cache: ./cache/10_1101-2021_02_09_430363.pdf txt: ./txt/10_1101-2021_02_09_430363.txt summary: Accommodating site variation in neuroimaging data using hierarchical and Bayesian models The potential of normative modeling to make individualized predictions has led to structural neuroimaging results that go beyond the case-control approach. in a similar way for multi-site modeling in a pooled neuroimaging data set, which contained 7499 participants that org/abide/) data set to compare a non-linear, Gaussian version of the model, to a linear hierarchical Bayesian version and mathematical description of our approach to include site as predictor in a normative hierarchical Bayesian model. With the aim to create reliable normative models in multi-site neuroimaging data, we developed and compared two model is also able to capture non-linear effects between age and thickness of the cortical region ("Hierarchical Bayesian Gaussian Process term, which allows to model non-linear association between age and cortical thickness measures. The only models that perform better for most regions than the mean of the training data set are the Hierarchical Bayesian id: 10_1101-2021_02_10_430367 author: Chen, Meili title: Genome Warehouse: A Public Repository Housing Genome-scale Data date: 2021 words: 4875 sentences: 656 pages: 18 flesch: 66 cache: ./cache/10_1101-2021_02_10_430367.pdf txt: ./txt/10_1101-2021_02_10_430367.txt summary: Running title: Chen M et al / Genome Assembly Data Repository 21 Genomics Data Center (NGDC), part of the China National Center for Bioinformation 40 archive high-quality genome sequences and annotations, GWH is equipped with a 46 Collectively, GWH serves as an important resource for genome-scale data 51 https://bigd.big.ac.cn/) [13], the aim of GWH is to accept data submissions worldwide 78 GWH is a centralized resource housing genome-scale data, with the purpose to 105 GWH not only accepts genome assembly associated data through an on-line 111 GWH will assign a unique accession number to the submitted genome assembly upon 149 GWH provides data visualization for both genome 163 Collectively, GWH is a user-friendly portal for genome data submission, release, and 209 Database resources of the National Genomics Data 302 Genome assembly accession number is prefixed with "GWH", followed by four 334 Genome assembly accession number is prefixed with "GWH", followed by four 334 id: 10_1101-2021_02_12_430979 author: Da Silva, Kévin title: StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs date: 2021 words: 10624 sentences: 992 pages: 20 flesch: 66 cache: ./cache/10_1101-2021_02_12_430979.pdf txt: ./txt/10_1101-2021_02_12_430979.txt summary: StrainFLAIR: Strain-level profiling of metagenomic samples using variation graphs results show that StrainFLAIR was able to distinguish and estimate the abundances of close strains, as approaches to handle multiple similar genomes as with strains use gene clustering and then select the64 StrainFLAIR assigns and estimates species and strain abundances of a bacterial metagenomic sample graph, called the "node abundance", is computed, first focusing on unique mapped reads (first step). Strain-level abundances are then obtained by exploiting the specific genes of each reference genome188 from the reference variation graph thus simulating a new strain to be identified and quantified.231 strains from a sequenced sample, mapped onto this graph.343 Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or Reference strains relative abundances expected and computed by StrainFLAIR or id: 10_1101-2020_11_17_386649 author: Danciu, Daniel title: Topology-based Sparsification of Graph Annotations date: 2021 words: 8205 sentences: 774 pages: 15 flesch: 67 cache: ./cache/10_1101-2020_11_17_386649.pdf txt: ./txt/10_1101-2020_11_17_386649.txt summary: Experiments on 10,000 RNA-seq datasets show that RowDiff combined with MultiBRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most a binary matrix, where the k-mer set indexes the rows and each annotation label specifies a column. Starting from any vertex in the de Bruijn graph, Algorithm 1 defines a traversal leading to an anchor Each row in a RowDiff-transformed annotation matrix has the same or fewer set bits than A naı̈ve implementation of the RowDiff construction would be to load the matrix A in memory, and gradually replace its rows with their sparsified counterpart, while traversing the graph. We now note that, when querying annotations for paths in the graph, or sets of rows corresponding to vertices We constructed annotated de Bruijn graphs from the RNA-Seq data set in the same We now compare the representation size for RowDiff and other state-of-the-art graph annotation compression methods. id: 10_1101-2021_02_08_430270 author: Gerard, David title: Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty date: 2021 words: 7219 sentences: 1582 pages: 22 flesch: 69 cache: ./cache/10_1101-2021_02_08_430270.pdf txt: ./txt/10_1101-2021_02_08_430270.txt summary: Scalable Bias-corrected Linkage Disequilibrium Estimation Under Genotype Uncertainty Keywords and phrases: attenuation bias, genotype likelihood, linkage disequilibrium, polyploidy, reliability ratio. Let XiA and XiB be the posterior means at loci A and B for individual Equations (5)–(7) take the naive estimators most researchers use in practice (the sample covariance/correlation of posterior means) and inflate these by a multiplicative effect. Gerard and Ferrão, 2019] to obtain the posterior moments for each individual''s genotype at each SNP reliability ratios of most SNPs only increase their correlation estimates by less than 10%. To evaluate the LD estimates of high reliability ratio SNPs, we calculated the MLEs for ρ2 applied to simple linear regression with an additive effects model (where the SNP effect is proportional to the dosage), result in the standard ordinary least squares estimates when using the extreme reliability ratio of PotVar0080327, the genotype-error adjusted correlation estimate is -1. id: 10_1101-2021_02_12_430963 author: Gerber, Stefan title: Streamlining differential exon and 3'' UTR usage with diffUTR date: 2021 words: 6710 sentences: 896 pages: 17 flesch: 62 cache: ./cache/10_1101-2021_02_12_430963.pdf txt: ./txt/10_1101-2021_02_12_430963.txt summary: adenylation site databases to enable differential 3'' UTR usage analysis. Conclusions: diffUTR enables differential 3'' UTR analysis and more generally facilitates DEU9 Popular bin-based DEU methods are provided by the limma [25,24], edgeR [23] and DEXSeq [22]41 Bins are prepared from various types of gene annotations as well as, optionally, additional APA-driven segmentation and extension, then read counts among statistically-significant genes, especially for bins with a higher expression (Figure 3A).78 diffUTR provides three main plot types to explore differential bin usage analyses, each with a88 Plotted are the UTR bins found statistically significant (binand gene-level FDR deuBinPlot (Figure 4B) provides bin-level statistic plots for a given gene, similar to those99 than CDS bins, including counts of 3'' UTR when calculating overall gene expression could under-121 diffUTR streamlines DEU analysis and outperforms alternative methods in inferring UTR changes,127 For differential UTR analysis, gene-level results are ob-206 id: 10_1101-2021_02_12_430830 author: Gergely, Tibély title: Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data date: 2021 words: 8181 sentences: 793 pages: 19 flesch: 68 cache: ./cache/10_1101-2021_02_12_430830.pdf txt: ./txt/10_1101-2021_02_12_430830.txt summary: Simultaneous estimation of per cell division mutation rate and turnover rate from bulk tumor sequence data widely available bulk sequencing data where mutations from individual cells are and genomic mutation rate from bulk sequencing data. based on the maximum likelihood estimation of the parameters of a generative model of tumor growth and mutations. human hepatocellular carcinoma sample reveals an elevated per cell division mutation rate and high cell turnover. Due to the limitations of bulk sequencing, which only essays mutation frequencies for a population of cells from each tumor sample and does not The estimation is based on a maximum likelihood fit of the parameters of a birth-death model to the measured mutant and be estimated from readcount data, to separate the effects of the mutation rate We use pre-generated division trees from the ELynx suite at predetermined turnover rate values. Using the turnover rate, we also estimated the number of cell id: 10_1101-2021_02_08_430343 author: Gibbs, David L title: Patient-specific cell communication networks associate with disease progression in cancer date: 2021 words: 11335 sentences: 1445 pages: 29 flesch: 58 cache: ./cache/10_1101-2021_02_08_430343.pdf txt: ./txt/10_1101-2021_02_08_430343.txt summary: tumor microenvironment, the method identified ligands, receptors and cells meeting certain criteria of 56 9,234 samples in The Cancer Genome Atlas (TCGA), starting from a network of 64 cell types and 1,894 62 Data sources including TCGA and cell-sorted gene expression, bulk tumor expression, cell type scores, 78 ligands and receptors for each of the 64 cell types in xCell, using the source gene expression data. With this procedure, a network scaffold is induced, where cells produce ligands that bind to receptors on 113 (PFI) and tumor stage for each sample, a matrix of patient-specific edge weights was constructed 206 number of high weight edges in each tumor type did not associate with the number of samples, as might 254 in the tumor stage contrast, a majority of ligand-producing cells include GMP cells, Osteoblasts, MSC 283 In the PFI results, Th1 cells appeared in 13 high scoring edges in SKCM, all with 394 id: 10_1101-2021_02_09_430036 author: Goldsborough, Thibaut title: A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea date: 2021 words: 3128 sentences: 477 pages: 7 flesch: 70 cache: ./cache/10_1101-2021_02_09_430036.pdf txt: ./txt/10_1101-2021_02_09_430036.txt summary: A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea A comparative study of genomic adaptations to low nitrogen availability in Genlisea aurea is a carnivorous plant that grows on nitrogen-poor waterlogged sandstone aurea''s genome, CDS and non-coding DNA 2) Determination of transcriptomic nitrogen content and codon usage bias associated with higher nitrogen content tRNAs (among codons that are coding for the same amino a considerably lower number of nitrogen atoms in its genome than the two other plant species. has higher nitrogen counts per molecular unit in genomic DNA, CDS, Non-Coding DNA, protein, aurea has a higher nitrogen usage in its DNA, RNA and proteins Figure 2: Average number of nitrogen atoms per molecular unit in genomic DNA, CDS, Non-Coding DNA, aurea had lower nitrogen content in tRNA sequences but not in other Figure 3: Bar graph representing the codon usage bias and tRNA nitrogen content in G. id: 10_1101-2021_02_11_430695 author: Gordon-Rodriguez, Elliott title: Learning Sparse Log-Ratios for High-Throughput Sequencing Data date: 2021 words: 7973 sentences: 817 pages: 12 flesch: 60 cache: ./cache/10_1101-2021_02_11_430695.pdf txt: ./txt/10_1101-2021_02_11_430695.txt summary: Log-ratios are an important class of features for analyzing high-throughput sequencing (HTS) metagenomic data for HTS data, and more generally, high-dimensional CoDa. Unlike existing methods, CoDaCoRe is simultaneously scalable, interpretable, sparse, and accurate. unlabelled datasets, {xi}ni=1, as a method for identiLearning Sparse Log-Ratios for High-Throughput Sequencing Data CoDaCoRe variable selection for the first (most explanatory) log-ratio on the Crohn disease data (Rivera-Pinto et al., 2018). more generally, in the field of CoDa. Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data Learning Sparse Log-Ratios for High-Throughput Sequencing Data id: 10_1101-2020_09_23_310276 author: Greenfest-Allen, Emily title: NIAGADS Alzheimer''s GenomicsDB: A resource for exploring Alzheimer''s Disease genetic and genomic knowledge date: 2021 words: 5987 sentences: 592 pages: 19 flesch: 52 cache: ./cache/10_1101-2020_09_23_310276.pdf txt: ./txt/10_1101-2020_09_23_310276.txt summary: The NIAGADS Alzheimer''s Genomics Database (GenomicsDB) is an interactive knowledgebase for Alzheimer''s disease (AD) genetics that provides access to GWAS summary statistics datasets The website makes available >70 genome-wide summary statistics datasets from GWAS and efficient real-time data analysis and variant or gene report generation. Gene reports provide summaries of co-located ADRD risk-associated variants and have pages linking summary statistics to variant and gene annotations, this resource makes these summary statistics available for browsing (on dataset, gene, and variant reports and as genome NIAGADS GenomicsDB variant reports and a track is available on the genome browser. The NIAGADS GenomicsDB includes allele frequency data from 1000 Genomes (phase 3, version visualizations for summarizing search results and annotations in gene and variant reports. compare NIAGADS GWAS summary statistics tracks to each other, against annotated gene or A detailed report is provided for each of the GWAS summary statistics and ADSP meta-analysis id: 10_1101-2021_02_13_429885 author: Househam, Jacob title: A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing date: 2021 words: 10584 sentences: 1257 pages: 36 flesch: 49 cache: ./cache/10_1101-2021_02_13_429885.pdf txt: ./txt/10_1101-2021_02_13_429885.txt summary: know tumour purity and the ploidy of a CNA segment, then the VAF mutations mapped A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. A fully automated approach for quality control of cancer mutations in the era of high-resolution whole genome sequencing. id: 10_1101-2020_10_08_327718 author: Jambor, Helena title: Creating Clear and Informative Image-based Figures for Scientific Publications date: 2021 words: 12824 sentences: 1189 pages: 36 flesch: 56 cache: ./cache/10_1101-2020_10_08_327718.pdf txt: ./txt/10_1101-2020_10_08_327718.txt summary: journals in three fields; plant sciences, cell biology and physiology (n=580 papers). figures were uncommon (physiology 16%, cell biology 12%, plant sciences 2%). among papers published in top journals in plant sciences, cell biology and physiology. contained images (plant science: 68%, cell biology: 72%, physiology: 55%). in physiology (49%) and cell biology (55%), and 28% of plant science papers provided and 29% of plant sciences papers contained no scale information on any image. Some publications use insets to show the same image at two different scales (cell Figure 1: Image types and reporting of scale information and insets physiology and plant science papers contained some images that were inaccessible to B: Most papers explain colors in image-based figures, however, explanations are less Figure 4: Using scale bars to annotate image size Creating clear and informative image-based figures for scientific publications. Creating clear and informative image-based figures for scientific publications. id: 10_1101-2021_02_08_430280 author: Kasukurthi, Mohan V title: SALTS – SURFR (sncRNA) And LAGOOn (lncRNA) Transcriptomics Suite date: 2021 words: 12363 sentences: 1218 pages: 23 flesch: 58 cache: ./cache/10_1101-2021_02_08_430280.pdf txt: ./txt/10_1101-2021_02_08_430280.txt summary: given transcriptome provided as either a raw user-generated RNA-Seq dataset or NCBI SRR file identifier. SURFR identifies all ncRNA fragments (both annotated and novel) and their expressions in up to ten datasets per comprehensively compare all fragment expressions identified in up to 30 individual datasets by entering multiple SURFR session IDs window detailing each fragment identified in the individual, selected small RNA-Seq dataset. of the results page redirects the user to a SURFR window detailing the expressions of all full length sncRNAs in the provided datasets. Fragments" window (Figure 2D) for each fragment identified in the individual, selected small RNA-Seq dataset within its host gene along with the fragment''s expression (RPM) in each individual small RNA-Seq dataset, and lncRNAs expressed in a given human transcriptome from either a user-provided RNA-Seq dataset or publically More importantly, however, LAGOOn identified MALAT1 as the most highly expressed lncRNA in MDAMB-231 breast cancer cells (Figure 9). id: 10_1101-2021_02_10_430512 author: Kim, Catherine title: Prediction of adverse drug reactions associated with drug-drug interactions using hierarchical classification date: 2021 words: 11859 sentences: 1137 pages: 41 flesch: 56 cache: ./cache/10_1101-2021_02_10_430512.pdf txt: ./txt/10_1101-2021_02_10_430512.txt summary: into DDIs. In this study, a hierarchical machine learning model was created to predict DDIassociated ADRs and pharmacological insight thereof for any drug pair. drugs'' chemical structures as inputs to predict their target, enzyme, and transporter (TET) Development of RFCs for Prediction of Target, Enzyme, and Transporter Profiles of Drugs Development of a Model for Prediction of DDI-associated ADRs from TET Profiles of Drugs ADR prediction from Target, Enzyme, and Transporter Profiles of Drug Pairs To predict ADRs of a drug pair from its TET profiles, Random Forest Classifier (RFC), Application of the SVM model for DDI-associated ADRs Involving Three Major Drugs through predicted PRR changes of drug pairs upon removal of each of the targets, enzymes, and changes of drug pairs were predicted by the model upon removal of each of the targets, enzymes, Target, enzyme, and transporter (TET) profiles of atorvastatin and concomitant drugs, id: 10_1101-2020_02_04_934216 author: Kirchoff, Kathryn E. title: EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning date: 2021 words: 8121 sentences: 726 pages: 13 flesch: 59 cache: ./cache/10_1101-2020_02_04_934216.pdf txt: ./txt/10_1101-2020_02_04_934216.txt summary: EMBER: Multi-label prediction of kinase-substrate phosphorylation events through deep learning task of kinase-motif phosphorylation prediction as a multi-label kinase or substrate, as well as protein scaffolds that facilitate structural orientation and downstream catalysis of the reaction, modify the efficacy of motif phosphorylation. prediction of phosphorylation events), a deep learning approach for predicting multi-label kinase-motif phosphorylation relationships. example, the TLK kinase family only has nine positive labels (verified TLK-motif interactions) and more than 10,000 resulting data set is comprised of 7302 phosphorylatable motifs and their reaction-associated kinase families (Table 1). The final output is a vector, k, of length eight, where each value corresponds to the probability that the motif a was phosphorylated by one of the kinase families indicated in We sought to illuminate the relationship between kinase-family dissimilarity and phosphorylated motif-group dissimilarity described results provide motivation to incorporate both motif dissimilarity and kinase relatedness into the predictive model, as of kinase-motif prediction compared to the single-label approaches. id: 10_1101-2021_02_09_430536 author: Lin, Cui-Xiang title: Genome-wide prediction and integrative functional characterization of Alzheimer’s disease-associated genes date: 2021 words: 20656 sentences: 4864 pages: 47 flesch: 79 cache: ./cache/10_1101-2021_02_09_430536.pdf txt: ./txt/10_1101-2021_02_09_430536.txt summary: Genome-wide prediction and integrative functional characterization of Alzheimer''s disease-associated genes example, a module-trait network approach was proposed and applied to identify gene 63 functional enrichment-based approach to identify negative genes that are not likely 94 associated genes through an optimal selection of networks and machine learning 98 FGN, and prediction of AD-associated genes using machine learning models (Fig. 1). addition, we tested their enrichment in three AD-related gene sets associated with 122 The top-ranked genes are enriched in AD-associated functions and phenotypes 154 These results provide additional evidence that our predicted genes are associated with 194 The top-ranked genes are associated with AD based on miRNA-target networks 227 We investigated whether top-ranked genes were functionally related to AD-associated 229 We tested whether the top-ranked k genes were more likely to interact with AD-associated 576 related to AD-associated genes or miRNAs based on miRNA-target interaction networks. id: 10_1101-2021_02_08_428881 author: Lu, Yang Young title: ACE: Explaining cluster from an adversarial perspective date: 2021 words: 7909 sentences: 790 pages: 12 flesch: 66 cache: ./cache/10_1101-2021_02_08_428881.pdf txt: ./txt/10_1101-2021_02_08_428881.txt summary: A common workflow in single-cell RNA-seq analysis is to project the data to a latent space, cluster the cells in that space, and identify sets of marker genes that explain the differences among the nonlinear embedding model which maps the gene expression to the low-dimensional representation where the groups A notable feature of ACE''s approach is that, by identifying genes jointly, the method moves away from the notion Input: gene expression matrix Deep autoencoder learns low-dimensional representation Embedding clustering Clustering is neuralized and concatenated with the encoder Differentiation analysis by ACE Output: gene relevance ACE takes as input a single-cell gene expression matrix and learns a low-dimensional representation for each Next, a neuralized version of the k-means algorithm is applied to the learned representation to identify cell groups. input gene expression profile that lead the neuralized clustering model to alter the assignment from one group to the other. id: 10_1101-2021_02_12_430739 author: Malekian, Negin title: Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli date: 2021 words: 7516 sentences: 1093 pages: 13 flesch: 70 cache: ./cache/10_1101-2021_02_12_430739.pdf txt: ./txt/10_1101-2021_02_12_430739.txt summary: Mutations in bdcA and valS correlate with quinolone resistance in wastewater Escherichia Coli Here, we systematically screen for candidate quinolone resistance-conferring mutations. coli and performed a genome-wide association study (GWAS) correlating over 200,000 mutations against quinolone resistance phenotypes. significant mutations including one located at the active site of the biofilm dispersal genes bdcA and six silent In summary, we demonstrate that GWAS effectively and comprehensively identifies resistance mutations Keywords: E Coli; Quinolone; Antibiotic Resistance; Genome-Wide Association Study (GWAS) direct route to resistance is mutations in the drug targets gyrA and parC. In summary, we aim to show that a bacterial genomewide association study can effectively and comprehensively identify targets relevant to antibiotic resistance. Based on representative resistance phenotypes, the authors selected 103 isolates for sequencing with Illumina MiSeq, 92 of which are available from coli bdcA may act indirectly on antibiotic resistance. id: 10_1101-2021_02_12_430923 author: Modi, Vivek title: Kincore: a web resource for structural classification of protein kinases and their inhibitors date: 2021 words: 7913 sentences: 666 pages: 18 flesch: 62 cache: ./cache/10_1101-2021_02_12_430923.pdf txt: ./txt/10_1101-2021_02_12_430923.txt summary: Kincore: a web resource for structural classification of protein kinases and their inhibitors result, among the DFGin structures, we distinguished between the catalytically active kinase conformation pages for kinase phylogenetic groups, genes, conformational labels, PDBids, ligands and ligand types. options to download data – database tables as a tab separated files; the kinase structures as PyMOL Kincore provides conformational assignments and ligand type labels to protein kinase structures from Figure 1: Representative protein kinase structure (3ETA_A) displaying the residues used to define inhibitor The distribution of different ligand types across kinase conformations is provided in Table 1. Table 1: Distribution of ligand types across protein kinase conformations (Number of chains). including conformational and ligand type labels and C-helix position, kinase family, gene name, Uniprot provides the number of kinase chains in the group across different conformations with their Database table provides the list of all the PDB chains with conformational labels and ligand id: 10_1101-2020_12_24_424317 author: Muazzam, Fariha title: Multi-class Cancer Classification and Biomarker Identification using Deep Learning date: 2021 words: 4252 sentences: 426 pages: 12 flesch: 57 cache: ./cache/10_1101-2020_12_24_424317.pdf txt: ./txt/10_1101-2020_12_24_424317.txt summary: classification, feature extraction and relevant gene identification through deep learning methods for 12 This research picks up from detection of different types of cancer RNA-Seq expressions using deep neural classification of gene expression profiles for different kinds of cancers. Hence, the effectiveness of deep learning models for feature extraction and relevant gene identification is performed revealing substantial results and they produced five high-ranked gene sets and reduced feature This study was aimed at classifying 12 types of cancer and identifying relevant genes and the results show were able to identify cancer-relevant pathways and genes for the sets, that different experiments generated, A deep learning approach for cancer detection and relevant gene Tumor gene expression data classification via sample expansionbased deep learning. Identification of a multi-cancer gene expression Multi-class Cancer Classification and Biomarker Identification using Deep Learning Multi-class Cancer Classification and Biomarker Identification using Deep Learning id: 10_1101-2020_09_21_305516 author: Nikolic, Ana title: Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer date: 2021 words: 10376 sentences: 1280 pages: 32 flesch: 67 cache: ./cache/10_1101-2020_09_21_305516.pdf txt: ./txt/10_1101-2020_09_21_305516.txt summary: Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 Copy-scAT: Deconvoluting single-cell chromatin accessibility of genetic subclones in cancer 1 uses single-cell epigenomic data to infer copy number variants (CNVs) that define cancer cells. We have tested the ability of Copy-scAT to use scATAC data to call CNVs with three different approaches 100 genome sequencing (WGS) data for adult GBM (aGBM) surgical resections (n = 4 samples, 3,647 cells). adult GBM samples identified using both methods, versus total numbers of gains detected by scATAC or 160 Number of chromosome-arm level gains detected in adult GBM samples identified using both methods, 163 (c) Multiple myeloma samples were profiled by both scATAC and the single-cell CNV assay. chromosome-arm level gains detected in adult GBM samples identified using both methods, versus total 166 CNVs are detected in scATAC clusters with Copy-scAT in pediatric GBM samples. id: 10_1101-2021_02_11_430847 author: Pinatti, Lisa M. title: SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer date: 2021 words: 6849 sentences: 788 pages: 26 flesch: 57 cache: ./cache/10_1101-2021_02_11_430847.pdf txt: ./txt/10_1101-2021_02_11_430847.txt summary: SearcHPV: a novel approach to identify and assemble human papillomavirus-host genomic integration events in cancer squamous cell carcinomas; however, the impact of HPV integration into the host human genome SearcHPV uncovered HPV integration sites adjacent to known cancer-related detection of HPV-human integration sites from targeted capture DNA sequencing data. developed a novel HPV integration detection tool for targeted capture sequencing data, which we SearcHPV showed a high frequency of HPV16 integration with a total of six events in UM-SCCIn this study, SearcHPV also called HPV integration sites within TP63. HPV integration sites have been associated with structural variations in the human genome3, 8, 37, which supports an additional genetic mechanism as to why HPV integration sites Genome-wide analysis of HPV integration in human and their integration sites in host genomes through next generation sequencing data. identify viruses and their integration sites using next-generation sequencing of human cancer id: 10_1101-2021_02_09_430405 author: Quazi, Sameer title: In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD date: 2021 words: 5941 sentences: 1038 pages: 23 flesch: 64 cache: ./cache/10_1101-2021_02_09_430405.pdf txt: ./txt/10_1101-2021_02_09_430405.txt summary: In-silico Structural and Molecular Docking-Based Drug Discovery Against Viral Protein (VP35) of Marburg Virus: A potent Agent of MAVD including structure-based drug-like compounds screening from online databases, molecular The final small molecules of drug-like compounds would have more effective and selected for the molecular docking with FGI-103 antiviral drug-using AutoDock 4.2 software. After that, FGI-103 was set and screen other drug-like compounds from PubChem databases. The finally selected drug-like compounds were docked with the P1 site of VP35 of based on ap1 site for ligand in every dock for VP35 MARV utilizing a grid chart of 50 × 50 × 50 The ADMET properties of finally selected drug-like compounds were checked to utilize 2D molecules structure of selected drug-like compounds (A) represents the 2D The molecule structure of three drug-like compounds is shown in Figure 6. "In-Silico Structural and Molecular Docking-Based Drug Discovery "In-Silico Structural and Molecular Docking-Based Drug Discovery id: 10_1101-698605 author: Sarantopoulou, Dimitra title: Comparative evaluation of full-length isoform quantification from RNA-Seq date: 2021 words: 12853 sentences: 1332 pages: 37 flesch: 55 cache: ./cache/10_1101-698605.pdf txt: ./txt/10_1101-698605.txt summary: Comparative evaluation of full-length isoform quantification from RNA-Seq Full-length isoform quantification from RNA-Seq is a key goal in transcriptomics analyses benchmarking, isoform quantification, simulated data, pseudo-alignment, RNA-Seq, short Given the difficulty in full-length isoform quantification, many RNA-Seq studies simply analysis performed on the known true isoform quantifications of the simulated data to the For the simulated data we started with 11 real RNA-Seq samples: six liver and six the isoform expression level using idealized and realistic simulated data, with full and true counts), for the set of expressed isoforms in sample 1 in C) idealized and D) realistic data. Method effect on differential expression analysis, using realistic data. Method effect on differential expression analysis, using realistic data. RSEM is a gene/isoform abundance tool for RNA-Seq data which uses a generative model S1 Fig. Method effect on full-length isoform quantification using simulated data. Method effect on full-length isoform quantification using simulated data. id: 10_1101-2020_09_23_308239 author: Schultz, Bruce T title: The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing candidates from multimodal knowledge harmonization date: 2021 words: 8797 sentences: 1318 pages: 31 flesch: 57 cache: ./cache/10_1101-2020_09_23_308239.pdf txt: ./txt/10_1101-2020_09_23_308239.txt summary: The COVID-19 PHARMACOME: A method for the rational selection of drug repurposing COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph generated from a initial version of the COVID-19 PHARMACOME, a comprehensive drug-target-mechanism graph representing COVID-19 pathophysiology mechanisms that includes both drug targets Figure 3: Overlap of compound hits between different drug repurposing screening experiments. space overlap between different COVID-19 drug repurposing screenings. The COVID-19 PHARMACOME associates pathways derived from drug repurposing targets Figure 4 shows the distribution of repurposing drugs in the COVID-19 cause-and-effect graph, overlap analysis allows for the identification of repurposing drugs targeting mechanisms that Virus-response mechanisms are targets for repurposing drugs Figure 5: Visualization of drug repurposing candidates (and their targets) used in combination treatment as our own drug repurposing screening results, we were able to identify mechanisms targeted COVID-19 PHARMACOME, we are now able to link repurposing drugs, their targets and the SARS-CoV-2 protein interaction map reveals targets for drug repurposing. id: 10_1101-2021_02_10_430619 author: Schutz, Sacha title: Cutevariant: a GUI-based desktop application to explore genetics variations date: 2021 words: 4932 sentences: 632 pages: 8 flesch: 66 cache: ./cache/10_1101-2021_02_10_430619.pdf txt: ./txt/10_1101-2021_02_10_430619.txt summary: Cutevariant: a GUI-based desktop application to explore genetics variations Cutevariant is a user-friendly GUI based desktop application for genomic research designed to search for variations in DNA samples collected in annotated files and encoded in the Variant Calling Format. application imports data into a local relational database wherefrom complex filter-queries can be built either Key words: genomics, DNA variant, desktop application, Domain Specific Language, Graphic User Interface applications import the data from VCF files into an indexed Cutevariant imports data from VCF files into a normalized Fig. 2: The Cutevariant main view showing the variants list sub-window (middle), different controllers sub-windows but not all are Just like Variant Tools, Cutevariant supports operations Features Cutevariant BrowseVCF VCF-Miner VCF-Explorer VCF-Server VCF-Filters GEMINI Variant Tools SnpSift Comparaison of time performance between cutevariant and VCF-miner for importation and query execution. 3. Pablo Cingolani, Adrian Platts, Le Lily Wang, Melissa VCF-Miner: GUI-based application for mining variants id: 10_1101-2021_02_11_430762 author: Schäffer, Alejandro A. title: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation date: 2021 words: 16496 sentences: 1489 pages: 28 flesch: 65 cache: ./cache/10_1101-2021_02_11_430762.pdf txt: ./txt/10_1101-2021_02_11_430762.txt summary: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation alignments of SSU, LSU and 5S rRNA from all three domains as well as from organelles, along with secondary structure predictions for selected sequences. Ribovore software package for the analysis of SSU rRNA and LSU rRNA sequences 18S SSU rRNA database of 1091 sequences was updated most recently on September 27, 2018 by running version 0.28 of the Ribovore program ribodbmaker on an input set of 579,279 GenBank sequences returned from the eukaryotic SSU rRNA The results of ribotyper and rRNA sensor are combined and each sequence is separated into one of four outcome classes depending on whether it passed or failed each input a set of candidate sequences and a specified rRNA model (e.g. SSU.Bacteria) two blastn databases: one of 1267 bacterial and archaeal 16S SSU rRNA sequences id: 10_1101-2021_02_12_430989 author: Sofer, Tamar title: Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies date: 2021 words: 8136 sentences: 626 pages: 27 flesch: 48 cache: ./cache/10_1101-2021_02_12_430989.pdf txt: ./txt/10_1101-2021_02_12_430989.txt summary: Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies as well as linear regression-based analyses for studying the association of continuous exposures generation of empirical null distribution of association p-values, and we apply the pipeline to Many studies of phenotypes associated with gene expression from RNA-seq consist of small Residual permutation approach for simulations and for empirical p-value computation covariates, and outcome distributions; and (b) their relationships, aside from the exposureoutcome association, are the same as in the real data, we used a residual permutation approach. association studies applied to residual permutations were included to compute empirical papproach to study the distribution of p-values under the null of no association between the phenotypes and RNA-seq, and used this approach to further study power, and to compute approaches for transcriptome-wide analysis of RNA-seq in population-based studies, including more comprehensive study of statistical permutation approaches for RNA-seq association id: 10_1101-2021_02_09_430550 author: Song, Dongyuan title: scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling date: 2021 words: 13512 sentences: 1548 pages: 37 flesch: 64 cache: ./cache/10_1101-2021_02_09_430550.pdf txt: ./txt/10_1101-2021_02_09_430550.txt summary: (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Therefore, for scRNA-seq data analysis, informative gene selection Besides scRNA-seq data analysis, informative gene selection is also crucial for designing number and a scRNA-seq dataset, scPNMF selects informative genes based on its weight matrix; First, the informative genes selected by scPNMF lead to the most accurate cell clustering. the informative genes and weight matrix of scPNMF lead to the best cell type prediction accuracy Figure 3: Benchmarking scPNMF against 11 informative gene selection methods on seven scRNA-seq (b) UMAP visualization of cells in the Zheng4 dataset based on 100 informative genes selected by We benchmark scPNMF against the 11 gene selection methods in terms of cell type prediction We propose scPNMF, an unsupervised gene selection and data projection method for scRNA-seq For cell type prediction, we project every targeted gene profiling dataset and its scRNA-seq id: 10_1101-2021_02_10_430705 author: Stassen, Shobana V. title: VIA: Generalized and scalable trajectory inference in single-cell omics data date: 2021 words: 13590 sentences: 1383 pages: 24 flesch: 53 cache: ./cache/10_1101-2021_02_10_430705.pdf txt: ./txt/10_1101-2021_02_10_430705.txt summary: 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 1 VIA: Generalized and scalable trajectory inference in single-cell omics data 35 strategy to compute pseudotime, and reconstruct cell lineages based on lazy-teleporting random walks Step 1: Single-cell level graph is clustered such that each node 50 user defined start cell) is first computed by the expected hitting time for a lazy-teleporting random walk along an 57 network topology and single-cell level pseudotime/lineage probability properties onto an embedding using GAMs, as The cell fates and their lineage pathways are then computed by a two-stage probabilistic method, 94 graph-traversal allows it to infer cell fates when the underlying data spans combinations of multifurcating 201 detected cell fates annotated (o) lineage pathway and gene-pseudotime trend shown for the CD41 Megakaryocytic 259 Figure 3 VIA infers trajectories in single-cell multi-omic and image datasets (a) Major lineages of human Single cells are represented by graph nodes that are connected based on id: 10_1101-727867 author: Tangherloni, Andrea title: scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data date: 2021 words: 15281 sentences: 2865 pages: 28 flesch: 72 cache: ./cache/10_1101-727867.pdf txt: ./txt/10_1101-727867.txt summary: scAEspy: a tool for autoencoder-based analysis of single-cell RNA sequencing data This computational tool allows for coupling low-dimensional probabilistic representation of gene expression data with the downstream analysis to consider the Finally, the currently available AEs cannot be directly exploited to obtain the latent space or to generate synthetic cells. to show the cells in this embedded space or as a starting point for other dimensionality reduction approaches (e.g., t-SNE and UMAP) as well as downstream analyses Non-linear approaches for dimensionality reduction can be effectively used to capture the non-linearities among the gene interactions that may exist in the highdimensional expression space of scRNA-Seq data [16]. be effectively applied to analyse disparate types of single-cell data from different flexible method developed to cluster single-cell data; (ii) a centroid is calculated batch-effect correction methods for single-cell rna sequencing data. Wang, D., Gu, J.: VASC: dimension reduction and visualization of single-cell RNA-seq data by deep id: 10_1101-2021_02_12_431018 author: Truong Nguyen, Phuoc title: HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. date: 2021 words: 3786 sentences: 502 pages: 14 flesch: 61 cache: ./cache/10_1101-2021_02_12_431018.pdf txt: ./txt/10_1101-2021_02_12_431018.txt summary: HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences. HaVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage 2 Several new variants of SARS-CoV-2 have emerged globally, of which the 18 based assemblies on raw SARS-CoV-2 sequences in addition to identifying lineages to detect 26 variants of concern, we have developed an open source bioinformatic pipeline called HaVoC 27 monitor the spread of SARS-CoV-2 variants of concern during local outbreaks. currently being used in Finland for monitoring the spread of SARS-CoV-2 variants. SARS-CoV2, variant detection, reference assembly, lineage identification, coronavirus, 40 surveillance of virus variants by sequencing the SARS-CoV-2 genomes would provide a fast 80 to query SARS-CoV-2 fastq sequence libraries and assigns lineages to them individually in 92 processing and a reference genome of SARS-CoV-2 in a separate FASTA file. The likelihood of emergence of novel SARS-CoV-2 variants of concern is increased and 209 Emerging SARS-CoV-2 Variants. id: 10_1101-2021_02_11_430789 author: Tyagin, Ilya title: Accelerating COVID-19 research with graph mining and transformer-based learning date: 2021 words: 9408 sentences: 807 pages: 9 flesch: 58 cache: ./cache/10_1101-2021_02_11_430789.pdf txt: ./txt/10_1101-2021_02_11_430789.txt summary: Accelerating COVID-19 research with graph mining and transformer-based learning develop text mining techniques that can help the science community answer high-priority scientific questions related to COVID-19. is currently customized and available in the open domain to massively process COVID-19 related queries. Both systems are the next generation of the AGATHA knowledge network mining transformer model [37]. (1) Most of the existing HG systems are domain-specific (e.g., genedisease interactions) that is usually expressed in limiting the processed information (e.g., significant filtering vocabulary and papers a trained deep bi-LSTM model for extracting predicates from unstructured text. For instance, the node representing the entity "COVID-19" is connected to every sentence and predicate that The prior AGATHA semantic network only includes UMLS terms that appear in SemMedDB predicates [18] which is a major limitation. obtain embeddings per node in the semantic graph, we train AGATHA system ranking model. id: 10_1101-2021_02_11_430871 author: Vadnais, David title: ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data date: 2021 words: 10071 sentences: 1053 pages: 24 flesch: 63 cache: ./cache/10_1101-2021_02_11_430871.pdf txt: ./txt/10_1101-2021_02_11_430871.txt summary: ParticleChromo3D: A Particle Swarm Optimization Algorithm for Chromosome and Genome 3D Structure Prediction from Hi-C Data chromosome and genome structure reconstruction from Hi-C data using Particle Swarm Optimization approach chromosome bin, according to the particle swarm algorithm, and then iterates its position towards a global best This paper presents ParticleChromo3D, a new distance-based algorithm for chromosome 3D structure The structures generated by ParticleChromo3D also shows that the result at swarm size Structures generated by ParticleChromo3D at different swarm size values. obtained by comparing the ParticleChromo3D algorithm''s output structure to the simulated dataset''s true plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. plot of ParticleChromo3D SCC performance on 500KB GM12878 cell Hi-C data for chromosome 1 to 23. chromosome 3D structure reconstruction algorithms on the GM12878 data set at both the 1MB and 500KB chromosome and genome structures reconstructed from Hi-C data. id: 10_1101-2021_02_10_430606 author: Wei, Zheng title: NeuronMotif: Deciphering transcriptional cis-regulatory codes from deep neural networks date: 2021 words: 12013 sentences: 1107 pages: 31 flesch: 63 cache: ./cache/10_1101-2021_02_10_430606.pdf txt: ./txt/10_1101-2021_02_10_430606.txt summary: Each point is a decoupled motif generate by a sample set of sequence. Only the max activation value of the decoupled motifs in Fig. 3b are significantly higher than the decoupled motifs of other neurons in layer 3 of Basset-3 model. discovered (q-value < 0.001) from the neuron in convolutional output layer of Basset, BD-5 and BD-10 model. c, The number of motif discovered (q-value < 0.01) from the neuron in layer 3 of Basset model using different sub-patterns in the input feature map of the max pooling layer to split the sequences set of which are DNA-sequence based DCNN models with 3 general convolutional layers for stacking sequences of different synonymous motifs with the maximum activation value In summary, we presented NeuronMotif as an effective algorithm to reveal the cisregulatory motif grammar learned by DCNN model that use DNA sequence to annotate sequences indicate more synonymous motif mixture in this DCNN model. id: 10_1101-2021_02_10_430649 author: Wen, Zi-Hang title: Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data date: 2021 words: 8418 sentences: 1302 pages: 11 flesch: 71 cache: ./cache/10_1101-2021_02_10_430649.pdf txt: ./txt/10_1101-2021_02_10_430649.txt summary: Bfimpute: A Bayesian factorization method to recover single-cell RNA sequencing data Recovering dropout events in a sparse gene expression matrix for scRNA-seq data is a long-standing matrix completion We introduce Bfimpute, a Bayesian factorization imputation algorithm that reconstructs two latent gene and cell matrices to impute final gene expression matrix within each cell group, with or without the aid of cell type labels or bulk Bfimpute achieves better accuracy than other six publicly notable scRNA-seq imputation methods on simulated Key words: single cell; RNA-seq; imputation; Bayesian factorization impute dropout events by adopting the bulk RNA-seq data imputation of single cell RNA-seq data could be applied by Bfimpute recovers dropout values and improves cell type identification in the simulated data. and the imputed data by Bfimpute, scImpute, and DrImpute for the human embryonic stem cell differentiation study. imputation method scimpute for single-cell rna-seq data. id: 10_1101-2021_02_10_430604 author: Youngblut, Nicholas D. title: Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets date: 2021 words: 1409 sentences: 157 pages: 4 flesch: 56 cache: ./cache/10_1101-2021_02_10_430604.pdf txt: ./txt/10_1101-2021_02_10_430604.txt summary: Struo2: efficient metagenome profiling database construction for ever-expanding microbial genome datasets 1 Struo2: efficient metagenome profiling database construction for ever-expanding 10 Mapping metagenome reads to reference databases is the standard approach for 12 reference databases often lack recently generated genomic data such as 15 method for constructing custom databases; however, the pipeline does not scale well with the 17 not allow for efficient database updating as new data are generated. 20 HUMAnN3 databases that can be easily updated with new genomes and/or individual gene Struo2 enables feasible database generation for continually increasing large-scale 25 ● Pre-built databases: http://ftp.tue.mpg.de/ebio/projects/struo2/ 26 ● Utility tools: https://github.com/nick-youngblut/gtdb_to_taxdump 28 Metagenome profiling involves mapping reads to reference sequence databases and is 39 computational resources, which led us to create Struo for straight-forward custom metagenome 54 CPU hours per genome versus ~2.4 for Struo (Figure 1B). 67 taxonomy (available at https://github.com/nick-youngblut/gtdb_to_taxdump ). (2020) Struo: a pipeline for building custom databases for id: 10_1101-2021_02_10_430656 author: Zakeri, Mohsen title: A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing date: 2021 words: 6557 sentences: 568 pages: 7 flesch: 64 cache: ./cache/10_1101-2021_02_10_430656.pdf txt: ./txt/10_1101-2021_02_10_430656.txt summary: A like-for-like comparison of lightweight-mapping pipelines for single-cell RNA-seq data pre-processing benchmark comparing the kallisto-bustools pipeline (2) for single-cell demonstrate that, when configured to match the computational complexity of kallisto-bustools as closely as possible, alevin-fry processes Alevin-fry (3) is a new pipeline for single-cell RNA-seq benchmarking STARsolo (9), kallisto-bustools (2) and alevin-fry (3), out new tools like alevin-fry for the pre-processing of single-cell data, (1), we have now created a simple-to-follow tutorial for speedoptimized single-cell pre-processing using alevin-fry (https:// by Booeshaghi and Pachter (1) change when a like-for-like comparison between alevin-fry and kallisto-bustools is carried out, we The time and memory used by the relevant steps of the alevin-fry and kallisto-bustools pipelines for pre-processing the 20 diverse tagged-end single-cell RNA-seq datasets used in (1). A comparison of the resulting count matrices obtained from alevin-fry and kallisto-bustools, as run in this manuscript, for the pbmc_10k_v3 dataset. peak memory than alevin-fry, with the kallisto-bustools pipeline using id: 10_1101-2021_02_08_430275 author: Zhang, Jianbo title: Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes date: 2021 words: 6404 sentences: 694 pages: 6 flesch: 68 cache: ./cache/10_1101-2021_02_08_430275.pdf txt: ./txt/10_1101-2021_02_08_430275.txt summary: Next-generation sequencing-based bulked segregant analysis without sequencing the parental genomes identified using BSA-Seq, a technology in which next-generation sequencing (NGS) is applied to bulked segregant analysis (BSA). recently developed the significant structural variant method for BSASeq data analysis that exhibits higher detection power than standard to analyze BSA-Seq data in which genome sequences of one parent served as the reference sequences in genotype calling, and thus We analyzed a public BSA-Seq dataset using our modified method and the standard allele frequency and Gmethod allows the detection of such associations without sequencing the parental genomes, leading to further lower the the BSA-Seq data with the genome sequences of both the parents101 when the parental genome sequences are used to aid BSA-Seq data 193 The allele frequency method: The ΔAF value of each SNP in 267 BSA-Seq data analysis using the genome sequences of both the parents and the bulks. BSA-Seq data analysis using only the bulk genome sequences. id: 10_1101-2020_05_15_090266 author: Zhang, R. title: SpacePHARER: Sensitive identification of phages from CRISPR spacers in prokaryotic hosts date: 2021 words: 2191 sentences: 283 pages: 6 flesch: 64 cache: ./cache/10_1101-2020_05_15_090266.pdf txt: ./txt/10_1101-2020_05_15_090266.txt summary: Summary: SpacePHARER (CRISPR Spacer Phage-Host Pair Finder) is a sensitive and fast tool for de novo prediction of phage-host relationships via identifying phage genomes that match CRISPR spacers in genomic or metagenomic data. SpacePHARER gains sensitivity by comparing spacers and phages at the protein level, optimizing its scores for matching SpacePHARER by searching a comprehensive spacer list against all complete phage genomes. methods compare individual CRISPR spacers with phage To increase sensitivity, (1) we compare protein coding sequences because phage genomes are mostly coding, and, (0) Preprocess input: scan the phage genome and CRISPR spacers in six ORFs q of CRISPR spacers extracted from one prokaryotic genome, and each target set T comprises the putative protein sequences t from a single phage. The performance of SpacePHARER was evaluated on the spacer test set against a target database predicted the correct host for more phages than BLASTN BLASTN in detecting phage-host pairs, due to searching id: 10_1101-2021_02_08_430070 author: Zhang, Yao-zhong title: On the application of BERT models for nanopore methylation detection date: 2021 words: 5183 sentences: 586 pages: 7 flesch: 60 cache: ./cache/10_1101-2021_02_08_430070.pdf txt: ./txt/10_1101-2021_02_08_430070.txt summary: On the application of BERT models for nanopore methylation detection with deep learning models, have achieved significant performance improvements on nanopore methylation recurrent patterns of positional-signal-shift in the context window surrounding target 5-methylcytosine that the refined BERT model can achieve competitive or even better results than the state-of-the-art biRNN of datasets from the different research groups, BERT models demonstrate a good generalization Fig. 1: Basic BERT''s and refined BERT''s model structure used for methylation detection. a refined BERT model to take account of signal-shift patterns in the proposed refined BERT model achieves a competitive or even better result explore applying the BERT model for the nanopore methylation detection 2.2 Applying BERT models for nanopore methylation For the cross-sample evaluation, we train models on one dataset and test a BERT model to pay more attention to center positions. In-sample evaluation of different deep learning models on 5mC datasets. id: 10_1101-2021_02_01_429246 author: Zheng, Hongyu title: Sequence-specific minimizers via polar sets date: 2021 words: 15440 sentences: 1407 pages: 24 flesch: 71 cache: ./cache/10_1101-2021_02_01_429246.pdf txt: ./txt/10_1101-2021_02_01_429246.txt summary: minimizers focus on sampling fewer k-mers on a random sequence and use universal hitting sets (sets suggests, a UHS is a set of k-mers that "hits" every w-long window of every possible sequence (hence the the elements of the polar sets are in the sequence: the higher the energy, the more spread apart the k-mers have densities upper bounded by |U|/σk, because only k-mers from the universal hitting set can be selected. Section 2.2 gives a formal definition of the link energy of a polar set and Theorem 1 gives upper and lower bounds using this link energy for the density of a minimizer compatible with a polar set. form a link, which in turn is the number of k-mer pairs in the polar set that are exactly w bases away on S. A context is charged if the minimizer selects a different k-mer in the first window than in the second ==== make-pages.sh questions ==== make-pages.sh search ==== make-pages.sh topic modeling corpus Zipping study carrel