#bioc2020-workshops

2020-06-23

Marcel Ramos Pérez (16:15:28): > @Marcel Ramos Pérez has joined the channel

Marcel Ramos Pérez (16:15:29): > set the channel description: BioC2020 Workshops

Kevin Rue-Albrecht (16:39:27): > @Kevin Rue-Albrecht has joined the channel

2020-06-24

Mikhail Dozmorov (20:18:07): > @Mikhail Dozmorov has joined the channel

2020-06-30

Frank Rühle (06:20:25): > @Frank Rühle has joined the channel

2020-07-02

Qian Liu (13:18:22): > @Qian Liu has joined the channel

2020-07-07

Charlotte Soneson (10:10:32): > @Charlotte Soneson has joined the channel

2020-07-10

Veronica Jimenez Jacinto (20:06:36): > @Veronica Jimenez Jacinto has joined the channel

2020-07-13

Mike Smith (04:02:56): > @Mike Smith has joined the channel

Mike Smith (04:04:53): > I was looking at the Bioc2020 website and wondered what the different workshop levels mean? What’s a 100 level workshop compared to 200 level?

Charlotte Soneson (05:43:32): > Roughly: > * 100: introductory workshops, which do not require so much previous experience with Bioconductor > * 200: more advanced workshops emphasizing use of Bioconductor for common tasks, assuming some experience with Bioconductor > * 500: either more complex analysis workflows, or workshops more aimed at (new or current) developers

Mike Smith (09:00:09): > Thanks! Is that explained anywhere? Maybe the aim is to avoid works like ‘beginner’ or ‘advanced’, but it just confused me.

Charlotte Soneson (09:01:44): > It was described in the 2018 workshop book:https://bioconductor.github.io/BiocWorkshops/. Not sure about this year actually. - Attachment (bioconductor.github.io): The Bioconductor 2018 Workshop Compilation > This book contains all the workshops presented at the Bioconductor 2018 Conference

Mike Smith (09:09:18): > Can’t say I read the 2020 website cover-to-cover, but I didn’t see it mentioned. I like the descriptions in that 2018 version though - I think ‘200 series’ rather than ‘200 level’ makes it sounds more like a course groupings rather than something where I need to have earned some credits to be allowed up to the next rung.

Mike Smith (09:10:16): > I’ll also take the opportunity to say thanks to everyone who’s organising the conference, I’m really looking forward to it, rather than just sounding like I’m critiquing the website.

Charlotte Soneson (09:15:01): > Issue openedhttps://github.com/Bioconductor/BioC2020/issues/116@Mikhail Dozmorov- what do you think?

Mikhail Dozmorov (09:28:24): > We indeed don’t have clear explanation of the workshop labels. I think, the

Mikhail Dozmorov (09:29:14): > “level” corresponds to the level of complexity/skills required for workshops. Will put together some explanations on the site.

Mikhail Dozmorov (09:42:30): > Suggesting the wordinghttps://github.com/Bioconductor/BioC2020/issues/116#issuecomment-657567957@Charlotte Soneson,@Mike Smithwhat do you think?

Charlotte Soneson (09:47:50): > I think the 500 category may need to be a bit broader - the way I remember the discussion during the original classification was that it was basically two types of workshops - those aimed at (new or current) developers, and those that represented more complex workflows. Maybe@Levi Waldronand@Lorena Pantanohave some thoughts here too.

Mike Smith (09:57:47): > Sorry, I wrote my thoughts in github. Just copying them here: > > I think the descriptions are good and definitely clarify the groupings for me. > > A couple of things strike me: > > - Why are the levels 100, 200, 500? Why not 1, 2, 3? Maybe there’s a good reason for that, but it just seems odd to me. Is a level 500 2.5 times harder than level 200? I think part of the problem is that somehow the word ‘level’ makes me think the number is important - that’s why I prefer ‘series’ from the 2018 book which seem more categorical. > > - I also find ‘N level’ strange to parse grammatically. I guess that’s because you see ‘Level 100’ all the time, but ‘100 Level Workshops’ makes me think we’re describing many levels there are. It’s a pretty minor point, but I find it jarring and it probably adds to my confusion.

Lorena Pantano (14:00:07): > @Lorena Pantano has joined the channel

Lorena Pantano (14:37:37): > yes, I followed the directions as Charlotte has said. I was helping@Matthew McCall, and I think that comes from other conferences?

Aedin Culhane (14:47:11): > @Aedin Culhane has joined the channel

2020-07-14

Kayla Interdonato (14:14:49): > @Kayla Interdonato has joined the channel

2020-07-16

Spencer Nystrom (08:46:13): > @Spencer Nystrom has joined the channel

Petr Smirnov (09:43:15): > @Petr Smirnov has joined the channel

James MacDonald (09:43:39): > @James MacDonald has joined the channel

Kelly Street (09:43:54): > @Kelly Street has joined the channel

Ruth Schmidt (09:45:55): > @Ruth Schmidt has joined the channel

Christopher Eeles (09:46:06): > @Christopher Eeles has joined the channel

Petra Palenikova (15:23:26): > @Petra Palenikova has joined the channel

2020-07-17

Julia Hlavka-Zhang (09:39:10): > @Julia Hlavka-Zhang has joined the channel

2020-07-18

Devika Agarwal (05:19:41): > @Devika Agarwal has joined the channel

Ting Sun (14:03:56): > @Ting Sun has joined the channel

Pratima Chennuri (17:45:20): > @Pratima Chennuri has joined the channel

2020-07-20

Mariela (05:22:59): > @Mariela has joined the channel

Mikhael Manurung (07:01:12): > @Mikhael Manurung has joined the channel

Monika Krzak (12:41:48): > @Monika Krzak has joined the channel

Rydham Goyal (13:15:46): > @Rydham Goyal has joined the channel

Brianna Barry (18:28:55): > @Brianna Barry has joined the channel

Paula Beati (18:38:10): > @Paula Beati has joined the channel

shr19818 (22:13:14): > @shr19818 has joined the channel

2020-07-21

Helen Horkan (07:27:54): > @Helen Horkan has joined the channel

Frederick Tan (11:34:58): > @Frederick Tan has joined the channel

2020-07-22

B P Kailash (03:14:46): > @B P Kailash has joined the channel

2020-07-23

Lori Shepherd (09:53:50): > @Lori Shepherd has joined the channel

Tania Guerrero (22:28:58): > @Tania Guerrero has joined the channel

2020-07-24

Levi Waldron (04:45:55): > @Levi Waldron has joined the channel

Maria Doyle (05:44:04): > @Maria Doyle has joined the channel

Isha Goel (07:47:51): > @Isha Goel has joined the channel

Shian Su (09:52:32): > @Shian Su has joined the channel

Koen Van den Berge (11:16:37): > @Koen Van den Berge has joined the channel

dnusskern (15:05:15): > @dnusskern has joined the channel

Leonardo Collado Torres (16:16:17): > @Leonardo Collado Torres has joined the channel

2020-07-26

Hans-Richard Brattbakk (06:16:26): > @Hans-Richard Brattbakk has joined the channel

Reza Rezaei (09:59:08): > @Reza Rezaei has joined the channel

Keren Xu (21:40:21): > @Keren Xu has joined the channel

Alex Bott (22:07:56): > @Alex Bott has joined the channel

2020-07-27

Alexandra Garnham (03:14:57): > @Alexandra Garnham has joined the channel

Triin (03:29:58): > @Triin has joined the channel

Aditi Verma (05:19:31): > @Aditi Verma has joined the channel

Sonali (08:57:39): > @Sonali has joined the channel

Vivek Das (09:12:01): > @Vivek Das has joined the channel

Levi Waldron (10:42:18): > The original descriptions were lost, but they were: 100 “learn”, 200 “use”, 500 “develop”

Kevin Rue-Albrecht (10:59:01): > and what’s the reason again for skipping 300 and 400 ?

Jenny Drnevich (11:21:43): > @Jenny Drnevich has joined the channel

Jenny Drnevich (11:38:36): > It would help to have a chat for the workshop zoom sessions

Charlotte Soneson (11:39:45): > You can still use the pathable chat for the respective session.

Jenny Drnevich (11:42:31): > Thanks@Charlotte Soneson!

CristinaChe (12:05:16): > @CristinaChe has joined the channel

Kozo Nishida (12:47:47): > @Kozo Nishida has joined the channel

Corina Lesseur (12:55:08): > @Corina Lesseur has joined the channel

2020-07-28

Levi Waldron (08:58:10): > @Marcel Ramos Pérez@Sehyun Oh@C. Mirzayi (please do not tag this account)are you here?

Sehyun Oh (08:58:16): > @Sehyun Oh has joined the channel

C. Mirzayi (please do not tag this account) (08:58:16): > @C. Mirzayi (please do not tag this account) has joined the channel

Dania Machlab (09:02:40): > @Dania Machlab has joined the channel

Jeroen Gilis (09:02:43): > @Jeroen Gilis has joined the channel

Malte Thodberg (09:03:02): > @Malte Thodberg has joined the channel

Hyun-Hwan Jeong (09:03:13): > @Hyun-Hwan Jeong has joined the channel

Levi Waldron (09:18:55): > <!channel>this is a good place to ask workshop-related questions - we’ll ask workshop authors and TAs to keep an eye out here.

Levi Waldron (09:19:29): > Use polls for questions to be asked at the end of the workshop, here for other discussion, real-time questions, etc

Jenny Drnevich (09:20:12): > I just counted and I have 6 windows open for Ludwig’s workshop: 1) Zoom for the presentation, 2) pathable for the chat/polls, 3) this Slack channel for real-time Q & A per@Levi Waldron’s suggestion, 4) RStudio server instance where I found the 5) presentation slides that are good for links and references and 6) the workshop vignette Geistlinger_enrichOmics.html. Luckily I have 3 monitors and I don’t need all windows at once, but it seems a bit extreme!

Levi Waldron (09:22:30) (in thread): > Ugh, good point@Jenny Drnevich:frowning:. Not sure this channel is really necessary, it may be more useful for after-the-workshop.

Jenny Drnevich (09:24:45) (in thread): > Slack will be more permanent than pathable so it’s not a bad idea to have the Q & A here, but it’s just one more thing

Jenny Drnevich (09:36:22): > Does getGenesets() get GOALL mappings?

joost groot (09:42:05): > @joost groot has joined the channel

Marcin Kaszkowiak (09:48:20): > @Marcin Kaszkowiak has joined the channel

Mikie Phan (09:48:21): > @Mikie Phan has joined the channel

Zuzanna Nowicka (09:49:21): > @Zuzanna Nowicka has joined the channel

Kirill Tsyganov (09:59:12): > @Kirill Tsyganov has joined the channel

Robert Castelo (09:59:17): > @Robert Castelo has joined the channel

Izabela Mamede (10:00:05): > @Izabela Mamede has joined the channel

Charlie George (10:01:33): > @Charlie George has joined the channel

Vince Carey (10:03:20): > @Vince Carey has joined the channel

Marcel Ramos Pérez (10:04:17) (in thread): > tagging@Ludwig Geistlinger

Ludwig Geistlinger (10:04:21): > @Ludwig Geistlinger has joined the channel

syliu (10:04:44): > @syliu has joined the channel

syliu (10:08:28): > @Ludwig GeistlingerThank you for the great workshop! I want to follow up with the question “in the ORA example you showed, what’s the gene background used for the enrichment analysis?”

Ludwig Geistlinger (10:10:11) (in thread): > Withmode = "GO.db"it usestopGO::annFUN.orgunder the hood for retrieving the mapping. Not sure whether this answers the question?

Ludwig Geistlinger (10:10:46) (in thread): > also: usemode = "biomart"to obtain from Biomart

Ludwig Geistlinger (10:16:34) (in thread): > Thank you for this question. Please see Supplementary Discussion S2.2.2Choosing the backgroundof the benchmarking paper (https://academic.oup.com/bib/article/doi/10.1093/bib/bbz158/5722384) for details and implications of the different choices here. In short, we build the background by intersecting genes contained in the gene set collection and genes measured in the experiment.

syliu (10:22:21) (in thread): > Thank you! very detailed discussion:+1:

Ludwig Geistlinger (10:22:23) (in thread): > it seems to be the reality of an all-online format? On the other hand, I would see the fully rendered vignette as the ultimate point to go to, if I needed to prioritize.

Sonali (10:39:27): > <!here>:@Haibo Liugave an excellent introduction to ATAC-seq, his slides are stored here:https://github.com/haibol2016/ATACseqQCWorkshop/blob/master/inst/vignettes/ATACseqQC.pptx. T

Haibo Liu (10:39:31): > @Haibo Liu has joined the channel

Yuri Kotliarov (10:52:10): > @Yuri Kotliarov has joined the channel

Prat (10:59:34): > @Prat has joined the channel

Prat (11:00:31): > After ATACSeqQC, should only NFR fragments be used for downstream analysis? i.e. remove mono-, di-, tri-, etc.

Daniela Cassol (11:03:04): > @Daniela Cassol has joined the channel

Levi Waldron (11:16:17) (in thread): > @Haibo Liua question for you

Veronica Jimenez Jacinto (12:00:18) (in thread): > Yes, I agree with you. Thanks for the link.

Nur M Shahir (13:07:22): > @Nur M Shahir has joined the channel

Mikhail Dozmorov (13:10:29) (in thread): > Also, thanks to@Jianhongfor helping the session.@Haibo Liuhas answered all questions, please, check the chat on the workshop page.

Joselyn Chávez (13:11:19): > @Joselyn Chávez has joined the channel

Jianhong (13:12:17): > @Jianhong has joined the channel

Lauren Hsu (13:56:36): > @Lauren Hsu has joined the channel

Spencer Nystrom (14:22:31): > I had a question about the ATACSeqQC steps. What is the reasoning behind removing reads less than 38bp? Do these small fragments correspond to a specific type of poor quality signal? Can signal in this size range be used as an additional data quality metric?

Mikhail Dozmorov (14:27:18) (in thread): > @Haibo Liu?

Dr Awala Fortune O. (15:54:56): > @Dr Awala Fortune O. has joined the channel

Jenny Drnevich (17:11:34) (in thread): > @Ludwig GeistlingerI had a chance to check this out. Looks liketopGO::annFUN.orgis pulling fromorg.Hs.eg.db using an older db method and not the newerselect()method, and the are only pulling the directly annotated GO terms for each gene and not the parental GO terms as well.

Jenny Drnevich (17:12:42) (in thread): > > > go.gs <- getGenesets(org = "hsa", db = "go", onto = "BP", mode = "GO.db") > Using cached version from 2020-07-28 21:06:43 > > go.gs[1] > $`GO:0000002_mitochondrial_genome_maintenance` > [1] "10000" "1890" "291" "4205" "4358" "4976" "55154" "55186" "80119" "84275" "92667" > [12] "9361" > > > select(org.Hs.eg.db, keys = "GO:0000002", columns = "ENTREZID",keytype = "GO") > 'select()' returned 1:many mapping between keys and columns > GO EVIDENCE ONTOLOGY ENTREZID > 1 GO:0000002 TAS BP 291 > 2 GO:0000002 IMP BP 1890 > 3 GO:0000002 ISS BP 4205 > 4 GO:0000002 IMP BP 4358 > 5 GO:0000002 IMP BP 4976 > 6 GO:0000002 NAS BP 9361 > 7 GO:0000002 IMP BP 10000 > 8 GO:0000002 IBA BP 55154 > 9 GO:0000002 IDA BP 55186 > 10 GO:0000002 IEA BP 80119 > 11 GO:0000002 IDA BP 84275 > 12 GO:0000002 IMP BP 92667 > > select(org.Hs.eg.db, keys = "GO:0000002", columns = "ENTREZID",keytype = "GOALL") > 'select()' returned 1:many mapping between keys and columns > GOALL EVIDENCEALL ONTOLOGYALL ENTREZID > 1 GO:0000002 IMP BP 142 > 2 GO:0000002 TAS BP 291 > 3 GO:0000002 IDA BP 1763 > 4 GO:0000002 IMP BP 1890 > 5 GO:0000002 IBA BP 3980 > 6 GO:0000002 ISS BP 4205 > 7 GO:0000002 IMP BP 4358 > 8 GO:0000002 IMP BP 4976 > 9 GO:0000002 IMP BP 7156 > 10 GO:0000002 IEA BP 7157 > 11 GO:0000002 NAS BP 9361 > 12 GO:0000002 IMP BP 10000 > 13 GO:0000002 IEA BP 10891 > 14 GO:0000002 IEA BP 11232 > 15 GO:0000002 IBA BP 55154 > 16 GO:0000002 IDA BP 55186 > 17 GO:0000002 IEA BP 80119 > 18 GO:0000002 IEA BP 83667 > 19 GO:0000002 IDA BP 84275 > 20 GO:0000002 IMP BP 92667 > 21 GO:0000002 TAS BP 92667 > 22 GO:0000002 IEA BP 201163 > 23 GO:0000002 IMP BP 219736 > > I also found this recently in the rrvgo package:https://support.bioconductor.org/p/131510/

Ludwig Geistlinger (17:21:03) (in thread): > it’s a good point, but I think rather than replacing I’d think a better solution would be to make this an argument to the function. I think there are scenarios where you want to have just the directly annotated genes and other scenarios where you want to make use of the transitivity. have to think about it.

Ludwig Geistlinger (17:21:12) (in thread): > thanks for looking closer!

Jenny Drnevich (17:52:24) (in thread): > An argument for backwards compatibility is a good idea. Using only direct annotations could reduce down the redundancy of GO terms but still seems like they should be included in the counts for parental terms. I used to useGOstats::hyperGTest()with the conditional test that would remove genes from significant child terms before testing their parental terms. But then quicker easier functions & packages came along…

Julie Zhu (18:05:40): > @Julie Zhu has joined the channel

Ludwig Geistlinger (18:17:14) (in thread): > I mean the question is whether the annotation function (heregetGenesets) or the enrichment method should take care of the implications of the parent-child scheme. From what I have seen so far (eg.topGOelim, weight, and parentchild algorithms,mgsa,GOstats) it would be rather taken care of by the enrichment method. But still, I see the point to also have the option to have this available at the point of annotation.

Jenny Drnevich (18:20:38) (in thread): > Yes, if I just wanted to annotate a list of genes then I only pull the direct GO annotations. It is the enrichment methods that should account for pulling genes from child terms when testing parent terms. If the enrichment methods you use all take these into account, then there isn’t an issue.

Haibo Liu (22:35:55): > @Haibo Liu has joined the channel

Haibo Liu (22:39:17) (in thread): > Please see this paper for the rationale why we should remove fragments < 38 bp:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3046479/ - Attachment (PubMed Central (PMC)): Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition > We characterize and extend a highly efficient method for constructing shotgun fragment libraries in which transposase catalyzes in vitro DNA fragmentation and adaptor incorporation simultaneously. We apply this method to sequencing a human genome and …

Haibo Liu (22:42:31) (in thread): > If you have a large number of effective reads (passing all filtering criteria), you may use only NFR fragments for peak calling, otherwise the power to detect open chromatin regions will compromised.

2020-07-29

Marc RJ Carlson (08:39:47): > @Marc RJ Carlson has joined the channel

Ludwig Geistlinger (11:46:20): > set the channel topic: Discussions and questions regarding individual Bioc2020 workshops

Ludwig Geistlinger (12:38:39): > <!here>: there were a number of questions in theenrichOmicsworkshop yesterday where time didn’t permit me to answer them, and which I collected and answered here. I am not sure how to best reach everyone who asked a question, but I am also pasting these answers in the chat section of the workshop of the pathable conference website. - File (Plain Text): enrichOmics_questions.txt

Leonardo Collado Torres (12:47:48): > For the recount workshop, I collected the questions on a gist and advertised that gist on Twitter, the pathable website and Slack (I think, maybe I forgot that one)

Mirko Signorelli (12:51:14): > @Mirko Signorelli has joined the channel

Nick Owen (13:02:25): > @Nick Owen has joined the channel

David Burton (13:04:44): > @David Burton has joined the channel

Kevin Rue-Albrecht (13:18:04): > Cross-posting here - Attachment: Attachment > For those who haven’t spotted it on Twitter, I’ve just posted replies to all our Q&A’s https://twitter.com/KevinRUE67/status/1288516611805151234?s=20

James Ashmore (13:32:28): > @James Ashmore has joined the channel

Hyun-Hwan Jeong (14:00:11): > I have a question on PCAworkshop. Does pre-filtering of scRNA-seq data is mandatory prior to run PCA?

Levi Waldron (14:17:06) (in thread): > @Aedin Culhane

Leonardo Collado Torres (14:18:52) (in thread): > here’s the gist I was referring tohttps://gist.github.com/lcolladotor/e8fba98ece3126ba7b0a843490edde63for the recount workshop

bogdan tanasa (14:44:08): > @bogdan tanasa has joined the channel

Lukas Weber (16:04:38): > @Lukas Weber has joined the channel

Kasper D. Hansen (17:02:06): > @Kasper D. Hansen has joined the channel

Christopher Eeles (17:18:26): > <!here>200: A workshop on discovering biomarkers from high through put response screens - we will be answering questions about this workshop on this thread!

Christopher Eeles (17:19:12) (in thread): > Q1: From where to get both kinds of drug/radiation data and genomic data for the same cell lines? Any recommended databases?

Christopher Eeles (17:19:19) (in thread): > Q2: Beyond the PDX and radiation features, is there any reason why PharmacoGx needs to be cancer specific? Would it make sense to consider generalizing to in vitro and in vivo pharmacology studies in non-oncology indications?

Christopher Eeles (17:19:31) (in thread): > Q3: Is it easy to crate your own Xena object?

Christopher Eeles (17:26:21) (in thread): > Workshop vignette:https://bioc2020.pathable.co/meetings/virtual/vsatG7WGPAbHprox5 - Attachment (bioc2020.pathable.co): BioC 2020 > BioC 2020

Riyue Sunny Bao (17:36:35): > @Riyue Sunny Bao has joined the channel

Nitesh Turaga (18:01:12): > @Nitesh Turaga has joined the channel

Nitesh Turaga (18:01:38): > @Qian LiuQuestions from the workshop: > 1. if you have a very large project involving hundreds of samples, what are your suggestions? > 2. do you have samples for running Rcwl in cluster?

Qiang Hu (18:26:51): > @Qiang Hu has joined the channel

Qian Liu (18:29:59): > For the other question, you can make pull request here for adding new tools/pipeines:https://github.com/hubentu/RcwlRecipes/tree/master/Rcwl

Qiang Hu (18:39:45): > For large project with many samples, theRcwlhas a functionrunCWLBatchto run jobs in parallel byBiocParallel. This function requires a list of inputs, for example, a list of fastqs for a list of samples. This function will use sge/slurm/… to assign a job for each sample to different computing nodes. I have used this tool to run jobs for more than one hundred samples. It works smoothly.

Qiang Hu (18:41:43): > We have a small example to run jobs in a cluster in the devel vignette. I am happy to share my experience running heavy jobs in clusters.

2020-07-30

Sachendra Kumar (01:32:52): > @Sachendra Kumar has joined the channel

Markus Schroeder (08:36:26): > @Markus Schroeder has joined the channel

Leonardo Collado Torres (08:44:42): > Qian Liu and Qiang Hu also created the#rcwlchannel for more discussions about their work

Kevin Rue-Albrecht (08:45:34): > I nominate DelayedArray as the workshop of the year:slightly_smiling_face:https://www.youtube.com/watch?time_continue=281&v=Ew_3RdtszBs&feature=emb_logo - Attachment (YouTube): Effectively using the DelayedArray framework to support the analysis of large datasets

Kevin Rue-Albrecht (09:45:34): > Hoping that Terra/AnVIL people can make it to the Reproduciblity BoF tomorrow 12pm EST. I’m aiming at an interactive discussion and i think they would have some nice insights about it!:slightly_smiling_face:

Leonardo Collado Torres (09:58:33): > Hi@Sehyun Oh& Levi, I had a few questions: > * Are there bandwidth recommendations for using Terra? > * If you make a plot on Terra, does it try to make it into a small figure file? > * Can you run interactive visualizations like usingplotlyor running a “local” shiny app?

Leonardo Collado Torres (10:02:40): > The first 2 are in the context of the CDSB workshops we run. I’m more afraid of bandwidth problems and either the website being too slow to use or frequent disconnections than helping students install software on their computers. > > We’ve also heard feedback that students don’t like it if we teach using a local HPC or something because then they run into the installation issues by themselves and can get stuck there once the workshop is over. But then again, it’s hard to scale a class if you want to have time to help each student: even now virtually for #CDSB2020 we set a cap at 50 students. (The other non technical reason being that we want to get to know everyone and try to establish personal trust relationships, which helps when we send direct emails like@Alejandro Reyesdescribed and for community building purposes)

Alejandro Reyes (10:02:43): > @Alejandro Reyes has joined the channel

Kevin Rue-Albrecht (10:02:52) (in thread): > @Levi Waldron

Ana Beatriz Villaseñor Altamirano (10:03:01): > @Ana Beatriz Villaseñor Altamirano has joined the channel

Kevin Rue-Albrecht (10:03:31) (in thread): > @Sehyun Oh

Leonardo Collado Torres (10:03:49) (in thread): > Just pinging others here:@Joselyn Chávez@Ana Beatriz Villaseñor Altamirano@Veronica Jimenez Jacinto

Frederick Tan (10:18:41) (in thread): > I imagine the bandwidth recommendations are similar to Galaxy … most things are fairly light but specific applications could be more heavy

Frederick Tan (10:19:04) (in thread): > Can definitely runplot.lyand Shiny!@Vince Carey@BJ Stubbs

Frederick Tan (10:20:18) (in thread): > e.g. this workshop materialdocs.google.com/document/d/1vR555ldiejOaOzlM9TVYUlKzxvab5NrlD5jHzOmvutw

Levi Waldron (10:43:54) (in thread): > Yes!

Levi Waldron (10:44:52) (in thread): > I mean no, I have to be at the Scientific Advisory Board meeting:disappointed:.@Sehyun Ohcan you make it?

Kevin Rue-Albrecht (10:45:39) (in thread): > you mean you can’t double-meeting that one too?:stuck_out_tongue_winking_eye:

Sehyun Oh (10:47:09) (in thread): > Yes, I can make it.

Kevin Rue-Albrecht (10:47:39) (in thread): > (@Levi Waldronnot sure if anyone told you in the end that you made an appearance in Aedin’s workshop:wink:was fun)

Levi Waldron (10:47:44) (in thread): > app.terra.bio doesn’t have high bandwidth requirements, and I think generally Cloud computing is better than working locally on a slow connection because large downloads never come to your computer.

Levi Waldron (10:48:14) (in thread): > Heh, yes I did realize that:smile:well I won’t rule out making another double-meeting appearance!

Levi Waldron (10:51:28) (in thread): > I’ll have to let@Sehyun Ohask your other two questions :)

Adrija Kalvisa (10:53:21): > @Adrija Kalvisa has joined the channel

Aedin Culhane (11:00:57): > Questions from@Mikhail DozmorovHC workshop

Aedin Culhane (11:00:59): > * when we do HiC analysis, usually we assume no chromosomal aberrations in the underlying genome. how does diffHiC deal with translocations or CNV ? > * also if you can please add the question how we can call the LOOPS in diffHIC or the LOOPS are called externally > * When extracting chr data, the loop say 1:22, is that chr 1:22. Are XY ignored? Why? > * Is there an intuitive explanation why you see those non-linear relationships in the MD plot?

Wendy Wong (11:06:04): > @Wendy Wong has joined the channel

Kirill Tsyganov (11:08:58) (in thread): > can I also add this question > * when you merging multiple data sets (.hic files) into single hic_exp under the hood a full outer join happens, is it ever problematic? like if samples have big difference in library size/contact frequencies types (i.e start and end coordinates are different). if this is a problem do you have any suggestions how to filter out low abundance contacts ?

Mirko Signorelli (11:21:57): > This might be a pretty stupid question, but I will ask it anyway:see_no_evil:: is it possible to ask more than one questions in the Polls tab? It seems that I am only able to edit the question I already asked, but cannot ask another one…

Sehyun Oh (11:23:38) (in thread): > 2) I don’t think Terra does any extra to affect the figure size, when you make a plot. 3) Interactive visualization (plotly, shiny app) is available in Terra.

Marcel Ramos Pérez (11:25:08): > I agree the interface isn’t user-friendly. If you click on thesubmitword in “submit again”, you should be able to add another question.

Mirko Signorelli (11:25:58): > Great, thanks for your help!:slightly_smiling_face:

Marcel Ramos Pérez (11:31:59): > Follow up > Q: Is there a way to cancel asking question after you’ve clicked on the “Enter your own” text? > A: There is no “Cancel” button. AFAIK you’d have to refresh your browser.

Ludwig Geistlinger (11:42:23) (in thread): > I also like how the haircut changes between performances within the workshop

Nitesh Turaga (12:13:46): > Lunch with core team link:https://pathable.zoom.us/j/95005217841?pwd=UmdSWU5sMzBCQmFORVZSSHJlRVhCQT09

Levi Waldron (12:15:08): > I missed it, did the admin person say why the video wasn’t working?

Nitesh Turaga (12:15:16): > No…

Levi Waldron (12:16:00): > I thought I heard “it was your fault” or “it wasn’t your fault”, not sure which would be better:smile:

Nitesh Turaga (12:16:10): > Haha!

Nitesh Turaga (12:19:41): > Some one with moderator privileges should filter questions we have already answered in the polls.

Kayla Interdonato (12:27:59): > <!here>I had some unanswered questions from my package workshop. I’ve gone through and answered these questions which you can find in the following text file. Thanks for listening and I hope everyone got some great take aways:smiley: - File (Plain Text): workshopQA.txt

Mikhail Dozmorov (12:40:08) (in thread): > > when we do HiC analysis, usually we assume no chromosomal aberrations in the underlying genome. How does diffHiC deal with translocations or CNV ? > > * HiCcompare can remove the effect of CNVs. We assume that translocations, duplications, and deletions would lead to changes in chromatin interaction frequencies. If one dataset has them and the other doesn’t, the differences in interaction frequencies (M value) will be consistently shifted. Loess normalization will regress this effect out.

Mikhail Dozmorov (12:40:28) (in thread): > > also if you can please add the question how we can call the LOOPS in diffHIC or the LOOPS are called externally > > * HiCcompare and multiHiCcompare detect differential loops and interactions. To call loops in individual Hi-C matrices, I suggest Fit-Hi-C or HiCCUPS. Please, review the list of Hi-C analysis tools for various purposes, including loop callinghttps://github.com/mdozmorov/HiC_tools

Mikhail Dozmorov (12:41:13) (in thread): > > When extracting chr data, the loop say 1:22, is that chr 1:22. Are XY ignored? Why? > > * We can analyze Hi-C matrices on X and Y chromosomes. The tutorial process them separately simply because X and Y do not fit into the numeric 1-22 for-loop sequence.

Mikhail Dozmorov (12:41:34) (in thread): > > Is there an intuitive explanation why you see those non-linear relationships in the MD plot? > > * These are technical biases. Sequence-specific biases will affect two Hi-C datasets (genomes) to the same extent, so their effect on interaction frequencies will be the same. Technical biases will affect the data unpredictably, leading to complex nonlinear deviations. That’s why we chose data-driven loess regression to remove such unpredictable and impossible to model biases.

Mikhail Dozmorov (12:42:01) (in thread): > > when you merging multiple data sets (.hic files) into single hic_exp under the hood a full outer join happens, is it ever problematic? like if samples have big difference in library size/contact frequencies types (i.e start and end coordinates are different). if this is a problem do you have any suggestions how to filter out low abundance contacts? > > * Hi-C matrices must be the same resolution, which is typical if processed by the same pipeline. Internally, the full-merge works by merging pairs of regions occurring in multiple datasets and filling zeros for other pairs. Low-abundance interactions should always be interpreted with caution. In the differential analysis, we have a “logCPM” column corresponding to average interaction frequency. If low, use caution.

Ayush Raman (12:42:37): > @Ayush Raman has joined the channel

C. Mirzayi (please do not tag this account) (13:01:15): > Thank you to all everyone who attended my workshop on Monday (Epidemiology for Bioinformaticians), I have answered all the questions that I didn’t have time to answer here:https://cmirzayi.github.io/EpiForBioWorkshop2020/articles/questionResponses.htmlHappy to answer additional questions either here, as an issue on Github, or via email! - Attachment (cmirzayi.github.io): Answers to BioC2020 Questions > EpiForBioWorkshop2020

Rene Welch (14:12:46): > @Rene Welch has joined the channel

Antonio Colaprico (16:30:33): > @Antonio Colaprico has joined the channel

Tanya Grancharova (16:33:09): > @Tanya Grancharova has joined the channel

Erica Feick (17:20:39): > @Erica Feick has joined the channel

Aedin Culhane (17:41:53): > After workshops, Anyone want to hang out/chat.  Pop intohttps://meet.bioconductor.org/yEoLukjsyI - Attachment (meet.bioconductor.org): Jitsi Meet > Join a WebRTC video conference powered by the Jitsi Videobridge

Leonardo Collado Torres (18:01:20): > @Avi Srivastava@Michael Love > > How is tximeta() building the TxDb object? Would it benefit from GenomicState::GenomicStateHub(type = "TxDb")?[http://research.libd.org/GenomicState/reference/GenomicStateHub.html](http://research.libd.org/GenomicState/reference/GenomicStateHub.html)(Q by LCT) Edit: I see that you use GenomicFeatures::makeTxDbFromGFF() at[https://github.com/mikelove/tximeta/blob/master/R/tximeta.R#L685-L690](https://github.com/mikelove/tximeta/blob/master/R/tximeta.R#L685-L690). GenomicState would be similar to the EnsDb approach you have at[https://github.com/mikelove/tximeta/blob/master/R/tximeta.R#L640-L641](https://github.com/mikelove/tximeta/blob/master/R/tximeta.R#L640-L641). Though I need to update GenomicState for gencode v32 & newer > > Athttp://research.libd.org/GenomicState/reference/gencode_txdb.htmlI do also use GenomicFeatures but I do other things so the resulting TxDb is something I can use withderfinderPlot()more easily. It’s a package I made years after the initial suggestion by@Davide Rissoon his review ofrecountWorkflowto move the 8-15 lines of code to another home to make it easier for users. I think that I have a valid *Hub key from@Lori Shepherd, so I could add support for newer Gencode versions too if you want (human and mouse + some other updates I owe Lori:stuck_out_tongue:). If you wanted to use them, you could access the data either usingGenomicState::GenomicStateHub()or directly withAnnotationHub. - Attachment (research.libd.org): Create a Gencode TxDb object — gencode_txdb > This function builds a transcript database (TxDb) object which you can then use to build a Gencode GenomicState object. This function will download the data from Gencode, import it into R, process it and build the TxDb object.

Avi Srivastava (18:01:26): > @Avi Srivastava has joined the channel

Michael Love (18:01:26): > @Michael Love has joined the channel

Davide Risso (18:01:26): > @Davide Risso has joined the channel

Michael Love (18:03:30): > @Leonardo Collado Torresthis is great, so if you have the GENCODE GTFs parsed and on AnnotationHub I will use those, i thought they were not there though

Leonardo Collado Torres (18:04:35): > they weren’t hehe, that’s why I madeGenomicState

Michael Love (18:04:37): > but does that put you in a position of having to manually parse the GTF with every gencode release?

Michael Love (18:04:58): > seems like it should be done programmatically so no one is having to do it manually each time

Leonardo Collado Torres (18:05:01): > yes and no, you can useGenomicState::gencode_txdb()to build it on the fly

Leonardo Collado Torres (18:05:16): > but to submit to the Hubs, I do need to run it

Leonardo Collado Torres (18:05:59): > https://github.com/LieberInstitute/GenomicState/blob/master/R/gencode_txdb.R

Michael Love (18:06:19): > i think i’m missing the crux though, is your function much faster than makeTxDbFromGFF?

Michael Love (18:06:32): > but generates an equivalent txdb?

Leonardo Collado Torres (18:08:34): > I would need to re-compare them again to remember the details (I’ll get back to you later, just popped a beer with Aedin and co :P)

Michael Love (18:09:16): > ok sounds good:slightly_smiling_face:

Michael Love (18:09:23): > gonna go get dinner with my little ones

Ray Su (18:13:11): > @Ray Su has joined the channel

Levi Waldron (18:31:44): > Happy hour athttps://meet.bioconductor.org/yEoLukjsyI!

Connie Li Wai Suen (20:27:31): > @Connie Li Wai Suen has joined the channel

bogdan tanasa (22:45:28): > Dear Aedin (@Aedin Culhane), thank you again for a very vibrant and informative workshop ! if I may ask please, for an easier math intro into UMAP, and tSNE, which materials would you recommend ? (my question may have been asked before, sorry if i have missed the answer:slightly_smiling_face:many thanks !

Noor Pratap Singh (22:52:34): > @Noor Pratap Singh has joined the channel

2020-07-31

Mikie Phan (08:31:14) (in thread): > Dear@Mikhail Dozmorov, thank you for the great workshop! What are your suggestions for normalization of asymmetric matrices from capture HiC experiments (where targeted loci is enriched by hybridization with biotinylated oligo probes)?

Krutika (08:38:17): > @Krutika has joined the channel

Leonardo Collado Torres (09:29:47): - File (R): Gencode_TxDb_GenomicsState_vs_GenomicFeatures.R

Michael Love (09:34:05): > I’ll definitely add code to check for GENCODE in AHub

Leonardo Collado Torres (09:34:08): > here you go@Michael Love. The resulting objects are very similar, it’s just that the one fromGenomicStatehas theseqinfo()data. I need the chr length for somederfinderPlotcode. If you wanted to subset for 1 chr, thenGenomicState::gencode_txdb()has an argument for doing so. So the main difference is having the data on the Hub already processed vs downloading the GTF file and building theTxDbobject. > > I do realize that I need to add support for mouse from Gencode + latest human versions.

Leonardo Collado Torres (09:34:18): > ahh, you beat me hehe

Michael Love (09:35:17): > but then for the case it’s not in the AHub, what is the benefit from GenomicState vs makeTxDbFromGFF? I’m already adding the chr lengths via a UCSC call (not me but GenomeInfoDb functions actually)

Leonardo Collado Torres (09:37:59): > I’m also usingGenomeInfoDb. The advantage was for the end user who only has to run a single function instead of several lines of code, which was Davide’s point on his review

Michael Love (09:38:44): > aha, so i think i would stick with my current codebase but will add checks for AHub for GENCODE (whereas currently i only do this for Ensembl):https://github.com/mikelove/tximeta/issues/40

Mikhail Dozmorov (09:40:31) (in thread): > Thank you,@Mikie Phan. Capture Hi-C is different. I haven’t worked with such data yet, only know CHiCAGO pipeline developed for Capture Hi-C data analysis. Try to search for “capture” keyword on this pagehttps://github.com/mdozmorov/HiC_tools, you may find some other tools for Capture Hi-C normalization.

Leonardo Collado Torres (09:41:22): > > > txdb_gs_live <-gencode_txdb(genome = "hg38") > 2020-07-31 09:38:40 importing[ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.annotation.gtf.gz](ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.annotation.gtf.gz)trying URL '[ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.annotation.gtf.gz](ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.annotation.gtf.gz)' > downloaded 40.8 MB > > 2020-07-31 09:39:53 keeping relevant chromosomes > 2020-07-31 09:39:53 preparing metadata > Prepare the 'metadata' data frame ... OK > 2020-07-31 09:39:59 building the txdb object > Warning message: > In .get_cds_IDX(mcols0$type, mcols0$phase) : > The "phase" metadata column contains non-NA values for features of type stop_codon. This information was > ignored. > > txdb_gs_live > TxDb object: > # Db type: TxDb > # Supporting package: GenomicFeatures > # Data source:[ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.annotation.gtf.gz](ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_31/gencode.v31.annotation.gtf.gz)# Organism: Homo sapiens > # Taxonomy ID: 9606 > # miRBase build ID: NA > # Genome: hg38 > # Nb of transcripts: 226882 > # Db created by: GenomicFeatures package from Bioconductor > # Creation time: 2020-07-31 09:40:47 -0400 (Fri, 31 Jul 2020) > # GenomicFeatures version at creation time: 1.40.0 > # RSQLite version at creation time: 2.2.0 > # DBSCHEMAVERSION: 1.2 > > seqinfo(txdb_gs_live) > Seqinfo object with 25 sequences (1 circular) from hg38 genome: > seqnames seqlengths isCircular genome > chr1 248956422 FALSE hg38 > chr2 242193529 FALSE hg38 > chr3 198295559 FALSE hg38 > chr4 190214555 FALSE hg38 > chr5 181538259 FALSE hg38 > ... ... ... ... > chr21 46709983 FALSE hg38 > chr22 50818468 FALSE hg38 > chrX 156040895 FALSE hg38 > chrY 57227415 FALSE hg38 > chrM 16569 TRUE hg38 >

Leonardo Collado Torres (09:41:29): > just showing how it works “live”

Michael Love (09:43:09): > i think we converged on probably similar code to get those lengths for GENCODE txomes

Michael Love (09:43:22): > :smile:

Leonardo Collado Torres (09:53:07): > hehe yup

Pablo Rodriguez (12:57:54): > @Pablo Rodriguez has joined the channel

Vivek Das (15:06:52): > Can anyone share the link video to Pete’s talk/workshop here?

Sehyun Oh (15:13:40): > @Vivek Dashttps://www.youtube.com/watch?time_continue=281&v=Ew_3RdtszBs&feature=emb_logo - Attachment (YouTube): Effectively using the DelayedArray framework to support the analysis of large datasets

Vivek Das (15:16:01): > Thanks:pray:@Sehyun Oh

Petr Smirnov (17:21:24) (in thread): > Question answers added here for perpetuity’s sake: 1. The datasets available from the packages are curated from large, public data projects, including the DepMap from the Sanger and Broad, but also other publications. All are available through the package. > 2. There are no specific limits to using these packages only for cancer screening. We focus on that from our own research, and furthermore cancer provides many genomically and transcriptomically different models that can be used to assess associations of particular molecular states with treatment response. > 3. Xeva is actually quite easy to use with your own data. We provide several interfaces to load your data into a Xeva object, including one that loads directly from the popular StudyLog tool, as well as a more open spreadsheet based data import.

Aedin Culhane (18:12:26): > post conference chat onbit.ly/bioc2020.. is still going on if you want to join

2020-08-03

Hena Ramay (01:45:42): > @Hena Ramay has joined the channel

Junyan Xu (13:07:21): > @Junyan Xu has joined the channel

2020-08-05

rohitsatyam102 (09:58:01): > @rohitsatyam102 has joined the channel

2020-08-06

Lukas Weber (23:35:41): > I think it was mentioned that workshop materials will be available for one week after the conference, and talks/presentations for a full year. For the workshops, will the videos/presentations also be available for a year? (so the one week only applies to the online computational materials)?

2020-08-07

Kevin Rue-Albrecht (04:49:06): - Attachment: Attachment > By August 14:

Kevin Rue-Albrecht (04:50:59) (in thread): > i’m aware it doesn’t mention workshops. I think@Sean Davishas full answer

Lukas Weber (10:32:56) (in thread): > thanks, just asked in the larger channel too

Kevin Rue-Albrecht (10:34:03) (in thread): > I saw. Well done. I think that other channel is better suited for the question in fact.:thumbsup:

2020-08-10

Huipeng Li (04:01:47): > @Huipeng Li has joined the channel

2020-08-30

Bob Policastro (11:00:33): > @Bob Policastro has joined the channel

2020-09-02

Ying Xu (22:04:05): > @Ying Xu has joined the channel

2020-09-03

Modupeh Betts (21:23:34): > @Modupeh Betts has joined the channel

2020-09-17

David Zhang (06:51:02): > @David Zhang has joined the channel

2020-10-07

Nathan (23:23:51): > @Nathan has joined the channel

2020-10-14

Zhiwei Bao (02:32:06): > @Zhiwei Bao has joined the channel

Nitika Kandhari (19:57:31): > @Nitika Kandhari has joined the channel

2020-10-24

Kevin Stachelek (20:41:20): > @Kevin Stachelek has joined the channel

2020-10-26

Jeffrey O’Brien (14:03:00): > @Jeffrey O’Brien has joined the channel

2020-11-18

eugenia.galeota (05:38:27): > @eugenia.galeota has joined the channel

Spiro Stilianoudakis (18:27:18): > @Spiro Stilianoudakis has joined the channel

2020-12-02

Konstantinos Geles (Constantinos Yeles) (05:42:17): > @Konstantinos Geles (Constantinos Yeles) has joined the channel

2020-12-05

Mahmoud Ahmed (23:50:47): > @Mahmoud Ahmed has joined the channel

2020-12-11

Dario Righelli (12:07:52): > @Dario Righelli has joined the channel

Dario Righelli (12:28:50): > Hi guys, I’m having an issue with the EuroBioc2020 workshop template, do I post it here or somewhere else?

Charlotte Soneson (12:34:30) (in thread): > Probably#biocworkshopsis the best place if it’s a technical issue

Dario Righelli (12:36:41) (in thread): > Thanks!

2020-12-12

Huipeng Li (00:41:30): > @Huipeng Li has joined the channel

2020-12-15

Francesc Català (05:57:49): > @Francesc Català has joined the channel

2020-12-16

Francesc Català (11:41:09): > @Francesc Català has left the channel

2020-12-19

Soumya Banerjee (16:37:22): > @Soumya Banerjee has joined the channel

2020-12-21

Harithaa Anand (04:10:24): > @Harithaa Anand has joined the channel

2020-12-22

Mike Smith (07:04:54): > @Mike Smith has left the channel

2020-12-31

Alexander Toenges (12:01:22): > @Alexander Toenges has joined the channel

2021-01-01

Bernd (14:03:32): > @Bernd has joined the channel

2021-01-17

rohitsatyam102 (03:46:54): > https://www.festivalofgenomics.com/register

2021-01-22

Annajiat Alim Rasel (15:41:41): > @Annajiat Alim Rasel has joined the channel

2021-02-07

Mikhael Manurung (11:10:04): > @Mikhael Manurung has left the channel

2021-02-24

Kyle Alford (11:40:05): > @Kyle Alford has joined the channel

2021-04-20

Levi Waldron (11:51:38): > archived the channel