#developers-forum

2019-08-08

Mike Smith (03:52:52): > @Mike Smith has joined the channel

2019-08-09

Charlotte Soneson (15:08:10): > @Charlotte Soneson has joined the channel

2019-08-14

Martin Morgan (06:14:32): > @Martin Morgan has joined the channel

Martin Morgan (06:19:39): > Here’s an email describing the forumhttps://stat.ethz.ch/pipermail/bioc-devel/2019-August/015359.htmlwith first meeting Thursday 12 noon eastern athttps://bluejeans.com/136043474?src=join_info(Meeting ID: 136 043 474). It’ll be recorded…

Peter Hickey (06:55:21): > @Peter Hickey has joined the channel

Laurent Gatto (06:55:39): > @Laurent Gatto has joined the channel

Deepak Tanwar (08:07:33): > @Deepak Tanwar has joined the channel

Kayla Interdonato (08:13:30): > @Kayla Interdonato has joined the channel

hcorrada (08:22:40): > @hcorrada has joined the channel

Lori Shepherd (08:24:04): > @Lori Shepherd has joined the channel

Jayaram Kancherla (09:12:54): > @Jayaram Kancherla has joined the channel

Craig (09:25:22): > @Craig has joined the channel

Stephanie Hicks (09:27:25): > @Stephanie Hicks has joined the channel

Marcel Ramos Pérez (10:32:23): > @Marcel Ramos Pérez has joined the channel

Leonardo Collado Torres (11:21:45): > @Leonardo Collado Torres has joined the channel

Aaron Lun (11:44:22): > @Aaron Lun has joined the channel

kipper fletez-brant (11:51:32): > @kipper fletez-brant has joined the channel

Aaron Lun (11:51:44): > @Mike SmithI’m not sure I want topresentsomething in a formal sense, but I don’t mind giving updates on the state of play of various packages.

Mike Smith (12:31:33): > If you wouldn’t mind giving a 10-15 minute update on things that would be awesome. My plan was for someone to ‘present’ something on that kind of scale, with the intention that it would lead into a similar length discussion, so incomplete or experimental things are good (probably encouraged).

Mike Smith (12:32:18): > I plan on showing some recent updates to biomaRt & to gauge opinions on its future

Aedin Culhane (14:43:13): > @Aedin Culhane has joined the channel

2019-08-15

Michael Stadler (02:25:56): > @Michael Stadler has joined the channel

Kevin Rue-Albrecht (08:29:43): > @Kevin Rue-Albrecht has joined the channel

Mike Smith (10:24:50): > Here’s a link to some slides for the upcoming call:https://docs.google.com/presentation/d/1kFjOb6Yg5PXvsYxbRN2_al_HYMOOM_qfWmHQgevOAHw/edit?usp=sharing

Qian Liu (10:34:36): > @Qian Liu has joined the channel

Kevin Rue-Albrecht (11:21:13): > the meeting is in 40 min right? (last minute check that I’ve got the right time zone difference)

Kevin Rue-Albrecht (11:27:05): > I haven’t prepared anything to present, but I’d be curious to hear > 1. whether anyone in the meeting is using iSEE in any way in their own work > 2. pros and cons from a user perspective (e.g., researchers, core facilities), > 3. whether anything could be done to broaden its adoption, make it easier to use/deploy > 4. whether anyone prefers another interactive browser and if so for what reason (no hard feeling:stuck_out_tongue:)

Hervé Pagès (11:32:24): > @Hervé Pagès has joined the channel

Aaron Lun (11:32:28): > Uh. Should I have slides?

Aaron Lun (11:32:31): > I’ll make some slides.

Mike Smith (11:32:52): > Only if you want to

Aaron Lun (11:33:00): > gonna use one of these fancy templates that I never use.

Mike Smith (11:34:23): > I’m just putting a bullet point agenda together, but today may have the feeling of throwing stuff at the wall and seeing what sticks

Mike Smith (11:39:56): > Agenda for August 15th 2019: > - Introduction > - Short presentation/discussion on biomaRt, recent issues & future plans - Me > - “State of play of various packages” - Aaron Lun > - Defining steps to clarifying our position on HDF5/DelayedArray/ExperimentHub (see discussion in#bigdata-rep) > - Picking future topics & meeting times

Aaron Lun (11:52:03): > https://drive.google.com/open?id=1TJgEP9_wxGflz2vwSobIro3bFqvdxlEp

Hervé Pagès (11:52:22): > Won’t be able to stay long today. Need to drive my daughter to her summer camp at 9:10 am PCT.

Aaron Lun (11:57:00): > Hm. Bluejeans has been trying to connect for minutes.

Mike Smith (11:57:40): > It says you’re in there

Aaron Lun (11:58:39): > Hm. just got a big “waiting for network”

Aaron Lun (11:59:21): > Will see if it gets better somewhere else.

Daniela Cassol (12:06:56): > @Daniela Cassol has joined the channel

Leonardo Collado Torres (12:21:35): > I think that RStudio usedhttps://github.com/r-lib/revdepcheckfor the scenario Tiago is describing

Leonardo Collado Torres (12:23:33): > Seehttp://r-pkgs.had.co.nz/release.html

Leonardo Collado Torres (12:23:40): - File (PNG): Capture.PNG

Kasper D. Hansen (12:24:03): > @Kasper D. Hansen has joined the channel

Leonardo Collado Torres (12:24:14): > looks likerevdep_maintainers()is how they get the list of maintainer emails

Tim Triche (12:26:43): > @Tim Triche has joined the channel

Dan Bunis (12:27:22): > @Dan Bunis has joined the channel

Leonardo Collado Torres (12:27:38): > (I just sent Tiago an email because I can’t find him on the Bioc Slack)

Tim Triche (12:27:39) (in thread): > re: using iSEE: yes

Tim Triche (12:28:18) (in thread): > re: broaden: is there documentation of how to hook up different types of data to iSEE without likely having it break when updated?

Tim Triche (12:28:24) (in thread): > e.g. MultiAssayExperiment stuff

Kevin Rue-Albrecht (12:28:37) (in thread): > If there is some interest, I could try to have something for a future meeting. Today was just a bit of a last minute thought

Tim Triche (12:29:11) (in thread): > yeah I just joined this channel and realized this is where all the planning/agenda was happening

Tim Triche (12:29:13) (in thread): > sorry

Tim Triche (12:30:17) (in thread): > cons: putting it onshinyapps.ioseems to breakshinyapps.io:smile:

Kevin Rue-Albrecht (12:30:59) (in thread): > Nothing to be worry about:slightly_smiling_face:Happy to collect thoughts and feedback for future agenda. I think that’s what the channel is about

Kevin Rue-Albrecht (12:31:51) (in thread): > > cons: putting it onshinyapps.ioseems to breakshinyapps.io > That’s a known issue if your dataset doesn’t fit in the default shinyapps container size (1GB)

Kevin Rue-Albrecht (12:32:36) (in thread): > even theallendataset doesn’t fit in there, I had to downsample genes and cells for the RStudio Shiny contest

JiefeiWang (12:32:54): > @JiefeiWang has joined the channel

Tim Triche (12:34:12) (in thread): > oh it doesn’t seem to be an issue only with hte default size

Tim Triche (12:34:21) (in thread): > it breaks the +$40/month size too

Tiago C. Silva (12:36:58): > @Tiago C. Silva has joined the channel

Kevin Rue-Albrecht (12:38:23) (in thread): > Hm.. Not sure about that. Still, always a good idea to check how much memory the app uses at runtime, e.g.https://community.rstudio.com/t/killed-error-during-installation-for-many-packages/24239/5?u=kevinrue - Attachment (RStudio Community): “Killed” error during installation for many packages > When I open that project and run gc() I see: > gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 5475504 292.5 8828510 471.5 8828510 471.5 Vcells 16484956 125.8 23798222 181.6 19711902 150.4 I am not certain how R mutates into the copy, but if it performs a full memory copy before making the mutation, then you could very easily run out of memory here.

Tim Triche (12:39:08) (in thread): > I am wondering if lazy data access might help avoid this blowing up

Kevin Rue-Albrecht (12:39:56) (in thread): > Yup. Are yourSCEobjects h5-backed or in memory?

Tim Triche (12:40:00): > anyone know if these slides be available afterwards?

Tim Triche (12:40:05) (in thread): > yes:wink:

Leonardo Collado Torres (12:40:17): > Mike said that they are recording the broadcast

Tim Triche (12:40:18) (in thread): > thinking the former is preferable now

Kevin Rue-Albrecht (12:40:38) (in thread): > > yes:wink: > that’s not an answer to an “either or” question:stuck_out_tongue:

Tim Triche (12:40:47) (in thread): > some are h5, some in-core

Kevin Rue-Albrecht (12:40:56) (in thread): > ahh ok (my question wasn’t properly phrase “either or”, fair answer then:grimacing:)

Tim Triche (12:41:17): > @Aaron Lunthat was terrific

Stephanie Hicks (12:41:21): - File (PNG): Screen Shot 2019-08-15 at 12.40.49 PM.png

Tim Triche (12:42:10): > we can help with cell annotation and multi-sample since that has begun to eat our weekly scRNA meetings

Tim Triche (12:45:12): > hmmm, CRAPseq actually forces that question, what happens when you look at several measurements in the same cell in the same well

Tim Triche (12:45:37): > the nucleus is physically separated from the cytoplasmic RNA and organelle DNA in the prep

Leonardo Collado Torres (12:49:36): > as far as I know, you can’t make a Slack channel public once it’s private@Stephanie Hickshttps://twitter.com/slackhq/status/835541575144992769?lang=en, so well, you have to start from scratch the new public channel - Attachment (twitter): Attachment > @Ronef We don’t allow converting private channels to public channels for privacy reasons. We’re listening to feedback on this though. :bow:

Stephanie Hicks (12:50:06): > then we will start with a new channel:upside_down_face:

Martin Morgan (12:52:08): > osca is tantalizingly close to oscar…

Leonardo Collado Torres (12:57:35): > or a google doc for signing up?@Mike Smith

Leonardo Collado Torres (12:58:01): > unless you want to curate the list of who talks

Tim Triche (12:58:12): > google sheets?

Tim Triche (12:58:19): > (month, slot1, slot2)

Tim Triche (12:58:49): > like the REMC data

Kevin Rue-Albrecht (12:59:29): > > or a google doc for signing up?@Mike Smith > Yes. I think we should have at least a place to store ideas of topics. > That said, I think individual speakers might only know closer to the date whether they can present. We should probably expect some flexibility to pick topics in the week leading up to the meeting

Leonardo Collado Torres (13:03:17) (in thread): > I needed to click on “See all 27 members” of the channel in order to see Tiago. Sorry@Tiago C. Silva!

Tiago C. Silva (13:06:13) (in thread): > no worries!

Kasper D. Hansen (13:23:27): > The alt exp sounds pretty nice and I very strongly hope that this gets merged into SummarizedExperiment

Kasper D. Hansen (13:23:47): > It is not a MAE experiment because you have a 1-1 sample mapping

Kasper D. Hansen (13:24:45): > I have wanted this in methylation for a long time. Usecase: on the methylation arrays we have normal CpGs, we have control probes and we have SNP probes. 3 different classes and I would love to have them separated as objects for all the reasons@Aaron Lunoutlined

Kasper D. Hansen (13:25:23): > I can understand potential issues with the current implementation (which can change of course), but I would LOVE to see this in SummarizedExperiment

Kasper D. Hansen (13:26:11): > On thing to think about is size. In my use case the data is “small” so in memory is not an issue

Tim Triche (13:32:41): > well, how about CpH vs. CpG vs. SNPs for WGBS

Tim Triche (13:32:45): > now it’s not small anymore

Alejandro Reyes (13:32:59): > @Alejandro Reyes has joined the channel

Kasper D. Hansen (13:34:21): > @Tim Trichetrue, but not sure that is what we intent for this representation. Perhaps we do. I don’t know, I just know I WANT THIS

Tim Triche (13:35:49): > yeah I’m about to use it for a project of my own where we are ALSO using singleR

Tim Triche (13:36:09): > it feels like we’re changing the tires while going 100mph but that’s the fun part

Tim Triche (14:13:15): > ps.@Stephanie Hicksdid you create the:oscar:channel?

Tim Triche (14:13:27): > Ochestrating Single Cell Analysis in R:wink:

Tim Triche (14:13:34): > per@Martin Morgan

Stephanie Hicks (14:15:28): > #osca-book

Tim Triche (14:16:17): > :grouch:

Stephanie Hicks (14:22:18): > @Tim Tricheooooh Ijustcaught what you and@Martin Morgansuggested oscar!

Tim Triche (14:34:32): > bonus: there doesn’t seem to be a CRAN or BioC package called “oscar” yet

Tim Triche (14:34:47): > > R> install("oscar") > Bioconductor version 3.10 (BiocManager 1.30.4), R 3.6.1 (2019-07-05) > Installing package(s) 'oscar' > Update old packages: 'BSgenome', 'csaw', 'edgeR', 'ExperimentHub', 'ggpubr', > 'lpSolveAPI', 'modelr', 'quantreg', 'Rhtslib', 'scater', 'scran', > 'shinystan', 'tinytex', 'tximport' > Update all/some/none? [a/s/n]: n > Warning message: > package 'oscar' is not available (for R version 3.6.1) >

Martin Morgan (14:37:25): > And ‘o’ is an under-represented initial letter > > > nms = rownames(available.packages(repos=BiocManager::repositories())) > > table(tolower(substr(nms, 1, 1))) > > a b c d e f g h i j k l m n o p > 742 993 1361 871 670 689 1016 581 583 125 197 588 1613 467 358 1473 > q r s t u v w x y z > 189 1829 1874 812 136 223 241 95 52 58 >

Tim Triche (14:38:19): > who wants to do the honors:wink:

Mike Smith (16:28:30) (in thread): > Is that statistically rigorous?

Boris Hejblum (18:51:15): > @Boris Hejblum has joined the channel

2019-08-16

Federico Marini (09:37:36): > @Federico Marini has joined the channel

Kayla Interdonato (14:52:38): > Recording of yesterday’s meeting is now available on the Bioconductor YouTube channel (https://www.youtube.com/watch?v=CcJgcDy2qEI&t=53s) - Attachment (YouTube): Developers Forum 01

Aaron Lun (16:02:12): > Hm. Don’t recognize the second presenter.

Aaron Lun (16:02:36): > What a weird accent.

Aedin Culhane (17:20:10): > Thanks for posting. I was looking at SingleR… what channell is a good place to ask a question about it?

Aaron Lun (17:20:21): > #sc-signature

Dan Bunis (17:25:12): > ^^^ We’ve been discussing there while working on it, so there is great!

Aaron Lun (17:26:07): > Woah, dan and aedin have the same identicon.

Aedin Culhane (18:15:33): > :wink:Great taste@Dan Bunis.. . Or maybe its Irish flag colors for us both:flag-ie:

2019-08-18

Luke Zappia (21:26:27): > @Luke Zappia has joined the channel

2019-08-28

Davide Risso (05:42:49): > @Davide Risso has joined the channel

2019-08-29

FelixErnst (05:37:13): > @FelixErnst has joined the channel

2019-09-06

Martin Morgan (12:18:28): > @Mike Smithone suggestion for next meeting (next week? I don’t think this has been formally scheduled?) is some discussion of ‘best practices’, e.g., related to object serialization in packages (and the Experiment/AnnotationHub…)

2019-09-11

Mike Smith (12:00:55): > The next Bioconductor Developers’ Forum is scheduled for Thursday 19th September at 09:00 PDT/ 12:00 EDT / 18:00 CEST > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/645510122?src=join_info(Meeting ID: 645 510 122)

Mike Smith (12:15:38) (in thread): > Sounds like an excellent topic, would you like to lead/introduce what we currently recommend? I seem to remember reading some recent discussion, but I’m struggling to find it either in Slack or on the devel-mailing

Martin Morgan (12:19:11) (in thread): > sure I’ll maybe make a couple of slides to provide orientation, hoping for discussion & input from others…

Mike Smith (12:27:51) (in thread): > Sounds perfect, thanks

Aedin Culhane (18:59:32): > Mike is this being announced outside of slack too ? On the bioc devel email list, on twitter, stackoverflow etc

Aaron Lun (19:02:48): > BIoC-devel got an email.

Stephanie Hicks (20:05:53): > maybe announce on twitter too?

Stephanie Hicks (20:06:32): > unfortunately I won’t be able to make this one though.

2019-09-12

Lluís Revilla (05:13:53): > @Lluís Revilla has joined the channel

Lluís Revilla (05:15:22): > set the channel topic: Platform for BioC developers to describe existing software infrastructure to other members of the BioC community, to present plans for future developments, and discuss changes that may impact other developers/software tools within the BioC.

Aaron Lun (20:57:58): > What’s the agenda?

Hervé Pagès (21:29:00): > I’ve been offered a 20 min. slot to explain how I plan to break hundreds of packages 2 days before the next release by making last minute changes to DataFrame/DFrame:sweat_smile:

2019-09-13

Laurent Gatto (04:25:11) (in thread): > Best topic ever!

Mike Smith (08:39:50): > We’ll also have some discussion on serialisation of objects, hopefully highlighting some pitfalls, best practices etc

2019-09-19

Mike Smith (07:06:35): > Just a reminder that out next meeting is today at 09:00 PDT/ 12:00 EDT / 18:00 CEST > > We have two principle topics on the agenda: > - Introduction to DataFrames and the impact of recent changes @Hervé Pagès- Pitfalls and best practices for serialising Bioconductor objects @Lori ShepherdThose titles are my ‘best guesses’ & shouldn’t be consider legally binding:male-judge:I’m looking forward to some lively discussion on both topics.

Nick Eagles (10:37:21): > @Nick Eagles has joined the channel

Aaron Lun (11:28:55): > will be late

Sehyun Oh (11:34:07): > @Sehyun Oh has joined the channel

Leonardo Collado Torres (12:17:22): > Is data from ExperimentHub / AnnotationHub accessible outside R? If you distribute some type of text file on those resources would users from other languages be able to access it?

Leonardo Collado Torres (12:19:21): > I was going to say: I guess that the ExperimentHub / AnnotationHub could have adata.framewith URLs to the text files (hosted somewhere else) such that R users could access but also users from other languages > > But then I remembered that ExperimentHub / AnnotationHub stores the data on AWS right? So maybe there’s a way to provide those AWS URL to the text/flat files from other languages

Leonardo Collado Torres (12:20:13) (in thread): > From Martin Morgan on the Bluejeans chat:yep, though the interface isn't well developed, e.g.,https://annotationhub.bioconductor.org

Leonardo Collado Torres (12:20:25) (in thread): > thanks!

Kasper D. Hansen (12:26:54): > My understanding is that ExperimentHub now allows to have a resource which is stored on an arbitrary URL (ie. outside of S3). It used to be S3-only.

Kasper D. Hansen (12:28:00): > This may require some more development; for example we could have md5 hashes of external resources so we can check if they changed since the resource was put into ExperimentHub, but developing this into a solid feature requires some thinking

Lori Shepherd (12:54:45): > The download should still use the API fetch call that would construct the file path from the given location_prefix and rdatapath - so I think it (https://annotationhub.bioconductor.organdexperimenthub.bioconductor.org) should work regardless of where the data is hosted

Hervé Pagès (14:19:27): > @Mike SmithWere you suggesting that objects that point to a local EH or AH cached file (e.g. HDF5Array) start using some kind of path that abstracts the exact location on disk (e.g. something like"HubId:EH1039") instead of the path to the local cached file? This way if someone ships the serialized object in his/her package the object won’t break on the user’s machine? (Every time the object needs to access the on-disk data, the path is expanded to the local cached file.) This would prevent this kind of problem:https://stat.ethz.ch/pipermail/bioc-devel/2019-September/015537.htmlI’ll implement something like this for HDF5Array and TENxMatrix objects.

Hervé Pagès (14:42:17): > Added to the TODO list:https://github.com/Bioconductor/HDF5Array/issues/20

Dan Bunis (15:08:03): > Is there a recording of today’s forum?

Kayla Interdonato (15:20:02): > Yes, it will be on the Bioconductor YouTube soon. I’ll also post the link here when it’s available.

Mike Smith (15:48:10) (in thread): > Yes, that type of abstraction is pretty much what I was thinking. That ID is check for in the local cache first, and if it’s not available it can be grabbed from a remote resource. > > I’m not sure how it should respond in the example given on the mailing list, since there the saved version is intended to be a subset of the data. If the operations aren’t stored then loading the object from file will presumably result in the original, complete, data being restored which presumably isn’t what would be expected.

Kayla Interdonato (16:09:44): > Here is the link to today’s developer forum recording:https://www.youtube.com/watch?v=NXztWuJSItk&t=46s - Attachment (YouTube): Developers Forum 02

Mike Smith (16:49:57) (in thread): > Actually, am I being dumb? Are the subsetting operations etc stored somewhere? Otherwise how could I take create different subsets in different objects that all link back to the same file on disk.

Hervé Pagès (17:20:47) (in thread): > Yep, the subsetting and other delayed operations are stored in the DelayedArray object so they also get serialized.

Peter Hickey (19:42:43): > re: use ofupdateObject()in pkgs. Yep, I’ve used them inbsseqandminfi(probably incompletely and imperfectly)

2019-09-20

Aaron Lun (00:57:53): > Man, I look like such a nerd.

Aaron Lun (00:59:31): > Next time I’m turning my video off.

Aaron Lun (01:01:00): > Y’know, I think I sound like Sean Connery when I speak, but this playback is really disappointing.

Almut (02:16:23): > @Almut has joined the channel

Peter Hickey (09:38:08): > re getting the ‘right’ python installation for use with R:https://github.com/hafen/rminicondaperhaps useful?

Aaron Lun (17:48:28): > If that works, it’s the bomb

2019-09-22

Stephanie Hicks (21:56:04): > omg thanks@Peter Hickey!

2019-10-07

Kelly Street (10:23:40): > @Kelly Street has joined the channel

Lori Shepherd (12:53:13): > <!channel>Hello. Not sure if I missed it but noticed I don’t have anything in the calendar - when is the next meeting and is there an agenda set yet?

2019-10-08

Mike Smith (09:46:50): > No confirmed date or agenda as of yet. I was working towards October 17th (I think we’re on a third Thursday of the month pattern, n = 2).

Mike Smith (09:48:59): > I think I’ve promised a list of suggested topics at some point, so here’s the list from the original proposal, a few of which have already been covered. These a pretty much based on my perspective, but anyone else has similar suggestions or wants to take on one of these and run with it, that would be awesome:General Topics● Best practice for communicating changes to other developers & users e.g a BioC ‘technical area’ rather than vignettes? Additional contributions tohttps://www.bioconductor.org/developers/? > ● Is the BioC-devel mailing list sufficient, or would something more advanced/modern be beneficial? > ● Supporting Bioconductor software hosted in external software repositories e.g. Bioconda. > ● When to use generics e.g.https://stat.ethz.ch/pipermail/bioc-devel/2019-May/015035.html● Maintenance & Improvements of the core infrastructure (Ranges, SummarizedExperiment,Hubs), incl. documentation, naming, what goes where. Use of startup message in packages - proliferation, suppression, guidelines for developers… > ● Reasons behind staged-installation in R-3.6 and potential impact. > ● Static vs dynamic linking for ‘external’ software libraries distributed as R packages. > ● Managing large data resources and complicated vignettes – what’s wrong with github? > ● Pain points in package submission and maintenance. > ● Playing better with python.Specific Topics● Serialised representation of SummarizedExpression objects. > ● HDF5 reading strategies and benchmarks > ● Implementing alternative DelayedArray backends > ● Highlighting use cases and/or performance of beachmat > ● Should time be spent making biomaRt more ‘Ensembl-centric’ or are alternative annotation packages or more modern APIs better places to invest time? > ● Authoring workflows via BioC and F1000. How smooth is this process? What are the thoughts of the authors of recent workflow authors? > ● Reasons why Oxford Nanopore sequencing is (maybe) under represented

2019-10-10

Mike Smith (10:57:52): > The next Bioconductor Developers’ Forum is scheduled for Thursday 17th October at 09:00 PDT/ 12:00 EDT / 18:00 CEST > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/528142528(Meeting ID: 528 142 528)

Mike Smith (11:07:29): > I thought we would take a break from the PowerPoint presentation style and instead listen to / share experiences of working with some of the core BioC packages.@Stephanie Hicksis going to share her experience submitting data to ExperimentHub, which will surely be an interesting tale given about half the BioC core team have contributed advice to the Github submission! So if you have opinions on whether SummarizedExperiment is too complex, the best way to add data to ExperimentHub, the optimal parameters for disk-backed datasets, or anything similar come prepared to share them. > > I also thought, with the upcoming release of BioC 3.10, it might be interesting to discuss how we as developers prepare for the release date & update our systems afterwards. In the run up to release do you cease development:double_vertical_bar:, create a new branch:branch:, cram as many bug fixes as possible in the last week:hourglass_flowing_sand:, or go on holiday until it’s all over:beach_with_umbrella:? Afterwards do you immediately spin up a new docker image:whale:, format your hard disk:fire:, or keep using the old version until someone emails you a bug:bug:& you can’t reproduce it? I’d love to know.

Constantin Ahlmann-Eltze (11:07:42): > @Constantin Ahlmann-Eltze has joined the channel

Federico Marini (11:11:41): > upvote for the tips and tricks pre-during-post release

Jacob Morrison (11:23:39): > @Jacob Morrison has joined the channel

2019-10-17

Mike Smith (05:01:34): > Just as a reminder, the next call is scheduled for today (Thursday 17th October) at 09:00 PDT/ 12:00 EDT / 18:00 CEST > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/528142528(Meeting ID: 528 142 528)

Mike Smith (05:03:07): > I’m on holiday with a temperamental internet connection, so if I am not present for the start of the call please begin without me. The two suggested topics are listed above.

Stephanie Hicks (11:41:35): > my slides for today’s call:https://docs.google.com/presentation/d/1WZxLC-4iYt0xR1p__HHbGpoyZXW4kFRAkcIj234PAE8/edit?usp=sharing - File (Google Slides): 2019-10-16-bioc-devel-forum-hicks

Tim Triche (12:00:06): > pushes thumbsup like a wild monkey

Tim Triche (12:10:27): > is BlueJeans giving anyone else problems? Mine spins and spins without connecting:disappointed:

Lori Shepherd (12:10:42): > no - we are connected

Tim Triche (12:19:45): > OK I figured out a kludge to make it go

Marcus Kinsella (12:51:07): > @Marcus Kinsella has joined the channel

2019-10-21

Kayla Interdonato (13:11:35): > Here’s the link to the developers forum from last week,https://www.youtube.com/watch?v=1iwhQuuHIK0. - Attachment (YouTube): Developers Forum 03

2019-10-23

Vince Carey (10:18:56): > @Vince Carey has joined the channel

Vince Carey (12:01:05): > Is there an agenda document from the past forum? Additionally, were any action items derived from Stephanie’s presentation?

2019-11-04

Izaskun Mallona (07:57:33): > @Izaskun Mallona has joined the channel

2019-11-14

Lluís Revilla (03:44:18): > In light of the recent discussions in#general, I think it would be interesting to make a developers forum about how to maintain a package in the long term and how and when Bioconductor team intervene in maintaining packages

2019-11-15

Liz Ing-Simmons (12:11:39): > @Liz Ing-Simmons has joined the channel

2019-11-18

Mike Smith (04:19:11): > The next Bioconductor Developers’ Forum is scheduled for Thursday 21st November at 09:00 PST/ 12:00 EST / 18:00 CET > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881)

2019-11-21

Mike Smith (03:36:52): > Updated agenda for today: > > • User and Developer session at EuroBioc2019 - Laurent Gatto > • Update on issues and need-to-know items post BioC 3.10 release - Lori Shepherd & Hervé Pagès > • Improving findability of BioC packages - Steffen Neumann

Steffen Neumann (11:13:26): > @Steffen Neumann has joined the channel

Steffen Neumann (11:18:31): > Improving findability of BioC packages: Notes herehttps://docs.google.com/document/d/1RQOmczzhkW60gLpiKuyvZhJVCNziqgbVrYKn8BhFT3M/edit#

Hervé Pagès (11:30:41): > finDability

Mike Smith (11:51:46): > Thanks for posting the link Steffen, (and that was definitely me who introduced a typo in the subject!)

Nitesh Turaga (13:01:02): > @Nitesh Turaga has joined the channel

Steffen Neumann (13:01:46): > we’ll have mulled wine and campfire in front of the building on 19.12. So I might not make it

Mike Smith (13:04:14): > Thanks very much@Steffen Neumann@Lori Shepherd@Hervé Pagès&@Laurent Gattofor the contributions today.

Lori Shepherd (13:05:26): > Links to slides from today:https://docs.google.com/presentation/d/12LYotJPibq84D-iHE46egRnW_JW1uNAzdWD0W8BKHGc/edit?usp=sharing

Mike Smith (13:05:50): > @Lori ShepherdI have some additional R-4.0 related problems that I had to fix in the lpsymphony package relating to options that had been removed fromR CMD config. Happy to add them to your list of gotchas.

Lori Shepherd (13:06:04): > Yes please!

Kayla Interdonato (15:12:13): > Here’s the link to the recording of today’s developer forum:https://www.youtube.com/watch?v=wXswaDt_ax4&t=24s - Attachment (YouTube): Developers Forum 04

Steffen Neumann (16:44:10): > Someone asked where the ~180 materials on BIoC in TESS come from:https://tess.elixir-europe.org/materials?tools=BioconductorIf you check the “Activity” tab ofhttps://tess.elixir-europe.org/materials/lab-9-1-efficient-and-parallel-evaluation#homeyou see that some scraper pulled HTML files off thebioconductor.org//help/course-materials/section

2019-11-22

Mike Smith (02:38:21): > Did you actively add your notes from yesterdays call? That’s a pretty swift update to the Elixir site if not. I wasn’t even aware we were making the slide available on the BioC site.

Steffen Neumann (02:41:11): > Hi@Mike Smith, I didn’t do anything there, apparently BioC is an important enough project that they send their Scraper across the BioC Website. Unfortunately that scraper is a bit buggy, and 2 out of 2 URLs pointing to the BioC website were broken. tess@elixir has been informed:slightly_smiling_face:

Martin Morgan (06:25:50) (in thread): > I noticed the link to material on this pagehttps://tess.elixir-europe.org/materials/lab-9-1-efficient-and-parallel-evaluation#homehasn’t respected the use of relative url’shref = /footranslates tohttps://bioconductor.rog/foo, nothttps://bioconductor.org/help/course-materials/foo) and that there is a link to both.Rmdand.htmlfiles for the corresponding entry forhttps://tess.elixir-europe.org/materials/cancer-immuno-oncology-bioconductor-and-beyond, but only the Rmd (appearing first in the courses page link, but less useful…) appears. If whomever at elixir is responsible wants to get in touch we can review the material more systematically and / or establish standard operating procedures to make these more correct…

Hervé Pagès (12:54:37) (in thread): > Kayla makes the material from the developer forum available on the website:https://bioconductor.org/help/course-materials/@Kayla InterdonatoBioC/R Version should be 3.11/4.0 for Lori’s talk. Thx!

Kayla Interdonato (12:55:24) (in thread): > @Hervé PagèsThanks for catching that, I’ll make the change!

2019-12-16

Mike Smith (09:54:29): > The next Bioconductor Developers’ Forum is scheduled for Thursday 19th December at 09:00 PST/ 12:00 EST / 18:00 CET > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881)

Mike Smith (09:55:14): > Sorry for the short notice, we’re further into December than I realised!

Lukas Weber (10:49:52): > @Lukas Weber has joined the channel

Martin Morgan (15:15:29): > one possible topic from the Bioc dev team & especially@Nitesh Turagais using the bioconductor_full docker image to reproduce (Linux) build system errors; the devel version of the docker image is configured with environment variables etc very close to the production build machine.

Kasper D. Hansen (15:37:31): > one way to have this discussion is to put together a rough howto, let people try it and then have a roundtable on what could be improved/people’s experience with it?

Mike Smith (16:51:56): > I might not have given people enough notice for that, but I’d certainly support if@Nitesh Turagahas time to put a howto together & people can give it a whirl.

2019-12-17

Jean Yang (02:41:09): > @Jean Yang has joined the channel

Mireia Ramos-Rodríguez (06:10:14): > @Mireia Ramos-Rodríguez has joined the channel

2019-12-18

Martin Morgan (13:50:17): > looks like the plan to use bioconductor_full to replicate build system errors is a little too ambitious for tomorrow (but quite productive for thinking about how this could be done) so I’ll withdraw volunteering nitesh… sorry about that

Mike Smith (14:59:02): > I think I’d still be keen to at least discuss how it might work. I’ve been trying to use bioconductor_full to build the OSCA book and learning quite a bit about the order.Rprofileand.Renvironare loaded etc, definitely had me puzzled for a while

Mike Smith (15:12:43): > Would anyone be interested in giving a brief overview on what Zarr / ZarrExperiment (https://github.com/Bioconductor/ZarrExperiment) is? I’m CC’d on a pull request, but that was the first time I’d heard of Zarr

Hervé Pagès (19:01:59): > Definitely interested!

Mike Smith (19:10:27): > In talking or listening?

Hervé Pagès (19:15:48): > Listening, sorry. Like you, I’ve no idea what Zarr is.

2019-12-19

Mike Smith (10:37:55): > Agenda for today: > * Update from from EuroBioc2019 user/developer sessions > * Adding additional compression filters to rhdf5 - Me > * Maybe some discussion on docker containers…. > * AOB > Some slides for HDF5 can be found athttps://docs.google.com/presentation/d/1SjPB3yEenzFNWiwPLIFBBte6Et2v3WGwBLqVyq8JFZk/edit?usp=sharing

Domenick Braccia (12:42:54): > @Domenick Braccia has joined the channel

Hervé Pagès (13:05:47) (in thread): > Great presentations and interesting discussions. Thanks to the presenters!

Sean Davis (13:06:56): > @Sean Davis has joined the channel

Mike Smith (13:09:59) (in thread): > Yes, thanks to everyone who’s participated over the last five (?) months. I’ve really enjoyed the discussions.

Hervé Pagès (13:25:38) (in thread): > Ok I took the time to read a little bit about Zarr/ZarrExperiment. Sounds like it would be fun trying to come up with a Zarr backend for DelayedArray. Putting this on my list of fun side projects to work on when I need a break from baby-sitting the builds.

Mike Smith (14:57:50) (in thread): > Cool, I might come calling for a 15 minute overview at some point!

Shubham Gupta (20:38:47): > @Shubham Gupta has joined the channel

2019-12-25

Wendy Wong (12:03:28): > @Wendy Wong has joined the channel

2020-01-01

Peter Hickey (20:01:34): > anyone aware of efforts to turnhttps://github.com/KlugerLab/FIt-SNEinto a proper R package?

Aaron Lun (20:06:19): > No, but if you’re going to do it, it would make life much easier down the stack if you make it a PR into Rtsne.

Peter Hickey (21:29:11): > unlikely to have time myself

2020-01-05

Steffen Neumann (09:20:43): > Happy new year! For some time BioC now has DOIs , e.g. 10.18129/B9.bioc.Risa pointing to the package page. How are these registered / resolved ? I am asking sincehttps://www.doi2bib.org/bib/10.18129/B9.bioc.Risaonly lists the first two authors, and I’d like to figure out where the rest are. (Did doi2bib fetch only two authors ? Did BioC only push 2 authors todoi.org?). Any pointers ?) Also, the year in the bibtex says 2017 (and does so for all packages).

Steffen Neumann (10:28:12) (in thread): > Ah, just foundhttps://support.bioconductor.org/p/101831/saying that@Sean Daviscame up with the code. Sean, any way I can figure out the author issue ?

Sean Davis (10:41:54) (in thread): > I suspect that the DOI was registered with only two authors. There is not an automated process to update the DOI once registered as far as I know. We should revisit the issue, I guess and then publicize the process. Just out of curiosity, did additional authors get added to Risa?

Steffen Neumann (17:44:12) (in thread): > No, last commit tohttps://github.com/ISA-tools/Risa/blob/master/DESCRIPTIONwas in 2016

Sean Davis (18:44:58) (in thread): > I’ll look into it. Thanks for the report.

2020-01-08

Kayla Interdonato (11:34:29): > Happy new year everyone! The video from the December developer forum is now available on Youtube (https://www.youtube.com/watch?v=TZXGJxzLzMM) and has been added to the course materials on the Bioconductor website.@Charlotte Sonesonand@Aaron Lunif you want to share a link to your slides from the EuroBioc2019 update portion I can add those to the course materials as well. - Attachment (YouTube): Developers Forum 05

Aaron Lun (11:41:27): > Here are my slides - File (PowerPoint Presentation): iSEE_v2.pptx

Sean Davis (12:20:21): > Thanks much for sharing.

Charlotte Soneson (15:36:56): > Mine are here:https://docs.google.com/presentation/d/1cUohUF8s-s0IjslPYyyxPWfH3QYHu45Nssm7PgkpIMw/edit?usp=sharing

2020-01-13

Tim Triche (16:23:43): > Zarr is interesting – it seems like a lot of people got fed up with HDF5 and wanted something simpler and easier to distribute across chunks

Tim Triche (16:23:56): > now I understand why HCA uses it

2020-01-20

Mike Smith (06:59:25): > The next Bioconductor Developers’ Forum is scheduled for Thursday 23rd January at 09:00 PST/ 12:00 EST / 18:00 CET > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881)

Mike Smith (07:01:16): > One topic that I would like to bring up is the new Windows tool chain being used by CRAN, were some packages that rely on pre-compiled DLLs are failing, presumably because the DLLs are only compiled with the old toolchain. I don’t know if we have a strategy to address this yet.

Stephanie Hicks (19:36:14): > @Mike SmithI won’t be able to make it, but interested in this topic. Looking forward to watching the video afterwards

2020-01-23

Mike Smith (11:52:39): > Here are a few slides to remind me what I want to talk about later:https://docs.google.com/presentation/d/1HcmpATBxIqM3Uvw2YCpTlPIybK__MUEEqt9xybnnQsQ/edit?usp=sharing

2020-01-27

Kayla Interdonato (10:55:49): > The January developer forum video and slides have now been added to the Bioconductor course materialshttps://www.bioconductor.org/help/course-materials/

2020-02-03

Hervé Pagès (04:54:23): > As discussed during our last forum, we’ve started to build a small subset of Bioconductor packages with Rtools 4.0 + R-testing:https://bioconductor.org/checkResults/3.11/bioc-testing-LATEST/

2020-02-08

Sara Ballouz (05:51:53): > @Sara Ballouz has joined the channel

Thanh Le Viet (16:39:27): > @Thanh Le Viet has joined the channel

2020-02-17

Mike Smith (09:51:15): > The next Bioconductor Developers’ Forum is scheduled for Thursday 20th February at 09:00 PST/ 12:00 EST / 18:00 CET > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > Please let me know either here or via direct message if you have any topics you’d like to raise.

2020-02-18

Stephany Orjuela (03:03:46): > @Stephany Orjuela has joined the channel

Robert Castelo (17:05:44): > @Robert Castelo has joined the channel

2020-02-20

Mike Smith (04:21:41): > Just a reminder that the next Developers’ Forum is today at 09:00 PST/ 12:00 EST / 18:00 CET. We have two schedule topics for today: > * @Nitesh Turagawill be talking about developments, potential use cases, and future directions for the Bioconductor docker containers > * @Robert Castelois presenting the surprisingly tricky problem of determining the exact dependency path that leads to unexpected packages being loaded by your code

Robert Castelo (11:58:29): > hi, here is the link to my slideshttps://docs.google.com/presentation/d/1QUhkGQ_5ELfq0DCEbix5S4TyXLy1ZRZ9dyYCo3FL9Ek/edit?usp=sharing

Tim Triche (12:23:15): > these are terrific, thank you for posting the slides!

Nitesh Turaga (12:39:26): > My slides, and attached Q&A at the end. I might have missed the longer discussions. - File (PDF): Feb-2020-developer-forum.pdf

Federico Marini (15:05:41): > Cool in-depth investigation, Robert!

Federico Marini (15:06:07): > itdependsis also simply great

Ludwig Geistlinger (15:57:47): > @Ludwig Geistlinger has joined the channel

2020-02-21

Aaron Lun (00:53:38): > Speaking of dependency killing, I am going to splitscater into two packages. The offspringscuttlewill be GGPLOT-FREE, which should slice a few dependencies offscater’s previous clients. End-users should be unaffected,scaterwill just re-export whatever it used to have fromscuttle. Or maybe just Depends on it.

Aaron Lun (00:56:19): > oh. It’ll be beautiful. I think I can get rid of almost 20 dependencies.

Charlotte Soneson (08:40:56): > Inspired by@Robert Castelo’s talk yesterday (and building on some of his functions) we wrote a couple of additional, complementary functions to assess the impact of removing a set of direct dependencies on the total set of dependencies of a package. We put them here for now:https://gist.github.com/csoneson/a468e4257af429edcde8837488a9956b

Kevin Rue-Albrecht (09:13:31): > Neat!

Robert Castelo (10:12:52): > @Aaron Lunlet’s analyse you case :):

Robert Castelo (10:16:09): > > pkgDepMetrics("scater", db) > ImportedBy Exported Usage DepOverlap > S4Vectors 4 268 1.492537 0.02631579 > DelayedArray 3 166 1.807229 0.18421053 > BiocNeighbors 2 48 4.166667 0.19736842 > SummarizedExperiment 4 81 4.938272 0.30263158 > BiocParallel 4 67 5.970149 0.09210526 > ggplot2 35 455 7.692308 0.48684211 > viridis 1 10 10.000000 0.51315789 > BiocSingular 3 28 10.714286 0.25000000 > DelayedMatrixStats 8 68 11.764706 0.23684211 > SingleCellExperiment 9 53 16.981132 0.31578947 > ggbeeswarm 1 4 25.000000 0.52631579 > BiocGenerics NA 139 NA 0.01315789 > Matrix NA 117 NA 0.02631579 > Rcpp NA 25 NA 0.01315789 > beachmat NA 0 NA 0.19736842 > > so not only ggplo2 but also viridis and ggbeesworm take out 0.5 from the dependency graph of ‘scater’, using@Charlotte Sonesoncode: > > checkRemoveDeps("scater", "ggplot2", db) > Number of nodes in original graph: 76 > Number of edges in original graph: 154 > Number of nodes in graph without ggplot2: 76 > Number of edges in graph without ggplot2: 153 > > doesn’t seem that removing ggplot2 only you would get the expected gain. There are in fact three dependency paths to ggplot2 from scater: > > g <- pkgDepGraph("scater", db) > igraph::all_simple_paths(graph_from_graphnel(g), "scater", "ggplot2") > [[1]] > + 2/76 vertices, named, from bbd3224: > [1] scater ggplot2 > > [[2]] > + 3/76 vertices, named, from bbd3224: > [1] scater ggbeeswarm ggplot2 > > [[3]] > + 3/76 vertices, named, from bbd3224: > [1] scater viridis ggplot2 > > so you need to get rid from ggbeeswarm and virdis to remove the upstream dependencies from ggplot2.

Federico Marini (10:17:41): > I think Aaron meant anythinggg-related in his cleansing action

Charlotte Soneson (10:17:59): > And if all of those are removed, the number of dependencies will be reduced not by 20, but by 39:smile: > > > a <- checkRemoveDeps("scater", c("ggplot2", "viridis", "ggbeeswarm"), db) > Number of nodes in original graph: 76 > Number of edges in original graph: 154 > Number of nodes in graph without ggplot2, viridis, ggbeeswarm: 37 > Number of edges in graph without ggplot2, viridis, ggbeeswarm: 92 >

Federico Marini (10:18:21): > but it is nice to see it confirmed also with cold numbers

Martin Morgan (11:51:18): > I wanted to check a package that I’m developing, but it isn’t in thedbused to calculate dependencies. So I added a row with > > my <- read.dcf("pkgserver/DESCRIPTION") > db <- rbind(db, my[, match(colnames(db), colnames(my))]) > rownames(db)[nrow(db)] <- "pkgserver" > > I have 5 direct dependencies and ‘only’ 24 total dependencies so far, almost all from deciding to use dplyr. > > > pkgDepMetrics("pkgserver", db) > Loading required package: itdepends > ImportedBy Exported Usage DepOverlap > dplyr 7 261 2.681992 0.87500000 > tibble 2 38 5.263158 0.54166667 > digest 1 9 11.111111 0.04166667 > rappdirs 1 7 14.285714 0.04166667 > BiocManager 1 5 20.000000 0.04166667 > > It struck me from Robert’s presentation yesterday and the scater example, about how little functionality is used from individual packages; e.g., in dplyr I only use > > > use = itdepends::dep_usage_pkg("pkgserver") > > use %>% filter(pkg %in% "dplyr") %>% count(fun, sort=TRUE) > # A tibble: 8 x 2 > fun n > <chr> <int> > 1 %>% 28 > 2 mutate 14 > 3 filter 10 > 4 left_join 3 > 5 select 3 > 6 distinct 2 > 7 pull 2 > 8 inner_join 1 > > It seems like this exercise can be as informative about development effort in the packages one depends on as it is about ones own package – are all of the other functions actually necessary / used?

Robert Castelo (12:23:30): > You could attempt to answer that question by calculating the fraction of the 261 functions exported by dplyr that are actually imported by some package. You could obtain the list of packages that import dplyr functionality and then, for each of those packages, fetch the function calls imported from dplyr. The problem is that dplyr is in fact imported by a large number of packages: > > revdeps <- tools::package_dependencies("dplyr", db, reverse=TRUE)[[1]] > length(revdeps) > [1] 1721 > > And theitdepends::dep_usage_pkg()would need to have all those 1721 packages installed in your system to analyse their namespaces.

Martin Morgan (13:29:16): > I mentioned yesterday that I installed all Bioc software packages & their dependencies on my laptop, just for fun. > > options(Ncpus = parallel::detectCores() - 2) > db = avialable.packages(repos=BiocManager::repostories()[[1]]) > pkgs = rownames(db) > BiocManager::install(pkgs) > > I’m not sure exactly how successful that was (I didn’t have some system dependencies). I added the library path of these installed packages to.libPaths(), and found the reverse dependencies amongst these – 208 CRAN or Bioconductor packages > > db = installed.packages(library) > installed = rownames(db) > revdeps = tools::package_dependencies(target, db, reverse = TRUE)[[1]] > > I then randep_usage_pkg()on each of these; it requires that the package namespace be loaded. It was surprisingly quick, and consumed less than 5 Gb of memory. So lets see… > > > filter(rds, pkg == "dplyr") %>% count(fun) %>% count(n, sort = TRUE) # A tibble: 35 x 2 > n nn > <int> <int> > 1 1 71 > 2 2 20 > 3 3 14 > 4 4 8 > 5 6 6 > 6 5 5 > 7 9 5 > 8 13 5 > 9 7 3 > 10 8 3 > > says that 71 packages depended on dplyr for a single function, 20 depended on dplyr for 2, etc. Of course some packages were heavy hitters > > > filter(rds, pkg == "dplyr") %>% count(revdep, sort=TRUE) > # A tibble: 208 x 2 > revdep n > <chr> <int> > 1 dbplyr 74 > 2 DiagrammeR 38 > 3 TPP 32 > 4 lipidr 31 > 5 proBatch 25 > 6 radiant.data 25 > 7 sjmisc 25 > 8 tidygraph 25 > 9 highcharter 24 > 10 TPP2D 24 > # … with 198 more rows > > The functions being used (not weighted by use in the package, so the max n would be 206) is > > > filter(rds, pkg == "dplyr") %>% count(fun, sort = TRUE) > # A tibble: 177 x 2 > fun n > <chr> <int> > 1 select 118 > 2 mutate 111 > 3 filter 109 > 4 bind_rows 84 > 5 group_by 84 > 6 %>% 81 > 7 arrange 65 > 8 left_join 58 > 9 summarise 53 > 10 ungroup 46 > # … with 167 more rows > > so one definitely gets a sense of what is being used. Only 177 functions out of the 261 exported by dplyr get used in this collection of packages.

Kayla Interdonato (13:37:52): > Don’t want to interrupt the great discussions but I just wanted to let everyone know that the recording from yesterday’s developer forum is now available on the YouTube channel (https://www.youtube.com/watch?v=xsM4nN85cok) as well as the course materials on the website (https://www.bioconductor.org/help/course-materials/). - Attachment (YouTube): Developers Forum 07

Robert Castelo (13:50:38): > Thanks@Kayla Interdonatofor posting the materials and@Martin Morganthis looks like a good way to identify candidate functions for deprecation. Such identification would nicely add up to the report you mentioned yesterday we could build with some of the described dependency metrics and become a useful tool to assist developers in maintaining their packages.

Federico Marini (19:04:50): > Could we make the content of this discussion somehow more permanent? Idea: Link the rendered version to one of the Developer resources on the main Bioc site?

Federico Marini (19:05:27): > Most developers are indeed actively/silently here - but this could be indeed of wider interest

2020-02-24

Lluís Revilla (04:25:17): > Yes, please make this discussion accessible to people outside slack

Lluís Revilla (04:25:49): > This discussion will be also of interest todplyrdevelopers

FelixErnst (05:10:15): > Maybe it would be worth a try to involve Jim Hester in such a way, thatitdependsends up in CRAN or is otherwise recycled in permanent place. Otherwise, it might not be worth a lot to “record” the discussion.

Mike Smith (09:49:52): > I’ve been taking a look at the ‘missing’ calls to Biostrings that@Robert Castelomention in his talk e.g. > > ImportedBy Exported Usage DepOverlap > Biostrings NA 240 NA 0.0750 > > It looks to me like there’s a few thingsitdependsdoesn’t pick up. Specifically methods e.gimportMethodsFrom(Biostrings, match)or built-in constants e.g.Biostrings::DNA_BASESI see similar things inbiomaRtwhereprogress::progress_bar$newisn’t picked up. > > I guess some of those are probably edge cases, but if it really doesn’t identify methods then some dependencies may be under reported substancially; would be happy for someone to find out I’m wrong there.

Robert Castelo (09:58:59): > On Friday I sent an email to Jim Hester with cc to Martin, asking specifically about itdepends making it to CRAN. So far I haven’t got an answer.

FelixErnst (11:20:36) (in thread): > Same is also the case, if you need to use thesubseqgeneric fromXVector. You have to import it, but maybe never call it. For this I got aNAas well.

2020-02-25

Hena Ramay (12:49:29): > @Hena Ramay has joined the channel

2020-03-02

Shubham Gupta (13:03:24): > In my package, I want to include some standalone python code (will not be needed by the package but I want to keep things together for future). I have created a folder namedpythonat the root of the package. In.Rbuildignorefile I have added^python$. Should it be ok for not getting any error/warning in the build report?

Lorena Pantano (13:39:53): > @Lorena Pantano has joined the channel

Lorena Pantano (13:41:57): > Hi, I am having an issue with my packages and I am not sure what it is because I cannot reproduce. One of my packages:http://bioconductor.org/checkResults/devel/bioc-LATEST/DEGreport/malbec2-buildsrc.htmlis failing in a line like this:class(counts) %in% c("matrix", "data.frame") are not all TRUE. I tried to reproduce this, but after using the docker imagebioconductor/devel_core2and install load the package with devtools and running my vignette, everything works. Can somebody help me on how to reproduce this error? Thanks!

Aaron Lun (13:42:43): > Probably becauseclass(counts)is not of length 1.

Aaron Lun (13:43:03): > Think there’s an environment variable that got flipped at some point.

Lorena Pantano (13:43:21): > thanks, I should use maybebioconductor/bioconductor_docker:devel

Lorena Pantano (13:43:37): > I just need to find the docker to install to reproduce this…

Marcel Ramos Pérez (13:43:58): > class(matrix)can bec("matrix", "array")

Martin Morgan (14:03:18): > bioconductor/bioconductor_docker:devel is closer to the actual build system and would be a better start; some additional usage techniques are discussed athttp://bioconductor.org/help/docker/#usage. The build system does not use a docker container and the build process is relatively unique so not completely captured.

Hervé Pagès (15:26:44) (in thread): > The usual place for these standalone scripts is ininst/scripts. Note that with this approach the scripts get included and distributed with the package. This helps make things more transparent for the end-user e.g. if the scripts were used to pre-process some data that you are including in the package.

Kasper D. Hansen (15:28:47): > @Lorena PantanoYou need something like this > > *R_CHECK_LENGTH_1_LOGIC2*=true R CMD check PKG.tar.gz >

Kasper D. Hansen (15:28:58): > (ie. setting an environment variable)

Shubham Gupta (15:29:47) (in thread): > Thanks. It does makes sense. However, I want to keep the scripts to myself and don’t want to include in the package. In this case, if I include folder in.Rbuildignore, would it still be moved to Bioconductor distribution? I want to keep them in my github repo but don’t want to them be distributed.

Kasper D. Hansen (15:29:59): > It might be good to have some tips and tricks for this. For example, in S4, I would use soemthing like > > is(counts, "data.frame") || is(counts, "matrix") > > but not 100% sure it works with S3

Hervé Pagès (15:33:28): > I would just useis.matrix(counts) || is.data.frame(counts)here.

Hervé Pagès (15:36:47) (in thread): > Yes in that case put them wherever you want and use.Rbuildignoreto exclude the folder.

Lorena Pantano (15:37:44): > thank all

Davide Risso (15:38:20): > Yeah, I had a similar issue in my EDASeq package, where I was using class(x)==“matrix” (which throws an error in bioc-devel) instead of the more appropriateis.matrix(x)

Federico Marini (15:43:35): > I recall to have seen this as an entry in the R developer blog

Federico Marini (15:44:05): > https://developer.r-project.org/Blog/public/2019/11/09/when-you-think-class.-think-again/index.html

Federico Marini (15:44:38): > … plus, BiocCheck suggest to avoid usingclassand switch to usingisor similar

Federico Marini (15:44:57): > (actually, the suggestion is there since quite more time than the recent post)

Lori Shepherd (15:46:18): > http://bioconductor.org/developers/how-to/troubleshoot-build-report/. Also provided some background. I will be updating this document tonight/tomorrow because there changes to matrix as array in Rdevel and we implemented the logical checks where class== and != will cause errors now.

Lori Shepherd (15:46:35): > But there is information addressing this in there as well

Shubham Gupta (22:05:16): > My package got accepted on Bioconductor last month. I have made some changes and pushed them in my GitHub repo. I have also followed steps to sync with Bioconductor :https://bioconductor.org/developers/how-to/git/sync-existing-repositories/How do I track progress (build fail/success, Version update on the website) on after I push my changes to the Bioconductor repo?

2020-03-03

Lluís Revilla (06:02:30): > @Shubham GuptaYou can create an RSS feed of the builds (https://www.bioconductor.org/developers/rss-feeds/), you can manually check it or programmatically with BiocPkgToolsbiocBuildReport()

Martin Morgan (07:45:51): > More directly, follow the ‘build reports’ link from the home page ofhttps://bioconductor.orgto see reports of the nightly builds. Note the ‘This page was generated on…’ stamp at the top of the page to get a sense of when the last successful build occurred, and the ‘snapshot date’ to know when the last commit included in the report was made.

Shubham Gupta (09:04:54): > Thank you@Martin Morganand@Lluís Revilla. Although I don’t see my latest changes as I submitted them last night, will hopefully get them in tonight’s build

2020-03-04

Jialin Ma (15:01:43): > @Jialin Ma has joined the channel

2020-03-09

Tim Triche (09:15:01): > Hi all, > Is there a release schedule for 3.11? > Thanks,

Lori Shepherd (09:18:08): > I will be announcing it later today - I was working on the website page update before I was going to announce here and on the mailing list but since you asked It is tentatively April 28th (following the R 4.0 tentative schedule release on April 24th) -

Tim Triche (09:44:18): > Thanks@Lori Shepherd!

Shubham Gupta (14:24:48): > I am facing an issue with Bioconductor / R build system. I use the commandR CMD buildand this have different behavior for a functionksmoothfromstatspackage in Windows. In linux, everything is fine. > My package name is DIAlignR. The tests related toksmoothandloessfunctions fail withi386but succeed onx64.DIAlignR.Rcheck/tests_i386/testthat.Rout.failfails butDIAlignR.Rcheck/tests_x64/testthat.Routsucceeds. > The web link ishttps://www.bioconductor.org/checkResults/3.11/bioc-LATEST/DIAlignR/tokay2-checksrc.html

Shubham Gupta (14:28:17): > I have printed the output numbers from the tests. Seems like the computation is different for these base_package functions ini386vsx64? Why so?

Aaron Lun (14:28:35): > welcome to the club.

Shubham Gupta (14:28:41): > The link for the Linux build ishttps://www.bioconductor.org/checkResults/3.11/bioc-LATEST/DIAlignR/malbec2-checksrc.html

Shubham Gupta (14:29:17): > This is successful withDIAlignR.Rcheck/tests/testthat.Rout

Aaron Lun (14:29:27): > I can’t say that it’s the cause of your problem, but it is well known that precision of mathematical operations differs between i386 and x64.

Shubham Gupta (14:29:55): > Ohh. That sucks

Shubham Gupta (14:29:58): > :disappointed:

Aaron Lun (14:30:01): > This is really a question for#bioc-builds, though.

Shubham Gupta (14:30:15): > Will post there

2020-03-10

Lori Shepherd (09:37:52): > The Bioconductor Release 3.11 (tentative) Schedule has been posted. Please see the following for important deadlineshttp://bioconductor.org/developers/release-schedule/. We encourage all maintainers to be active in fixing any package ERRORs in release 3.10 and 3.11 ASAP. Cheers.

2020-03-11

Aaron Lun (19:35:27): > @Mike SmithWe should write a requirements list for a book builder.

Aaron Lun (19:35:31): > Who else was on this?@Sean Davis?

2020-03-12

Robert Castelo (10:16:37): > hi, regarding the discussion on package dependencies and particularly my last message on Feb 24th aboutitdependsmaking it to CRAN, i exchanged an email with Jim Hester, who said this will not happen before the next BioC release. So, i’ve decided to implement the bits of functionality that enable the metrics we discussed at the lastDeveloper’s forumby posting apull requestinBiocPkgTools. I’ve added documentation including a new section in the vignette that illustrates that functionality.@Charlotte Sonesononce this PR is accepted, maybe you could add the other bits you made to calculate how the dependency graph changes when you remove a dependency. i think a useful addition would be to add a column in the table ofpkgDepMetrics()with the Jaccard index once that dependency is removed. - Attachment (YouTube): Developers Forum 07 - Attachment (Bioconductor): BiocPkgTools > Bioconductor has a rich ecosystem of metadata around packages, usage, and build status. This package is a simple collection of functions to access that metadata from R. The goal is to expose metadata for data mining and value-added functionality such as package searching, text mining, and analytics on packages.

Sean Davis (11:13:52): > Thanks,@Robert Castelo. Merged and pushed to devel.

2020-03-13

Aaron Lun (03:29:03): > @Sean Daviswere you the other one who was interested in the proposed book builder?

Aaron Lun (03:29:25): > @Mike SmithI sent you a link to a gdoc. I’ll add@Sean Davisif you’re interested.

Mike Smith (03:36:44): > Cool, thanks. Not sure it’s really a devel forum topic per se, but happy to discuss somewhere else. I’ve got a few idea for this, and been trialling some workflows for parallel building of chapters.

Aaron Lun (03:38:09): > Hm. Why did I think it came out of the dev forum?

Aaron Lun (03:38:12): > Oh

Aaron Lun (03:38:26): > I got the TAB meeting and the dev forum mixed up.

Aaron Lun (03:38:44): > Right. Well, we should move this discussion to#bioc-builds.

Mike Smith (03:39:21): > #TooManyMeetings ?

2020-03-15

Charlotte Soneson (11:31:46): > Thanks@Robert Castelofor adding your functions toBiocPkgTools! I just made a PR (https://github.com/seandavi/BiocPkgTools/pull/44) including the additional functionality from our side - it adds a column to the output ofpkgDepMetrics, and also lets you find the dependency gain from excluding multiple dependencies at once.

2020-03-16

Mike Smith (08:03:48): > The next Bioconductor Developers’ Forum is scheduled for Thursday 19th March at 09:00 PDT/ 12:00 EDT / 17:00 CET (Note: this is 1 hour earlier in Europe,Check here!) > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881)

Mike Smith (08:09:25): > We’ve been running these for a little while now, so I thought would be good to guage peoples opinions on the type of topic we’ve been covering and the direction we should take things in the future. If you have any thoughts on what you’d like covered by the Developers’ Forum please bring it to the meeting, write them here, or message me directly. Feel free to suggest anything of interest, don’t feel like you need to match with the sort of direction we’ve gone previously.

Lori Shepherd (08:17:18): > Maybe another announcement in the general channel too – maybe get some new people that weren’t aware of this channel?

Henrik Bengtsson (09:08:39): > @Henrik Bengtsson has joined the channel

Lukas Weber (09:22:38): > just noticed the meeting time is 12:00 EST – we have started daylight savings time here now, so does this mean 1pm EDT for the US east coast? I think we are in this confusing time right now when some regions have started daylight savings but others have not

Sergi Sayols (13:17:13): > @Sergi Sayols has joined the channel

Mike Smith (13:46:18) (in thread): > I didn’t realise this was ever a thing, thanks for pointing it out! I think there are more participants from the US, so lets keep the time the same there and 1 hour earlier for Europe. I’ll update the post above. Reply with:angry:if you disagree with this or:+1:if it’s fine.

2020-03-17

Joselyn Chávez (12:42:50): > @Joselyn Chávez has joined the channel

Henrik Bengtsson (20:57:39): > I’ll try to join for the Bioc dev call (my first ever if so). I’d like to discuss/suggest some steps toward harmonizingR CMD checkon Bioconductor with that on CRAN, e.g. using*R_CHECK_PACKAGES_USED_IN_TESTS_USE_SUBDIRS*=true(set by--as-cran): > > If set to a true value, also check the R code in common unit test subdirectories of tests for undeclared package dependencies. Default: false (but true for CRAN submission checks). > There are packages on Bioconductor that do not declare all of their dependencies. Using the above will catch that. Not only that, another advantage with this setting is that it will help CRAN package maintainers to reverse dependency check also with Bioconductor packages, e.g. usingrevdepcheck::revdep_check(). Right now lots of Bioconductor packages fail such tests because of the above.

Hervé Pagès (23:53:07): > WhyR CMD checkwouldn’t do this by default? We could probably add this but it’s important to keep in mind that the more we deviateR CMD checkfrom its factory settings the harder we make it for developers to reproduce what they see on the build report. For this particular setting, maybe someone wants to make the case for changing the default with the R core people? Sounds like everybody would benefit from such change.

2020-03-18

Lori Shepherd (08:46:54): > Agreed. We are already seeing many many confused developers trying to produce the errors from the length and logic checks we implemented. They don’t understand how too implement optional checks to reproduce in order for them to correct. I would start to push more of these be default as well.

Vince Carey (11:29:14): > From left field: We can control the R CMD check activity at the developer level by embedding it in a function in BiocCheck … rcmdcheck package could be relevant. We want more uniformity of developer experience … but if they have to set environment variables and run R CMD we lose control/support that could be offered at the level of running functions and getting/interpreting output.

Vince Carey (11:45:33): > It seems to me that R CMD check, as great as it is, produces too much output that gets in the way of finding, interpreting and fixing conditions. rcmdcheck package separates notes, warnings, and errors in a useful way.

Hervé Pagès (16:47:08): > rcmdcheckadds nice eye candy on top ofR CMD checkraw output and is probably well maintained but IMO the added value is not enough to justify making a core build system functionality likeR CMD checkrely on yet another CRAN package that we have no control on. The situation with knitr and rmarkdown introducing breaking changes every couple of months is already painful enough.

Henrik Bengtsson (18:18:49) (in thread): > Personally, I think there is no reason not to useR CMD check --as-cran(modulo the test that attempts to check if package is already on CRAN or not) also on Bioconductor.

Henrik Bengtsson (18:19:57) (in thread): > Without having tracked it, I’d assume that severalR CMD check --as-cranchecks eventually trickle up toR CMD check.

Henrik Bengtsson (18:22:13) (in thread): > Good point about Bioc developers being able to reproduce it locally. They basically run eitherR CMD checkorR CMD check --as-cran- anything else complicates it. I have an open R wish to support something likeR CMD check --flavor={cran,bioc,...}that would address this (https://github.com/HenrikBengtsson/Wishlist-for-R/issues/16).

Henrik Bengtsson (18:23:19) (in thread): > R CMD BiocCheckcould achieve this though, correct?

Henrik Bengtsson (18:24:49) (in thread): > Though, BiocCheck isa naughty hackand does it work for people without admin/sudo rights?

Hervé Pagès (18:29:58) (in thread): > I don’t think they’d be able to install it if they don’t own the R_HOME folder.

Henrik Bengtsson (18:32:23) (in thread): > So, I guess, then theBiocCheckapproach is also not really an ideal solution. Lots of people don’t have those rights on shared environments and with larger and larger annotation pgks etc, it becomes more and more tedious to have a solid Bioc devel env also on your local machine/notebook.

Henrik Bengtsson (18:34:19) (in thread): > If there would be an easy way (e.g.R CMD checkoption) to temporarily use a custom~/.R/check.Renviron, I think 99.9% of the cases would be covered.

Henrik Bengtsson (18:37:34) (in thread): > One could do something like: > > R_CHECK_ENVIRON=~/.R/bioc-check.Renviron R CMD check ... > > where~/.R/bioc-check.Renvironcould be installed with/viaBiocManager. But, it’s still a bit tedious, easy to forget, and maybe even more so for people usingdevtools,rcmdcheck, and similar tools.

Vince Carey (19:30:45): > I was not proposing a transition to rcmdcheck. I was wondering if we could add some capacity to BiocCheck to provide more precise data to developers. Furthermore I do not understand why BiocCheckfunctionalityrequires special privileges. Its current implementation does, but isn’t that just to give the capacity to use R CMD to run it?

Martin Morgan (20:13:39): > I guess you’re referring to the message > > Failed to copy the script/BiocCheck or script/BiocCheckGitClone script > to /usr/local/lib/R/bin. If you want to be able to run 'R CMD > BiocCheck' you'll need to copy it yourself to a directory on your PATH, > making sure it is executable. See the BiocCheck vignette for more > information. > > and the purpose of trying to copy the script somewhere in the execution path was indeed to be able to run R CMD BiocCheck. Putting the script in$R_HOME/binwas I think originally meant to be a convenience, so that you didn’t have to manually add the script to your search path. > > One could simply omit the attempt to copy (and hence the message), and the user could either follow the vignette to install the script in some executable path or invoke the package (as now) interactively in an R session asBiocCheck::BiocCheck(package =...). But I believe that functionality does not require that the package be installed in that location. (One part of BiocCheck examines email subscription to the support site and mailing list; we did not want to make that part of membership public, so require credentials for this step). If the wording or approach can be made more slick, a good direction to go would be an issue or pull request onhttps://github.com/Bioconductor/BiocCheck; see theR/install.Rfile and the vignette for details of the current implementation. > > When contributing new packages,https://github.com/Bioconductor/Contributions#r-cmd-check-environmentdescribes the method used to configure the Single Package Builder (SPB) with environment variables; following this approach gets the environment closer to the environment used by the nightly build system. > > We had also hoped to have implement these SPB flags by default on thebioconductor/bioconductor_docker:develdocker image, but that turns out to be a work-in-progress.

2020-03-19

Mike Smith (06:22:56) (in thread): > Just a reminder of the call today. Topics are: > * With the upcoming release and switch to R-4.0 Lori and the BioC core team will give us an overview of some of the changes that developers need to be looking out for. > * Assuming@Henrik Bengtssoncan make it, some discussion on the BioC build system, how it diverges from CRAN, and whether they should be more in sync. > * I’d also like to review topics we’ve covered in previous calls, and get participant feedback on future directions and topics for the Developers’ Forum, so come armed with any suggestions you may have.

Mike Smith (11:16:50): > To help with point three, there’s a list of previous calls and topics inhttps://docs.google.com/document/d/1ZC2hcC_ABzKV6WmAPz1CjU_IalPdlyOOVkbKYUR7fAk/edit?usp=sharingThat’s based on my notes, which are definitely incomplete, so if I’ve missed anyone or anything out please let me know.

Lori Shepherd (11:23:36): > And I created a newly updated slide deck for build issueshttps://docs.google.com/presentation/d/11N3WSM0QonVo7jV7KBTIpZe72DSlB_liq8HGf9XiGD8/edit?usp=sharing

Mike Smith (13:17:20): > Thanks@Lori Shepherdfor your presentation, and everybody else who attended or contributed to today’s meeting. It was great to learn a bit more about both the Bioc build system and how CRAN submission works.

Mike Smith (13:19:48): > Our next Developers’ Forum is scheduled for Thursday 16th April at 09:00 PDT/ 12:00 EDT / 18:00 CEST (Check here!) > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881)

2020-03-20

Hervé Pagès (15:59:19): > As discussed during yesterday’s meeting the build report now provides a link to the R settings used on the build machines:https://bioconductor.org/checkResults/3.11/bioc-LATEST/messina/malbec2-checksrc.html

Kayla Interdonato (16:32:27): > The recording from yesterday’s developers forum is now available on our YouTube channel,https://www.youtube.com/watch?v=6cXvIULNseM. I will be adding it to the course materials on the website soon. - Attachment (YouTube): Developers Forum 08

2020-03-23

FelixErnst (07:27:44) (in thread): > fyi: the comment inRenviron.biocmentioning the docker image currently uses a different/wrong name.@Nitesh Turaga

Nitesh Turaga (08:03:09) (in thread): > I’ll take a look. Pull requests are welcome though:blush:

FelixErnst (11:27:36) (in thread): > Is the source of the build system available on GitHub?

Nitesh Turaga (11:29:51) (in thread): > https://github.com/Bioconductor/BBS/blob/master/3.11/Renviron.bioc

FelixErnst (11:34:10) (in thread): > :hushed:Ok next time, then:blush:

Nitesh Turaga (11:45:26) (in thread): > https://github.com/Bioconductor/BBS/commit/24addc4feeb2e4b00975a23fc19bfecc527a32fc

2020-03-24

Edgar (13:24:13): > @Edgar has joined the channel

Jake Wagner (19:05:19): > @Jake Wagner has joined the channel

2020-03-31

Levi Waldron (09:30:49): > @Levi Waldron has joined the channel

2020-04-10

Shubham Gupta (01:38:02): > Hi, I need to update list in my package without creating copies (due to big size) > > test <- function(param = TRUE){ > x <- list("a"= data.frame(a1 = c(1,2), a2 = c(1,1)), > "b"= data.frame(b1 = c(2,3), b2 = c(1,2))) > x <- updateList(x) > x > } > > updateList <- function(x){ > for(name in names(x)) rbind(x[[name]], c(4,4)) > x > } > > Is there any way to do it withlist2envor some other approaches?

Hervé Pagès (02:13:39): > Not sure this is a good channel for this. We don’t know enough so it depends. Is the list “big” in the sense that it has thousands (or millions) of small data.frames and you need to add a few rows to all the data.frames? Or is it “big” in the sense that it has only a few data.frames – and these data.frames are not necessarily big at the beginning – but you want to slowly grow them in a loop by adding a few rows to them at each iteration? I suggest you ask again but provide more details in the#bigdata-repchannel.

Shubham Gupta (12:11:46): > Thanks. it has thousands (or millions) of small data.frames and you need to add a few rows to all the data.frames.

Shubham Gupta (12:11:59): > i will post in that channel

Hervé Pagès (12:48:15): > That’s the simple case. Just use an environment instead of a list.

Martin Morgan (13:35:50): > To me this sounds like you should re-think the way things are implemented so that they are more ‘R’-like – instead of a million data frames, have one data frame with a column that partitions it into groups. Instead of iterating over many data.frames, transform vectors in this partitioned data.frame once. I guess if performance were important (it seems like it could well be) then i’d emphasize finding vector-oriented ways of accomplishing what are currently iterative solutions. Also, I think it’s worth while to verify that the slow step is actually updating list elements – I think current R does NOT duplicate the entire list when individual elements are updated > > > x = list(a = 1, b = 2) > > .Internal(inspect(x)) > @7fca11942808 19 VECSXP g0c2 [REF(1),ATT] (len=2, tl=0) > @7fca10e2a9e0 14 REALSXP g0c1 [REF(3)] (len=1, tl=0) 1 > @7fca10e2a970 14 REALSXP g0c1 [REF(3)] (len=1, tl=0) 2 > ATTRIB: > @7fca10e2cf80 02 LISTSXP g0c0 [REF(1)] > TAG: @7fca1101c890 01 SYMSXP g0c0 [MARK,REF(5731),LCK,gp=0x4000] "names" (has value) > @7fca11942888 16 STRSXP g0c2 [REF(1)] (len=2, tl=0) > @7fca118622e0 09 CHARSXP g0c1 [MARK,REF(5),gp=0x61] [ASCII] [cached] "a" > @7fca10ad8dc0 09 CHARSXP g0c1 [MARK,REF(10),gp=0x61] [ASCII] [cached] "b" > > x[["b"]] = 3 > > .Internal(inspect(x)) > @7fca11942808 19 VECSXP g0c2 [REF(1),ATT] (len=2, tl=0) > @7fca10e2a9e0 14 REALSXP g0c1 [REF(3)] (len=1, tl=0) 1 > @7fca10e2a820 14 REALSXP g0c1 [REF(3)] (len=1, tl=0) 3 > ATTRIB: > @7fca10e2cf80 02 LISTSXP g0c0 [REF(1)] > TAG: @7fca1101c890 01 SYMSXP g0c0 [MARK,REF(5731),LCK,gp=0x4000] "names" (has value) > @7fca11942888 16 STRSXP g0c2 [REF(65535)] (len=2, tl=0) > @7fca118622e0 09 CHARSXP g0c1 [MARK,REF(5),gp=0x61] [ASCII] [cached] "a" > @7fca10ad8dc0 09 CHARSXP g0c1 [MARK,REF(12),gp=0x61] [ASCII] [cached] "b" > > show (via the pointer addresses,@...) that the entire list isnotcopied by updating an individual element

Hervé Pagès (13:57:05): > > instead of a million data frames, have one data frame with a column that partitions it into groups > That is assuming all the data.frames have the same columns of course, which was not clear based on Shubham’s little toy example. But if that’s the case, then using a SplitDataFrame (which is basically a big DataFrame that keeps track of the groups of rows) maybe would be an option here.

2020-04-11

Kylie Bemis (13:49:00): > @Kylie Bemis has joined the channel

2020-04-15

Tim Triche (17:15:34): > quick silly question (I was trying to use thecompositionspacakge and it’s borked against R-4.0):

Tim Triche (17:16:26): > > **** byte-compile and prepare package for lazy loading > Error in formals(scale) <- c(formals(scale), alist(... = )) : > inserting non-tracking CDR in tracking cell > Error: unable to load R code in package 'compositions' > Execution halted > ERROR: lazy loading failed for package 'compositions' > * removing '/home/tim/R/x86_64-pc-linux-gnu-library/4.0/compositions' >

Tim Triche (17:16:29): > what does this mean? other than a memory write barrier violation – why is alist() throwing it?

Tim Triche (17:17:11): > the offending code redefines the formal arguments for base::scale, it seems

Tim Triche (17:17:48): > I tracked it back tohttps://svn.r-project.org/R/trunk/src/main/memory.c

Tim Triche (17:21:54): > I just grabbed an alpha build of R-4.0 and installed it on a newish machine; I don’t appear to have set any hard write limit flags at configure time. But this code branch appears to be specific to write violation testing.

Nitesh Turaga (17:23:19): > What is the piece of code you are running?

Nitesh Turaga (17:23:29): > I don’t have the same issue when loading the package

Nitesh Turaga (17:23:35): > on R-4.0-alpha

Tim Triche (17:23:45): > this is an attempt to install the compositions package on R-4.0 from CRAN

Nitesh Turaga (17:24:32): > Interesting. I was able to install it just fine.

Tim Triche (17:24:34): > it is a dependency for ggtern, which I wanted to use to look at whether a dirichlet-categorical extremality value is useful for feature engineering

Nitesh Turaga (17:24:38): > > > sessionInfo() > R version 4.0.0 alpha (2020-04-07 r78176) > Platform: x86_64-apple-darwin17.0 (64-bit) > Running under: macOS Mojave 10.14.6 > > Matrix products: default > BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib > LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] compositions_1.40-5 bayesm_3.1-4 robustbase_0.93-6 > [4] tensorA_0.36.1 > > loaded via a namespace (and not attached): > [1] DEoptimR_1.0-8 BiocManager_1.30.10 compiler_4.0.0 > [4] tools_4.0.0 Rcpp_1.0.4.6 > > library(compositions) >

Tim Triche (17:24:40): > huh. what architecture

Tim Triche (17:24:51): > ah – you are using binary packages?

Tim Triche (17:25:10): > > > sessionInfo() > R version 4.0.0 alpha (2020-04-06 r78160) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu Focal Fossa (development branch) > > Matrix products: default > BLAS: /usr/lib/R/lib/libRblas.so > LAPACK: /usr/lib/R/lib/libRlapack.so > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.0.0 tools_4.0.0 >

Tim Triche (17:26:04): > I’ll try bumping to a newer R-devel and see if that resolves anything

Nitesh Turaga (17:26:27): > We are just 16 commits apart…shouldn’t be that big a change.

Tim Triche (17:26:31): > also@Nitesh Turagasince you’re here – I need to update some credential-to-email mappings

Tim Triche (17:26:47): > it seems like all of my coauthors on any package are now set to be me

Nitesh Turaga (17:26:52): > But yes, I did use the R-4.0-alpha binary which is available

Nitesh Turaga (17:27:10): > Could you send me an email about that on bioc-devel ?:smile:

Tim Triche (17:27:17): > yes, will do

Tim Triche (17:27:18): > thanks

Tim Triche (17:27:38): > if you compile compositions from source does it install cleanly on your R?

Nitesh Turaga (17:27:38): > Perfect.

Martin Morgan (17:41:19) (in thread): > both release and devel will be on R-4-0 after the release, so that’s where you want to be.

Tim Triche (17:45:06) (in thread): > I’m currently using R-4.0 for everything, thanks!

Kasper D. Hansen (22:53:50): > This seems bad > > the offending code redefines the formal arguments for base::scale

2020-04-16

Al J Abadi (02:05:16): > @Al J Abadi has joined the channel

Mike Smith (04:26:08): > A reminder that our next Developers’ Forum is scheduled for today (Thursday 16th April) at 09:00 PDT/ 12:00 EDT / 18:00 CEST (Check here!) > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > We have two topics for today: > * @Joselyn Chávezwill present “Writing our first Bioconductor package as members of the CDSB community” describing the package development and submission experiences from the perspective of first time Bioconductor contributors. > * I would also like to ask the audience for their perspectives on unit testing, particularly dealing with ‘problematic’ tests e.g relying on external resources, long runtimes, operating system specific code etc.

Marcel Ramos Pérez (10:34:49) (in thread): > It works for me on R 4.0 beta from source

Marcel Ramos Pérez (10:35:13) (in thread): > > > sessionInfo() > R version 4.0.0 beta (2020-04-14 r78225) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 19.10 > > Matrix products: default > BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0 > LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] compositions_1.40-5 bayesm_3.1-4 robustbase_0.93-6 > [4] tensorA_0.36.1 colorout_1.2-2 > > loaded via a namespace (and not attached): > [1] DEoptimR_1.0-8 BiocManager_1.30.10 compiler_4.0.0 > [4] tools_4.0.0 Rcpp_1.0.4.6 > > >

Nitesh Turaga (13:13:10): > @Leonardo Collado TorresYou were about to say something about github-actions after@FelixErnst. Want to discuss here?

FelixErnst (13:13:29): > Hi@Nitesh Turaga

Nitesh Turaga (13:13:35): > Hi!

FelixErnst (13:13:55): > Yes I tried them, but currently I wouldn’t recommend setting them up for devel

FelixErnst (13:14:13): > ubuntu-devel is not available using jim hester’s action templates

FelixErnst (13:14:33): > macOS-devel fail because of binaries being weird

FelixErnst (13:15:08): > and windows-devel works except for RCurl. I had that in my dependency tree and then it doesn’t work

Leonardo Collado Torres (13:15:27): > (I’ll respond too in a little bit, I’ll wait for Felix)

FelixErnst (13:15:31): > for devel travis-CI is the most reliable in my opinion

Mike Smith (13:15:51): > I think there’s some issues with R-devel now being 4.1. Lots of URLs were failing for me yesterday because of that.

Mike Smith (13:16:34): > I was also trying to use Rtools 40 since there’s an option in the setup action for that, but it would just fail and I couldn’t figure out how to get an error log

FelixErnst (13:17:19): > yep, tried that as well.

FelixErnst (13:17:47): > What I forgot to say was that for hands on fixing nothing beats a docker container from@Nitesh Turaga

FelixErnst (13:21:25): > @Leonardo Collado Torresyou wanted to something as well, didn’t you?

Leonardo Collado Torres (13:21:42): > yup, I’m testing still:stuck_out_tongue:

Nitesh Turaga (13:21:44): > Thanks for the input on github actions@FelixErnst. I’m trying to see if there is a way, I can use the bioconductor container to set up basic functions likeR CMD build / checkandR CMD BiocCheck. I’m not totally familiar with github actions, and still learning about them. I figure using that container would solve a lot of installing system dependencies problem.

Nitesh Turaga (13:22:29): > Although that wouldn’t be on multiple platforms, it would just be on linux.

Nitesh Turaga (13:22:54): > But i’m looking into Jim hesters code to see how he sets up builds on Windows and mac.

Charlotte Soneson (13:22:55): > There’s a bit of that inhttps://github.com/seandavi/BiocActionstoo, right?

Nitesh Turaga (13:23:30): > @Charlotte SonesonI saw that the iSEEu package uses github actions really well:smile:

Charlotte Soneson (13:23:52): > I tried to combine the docker containers with the regular operating systems here:https://github.com/csoneson/ExploreModelMatrix/actions/runs/76576511

Charlotte Soneson (13:24:03): > Seems to work, even if I think it’s a bit of a hack

Nitesh Turaga (13:25:03): > This is great!@Charlotte Soneson

FelixErnst (13:28:18): > Nice!@Charlotte Soneson

FelixErnst (13:29:17): > @Nitesh Turagamaybe you want to invest some time into an images based directly on r-ver, so that the rstudio install is removed

Marcel Ramos Pérez (13:29:41): > Has anyone looked at usingbioconductor_docker:develwith Travis?

FelixErnst (13:29:43): > I guess that would make the image a bit more resource friendly for just testing

Nitesh Turaga (13:30:45): > But you don’t have to use Rstudio though, why is that making a difference?

Mike Smith (13:31:20) (in thread): > It’s a great hack. I think you should have just told us all to stop talking and that you’d figured it out!

Charlotte Soneson (13:31:49): > I decided to invest a bit into github actions since travis suddenly refused to autodeploy with pkgdown - easier without having to add keys here and there:slightly_smiling_face:

FelixErnst (13:32:18): > nope I don’t have to, but it is with dependencies roughly 500 Mb which could be removed. Just mentioning.

Leonardo Collado Torres (13:33:36): > So, I’ve been playing around with this since yesterday with myderfinderPlotrepo athttps://github.com/leekgroup/derfinderPlot/commits/master. > * Throughusethishttps://www.tidyverse.org/blog/2020/04/usethis-1-6-0/likeusethis::use_github_action('check-standard')https://github.com/leekgroup/derfinderPlot/commit/928c1e90d296bac75ba33fd12838b4b3f908627c#diff-357b9284d07b04b8b0ff2f83db19a345R13(tests R 3.6 in linux, mac, win + devel on mac). I realized you might not want to test in R release for your BioC pkg (in my case, I’m using a bioc-devel-only function at some point), plus issues with the macOS devel (eventuallyGenomicFeatures,derfinderandggbiofailed to install). > * Searching the BioC slack I ran intohttps://github.com/csoneson/dreval/blob/master/.github/workflows/R-CMD-check.yamlandhttps://github.com/seandavi/BiocActions/blob/master/.github/workflows/main.yml. I edited theusethis::use_github_action('check-standard')output based on what I saw in Charlotte and Sean’s work athttps://github.com/leekgroup/derfinderPlot/commit/eb1bad9605417e1f7ea7c344221f21c4639b6d19(previous commit copies the file). This tests only usingbioconductor/bioconductor_docker:develfor pushes/PRs made to themasterbranch. > * In my most recent commit (which I just did)https://github.com/leekgroup/derfinderPlot/blob/7b4104804ea5b3e93227cf147ea8ed387bbf5830/.github/workflows/check-bioc.ymlI’m trying to automatically detect and configure the docker info based on whether you are working on themasterorRELEASE_*branches. But well, I haven’t figured out how to do this properly as I’m getting YAML syntax errors and the like (see other earlier commits). > My goal would be to have a GitHub action that uses either the BioC devel docker or the release one that I do not have to manually edit for every BioC release. I only want to deploy (pkgdown) and run the coverage tests (covr) on the bioc-devel version. > > Yes, testing on Mac and Windows could be nice (via the R versions that Jim Hester maintains), but I don’t need to for most cases (and currently it’s tricky to get an R 4.0.0 or 4.1 run up for me: aka, some issues with installing dependencies). Also, recently I’ve run into hard to debug issues that I could only reproduce using Bioc’s dockers.

Charlotte Soneson (13:33:50) (in thread): > Well, there’s still the problem with defining the Bioc release (which is why I added the dockers) - right now it will grab the latest version for the R version you specify. Perhaps it’s possible to somehow build that into the runner matrix and pass it to BiocManager.

Marcel Ramos Pérez (13:35:14) (in thread): > was this usingpkgdown::deploy_site_github?

Charlotte Soneson (13:36:20) (in thread): > Yes

Nitesh Turaga (13:48:33): > This is also very useful@Leonardo Collado Torres!

Tim Triche (16:02:22): > OK so I verified thatcompositionsonly refuses to load when R is compiled with--enable-strict-barrier

Tim Triche (16:02:44): > and that is why the CDR modification for the formals of base::scale is a problem

Tim Triche (16:02:57): > perhttps://svn.r-project.org/R/trunk/src/main/memory.c

Tim Triche (16:03:19): > but now I have to wonder – is it not an issue for developers to typically run R without strict memory fencing?

Tim Triche (16:03:58): > > > sessionInfo() > R version 4.0.0 beta (2020-04-15 r78231) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu Focal Fossa (development branch) > > Matrix products: default > BLAS: /usr/lib/R/lib/libRblas.so > LAPACK: /usr/lib/R/lib/libRlapack.so > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] grDevices datasets parallel graphics stats4 stats utils > [8] methods base > > other attached packages: > [1] compositions_1.40-5 bayesm_3.1-4 robustbase_0.93-6 > [4] tensorA_0.36.1 S4Vectors_0.25.15 BiocGenerics_0.33.3 > [7] skeletor_1.0.4 gtools_3.8.2 useful_1.2.6 > [10] knitr_1.28 forcats_0.5.0 stringr_1.4.0 > [13] dplyr_0.8.5 purrr_0.3.3 readr_1.3.1 > [16] tidyr_1.0.2 tibble_3.0.0 ggplot2_3.3.0 > [19] tidyverse_1.3.0 BiocManager_1.30.10 > > loaded via a namespace (and not attached): > [1] tidyselect_1.0.0 xfun_0.13 haven_2.2.0 lattice_0.20-41 > [5] colorspace_1.4-1 vctrs_0.2.4 generics_0.0.2 rlang_0.4.5 > [9] pillar_1.4.3 glue_1.4.0 withr_2.1.2 DBI_1.1.0 > [13] dbplyr_1.4.2 modelr_0.1.6 readxl_1.3.1 lifecycle_0.2.0 > [17] plyr_1.8.6 munsell_0.5.0 gtable_0.3.0 cellranger_1.1.0 > [21] rvest_0.3.5 fansi_0.4.1 DEoptimR_1.0-8 broom_0.5.5 > [25] Rcpp_1.0.4.6 scales_1.1.0 backports_1.1.6 jsonlite_1.6.1 > [29] fs_1.4.1 hms_0.5.3 stringi_1.4.6 grid_4.0.0 > [33] cli_2.0.2 tools_4.0.0 magrittr_1.5 crayon_1.3.4 > [37] pkgconfig_2.0.3 ellipsis_0.3.0 xml2_1.3.1 reprex_0.3.0 > [41] lubridate_1.7.8 assertthat_0.2.1 httr_1.4.1 rstudioapi_0.11 > [45] R6_2.4.1 compiler_4.0.0 nlme_3.1-147 >

Tim Triche (16:05:01): > so while previously,formals(scale) <- c(formals(scale), alist(...= ))threw an error at load time, now it doesn’t:

Tim Triche (16:05:13): > > > library(compositions) > Loading required package: tensorA > > Attaching package: 'tensorA' > > The following object is masked from 'package:base': > > norm > > Loading required package: robustbase > Loading required package: bayesm > > Attaching package: 'bayesm' > > The following object is masked from 'package:gtools': > > rdirichlet > > Welcome to compositions, a package for compositional data analysis. > Find an intro with "? compositions" > > > Attaching package: 'compositions' > > The following objects are masked from 'package:S4Vectors': > > cor, cov, var > > The following objects are masked from 'package:BiocGenerics': > > normalize, var > > The following objects are masked from 'package:stats': > > cor, cov, dist, var > > The following objects are masked from 'package:base': > > %*%, scale, scale.default >

Tim Triche (16:05:41): > this seems like maybe a bad thing. (I think I got into the habit of compiling with--enable-strict-barrierfrom Dirk)

Tim Triche (16:05:55): > Is this compile option no longer recommended?

Martin Morgan (18:03:26) (in thread): > The mention of R-devel makes me nervous – both the release and devel branches of bioc will use R-4.0 for the next six months.

Al J Abadi (23:04:31): > I think it’s nice to limit the length of the lines in source code / messages so it’s easier to review. One of the drawbacks is trying to break the messages into multiple lines as it may seem messy in console. Does anyone have a quick solutions which does not require breaking the message itself?

Marcel Ramos Pérez (23:29:52): > I’m not sure what you mean. Perhaps usepaste0?

Al J Abadi (23:46:02): > For instance the following message is long: > > bar <- 42 > egg <- 24 > message(sprintf("foo foo foo foo foo foo foo foo foo foo foo %s foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo %s foo foo foo foo foo foo foo", bar, egg)) > > And if you use SHIFT+RETURN to break it in the middle, well, you can try on different console widths and see that it becomes messy

2020-04-17

Martin Morgan (05:02:03) (in thread): > …though remember that > > message(paste0("foo is: '", bar, "'")) > > is the same as > > message("foo is '", bar, "'") >

Martin Morgan (05:16:55) (in thread): > I’m not a fan ofsprintf()these days, especially sincemessage()/warning()/stop()allpaste0()their...arguments together. > > I think there are two approaches to ‘formatting’ messages. The first is to enter the message in short lines which get pasted together into a single line using the implicitpaste()as something like > > message( > "I'm a really helpful ", > "message formatted for the ", > "narrow window that I'm now ", > "using in slack." > ) > > This creates a long message that gets wrapped, in an ugly way, to the R console. > > The second approach is to try harder usingstrwarp(), which wraps lines to the current console width. Something like > > txt <- paste0( > "I'm a really helpful ", > "message formatted for the ", > "narrow window that I'm now ", > "using in slack." > ) > breaks <- strwrap(txt) > > wherebreaksis now a vector with elements of text appropriate for the size of the current console. To use these, they need to be pasted together with a new line seperator > > message(paste(breaks, collapse="\n")) > > Arguments tostrwrap()allow control of first line and subsequent line indent / exdent. The approach lends itself to helper functions that break and wrap the text or similar > > message(.wmsg(...)) # one wrapper > .message(...) # another wrapper > > and usually one gets frustrated when one wants to have, more complicated formatting, e.g., > > warning( > ## argh, I can't use my wrapper! > "something bad happened:\n", > " filename: '", filename, "'\n", > " bad thing: ", reason > ) >

Kayla Interdonato (13:35:02): > @Joselyn ChávezCould you send me a link to your slides from yesterday’s developer forum? We like to provide a copy of the slides with the recording on the course materials on the Bioc website. Thanks!

Joselyn Chávez (13:36:15): > Great!@Kayla Interdonatohttps://docs.google.com/presentation/d/1R0O86Uglvr2slzqN7o1P0IBPcTTXepSAOTbKGMBfPus/edit#slide=id.p

Hervé Pagès (13:44:14) (in thread): > Actually you still can, but it becomes a little bit more complicated: > > stop(wmsg("You need to install", > "package blah to ", > "perform this ", > "operation. ", > "Please install the ", > "package with:"), > "\n BiocManager::install("blah")\n", > wmsg("and try again.") > ) >

Daniela Cassol (14:40:01): > Hello all, > I just noticed this message on the release_3.10 ofsystemPipeRpackage: > > Citation (from within R, enter citation("systemPipeR")): > > Important note to the maintainer of the systemPipeR package: An error occured while trying to generate the citation from the CITATION file. This typically occurs when the file contains R code that relies on the package to be installed e.g. it contains calls to things like packageVersion() or packageDate() instead of using meta$Version or meta$Date. See R documentation for more information. > > I also know this branch is already frozen, but is there any way I can fix this? > Thanks in advance!

Hervé Pagès (14:42:18): > Unfortunately it’s too late. Sorry.

Daniela Cassol (14:43:43) (in thread): > Okay, thank you for letting me know! I will fix this on the devel branch to avoid this issue in the future.

Shubham Gupta (16:18:08): > Hi, I am trying to use.datapronoun fromrlangto avoid defining global variables in my package. However, now my tests fail due to not able to find column in .data. > > output <- dplyr::group_by(df, .data$var1, .data$var2) %>% dplyr::summarise(var4 = dplyr::lst(var3)) %>% dplyr::ungroup() %>% as.data.frame() > > The error comes if I definevar4 = dplyr::lst(.data$var3)

Shubham Gupta (16:20:33): > > 2: dplyr::group_by(df, .data$var1, .data$var2) %>% dplyr::summarise(var4 = dplyr::lst(.data$var3)) %>% dplyr::ungroup() %>% as.data.frame() > 3: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) > 4: eval(quote(`_fseq`(`_lhs`)), env, env) > 5: eval(quote(`_fseq`(`_lhs`)), env, env) > 6: `_fseq`(`_lhs`) > 7: freduce(value, `_function_list`) > 8: function_list[[i]](value) > 9: dplyr::summarise(., var4 = dplyr::lst(.data$var3)) > 10: summarise.tbl_df(., var4 = dplyr::lst(.data$var3)) > 11: summarise_impl(.data, dots, environment(), caller_env()) > 12: dplyr::lst(.data$var3) > 13: lst_quos(xs) > 14: eval_tidy(xs[[i]], unique_output) > 15: .data$var3 > 16: `$.rlang_data_pronoun`(.data, var3) > 17: data_pronoun_get(x, nm) > 18: rlang:::abort_data_pronoun(x) > 19: abort(msg, "rlang_error_data_pronoun_not_found") > 20: signal_abort(cnd) >

Shubham Gupta (16:21:24): > Seems likedplyr::lst()function doesn’t allow.data

Leonardo Collado Torres (18:55:12): > @Nitesh Turaga@Charlotte Soneson@FelixErnstI now have a working GitHub actions workflow that uses the Bioconductor dockers (release or devel depending on the GitHub branch) , caches the R packages, runsBiocCheck,pkgdown,covr. Some things got tricky because of the docker containers vs the linux OS (like gettingBiocCheckto work properly in that setting and configuration required bypkgdown). I’m sure there’s probably ways to make the code cleaner. > > Anyway, I’m quite happy with it and made suggestions in 5 different repos all linking tohttps://github.com/r-lib/actions/issues/84. If they like the suggestions, I would make the related PRs too. > > I thought you would be interested too. If you feel inclined, please feel free to chime in the “issues” (suggestions) I made on GitHub so others not in this Slack can see your comments.

2020-04-20

Peter Hickey (18:48:47): > This is a bit of a thought bubble … > A somewhat common occurrence I’ve seen in BioC package reviews is that the vignette has incorrect instructions for installing the package or these instructions are missing entirely. > I wonder whether we might be able to autogenerate this section via a function (inBiocStyle?).

Aaron Lun (18:49:33): > I wonder how people even get to the vignette without either (i) passing through the landing page or (ii) having already installed the package.

Al J Abadi (18:50:47): > Hi@Hervé Pagès, > With the bi-weekly build report emails, there seems to be a lag I guess which can confuse some. For example, our latest build on the 19th was successful, but on 20th (yesterday) we received an email saying that our build failed (which I assume related to the second last build).

Peter Hickey (18:50:55) (in thread): > I think it’s via google

Peter Hickey (18:51:20) (in thread): > Does timezones explain it?

Al J Abadi (18:53:48) (in thread): > I actually kind of corrected for that, the received email was early today (21st), Melbourne time

Hervé Pagès (18:59:28) (in thread): > Can we please discuss this on the appropriate channel? mixOmics is all clear on today’s report (April 20th) and I don’t see that any email was sent for mixOmics today. So maybe we are talking about another package? Please do NOT answer here. Thanks!

Al J Abadi (19:07:57) (in thread): > sorry about the channel hiccup, I recently joined multiple channels and I got confused this time:slightly_smiling_face:

2020-04-21

Martin Morgan (14:21:23) (in thread): > @Mike Smithsounds like a pretty good idea… to complement (and / or incorporate in?)BiocStyle::use_vignette_*()?

2020-04-22

Daniela Cassol (13:36:04): > Hello everyone, I have a quick/silly question.:grin:I was wondering if I could add on an example file in the man page a file from another Bioc package, for example:x <- system.file("extdata/", "file.txt", package="XXX")On my build/check I don’t have any errors, but I am not sure of the behavior on the Bioc instance.

Lori Shepherd (13:40:58): > I think as long as you specify the package correctly in the DESCRIPTION it shouldn’t be a problem.

Martin Morgan (13:50:56) (in thread): > I agree it shouldn’t be a problem, but was wondering about the specifics? If it is generally useful, maybe the file should be ExperimentHub resources…

Daniela Cassol (13:50:59): > Under “Depends” or “Imports”?

Daniela Cassol (13:53:03) (in thread): > It is not very important, but the idea is to point toRmd vignettethat we have stored on the data package

Hervé Pagès (14:25:15) (in thread): > Don’t use trailing slashes (e.g. use"extdata", not"extdata/"). This works and is used a lot. As long as package XXX is installed of course, which you can ensure by putting XXX inDependsorSuggests. With the former, XXX will automatically be installed when the user installs your package sox <- system.file("extdata", "file.txt", package="XXX")will work out-of-the-box. With the latter, XXX is only suggested so doesn’t get automatically installed when the user installs your package. In this case you need to put something like this in your example: > > # In this example we'll use data from the XXX package: > library(XXX) > x <- system.file("extdata", "file.txt", package="XXX") > > The call tolibrary()is not strictly needed. It’s only here to make sure that the code will fail early and with a clear error message if the package is not installed. > Hope this helps.

Daniela Cassol (18:58:14) (in thread): > Yes, this helped a lot. Thank you for the detailed explanation!:slightly_smiling_face:

2020-04-24

Daniela Cassol (02:05:25) (in thread): > @Martin Morganand@Hervé PagèsAnother question now related to import/dependencies packages. Our package depends on theShortReadpackage. Now I want to include also the packagemagrittr. > > library(ShortRead) > library(magrittr) > > Attaching package: 'magrittr' > > The following object is masked from 'package:ShortRead': > > functions > > Also, when build and check the package, it presents with this Warning: > > * checking whether package 'systemPipeR' can be installed ... WARNING > Found the following significant warnings: > Warning: replacing previous import 'ShortRead::functions' by 'magrittr::functions' when loading 'systemPipeR' > > Any suggestions which are the best way to solve this? > Thank you very much :)

Hervé Pagès (04:37:07) (in thread): > Are the library statements you’re showing us from the vignette or the examples in a man page? It makes sense to dolibrary(magrittr)in your vignette or man pagesonlyif you’re going to make calls to magrittr in the code chunks of the vignette or in your examples. But if what you want to do is use magrittrinternallyin your package then do NOT dolibrary(magrittr). Instead you should put it in the Imports field and use selective imports (e.g.importFrom(magrittr, set_colnames)) in yourNAMESPACEfile to import only what you need. This will avoid the name collision with ShortRead.

Daniela Cassol (12:20:39) (in thread): > I was importing both packages,import(ShortRead, magrittr), and the error was induced by an example on the man pages. However, now I used theimportFromselecting a few functions frommagrittrand fixed the issue. Thank you again! I appreciate it!

2020-04-26

Henrik Bengtsson (01:40:15): > Hi. It’s fresh, it’s experimental, but it works: > > $ R CMD check --as=bioconductor MANOR_1.59.4.tar.gz > * using --as=bioconductor > * using Bioconductor version: 3.11 (per BiocVersion) > * using R_CHECK_ENVIRON="/home/hb/R/x86_64-pc-linux-gnu-library/4.0/rcli/bioc-3.11/check.Renviron" (5 lines; 1814 bytes setting 5 environment variables: '*R_CHECK_EXECUTABLES*', '*R_CHECK_EXECUTABLES_EXCLUSIONS*', '*R_CHECK_LENGTH_1_CONDITION*', '*R_CHECK_LENGTH_1_LOGIC2*', '*R_CHECK_S3_METHODS_NOT_REGISTERED*') > * using log directory '/home/hb/repositories/other/MANOR.Rcheck' > * using R version 4.0.0 (2020-04-24) > * using platform: x86_64-pc-linux-gnu (64-bit) > * using session charset: UTF-8 > * checking for file 'MANOR/DESCRIPTION' ... OK > * this is package 'MANOR' version '1.59.4' > ... > > and > > $ R CMD check --as=bioconductor::BiocCheck MANOR_1.59.4.tar.gz > * using --as=bioconductor::BiocCheck > * using Bioconductor version: 3.11 (per BiocVersion) > * using R_CHECK_ENVIRON="/home/hb/R/x86_64-pc-linux-gnu-library/4.0/rcli/bioc-3.11/check.Renviron" (5 lines; 1814 bytes setting 5 environment variables: '*R_CHECK_EXECUTABLES*', '*R_CHECK_EXECUTABLES_EXCLUSIONS*', '*R_CHECK_LENGTH_1_CONDITION*', '*R_CHECK_LENGTH_1_LOGIC2*', '*R_CHECK_S3_METHODS_NOT_REGISTERED*') > This is BiocCheck version 1.23.6. BiocCheck is a work in progress. > Output and severity of issues may change. Installing package... > * Checking Package Dependencies... > * Checking if other packages can import this one... > ... > > I’ve tested it on Linux and Windows, but not macOS. Install as follows and give it a spin: > > remotes::install_github("HenrikBengtsson/rcli") > rcli::install() > > https://github.com/HenrikBengtsson/rcli

Martin Morgan (13:13:37) (in thread): > https://github.com/Bioconductor/BiocStyle/pull/76implements enhancements touse_vignette_html()that provides a template with installation and other instructions relevant to package vignettes.

2020-04-28

brian capaldo (11:49:25): > @brian capaldo has joined the channel

Federico Marini (16:04:25): > Guessing most of the interested people can be here, I’ll ask here: > is any particular hour planned for the release? Guess it is all set up:slightly_smiling_face:

2020-04-29

Kevin Blighe (06:57:39): > @Kevin Blighe has joined the channel

Vince Carey (09:02:34): > Here’s a possible developers forum topic: exception handling. Specific use case – some code in a shiny app may throw a warning – how to send it to the app via showNotification? I assume tryCatch would be involved but I am not well-versed in its use.

Vince Carey (09:03:26): > Another question for developers: is it possible to get the result of showNotification in shiny to show more prominently in the UI? By default it pops up in the bottom right corner and is easily missed.

Federico Marini (09:09:12): > Re: notifications, there are other “systems”, first one coming to mind is shinyAlert

Federico Marini (09:09:13): > https://daattali.com/shiny/shinyalert-demo/ - Attachment (daattali.com): shinyalert package > Easily create pretty popup messages (modals) in Shiny

Federico Marini (09:09:47): > but they mostly are not shiny-native

Federico Marini (09:10:22): > could be a meaningful request to do, actually - prompting e.g. the “anchor point” of the notification to show up

Federico Marini (09:11:33): > 2nd one coming to mind is from the french group,https://github.com/dreamRs/shinypop

Federico Marini (09:12:11): > 3rd one is shinytoastr

Federico Marini (09:12:13): > https://github.com/MangoTheCat/shinytoastr

Federico Marini (09:12:50): > 1 and 3 are on CRAN afaik

Henrik Bengtsson (09:42:27) (in thread): > You want to usewithCallingHandlers()for non-error conditions because, contrary totryCatch(), it will continue the evaluation of the expression after you handled the condition.

Keegan Korthauer (13:05:08): > @Keegan Korthauer has joined the channel

Kevin Rue-Albrecht (16:15:31): > duration=NULLeven creates permanent notifications

2020-04-30

Jared Andrews (14:21:13): > @Jared Andrews has joined the channel

Dan Bunis (14:42:01): > Not sure if this is where I should post, but I also thinkdeprecationmight make for a good discussion topic so:man-shrugging:: > > Say I’m making an update to a function that ends up expanding the utility of one of its inputs. So now I’d like to update the name of that input in order to reflect it’s expanded usage. Is this the proper way? > > I seehttps://bioconductor.org/developers/how-to/deprecation/but it focuses on full functions, so I’m wondering if my below adaptation is proper. #s are meant to match to the same steps. > > 0. Leave the old input, but make anew.inputwith the new name that defaults to theold.input. Ultimately utilize thenew.inputin the code. > 1. Current devel / next release: Useif ('!default'(old.input)) {.Deprecated(...)}in code to add awarning()whenever thatold.inputis given a non-default value. (Essentially outputsWarning: 'old.input' is deprecated. use 'new.input' instead.) > 2. Next devel/ next, next release: Skip the.Defunct()step because it has no input to it for outputting a custom name rather than the name of the function it was called within. Instead: Simply remove theold.inputentirely as R will auto-gen an error message. Or if there is an…input that causes R not to output such an error, swap.Deprecated()forstop("'old.input' is deprecated. use 'new.input' instead.") > 3. Next next devel / 3 releases later: Remove the old input entirely and all warning code.

Hervé Pagès (16:08:09): > I can’t really figure out if what you want to do is rename thedata setthat you use as input for the improved function, or rename theargumentof the function that the data set is passed to, or both.

brian capaldo (16:09:15): > Isn’t this what scater went through when it went fromSCEsetclass toSingleCellExperimentclass?

Hervé Pagès (16:10:39): > I missed that story but that sounds like renaming a class, which is another story.

Aaron Lun (16:12:19): > I try to forget about the SCESet class.

Hervé Pagès (16:12:45): > is it working?

brian capaldo (16:24:57): > apologies for bringing up bad memories

Aaron Lun (16:25:44): > Is what working?

brian capaldo (16:29:13): > forgetting about it

Aaron Lun (16:29:25): > forgetting about what?

Hervé Pagès (16:29:43): > seems to be working

brian capaldo (16:29:44): > seems to be working

Aaron Lun (16:30:41): > Anyway,@Dan Bunis, you can have a look at scuttle and see an implementation of a long-standing wish: to replace all the underscores in the argument names with dots.

Aaron Lun (16:31:32): > The deprecation phase now has both arguments in the function definition; in the release after next I will add a deprecation message; and 3 releases down the line, I will remove the underscores.

Hervé Pagès (16:34:06): > you could have reduced the whole process to 2 releases by adding the new argument and deprecating the old argument at the same time

Dan Bunis (16:35:11): > Sorry hadn’t seem any of this discussion til@Aaron Luntagged me.@Hervé PagèsI want to rename an argument that receives a string (which ultimately does refer to rownames of target data) without breaking anyone’s code.

Aaron Lun (16:35:52) (in thread): > I know, but sater has been really unstable and I wanted to give people a chance to breathe.

Dan Bunis (16:36:31): > Aaron’s suggestion ultimately sounds pretty similar to my own, except with an extra release cycle at the beginning where there is no message. And without the .Defunct-ish error step 2

Hervé Pagès (16:36:44): > ok, that sounds like what@Aaron Lunis planning to do with renaming function arguments in scuttle.

Dan Bunis (16:43:12): > Just to summarize, and be sure I’m clear, this is the typical suggestion then: > 0. Leave the old input, but make anew.inputwith the new name that defaults to theold.input. Ultimately utilize thenew.inputin the code. > 1. Current devel / next release: Useif ('!default'(old.input)) {.Deprecated(...)}in code to add awarning()whenever thatold.inputis given a non-default value. (Essentially outputsWarning: 'old.input' is deprecated. use 'new.input' instead.) > 2. ~~~Next devel/ next, next release: Skip the ~~~~~~~.Defunct()~~~~~~~~ step because it has no input to it for outputting a custom name rather than the name of the function it was called within. Instead: Simply remove the ~~~~~~~~old.input~~~~~~~~ entirely as R will auto-gen an error message. Or if there is an ~~~~~~~~…~~~~~~~~ input that causes R not to output such an error, swap ~~~~~~~~.Deprecated()~~~~~~~~ for ~~~~~~~~stop("'old.input' is deprecated. use 'new.input' instead.")~~~~~~~~ ~~~~ > 3. Next~~~next ~~~devel /~~₃~~2releases later: Remove the old input entirely and warning code.

Hervé Pagès (16:48:29): > I’ve gone thru this function argument renaming business in a few occasions in the past. Unfortunately you need to have both arguments temporarily so the signature of the function will look ugly. The idiom I was using in the body of the function was something like: > > if (!missing(old.argument)) { > if (!missing(new.argument)) > stop("you can only specify one of 'new.argument' or 'old.argument'") > .Deprecated(msg="argument 'old.argument' is deprecated, please use 'new.argument' instead") > new.argument <- old.argument > } > > Then, 6 months later (next devel cycle), I would just remove the old argument. You can add an extra Defunct step before the removal: > > if (!missing(old.argument)) > .Defunct(msg="argument 'old.argument' is defunct, please use 'new.argument' instead") > > if you want to give another extra 6 months to your users to adapt.

Dan Bunis (16:51:22): > Awesome, thanks! (That’s pretty much what I was trying to say:smile:. Sorry for being confusing!)

Hervé Pagès (16:59:47) (in thread): > understandable

Hervé Pagès (17:01:39) (in thread): > BTW I’m with you for using dots instead of underscores in function args.

Aaron Lun (17:05:12) (in thread): > most excellent

Martin Morgan (17:41:14): > I’d take a pragmatic approach, with the goal being not to shoot the unsuspecting user in the foot. > > Mostly I’d live with my bad decisions, focusing instead on making better decisions in the future. > > If it were really important, then it seems like the right thing to do is to create a new function (with an even better name!) and deprecate the original. > > (realistically, we’re not all writing packages used by many, especially in the stages where we’re still tinkering with argument names)

Dan Bunis (18:30:54): > Oh man, that sounds drastic@Martin Morgan, but the goal there is to add protection for any user/package that doesn’t utilize the input within the deprecation period? > > I’m talking about a pretty minor customization input in a visualization function, so I don’t think that a new function is necessary as it has zero effect on any calculations that might be downstream in someone’s code. Also, the target package is new as of this release cycle. (So probably few enough users to alternatively just mash the change right into the current release lol, but I’m not going to do that.)

2020-05-01

Vince Carey (13:47:43): > Deprecation and long-term stability are important topics and I am glad to see them discussed here. There is a bit of vagueness about the concept “API” in connection with this. It would be interesting to think about adding some “memory” to BiocCheck that could warn if “previous API not respected with these changes, please revert and deprecate function f”. One might say that the API is that set of functionalities covered by tests and examples. But we know it is more than that – anything that a user/developer could possibly rely upon is the actual meaning.

Martin Morgan (16:00:16): > FWIW We’ve been talking about a BiocCheck hackathon to suggest, develop and add / remove checks from BiocCheck, so keep thinking about things that would make BiocCheck better… tentatively thinking about this for the week of May 11…

Aaron Lun (16:05:59): > coughno more 80 character warningscough

Marcel Ramos Pérez (16:08:03): > more stringent checks for 80 char width:wink:

Hervé Pagès (16:13:43): > @Aaron LunAFAIK it’s only a NOTE

Hervé Pagès (16:18:46): > Another highly recommended reading about coding style from the same guy who has strong opinions about C++:https://www.kernel.org/doc/html/latest/process/coding-style.html

Aaron Lun (16:30:02): > Well, when you’re talking about C, a coding style is sort of like putting lipstick on a pig.

Hervé Pagès (16:34:29): > You realize that C++ is a super-set of C right? So also a pig but a much fatter one.

Aaron Lun (16:37:34): > slander!

Hervé Pagès (16:38:51): > Clarification: I’ve nothing against pigs, even fat ones

Aaron Lun (16:39:45): > it’s more like a family of pigs

Aaron Lun (16:39:54): > and you can pick the pig that you want to use for any given project

Aaron Lun (16:40:02): > sometimes you have 2 pigs at once

Aaron Lun (16:40:39): > templating and functional programming? multiple inheritance and lambdas? Go for it!

Hervé Pagès (16:42:17): > The papa pig (a.k.a. C) syntax had many many offsprings and R is one of them. It’s a really big family of pigs these days.

Dan Bunis (17:03:31) (in thread): > Right near the top: “Tabs are 8 characters, and thus indentations are also 8 characters. There are heretic movements that try to make indentations 4 (or even 2!) characters deep, and that is akin to trying to define the value of PI to be 3.”:rolling_on_the_floor_laughing:

Robert Ivánek (19:09:02): > @Robert Ivánek has joined the channel

2020-05-04

Tim Triche (11:23:37): > it’s been too long since I seen the Lion King. “you can be a big pig too! [oi]”

Hervé Pagès (18:19:30): > Am I missing something about the random number generator in R? The probability to get a duplicate here is roughly 1e5/1e9 but I get oneevery timeI try: > > > anyDuplicated(sample(1e9, 1e5, replace=TRUE)) > [1] 44725 > > anyDuplicated(sample(1e9, 1e5, replace=TRUE)) > [1] 32160 > ... etc... > > Maybe I should rush to the casino before my luck changes:moneybag:

Aaron Lun (18:20:11): > sounds like the birthday problem, right?

Hervé Pagès (18:22:22): > Sure. With 1e5 students in the classroom and 1 billion days on the calendar.

Aaron Lun (18:23:44): > > > 1 - exp(-(1e5)^2/2/1e9) > [1] 0.9932621 > > Using the approximation on Wikipedia.

Aaron Lun (18:24:03): > So, 99.3% probability of getting a duplicate.

Hervé Pagès (18:24:55): > ah ok, so that’s it. Need to go back to some basics in probabilities I guess. Thx!

Hervé Pagès (18:34:57): > That’s actually a really good approximation. Exact calculation: > > > 1 - exp(sum(log(1 - (0:99999)/1e9))) > [1] 0.9932628 >

2020-05-05

Tim Triche (10:46:23): > hola@Hervé PagèsI have a question about HDF5Array and DelayedArray / SummarizedExperiment objects

Tim Triche (10:47:09): > at present, if I want to share a “pickled” HDF5-backed RangedSummarizedExperiment or MultiAssayExperiment via (say) Dropbox, everything works great except I have to tell the share-ee how to relocate the backing files:

Tim Triche (10:48:33): > > # Suppose AML_MAE is composed of SummarizedExperiments "CNA", "mRNA", and "DNAme" > # Then each lives in AML_MAE@ExperimentList$name_of_SE > > # Example: AML_MAE@ExperimentList$DNAme is a SummarizedExperiment (actually a GenomicRatioSet) so let's stub that out: > DNAme < - AML_MAE@ExperimentList$DNAme > > # Then the critical bit for an assay is this: # Note: don't ever do this. It will break when BioC changes --t > > DNAme@assays$data$Beta@seed > # An object of class "HDF5ArraySeed" > # Slot "filepath": > # [1] "/home/tim/Dropbox/TARGET_AML/AAML_MERGED_EPIC/assays.h5" > # > # Slot "name": > # [1] "assay001" > # > # Slot "type": > # Error in slot(object, what) : > # no slot of name "type" for this object of class "HDF5ArraySeed" > > # In order to move that around, you have to reassign it. > > DNAme@assays$data$Beta@seed@filepath <- "/tmp/assays.h5" # or whatever # Also don't do this. > > # Clinical covariates (often updated, e.g. when St. Jude validates another fusion): > class(AML_MAE@colData) > # [1] "DataFrame" > # attr(,"package") > # [1] "S4Vectors" > > # That could just as easily live somewhere else, though: > #[https://bioconductor.org/packages/release/bioc/vignettes/SQLDataFrame/inst/doc/SQLDataFrame.html](https://bioconductor.org/packages/release/bioc/vignettes/SQLDataFrame/inst/doc/SQLDataFrame.html) >

Tim Triche (10:50:23): > @Kasper D. Hansenand@Vince Careyrecoiled with a gagging noise loud enough that I could hear it from thousands of miles away when we discussed this situation this morning. Is there a recommendation for a BioC-release-invariant way to do this? I’ve written wrapper functions obviously, but the fact that I’m reaching in and fiddling with slots inside list elements inside slots suggests that… maybe this is notBest Practices(tm).

Tim Triche (10:51:23): > Maybe I should just write down the things that I do as supplementary files, and then suggest people cite them as examples ofWhat Not To Do.

Tim Triche (10:52:23): > (I have many more. Want to learn the wrong way to compute across an HDF5 array, or pileup a bunch of BAMs in a folder? Got that covered too!)

Hervé Pagès (15:32:56): > UsesaveHDF5SummarizedExperiment()to share a HDF5-backed SummarizedExperiment derivative. > I don’t see an easy way to achieve something like this with a MAE object though. > One workaround is to (1) save the individual experiments (usingsaveHDF5SummarizedExperiment()on SE objects and regularsaveRDS()on non-SE objects), (2) tell your share-ee to load them individually (withloadHDF5SummarizedExperiment()on SE objects andreadRDS()on non-SE objects), and (3) tell them to reassemble the MAE by calling the MAE constructor on the experiments they loaded in (2). The extra decorations (colData, sampleMap, etc…), if any, would also need to be individually serialized, loaded on the share-ee’s side and passed to the MAE constructor. Sounds like the MAE package could provide something like this e.g. viasaveHDF5MultiAssayExperiment()andloadHDF5MultiAssayExperiment()functions. Pinging Marcel@Marcel Ramos Pérez? > Let’s please use the bigdata-rep channel to continue this discussion.

Marcel Ramos Pérez (15:50:57): > Yes, this functionality is built-in tocuratedTCGADatawhen reading methylation files and could be moved toMultiAssayExperiment

Tim Triche (17:08:07): > saveHDF5SummarizedExperimentseems to have the same issue w/r/t needing the HDF5 to be where it expects

Tim Triche (17:08:21): > AFAIK, MAE just delegates instantiation to HDF5Array (?)

Tim Triche (17:08:55): > oh I just saw the#bigdata-repreference, heading there now

Tim Triche (17:09:15): > @Hervé Pagèsuse hashtags next time e.g.#bigdata-rep:wink:

2020-05-07

Sean Davis (13:06:57): > Hi,<!channel>. Is there any interest in having a developer forum call devoted to “big data” discussion? HDF5, APIs, cloud-backed resources, AnVIL/other cloud projects, R-based approaches to out-of-memory computing, language-agnostic approaches, etc. This would potentially inform the project on strengths, weakness, opportunities, and challenges to approaching large datasets and to classify “big data” problems into actionable chunks.

Jianhong (14:17:28): > @Jianhong has joined the channel

Tim Triche (18:20:40): > Quite

Stephanie Hicks (22:42:19): > @Sean Davisi’m definitely interested in this. especially from the perspective of people’s experience combining various “big data” bioc packages together. For example, I have dabbled with combiningrhdf5,HDF5ArrayandDelayedArrayand found it to be extremely powerful, but also found it frustrating to figure out how to make them work together in an optimal way.

2020-05-08

Martin Morgan (02:56:50): > Join us for ‘BiocCheck-a-thon’, a week-long virtual hackathon to improve BiocCheck and the consistency and quality of Bioconductor packages! Seehttps://github.com/Bioconductor/BiocCheck/wikifor details; starting May 18.

Lluís Revilla (04:54:51) (in thread): > Does BiocCheck only applies to new packages? Is there a way to suggest checks on the daily builds? Sometimes once in Bioconductor there are drastic changes that might need to be automatically checked.

Mike Smith (08:55:20) (in thread): > Thanks Martin. Andrzej is officially still in charge of BiocStyle, but I think lockdown with two children is pretty limiting on how much time he can dedicate to it. I’ll offer to review the pull request and see what he says.

Tim Triche (14:37:31) (in thread): > you got time for a weekly call on this stuff? I was thinking “Stephanie could answer that!” a lot on Wednesday

Stephanie Hicks (22:44:35) (in thread): > ha! can’t do a weekly call atm, but happy to answer Qs on slack:upside_down_face:

2020-05-09

Laurent Gatto (13:14:33) (in thread): > I am certainly be interested in this ‘big data’ topic, and would be happy to contribute some efforts on the mass spec side. Some developments are presented here (https://www.biorxiv.org/content/10.1101/2020.04.29.067868v2) and more is being developed and tested.

Tim Triche (16:12:02) (in thread): > close enough for government work. would be biweekly at worst, though

Tim Triche (16:12:11) (in thread): > (as in every other week)

2020-05-10

Sangram Keshari Sahu (09:26:50): > @Sangram Keshari Sahu has joined the channel

2020-05-11

Mike Smith (12:59:14): > <!channel>I’m thinking that it might be nice to dedicate most of this months’ developer teleconference to the#bioccheck-a-thonthat will be taking place that week. The call is scheduled to fall close to the middle of that week and seems like a natural place for people to present work in progress or plan for the final two days of the hack-a-thon. It would be great to gauge whether this would be interesting to people, so if you think this would be beneficial, please reply with a:+1:and if not then:-1:

Tim Triche (14:12:08): > would be super if there were also a way to discuss how testing interacts with “big data” backends like HDF5 and GenomicFiles, or whether something could be finagled into BiocCheck to handle interaction with AnnotationHub/ExperimentHub

2020-05-12

Eli Miller (15:00:20): > @Eli Miller has joined the channel

Stuart Lee (20:31:27): > @Stuart Lee has joined the channel

2020-05-14

Johannes Rainer (02:30:14): > @Johannes Rainer has joined the channel

saskia (07:44:14): > @saskia has joined the channel

2020-05-18

Huipeng Li (11:03:50): > @Huipeng Li has joined the channel

B P Kailash (11:30:02): > @B P Kailash has joined the channel

Mikhael Manurung (15:11:12): > @Mikhael Manurung has joined the channel

2020-05-22

Matt Ritchie (08:41:24): > @Matt Ritchie has joined the channel

Shian Su (08:43:21): > @Shian Su has joined the channel

2020-05-26

Gabriele Sales (09:37:00): > @Gabriele Sales has joined the channel

2020-05-27

Logan Knecht (18:50:26): > @Logan Knecht has joined the channel

Logan Knecht (18:52:44): > Hello! A co-worker of mine mentioned that there were some people experiencing Catalina issues with R and that I might find someone with the same issues here.

Logan Knecht (18:52:54): > If this is the incorrect spot for this - please let me know!

Logan Knecht (18:53:58): > I’m installing R as an executable dependency for an electron application I’ve been working on. Previously I was doing version3.5.2andOSX High Sierra, the issue is that I’m now trying to get this working onCatalinaand4.0.0

Logan Knecht (18:54:24): > Everything runs the same except when I R either the3.5.2or4.0.0installation I get this feedback

Logan Knecht (18:55:02): > > dyld: Library not loaded: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libR.dylib > Referenced from: /Users/lknecht/Repositories/FAUST_Nextflow_Desktop/electron_faust_nextflow_desktop/app/binaries/r/r-mac/bin/exec/R >

Logan Knecht (18:55:29): > Which is strange because I’m using an an executable at/Users/lknecht/Repositories/FAUST_Nextflow_Desktop/electron_faust_nextflow_desktop/app/binaries/r/r-mac/bin/exec/Rto run it

Logan Knecht (18:55:48): > So my question is - has anyone else encountered an issue like this? Was it a Catalina issue?

Sean Davis (22:07:13): > Hi,@Logan Knecht. I’d suggest asking in a broader forum such as R-SIG-MAC or stackoverflow. The developer forum here is bioconductor-focused.

2020-05-28

Logan Knecht (14:23:22): > Word word - Figured it was worth a shot

Sean Davis (15:57:45): > Keep them coming. Just wanted you to get to an answer faster….

2020-06-01

Aaron Lun (19:43:35): > Indeed.

Aaron Lun (19:44:02): > Well, Roxygen won’t say what the default is in the argument description, but in most cases it should be apparent from theusage.

Hervé Pagès (21:54:37): > just to be clear, I don’t think there is any difference between the traditional way and the Roxygen way to document an argument: you’re free to put whatever there and it’s up to you to say something about the default value or not.

Aaron Lun (22:03:03): > That is correct AFAIK.

2020-06-02

Kasper D. Hansen (12:31:46): > so what you’re saying is that Kevin is wrong

Aaron Lun (12:32:09): > Why?

Kasper D. Hansen (12:40:28): > oh, yeah, the usage section is probably inferred from the function signature

Kasper D. Hansen (12:41:18): > In “classic” Rd all we have is a check that the usageshouldbe the same as the function signature and this is one part of classic Rd I have always wondered why I need to write and not just have autogenerated. Perhaps the only part.

2020-06-03

Tim Triche (10:45:40): > Roxygen (andskeletor’s “make doc” target) ended my relationship with editing .Rd files forever)

Aaron Lun (11:02:48): > I will say, though, that seeing if people can edit the Rd files directly is a good way to gauge someone’s skill level.

Aaron Lun (11:04:28): > A sign of true class, as it were. Like a necktie that you have to tie yourself and isn’t just connected to an elastic loop.

Tim Triche (11:16:18): > Or screen-printed onto the t-shirt

Federico Marini (11:29:39): > Everyone can build up a tent with the new 2-seconds ones from Quechua:smile:

Hervé Pagès (14:26:23): > Good. So we all agree that using Roxygen is shameless cheating:smile:

Hervé Pagès (14:29:32): > @Tim TricheI like that. Let’s start printing man pages on t-shirts. Maybe people will start reading them?

2020-06-04

Daniela Cassol (13:46:18): > Hello Everyone! > Is there any specific guideline for building a package with an interactive Shiny-based graphical user interface? > I was wondering what is the best policy for theUIandserverfiles, and shiny module functions ofshinyapp? Can we add this files insideinstfolder and copy them when the user launches the app? > The second type is random R script likeglobal.R. It has no R functions, just random R code that shiny uses it as the standard to resolve global settings, variables. > I couldn’t find any guidelines regarding this matter, please let me know if I missed the documentation on the webpage.

Sean Davis (14:08:48): > @Daniela Cassolseehttps://github.com/mangothecat/shinyAppDemofor an example.

Daniela Cassol (14:11:02) (in thread): > Thank you very much!:slightly_smiling_face:

Aaron Lun (14:12:16): > See iSEE for a complex real-life example.github.com/iSEE/iSEE.

Daniela Cassol (14:17:37) (in thread): > amazing package!:slightly_smiling_face:

Federico Marini (15:25:30): > If you’d need a mid-size example.. ->https://bioconductor.org/packages/release/bioc/html/ExploreModelMatrix.html - Attachment (Bioconductor): ExploreModelMatrix > Given a sample data table and a design formula, generate an interactive application to explore the resulting design matrix.

Martin Morgan (15:52:40) (in thread): > a really nice feature of this is that all the code is in the R directory, so is checked by R CMD check. Also, writing code this way encourages standard best-practices, with stand-alone functions that can be documented and tested independently of the apparatus for presenting a shiny application.

Daniela Cassol (21:03:35) (in thread): > Thank you very much!

Daniela Cassol (21:04:12) (in thread): > Thank you,@Martin Morgan!

2020-06-05

Leopoldo Valiente (12:39:37): > @Leopoldo Valiente has joined the channel

2020-06-06

Olagunju Abdulrahman (19:57:15): > @Olagunju Abdulrahman has joined the channel

2020-06-09

Mike Smith (16:50:57): > Cross posting from#bigdata-repa call for input on disk backed data as a potential topic for our next Developers’ Forum - Attachment: Attachment > Hi <!channel>, @Sean Davis and I were thinking about focusing the next developer forum on disk-based ‘big data’ topics. Does anyone have experience with formats like TileDB, Zarr, Parquet, HDF5 etc that they’re be willing to share? I thought it’s be cool to get some perspective on things other than HDF5, but any opinions or experiences on the topic would be great.

2020-06-11

Aaron Lun (11:07:12): > The developer forum isn’t today, right?

Nitesh Turaga (16:32:36): > @Leonardo Collado TorresAre we able to edit the indentation spacing withstyler? It defaults to 2 spaces, can we make it 4? I can’t seem to find the argument / setting for this.

2020-06-12

Tim Triche (15:55:27): > I have a related question forstyler

Tim Triche (15:56:04): > I’ve been using it in vim/ALE and the default tidyverse_style transformer… kind of sucks. are there any plans from anyone to add a bioc_style transformer? (camelCase, etc.)

Tim Triche (15:56:19): > maybe that would be a good time to add the 4-spaces apostasy:wink:

Marcel Ramos Pérez (15:56:39) (in thread): > @Nitesh Turagahttps://github.com/r-lib/styler/issues/331

Shian Su (19:45:34): > Let’s just compromise at 3 spaces shall we?

Aaron Lun (19:45:55): > You !=4 spacers make me sick.

2020-06-13

Martin Morgan (06:13:23): > https://github.com/lcolladotor/biocthis/blob/master/R/bioc_style.Rseems like a promising location to add a pull request, and to move what looks like a useful package down the line a bit@Leonardo Collado Torres– although asking for a bioc_style camelCase transformer seems paradoxical?

Nitesh Turaga (11:14:52) (in thread): > Thanks@Marcel Ramos Pérez

Tim Triche (12:47:53): > biocStyler:smile:

Tim Triche (12:49:12): > I need to write/usebiocStyleRASAP before I become a two-spacing snake_case tidyverse degenerate. Vim is like a little game where I make the yellow lights go away now

Tim Triche (12:52:35): > it doesn’t look likebioc_stylegoes beyond 2->4 spacing, and I’m not sure how to use it as the default handler forstyle_filein ALE – that said it looks like a suitable challenge so thanks for the tip

2020-06-15

Lluís Revilla (09:11:32): > I wrote a blog post about submissions to Bioconductor:https://llrs.dev/2020/06/bioconductor-submissions/hope it is useful to reviewers too: - File (PNG): eda-1.png

Martin Morgan (11:30:54) (in thread): > Pretty interesting read! An overall statistic you mention is “Around 50% of the ~400 yearly packages submitted are approved” but is that really ‘50% of opened issues’? My feeling is that the acceptance rate is much higher once one removes submissions that are trivially incorrect (e.g., not pointing to a git repository; opening a new issue rather than pushing to a repository that is already tracked)

Lluís Revilla (12:04:11) (in thread): > Yes, once the trivially incorrect are removed the approval rate raises above 75%

Lluís Revilla (12:04:34) (in thread): > See the figure about the approval rating of each reviewer.

Tim Triche (12:07:41) (in thread): > it turns out that this happens in lintr so no worries

Tim Triche (12:08:03) (in thread): > > " lintr > let ale_r_lintr_options="lintr::with_defaults(object_name_linter=NULL, trailing_blank_lines_linter=NULL, trailing_whitespace_linter=NULL, infix_spaces_linter=NULL" > > " styler > " let ale_r_styler_options="styler::bioc_style" >

Tim Triche (12:08:40) (in thread): > if one usesobject_name_lintr=camelCase_lintror whatever it’s called (in the middle of 5 conversations) then that solves that.

Hervé Pagès (12:37:40) (in thread): > I was surprised to see my approval rate at about 80%. Didn’t decline a submission for years. (Last time I did was more than 10 years ago, so many years before the switch from svn to git and from the old tracker to the new one.) Although it’s not rare that people withdraw their submission or that the bot closes a submission for inactivity. But I don’t think this happens for 20% of my reviews. I don’t keep track of the numbers but my feeling is that it happens less often than that.

Lluís Revilla (12:41:22) (in thread): > Will check that, but maybe some issues are closed due to unresponsive submitter/inactivity…

Hervé Pagès (12:43:02) (in thread): > Yes it could be that some issues get closed after they got assigned a reviewer and before the reviewer actually gets a chance to start the review.

Lluís Revilla (12:44:50) (in thread): > Mmh not sure how to check that, will look more closely to the labels

Hervé Pagès (12:49:10) (in thread): > Maybe compute reviewer’s approval ratings based on issues where the assignee actually got involved. A simple criteria (even though it’s not perfect) is to consider only issues where the assignee has posted at least once.

2020-06-18

Lori Shepherd (09:26:09) (in thread): > how does it handle issues assigned to multiple people or reviews that switched reviewers?

Hervé Pagès (12:06:20): > Can’t find the link to today’s forum. Anyone?

Lori Shepherd (12:08:28): > https://bluejeans.com/114067881

Lori Shepherd (12:09:36): > although it looks like the discussion on bigdata-rep might be rescheduled for a later forum

Aaron Lun (12:24:44): > Wait what? Is it over?

Lori Shepherd (12:27:41): > mike smith was not on and the tentative presenters were not on – so we will reschedule – sorry everyone for the confusion

Tim Triche (13:31:46): > oh fudge

Tim Triche (13:32:23): > thanks@Mike Smithand@Lori Shepherdfor managing this process

Lluís Revilla (18:34:32) (in thread): > When I collect the issues I don’t get the comments or what is the history of the issue (Also I discarded those issues with multiple reviewers). So for this analysis had to use the photo finish. The code is athttps://github.com/llrs/blogR/blob/9781cc7cedddc75467a5a469befa21057f8a8fba/content/post/2020-06-01-bioconductor-submissions.en.Rmd. I am writing a package to collect all the information of an issue. Once I manage to have all the information together I will make another post looking at comments, relabelling, time between replies, how many people commented and so on.

2020-06-19

Mike Smith (11:38:33): > Hi all, sorry for the confusion around yesterdays devel forum - totally my fault. Life things got a bit much over the last few weeks & I completely dropped the ball on this. If it still works for people we can reschedule to next Thursday (25th June) and go ahead with the big data topics then.

Will Townes (13:36:21): > @Will Townes has joined the channel

Dirk Eddelbuettel (13:39:55): > @Dirk Eddelbuettel has joined the channel

2020-06-22

Sean Davis (09:46:24): > Adding another related topic as a possible forum item: ALTREP. From another set of discussions elsewhere, an ALTREP developer session, with Luke / Gabe as guests but also input from Jiefei / Aaron could certainly be interesting.

Sean Davis (09:47:37): > Unrelated, is there a scheduling document or a set notes collected about dev forum topics somewhere?

2020-06-23

Mike Smith (16:01:59) (in thread): > Well originally it was this channel, but it’s become more of a general discussion for ‘dev’ topics. In the past there hasn’t been a glut of topics, if someones suggested something it’s pretty much been on the agenda the next week. > > I have a list of previous topics athttps://docs.google.com/document/d/1ZC2hcC_ABzKV6WmAPz1CjU_IalPdlyOOVkbKYUR7fAk/edit?usp=sharingI’d be happy for there to be a list of suggestions

Mike Smith (16:23:20): > I’d like to put at least a rough agenda together for the call on Thursday. If you’d like to share you knowledge / experience with a particular on-disk technology or have some specific questions for the discussion can you reply here and I’ll put something together.

2020-06-24

Sean Davis (06:32:33) (in thread): > I pinned this topic list and added a bunch more topics as suggestions.

Sean Davis (06:33:44): > Live dev forum topic list:https://docs.google.com/document/d/1ZC2hcC_ABzKV6WmAPz1CjU_IalPdlyOOVkbKYUR7fAk/edit?usp=sharing - File (Google Docs): Bioconductor Developers’ Forum - Schedule & Topics

Mike Smith (10:42:15): > Thanks a lot for the suggestions@Sean Davis, there’s some great topics in there

Mike Smith (10:46:40): > A reminder that our next Developers’ Forum is scheduled for tomorrow (Thursday 25th June) at 09:00 PDT/ 12:00 EDT / 18:00 CEST (Check here!) > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > The general topic will be technologies for accessing ‘big data’ on-disk i.e. HDF5, TileDB, Zarr, etc If you’d like to share you knowledge / experience with a particular on-disk technology or have some specific questions for the discussion please reply here and I’ll try to put a little structure to our discussion.

Aaron Wolen (11:11:28): > @Aaron Wolen has joined the channel

2020-06-25

Hervé Pagès (13:09:34): > Very interesting TileDB/tiledb/TileDBArray introductions by@Stavros Papadopoulos@Dirk Eddelbuetteland@Aaron LunI enjoyed it very much. Thanks@Mike Smithfor organizing these meetings!

Stavros Papadopoulos (13:10:53): > @Stavros Papadopoulos has joined the channel

Mike Smith (13:11:06): > Here’s a link to the Github repo if you want to check out TileDB-Rhttps://github.com/TileDB-Inc/TileDB-R

Stavros Papadopoulos (13:12:07): > Thanks@Mike Smithfor organizing this and thanks everyone for attending! We are happy to answer any questions.

Dirk Eddelbuettel (13:20:27): > Oh, and forgot to mention that I also keep a few informal Docker containers around.@Aaron Lunuses one to run CI for his very nice package – I tagged it ‘bioc’ and it has what he needs baked in, incl our package:

Dirk Eddelbuettel (13:20:51): - File (Shell): Untitled

Dirk Eddelbuettel (13:21:44): > If you use that with the usual-v $PWD:/somedir -w /somediroptions you can read/write some local TileDB data to play around.

Dirk Eddelbuettel (13:23:37): > With that big thanks to@Mike Smithfor hosting and MCing!

2020-06-28

Lorena Pantano (14:40:50): > quick question: the web page in bioc shows my package broken:https://bioconductor.org/packages/release/bioc/html/DEGreport.htmlbut then I see all is alright,http://bioconductor.org/checkResults/release/bioc-LATEST/DEGreport/. Can I trust the latest saying all is good? Thanks! - Attachment (Bioconductor): DEGreport > Creation of a HTML report of differential expression analyses of count data. It integrates some of the code mentioned in DESeq2 and edgeR vignettes, and report a ranked list of genes according to the fold changes mean and variability for each selected gene.

Lori Shepherd (21:05:20): > Trust the builds report as long as the time stamp is updated for the current day. The landing pages have a delay before regenerating. I’ll be sure too double check the generation tomorrow tho to be sure the scripts are still running. Cheers

2020-06-30

Davide Risso (02:29:56): > Is there a recording of the last dev forum meeting? I couldn’t attend but I’m very interested

Mike Smith (04:28:46) (in thread): > I’m afraid not. The BlueJeans storage space was full, and we only found that out when we hit record in the session. It’s been cleaned up now, but unfortunately there wasn’t a chance to do that mid-meeting.

Davide Risso (04:53:47) (in thread): > Ok, np! Hopefully next time I’ll be able to join

Will Townes (11:09:57) (in thread): > For me the takeaway is tiledb is a really exciting tool for big data. I didn’t fully understand this part but there was some mention of limitations in a high performance computing environment, something about lustre file system not always playing nicely with it, perhaps others can elaborate.

Michael Love (21:13:02): > @Michael Love has joined the channel

Michael Love (21:16:59) (in thread): > this was a really interesting read:slightly_smiling_face:

2020-07-01

Mike Smith (05:01:31) (in thread): > My understanding is that this stems from the fact that the lustre file system stores file meta-data separately from the files themselves. If you have lots of small files on a lustre system there’s a large overhead accessing and updating that metadata in addition to actually obtaining the data. This is normally via some relatively slow network connection and this then becomes the bottleneck in accessing the data. Because a TileDB file is actually a bunch separate files, one for each chunk, it can suffer heavily from this issue, and a monolithic file format like HDF5 might be more appropriate for this scenario.

Lluís Revilla (05:40:19) (in thread): > Thanks all for the feedback!

2020-07-02

Aedin Culhane (12:43:13): > thishttps://builder.r-hub.io/about.html, was supported by the R consortium. Its runs R CMD check on pkg on different systems on request. would it be useful to have a website that runs biocheck given a github url? - Attachment (builder.r-hub.io): R-hub package builder > R package builder, by the R-hub project of the R Consortium, easing the R package development process.

Tim Triche (14:49:06): > yes

Sean Davis (14:51:53): > Note that one can likely get this effect on Github already using github actions and running BiocCheck as an addon to a workflow like this one:https://github.com/tidyverse/dplyr/blob/master/.github/workflows/R-CMD-check.yaml

Tim Triche (14:57:58): > thanks Sean! that’s really helpful to know!

Sean Davis (15:26:42): > Here is the workflow that seems to be working (more-or-less) for Bioc2020 Workshop contributors, consider sharing them here. It generates a pkgdown website, runs R CMD check, and builds a docker container containing dependencies and the software package itself.https://github.com/seandavi/BuildABiocWorkshop2020 > * https://github.com/seandavi/BuildABiocWorkshop2020/blob/master/.github/workflows/basic_checks.yaml

Dirk Eddelbuettel (15:42:05) (in thread): > Nice, I should read that closely. I sometimes do it the other way around and have a directorydocker/in the source repo to create the Docker container (viahub.docker.com) I use for CI. Added bonus: portable. Users can debug with the container wherever they please and nobody is tied to the CI environment of the week^Hyear.

Sean Davis (21:17:58) (in thread): > The docker container built by the GH actions above gets deposited tohub.docker.com, for exactly the reasons you give. Folks can run the entire environment with only docker (or using google compute engine [https://gist.github.com/seandavi/5da4a73d94bc24236cf204196feddc85], for instance).

2020-07-03

Sean Davis (07:24:44): > https://github.com/r-lib/actions

Mike Smith (07:30:36): > This seems like it’d be a perfect topic for a devel call - I seem to remember@Leonardo Collado Torres@Charlotte Soneson&@FelixErnstdiscussing Github actions based CI in here previously. > > Edit: I think the previous discussion starts herehttps://community-bioc.slack.com/archives/CLUJWDQF4/p1587164112094800

Mike Smith (07:32:33): > Would anyone be willing to talk through their setup? I know my approach has been to blindly copy paste other people workflow, and then push increasingly irritated commits when it doesn’t run for me. I think it’d be great to hear someone talk about the various steps.

Federico Marini (08:00:28) (in thread): > I’d also suggest another master of GHA settings,@Kevin Rue-Albrecht

Federico Marini (08:00:45) (in thread): > I think he makes coffee upon pushing to GH now:slightly_smiling_face:

Kevin Rue-Albrecht (08:02:06) (in thread): > haha - happy to share thoughts, not sure about leading the session though, I also copy paste a lot:wink:

Charlotte Soneson (08:04:26) (in thread): > also happy to contribute from my experiences

Kevin Rue-Albrecht (08:04:53) (in thread): > though that sounds like an excellent opportunities to exchance ideas with@Leonardo Collado Torres: there have been recent troubles even with just thecheckoutaction, which affect later deployment to github pages, I think Leo took up the battle with checkout@v2 while I’ve established a stable way to continue with checkout@v1 until the smoke clears

Kevin Rue-Albrecht (08:44:15) (in thread): > while i’m on it, I also set up a (simple) workflow to compile markdown (xaringan) slides to PDF and HTML, that might be of interest to anyone giving tutorials and lectures

Sean Davis (10:46:48): > I think we can keep the bar low for anyone’s time. I can do some slides and a walkthrough–maybe 20-25 minutes just to do some level-setting. After that, open up for others to share and discuss. Does that work for folks?@Mike Smith?

Sean Davis (10:47:22): > I’ll use the BuildABiocWorkshop as the use case.

Mike Smith (11:33:06): > That sounds great to me. I like to think the audience for these meetings are enthusiastic about the topic, but not necessarily already heavily invested, so an intro to continuous integration and what Github actions brings to the table would be perfect - followed by finding out how we as BioC developers can/are making use of it. For example, I think pkgdown stuff is very cool, but it’s not a topic I’ve ever looked into or tried to use myself.

Tim Triche (12:03:39): > this looks handy too (from Jim Hester’s actions repo)https://ropenscilabs.github.io/actions_sandbox/ - Attachment (ropenscilabs.github.io): Github actions with R > An introduction to using github actions with R.

2020-07-04

Umar Ahmad (08:20:28): > @Umar Ahmad has joined the channel

2020-07-07

Leonardo Collado Torres (13:38:01) (in thread): > Hi@Mike Smith. When are you thinking about having this call? I’ll be out July 10-19th for vacations, but I’d be happy to talk about the GHA stuff I’ve done along with others

Leonardo Collado Torres (13:38:53) (in thread): > on the Google Calendar the next dev forum is July 16th, which I would be unavailable for

Mike Smith (16:24:38) (in thread): > Yep, the plans to go ahead with the 16th, otherwise it runs a bit close to Bioc2020. That’s a real shame you won’t be able to make it, although I’m sure it’s a topic that will come up frequently here. I’m also on vacation and will be missing this one too.

Mike Smith (16:33:11): > The next Bioconductor Developers’ Forum is scheduled for Thursday 16th July at 09:00 PDT/ 12:00 EDT / 18:00 CEST - You can find a calendar invite athttps://bit.ly/3gCsFXOWe will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > This month we’ll be focusing on continuous integration with Github Actions, led by@Sean Davis, and discussing how we can benefit from GA across a wide variety of development tasks, including cross-platform testing of code and website deployment.

Mike Smith (16:33:18): > I’m on vacation for the meeting this month, so@Lori Shepherdhas kindly agreed to host, and will almost certainly make you wish I was away more often!

Leonardo Collado Torres (18:09:45) (in thread): > sure no problem and enjoy your vacations too ^^

2020-07-08

Pariksheet Nanda (06:44:16): > @Pariksheet Nanda has joined the channel

David Zhang (14:43:21): > @David Zhang has joined the channel

Leonardo Collado Torres (15:57:09) (in thread): > I invited several LIBD folks like@Nick Eaglesto attend:smiley:So hopefully we won’t miss out on the fun hehe

2020-07-12

USLACKBOT (15:47:15): > This message was deleted.

Martin Morgan (16:03:40) (in thread): > There was some discussion on this threadhttps://stat.ethz.ch/pipermail/r-package-devel/2020q2/005490.htmlwhere for instance my then-current view that software package authorship was analogous to manuscript authorship (you’d cite or acknowledge someone who influenced your thinking, but not give them authorship) was critiqued based on the interpretation, I think, of the Authors: field as a statement of copyrighthttps://stat.ethz.ch/pipermail/r-package-devel/2020q2/005498.html. To me this is still strange – for instance and I guess I’m still stuck in the academic mode, I would not want to be the ‘author’ of a package that used my ideas to implement obviously incorrect software, especially not without my permission. But the idea that the Authors field is a necessary acknowledgement / assertion of copyright seemed to be the consensus of the thread. There is also some discussion of the relatively cryptic meaning of aut / ctb / cre inhttps://stat.ethz.ch/pipermail/r-package-devel/2020q2/005507.html. > > I don’t think publication status influences authorship…

Shubham Gupta (16:26:23): > Trying to understand, how doessavework with arguments (compress and version )?Q1Difference between version = 3 and version = 2. > > x <- list(1:10000) > pryr::object_size(x) #40.1 kB > save(x, file = "x.rda", version = 2, compress = TRUE) # Filesize = 20.7 kB > save(x, file = "x.rda", version = 3, compress = TRUE) # Filesize = 118 bytes > save(x, file = "x.rda", version = 2, compress = FALSE) # Filesize = 40.1 kB > save(x, file = "x.rda", version = 3, compress = FALSE) # Filesize = 40.1 bytes > > Why filesize is so small withversion = 3, what kind of compression is used in this case? Update: If I add some noise, then version 2 and 3 behave the same. Must be something to do with integer and sequence.Q2In the above case, file size is same as I see withpryr::object_size(). I have developed a package where I save another object : list of list of data-frames. In this case the filesize doesn’t match to the object_size() > > library(DIAlignR) > data(XIC_QFNNTDIVLLEDFQK_3_DIAlignR) > xic <- XIC_QFNNTDIVLLEDFQK_3_DIAlignR > pryr::object_size(xic) #61.6 kB > save(xic, file = "xic.rda", version = 3, compress = FALSE) #53.4 kB > > Do not understand why are these numbers different?

Dirk Eddelbuettel (17:35:33): > Q1ALTREPis the main reason for v2 versus v3, and it does things likesequencesreally well bydeclaring it as a sequence. If you try a random vector there should be less difference.

Dirk Eddelbuettel (17:37:02): > Q2 Just a hunch: theSEXPoverallocates andpryrmay report total size. When saving only actual may be saved. (That’s a guess…)

Shubham Gupta (17:57:50): > Makes sense. Thanks Dirk:slightly_smiling_face:

Martin Morgan (19:41:23) (in thread): > Kind of a nice complement to@Pariksheet Nanda’s question in#generalhttps://community-bioc.slack.com/archives/C35G93GJH/p1594560378113400about whether it’s worth while to get a package in Bioconductor, compared to distribution via github or other; about 1/2 your citations seem also to mention Bioconductor (the first one I looked at mentioned DESeq2 and EnhancedVolcano in more-or-less the same sentence…) - Attachment: Attachment > Hi folks, I’m making the case for improving the quality of 3 GitHub packages developed by collaborators to be brought into Bioconductor, and was hoping to have citations to justify several months of PhD work for why it’s worth having things like tests, CI, more frugal memory management, better integration with Bioconductor infrastructure instead of calling ad-hoc command line tools (!), and improved maintenance from a larger developer community. Can anyone point me towards a citable resource talking about the benefits of having packages in Bioconductor that would convince the average non-programmer? I suspect there’s a broader, standard paper I should cite, in which case it would be nice to know what “catch phrases” are convincing over simply sharing code on GitHub, etc.

Pariksheet Nanda (23:06:22) (in thread): > Regarding greater scrutiny and quality control the others can correct my conjecture here; from what I’ve seen stalking the new package GitHub reviews is the Bioconductor team seems to follow ideas of Zone of Proximal Development where the reviewers, instead of enforcing some unreasonably high bar, will account for how experienced the author is and help the author improve their work. I’ve read at least one review where the author was asking how they did, and the reply back was that it’s fine if things are less than perfect for their first package and that their next package will be better. So I think most authors who make the effort to read and follow the guidelines will do well. I’ve only ever seen one rejection where the author useddata.framefor all of data manipulation of genomic data, and used no Bioconductor objects at all:face_palm:and was told as such, and still appealed the rejection on the mailing list.

2020-07-13

Mike Smith (08:47:43): > <!channel>Just a reminder that the next Developers Forum discussing Github Actions will take place this coming Thursday 16th July at 09:00 PDT/ 12:00 EDT / 18:00 CEST - You can find a calendar invite athttps://bit.ly/3gCsFXO

Pariksheet Nanda (10:15:43) (in thread): > Yes! I have had stale pull requests on GitHub projects because it wasn’t clear that upstream was dead / uninterested.

Genevieve Stein-O’Brien (10:15:46): > @Genevieve Stein-O’Brien has joined the channel

Robert Castelo (14:01:28): > @Martin Morganmoving this thread to developers-forum. Is there some automatic way of getting the packages considered to becore infrastructure? (I thought grepping formaintainerin the Maintainer field would work but for instance Biostrings has Hervé as maintainer). - Attachment: Attachment > That’s a great question, and I look forward to the citable answers. > > There are many anecdotal stories, of course, including the ticker of citations mentioning Bioconductor at http://bioconductor.org/help/publications/ where there are several citations to Bioconductor (usually as the source for a package) per week. Also following the link ‘PubMed Central’ at the top right of the page tells us that there are 38285 full text citations to ‘bioconductor’ in the scientific literature (a fraction of these are false positives). The annual reports http://bioconductor.org/about/annual-reports/ used to cite, pre- about 2010 (e.g., section 6.1), citations to individual packages, but this stopped when it became problematic [for me] to confidently claim citations to papers that, e.g., introduce a package but also significant statistical innovation, as a citation to the package. > > The download stats for Bioconductor packages are available http://bioconductor.org/packages/stats/ and it would be very interesting to ask about downloads of packages that make use of core infrastructure (based on dependencies, for instance…) versus those that do not. This seems like a nice graph theory problem, too, removing packages whose downloads are because they implement core infrastructure. A nice problem for @Robert Castelo and other? > > It would be tempting to add to the narrative the success of individual Bioconductor packages, and the role this success had on the academic careers of the authors of those packages (there are more than a few examples of this) but it’s hard to know how to specify the control group, e.g., Seurat is obviously successful without being in Bioconductor…

Martin Morgan (14:57:59) (in thread): > Probably it is ‘maintainers@’ plus the usual suspects, unforutnately….

2020-07-14

Levi Waldron (04:19:46) (in thread): > I recently learned that having layers of trustworthiness of packages, including a defined set of “core” packages matters to open-source software validation for regulatory purposes. For example, seehttps://www.pharmar.org/white-paper/, which defines layers of baseline trustworthiness of R packages according to “base”, “recommended”, and “contributed”, a subset of the latter including “popular”. I’m not sure how to accomplish the same in Bioconductor (BiocViews, shell “base” / “recommended” / “popular” packages existing only for their Imports field, a simple updatable list?) but it seems worth considering. - File (PNG): image.png

Vince Carey (08:17:16) (in thread): > Nice catch@Levi Waldron. One of the specific aims of the bioc resubmission is “Enhance reliability and performance of core genome analysis infrastructure components through improved formal testing disciplines and modernization of continuous integration/continuous delivery methods of the project.” Let’s do this!

Vince Carey (08:25:08) (in thread): > Most depended upon, moderately depended upon: - File (PNG): depanal.png

Vince Carey (08:33:28) (in thread): > Here’s the covr::package_coverage result for Biobase > > > package_coverage() > Biobase Coverage: 55.16% > R/anyMissing.R: 0.00% > R/methods-container.R: 0.00% > R/methods-MIAxE.R: 0.00% > R/methods-ScalarObject.R: 0.00% > R/packages.R: 0.00% > R/strings.R: 0.00% > src/anyMissing.c: 0.00% > src/matchpt.c: 0.00% > R/methods-aggregator.R: 4.17% > R/vignettes.R: 6.76% > R/tools.R: 11.28% > R/methods-MIAME.R: 22.94% > R/environment.R: 28.33% > src/envir.c: 31.51% > R/rowOp-methods.R: 43.48% > R/methods-VersionsNull.R: 50.00% > R/methods-AnnotatedDataFrame.R: 52.60% > R/methods-ExpressionSet.R: 59.15% > src/rowMedians.c: 64.29% > src/rowMedians_TYPE-template.h: 69.33% > R/AllGenerics.R: 70.33% > R/zzz.R: 72.73% > R/updateObjectTo.R: 80.00% > R/methods-eSet.R: 80.26% > src/sublist_extract.c: 81.89% > R/VersionsClass.R: 83.08% > R/methods-AssayData.R: 85.05% > R/methods-VersionedClass.R: 88.37% > R/methods-NChannelSet.R: 96.15% > R/methods-MultiSet.R: 100.00% > R/methods-SnpSet.R: 100.00% > src/Rinit.c: 100.00% > > which shows variability, but also indicates different approaches to source-file naming – we have best practices recommendations for this. I regret to say that I am quite non-compliant in that domain. In any event. improving test coverage might be a basis for a hackathon-like event in the future?

Michael Love (11:32:10) (in thread): > I thought the coverage badge was motivating…

Michael Love (11:32:36) (in thread): > I ended up adding loads of unit tests, and found some corner cases, and trimmed old code

Robert Castelo (12:01:14) (in thread): > @Martin MorganI’ve done a little research into the relationship between downloads (median monthly download in the last 12 months) and number of dependences to “core infrastructure packages”. You can find a gist with the codehereand I’m attaching an RDS file with the resulting data. As you can see in the attached image, there’s no difference in downloads between having or not having dependences to “core infrastructure packages”. I’ve tried to be very inclusive in the definition of “core infrastructure package” and I’m sure this can be done more accurately in a number of ways. - File (PNG): downloadsbydeps.png - File (Gzip): downloadsbydeps.rds

Henrik Bengtsson (23:35:10): > On a side note: Maybe it’s time to deprecate those oldR/anyMissing.R+src/anyMissing.candsrc/rowMedians*.cfunctions in Biobase.anyMissing()is not needed bcanyNA()exists since R 3.1.0 (2014-04-10). TherowQ() -> rowRanks(), rowMedians(), ...moved to matrixStats in 2007 and has since undergone memory and speed improvements and has a solid set of test cases.

2020-07-15

Henrik Bengtsson (02:51:55) (in thread): > @Kevin Blighe, packages that fail to build or do not pass checks are “removed” (=archived) from CRAN if maintainers don’t address the issues in time. I find CRAN to be stricter than Bionconductor when it comes to check requirements.

Martin Morgan (12:31:30) (in thread): > I think Kevin hits the pluses and Henrik the minuses; I think Bioconductor should move toward enforcing--as-cran(maybe there are very narrow exceptions), and more-or-less automatically deprecating packages that ERROR in devel after x builds; I think we’ve been too conservative in trying to persist packages after their maintainers have lost interest. We’re working toward a more persistent view on ‘build reports’ so that it’s easy to see that things have been going wrong for too long…

Spencer Nystrom (14:35:07): > @Spencer Nystrom has joined the channel

2020-07-16

Pedro Baldoni (04:10:26): > @Pedro Baldoni has joined the channel

Lori Shepherd (08:59:22): > Just a reminder: > The next Bioconductor Developers’ Forum is scheduled for today Thursday 16th July at 09:00 PDT/ 12:00 EDT / 18:00 CEST - You can find a calendar invite athttps://bit.ly/3gCsFXOWe will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > This month we’ll be focusing on continuous integration with Github Actions, led by@Sean Davis, and discussing how we can benefit from GA across a wide variety of development tasks, including cross-platform testing of code and website deployment.

Lori Shepherd (11:59:21): > Reminder: Starting momentarily

Kevin Rue-Albrecht (12:06:19): > @Aaron Lunmulti-tasking - File (PNG): image.png

Ludwig Geistlinger (12:12:35) (in thread): > well, sometimes there are important things to do …

Kevin Rue-Albrecht (12:13:24) (in thread): > meetings: the one thing that motivates us enough to clean the house instead:stuck_out_tongue:

Tim Triche (12:17:53): > is there a repo that accompanies Sean’s presentation? I missed it

Nitesh Turaga (12:18:24): > No, but we can ask if he would like to publish it. The talk is recorded though.

Charlotte Soneson (12:19:53): > https://github.com/seandavi/BuildABiocWorkshop2020

Will Townes (13:55:47): > R package question: if I want to link to the help file of a function from another package in the “see also” field of a function’s documentation, but I don’t want to require users of my package to install the other package, does that package have to appear under “suggests” in the DESCRIPTION? Eg, in roxygen

Henrik Bengtsson (14:11:02) (in thread): > Correct, list those packages underSuggests:

Dirk Eddelbuettel (14:11:12) (in thread): > That is actually also something bugging me. If one does at the link, and does have not a Suggests etc then R CMD check dumps at least a not very elegant line about ‘pkg missing for link’ (or some such). Plus, there was recent chatter on r-devel (or was it r-package-devel) about how this changed. I used to use this feature, I am torn whether I should just avoid it.

Will Townes (14:15:21) (in thread): > thanks for replies. A wrinkle: my package is for CRAN but the other package is on bioconductor. So, if I suggest a bioconductor package it won’t automatically be installed by install.packages(). So then when the user clicks on the link it will be a dead URL. I’m thinking of just using instead since this will tell the user about the package without my having to put it in suggests and cause build errors or dead URLs in the docs.

Daniela Cassol (15:51:06): > Hello everyone! > Is there an automatic way to find how many Bioc packages depend on a specific CRAN package?

Martin Morgan (16:04:47): > > db_bioc = available.packages(repos = BiocManager::repositories()[1:4]) > db = available.packages(repos = BiocManager::repositories()) > deps = tools::package_dependencies("dplyr", db, reverse = TRUE, recursive = TRUE) > sum ( unique(unlist(deps)) %in% db_bioc[, "Package"] ) > > which give 865 for the devel branch.recursive= TRUEmeans that both direct and indirect dependencies are being counted.

Daniela Cassol (16:22:51) (in thread): > Thank you very much! Very helpful!:slightly_smiling_face:

Sean Davis (17:08:15) (in thread): > See response from Charlotte below.

Dirk Eddelbuettel (21:11:25): > Is the recording for the@Sean Davistalk up? I was tied up with some other things…

Dirk Eddelbuettel (21:12:41) (in thread): > (This misses a closing)in the last line.) > > I have a shorter two-liner scribbled away on my hard-drive based on something by@Martin Morganfrom ~ 2018:

Dirk Eddelbuettel (21:13:17) (in thread): - File (R): Untitled

Dirk Eddelbuettel (21:13:45) (in thread): > The current one is likely better due to more use ofBiocManager()…

Kayla Interdonato (22:10:14): > It will be available tomorrow morning (EST time), sorry for the delay.

2020-07-17

Kayla Interdonato (10:34:02): > The recording for yesterday’s developers forum is now available on YouTube (https://www.youtube.com/watch?v=-OjwMal80KY) as well as under Courses on the Bioconductor website. - Attachment (YouTube): Developers Forum 12

Tim Triche (10:38:00) (in thread): > thanks both!

Hervé Pagès (11:39:26) (in thread): > Just to clarify: likeinstall.packages(),BiocManager::install()doesn’t install suggested packages by default either. So you’d have the dead URL problem even if both packages were on Bioconductor (or on CRAN).

Hervé Pagès (11:49:36) (in thread): > Most importantly: they don’t do the same thing. 2 notable differences: (1) the current one counts direct and indirect reverse deps, not just the direct ones, and (2) it looks at all Bioconductor repos, not just software. Hence the 2 code snippets give very different results.

2020-07-20

Dr Awala Fortune O. (02:14:47): > @Dr Awala Fortune O. has joined the channel

Dr Awala Fortune O. (02:15:45): > Hello happy meet you

2020-07-21

Carl McIntosh (12:49:22): > @Carl McIntosh has joined the channel

2020-07-29

Riyue Sunny Bao (17:39:08): > @Riyue Sunny Bao has joined the channel

2020-07-30

Malte Thodberg (08:53:10): > @Malte Thodberg has joined the channel

Paula Beati (08:54:35): > @Paula Beati has joined the channel

Simina Boca (11:48:58): > @Simina Boca has joined the channel

Ayush Raman (12:43:14): > @Ayush Raman has joined the channel

Rene Welch (14:09:51): > @Rene Welch has joined the channel

2020-07-31

CristinaChe (18:01:29): > @CristinaChe has joined the channel

2020-08-04

Lambda Moses (00:27:14): > @Lambda Moses has joined the channel

2020-08-07

Mikhail Dozmorov (20:03:15): > @Mikhail Dozmorov has joined the channel

2020-08-08

RGentleman (00:01:57): > @RGentleman has joined the channel

2020-08-17

Stuart Lee (03:32:43): > Curious to hear people’s thoughts on this recent article in PLoS Comp Bio (h/t@Shian Su)Ten Simple Rules for Developing Usable Software in Computational Biology

Dan Bunis (03:46:47): > There was actually a Birds of a Feather session at BioC2020 on exactly this. I think the recording should end up on youtube soon (the first batch of recordings went up today), and you can check out#bioc2020-bof-10srfor discussion from after the session.

Stuart Lee (07:43:55): > ah cool, that’s good to know!

Kasper D. Hansen (17:26:29): > It seems to be from 2017.

Kasper D. Hansen (17:27:21): > Anyway, these are sound things to think about, but also very general.

Kasper D. Hansen (17:27:48): > Not sure there is anything compbio-specific in this advice

2020-08-18

Spencer Nystrom (17:28:09): > Quick question regarding package submission. I have a dependency for a package that isn’t on CRAN yet, but I will submit to CRAN soon. Is it alright to go ahead and submit my bioc package for review and to pend acceptance on the dependency?

Martin Morgan (17:43:31): > no, the build system won’t install your non-CRAN / Bioc package, so your package won’t pass build / check.

Spencer Nystrom (18:09:42): > That’s what I thought, but wanted to confirm, thanks Martin!

2020-08-19

Mike Smith (09:03:56): > Sorry for the short notice, the next Developers Forum will take place this coming Thursday 20th August at 09:00 PDT / 12:00 EDT / 18:00 CEST - You can find a calendar invite athttps://bit.ly/3iUidMbWe will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > * @Aaron Lunis going to discuss the steps required to wrangle compilation of the OSCA book into the Bioconductor build system - it’s much more than just pressing ‘knit’ in RStudio! > * I will present a prototype interface for browsing the Bioconductor git repository and rapid searching of the entire codebase across all packages. > Please let me know if you’d like anything else added to the agenda.

Stephanie Hicks (09:21:11): > @Mike Smith— just confirming, this will be recorded? I can’t make it live, but would greatly appreciate being able to watch afterwards

Mike Smith (09:29:46) (in thread): > It should be! The only time we didn’t manage to record was the first time we learnt you could run out of Bluejeans storage space - that shouldn’t happen again.@Kayla Interdonatois great at getting them on Youtube in short order after we’re done, and the link is usually posted in here too.

Stephanie Hicks (09:30:03): > awesome, thanks!

Leonardo Collado Torres (10:07:03): > thx! I won’t be able to join live either

Stephanie Hicks (10:21:37) (in thread): > yes

Mike Smith (10:25:03) (in thread): > Yes, absolutely. My hope is that they’re perceived as an open, friendly, meeting for anyone who’s interested in the software development side of BioC to join in conversation and learn new things. There’s no prerequisites to joining, and people are welcome to be as involved or passive as they like.

Simina Boca (10:28:20): > Is it a 1 hour meeting?

Simina Boca (10:28:34): > I will try to join regardless but I have a meeting after 1 hour

Mike Smith (10:30:30): > Yes, we try to stick to 1 hour maximum. They occasionally run over, but we’ll never start a new topic after 1 hour, and you won’t be the only person leaving after an hour! We’ll also leave the recording going until then end in case there’s anything you miss.

2020-08-20

Mike Smith (11:04:22): > Here are some slides for today’s callhttps://docs.google.com/presentation/d/1kjApFvGmSN39RImfGQZAeOGf4oHuRq8YGbCXK8yoTzQ/edit?usp=sharing - File (Google Slides): BiocCodeTools-DevelForum

Leonardo Collado Torres (11:13:57): > hehe, more RSS sourceshttp://bioc-code-tools.msmith.de/gitlist/recount/master/rss/. Can you get an RSS for a query?

Leonardo Collado Torres (11:14:34): > currently I use IFTTT for some queries

Mike Smith (11:17:49) (in thread): > You have answered my own question of “does anyone use RSS in 2020?”! What do you mean by “query”?

Leonardo Collado Torres (11:50:48): > how can you search say.size_factors?".size_factors"still returns many results that aresize_factors. Just an example of a search that gets a bit complicated

Leonardo Collado Torres (11:52:15) (in thread): > I have a query for “recount” and another one for “derfinder” using the BioC RSS to learn if anyone is making changes related to my packages. This is how I can easily notice when BioC core edits my packages

Mike Smith (11:54:51): > Is this in the code search tool? It should take regular expressions, so I think\.size_factorsget you that. Admittedly it doesn’t find any hits, and I don’t know if that’s a false negative or a true negative.

Mike Smith (11:56:16): > \.listMartsfinds me only that and notlistMartsinside the biomaRt source.

Mike Smith (11:57:45) (in thread): > Short answer is I don’t know, but this is just a mirror of the Bioc repo, so if there’s already an official RSS feed the new one is redundent.

Nitesh Turaga (11:59:22): > Any one have a link to this session?

Mike Smith (11:59:34): > https://bluejeans.com/114067881

Leonardo Collado Torres (13:39:08) (in thread): > well, IFTTT is the one that actually enables the query, and you can feed to it any RSS. Right now I’m using the official BioC one

Leonardo Collado Torres (13:39:45) (in thread): > also, note that the official BioC one has a global RSS, not a package specific one like the one your site provides

Nitesh Turaga (13:56:14): > @Mike Smith

Nitesh Turaga (13:56:16): > http://bioconductor.org/developers/rss-feeds/gitlog.xml

2020-08-25

Kayla Interdonato (11:15:45): > Sorry for the delay on getting the recordings posted - I was on vacation at the end of last week and I was experiencing some technical difficulties with YouTube yesterday. The issue with YouTube is still being explored, so for the time being the recordings are available only on the course materials:https://www.bioconductor.org/help/course-materials/.

Stephanie Hicks (11:19:05): > Thanks@Kayla Interdonato!

Stephanie Hicks (12:07:30): > @Aaron Lun@Hervé Pagès— I really appreciated discussion on building and deploying OSCA at the last developers forum! I’m sorry I missed it, but it was a really helpful discussion. Just for clarity, if I am building a bioc book on a different topic, should I follow the same workflow with the various trojan packages and github actions if I were interested in having it built on the BBS?

Hervé Pagès (13:52:38): > If you start with a repo that is already structured as an R package with a Makefile in the vignettes folder, you don’t need the trojan hack. Aaron needs the hack because he started the book before we had the new books builder. He started with an OSCA repo that is not structured as an R package. But now that we reuse BBS to build books and because BBS only takes R packages he created another repo (OrchestratingSingleCellAnalysisWrapper) that is structured as an R package and he uses his trojan hack to keep that other repo in sync with the primary OSCA repo. > So in short: organize the GitHub repo of your book as an R package and you don’t need any sort of hack. You’ll still need to use Github Actions todeploythe book though.

Aaron Lun (19:34:44): > TBH I would have still used the current set-up, even with full knowledge of the current BBS-based deployment. The wrapper package is out of sight, out of mind; and I’m willing to pay the cost of one level of redirection to have a nice pristine repo for the book sources.

Hervé Pagès (21:31:27): > “out of sight, out of mind” sounds good but this is at the cost of a somewhat tricky/hacky setup

Hervé Pagès (21:33:24): > I guess the question is: what makes the pristine repo so appealing? What makes it more “pristine” than an R package?

Aaron Lun (22:00:11): > If I had to say, it would be the fact that the book repo is already a standalone unit. It can already be built and deployed by itself - after all, that’s what I was doing with it locally. I didn’t like the idea of hard-coding any more deployment details, given that the nature of the deployment has changed several times already (Hutch servers from Rob, then an attempt on the Genentech servers, then my laptop, then AnVil with Vince, and finally the BBS).

Aaron Lun (22:01:34): > I didn’t want to have to move files to different directories every time we decided to do something different, so the trojan approach provides a degree of modularity that insulates the book’s contents from whatever mud-wrestling was required to get it to build on the target system.

Aaron Lun (22:03:40): > Even the current approach on the BBS is hardly a routine use ofR CMD build, so who knows what we’ll do in the future.

Hervé Pagès (22:48:30): > mmh.. I don’t know. With the R package format, building the book becomes as simple as running a completely standardR CMD buildcommand on the repo. Anybody can do it (granted they have the computing power), not just BBS. The fact that there is a Makefile in the vignettes folder is not really relevant, many packages have that. So I see the R package format as something that has intrinsic value, beyond the current BBS-based books builds. Allows you to specify authorship, licence, deps, versions, etc… using a standard and well documented mechanism. Sure we don’t know what we’ll do in the future but we know what we’re doing now. So starting with an R packagenowmakes a lot of sense. If we change in the future, then maybe a hack will be needed to convert the R package into whatever form will be needed then, but only then. I’m just not sure what benefit there would for@Stephanie Hicksto start with something else. All I see is that you started with something different, you like it, and you don’t want to change it. It’s fair and I respect that. But I wouldn’t suggest anybody who is starting a book now to follow that path.

2020-08-26

Aaron Lun (03:29:55): > If the BBS turns out to be reliable for book compilation, then I can see some benefit towards locking into a package structure. I guess we’ll see. > > For some more context: I have always imagined the idealized build system for the OSCA book to be something that cancdinto the repo and runbookdown::render_book("index.Rmd"). I would say that all mybookdownbooks (of which OSCA is but one) are built around that philosophy. Perhaps this layout could even be considered the convention forbookdownprojects, though I don’t know enough other projects to be able to say that for sure. > > I still hold out hope for the realization of this ideal build system - it is, forbookdownbooks, a more natural way of compilation than the BBS’s package-based set-up, and it’s how I compile each of the books locally - which is why my book-related repositories are organized the way they are.

Aaron Lun (03:48:15): > Putting that aside: therealproblem is how to streamline the deployment of the compiled book. It would make a lot of sense for Bioconductor’s infrastructure to host the HTMLs directly, then we wouldn’t need another 1-2 repositories to serve as vehicles for GitHub Pages deployments.

Kasper D. Hansen (04:40:49): > Caveat: I have not looked at the file / package structures that Aaron uses. But in my reading of this discussion,@Hervé Pagèsis advocating the advantage of using a DESCRIPTION file with its specification of dependencies, authorship, license etc. That seems like a pretty useful file to have. On the other hand@Aaron Lunis advocating for the simplicity of using a bookdown package repos style, which makes tons of sense from a writing perspective.

Kasper D. Hansen (04:41:51): > @Aaron Lunseems to have taken the sensible approach with the wrapper where the wrapper package is essentially a DESCRIPTION, a Makefile and some additional directory structure, which is irritating.

Kasper D. Hansen (04:42:49): > In my limited experience doing stuff like this, I tend to always end up with a Makefile though

Kasper D. Hansen (04:43:01): > Then there is the issue of web page deployment.

Kasper D. Hansen (04:43:48): > It seems like we are almost there though, apart from the html deployment

Kasper D. Hansen (04:45:15): > So Aaron, you find thecd vignettes/bookstep to be irritating? Or is it all about deployment

Kasper D. Hansen (04:45:42): > (I also think we should figure out some standard way to have vignettes represented in Github btw, but that is a different conversation)

Martin Morgan (07:15:23): > Does the Bioconductor infrastructure already support the book as part of the package landing page, if the build product is transferred to inst/doc ?

Saulius Lukauskas (07:48:03): > @Saulius Lukauskas has joined the channel

Sean Davis (08:20:45) (in thread): > Pkgdown works pretty well for fully documenting and presenting vignettes as well as package documentation.

Sean Davis (08:23:33) (in thread): > Variations of pkgdown work pretty well. If there is interest in something Bioc-centric, something akin to pkgdown seems pretty sensible. I like the fact that I can test my documentation, vignettes, and site aesthetics locally and expect identical results online.

Aaron Lun (12:05:20) (in thread): > It is slightly annoying but the bigger reason is, yes, the deployment: the book contents should be compartmentalized from the implementation details of the deployment. Of course this can bend a little (e.g., Dockerfiles in the main) but I would draw the line at the wholesale restructuring required to get it onto the BBS.

Aaron Lun (12:16:28): > I was of the understanding that compiled vignettes would only show up in thevignettes()if they were actually built by theVignetteBuilder, and not theMakefile.

Hervé Pagès (12:35:56): > @Martin MorganI don’t think it does. I think the landing page generation script would pick up all the HTML documents that end up in inst/doc (there are 42) and list them as individual vignettes.

Martin Morgan (13:03:45): > On the bioconductor landing pages, it’s just that the URL exists (and I think everything under inst/doc is propagated) and that the individual chapters are actually sufficient to give the illusion of a book – ‘chapter1.html’ contains the book sidebar, etc… but of course the proof is in the pudding…

Aaron Lun (13:04:49): > another consideration is that the book chapters are not standalone HTMLs; I think they depend on other files in the same directory (e.g., images, CSS files, etc.).

Hervé Pagès (13:38:48): > It wouldn’t be hard to tweak the landing page generation script to only display the link to inst/doc/index.html. So the book would live athttps://bioconductor.org/packages/release/books/vignettes/OrchestratingSingleCellAnalysisWrapper/inst/doc/index.htmlwhich is super ugly compared to the neathttps://osca.bioconductor.org/. One benefit of deploying the book at the ugly location though is that then deployment at the primary location becomes just a matter of running rsync between the 2 locations. > Anyway, all this feels a little bit convoluted. Sounds like we should deploy directly at the primary destination, which is what Aaron does at the moment, with the help of GitHub Actions and a few additional GitHub repositories. Sounds like there should be a simpler way to deploy though. A cron job running once or twice a week should do the trick. Could be done on BBS’s side but we would need to know the destination. Maybe via an additional field in the DESCRIPTION file?

Martin Morgan (13:57:37): > Instead of everyone-getting-a-CNAME maybe a more consistent strategy ishttps://bioconductor.org/books/release/OSCA… (if only the package name were brief; I guess this implies a devel / release discipline for books, which I think is a good idea?) redirecting (includinghttps://bioconductor.org/books/OSCA) to the complicated URL

Hervé Pagès (14:02:05): > Sounds good to me. I don’t think we need to have the complicated URL at all. It’s not more complicated to deploy the books directly at the “canonical” URLs.

Hervé Pagès (14:14:22): > I can start working on this next week. Book authors who want to host their book at a different place would just need to mirrorhttps://bioconductor.org/books/release/MyBook. It’s on them. > Does that work for you@Aaron Lun? We’re trying to help you get rid of theOrchestratingSingleCellAnalysis-releaseandOrchestratingSingleCellAnalysis-develrepos:wink:

Aaron Lun (14:14:32): > hell yeah

Aaron Lun (14:14:47): > osca.bioconductor.orgwould probably need some redirects for a while

Hervé Pagès (14:23:17): > Should we declare vignettes/book/docs the standard place for build products? In which case all what deployment will do is put everything that’s under this folder at the canonical location.

Aaron Lun (14:23:32): > sounds sensible to me.

Hervé Pagès (15:29:03): > Some preliminary testing: I just manually pushed the content ofvignettes/book/docs/from the latest successful build tohttp://bioconductor.org/books/devel/OSCA/@Aaron LunWhat’s up with the javascript code that gets displayed as plain text at the bottom of the Welcome page?

Aaron Lun (15:35:14): > I have no idea. I don’t observe this my local builds, though I did observe it with Vince’s Anvil builds.

Aaron Lun (15:38:52): > If i had to guess, something is bleeding over from the other chapters.

Vince Carey (17:23:12): > Maybe we need another channel for this. I remain concerned that there is a monolithic aspect of this process that could benefit from critical evaluation from uninvested parties. A cross-reference is one thing, inter-chapter computational interdependencies another. If we are not able to build chapters in isolation, to build chapters in parallel, to check chapter and book validity without going through a full build, I think we may want to reevaluate.

Vince Carey (17:24:12): > Don’t get me wrong. I think this is a huge flagship development in the project. I think it has been done extremely well.

Aaron Lun (17:24:12): > what’s the alternative?

Hervé Pagès (17:25:27): > Maybe we should follow up on the#bioc-buildschannel?

Vince Carey (17:25:45): > One alternative is to avoid interdependencies and allow redundant computations between chapters. This would permit parallelization. I don’t know if it is that much better, but it seems to me worth considering.

Vince Carey (17:26:24): > I think we are talking about something independent of the build system per se. The book concept is valuable and complex and so is building it.

Vince Carey (17:27:50): > So the authors and editors have one outlook and the production system may have another – I’d like to get a handle on the first group first. Could a more decoupled approach lead to less burdens on the editor – who I take to be Aaron at this time – at least he is the coordinator of the process of creating the content?

2020-08-27

Tim Triche (08:19:02): > possibly offtopic, but when I opt to handMulticoreParama non-NAresultDir, why do I get a bunch of results along the lines ofvalueinstead of actual saved data objects? Is there a way to indicate that I wish to save the job/task results rather than justvalue?

Tim Triche (08:19:27): > e.g. inenmity, a moderately disgusting package that takes the worst parts of several others:

Tim Triche (08:19:31): > > # are the results saved? Load them first, if so. > resultDir <- bpresultdir(BPPARAM) > if (!is.na(resultDir)) { > if (verbose) message("Loading saved results from ", resultDir) > grs <- .loadBpFiles(BPPARAM) > if (verbose) message("OK.") > } > > where.loadBpFiles(BPPARAM)is defined as > > # load results of BiocParallel jobs when !is.na(bpresultdir(BPPARAM)) > .loadBpFiles <- function(BPPARAM) { > > resultDir <- bpresultdir(BPPARAM) > if (!is.na(resultDir)) { > stopifnot(!dir.exists(resultDir)) > resultFiles <- list.files(resultDir, patt="^BP.*Rda$") > sapply(resultFiles, .loadBpFile, resultDir=resultDir) > } > > } > # load one BiocParallel saved result > .loadBpFile <- function(resultFile, resultDir) { > > if (!is.na(resultDir)) load(file.path(resultDir, resultFile)) > > } >

Tim Triche (08:21:34): > I feel like I’m missing a critical part here that would otherwise allow me to checkpoint some pretty big jobs.

Tim Triche (08:22:37): > the above is fromhttps://github.com/trichelab/enmity/blob/master/R/scNMTFileListToGR.R

Tim Triche (08:22:53): > and the entire package is athttps://github.com/trichelab/enmity

Tim Triche (08:24:14): > but the only real issue I have is avoiding either a) running out of RAM as a result of an in-memory DelayedMatrix forced on our HPC cluster by issues with HDF5 writing or b) not being able to recover intermediate results when that happens

Tim Triche (08:24:58): > I’ve mostly solved a), so b) is the part I’m after now.

FelixErnst (08:25:32): > I am not sure, but I guess the result ofload()isA character vector of the names of objects created, invisibly.and not the objects themselves. Would this explain your problem?

Tim Triche (08:25:54): > crap, perhaps so. is there a provision to save the objects rather than just their names?

Tim Triche (08:25:59): > thanks for catching that

FelixErnst (08:26:37): > well you get the name of the object. So you can select it viagetI suppose. Maybe not the most straightforward solution

Tim Triche (08:27:42): > hmm, I could add a hook to save it as an .rds file in the event that!is.na(bpresultdir(BPPARAM))

Tim Triche (08:27:49): > this is testable, at least

Tim Triche (08:28:08): > actually

FelixErnst (08:28:40): > That would be the alternative, but if the.RDatacontains only one object, usinggetmight be sufficient. You wouldn’t need to change code other places

Tim Triche (08:29:35): > oh good call

Tim Triche (08:30:02): > weird, I thought thatload()returned a list if assigned. Looks like a major oversight on my part. Thanks!

Tim Triche (08:32:17): > > foo <- matrix(rnorm(n=10000), nrow=100) > save(foo, file="foo.Rda") > bar <- get(load("foo.Rda")) > identical(foo, bar) > # [1] TRUE >

FelixErnst (08:32:18): > It is just a guess. Let me know, how it turns out

Tim Triche (08:32:22): > yep, that does it. Thanks!

Tim Triche (08:33:01): > actually let me make sure it isn’t simply retrieving the existingfoo

Tim Triche (08:34:03): > > foo <- matrix(rnorm(n=10000), nrow=100) > save(foo, file="foo.Rda") > baz <- foo > rm(foo) > bar <- get(load("foo.Rda")) > identical(bar, baz) # baz == foo, by another name > # [1] TRUE >

Tim Triche (08:34:07): > yep, that does it. Thanks!

2020-08-28

Aaron Lun (01:50:44) (in thread): > While I can’t reproduce it, I’ve made a change thatmight fix it.

Hervé Pagès (11:28:20) (in thread): > Thx. OrchestratingSingleCellAnalysisWrapper currently building on malbec1 and nebbiolo1…

Hervé Pagès (14:47:10) (in thread): > https://bioconductor.org/checkResults/3.12/books-LATEST/… oops! > I think that’s because the book builds are competing for resources with the workflow builds. I made some adjustments to the build schedule to avoid that, hopefully. We’ll see next Tuesday. > Anyway, deploying from the source tarball produced by nebbiolo1 looks good:http://bioconductor.org/books/devel/OSCA/

Aaron Lun (15:29:53) (in thread): > AWSEOME

Aaron Lun (15:30:18) (in thread): > tick over 3 hours, very respectable.

Aaron Lun (15:30:31) (in thread): > Especially given I just keep on adding stuff.

Aaron Lun (15:31:06) (in thread): > What’s the problem witht he clash? I don’t see anything concerning.

Hervé Pagès (15:31:37) (in thread): > It just slowed everything down. Hence the timeout on malbec1.

2020-08-30

Aaron Lun (23:04:13) (in thread): > I guess we should killosca-dev.bioconductor.organd redirect people tohttp://bioconductor.org/books/devel/OSCA/. I guess the Bioconductor domain must have something aliasingosca-dev.bioconductor.orgto GitHub pages right now, so it should be simple enough to change that. - Attachment (bioconductor.org): Orchestrating Single-Cell Analysis with Bioconductor > Or: how I learned to stop worrying and love the t-SNEs.

2020-08-31

Hervé Pagès (01:28:15) (in thread): > Let me work on the automated book deployment tohttp://bioconductor.org/books/release/andhttp://bioconductor.org/books/devel/first. BTW it would make things easier if the automatic deployment was athttp://bioconductor.org/books/release/booknamewherebooknameis just the name of the book package. So can we rename OrchestratingSingleCellAnalysisWrapper -> OSCA? Alternatively we could use a DESCRIPTION field for that e.g.PublishedName: OSCAbut I’d rather keep things as straightforward as possible.

Aaron Lun (01:42:34) (in thread): > I’d be happy to do OSCA.

Hervé Pagès (03:20:46) (in thread): > Great.@Nitesh TuragaHow do we proceed? Aaron renames the GitHub repo and we rename the upstream repo atgit.bioconductor.org? Thanks

2020-09-02

Hervé Pagès (13:26:40) (in thread): > @Aaron LunCan you please rename on GitHub? Once you’re done Nitesh (@Nitesh Turaga) will reclone ongit.bioconductor.org. Thx!

Aaron Lun (13:43:10) (in thread): > it is done.https://github.com/LTLA/OSCA

Hervé Pagès (13:43:49) (in thread): > Perfect.@Nitesh Turagacan you reclone? Thx

Nitesh Turaga (13:59:28) (in thread): > Yes

Nitesh Turaga (13:59:29) (in thread): > I’m on it

Nitesh Turaga (14:04:38) (in thread): > Should be done…please check now.

Hervé Pagès (14:09:16) (in thread): > Thx Nitesh. We’ll see on Friday how things go with the builds…

2020-09-06

Bob Policastro (08:35:43): > @Bob Policastro has joined the channel

2020-09-08

Aaron Lun (18:00:55) (in thread): > Moving to#bioc-builds

Henrik Bengtsson (19:34:45): > Hi. I’ve got a question about a potential but unusual package submission to Bioconductor. > > BACKGROUND: > The TopDom methods - An efficient and Deterministic Method for identifying Topological Domains in Genomes - was published in 2016. The authors implemented the method in R in form of some rudimentary R scripts hosted on their website (no longer available). The TopDom method appears to be somewhat popular. > > As part of a HiC project I worked on, I ended up wrapping their TopDom script into a proper R package (https://github.com/HenrikBengtsson/TopDom), adding package tests, help pages, obvious bug fixes, etc. It’s a stable, solid package version of their R code. The package tests depends on data in a standalone TopDomData package (https://github.com/HenrikBengtsson/TopDomData). These data is a subset of their supplementary data (no longer available; used to be hosted on their webpage; I’ve reached out to them about the original data but they seem to have lost them). > > QUESTION: > I think it would benefit the community if these two package would be hosted and especiallyarchivedon a proper R repository, e.g. CRAN or Bioconductor. I’ve got the original authors’ permission to publish both under GPL-3.Would Bioconductor accept these packages as-is?These package donotmake use of any Biocondcutor data structure etc. There is no vignette. They pass R CMD check --as-cran with all OKs. I’m not interesting in doing more work on them (I’ve given them looots of love already). > > The TopDom package will not receive any further updates other than fixing typos. I don’t want to it digress from the original TopDom scripts. I have zero interest taking over the lead of developing TopDom or even answering questions regarding the method.I am just trying to find an home for them so that they’re archived somewhere solid.PS. I could submit ‘TopDom’ to CRAN to make sure it is archived there. The ‘TopDomData’ is too big for CRAN and since I cannot get access to the original data I cannot move to a smaller chromosome that would fit the 5 MiB limit on CRAN. I could submit just ‘TopDom’ but then the community would loose out on the validation part. So, although CRAN would be more straightforward, there’s a Catch 22 regarding ‘TopDomData’.

Dirk Eddelbuettel (19:55:03): > As for ‘bigger data’, there is the Anderson + Eddelbuettel approach described here:https://journal.r-project.org/archive/2017/RJ-2017-026/index.html

Martin Morgan (19:56:04): > A vignette is a required component for Bioconductor packages.@Dirk Eddelbuettelmight point to solutions for CRAN packages wanting to provide larger data (and I see he has!); large files wouldn’t work for Bioconductor (the solution we’d pursue would be an ExperimentHub package / collection of resourceshttps://bioconductor.org/packages/ExperimentHub) and we wouldn’t go for a user-based hosting solution either (because the data isn’t under control of a resource with long-term commitment). Interoperability is really a key component of the Bioconductor ecosystem, so the absence of interoperability would definitely trigger reviewer flags. Why not just continue hosting the way you have until now? - Attachment (Bioconductor): ExperimentHub > This package provides a client for the Bioconductor ExperimentHub web resource. ExperimentHub provides a central location where curated data from experiments, publications or training courses can be accessed. Each resource has associated metadata, tags and date of modification. The client creates and manages a local cache of files retrieved enabling quick and reproducible access.

Spencer Nystrom (21:06:05): > Could you consider hosting the data on Zenodo?

Spencer Nystrom (21:07:13): > You can put Git repos in Zenodo and they are assigned DOI’s. So your package would get archived in a still usable form if you had instructions in a README or something.

Henrik Bengtsson (22:59:25): > Sorry, I should have mentioned that the TopDomData package is only 14 MB - but still too large for CRAN (5 MB limit). > > Thanks for the quick and clear reply. It confirms what I thought - Bioconductor is not the place for these type of packages. This helps me going forward since it makes CRAN the only solution. I’ll go ahead and submit TopDom there and postpone the problem of finding a location for the too-large (14 MB) TopDomData package. > > Why not just continue hosting the way you have until now? > Mainly because it is not a long-term solution/archive. GitHub might disappear or I might get grumpy and remove those repositories - neither is good for science. That the original TopDom website vanished without existing on any archive is another good example of this. Also, it prevent anyone else on CRAN or Bioconductor from depending on the package.

Henrik Bengtsson (23:05:38): > @Spencer Nystromthxs - using Zenodo might be a good, low-bound solution for TopDomData. (If the TopDom authors end up finding their supplementary data, I should be able to use Chr 22 and get the size below 5 MB. If that ever happens, I can submit that to CRAN but until then Zenodo is good.)

Hervé Pagès (23:40:01): > Just to be clear, and sorry if I’m stating the obvious here, but “interoperability” = “use the well established containers that are already available in the Bioconductor ecosystem for the kind of data you are dealing with.” Of course, if you’re dealing with stuff for which there is no well established container, you do what you want.

2020-09-10

Kasper D. Hansen (05:29:33): > @Henrik BengtssonNot ideal but@Mikhail Dozmorovhas been working on this problem. Hemaybe interested in taking over. It sounds to that Bioc-wise a main thing is a vignette which could mostly be about explaining the historical context here.

Kasper D. Hansen (05:30:12): > I’ll say that we don’t have completely standardized solutions for Hi-C in the project. A number of different packages use different backends

Kasper D. Hansen (05:32:28): > Most Hi-C data could benefit from a GenomicRanges and I am sure that wrapping the current implementation in some of these existing implementaions is pretty easy. However, it is not zero work. But perhaps this is something@Mikhail Dozmorovcould have a look at, since he has some infrastructure for TAD prediction and this is after all just a different method for the same problem

Kasper D. Hansen (05:34:29): > Related, this seems like a great anecdotal story with all the issues of archieving. The original paper was in NAR; its disappointing the code is going bust

Mikhail Dozmorov (08:25:28): > @Henrik Bengtsson, we are currently facing the same data hosting question,using GitHub as a temporary solution. We are looking into ExperimentalHub and other options, and can help with TopDomData.

Mikhail Dozmorov (08:26:14): > I’m not sure how strict Bioconductor is about forcing GRanges or GInteractions, which are good to use, but rewriting the code would take some time. Hi-C data is very diverse, depending on the application, containers are less well established. If some leeway is OK, the most important seems the vignette. I can help with that, the main question is whether the TopDom code can be accepted to BioC as is?

Kasper D. Hansen (09:07:52): > I am assuming the input to TopDom is a matrix.

Kasper D. Hansen (09:08:31): > Don’t we have a simple class that we can put a HiC matrix into.

Kasper D. Hansen (09:08:46): > GInteractions are great and sparse, but are not fully a classic matrix

Kasper D. Hansen (09:09:13): > I mean, GInteractions are probably better, but we could use someting simple

Kasper D. Hansen (09:09:42): > There are also various attempts at providing on-disk storage which again is great, but a bit overkill for this probably

Kasper D. Hansen (09:10:28): > In bnbc we have a class which is essentially a GRanges + a list of HiC matrices because it is focused on multiple samples. It kind of sucks, but works and is simple

FelixErnst (09:11:51) (in thread): > GRanges + matrix make me think about SummarizedExperiment. Wouldn’t that work?

Kasper D. Hansen (09:52:26) (in thread): > the “assay” data is a symmetric genome X genome matrix for 1 sample. So unlike a SummarizedExperiment which has multiple samples in the columns, it is really more a single GRanges list + a matrix

Mikhail Dozmorov (10:01:39): > I asked my student, but yes, my recollection is TopDom takes in a matrix, and overall, was easy to use. We use the@Henrik Bengtssonversion

Mikhail Dozmorov (10:02:11): > What I’m not sure is how to deal with the fact that the original TopDom code is also on GitHub,https://github.com/jasminezhoulab/TopDom, without a license.

Kasper D. Hansen (10:33:08): > It sounds like Henrik has done the license negotiation for us

Mikhail Dozmorov (10:42:52): > Then it would be great to have it on BioC. We can help with the data and the vignette

Aaron Lun (11:10:34): > There are ContactMatrix classes in InteractionSet. Consider starting from there and making a PR for whatever modifications are needed. I spent a lot of time a few years ago trying to avoid fragmentation of the BioC Hi-C infrastructure, I’d rather not see if fall apart in a squiggly heap again.

Henrik Bengtsson (11:37:38) (in thread): > Correct. Licensing was something the authors hadn’t thought of. (As a community we need to do more to explain the importance of a license, explain what it means to choose one, what copyright means, … Journals can do a lot more to help here too. I’m sure many editors and reviewers don’t understand the importance of this too)

Henrik Bengtsson (11:41:32) (in thread): > They (the lead, not the PhD student who I think made the work) created that GitHub long after. It doesn’t contain the original version. “Luckily” I caught that and it’s in my git repos history but original implementations are also available viaTopDom::legacy(). It’s good that they have a GitHub presence although not active - hopefully it’s a first step for them.

Henrik Bengtsson (11:46:02) (in thread): > (Somewhat side-tracking): This is where I’ve always disagreed with the Bioconductor strategy. I think Bioconductor should allow for low-level APIs/packages and high-level APIs/packages. They can (andshould) be separated. Many many methods and algorithms can be designed and implemented using basic data types such as matrices and vectors. They are best maintained, understood, and optimized if they arenotblurred by higher-level data structures. Higher-level APIs can then be modeled on top of the well defined lower-level APIs.

Henrik Bengtsson (11:47:57) (in thread): > These low-level packages are more stable over time, e.g. they are not affected by the constant drift of new, re-designed higher-level data structures.

Henrik Bengtsson (11:51:35) (in thread): > Also, blocking low-level packages from entering Bioconductor because they lack a higher-level API compatible with the latest Bioconductor data structures is unfortunate. Who knows, maybe the orginal TopDom authors would have considered submitting to Bioconductor if the threshold was lower (not sure it would happen in this case - but you understand the gist). Higher-level wrappers/packages can always be added later. (I understand that this removes some of the motivations for providing APIs compatible with the Bioconductor ecosystem/philosophy)

Henrik Bengtsson (11:56:41) (in thread): > In some cases, the low-level API and high-level API are in the same package (which is ok unless the high-level API brings in a huge amount of dependencies). What I find very unfortunately is when there’s no low-level API when there could/should be one.

Aaron Lun (12:10:04) (in thread): > If it’s just going to be a low-level API, then it seems like it could just be a CRAN package. There’s really no reason for it to be in Bioconductor if it’s not going to interoperate.

Hervé Pagès (12:13:01) (in thread): > It’s generally very easy for a package that already has a well designed low-level API to add the high-level one on top of it. In my experience submitters understand the value of the high-level API and have no problem adding it. For the long term maintenance of their package, the separation between low-level and high-level API is good, but it’s their choice. Maybe we should encourage it more but that’s a different story.

Henrik Bengtsson (12:15:11) (in thread): > Yes, this is how I look at CRAN and Bioconductor too. It gets blurry when the method is in the field of bioinformatics. For instance, I think segmentation methods such as DNAcopy (on Bioc) and PSCBS (on CRAN) are better suited for CRAN where they reach a wider audience and allow them to evolve without the Bioconductor ’constraints”. OTH, some people expect to find such methods on Bioconductor.

Hervé Pagès (12:17:36) (in thread): > The “difficult” packages are those that came up with their own high-level container for single cell data and don’t have a low-level API. Then it’s a lot of work for them to switch to SingleCellExperiment and some are reluctant to do it. In these cases, yes, CRAN is probably a better place.

Henrik Bengtsson (12:19:11) (in thread): > > In my experience submitters understand the value of the high-level API and have no problem adding it. > But I think there are people out there that have developed awesome “low-level” APIs/methods that are not willing, don’t have the experience, or the resources, to go the extra mile(s) making it compatible with Bioconductor. Their option is CRAN (which is OK with me).

Hervé Pagès (12:24:53) (in thread): > S4 phobia, I know:smirk:

2020-09-15

Spencer Nystrom (13:35:20): > quick CRAN question: is it frowned upon to ping ther-pkg-develmailing list about review progress? I’ve had a package sitting inpendingfor 20 days & not sure whether to just wait for an update or what. I think policy changed recently wherependingno longer means they’re waiting on an author response, but I can’t find any guidelines on this.

Dirk Eddelbuettel (14:06:09): > That is a misunderstanding of the r-package-devel list! It is not official – only CRAN is – but aims to help to keep load off CRAN. I am one of three founding admins there. > > Now, 20 days is long. I also think the description of ‘pending’ at other places is wrong. Pending, AFAIK, is “them” where as ‘waiting’ is them waiting on you. I had my own battles with them lately and have one bouncing in and out of newbies right now as they are being truly obnoxious on DESCRIPTION etc (on the important grounds of AUTHORS and COPYRIGHTS, which I already addressed via files in inst/ …). It all can be such a royal PITA but I guess it is worth it in the longer run.

Spencer Nystrom (14:09:59) (in thread): > Thanks, Dirk! I didn’t want to read too much into others interpretations of each directory, but was just starting to get puzzled as everyone else’s package leaves, but mine remains after so many days. Guess I’ll just “hurry up and wait”.

Dirk Eddelbuettel (14:12:20) (in thread): > It’s tough. I runecho ls -lR | ncftp incoming | lessevery day to spy on where ‘incoming’ is a ncftp bookmark. I think after three weeks it is fair to inquire. > > What is the package name?

Spencer Nystrom (14:12:34) (in thread): > cmdfun

Spencer Nystrom (14:13:15) (in thread): > So ther-helpmailing list is the one to ping?

Dirk Eddelbuettel (14:13:25) (in thread): > Yes, I’d email. Nooooo.

Dirk Eddelbuettel (14:14:30) (in thread): > Let’s recap. My evaluations: > * r-helpNewbs. Low signal/noise > * r-develR dev in general. Good. > * r-package-develPackage dev in general. Very good list, Butnot speaking for CRAN as no human is. If you want to talk to CRAN,talk to CRAN.

Dirk Eddelbuettel (14:15:04) (in thread): > So when I say ‘I would email’ it means emailing cran or cran-submissions. Whichever it is you last heard from.

Spencer Nystrom (14:15:26) (in thread): > Gotcha. Think I just lost my way in the slew of possible lists & contact points.

Spencer Nystrom (14:16:04) (in thread): > Thanks for taking the time to respond!

Dirk Eddelbuettel (14:16:34) (in thread): > No worries. We are all navigating the same mine field together, Without maps. Or helmets.

Spencer Nystrom (14:17:11) (in thread): > :helmet_with_white_cross::r::fire:

Hervé Pagès (14:56:40) (in thread): > but with masks:mask:

2020-09-16

Lluís Revilla (03:45:58) (in thread): > I had similar doubts too. Knowing where to address a question is confusing between CRAN & R mailing lists.

Lluís Revilla (03:46:41) (in thread): > BTW this package/vignette was very helpful to track the package (it is updated each hour):https://lockedata.github.io/cransays/articles/dashboard.html

Mike Morgan (11:02:30): > @Mike Morgan has joined the channel

2020-09-17

Mike Smith (09:13:10): > We’re scheduled for our next developers call next week. I don’t have any specific topics in mind, so I thought this might be an opportunity to find out what topics people are interested in covering. Below is a list of suggestions, feel free to vote on as many as your interested in. If there’s any other topics you’d like us to consider covering please let me know and I’ll add them as options.

Simple Poll (09:13:19): > @Simple Poll has joined the channel

Unknown User (09:14:58): > [Unsupported block type: context] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: context]

Unknown User (09:15:14): > [Unsupported block type: context] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: context]

Tim Triche (09:55:46): > CRAM? debugging horrorshow frozen APIs?:wink:

2020-09-18

Robert Castelo (13:42:01): > > $ svn > > svn: error: The subversion command line tools are no longer provided by Xcode. > Where is the empathy for older generations???:weary:what’s next, removing vi?? the terminal window??:sob:

Nitesh Turaga (13:42:22): > What’svi?:stuck_out_tongue:

Hervé Pagès (13:51:33): > that trend started when they removed the Fortran compiler a few years ago

Hervé Pagès (13:57:22): > At the time the R community complained and Steve Jobs answered:https://stat.ethz.ch/pipermail/r-sig-mac/2011-March/008116.htmlWant to fill a complaint for svn Rob?

Robert Castelo (14:07:32): > interesting, and i don’t wanna complain as there’re workarounds in macOS to access old svn repos, but at the moment i saw the message i felt like it adds to my yearning on things i used i could do and now i can’t anymore:sob::smile:

Dirk Eddelbuettel (14:08:45) (in thread): > Interesting post. “Apologies for the rant but I really do feel that Apple is behaving stupidly and shortsightedly in this case.” Looks like the R community hasn’t really learned that lesson over the subsequent nine years. Ah well. As I usually quip, at least the hardware is more expensive…

Nitesh Turaga (14:08:51): > It’s incredible Steve jobs replied though.

Federico Marini (14:12:38) (in thread): > if they can manage to exitvi

Hervé Pagès (14:55:58) (in thread): > I hear you. Like I’ve been using Thunderbird+IMAP for the last 15 years to manage myfredhutch.orgaddress but this week the IT folks at the Hutch disabled IMAP access. They want to force everybody to use Microsoft Outlook. Well, they don’t really say that, the official reason being that “IMAP is legacy and not secure”.:rage:

Robert Castelo (18:08:41) (in thread): > :man-facepalming:propose them to use pine+pop3 maybe since pine was developed at the UW you might have a better luck:sweat_smile:https://en.m.wikipedia.org/wiki/Pine_(email_client) - Attachment: Pine (email client) > Pine is a freeware, text-based email client which was developed at the University of Washington. The first version was written in 1989, and announced to the public in March 1992. Source code was available for only the Unix version under a license written by the University of Washington. Pine is no longer under development, and has been replaced by the Alpine client, which is available under the Apache License.

Hervé Pagès (18:29:56) (in thread): > No kidding. This would still be a lot better than the horrible (and I’m sure expensive) Microsoft Exchange platform that they’re going full steam with. Can’t believe people are still falling for this like they were in the 90’s:face_with_rolling_eyes:

2020-09-21

Belinda Phipson (20:18:18): > @Belinda Phipson has joined the channel

2020-09-22

Spencer Nystrom (06:35:29) (in thread): > Well, the package was accepted, published to CRAN for a few days, then I guess re-reviewed and archived for a minor bug. Now have to wait 30 days to resubmit. Really appreciating Bioc’s release model/ability to hotfix minor issues now… oh well, so it goes.:helmet_with_white_cross::mask:

Lluís Revilla (06:52:03) (in thread): > Oh, sorry to hear that… I’m still learning how & when to fix packages on CRAN

Dirk Eddelbuettel (07:42:07) (in thread): > > Now have to wait 30 days to resubmit. > What makes you say that? I read the CRAN Repo Policies carefully, and wrote a bot to track changes, but that is news to me.

Spencer Nystrom (08:19:21) (in thread): > The person from CRAN told me I had to wait a month to resubmit.

Spencer Nystrom (08:19:55) (in thread): > In order to “take the time to carefully read CRAN policy”

Spencer Nystrom (08:20:09) (in thread): > :face_with_rolling_eyes:

Spencer Nystrom (08:24:37) (in thread): > I understood those rules to be for package updates, not issues caught in review. What’s frustrating is it seems my package was rushed through review then immediately flagged so now it’s technically a “new version” not a revise & resubmit.

Spencer Nystrom (08:26:23) (in thread): > It’s honestly not a big deal, since the only reason I wanted it out was as a dependency for the fall bioc release, but it’s unlikely a revise&resubmit would be accepted before the bioc deadline anyway, so I don’t really think it’s worth pushing back on.

Dirk Eddelbuettel (08:32:39) (in thread): > @Lluís RevillaWith all due respect I would recommend somewhat strongly against placing all your bets on tertiary documentation. There isWriting R Extension. There is the CRAN Policy, Read those.

Dirk Eddelbuettel (08:32:56) (in thread): > @Spencer NystromWho said and can you quote the actual wording?

Dirk Eddelbuettel (08:33:45) (in thread): > @Lluís RevillaWrong context. “1 to 2 months” is foralready accepted packages as a desired update cadencewhich is something completely different that what we talk about here and now: initial acceptance.

Spencer Nystrom (08:34:13) (in thread): > Email this morning from Kurt Hornik. > > > Checking this creates ~/meme in the user's home dir, in violation of the > CRAN Policy's > > Packages should not write in the user's home filespace (including > clipboards), nor anywhere else on the file system apart from the R > session's temporary directory (or during installation in the location > pointed to by TMPDIR > > We thus had to archive your package. > > Please allow yourself at least one month to more carefully reading the > CRAN Policy before possibly submitting a new version of your package. >

Dirk Eddelbuettel (08:34:28) (in thread): > > What’s frustrating is it seems my package was rushed through review then immediately flagged so now it’s technically a “new version” not a revise & resubmit. > I had the exact same issue this month and am with you. Very frustrating.

Spencer Nystrom (08:35:20) (in thread): > It’s difficult to fault the reviewers, as the system is just overloaded, they’re set up to fail.

Dirk Eddelbuettel (08:35:44) (in thread): > @Spencer NystromThanks. The ‘do not write to user filespace’ is frequently discussed on r-package-devel. They also try to catch it inR CMD check– odd it passed initally. I had not seen the 30 day mandate anywhere. Interesting.

Spencer Nystrom (08:36:37) (in thread): > Yeah that was an oversight on my part in the vignette. I was super anal about not writing to userspace everywhere else. This just slipped past.

Dirk Eddelbuettel (08:36:45) (in thread): > Completely agree on overloaded. We (i.e. R Foundation) always offer support and resources but CRAN itself is not too successful on scaling up:confused:

Spencer Nystrom (08:41:26) (in thread): > Anyway, thanks for your input. Glad it’s not just me.:+1:

Hervé Pagès (18:33:00) (in thread): > I think I remember from a discussion I had in the past with the CRAN folks that they sometimes use a read-only partition to run the checks in order to catch packages that write to the user filespace. This is the kind of check that is too hard to implement inR CMD checkwithout a special setup like the read-only partition. Could be that on submission they didn’t use that setup, thus your package went thru, but then it got caught later when they checked the package with the read-only setup.

Dirk Eddelbuettel (18:37:13) (in thread): > They do, and you are correct. I wasn’t thinking this through because I was aware of what Kurt does there and for the usual reason we can’t easily accommodate that on our machines as we can’t “just like that” bind-remount our partitions. SoR CMD check ....just won’t see it.

2020-09-23

Leonardo Collado Torres (21:55:00) (in thread): > I hate my official email too because they also blocked IMAP and want us to use Outlook. But then they don’t enable/install any of the things that would make it actually useful:confused:

Hervé Pagès (22:09:51) (in thread): > Ah! Glad I’m not the only one. I feel less alone now. Maybe we should start a protest to claim our right to use IMAP!:bomb:

2020-09-24

Mike Smith (07:15:26): > Thanks all for responding to the poll. Looks like there’s some clear topics of interest. I haven’t been very successfully at finding anyone to present this month (my fault for leaving it a bit late), so unless anyone has anything super pressing they’d like to discuss I suggest we postpone this month’s call. > > The next call will be 22nd October and I’ll start hunting for speakers today.

Martin Morgan (09:40:51): > I could talk my way through the second most popular topic, AnVILhttps://anvilproject.org https://anvil.terra.bio, today if people are up for it… I could make a mess of ALTREP, but probably better to get someone with real experience to talk about that - Attachment (The AnVIL): Migrate Your Genomic Analysis Workflows to the Cloud > Analyze large, open & controlled-access genomic datasets with familiar tools and reproducible workflows in a secure cloud-based execution environment.

Vince Carey (10:14:11): > would this be at noon edt today?

Martin Morgan (10:29:21): > Yes, noon EDT today

Lori Shepherd (10:37:04): > And/Or – with the release coming up if anyone had any development or release related questions –

Martin Morgan (11:27:59): > Ok<!channel>we’ll meet at 12pm EDT today athttps://bluejeans.com/114067881for a Q&A about the upcoming releasehttps://bioconductor.org/developers/release-schedule/and then ad hoc presentation on AnVILhttps://anvilproject.org https://anvil.terra.bio - Attachment (The AnVIL): Migrate Your Genomic Analysis Workflows to the Cloud > Analyze large, open & controlled-access genomic datasets with familiar tools and reproducible workflows in a secure cloud-based execution environment.

Aaron Lun (12:09:02): > <!here>DEV FORUM nOW!

USLACKBOT (13:13:45): > This message was deleted.

Ludwig Geistlinger (13:21:52) (in thread): > If you would like to take over maintainer responsibilities forGenRankfor the case that the original maintainer remains unresponsive, just respond to the above thread on the support site or the corresponding thread on the bioc-devel mailing list.

Lori Shepherd (13:32:04) (in thread): > Yes this is true. We like the original maintainers to give the okay but if they have truly been unresponsive and we have someone interested in taking over we normally allow that.

Lori Shepherd (13:36:28) (in thread): > If you do – can you please cc myself and/ormaintainer@bioconductor.org

Federico Marini (16:52:01): > On a similar line of@Kevin Blighe’s comment: wow, BioNet is also going that path - and JunctionSeq too. > Given my geographical vicinity - and also for the sake of interest - I’d try to reach out via other ways with the authors of BioNet and kindly ask what their plan is

2020-09-25

Lori Shepherd (07:46:32): > someone is trying to take over BioNet – but we are having contact issues with the maintainer.

Federico Marini (08:13:07) (in thread): > uh good to know - I emailed as well Marcus Dittrich, after one try on the phone

Federico Marini (08:13:16) (in thread): > (which seems to be not-existing?)

Federico Marini (08:13:52) (in thread): > I might have some interest as well in (co-) taking over, if the opportunity comes up

Kayla Interdonato (12:33:18): > The recording from yesterday’s devel forum is now available on our YouTube (https://www.youtube.com/watch?v=d_fhhHTzrqI&feature=youtu.be) as well as on the course materials on the website (https://www.bioconductor.org/help/course-materials/). - Attachment (YouTube): Developers Forum 14

Henrik Bengtsson (14:10:13): > Hi. Just a friendly reminder to all Bioconductor package developers to check your packages once in a while with: > > R CMD check --as-cran ... > > It will spot lots of mistakes and also real bugs that you don’t get when checking without--as-cran. It also gives great suggestions. When I runrevdepcheck::revdep_check(), I still see quite a few, serious bugs in Bioconductor packages, especially the ones of type: > > Error in if (x == 1) message("x == 1") : the condition has length > 1 > Error: 'length(x) = 2 > 1' in coercion to 'logical(1)' > > In the worst case, these types of bugs can result in invalid analytical results without anyone noticing and they might only occur for some type of input data. If you don’t know what these are, see <https://github.com/HenrikBengtsson/Wishlist-for-R/issues/38> and <https://github.com/HenrikBengtsson/Wishlist-for-R/issues/38 https://github.com/HenrikBengtsson/Wishlist-for-R/issues/48>.

Al J Abadi (22:06:31) (in thread): > Thanks@Henrik Bengtsson. This is really useful. We had in fact had this issue in our package which surfaced only a while ago.

2020-09-26

FelixErnst (07:03:48) (in thread): > Hi Hernrik. thx for the heads up. I haven’t had any luck finding a summary of the options turned on by setting--as-cranDo you have some resources describing the exact effects? > > On the hint regarding the length of logical values in if conditions: My guess was, that starting a year (?) ago, the BBS also checks for that (http://bioconductor.org/checkResults/devel/bioc-LATEST/Renviron.bioc). Am I mistaken or are there other settings to consider?

Henrik Bengtsson (18:11:39) (in thread): > I think –as-cran is described in the WRE, e.g. help.search()

Henrik Bengtsson (18:14:24) (in thread): > I thought so too but e.g. MIGSA has an error of this kind.

Henrik Bengtsson (18:15:09) (in thread): > (Maybe it’s because it’s in the vignette?!?)

2020-09-27

FelixErnst (03:35:52) (in thread): > I found it athttps://cran.r-project.org/doc/manuals/r-devel/R-ints.html#Toolsafter checking in WRE more closely. I don’t see an additional setting referring to length of conditions. - Attachment (cran.r-project.org): R Internals > R Internals

FelixErnst (03:38:01) (in thread): > My interest in this was a bit out of panicTM, because I got hit by that once upon a time.

2020-10-01

Al J Abadi (23:19:36): > Hi, we’re just migrating our parallel computations toBiocParallel. On my machine RNGseed works withSnowParambut not withMulticoreParam. I was wondering if I’m doing it right, or if it’s a known/intended behaviour . > > r > library(BiocParallel) > bppsnow <- SnowParam(workers = 2, RNGseed = 3) > sp1 <- bplapply(1:3, function(x){ > runif(100) > }, BPPARAM=bppsnow) > > sp2 <- bplapply(1:3, function(x){ > runif(100) > }, BPPARAM=bppsnow) > > identical(sp1, sp2) > #> [1] TRUE > > > bppmc <- MulticoreParam(workers = 2, RNGseed = 3) > mcp1 <- bplapply(1:3, function(x){ > runif(100) > }, BPPARAM=bppmc) > > mcp2 <- bplapply(1:3, function(x){ > runif(100) > }, BPPARAM=bppmc) > > identical(mcp1, mcp2) > #> [1] FALSE > > > Created on 2020-10-02 by the [reprex package]([https://reprex.tidyverse.org](https://reprex.tidyverse.org)) (v0.3.0) > > <details> > > <summary>Session info</summary> > > r > devtools::session_info() > #> ─ Session info ─────────────────────────────────────────────────────────────── > #> setting value
> #> version R version 4.0.2 (2020-06-22) > #> os macOS Catalina 10.15
> #> system x86_64, darwin17.0
> #> ui X11
> #> language (EN)
> #> collate en_AU.UTF-8
> #> ctype en_AU.UTF-8
> #> tz Australia/Melbourne
> #> date 2020-10-02
> #> > #> ─ Packages ─────────────────────────────────────────────────────────────────── > #> package * version date lib source
> #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) > #> backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2) > #> BiocParallel * 1.23.2 2020-07-06 [1] Bioconductor
> #> callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2) > #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) > #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) > #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) > #> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2) > #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) > #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0) > #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) > #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) > #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) > #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) > #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) > #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) > #> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2) > #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2) > #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) > #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) > #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) > #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) > #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) > #> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2) > #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) > #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) > #> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) > #> rmarkdown 2.4 2020-09-30 [1] CRAN (R 4.0.2) > #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) > #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) > #> snow 0.4-3 2018-09-14 [1] CRAN (R 4.0.0) > #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) > #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) > #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0) > #> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2) > #> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2) > #> xfun 0.18 2020-09-29 [1] CRAN (R 4.0.2) > #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) > #> > #> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library > > > </details> > > </details> >

Al J Abadi (23:25:10) (in thread): > PS: With our current setup,clusterSetRNGStreamdoes work even withFORK

2020-10-02

Martin Morgan (09:43:30) (in thread): > yes this seems to be broken; I’ll look into it, I’ve opened an issue athttps://github.com/Bioconductor/BiocParallel/issues/122

Martin Morgan (13:49:35): > Check out the instructions in github when you create a new repository > > git remote add origin git@github.com:mtmorgan/tmp.git > git branch -M main > git push -u origin main > > Note the second line ‘main’, not ‘master’! This will cause a lot of problems for new packages and the build system; we’ll mull this over for the next several weeks…

Aaron Lun (13:54:32): > oh man.

Aaron Lun (13:55:31): > Fortunately I never pay attention to the GitHub instructions anyway.

Aaron Lun (13:55:33): > masterFTW

Al J Abadi (22:40:53) (in thread): > Thanks Martin.

2020-10-03

Federico Marini (11:40:13): > I think I read some discussion about the use of the wordmasterand the related slavery connotation. > Not being a native speaker I did not explicitly think of that too much

Federico Marini (11:40:36): > but it is indeed a freshly released feature ->https://www.zdnet.com/article/github-to-replace-master-with-main-starting-next-month/ - Attachment (ZDNet): GitHub to replace ‘master’ with ‘main’ starting next month | ZDNet > All new Git repositories on GitHub will be named “main” instead of “master” starting October 1, 2020.

Robert Castelo (12:13:18): > I guess then the academic world should stop spelling out MSc as Master of Science and stick to its original latin meaningMagister Scientiaeor translate it asMagister of Science and Magister degree.

Aaron Lun (13:34:24): > I was very irritated when I first heard about it

Aaron Lun (13:34:40): > and now that it’s happened, I continue to be very irritated about it.

2020-10-04

Hervé Pagès (18:30:06): > I couldn’t care less about the cosmetic changes that Microsoft introduced in the last couple of months but this one bothers me too. It’s not like the other branches are “slave branches”. Makes no sense at all. Also git existed before GitHub and there is a long history in the git community to call this the “master branch”, with no connotation whatsoever. So many books and documentation just follow that convention. All this happened way before some inspired dude at Microsoft came up with a brilliant idea during a boring meeting. I hope he got a raise. This is Microsoft trying to “catch up” with the Open Source movement (or more accurately, trying to repair their image with declarations like “Microsoft loves Linux” after they tried to destroy it in the late 90’s) but still not fully embracing/understanding the Open Source culture.

Hervé Pagès (18:48:10): > Great explanation of the meaning of “master” in the git context:https://lore.kernel.org/git/20200504174548.r3zaftqcq52yhd2u@chatter.i7.local/

2020-10-05

Federico Marini (09:04:39): > I had also that def in mind,@Hervé Pagès- especially taking it also from the music corner where “the master copy” is the relevant one

Aaron Lun (11:28:50): > yeah, and now my protagonist in my favorite video game will have to change his name. (Halo, FYI.)

Aaron Lun (11:29:55): > This change is absurd. Putting aside any moral arguments, it’s going to break so many things.

Spencer Nystrom (20:59:37) (in thread): > I dunno, “Main Boss” has a nice ring to it.

2020-10-06

Shian Su (01:15:43): > From what I’ve seen, Git’s master does derive from master/slave terminology.

Shian Su (01:16:40): > Git was developed as a replacement for Bitkeeper, and Bitkeeper used master/slave terminology :https://github.com/bitkeeper-scm/bitkeeper/blob/master/doc/HOWTO.ask#L223

Shian Su (01:17:45): > An email from Linus early on in git’s development also refers to master/slave.https://marc.info/?l=git&m=111968031816936&w=2

Aaron Lun (01:28:51): > I don’t really care how the term originated. The wordsmasterandslavehold no power over me. They also get the same number of points in scrabble.

Hervé Pagès (01:58:22): > AFAIK the master/slave terminology has never been used in any official git/GitHub documentation. The git folks still haven’t decided what they’re going to do, there’s a lot of push back (they know this is the kind of change that has the potential to break A LOT of things). That didn’t stop the new GitHub management to go ahead with the change.sigh

Henrik Bengtsson (03:09:35): > @Marcel Ramos Pérez, I saw your comment on last weeks ‘Talk with the Bioc Core Team’ to possibly usepkgdownfor Bioc package landing pages. Thepkgdowndoesnotsupport generic package vignettes - it’s hardcoded to vignettes written in the Rmarkdown format only. Vignettes in any other type of vignette format are silently dropped. I looked into whether or not this could be fixed, butpkgdowndoesn’t make use oftools::buildVignette[s]()to build the Rmarkdown vignettes. It didn’t look like an easy fix to me.

Kasper D. Hansen (04:58:12): > I will come out in support of the terminology switch. It is indeed a pretty unfortunate term which should not be used. I can accept the opinion that the effort of purging the term is too high, but there is no doubt that the term is very unfortunate.

Kasper D. Hansen (05:01:28): > Personally, I don’t think the technical price of switching the term will be that great, but I admit I am not a git expert by any measure.

Federico Marini (05:14:55): > probably larger projects would feel more this pain - me as single user, that’s doable in an hour or so? (ok, now I just jinxed it..)

Kasper D. Hansen (05:22:52): > For now, according to the article, this is for new repos. And they are working on a simple fix for existing repos; we’ll see how simple that really is.

Federico Marini (05:53:55): > could that be it already?

Federico Marini (05:53:56): > https://github.com/settings/repositories

Federico Marini (05:54:15): - File (PNG): image.png

Kasper D. Hansen (06:26:48): > yes, which is consistent with the article saying the change is in effect from Oct 1

Dirk Eddelbuettel (07:55:22): > The possibility to change the name of the main repo has been around for a few months already and some project have changed. The whole process was pretty well publicized and reported upon. And eg AFAICT the default instruction after creating a new repo got this new middle line added around the same time making the name set explicit:

Dirk Eddelbuettel (07:56:27): - File (Shell): Untitled

Tim Triche (10:09:17): > this is the least worst solution to an unfortunate situation, nicely done IMHO

Vince Carey (11:26:32) (in thread): > @Marcel Ramos PérezI’m unclear on the virtues of pkgdown for landing pages – are there examples of the benefits? Supposing that they are substantial, what would be the downside to strongly encouraging use of Rmarkdown forallBioc vignettes, so that content in vignette folders would not be lost in the landing pages?

Marcel Ramos Pérez (11:48:29) (in thread): > The benefits would be that we avoid all the backend coding required to make the landing pages possible by using a HTML template and running the R code for each package. I don’t think RMarkdown has to be used for all Bioc vignettes. The landing page is cosmetic and can probably reference PDF vignettes as long as we have a yaml header template that supports ithttps://pkgdown.r-lib.org/reference/build_articles.html

Dirk Eddelbuettel (11:55:43) (in thread): > I probably should not say this because it is somewhat premature but I began using the Python based ‘Material for MkDocs’ in a few packages. My needs were similar to what@Henrik Bengtssonsaid (i.e. non-html vignettes) but also support for (old-school GNU) ChangeLog and NEWS.Rd. Plus general flexibility and …. a desire to escape the cookie-cutter look and feel ofpkgdown. I tweeted once about it and added it by now to a handful of repos of mine.https://twitter.com/eddelbuettel/status/1305159954010124288Happy to chat in DM. I would think Bioc has the firepower to mount something similar on whichever framework is seen as most suitable – which may just be extensions topkgdown. - Attachment (twitter): Attachment > Delighted to have found a nice, clean, extensible package doc site geneator: Material for MkDocs by @squidfunk
> > Added #Rstats scripts for NEWS, ChangeLog, man/, … which should be in a repo “soon”. See > https://eddelbuettel.github.io/anytime > https://eddelbuettel.github.io/nanotime > https://eddelbuettel.github.io/drat https://pbs.twimg.com/media/EhzbRHZWsAAohZ8.jpg

Marcel Ramos Pérez (12:04:11) (in thread): > I would still need to come up with an example that allows for PDF vignettes but we have (biocthis::use_bioc_github_actions) GitHub Actions to help generatepkgdownwebsites for Bioconductor packages. > > Thanks for sharing@Dirk Eddelbuettel! This is a nice solution for PDF vignettes. It would be good to see what framework / setup is easiest to deploy so to minimize maintenance overhead.

Dirk Eddelbuettel (12:06:57) (in thread): > Right. And I haven’t automated the generation of the markdown stubs yet but eg the anytime doc site embeds the pdf document in a tag (that I believe to be html5 compliant but I could be wrong). That’s the “currently best” solution I could come up with:https://eddelbuettel.github.io/anytime/vignette/

Hervé Pagès (13:13:55) (in thread): > Which term should not be used? master? Ok then many people on this slack should fix their resume as suggested by@Robert Castelo

Robert Castelo (13:39:57) (in thread): > I wasn’t thinking of people’s CVs because you can’t change the degree you got backwards in time, but just wondering whether universities should make a similar change by renaming their current and/or future master degrees to magister degrees, assuming the term master originates in this context frommastery. - Attachment (merriam-webster.com): Definition of MASTERY > the authority of a master : dominion; the upper hand in a contest or competition : superiority, ascendancy; possession or display of great skill or technique… See the full definition

Hervé Pagès (14:06:59) (in thread): > Waiting for the universities to make the change is just a lame excuse for not removing an unfortunate term from your resume NOW.

Kasper D. Hansen (14:49:25) (in thread): > The master term for git which - whether or not its true - can be viewed as coming from a master-slave context.

Kasper D. Hansen (14:49:59) (in thread): > We are trying to be polite and welcoming, and that includes taking people’s feelings into consideration.

Kasper D. Hansen (14:50:12) (in thread): > It is btw. ironic that Im lecturing on being polite.

Hervé Pagès (14:56:50) (in thread): > I just hope you can accept the opinion that a lot of people actually doubt that the term master is very unfortunate, unless you artificially force it into a master/slave context.

Kasper D. Hansen (15:06:16) (in thread): > I can accept that opinion. But I also think people should reflect on what matters - is it the intention behind the usage or is it the perception? I have personally reached the conclusion that the perception matters quite a bit, which is also what we tend to agree on in most civil discourse.

Kasper D. Hansen (15:07:01) (in thread): > I also think, following the pointers to eg. Linus above, that its not clear that the original intent was not master-slave, although I am not a git historian.

Hervé Pagès (15:26:43) (in thread): > Exactly, the perception is what matters. And before someone went to dig out some 15+ year old indication that the master/slave terminology was used in BitKeeper, nobody had any problem with naming their git branch “master”. Personally I didn’t but maybe you did? OTOH I always found the master/slave terminology questionable and possibly offensive to some people, and I’m glad R’s--slaveoption is going away. But I never had that feeling with a git master branch and I don’t see how you could possibly genuinely connect it to the master/slave thing unless you work really hard on it. I also acknowledge that some people tend to work harder on connecting things than I do.

Will Townes (15:26:47): > @Will Townes has left the channel

Dirk Eddelbuettel (16:49:35): > BTW, just saw this in a U of I slack:https://github.com/github/renaming

Dirk Eddelbuettel (16:50:35): > > If you haven’t renamed your default branch yet, consider waiting until later this year. We’re investing in tools to make renaming the default branch of an existing repository a seamless experience for both maintainers and contributors.

Hervé Pagès (16:56:31) (in thread): > but notmain. You’ll score less in scrabble with that one.

Aaron Lun (16:59:03) (in thread): > indeed

Hervé Pagès (16:59:38) (in thread): > defaultwould have been a much better choice

Hervé Pagès (17:00:28) (in thread): > especially if you can place it in a corner:wink:

Hervé Pagès (17:11:59) (in thread): > orprimary, even better

Hervé Pagès (17:40:36): > Since we are on renaming core stuff and breaking a lot of things along the way, I propose that we cease the opportunity to also change the naming scheme of release branches. Right now we useRELEASE_X_YwhereX_Yis the Bioconductor version. Note that this is what we were using in the good ol’ time of Subversion. When we switched to git we importedtrunkasmasterand theRELEASE_X_Ybranches as-is (with no renaming) without giving it too much thought. However, I always had 2 issues with this: > 1. It’s a lot of typing. Every time I need to switch back and forth betweenmasterandRELEASE_X_Y(e.g. in the context of backporting fixes) my fingers complain (Ok ok I could have created convenient aliases forgit checkout RELEASE_3_11andgit checkout masterbut I didn’t.) > 2. Most Bioconductor packages are primarily hosted on GitHub. Having branches namedRELEASE_X_Ythere does not have much meaning if you’re not aware of the Bioconductor context and the Bioconductor release naming scheme. > Therefore I propose that starting with the next BioC release we name the release branchesBIOC_X_Y. So for example the branch we will create a couple of days before this month release will beBIOC_3_12instead ofRELEASE_3_12. Maybe not much less typing but at least it would make the purpose of these branches a lot clearer, especially when you see them on GitHub.

Aaron Lun (17:51:45): > well, if we’re talking about typing, maybe something that doesn’t involve holding down shift all the time would help.

Aaron Lun (17:51:55): > likebioc3.12.

Hervé Pagès (17:52:19): > I’m all for that. Are dots allowed in branch names?

Aaron Lun (17:52:26): > AFAIK

Aaron Lun (17:52:36): > -would also be fine

Aaron Lun (17:52:48): > a bit less natural than.though

Hervé Pagès (17:56:09): > Right, we use the dot everywhere else for the BioC version so being consistent would be nice. I believe there is also a long tradition of using caps for the 1st and last letter of BioC. So we would need to decide between consistency vs maximum convenience (i.e. all lower case).

Aaron Lun (17:57:09): > the bioc docker images are already all lower case, so… seems fine to me.

Hervé Pagès (17:58:04): > ah didn’t know that. If there is a precedent, then fine.

Aaron Lun (17:58:30): > not by choice, I should add: dockerhub just doesn’t give you the option!

Aaron Lun (23:33:05) (in thread): > They can take mymasterbranch from my cold dead hands. I don’t care either way about the moral issue, and even the technical issues are solvable with enough smart people in the room. But I don’t like to be told how to feel and act. So I’m going to keep making repos with amasterbranch and making them the default on GitHub. People are free to get hurt if they like, but offense is taken, not given.

2020-10-07

Robert Castelo (05:11:31): > i find thewoktreetrick very useful to avoid typingRELEASE_X_Yall the time: > > git checkout -b RELEASE_3_11 upstream/RELEASE_3_11 > git merge upstream/RELEASE_3_11 > git merge origin/RELEASE_3_11 > git checkout master > mkdir -p ../bioc3.11/myPackage > git worktree add ../bioc3.11/GSVA RELEASE_3_11 > cd ../bioc3.11/GSVA > git status > On branch RELEASE_3_11 > Your branch is up-to-date with 'upstream/RELEASE_3_11'. > nothing to commit, working tree clean > cd ../../myPackage > git status > On branch master > Your branch is up-to-date with 'origin/master'. > nothing to commit, working tree clean > > so that you switch branches by simply changing directories.

Luke Zappia (05:25:28): > Is there a recommended/standard (hopefully) simple Bioc object for storing data from a VCF file?

FelixErnst (05:59:41) (in thread): > I think you might want to have a look at this:https://bioconductor.org/packages/3.11/bioc/html/VariantAnnotation.html - Attachment (Bioconductor): VariantAnnotation > Annotate variants, compute amino acid coding changes, predict coding outcomes.

FelixErnst (06:02:58) (in thread): > ButVCFfiles are kind of blindspot for me. So probably other people have first hand experience. I also think, that this might have been more suitable for post ingeneral. Maybe you want to move the thread there, but I am not sure about this

Sean Davis (07:26:37) (in thread): > Just confirming that VariantAnnotation is the place to look.

Al J Abadi (15:28:33) (in thread): > @Hervé Pagèsyes, as any branch can bedefaultunder current framework to refer to the ‘base’ branch - File (PNG): image.png

Hervé Pagès (16:39:23) (in thread): > but what if I want the primary master branch to be the base branch by default but only in trunk?

Levi Waldron (16:40:13): > I’m a little late to the conversation about GitHub’s master -> main change, but I want to say 1) please avoid tone that comes across as dismissive or ridiculing of differing perspectives, 2) this is a change that reasonable people, including actual descendants of slaves, can disagree on (e.g.https://www.wired.com/story/tech-confronts-use-labels-master-slave/), and 3) the lack of significance of the word to any one of us individually is irrelevant, the point is what the overarching goals are and whether the change contributes to those goals. Raising conversations like this might have been one of the goals.@Hervé Pagèsand@Aaron LunI would welcome the chance to discuss it, even though I’d prefer another venue that doesn’t involve typing… - Attachment (Wired): Tech Confronts Its Use of the Labels ‘Master’ and ‘Slave’ > Companies and programmers are reexamining how technical terms are used amid Black Lives Matter protests. But some worry the changes are empty symbolism.

2020-10-08

Aaron Lun (03:58:31): > Well, I would say that there’s a considerable lack of hard data on this matter, which is rather unbecoming of us as scientists. So I propose an experiment based on the following two premises: > > - A more inclusive community is valuable because we have more contributors to get more work done. > - A more inclusive community is valuable because we have more users to justify the grant funding. > > There may also be some fluff about good feelings and happy thoughts and all that, but those are difficult to quantify so I’m leaving them out. Based on the points above, I propose the following experimental design: > > 1. Randomly allocate all BioC packages into three groups. > 2. In the first group, change the default branch tomain. > 3. In the second group, change the default branch to a “control” name. > 4. In the last group, leave the default branch asmaster. > > Over the next few releases, we then collect data on the number of commits and the number of downloads as our two primary readouts. (More sophisticated breakdowns are possible based on GitHub usernames, IP addresses, etc. but I’ll leave those details aside for now.) The million dollar question is the following:Is the level of “community involvement” in package development/use affected by the group to which that package is allocated?If the existence of amasterbranch is truly non-inclusive, this should manifest as an increase in community involvement (i.e., more commits or downloads) once the branch has been changed tomain. But if there is no difference, then it really doesn’t matter whether if certain people find it offensive, because those same people probably weren’t going to contribute to the project in the first place. > > Now, a tricky question is the choice of a control name. You don’t want to just compare the first and last groups because there may be an increase in the number of commits in the former just to deal with the branch name change. So we need a control name to account for this effect, where this control still has the same properties asmaster. Personally, I would proposetrumpbecause I would hypothesize that this insults the same type of people who findmasteroffensive. (It’s also pretty close totrunk.) > > If we see a significant increase in contributors or users, then I will grudgingly accept that this change has a useful practical effect. Otherwise this discussion is purely academic and I will need more alcohol to enjoy it. > > P.S. I know I said random, but please put all my packages in the last group. > > P.P.S. Though I would also be amused by committing totrump. It’ll be funny for the next 1-48 months.

Laurent Gatto (05:23:52) (in thread): > I think you miss an important point, Aaron. It has absolutely nothing to do with quantifying the community involvement, or the the quality of the code, the number of contributions, or anyone’s productivity. It’ssimplyabout appreciating that other people might feel differently and with little personal efforts, improving the way we communicate.

Levi Waldron (05:38:02) (in thread): > Bioconductor currently does not have that many contributors from under-represented minorities, so the experiment seems kind of self-affirming. The most direct measurement would be to reach outside the current community and ask, but other people have already done that.

Aaron Lun (11:08:15) (in thread): > I don’t think that people’s feelings are important if they don’t have any practical consequence.

Aaron Lun (11:21:13) (in thread): > And you’re more than welcome to do general promotion of BioC to minorities if you want to determine the effect over the course of the evaluation.

Aedin Culhane (11:54:35) (in thread): > Words matter. There is no justification of using terms that exclude, when its fixable. Lets just make the change

Aaron Lun (12:04:47) (in thread): > Well, I’m not convinced that anyone’s being quantifiably excluded here.

Aaron Lun (12:05:29) (in thread): > Based on these comments, I can only assume that people are afraid that their ideology won’t stack up against some hard empirical data.

Aaron Lun (12:09:40) (in thread): > And no, words don’t really matter. They only matter with respect to their effect on people’s behavior. And that’s what I’m proposing to measure here. Surveys about feelings are completely irrelevant to me.

Hervé Pagès (12:41:29) (in thread): > I think words and feelings are important. But primary feelings, not proxy feelings e.g. “I think it’s offensive because I assume it might be offensive to some people from a group I don’t belong to but I’m not going to ask them”. So yes, showing some data would have helped. It would have been a very different story if GitHub’s new management had taken the time to collect some data about how offensive the name of the master branch is to people, especially people from the black community. This would have been the right way to go, and the only way to justify such a disruptive change. But no, the whole thing is based on air.

Levi Waldron (13:30:20) (in thread): > @Aaron LunI would point out flaws in your experimental design and its premises, and ask how much time you’ve spent searching primary literature before stating that no relevant empirical data exist. But I find your tone not conducive to real conversation, and I think it’s time to close this off-topic and anti-productive thread.@Hervé PagèsI agree with you that speaking on behalf of others without consulting them is a bad thing, but wouldn’t just assume that’s what happened. Maybe Microsoft/GitHub did no research before making a disruptive change - I don’t know - but my own humility makes me more inclined to leave open the possibility that other people might know something I don’t, especially on a topic that’s not my own.

Hervé Pagès (13:35:32) (in thread): > They might know something that we don’t, in which case they should share. So at best, the decision making process lacked transparency. At worst, they know nothing more than you and I do.

Aaron Lun (13:37:40) (in thread): > Insofar as this thread is winding up, I’ll just finish by saying that I’m not the person proposing the change, so I don’t see how the responsibility for giving hard contributor data lies on me. You’re more than welcome to propose a better experimental design… which is a step forward fromnoexperimental design. And no, I don’t think my premises are flawed.

Michael Lawrence (18:18:58): > @Michael Lawrence has joined the channel

2020-10-09

Henrik Bengtsson (12:56:08): > Suggestion for Bioc package pages: Report on DESCRIPTION fieldOS_typeunder ‘Details’, e.g. it’s not clear fromhttps://bioconductor.org/packages/release/bioc/html/bigmemoryExtras.htmlthat the package is only forOS_type: unix

Lori Shepherd (13:16:42): > We could look at adding another field – I will mention that is why the platform badge at the top displays some

Lori Shepherd (13:18:15): > And when you click on that badge it takes you to the bottom section where you can see that the windows binary is blank/unavailable

Henrik Bengtsson (13:28:44) (in thread): > Could the Windows build/badge ever become “blank” because of Bioc build problems, e.g. a system dep got upgraded that is hard to build on Windows? … which would be in contrast to the developer explicitly saying “this is only support on Unix”.

Henrik Bengtsson (13:35:58) (in thread): > As you probably guessed, I noticed this after readinghttps://stat.ethz.ch/pipermail/bioc-devel/2020-October/017296.html. Regarding that, technically I think the Bioc build system could scan the package dependency graph for hard dependencies (Depends, Imports, LinkingTo) that haveOS_type: unixspecifications and conclude when a package is only supported on Unix-like systems. That would makeUnsupportedPlatforms: winin .BBSoptions redundant - and one less thing for developers to know about (… and the Bioc build system one step closer to standard R package builds/checks)

Hervé Pagès (18:15:38) (in thread): > The.BBSoptionsmechanism is more “fine grained” than theOS_typee.g. it allows us to specify things like “unsupported on 32-bit Windows” or “unsupported on mac” or “unsupported on a given builder” (where the builder is specified by name). We sometimes need that level of control. Very few developers know aboutOS_typeor.BBSoptionsbut it doesn’t really matter because even when they know about it, they will very rarely go ahead and mark their package as unsupported on Windows when they were not able to make it work on this platform. We generally do this for them:wink:I agree that we could probably make the unsupported platform information more visible on the package landing page. BTW what are the bigmemoryExtras rev deps that need to be marked as unsupported on Windows? Can’t see any. Thanks!

Henrik Bengtsson (21:44:15) (in thread): > Regardless of .BBSoptions, I’d argue that if it is known that a package only works on Unix should declare so inOS_typeand if it doesn’t it’s a mistake. It’s also a nice gesture to anyone one who consider to depend on it.R CMD INSTALLacknowledges it and will give an informative error message if the current OS does not support it, e.g. withOS_type: windowsyou’ll get: > > $ R CMD INSTALL teeny_0.1.0.tar.gz > * installing to library '/home/hb/R/x86_64-pc-linux-gnu-library/4.0-custom' > ERROR: Windows-only package > * removing '/home/hb/R/x86_64-pc-linux-gnu-library/4.0-custom/teeny' > > I don’t think reverse dependencies shouldnothave to repeat this, just like they don’t have to repeatSystemRequirements. > > There’s alsoArchs, but honestly I’m not sure of it’s purpose/where it’s documented, e.g. it might be something thatR CMD buildadds/updates. Scanning CRAN gives: > > > db <- tibble::as_tibble(utils::available.packages(repos = "[https://cran.r-project.org](https://cran.r-project.org)")) > > table(db$OS_type) > unix > 32 > > table(db$Archs) > i386, x64 x64 > 11 1 > > andR CMD check --as-cranis happy with it if you add it to the DESCRIPTION.

2020-10-11

Kozo Nishida (21:43:43): > @Kozo Nishida has joined the channel

2020-10-13

Lori Shepherd (09:35:09): > We think we are still about a week away from the support site upgrade rollout. We would appreciate any additional feedback for the testing site before we officially launch. Please access the testing site athttp://supportupgrade.bioconductor.org/Please report any bugs or enhancements by opening an issue athttps://github.com/Bioconductor/SupportUpgrade/issues

2020-10-14

Matthew McCall (13:00:23): > @Matthew McCall has joined the channel

David Burton (13:02:49): > @David Burton has joined the channel

2020-10-16

Kasper D. Hansen (04:50:43): > What is best practices for testing code that depends on the internet. We have a package whose purpose is accessing remote data. For most of the code, if we want to run it, we should really run it where it accesses the remote files. However, that requires internet connectivity etc etc. Someone else must have thought about this? What are the best practices?

FelixErnst (04:51:50): > httptest::skip_if_disconnectedmight be an option to look into

Sean Davis (05:21:08) (in thread): > One standard approach is to refactor a bit and then use “mocking.” Here is what that looks like in python:https://realpython.com/testing-third-party-apis-with-mocks/ - Attachment (realpython.com): Mocking External APIs in Python – Real Python > Let’s look at how to test the use of an external API using Python mock objects.

Sean Davis (05:23:00) (in thread): > And capabilities in R for a related approach:https://cran.r-project.org/web/packages/httptest/vignettes/httptest.html

Kasper D. Hansen (06:26:26) (in thread): > So if I understand this correctly, you just do two things (whcih we are already doing) > 1. you separate retrieving the data and processing the data. > 2. For testing you save a local copy of what the retrieved data looks like and run it through the processer > This is easy for us to do. But IMO I think it is a bit fake. What happens if the remote service (for example) changes its return format? Then the test will pass, but the package won’t work

FelixErnst (07:19:17) (in thread): > For me that is the reason, why just skip the test, if there is no connection. I set up the test on the other low-level parts to be run with dummy data (not even mockup of the connection) before the actual test with the remote connection. That way, I know, that parsing/loading/whatever works/not works, and if the test with the connection is performed, that the parsing on the current return values works

Martin Morgan (07:29:48) (in thread): > One test that validates the endpoint, forg example tests the openapi version. The rest mocks as Sean says. This ids very much in the spirit of unit tests were the specific functionality you implement in the code works.

Kasper D. Hansen (07:31:09) (in thread): > So I should have a test validating the endpoint, despite the fact that it requires an internet connection and that the endpoint is responding?

Martin Morgan (09:51:30) (in thread): > That’s what I think, yes. The two problems with internet tests seem to be the cost of data transfer, and the ‘flakiness’ of the service / internet. The approach limits data transfer, and exposes only one call to flakiness. And if the endpoint isn’t responding, then the package is broken…

Martin Morgan (10:07:15) (in thread): > But I’ll mention an example of a service one of my packages relies on. The API was not versioned. I encouraged versioning. Here is the sequence of versions used, reverse chronologically, over 79 commits > > version: "0.2.0" > version: "0.1.0" > version: "0.0.1" > version: "0.1" > version: '0.1' > version: "0.1" > > :disappointed:but actually I think it’s getting better, and it was really helpful to make the API provider aware of the value of semantic versioning.

Sean Davis (10:33:24) (in thread): > There is another reason to separate concerns clearly for testing. For third-party services, the separation of concerns between 1) basic internet access to the endpoint, 2) api data model for request (query parameters, etc.), 3) data model for response (return format), and 4) processing of response are all important details. Numbers 1-3 should likely undergo testing daily, hourly, or even more often (and independently from client code changes) depending on the priority of catching problems early and often that stem from internet connectivity issues or changes to a backend that are out-of-sync with client changes.

2020-10-17

Kevin Blighe (08:24:24): > @Kevin Blighe has joined the channel

2020-10-20

Mahmoud Ahmed (07:44:20): > @Mahmoud Ahmed has joined the channel

2020-10-21

Mike Smith (10:37:41): > The next Developers Forum will take place this coming Thursday 22th Oct at 09:00 PDT / 12:00 EDT / 18:00 CEST - You can find a calendar invite athttps://tinyurl.com/y4y2xptbWe will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881)@Constantin Ahlmann-Eltzeis going to introduce us to the sparseMatrixStats package for high-performance computation on sparse matrices. Some of the benchmarking in the package vignette looks very impressive, and I’m excited to learn more about both sparse matrices in general and the performance gains the package might help us realise. > > Please let me know if you’d like anything else added to the agenda.

2020-10-22

Mike Smith (10:51:02): > <!channel>Just a reminder that this meeting is happening in about an hour. I thought it might also be an opportunity to discuss mocking and unit tests following the discussion in this channel a few weeks ago, with some examples in the context of biomaRt and Ensembl.

Tim Triche (10:51:35): > hot. Is thesparseMatrixStatsdiscussion still on, though?

Constantin Ahlmann-Eltze (10:51:47): > Sure:slightly_smiling_face:

Dirk Eddelbuettel (13:03:28): > That was a really nice presentation. As for the comparison to Python (and its sparse matrix support) I also do not know but when we added sparse matrix support to RcppArmadillo (with the help of Google Summer of Code student) we referred to some SciPy code via reticulate just for unit testing – that may be a starting point for “simple / first” benchmarks.

2020-10-23

Kayla Interdonato (09:38:42): > @Constantin Ahlmann-EltzeIf you could share your slides from yesterday’s presentation I can be sure they get added to the website along with the video recording.

Constantin Ahlmann-Eltze (10:41:15) (in thread): > Sure, that would be great :) - File (PDF): 2020-10-22_sparseMatrixStats.pdf

Kayla Interdonato (10:47:08) (in thread): > Thank you!

Kayla Interdonato (15:32:02): > The recording from yesterday’s devel forum is now available on our YouTube (https://youtu.be/uw90J1Oy0Bc) as well as on the course materials on the website (https://www.bioconductor.org/help/course-materials/). - Attachment (YouTube): Developers Forum 15

2020-11-04

Regina Reynolds (15:55:52): > @Regina Reynolds has joined the channel

Hervé Pagès (17:01:38) (in thread): > So now I’m being told by the CoC Committee that my comments made “some members (in particular Bioconductor members who are descendants of slaves) feel unwelcome by insisting on the usage of offensive terms.” Sorry but this is non-sense. > > I thought my point was clear, but if it was not, here it goes again. My point is that GitHub’s management didn’t provide evidence that the stand-alone term “master” was offensive per-se i.e. when not used in the master/slave context. Collecting data before making such a disruptive change is what responsible professionals do. Please note that I also said that I always found the master/slave terminology questionable and possibly offensive to some people, and that I was glad that it was going away (e.g.R --slave). > > So I only expressed my opinion and I stand by it. Maybe I expressed it too loudly for some people’s taste, or maybe I was rude, in which case I apologize. But I fail to see how I expressed it in a way that would make “some members (in particular Bioconductor members who are descendants of slaves) feel unwelcome by insisting on the usage of offensive terms”. I think this is a serious accusation. I also think it’s a moot one because what I was discussing was the offensiveness of the standalone term in the first place. That’s not the same as insisting on using a term that I know is offensive, which is what I’m being reproached here. I’ve not done that. I’ve never done that. I will never do that. That’s not me. So this accusation is unfair and a gross misinterpretation of what I said. > > Just wanted to clarify things.

Aaron Lun (18:01:07) (in thread): > Ah, they probably meant to send that to me.

Hervé Pagès (18:06:53) (in thread): > What, you didn’t get that email too? Should come soon. Now you got a heads-up of what to expect.

Aaron Lun (18:08:26) (in thread): > oh my god

Aaron Lun (18:08:32) (in thread): > i just deleted it out of instinct

Aaron Lun (18:10:57) (in thread): > Well. If we’re going to start banning arbitrarily offensive words, maybe we can ban the usage of%>%? I find its usage deeply offensive.

Hervé Pagès (18:13:46) (in thread): > No more offensive than lines of codes that are 80+ character wide. They’re exclusive by assuming that everybody can afford a big screen.

2020-11-10

Shubham Gupta (10:20:14): > converting int64 vector to data-frame changes its values. > > > x <- data.frame("a" = bit64::as.integer64(c(7742456255764097691, 6462000664077079508))) > > x > a > 1 7742456255764098048 > 2 6462000664077079552 > > The last numbers are changed. Does anyone know what could be happening? I am using R version 4.0.3

Dirk Eddelbuettel (10:25:46): > The issue is more fundamental. The values you assigned do not even make it intobit64correctly:

Dirk Eddelbuettel (10:26:07): - File (R): Untitled

Dirk Eddelbuettel (10:27:16): > It works if you start from character:

Dirk Eddelbuettel (10:27:36): - File (R): Untitled

Shubham Gupta (10:44:58): > Thanks. This makes sense. Previously the values were numeric that’s why they did not match

2020-11-14

Aaron Lun (04:32:55): > @Martin Morganyour thoughts onhttps://github.com/LTLA/BiocSeedwould be welcome.

Martin Morgan (14:10:11) (in thread): > @Henrik Bengtssonmight have comments. > > Is this what one wants from nested calls? > > > f = function() { setBiocSeed(1); unsetBiocSeed() } > > g = function(x) { rnorm(1); setBiocSeed(2); if (x) f(); print(rnorm(1)); unsetBiocSeed() } > > g(FALSE) > [1] -1.130333 > > g(TRUE) > [1] 0.5761661 > > Are there concerns that the serialized version of objects change for subtle (to the end user) reasons, e.g., when slots of an S4 object are renamed? > > The pattern used bysetwd()oroptions()for instance > > old_wd <- setwd("/tmp") > on.exit(setwd(old_wd)) > > has (in my opinion; I feel like I’m going to be pilloried for this) good things to recommend for it, including error recovery and straight-forward management of state. > > The purists might be concerned about the arbitrary but not random selection of seed from serialized hash; is there a solution usingparallel::nextRNGStream()?

Aaron Lun (14:52:13) (in thread): > Thanks Martin. > > Is this what one wants from nested calls? > Oops. Fixed. > > Are there concerns that the serialized version of objects change for subtle (to the end user) reasons, e.g., when slots of an S4 object are renamed? > Yes, I was thinking about this last night. I switchedxto anything that is coercible to a character vector. Hopefully, e.g., double->character is a well-defined operation. Open to other ways of portably getting bytes from an object. > > has (in my opinion; I feel like I’m going to be pilloried for this) good things to recommend for it, including error recovery and straight-forward management of state. > Yes, that might eliminate my nesting shenanigans. > > The purists might be concerned about the arbitrary but not random selection of seed from serialized hash; is there a solution using parallel::nextRNGStream() ? > That’s kind of the point, though, to be deterministic in the seed selection.

2020-11-15

Martin Morgan (11:21:06) (in thread): > I guess there’s a subtle distinction between ‘arbitrary’ and ‘psuedo-random’ – aim for deterministic selection of a psuedo-random, rather than arbitrary, seed.

Aaron Lun (14:40:32) (in thread): > you mean: generate an arbitrary seed via the hash, and then use the seed to call a PRNG to generate a pseudo-random seed?

Kasper D. Hansen (15:27:25): > Personally, I don’t like this

Kasper D. Hansen (15:28:41): > I don’t see a convincing argument in the vignette. I can see a thousand ways where this could result in potential problems

Aaron Lun (15:33:40): > I just spent a week debugging a chapter of the book because I missed aset.seed()call prior to a random step. So I don’t want to do that again.

Aaron Lun (15:34:31): > Implementations of theoretically deterministic steps like KMKNN or IRLBA should abstract away the random component if they want to be plug-and-play replacements for other algortihms.

Kasper D. Hansen (15:34:41): > You did not set the seed and you got confused because the results changed?

Kasper D. Hansen (15:35:01): > You don’t abstract it away by setting the seed

Aaron Lun (15:35:04): > I didn’t even know that there was a random component in that function.

Aaron Lun (15:35:10): > Sure you can.

Kasper D. Hansen (15:35:13): > No

Kasper D. Hansen (15:35:18): > Its a fake fix

Aaron Lun (15:35:25): > I’m pretty sure I can.

Kasper D. Hansen (15:35:54): > There are two possibiltiies. Either the algorithm converges and gives you exact precision, in which case you don’t care that its random

Kasper D. Hansen (15:36:15): > Or you take something that is random and just arbitrarily decide what the result should be by setting the seed

Aaron Lun (15:36:26): > But I do care that it’s random, because it progresses the RNG stream and affects all downstream results.

Kasper D. Hansen (15:38:02): > So? In that case, whatever confidence you reported downstream results with, is arbitrary

Aaron Lun (15:38:45): > yes, but do you know how annoying it is to have to continually update the abirtrary cluster numbers because they all flipped around when you shifted the stream?

Kasper D. Hansen (15:39:46): > I sympathize with your use case because writing a book is one of the rare few cases where I would like full control over the random stream

Aaron Lun (15:39:50): > You should try to maintain the book at some point. Then you’ll see why this is necessary.

Kasper D. Hansen (15:40:18): > But the solution to that is custom book-specific code. Not stuff that should get put into packages used by people for analysis

Aaron Lun (15:40:19): > Not just for the book. But for all practical analyses where you are making hard statements about specific outputs.

Aaron Lun (15:40:35): > The book contains real analysis scripts where I talk about “cluster 5” or blah blah blah.

Aaron Lun (15:40:42): > This is no different from what people do IRL.

Kasper D. Hansen (15:41:00): > If your hard statements change upon the random stream, they should not have been made

Aaron Lun (15:41:37): > You probably haven’t been doing a lot of single-cell analysis, then.

Kasper D. Hansen (15:42:24): > So the issue is you run a random algorithm and you get a different result every time. The solution to this is not to ignore you run a random algorithm by setting the seed

Kasper D. Hansen (15:43:09): > In fact, I would argue that the when you see the results change before your eyes, you realize that the results were not so confident

Aaron Lun (15:44:44): > Of course the solution is to set the seed. That’s what everyone does at the top of their scripts and then they forget about everything downstream. Do you think people are actually looking at each step and checking for randomness? hell no.

Aaron Lun (15:45:11): > No one bothers to do multiple runs with different seeds. When your t-SNE takes an hour to run, you’re just happy you got back a result.

Kasper D. Hansen (15:46:06): > Im not saying Im doing this systematically every time. But I am often inadvertently doing this when I re-run my script.

Kasper D. Hansen (15:51:44): > If you start to put this into packages you’ll make everything look nice and deterministic, and what is really happening is you pick a specific version of reality

Aaron Lun (18:50:26): > Whether you like it or not, people are already picking a specific version of reality without knowing anything about it. If they’re going to slapset.seed(0)at the top of their scripts, why don’t we just make life easier for them and do it ourselves?

Aaron Lun (18:53:15): > Do you know the full implications of callingset.seed(0)? I bet you don’t, and I bet you also don’t explore the full 2^32 - 1 space of possible seeds. At leastBiocSeedwill go to parts of the RNG stream that human minds will not touch.

Aaron Lun (18:54:03): > Furthermore, I would say that getting new results when you re-run your script is not a feature, it’s a bug.

2020-11-16

Martin Morgan (03:09:38): > I feel khansen is correct in saying that interpretations depending on a specific random number seed are not robust scientific results. In a t-SNE one isn’t expecting points to be at identical physical locations, but for the scientific interpretation derived from the visualization to be approximately consistent across random number streams. > > And at the same time I agree with wizard_of_oz that for expository purposes it can be very helpful to fix (in the sense of making constant) the output. > > I think the compromise (and practice that we try to encourage during the package review process, e.g.,https://github.com/Bioconductor/Contributions/issues/1764#issuecomment-727247619) of explicitly setting the seed for a vignette is appropriate, but setting the seed in package code is not. > > I viewed BiocSeed as an attempt to better manage the random number stream, e.g., making individual steps robust to the ordering of the overall workflow. Maybe a more explicit approach is to disable functionality by default, so the user (e.g., in a vignette) must sayBiocSeed::enableBiocSeed(). This provides the opportunity for explaining why the seed is being set – reproducibility for exposition purposes – and why it should not be set in general – so that one has confidence in the analytic repeatability of the scientific insights being made. As the author of a complicated exposition like a book, this reduces the chance of missing aset.seed()from once per code chunk to once per markdown document.

Martin Morgan (03:23:28) (in thread): > My rough understanding is that the L’Ecuyer (referenced on ?parallel::nextRNGStream()) random number generator has a period of 2^191. It is divided into 2^64 streams, each of period 2^127 and each appropriately psuedo-random relative to each other. So I was suggesting that you would use the hash to select one of the 2^64 streams, rather than to choose one of the 2^191 starting points in the period of the RNG.

Spencer Nystrom (08:25:21): > Perhaps another useful addition could be some sort of check that eachsetBiocSeedis followed by anunsetBiocSeed? This may be more trouble than it’s worth, but if it wound up in a package, it’d be nice to have something to add as a unit test against this, because I imagine it could be tricky to debug.

Mike Smith (11:13:05): > The next Developers Forum will take place this coming Thursday 19th Nov at 09:00 PST / 12:00 EST / 18:00 CET - You can find a calendar invite athttps://tinyurl.com/BiocDevel-2020-09-19We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > This month Gabe Becker (https://twitter.com/groundwalkergmb) will introduce the topic of Alternate Representations of R Objects or ALTREP. It’s been in R for a while (first introduced in R-3.5.0), ALTREP was the most requested Developers’ Forum topic our survey (https://community-bioc.slack.com/archives/CLUJWDQF4/p1600348498070900), and this sure to be an interesting look into some of the fundamentals of how R works. - Attachment (twitter.com): Gabe Becker (@groundwalkergmb) | Twitter > The latest Tweets from Gabe Becker (@groundwalkergmb). Doer of #rstats and #reproducibility things. I’m not influential, but people who are think some of my ideas are pretty good. Formerly Scientist @genentech. Pleasant Hill, CA

Martin Morgan (11:17:39) (in thread): > @JiefeiWangwould be a great addition to the ALTREP discussionhttps://bioconductor.org/packages/SharedObjecthttps://github.com/Jiefei-Wang/Travel https://github.com/Jiefei-Wang/HighFive - Attachment (Bioconductor): SharedObject > This package is developed for facilitating parallel computing in R. It is capable to create an R object in the shared memory space and share the data across multiple R processes. It avoids the overhead of memory dulplication and data transfer, which make sharing big data object across many clusters possible.

JiefeiWang (11:34:54) (in thread): > Thanks for the reference. It is great that the ALTREP topic will be covered by the Developers Forum. I’ll be more than happy to introduce some interesting usage of ALTREP if I can have a lightning talk after Gabe’s speak.

Levi Waldron (12:42:48) (in thread): > I agree with this statement that explicitly setting the seed for a vignette is appropriate, but setting the seed in package code is not. I once wasted quite a few cpu hours on a cluster because a package function set the seed and as a side effect, I got back thousands of identical simulation results even though that function was just one part of the simulation. It wasn’t obvious because it was a default side-effect. Now I’m curious to recall what CRAN) package it was and whether the maintainer heeded my plea to get rid of the seed side effect!

2020-11-17

Henrik Bengtsson (12:46:26): > Here’s my take on the idea of setting the random seed based on input data. > > The short version: > > I argue that statistically this is not a good idea for the same reason we don’t want to setset.seed(42)in our ~/.Rprofile startup file. If that would be a good idea, R would already do this for us. > > The long version: > > Methods that relying on resampling techniques only give accurate results on an average across random-number sequences - there’s always a non-zero probability that one of these random-number sequences will produce a very biased result. The correctness of these methods is that they are correct on average. If we happened to find interesting, significant results by chance, these findings will not survive another run, the scrutiny of other people, or passage of time. That is, the statistical method and the scientific process is designed to catch such false results. > > With a deterministic estimator f(), the estimate can be calculated as y = f(Data). In contrast, we can think of an estimator that rely on resampling techniques as y = f(Data, {RNG_i}, i) where ‘RNG_i’ is the random-number sequence with initial seed ‘i’ and ‘{RNG_i}’ is the set of all possible random-number sequences. When we fix the random seed ‘i’, we effectively get an estimator y = f(Data, RNG_i). If we are unlucky, RNG_i will be very biased. Seehttps://stats.stackexchange.com/a/157646for an example using R code to illustrate this point. > > AFAIU, the proposal here is to make the random seed a function of the data, i.e. i = h(Data). This is just like the latter case, where we get y = f(Data, {RNG_i}, i=h(Data)) = f(Data). That is, the resampling technique used by the estimator is no longer random, and for some input data there is a non-zero probability that you end up with a “poor” random seed that gives you a very biased estimate. > > I think the key point to understand here is that there is not just a single instance of ‘Data’ - there are many instances out there and there will always be cases where using the random seed i = h(Data) will produce very biased results. This means that if our method is hard-coded to use i = h(Data), it will contribute to incorrect scientific results being published and rerunning the method again will not detect this mistake.

Kasper D. Hansen (16:33:30) (in thread): > This is the kind of stuff we want to avoid

Jeroen Ooms (18:30:59): > @Jeroen Ooms has joined the channel

Gabriel Becker (18:51:01): > @Gabriel Becker has joined the channel

Gabriel Becker (19:06:48): > Hi all. > > I have developed a simple illustratory package as a companion to my presentation on Thurs, which you can find athttps://github.com/ALTREP-examples/vectorwindowI don’t know how closely I’ll be able to go through the code, given that I have a lot of introductory slides to get through first, but if you want to look through it and come with questions, I’ll be happy to answer them. If not, we’ll get through what we can. > > See you in a couple days.

2020-11-18

Aaron Lun (03:23:48) (in thread): > I understand the theoretical implications well enough. But there are real practical costs to following this ideal to the letter. All it takes is for a user to forget theirset.seed()call and they can never get back their results. At best, this is a surprise; at worst, in enterprise settings where we’re pushing data through pipelinesen masse, this is a bug. > > I would further say that, for most people, this theoretical ideal is not followed in practice. Users are conditioned to slapset.seed()at the top of their scripts, usually with some very non-random seed like0or100or, indeed,42. They are already choosing a specific version of the random results to report, whether we like it or not. The subset of users who are cognizant enough to perform multiple runs may continue to do so, but I would argue that this is the minority and so my defaults should not cater to them. > > The issue is particularly pronounced when the randomization occurs at a low level of the stack (e.g., NN-search, PCA implementations). Users are basically forced to spamset.seed()everywhere to make sure that the results are reproducible and robust to reordering of the analysis code that would otherwise change the random number stream. I would consider this an antipattern, and the various chapters of the OSCA book provide an excellent example of how this plays out in a large-scale analysis. > > In any case, I would happily accept the low chance of false results (which would never be zero anyway, even for deterministic algorithms) against the much higher chance of user pain from not getting exactly reproducible results.

Aaron Lun (03:24:17) (in thread): > WhichBiocSeeddoes avoid.

Aaron Lun (03:33:21) (in thread): > I would say that my experience with the book is simply a microcosm of what happens in a real analysis. For me, there is no difference with the treatment of the seeds between the two. I throw downset.seed()calls everywhere to make sure that the results are robust to code reordering and insertion/removal of random steps. This was a lesson painfully learnt (e.g., whenBiocParallelchanged the random seed, so you got different results depending on how many workers were assigned) and I don’t fancy going through it again. The cost of irreproducible results is real and high; and if you’re worried about scientific correctness, well, the probability of getting false positives with low-sample-size data analyzed from deterministic algorithms is much higher than getting an unfortunate series of ~10000 random draws to do some initialization or whatever.

Henrik Bengtsson (13:03:20) (in thread): > > All it takes is for a user to forget theirset.seed()call and they can never get back their results. At best, this is a surprise; > So this can easily be resolved by recording the random seed at the start of the pipeline. Pipelines/packages/functions could return the incoming random seed as, say, an attribute. > > At best, in enterprise settings where we’re pushing data through pipelinesen masse, this is a bug. > To me that is a bug of the “enterprise settings” - not the statistical methods. There are methods that rely on proper randomnessacross multiple runsby design and we should not step away from their requirement via hacks. It’s a risky business and I don’t think we want to spread this within such important fields as the Bioconductor community represents. > > Also, note that there are non-pseudo RNGs for which we cannot set an initial seed, e.g.random::randomNumbers(10). It’s not totally unlikely that anRNGkind()for those in R at some point. > > would further say that, for most people, this theoretical ideal is not followed in practice. Users are conditioned to slapset.seed()at the top of their scripts, usually with some very non-random seed like0or100or, indeed,42. They are already choosing a specific version of the random results to report, whether we like it or not. > I think it is our responsibility to explain what the implications could be from such practices. We c/should raise the question if they’re really needed. But, I don’t think we be in the business of promoting throwing in aset.seed()as a quick fix. > > In any case, I would happily accept the low chance of false results (which would never be zero anyway, even for deterministic algorithms) against the much higher chance of user pain from not getting exactly reproducible results. > My counter argument/question: how would you feel if someone died during clinical trials because of that choice of coding? Or someone who should get a treatment doesn’t get it? > > So, I’m on the conservative end that prioritize correctness before speed and convenience. I arrived at this after having been bitten by too many exponentially costly bugs and hacks over the years - including my own. I might even support if R would only allow for the random seed to be set on top of scripts and never inside an R package.

Aaron Lun (13:53:28) (in thread): > > So this can easily be resolved by recording the random seed at the start of the pipeline. > Which then requires people to modify their code with aset.seed()anyway. > > There are methods that rely on proper randomness across multiple runs by design and we should not step away from their requirement via hacks. > Seems like those functions should automatically trigger multiple runs if they care about it so much, rather than hoping that the user will run it multiple times. > > how would you feel if someone died during clinical trials because of that choice of coding? Or someone who should get a treatment doesn’t get it? > I have no feelings towards that whatsoever. But you could say the same for any deterministic algorithm, e.g., choice of statistical test, choice of read aligner, etc. Arbitrary decisions aren’t new here. > > So, I’m on the conservative end that prioritize correctness before speed and convenience. > If your idea of correctness is to do multiple runs, then it is the function’s responsibility to set them off, rather than pushing that onto the user. (And then we still need to set a seed to ensure that the multi-run results are reproducible.) If your idea of correctness is related to a single run, well, I would say my choice of seed is no worse than any other.

Henrik Bengtsson (13:56:50) (in thread): > > Which then requires people to modify their code with aset.seed()anyway. > Nah, the random state is inglobalenv()$.Random.seed, so that’s what needs to be recorded, e.g.oseed <- globalenv()$.Random.seed. To reset, one can use `globalenv()$.Random.seed <- oseed. This can be wrapped in a neat API. (BTW, I think that R stores the RNG state in the global environment to be a bit … odd)

Aaron Lun (13:58:12) (in thread): > that looks even more grotesque thanset.seed().

Henrik Bengtsson (13:58:32) (in thread): > It’s not about look - it’s about correctness

Aaron Lun (13:58:44) (in thread): > but you’re setting the seed anyway, so what’s the difference?

Henrik Bengtsson (14:02:43) (in thread): > You don’t have to set the seed. If your pipeline returnsattr(result, "initial_seed")then user of your pipeline has theoptionto rerun with the exact same seed viaglobalenv()$.Random.seed <- ttr(result, "initial_seed"). (It would be neat ifset.seed(attr(result, "initial_seed"))would support this but I don’t think it does.)

Henrik Bengtsson (14:10:22) (in thread): > > If your idea of correctness is to do multiple runs, then it is the function’s responsibility to set them off, rather than pushing that onto the user. > Many methods relying on RNG already do “multiple runs”, e.g. bootstrap etc. How many iterations depends on you the wanted precision. It’s that choice of precision that leaves room for the non-zero probability of getting biased results. There is no solution that will remove this uncertainty - we can just choose to make it smaller and we might be able to improve on the methods/algorithms such that they converge quicker. > > I think this comes back to “re-validation” - a fundamental concept of statistics and science - we can’t get around it because there is no absolute truth.

Henrik Bengtsson (14:11:43) (in thread): > > I have no feelings towards that whatsoever. But you could say the same for any deterministic algorithm, e.g., choice of statistical test, choice of read aligner, etc. Arbitrary decisions aren’t new here. > Sure, arbitrary decisions are made all the time, but it doesn’t mean we should accept incorrect ones when we are aware of them.

Henrik Bengtsson (14:19:07) (in thread): > > There is no solution that will remove this uncertainty - we can just choose to make it smaller > I should clarify this further: if the random seed is such that we get a random sequence that result in a greatly biased estimate, it doesnotmatter how many iterations we run. It will still be biased. > > We can only protect ourselves against such biases due to the random seed by rerunning with other random seeds. Then a rhetorical question is: How do you get hold of those “other random seeds”?

Aaron Lun (21:51:47) (in thread): > If you’re defining correctness by long-term average behavior, most of our inferences are only correct on average anyway (i.e., with new draws from the population), even for deterministic algorithms. So what’s the big deal with converting a random algorithm into a deterministic one using a data-derived seed? Long-term correctness is still preserved; when someone replicates the study with a new dataset, they get a different seed and do their analysis in some other part of the random number stream. It’s not like we’re always using the same position in the stream for every dataset. > > Practically, your proposed approach of returning the seed in an attribute would inevitably devolve to aset.seed()call or its equivalent. I don’t pass results to people or systems; I pass code and data to them, and they regenerate the results accordingly. Under your proposal, I would either have to embed the 626-int seed vector in my Rmarkdown/R script, or I will have to pass along an additional RDS object containing the seed and modify the code to assign it to.Random.seed. This is unnecessarily convoluted and I can predict that there would be no uptake among my users. > > The reality is that, when there is an unexpected change in the results, the response of most users will be to slap aset.seed()at the top and move on. (Or indeed, maybe they already did that defensively.) I simply plan to make life easier by just doing it for them, and at least I will choose some more diverse seeds than the usualset.seed(0)that everyone would otherwise do. This gives reproducibility by default; and if you are so inclined, you can just doBiocSeed::disableBiocSeed()and re-run everything at your leisure to evaluate the stability of the results.

Henrik Bengtsson (22:05:51) (in thread): > Personally, I’m concerned about this approach and I’d even say I’d stay away from any methods implementing it until I’m convinced it’s statistical sound to do so. That’s just my personal opinion based my limited knowledge. However, I would like to suggest that you bring this up to a much larger audience for a solid scrutiny before going ahead with this approach.

2020-11-19

Kevin Blighe (08:27:55): > @Kevin Blighe has joined the channel

Henrik Bengtsson (18:30:10) (in thread): > Thanks for a great presentation. It made be wanna update {matrixStats} to make use ofREAL_NO_NA(x). For backward compatible reasons, what’s the best way to check whether ALTREP is supported or not? My current idea is to check whether or notALTREP_METHODSis defined.

Gabriel Becker (19:01:38) (in thread): > if REAL_NO_NA(x) exists as a symbol at all it will work on both ALTREP and non-ALTREP REALSXP objects

Gabriel Becker (19:01:51) (in thread): > (if not your code wont’ compile:wink:)

Henrik Bengtsson (19:05:58) (in thread): > yes, that part I got. But, I want my code to compile all the way back to R 2.12.0. So, if ALTREP is not available, I’ll just assume there are NAs (as the code does right now). > > I think I just spotted aALTREPmacro in the source code, so would: > > #ifdef ALTREP > ... > #else > ... > #endif > > be a good pattern?

Gabriel Becker (19:07:00) (in thread): > ALTREP shouldn’t (I think?) be a macro unless you’re defining USE_RINTERNALS. the best way would be to dig around for where in the headers (hopefully) the R version is defined, and test it against 3.5.0 (I incorrectly said 3.4.0 in the talk for some reason but that was wrong)

Gabriel Becker (19:07:27) (in thread): > but just test r version against 3.5.0 if you can (I don’t know where in the header that lives OTTOMH)

Henrik Bengtsson (19:09:07) (in thread): > Got it. I’ll dig around a big more. It would be nice if we could come up with a de facto standard for this. I guess could always check forREAL_NO_NA. Thxs

Gabriel Becker (19:11:19) (in thread): > looks like R version lives in Rversion.h and config.h after config

Gabriel Becker (19:12:00) (in thread): > I dunno how easy that will be to work with in a preprocessor though

Gabriel Becker (19:12:14) (in thread): > REAL_NO_NA is also a function, most things are going to be funcions

Henrik Bengtsson (19:13:37) (in thread): > Ah… I thoughtREAL_NO_NAwas a macro.

Gabriel Becker (19:14:11) (in thread): > I mean, all of these things are macros in the r internals but they’re (mostly inlined) functions when USE_RINTERNALS isn’t defined, I think

Gabriel Becker (19:17:33) (in thread): > currently the sortedness macros are just that (KNOWN_SORTED and the like) but I wouldn’t want to rely on them being macros forever instead of inlined functions

Gabriel Becker (19:21:20) (in thread): > so right now #ifdef KNOWN_SORTED would work but that definitely may break someday

Henrik Bengtsson (19:22:40) (in thread): > Mkay… I just right now confirmed that the following works for me on R 3.3.3 and R 4.0.3: > > #ifdef REAL_NO_NA > static R_INLINE int has_NA(SEXP x) { > int mode = TYPEOF(x); > switch (mode) { > case INTSXP: return !INTEGER_NO_NA(x); > case LGLSXP: return !LOGICAL_NO_NA(x); > case REALSXP: return !REAL_NO_NA(x); > default : return 1; > } > } > #else > static R_INLINE int has_NA(SEXP x) { > return 1; > } > #endif > > So, you’re saying I shouldn’t use that?

Henrik Bengtsson (19:27:41) (in thread): > Ahh… my bad. Scratch that.#ifdef REAL_NO_NAisalwaysFALSE.

Henrik Bengtsson (19:51:13) (in thread): > Ok, I’ll be comparing to the R version. RWE explains to do this in a safe, backward-compatible way. So, I can define my ownHAS_ALTREPas: > > #include <Rversion.h> > #if defined(R_VERSION) && R_VERSION >= R_Version(3, 5, 0) > #define HAS_ALTREP > #endif > > and then use it as, for instance: > > #ifdef HAS_ALTREP > static R_INLINE int has_NA(SEXP x) { > int mode = TYPEOF(x); > switch (mode) { > case INTSXP: return !INTEGER_NO_NA(x); > case LGLSXP: return !LOGICAL_NO_NA(x); > case REALSXP: return !REAL_NO_NA(x); > default : return 1; > } > } > #else > static R_INLINE int has_NA(SEXP x) { > return 1; > } > #endif >

Henrik Bengtsson (19:51:37) (in thread): > I’ve just confirmed (for real) that the above works.

Gabriel Becker (20:04:51) (in thread): > :+1:

Henrik Bengtsson (21:20:40) (in thread): > Thxs for your help

2020-11-20

Kayla Interdonato (11:37:01): > @Gabriel BeckerWould you be able to provide a link to your slides? I’ll be posting the recording to yesterday’s meeting on our website under course materials and it would be great to have the slides there as well.

Gabriel Becker (13:50:13): > here they are@Kayla Interdonato - File (PDF): talk2.pdf

Kayla Interdonato (13:50:36) (in thread): > Thank you!

Gabriel Becker (14:09:58) (in thread): > please let me know when its up

Kayla Interdonato (14:52:19): > The recording and material from yesterday’s devel forum is now available on our Youtube (https://youtu.be/8i7ziLqsE2s) as well as the course materials on the website (https://www.bioconductor.org/help/course-materials/). - Attachment (YouTube): Developers Forum 16

2020-11-30

Will Macnair (05:52:32): > @Will Macnair has joined the channel

2020-12-03

Mike Smith (12:27:41): > The next Developers Forum will take place next Thursday 10th Dec at 09:00 PST / 12:00 EST / 18:00 CET - You can find a calendar invite athttps://tinyurl.com/BiocDevel-2020-12-10Please note this is one week earlier than originally scheduled to avoid clashing with EuroBioc2020. The event calendar has been updated. > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) > > This call will be a direct follow on from our introduction to ALTREP last month.@JiefeiWangwill show us some of the difficulties with working with ALTREP in practise, and introduce the Travel package (https://github.com/Jiefei-Wang/Travel) he’s been developing to ease some of these limitation when writing C++ code. > > Please let me know if you have any other topics you’d like to discuss this month.

2020-12-10

Mike Smith (08:35:46): > Just a quick reminder that@JiefeiWangwill be presentingtodayat 09:00 PST / 12:00 EST / 18:00 CET

Michael Lawrence (11:55:16): > It would help to send a calendar invite. If it doesn’t show up on my calendar, I won’t be there.

Mike Smith (11:59:32) (in thread): > It’s in the BioC events calendar athttps://bioconductor.org/help/events/and there’s a link where a calendar invite can be downloaded in my announcement message/email. Maybe those aren’t sufficient, so I can have a think about other mechanisms.

Simina Boca (14:47:19): > Has there been a weird Git/GitHub update?

Simina Boca (14:47:41): > I want to update a package and I got a popup to input my GitHub username and password, which I had not gotten before

Simina Boca (14:48:05): > Also, the BioC package link is no longer upstream

Simina Boca (14:48:10): > I.e. it used to be:

Simina Boca (14:48:11): > $ git remote -v > originhttps://github.com/leekgroup/swfdr.git(fetch) > originhttps://github.com/leekgroup/swfdr.git(push) > upstreamgit@git.bioconductor.org:packages/swfdr (fetch) > upstreamgit@git.bioconductor.org:packages/swfdr (push)

Simina Boca (14:48:18): > Now and it is:

Simina Boca (14:48:29): > $ git remote -v > originhttps://github.com/SiminaB/swfdr.git(fetch) > originhttps://github.com/SiminaB/swfdr.git(push) > upstreamhttps://github.com/leekgroup/swfdr.git(fetch) > upstreamhttps://github.com/leekgroup/swfdr.git(push)

FelixErnst (14:49:40): > The remote names are not a convention. You could also use the remote namewhateverinstead ofupstreamand git couldn’t care less

Nitesh Turaga (14:50:18): > It needs to be an SSH protocol.

Nitesh Turaga (14:50:31): > You are getting the password and username issue because of that.

Nitesh Turaga (14:50:58): > You can do, > > git remote set-url upstream git@git.bioconductor.org:packages/swfdr > > that will fix it.

Dirk Eddelbuettel (14:56:09) (in thread): > Oh, nice. I just did the equivalent yesterday doing theeditor .git/configmethod :)

Simina Boca (14:56:37): > Thank you so much (as usual!)@Nitesh Turaga!

Simina Boca (14:57:16): > Is there a reason I didn’t come across this before?

Simina Boca (14:57:35): > (I keep my own notes for how to do this lol since I only do these updates a couple of times a year)

Nitesh Turaga (14:59:02): > I’m not too sure, it could have happened if you switched computers or did a fresh clone or something of that nature.

Simina Boca (14:59:04): > Oh, I also got this email from GitHub: > > Hi @SiminaB, > > > > We have detected that you recently attempted to authenticate to GitHub using an older version of Git for Windows. GitHub has changed how users authenticate when using Git for Windows, and now requires the use of a web browser to authenticate to GitHub. To be able to login via web browser, users need to update to the latest version of Git for Windows. You can download the latest version at: > > > > https://gitforwindows.org/If you cannot update Git for Windows to the latest version please see the following link for more information and suggested workarounds: > > > > https://aka.ms/gcmcore-githubauthchangesIf you have any questions about these changes or require any assistance, please reach out to GitHub Support and we’ll be happy to assist further. > > > > *https://support.github.com/contactThanks, > > The GitHub Team

Nitesh Turaga (15:00:01): > Have you tried the Ubuntu app in windows for this sort of terminal based work ? It usually works really well for me.

Nitesh Turaga (15:00:15): > my git version isgit version 2.24.3 (Apple Git-128)

Simina Boca (15:00:48): > Good idea!

Simina Boca (15:00:54): > I’ve used that app but for other things

Simina Boca (15:01:33): > the only painful part is that the home directory on that app is in a weird place, but I guess you can alias

Jared Andrews (15:01:48): > WSL is definitely the way to go.

Kayla Interdonato (15:43:53): > The recording of today’s developer forum is now available on youtube (https://www.youtube.com/watch?v=biygNnJA1oY) as well as the course materials on the Bioconductor website,https://www.bioconductor.org/help/course-materials/. - Attachment (YouTube): Developers Forum 17

2020-12-11

Michael Lawrence (14:58:20) (in thread): > Thanks, I will watch for your announcement email then.

Michael Lawrence (15:02:31): > I watched the recording. Without looking at the code, I think the virtual memory trick being used by the travel package may be the same that was presented by Hannes Mühleisen at a DSC a few years back.https://www.r-project.org/dsc/2017/slides/DSC2017-mprotect.pdf

Martin Morgan (15:03:29): > @JiefeiWang^^

Michael Lawrence (15:04:53): > If memory serves, he came up with that on the plane flight to the conference.

JiefeiWang (19:42:58): > Thanks for sharing! I am reading his slides. It is amazing that someone had considered the idea of the virtual pointer 3 years before. The signal handler solution he mentioned in the slides is another possibility for the virtual pointer. I’ve also considered it when I was designing the Travel package, but there are some limitations with the signal handler. Besides the performance issue he mentioned, using a generic signal handler for a specific purpose also seems problematic, so I did not use this strategy. The current implementation for Travel isfile-mapping+virtual file, the OS determines how to manage the memory page and the filesystem(fuseon Linux anddokanon win) provides the virtual file. As they do it in parallel, this is much faster than doing everything in a single thread. Also, mapping the same file in another R process would not double your swap space usage for the OS knows it is the same file. This makes sharing a virtual pointer in the R-level parallel computing much easier.

JiefeiWang (19:46:00): > I haven’t explored the signal handler as deep as Hannes did, so It is very helpful to see his conclusion about it.

2020-12-12

Huipeng Li (00:40:20): > @Huipeng Li has joined the channel

2020-12-13

Paul Harrison (19:18:22): > @Paul Harrison has joined the channel

2020-12-15

Francesc Català (05:41:54): > @Francesc Català has joined the channel

Jialin Ma (14:17:21) (in thread): > This is an amazing and clever package! Thanks for making it. > By the way, I was shocked when reading your example on README > > > x <- make_sequence_altrep(n = 1024*1024*1024*64, start = 1, step = 2) > > x[1] <- 100 > > x[1:10] > [1] 100 3 5 7 9 11 13 15 17 19 > > and thought you somehow avoided allocation of the new array. But I overlooked its size, which can be handled by memory.

Jialin Ma (14:20:50) (in thread): > One question out of curiosity: > If someone usesDATAPTRtrying to write data to the “virtual file”, what would happen?

2020-12-17

James MacDonald (11:12:25): > @James MacDonald has joined the channel

2020-12-22

JiefeiWang (08:01:36) (in thread): > Hi, sorry for the late reply, somehow this thread is not highlighted in my slack. For your question. You are free to overwrite the data in the vector as there is a write cache for each ALTREP, so if one usesDATAPTRto write data to the ALTREP, the data will be sent to the write cache and the pointer will work exactly as you expect. Furthermore,Travelprovides a function to allow developers to determine how to process the write request. For example, it is possible to make a compressed vector which allows you to write the data into the vector by its pointer and the data will be compressed in real time. I hope this can answer your question.

2020-12-23

Simina Boca (10:41:32): > In trying to update a package I got an error: > > remote: Error: Illegal version bump from ‘1.17.0’ to ‘1.18.0’.

Simina Boca (10:41:47): > It says to: > > remote: Checkhttp://bioconductor.org/developers/how-to/version-numbering/remote: for details.

Simina Boca (10:42:24): > But I don’t see that error listed. Anyone have experience with this?

Simina Boca (10:42:53): > Maybe I should have updated 1.17.0 to 1.17.1?

Charlotte Soneson (10:45:37): > Yes - the middle number is always odd in devel, and 1.17.z will automatically become 1.18.0 in the next release. The last sentence on the page above outlines the typical behaviour in package development: > > z- should be incremented sequentially during regular package development - no limitation on the size of z - bumped at release time to 0 for all packages.

Simina Boca (10:48:10): > Got it! Thank you so much!

2021-01-05

FelixErnst (06:45:17): > @Mike Smithcan you post/send the link to the code browsing tool, you showed in EuroBioc2020? I can’t find a link on your website, but I think it was on a subdomain, wasn’t it?

Vince Carey (09:18:22): > @FelixErnstdo you meanhttp://bioc-code-tools.msmith.de/ - Attachment (bioc-code-tools.msmith.de): Landing Page – Layout Examples – Pure > A layout example that shows off a responsive product landing page.

FelixErnst (09:19:03) (in thread): > Thanks!

Hervé Pagès (13:30:17): > Maybe this should be linked from the Bioconductor website, if not already?

2021-01-06

FelixErnst (10:12:21): > If I remember correctly, Mike mentioned, that is on the agenda. But when/how/who I don’t know.

FelixErnst (10:15:15): > But I agree, that this is/would be such a nice addition, I cannot comprehend, that I misplace that link

2021-01-07

Nitesh Turaga (16:53:40): > When is the january developer forum? Anyone have a date?

Lori Shepherd (17:31:15): > It’s on tht Google calendar. Currently scheduled for the 21st

2021-01-08

Nitesh Turaga (01:17:52): > Thanks Lori.

2021-01-14

Federico Marini (08:53:19): > Picking up on the topic of package dependencies again:https://github.com/jokergoo/pkgndep

2021-01-15

Aaron Lun (19:58:54): > @Martin Morganwhat do you think of my proposal herehttps://github.com/LTLA/SingleR/issues/170#issuecomment-761278882, or does something already exist in an appropriate base package?

2021-01-16

Martin Morgan (01:48:47) (in thread): > commented on the issue; maybe@Hervé Pagèshas input?

2021-01-19

Pablo Rodriguez (04:56:41): > @Pablo Rodriguez has joined the channel

Mike Smith (08:25:34): > The next Bioconductor Developers’ Forum is this Thursday 21st January at 09:00 PST / 12:00 EST / 18:00 CET - You can find a calendar invite attached to this post and athttps://tinyurl.com/BiocDevel-2021-01This month we will hear from Jass Bagga and Erdal Cosgun from Microsoft Genomics. They will introduce the work they’ve been doing in collaboration with the Bioconductor core team (thanks@Nitesh Turaga) to provide access to Bioconductor and other genomics tools in the Azure cloud. If you’re interested in discussing scalable platforms for running analysis workflows, interactive notebook environments, or accessing large-scale public datasets this should be a great event. > > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) - File (Binary): bioc-devel-forum-21-01.ics

Mike Smith (08:27:46) (in thread): > Please let me know if you try the ICS file and it doesn’t work in your calendar/timezone/OS etc. Digital calendars seem to be a continuing enigma to me!

2021-01-21

Kayla Interdonato (21:33:33): > The recording from today’s developers’ forum is now available on our YouTube -https://www.youtube.com/watch?v=ALUTe8eyLOg&feature=youtu.beas well as under the course materials on the Bioconductor website. - Attachment (YouTube): Developers Forum 18

2021-01-22

Annajiat Alim Rasel (15:43:39): > @Annajiat Alim Rasel has joined the channel

Martin Morgan (16:44:33): > Wow the video in this blog post provides a really great intro to using RStudio / Bioconductor in AnVIL; complements the developer forum talk earlier this week, showing some features of cloud-based computing like fast access to cloud-based data, easily sharing analyses, and scalable compute power. - Attachment (Terra.Bio): Try out RStudio in Terra - Terra.Bio > A sneak preview at RStudio in Terra, with a short video demonstrating how to launch RStudio, import data and use Bioconductor for scRNAseq

Aaron Lun (16:46:56): > I wonder if it is worth revisiting the thought of building the books on some of these systems, to take the compute burden off the BBS.

Martin Morgan (16:50:35) (in thread): > With your new modular structure and a WDL that wouldn’t be too complicated to create, you could launch appropriate instances to build each module, in parallel, and have the book built in (relatively) no time… Or skipping the workflow, get a really big instance and build in parallel.

2021-01-23

Aaron Lun (01:42:12) (in thread): > would be interesting to do a few runs, see the cost-benefit analysis. Might be the only way if the book keeps on expanding as I hope it will

Vince Carey (06:32:11) (in thread): > The initial design of book was monolithic owing to cross-referencing of results between chapters … how is this handled now?

Aaron Lun (20:56:56) (in thread): > With a lot of work

Aaron Lun (20:57:27) (in thread): > each subbook now scrapes its own references and puts it in itsinstfor retrieval by other books.

2021-01-28

Robert Castelo (13:07:15): > hi, i’m working on a new package, decided to start usingroxygen2and found the following glitch, when callingdevtools::document()the generatedNAMESPACEfile always includes the following line: > > exportClasses("") > > which i always have to remove, otherwise i get the errorempty name in directive 'exportClasses' in 'NAMESPACE' fileat install. i do define a couple of S4 classes in the package but i can’t see anything in the associated#'lines which could be causing this, and after googling for quite some time i haven’t been able to find the problem. any hint fromroxygen2experts here?

FelixErnst (13:10:43): > Do you have a roxygen block starting with an empty line? I had the same issue and it turned out that there was any empty line in roxygen block aboveNULLwith a@nametag as the first element in the block. I figured it out the hard way, be removing code and adding each block back in until I narrowed it down.

Robert Castelo (13:13:32): > Do you mean in theR/AllClasses.Rfile where i define the classes and use the@exportClassdirective or anywhere in the rest of the files?

FelixErnst (13:14:40): > Probably the first

Robert Castelo (13:18:31): > mmmm..i don’t see that pattern, the three classes i’m defining look like: > > #' blabla class > #' This is a class xxx. > #' > #' @slot first slot. > #' > #' @slot second slot. > #' > #' @importClassesFrom somewhere thatclass > #' > #' @name blabla-class > #' @rdname blabla-class > #' @exportClass > setClass("blabla", representation(first="character")) >

Aaron Lun (13:18:51): > Why not just use@export?

FelixErnst (13:19:17): > ~~~Why not just use ~~~~~~~export~~~~~~~~ ?~~~~What he said :)

Robert Castelo (13:20:01): > aha, well, i saw somewhere@exportClassand thought that would the the right directive to export a class, but let me try using just export.

FelixErnst (13:21:56): > I think theexportClasstag requires a value:https://r-pkgs.org/namespace.html#export-s4

FelixErnst (13:22:57): > #' @exportClass blabla

Robert Castelo (13:23:14): > bingo!!!! using@exportinstead of@exportClassworks!!!

Robert Castelo (13:23:34): > but then, what’s the difference between using@exportand@exportClass?

FelixErnst (13:24:45): > The requirement for a value, I guess.

Robert Castelo (13:29:18): > ok, i’ve tried@exportClass blablaand works also without adding the buggyexportClasses(""). thank you so much!!!:+1::clap:

2021-01-29

Milan Malfait (06:26:02): > @Milan Malfait has joined the channel

Aaron Lun (21:29:57): > @Hervé Pagèscould S4Vectors host generics for a “flexible rbind” and “flexible cbind”? For example, one could imagine a function that is more tolerant to mismatches in the number of columns whenrbind-ing DFs by filling in missing columns in each DF with NAs.

Aaron Lun (21:30:17): > SingleCellExperimentwould have some interest in extending these generics to SCEs, if they were available.

2021-01-30

Vince Carey (06:22:10) (in thread): > like this? > > > m1 = mtcars[,1:4] > > m2 = mtcars[,2:5] > > bind_rows(m1,m2)[31:34,] > mpg cyl disp hp drat > Maserati Bora...31 15.0 8 301 335 NA > Volvo 142E...32 21.4 4 121 109 NA > Mazda RX4...33 NA 6 160 110 3.9 > Mazda RX4 Wag...34 NA 6 160 110 3.9 >

Vince Carey (06:22:37) (in thread): > bind_rows is in dplyr

Aaron Lun (14:16:41) (in thread): > something like that. But extensible to SE’s and other things in the BioC class hierarchy.

Ruizhu HUANG (18:06:29): > @Ruizhu HUANG has joined the channel

2021-02-02

Hervé Pagès (04:17:31) (in thread): > Sure. Do you have these generics/methods somewhere already? The generics could go to BiocGenerics. I guess there would be methods for DataFrame objects? Note that we have something similar todplyr::bind_rows()in S4Vectors for binding the metadata columns of Vector derivatives: > > gr1 <- GRanges("A", IRanges(1:10, 15)) > mcols(gr1) <- mtcars[1:10, 1:3] > gr2 <- GRanges("A", IRanges(1:8, 12)) > mcols(gr2) <- mtcars[11:18, 2:4] > mcols(c(gr1, gr2)) > # DataFrame with 18 rows and 4 columns > # mpg cyl disp hp > # <numeric> <numeric> <numeric> <numeric> > # 1 21.0 6 160 NA > # 2 21.0 6 160 NA > # 3 22.8 4 108 NA > # 4 21.4 6 258 NA > # 5 18.7 8 360 NA > # ... ... ... ... ... > # 14 NA 8 275.8 180 > # 15 NA 8 472.0 205 > # 16 NA 8 460.0 215 > # 17 NA 8 440.0 230 > # 18 NA 4 78.7 66 >

Aaron Lun (04:24:20) (in thread): > no, i was just thinking of writing one for SCE but then I realized how deep the rabbit hole went. I guess I could start with making a PR for DFs unless it’s easier for you to just factor out themcols()handling into a separate function.

Aaron Lun (04:24:41) (in thread): > … and we should both go to sleep.

Hervé Pagès (12:22:56) (in thread): > Any suggestion for the name? It’s by far the hardest part.

Aaron Lun (12:23:34) (in thread): > Hmm. I too was wondering this.

Aaron Lun (12:23:43) (in thread): > we can’t really usemerge.

Hervé Pagès (12:24:05) (in thread): > of course not

Hervé Pagès (12:25:23) (in thread): > it’s about binding rows right? merge can go in any direction (vertical or horizontal)

Aaron Lun (12:26:03) (in thread): > The DF use-case would be binding rows. I would also like a separate (but related) generic to bind columns, which is what I will be extending for the SCE’s.

Hervé Pagès (12:27:54) (in thread): > Are you thinking of 2 separate verbs, one for each direction, or one single verb for combining the SCE’s?

Aaron Lun (12:28:23) (in thread): > two separate verbs

Aaron Lun (12:28:53) (in thread): > so we’d wantsomethingRowandsomethingCol, etc. with the same “something”, whatever that is.

Aaron Lun (12:29:14) (in thread): > guess we could just use “combine”.

Hervé Pagès (12:29:27) (in thread): > Maybe this is a case where knowing more about what the merging/combining will do exactly would help come up with good names.

Hervé Pagès (12:30:12) (in thread): > You said 2 separate verbs?

Hervé Pagès (12:31:33) (in thread): > oh, I missed what you said~~~after~~~right before that, I was typing so didn’t pay attention

Hervé Pagès (12:33:40) (in thread): > wouldnrow(combineRows(sce1, sce2))be equal tonrow(sce1) + nrow(sce2)?

Aaron Lun (12:34:31) (in thread): > yes

Aaron Lun (12:34:43) (in thread): > and similarly forcombineColumns

Hervé Pagès (12:40:04) (in thread): > So basicallycombineRowswould do whatrbinddoes right now but would accept SE objects that don’t necessarily have the same columns?

Hervé Pagès (12:52:41) (in thread): > At the lower level,combineRowswould call various rbind/cbind/merge/combine/bind_rows things on the vertical and horizontal slots to “combine” them in a sensible manner. Might also callcombineRowsand/orcombineColsmethods defined for some of the slots to combine (e.g. assays). I guess some of the aspects of all this internal merging/combining will be controlled by a bunch of extra arguments. > Anyways, sounds like BiocGenerics would be a good place for thecombineRows/combineColsgenerics, next to thecombinegeneric.

Aaron Lun (12:52:54) (in thread): > yes, that’s right; missing columns would either be removed, filled in with NAs or zeroes, depending on the user’s choice. The NA-fill be done in a reasonably memory-efficient manner via theConstantMatrix, currently sitting around in SCE.

Hervé Pagès (12:55:18) (in thread): > About their signature:combineRows/Cols(x, y, ...)like forcombine()? Making them binary make it easier to support extra arguments.

Aaron Lun (12:55:44) (in thread): > yeah sure

Hervé Pagès (12:56:47) (in thread): > great, I’m going to add them to BiocGenerics

Aaron Lun (13:00:27) (in thread): > thanks. If you can move themcolscode aroudn for the DFcombineRowsmethod, I will start looking at the SE methods.

Hervé Pagès (13:25:04) (in thread): > NewcombineRows/combineColsgenerics are in BiocGenerics 0.37.1:https://github.com/Bioconductor/BiocGenerics/commit/3299912300a823d08df27b72eb87a12375179327Will look at adding acombineRows()method for DataFrame objects to S4Vectors.

Sanchit Saini (17:16:49): > @Sanchit Saini has joined the channel

2021-02-03

Michael Lawrence (13:17:59) (in thread): > This is a great development.

2021-02-04

Chris Vanderaa (17:38:56): > @Chris Vanderaa has joined the channel

Nitesh Turaga (23:41:56): > How can I confirmmake -jis working in parallel on a mac?

Dirk Eddelbuettel (23:49:27) (in thread): > Does it have something likehtopto see processes launched? Else wrap/usr/bin/timein front ofmake; with parallel use wall time (3rd col) and tot time (1st col) should differ. (Also, I thinkmakeuses ‘a process-parallel’ approach if you have more than 1 core it should just work…)

2021-02-05

Hervé Pagès (01:10:55) (in thread): > Maybe you’re asking for a specific project but if you’re wondering ifmake -jin general actually works in parallel on a mac, you can try the fun little experiment: > 1. Put the following in toy Makefile (make sure to indent with a tab, not with spaces): > > > all: job1 job2 job3 job4 > @echo "done with all jobs" > > job1: > @echo "starting job1" > @sleep 5 > @echo "done with job1" > > job2: > @echo "starting job2" > @sleep 5 > @echo "done with job2" > > job3: > @echo "starting job3" > @sleep 5 > @echo "done with job3" > > job4: > @echo "starting job4" > @sleep 5 > @echo "done with job4" > > 2. Then run: > > time make all > > time make all -j4 > > Seeing is believing.

Hervé Pagès (01:27:35) (in thread): > How much a specific project will be able to take advantage of this depends on the dependency graph. The worse case scenario is when the tree of deps is linear. For example: > > all: job1 job2 job3 job4 > @echo "done with all jobs" > > job1: job2 > @echo "starting job1" > @sleep 5 > @echo "done with job1" > > job2: job3 > @echo "starting job2" > @sleep 5 > @echo "done with job2" > > job3: job4 > @echo "starting job3" > @sleep 5 > @echo "done with job3" > > job4: > @echo "starting job4" > @sleep 5 > @echo "done with job4" > > Thenmake -j4won’t be able to run anything in parallel. You actually see this when compiling R withmake -j20: even though the C code gets compiled in parallel (and this part is amazingly fast if you can use dozens of cores), the last steps when the core packages get installed and byte-compiled are always performed linearly.

Nitesh Turaga (13:18:05) (in thread): > Thanks Herve and Dirk. That helps.

Dirk Eddelbuettel (13:55:49) (in thread): > You are welcome! When I was younger and smarter I also remembered to do thesleep Ntrick Herve used here—it is very appropriate.

2021-02-09

Aaron Lun (16:03:23): > Well. This is a problem. In Julia: > > struct aaron > thing::Vector{Float64} > end > > x = [1.0, 2.0, 3.0]; > me = aaron(x); > x[1] = 4; > > me.thing > # 3-element Array{Float64,1}: > # 4.0 > # 2.0 > # 3.0 > > How does the average user survive with this pass-by-reference behavior? The relationships between objects aren’t clear from the object definitions. At least with C/C++ I know when this is happening because I’ve got pointers floating around.

Aaron Lun (16:06:10): > A R/BioC-like experience would require the constructors perform a deep copy. Not sure I like that.

Hervé Pagès (16:43:20): > > How does the average user survive with this pass-by-reference behavior? > The same way they survive it with numpy, I guess: > > >>> import numpy as np > >>> a = np.array([1, 2, 3, 4, 5]) > >>> b = a > >>> a[0] = 99 > >>> b > array([99, 2, 3, 4, 5]) >

Aaron Lun (16:44:17): > yes, I thought numpy would have this too. It seems dangerous.

Hervé Pagès (16:45:07): > Of course, if you come from R, you are strongly biased about this.

Aaron Lun (16:46:29): > well, I came from C++ before R, so I understand the why. But there must be so many bugs in analysis code where someone just modifies, e.g., a list and it propagates unintentionally to a class member that was bound to that same list.

Hervé Pagès (16:48:21): > Can’t you write a constructor in Julia that enforces deep copy of the supplied member values?

Aaron Lun (16:49:07): > yes, that would be the workaround.

Aaron Lun (16:51:10): > though even constructors are looking like they’ll be a pain to write; eachstructdefinition instantiates its own default constructor, and I don’t know how to override that. You can have different constructors but it seems they require different numbers of arguments to distinguish them.

Hervé Pagès (16:55:14): > If the constructor cannot be overriden, seems like a project that wanted to avoid the pass-by-ref could come up with a convention to provide constructor functions with a recognizable prefix e.g.new_aaron().

Aaron Lun (16:56:18): > yes, that’s also what I was thinking. Ormake_aaron().

Hervé Pagès (16:56:41): > orgo_aaron()!

Aaron Lun (16:56:46): > indeed

Aaron Lun (16:57:14): > Though if such a hypotehtical project did deepcopies as a default, we’d probably be told how inefficient we were.

Hervé Pagès (16:57:54): > yes they’d say: you guys should use R!:grin:

Hervé Pagès (16:59:59): > More seriously, it would be a fake deepcopy i.e. would use a COW mechanism like R does. Don’t know if Julia supports this though.

Aaron Lun (17:05:18): > probably need a kind ofLockedlayer wrapping the underlying objects

Spencer Nystrom (19:47:17): > Wait, can you not do: > > struct aaron > thing::Vector{Float64} > aaron(x) = new(deepcopy(x)) > end >

Spencer Nystrom (19:51:57) (in thread): > might have to futz with swappingxwiththing? It’s been a while since I did this.

Hervé Pagès (20:32:38) (in thread): > Nice! Hopefully this doesn’t trigger the deep copy until required (COW).

Aaron Lun (20:41:19) (in thread): > I’m pretty sure it fires right away, doesn’t seem like Julia has any COW mechanism.

Spencer Nystrom (20:47:24) (in thread): > I am unsure when it triggers. But I am pretty certain there is a way to do some form of COW. Like I mentioned before, my time with Julia was pre 1.0 so maybe some stuff has changed.

Aaron Lun (20:51:04) (in thread): > yeah, just tested withtopright now; memory spike suggests thatdeepcopyfires right away.

Aaron Lun (21:14:17) (in thread): > this behavior has some implications for how to design classes. For example, we can’t easily create complex abstractions when the abstraction layer is so easily penetrated by users accidentally modifying the internals.

Hervé Pagès (22:19:37) (in thread): > :cow:would have been nice

2021-02-10

Johannes Rainer (01:58:20): > Hi all, I stumbled over a performance issue withDataFrame: adding a column with$is (compared to a plaindata.frame) very slow: > > library(S4Vectors) > library(microbenchmark) > df <- data.frame(a = 1:1000, b = "b") > DF <- as(df, "DataFrame") > microbenchmark(df$d <- 5L, DF$d <- 5L) > Unit: microseconds > expr min lq mean median uq max neval cld > df$d <- 5L 15.0 17.65 26.119 23.4 28.15 139.3 100 a > DF$d <- 5L 6307.9 6935.85 8204.356 7471.1 8404.60 27048.6 100 b > > usingcbindinstead of$is already faster, but I was wondering if there was also another possibility to increase performance of adding columns to aDataFrame?

Kasper D. Hansen (04:06:36): > @Aaron LunIf you read (parts of) John Chambers book (Programming with data) he specifically discusses how S was designed to do pass-by-value to decrease bugs and make it easier to reason about code. I totally agree with your reaction. Like you, I can see how pass-by-reference makes it possible to write more efficient code but it becomes much harder to reason about.

Spencer Nystrom (10:08:19) (in thread): > Yeah, I was wrong, Stefan says no:https://stackoverflow.com/a/58150884

Hervé Pagès (11:14:02) (in thread): > Not sure where you were wrong. Yournew(deepcopy(x))in the constructor works as expected for me. Only thing is that, with no:cow:, the deep copy is immediately triggered which is expensive and unnecessary if the code downstream doesn’t try to modifyxor its copy.

Spencer Nystrom (11:15:23) (in thread): > Right right. For some reason I thought there was a way to delay the deepcopy.

Spencer Nystrom (11:16:27) (in thread): > (or that the compiler could figure it out) which is where I was wrong.

Aaron Lun (12:13:11) (in thread): > Having thought about it, it may be sufficient to proceed by ensuring all of our hypothetical BioC constructors do a deepcopy by default, with an option to turn this off if the user really knows what they’re doing. This would give us “safety by default” plus optional performance for power users.

Aaron Lun (12:14:20) (in thread): > Deepcopies aren’t too bad. We basically do this all the time anyway inSE::assaywhen we modify the dimnames of the assay.

Liz Ing-Simmons (12:36:10): > @Liz Ing-Simmons has left the channel

Michael Lawrence (16:23:27) (in thread): > Probably because SimpleList is similarly slow compared to list, because it uses the default implementation ofsetListElement(). It could save time by delegating to the underlying list, while also adjusting themcols()as necessary. It will always be slower than base because of S4 dispatch and the mcols, but it could be faster. Are you interested in giving it a try via a PR?

2021-02-11

Johannes Rainer (01:50:47) (in thread): > Thanks! I will have a look at it and come up eventually with a PR.

2021-02-12

Janani Ravi (15:52:32): > @Janani Ravi has joined the channel

2021-02-13

Aaron Lun (19:50:31): > @Hervé Pagèsit would be so nice if you could export something likeS4Vectors:::wmsg2. Would save me a lot of typing to format my own error messages.

2021-02-14

Hervé Pagès (15:26:42): > Do you meanwmsgorwmsg2? The former is what I use everywhere to format error and warning messages, and it’s already exported. The latter is a more specialized version that is used internally byS4Vectors::setValidity2to format the string returned by the validity method. Your validity methods will automatically benefit if you register them withsetValidity2so there should be no need to callwmsg2directly.

Aaron Lun (18:25:29): > ah, exellent,S4Vectors::wmsgwill do nicely.

2021-02-15

Robert Castelo (12:54:44): > hi, i have a question about package deprecation and its downstream dependencies. i wanted to try the packageAPAlyzerand found that it depends (imports) onDESeq. The result is that after installingAPAlyzerin macOS i get a warning saying thatDESeqis not available (in linux i cannot even install it): > > BiocManager::install("APAlyzer") > Bioconductor version 3.12 (BiocManager 1.30.10), R 4.0.3 (2020-10-10) > Installing package(s) 'APAlyzer' > Warning: dependency 'DESeq' is not available > > and, consequently, when i try to loadAPAlyzeri get an error: > > library(APAlyzer) > Error: package or namespace load failed for 'APAlyzer' in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): > there is no package called 'DESeq' > > i’ve checked that in Bioc 3.11 i can install and loadAPAlyzerwithout any warning about its deprecation, so i guessDESeqwas not yet deprecated in Bioc 3.11. > > so, ifDESeqwas deprecated in Bioc 3.12, shouldn’tAPAlyzerstill work through this release cycle (this is what i’d understand from the package end-of-life documentationhere)? > > no need for me to makeAPAlyzerwork in Bioc 3.12 since i can run it on Bioc 3.11, but i just wonder whether the package end-of-life process is correct in this case.

Martin Morgan (13:49:21): > Deprecation is a bit strange. If the package doesn’t build & check (DESeq does not in 3.12), then it doesn’t really matter that it’s labelled ‘deprecated’ — it’s still not available. There could still be value in the deprecation label, with the landing page indicating that the package is not just broken, but won’t be fixed.

Federico Marini (14:28:41): > I think DESeq’s deprecation cycle initiated already in the previous round

Federico Marini (14:29:26): > you could theoretically always grab the source and install “version-agnostic” - not the most kosher thing to do, but at least would allow you to run APAlyzer?

Henrik Bengtsson (15:02:12): > WISH: Hi, I’d like to propose a change in how Bioconductor does version bumps, or more specificallywhenit occurs: > > If a Bioconductor package doesnotchange (and has no commits), then it would be useful if the package version remains the same across future Bioconductor releases. For example, ifaffx 1.41.1was last updated in 2014, then the following Bioconductor release would bump that toaffx 1.42.0but after that, it would stayaffx 1.42.0as long as there has been no updates to the package.

Henrik Bengtsson (15:03:16) (in thread): > One problem with the current, “always-bump-the-version” for each new Bioconductor cycle is that it is not possible to know if there has been any changes or not in a package. This means that you have to resort to reading NEWS and ChangeLog files, which not all packages maintain, to figure out if there has been any updates or not. This means that you have to go to the source code and dogit diff:s to see what’s changed, if at all. > > Also, as a package maintainer, if you package only occassionally updates, you have to either do dummy updates in order to keep your NEWS file up-to-date, or ignore the fact that the NEWS file does not show the latest automatically bumped version. If you do the latter, an end-user might look at the version number of the package page and then look at the NEWS file and conclude that the NEWS file is not maintained. > > So, my proposal is to not bump the version of a package that hasn’t received any updates in either the Bioc release or the Bioc devel branch. > > Takeaffy(https://www.bioconductor.org/packages/release/bioc/html/affy.html) as an example. Bioc release providesaffy 1.68.0. TheaffyNEWSfile mentions version 1.41.1. To figure out if there have been anyrealupdates since 1.41.1 and 1.68.0, a user needs to dogit cloneandgit log/diffto figure this out. With my proposal, there would have only been a version bump fromaffx1.41.1 toaffx1.42.0 when Bioconductor 2.14 was released in 2014. The Bioc devel version ofaffxwould remain at 1.42.0 until the developer makes a change, which in case it should become 1.43.1 (to fit the current versioning scheme). If the developer does not bump to (> 1.43.0), the the build system will ignore it. Next, when Bioconductor 2.16 comes around, the build system notice that there has been zero changes/commits and it will keepaffx1.42.0 as the current version. This way we would not have dummy updates ofaffx1.44.0, 1.46.0, …, 1.68.0.

Martin Morgan (15:19:08) (in thread): > There is an implicit change with each release — the version of package dependencies, and also the version of R on which the distributed package was built and ‘last known good’ from the build system perspective. This is important for binary installations (Windows, macOS) in particular. Being able to scan version numbers can be very helpful when debugging support site posts.

Hervé Pagès (15:47:35) (in thread): > Exactly. Avoiding dummy version bumps would be nice and has been proposed before. Unfortunately it doesn’t work for Windows and Mac binary packages. Thesemustbe rebuilt for every new version of R, and they also need to be rebuilt when things like S4 classes definitions or generic/method tables change in the packages they depend on. The systematic version bumps at each new releases prevents ending up with different binaries of the same package with the same version number out there, which would be harmful.

Henrik Bengtsson (16:15:00) (in thread): > But, Bioconductor provides different package repos for different releases and R versions for these reasons (e.g.https://bioconductor.org/packages/3.12/bioc), so you can have different binaries for the same package versions. Are you arguing that the version bumping is another layer of protection on top of this?

Henrik Bengtsson (16:16:44) (in thread): > Being able to look at a package version to infer from what Bioc release it’s from could be convenient but doesn’tsessionInfo()give a much better picture for that?

Henrik Bengtsson (16:20:46) (in thread): > I also think the current version “creep” is unfortunately when it comes to publications. Mentioning thataffx1.42.0 was used for a particular study only serves a breadcrumb for reproducibility but for someone who read that article years later and try to reproduce it won’t know how relevantaffx1.68.0 that they just installed is compared to what is published.

Martin Morgan (16:23:32) (in thread): > People install packages in all kinds of ways, for instance by installing packages into the same library across Bioconductor (and R) releases. When they reportsessionInfo()and it says affy is at1.42.0you wouldn’t know whether they installed it yesterday or five years ago, with other packages from the current R & Bioconductor or with packages from some point over the last five years. > > In terms of reproducibility knowing that affy 1.42.0 was used is actually quite useful currently, because you know the analysis was done a number of years ago, and that the relevant Bioconductor packages are from a previous release, not the current release. > > If version numbers were not changed, as already mention ‘1.42.0’ in Bioconductor release 3.12 (using R version 4.0) would not be the same as ‘1.42.0’ in Bioconductor release 3.13 (using R version 4.1). Many things could have changed, in large and saml

Martin Morgan (16:31:10) (in thread): > I’m not sure whether ‘1.42.0’ was meant to be an actual case in point, but trying to install under current bioc-devel (R-devel, Bioc-3.13) and using the source tarball available fromhttp://bioconductor.org/packages/2.14/bioc/html/affy.html > > $ bioc-devel CMD INSTALL ~/Downloads/affy_1.42.3.tar.gz > * installing to library '/Users/ma38727/Library/R/4.1/Bioc/3.13/library' > ERROR: dependency 'BiocInstaller' is not available for package 'affy' > * removing '/Users/ma38727/Library/R/4.1/Bioc/3.13/library/affy' > - Attachment (Bioconductor): affy > The package contains functions for exploratory oligonucleotide array analysis. The dependence on tkWidgets only concerns few convenience functions. ‘affy’ is fully functional without it.

Henrik Bengtsson (16:32:06) (in thread): > > People install packages in all kinds of ways … > I’d argue that bumping the package version to get around this is very much a hack for a bigger problem. At best it would give you just a hint that they have this problem. > > I think the only solution for this problem is education, education, and education. (I also think the current Bioc release / devel split by R version might contribute this this problem by tricking people into bad hacks, but that is a much longer discussion for another time) > > … you wouldn’t know whether they installed it yesterday or five years ago > Is that whatBiocManager::valid()is for? If not already done, it could certainly check for installation timestamps.

Martin Morgan (16:39:28) (in thread): > BiocManager::valid()doesn’t currently need to look at timestamps, and timestamps wouldn’t resolve whether 1.42.0 installed yesterday was installed from the 2.10 archive or from the 3.13 archive… The ‘Built’ field ofinstalled.packages()would tell one about R version, but there are two releases of Bioconductor per version of R….

Henrik Bengtsson (16:39:45) (in thread): > affxparser was just an example since I know it’s rarely is updated. > > **** ****ERROR: dependency ‘BiocInstaller’ is not available for package ‘affy’ > So, this works as expected, I’d say.affxwouldn’t make it into next Bioc release because it would depend onBiocInstallerno longer available from the Bioc repositories. The only way foraffxto be part of the next release would be for it to remove that dependency and theaffxthen it’s version would be bumped.

Hervé Pagès (16:42:40) (in thread): > It’s very simple: If the content of a package has changed, its version must change. We want to stick to that paradigm, it’s a healthy one, even though you’re right that in theory, we don’t strictly need it because we have releases. But still. Throwing that paradigm away would open a new can of worms.

Henrik Bengtsson (16:51:48) (in thread): > I think lots of this could be resolved by relying on things such asDepends: R (>= 4.0.3), but alsoDepends: Bioconductor (>= 3.10), when a package maintainer believe they requires them. R would then take it from there. If there’s an incompatible version, the package wouldn’t load, even if the user would “force” install it.

Hervé Pagès (16:59:40) (in thread): > I like the idea of injecting something likeBiocVersion (>= 3.13)in the package DESCRIPTION files at release time at the same time that we do the version bumps and branch. It certainly wouldn’t hurt and would make a no-version-bump-if-no-change approach less scary.

Hervé Pagès (17:03:58) (in thread): > … and we should also seriously consider using branch names likeBIOC_3_13instead of the currentRELEASE_3_13so the Bioconductor “stamping” would be consistent and immediately recognizable on GitHub.

Henrik Bengtsson (17:05:40) (in thread): > Oh no:zany_face:I didn’t mean that you would insert it into “my” Bioc packages. It should be used by the developers/maintainers would requires it. For example, Bioc Core could do it forBiobaseand friends if they think it’s truly needed. > > For my packages I can control this myself, e.g. we know that the only hard requirement foraffxparserisDepends: R (>= 2.14.0)- it works with any Bioconductor version because it has no dependencies on it. OTH, if it would depend onBiobase and Biobase would require R (>= 4.0.0), thenaffxparser would inherit that requirement regardless of itself would claim R (>= 2.14.0).

Martin Morgan (17:08:33) (in thread): > >= 4.0.3is an interesting assertion that the package will be compatible with all future versions of R!BiocVersion (== 3.13)(is that the right format?) would be less prescient, but it would also require a version bump of the package (right?) so then Henrik would be back at the top of this thread, with package version changing each release… > > I think it’s naive to think that the package maintainer will be conscientious enough to do this, for example, I’m not sure when the last commit from the ‘maintainer’ of affy was? Certainly it would lead to a much smaller release…

Hervé Pagès (17:10:45) (in thread): > @Henrik BengtssonBut we create branches for each release remember? We do it for all packages whether you need it or not. Would be the same for injecting theBiocVersion (>= 3.13)orBiocVersion (>= 3.13 && < 3.14)thing at branch creation. We don’t want (and don’t need) to leave this responsibility to the developers.

Henrik Bengtsson (17:13:16) (in thread): > BiocManager::valid(): TheBuilt:field in thepackageDescription(pkg_name)carry information for the install time when installed from source. When installed from pre-built binary, that timestamp is for when the binary was built, which I guess is a clue in itself. Otherwisefile.mtime(find.package(pkg_name))should be a good clue for when a package was installed. So, these clues could be used to assert that Bioconductor packages was installed during the expected time period.

Hervé Pagès (17:14:14) (in thread): > Because again, once we push a new release, all the binaries (Windows and Mac) actually need that kind of stamping because they differ between BioC versions, EVEN if the package source has not changed.

Henrik Bengtsson (17:14:50) (in thread): > My starting point is/was: if the package source has not changed, then the version should not be bumped. If there is a change to the source, then the version should be changed. This includes changes toDESCRIPTION.

Henrik Bengtsson (17:18:31) (in thread): > Yes, binaries differ when built on adifferentR versions, also when on the same R version but configured differently (e.g./.configurewith and without--enable-R-shlib), linking to different libraries (some from other R packages), and so on. But thesourceis still the same.

Hervé Pagès (17:18:38) (in thread): > I agree in theory, but as mentioned before, even if the source has not changed, the various binaries produced for various BioC versions can have a different content (mostly S4 cached stuffed). So how do you suggest we can distinguish them if we don’t use different version numbers or if we don’t stamp them somehow by injecting something in their DESCRIPTION files?

Henrik Bengtsson (17:21:54) (in thread): > But, for the samesource,all these different binaries are gated by the package repository URL, which is unique for each operating system and architecture, R x.y.* version, and each Bioconductor version x.y. (So, I argue that modifying thesourceto add another layer to this is not needed and also “a bit odd”)

Henrik Bengtsson (17:29:00) (in thread): > > “We don’t want (and don’t need) to leave this responsibility to the developers.” > But that’s what your build/check system is for. If something is not working with a new version of R, or with an updated version of, say,Biobase, then the packages that are affected produceR CMD checkERRORs, WARNINGs, NOTEs. The package maintainer would then be required to fix that in order to make it to the next release cycle.

Hervé Pagès (17:30:19) (in thread): > Experience has taught us that we can’t trust the gating provided by the URL / R x.y version / Bioconductor version. As Martin said, people manage to install packages in all kinds of ways. Adding some kind of stamping to the DESCRIPTION file is one way to improve the situation. It doesn’t have to be a commit to the package git repo. Could be done on-the-fly by the build system when the package source tarballs and .zip and .tgz files are produced.R CMD buildalready injects stuff in the DESCRIPTION file so the exact content of this file already varies slightly depending on when/whereR CMD buildwas run.

Hervé Pagès (17:33:49) (in thread): > > But that’s what your build/check system is for > We only run the build/check system for the current release/devel so we have no way to validate older things. OTOH the stamping remains forever.

Hervé Pagès (17:49:51) (in thread): > If I can choose between using theBuiltfield to infer the BioC version, or using an explicit stamp likeThis package belongs to Bioconductor 3.13, I choose the later without hesitation. It’s not only more reliable but it also talks to people. Usingfile.mtime()is probably even less reliable than theBuiltfield.

Henrik Bengtsson (17:53:48) (in thread): > > As Martin said, people manage to install packages in all kind of ways. > That’s a problem of installation path (e.g.R_LIBS_USER) does not carry information on architecture, R version, and in case of Bioconductor, Bioc version. That’s a universal problem in R, e.g. if a user installs to~/Rlibsthere’s nothing we can do to save them when they go from R 4.0.4 to R 4.1.0 in a couple of months. R should really produce a big fat warning on startup about this problem and make it really really hard to disable that warning. In lie of that,BiocManager::install()could check for this and maybe even refuse to install to such “unsafe” folders? > > Related: On Unix, the default installation path is effectively: > > R_LIBS_USER=~/R/%p-library/%v > > which R expands to something like~/R/x86_64-pc-linux-gnu-library/4.0during startup. > > Ideally, there would a mechanism for including also the Bioconductor version, e.g. > > R_LIBS_USER=~/R/%p-library/%v-%b > > expanded as~/R/x86_64-pc-linux-gnu-library/4.0-Bioc3.12. Bioconductor already have Bioconductor features in base R, e.g.chooseBioCmirror()andR_BIOC_VERSIONso it doesn’t seem like an impossible task to get this feature in. > > Even without built-in support for this,BiocManager::install()could provide this service for the user, e.g. install Bioconductor packages to~/R/x86_64-pc-linux-gnu-library/4.0-Bioc3.12andlibrary(Bioconductor)could prepend it to.libPaths().

Hervé Pagès (18:01:59) (in thread): > Another issue with the no-version-bump-if-no-change approach is that it doesn’t “leave room” for minor version bumps in release. Let’s say we have Biobase 2.50.0 in both BioC 3.12 and BioC 3.13 but now for some reason we need to make a minor change to the release version but not in devel. We’ll bump the version to 2.50.1 in release and end up in the weird situation where the version in release is higher than in devel. Unless we also bump the version to 2.50.1 in devel (even though we didn’t make any change there) but now we end up with 2 packages with the same version but different content. Unless we bump the version to 2.50.2 in devel but then how are we going to bump the version in release next time we need to make another fix there? In other words, we no longer have a simple/straightforward versioning scheme.

Henrik Bengtsson (18:12:16) (in thread): > The “fork” problem of release and devel is a good point. My proposal would require the devel version to follow the current Bioc version schemeas soon as there’s an updateto it. So, if the devel branch says 2.50.0, it signals that there has been zero changes to it relative to the release branch. If there’s a change, then it has to become 2.51.1. (The git server and build system could protect against push anything but 2.51.z)

Hervé Pagès (18:34:03) (in thread): > Protecting against push only works if the developer tries to push something. In case the release branch changes and its version gets bumped to~~~2.51.1~~~2.50.1, at the same time something must happen to the devel branch. We can’t be passive and wait that something happens to it, we need to be active. Of course it’s technically possible to detect an event like this and to automatically trigger some action to the devel branch, but we’re adding more failure points to the whole system and making things slightly more confusing for the developer.

Henrik Bengtsson (18:39:58) (in thread): > > In case the release branch changes and its version gets bumped to 2.51.1 … > Did you mean to write 2.50.1 here?

Hervé Pagès (21:02:59) (in thread): > indeed, corrected

Henrik Bengtsson (21:17:23) (in thread): > > … at the same time something must happen to the devel branch. We can’t be passive and wait that something happens to it, we need to be active. > Why does the devel branch have to be updated when the release branch is updated? I think it’s perfectly fine to have release=2.50.1 and devel=2.50.0 - is that a problem? It would reflect that the release branch is actually ahead of the develop branch.

Hervé Pagès (21:35:48) (in thread): > > BiocVersion (== 3.13)(is that the right format?) would be less prescient, but it would also require a version bump of the package (right?) > @Martin MorganIf we accept the notion that stamping the DESCRIPTION file is not the same as modifying the source (and stamping can be done on-the-fly when producing the source tarball and package binaries, at the same time that other fields are also injected in the DESCRIPTION file), thenmaybeit’s ok to not require a version bump just because of the stamping, I don’t know. Anyways, my point was that stamping could have some value on its own, even with the current version scheme (systematic version bumps at each release). It’s also easy to implement and feels like a natural thing to do right after branch creation. A few years ago I actually thought of adding something likeBioconductor-version: 3.13to my own packages but now thatBiocVersionis around we should probably take advantage of it.

Hervé Pagès (21:45:53) (in thread): > @Henrik BengtssonNot necessarily a problem but it could be one if the fix that went into 2.50.1 is meant to be a release thing only. Then having devel being still at 2.50.0 indeed suggests that its behind release when it’s not. The release and devel branches are just different branches that will never be merged (in general) and no one can be considered to be behind or ahead of the other. So I see some potential for confusion.

2021-02-16

Robert Castelo (03:47:56): > @Federico MariniDESeqin Bioc 3.11 says the following: > > library(DESeq) > Welcome to 'DESeq'. For improved performance, usability and > functionality, please consider migrating to 'DESeq2'. > > BiocManager::version() > [1] '3.11' > > which doesn’t signal very clearly that is being deprecated, specially to package developers that importDESeq.@Martin Morganas i reported above, in the next (current) release 3.12DESeqdirectly fails, breaking the Bioc integrity of its downstream dependencies which somehow have been released along.DESeqnot only hasAPAlyzeras downstream dependency, it is currently (3.12) imported by 14 other software packages and suggested in 8 more. Every of the 14 packages that importDESeqdon’t work in the current release, another example after installing it in macOS: > > library(vulcan) > Error: package or namespace load failed for 'vulcan' in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]): > there is no package called 'DESeq' > > and for the other 8, their vignettes cannot be reproduced. i know this is being fixed in the current devel cycle, whereDESeqis not there anymore but i don’t think we should release packages that cannot be installed or loaded. currentDESeq2is being imported by some 70 other packages, if someday we get aDESeq3, the current deprecation procedure applied toDESeq2could lead to substantial number ofreleasedunusable packages. i’d argue that package deprecation, however we do it, should avoid breaking the integrity of the release.

Hervé Pagès (04:42:29): > DESeqdidn’t look like it was actively maintained for the last 6 years. The last commit it received from its authors/maintainers is from 2014. Any commitgit logshows after that is from a core team member. These are either version bumps at release times, or tweaks to itsBiocViews, or a small fix to help the package build again after a breaking change inggplot2. At some point during the BioC 3.11 devel cycle, a change somewhere in R or in one ofDESeq’s deps broke the package, and this time the core team didn’t try to repair it but contacted the maintainer who agreed that we deprecate the package. What we do in this case is add a deprecation message to the package startup message but since the package never built again, the new versions of the package with the deprecation message (1.40.0 or above) never propagated. > Not a good situation I agree so how do we improve this? More generally speaking how do we deal with the situation where a package breaks, won’t get repaired, but a bunch of other packages depend on it? We’re kind of assuming that maintainers keep an eye on the packages they depend on and will take action if they depend on a package that breaks and doesn’t get repaired. Obviously that’s not enough so it seems that we should contact them (at least), and also deprecate their package too, with the option to undeprecate if they address the issue before the next release. > Finally note that whatever we do, unfortunately we can’t completely avoidreleasedunusable packages. Packages can disappear from CRAN anytime, and, when this happens, it breaks all the Bioconductor packages in release and devel that depend on them.

Robert Castelo (04:57:42): > sure that CRAN packages can disappear anytime, but here i’d say we should at least strive to maintain the integrity of the release for Bioc packages. what i don’t understand in this situation is that how is it possible thatifas you say, in the Bioc 3.11 devel cycleDESeqbroke, then all its downstream dependencies didn’t break, because if they would have also broken, then they would have been either fixed or wouldn’t have make it to the release. i guess i’m missing something here about how the built system works.

Hervé Pagès (05:16:24): > DESeq’s vignette broke at some point during the BioC 3.11 devel cycle. Here is the final report for BioC 3.11:https://bioconductor.org/checkResults/3.11/bioc-20201017/DESeq/malbec2-buildsrc.htmlA broken vignette doesn’t necessarily mean that the reverse deps break. Most of the time they don’t. But I agree that they shouldn’t propagate. Not sure how they did in this case.

Robert Castelo (06:55:54): > Well, in the report you show, the vignette breaks when building DESeq 1.40.0, which means that, that specific version of DESeq was never built and this is confirmed going to the package landing page for Bioc 3.11,here, where the available tarball is for version 1.39.0. So, maybe the reverse deps built in Bioc 3.11 were using DESeq 1.39.0.

Mike Smith (10:55:02): > Does anyone have any topics they’d like to present at the next Developers Forum teleconference? I’ve been absent from here for a few weeks and it looks like there’s been loads of discussion! > > If not I was thinking about doing something where we compare and contrast some of the unit testing packages available in R, mostly because I always forget how RUnit is supposed to be run. It’d be great to know if anyone would be willing to give a short overview on their favourite solution.

Hervé Pagès (12:42:57) (in thread): > Right, in the case of BioC 3.11 that’s probably what happened. But in the case of BioC 3.12, no version of DESeq ever propagated so the rev deps shouldn’t have propagated either. The script that controls propagation is supposed to block packages with impossible deps but it looks like in this case it didn’t.:worried:

Vince Carey (13:12:25): > +1 on unit testing discussion. Are there upsides/downsides to RUnit over testthat? Are there alternatives? Is measurement of test coverage with covr a useful practice?

Dirk Eddelbuettel (13:20:52): > Very happytinytestuser here. Converted (almost) all myRUnituse case totinytest(across a few dozen packages). Converted onetestthatpackage too for performance reasons. Am less familiar withtestthatso “comparison” is tricky. But would be happy to talk about why I liketinytestand what I see as key strength. The author is a friend too… He was recently asked to compare and managed to keep it at tweet length:https://twitter.com/markvdloo/status/1359974695114801154 - Attachment (twitter): Attachment > @msberends @rstats_tweets @BrodieGaslam tests install with the pkg so users can test your pkg at their infra, testing in parallel, auto-record side effects (like changes in env vars), set env vars during testing, simple test file layout, test results are data, dependency-free. https://pbs.twimg.com/media/Et-ZsbVXEAEuhDj.png

2021-02-17

Sean Davis (04:35:12) (in thread): > @Tusharkanti Ghosh, any thoughts on future topics?

Tusharkanti Ghosh (04:35:18): > @Tusharkanti Ghosh has joined the channel

Mike Smith (16:25:48): > Thanks@Vince Carey, those are exactly the sort of questions I think it’d be nice to address. There are at least 3 popular packages that I know of, but they all work slightly differently, and I thought it’d be great to get an overview on where they are similar and different. Thanks@Dirk Eddelbuettelfor volunteering to cover tinytest. The call will be next Thursday (Feb 25th) if anyone would like spend 10 minutes introducing RUnit, testthat, or something else I’ve missed. I’m happy to coordinate things.

Marcel Ramos Pérez (16:38:06): > Did you mean Thursday Feb 25th? :)

Andres Wokaty (16:39:43): > @Andres Wokaty has joined the channel

Martin Morgan (17:41:36) (in thread): > I think it would be useful to include a discussion of what a unit test is, and what developer practices can help make code more testable…

Henrik Bengtsson (21:07:21) (in thread): > Wed or Thur?

Dirk Eddelbuettel (21:08:26) (in thread): > See next post by@Marcel Ramos Pérezin main thread. In short, always Thu, always 11h Central for me so likely 1pm for you.

Henrik Bengtsson (21:39:49) (in thread): > Yes, I saw that comment but it was phrased as a question, so as a sporadic attendee it wasn’t clear . (@Mike Smithmay I suggest to edit your post in case others wonder)

Henrik Bengtsson (21:44:41) (in thread): > “always 11h Central for me so likely 1pm” - thinko or typo? you meant 9 am PST, eh? (tried withanytime, but it failed to parse:stuck_out_tongue_winking_eye:)

Dirk Eddelbuettel (21:47:09) (in thread): > Yeah yeah yeah. I clearly talk more with people east of me than people west of me. And besides, may I not introduce a newoff by fourclass of error?

Dirk Eddelbuettel (21:49:02) (in thread): > @Mike Smithbeing the thorough fellow that he is will generally send out invites with proper dates and times, generally fool proof even for ‘90 proof’ fools like me …

2021-02-18

Mike Smith (03:44:33) (in thread): > So thorough this have trouble reading a calendar (don’t get me started on daylight savings!). I’ll send a proper announcement email/post later today. I like to have the topic decided before I do that. Normally that’s behind the scenes, but that didn’t work this month, so I’m doing a public request for contributions.

2021-02-22

Mike Smith (11:56:08): > The next Bioconductor Developers’ Forum is this Thursday 25th February at 09:00 PST / 12:00 EST / 18:00 CET - You can find a calendar invite attached and athttps://tinyurl.com/BiocDevel-2021-02 > * This month we’re going to focus on Unit Testing! We’ll introduce what unit tests are and why they might be useful for your R package development. > * We’ll also introduce some of the common R packages that help us write and run unit tests. To that end@Dirk Eddelbuettelwill give an overview of ‘tinytest’ and how he makes use of it in his work. > * If you’re a fan (or indeed not!) of testthat, RUnit, or any other testing strategies and would be willing to give a 10 minute overview please let me know ASAP! > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) - File (Binary): BiocDevel-2021-02.ics

Dirk Eddelbuettel (11:58:04) (in thread): > FWIW my (draft) slides have something on both context (“How does R call any tests ?”) and transition from RUnit.

2021-02-24

Stuart Lee (00:34:34): > This might be of interest:https://melvidoni.rbind.io/publication/2021-rsatd/ - Attachment (Dr Melina Vidoni): Self-Admitted Technical Debt in R Packages: An Exploratory Study [Preprint] | Dr Melina Vidoni > Self-Admitted Technical Debt (SATD) is a particular case of Technical Debt (TD) where developers explicitly acknowledge their sub-optimal implementation decisions. Although previous studies have demonstrated that SATD is common in software projects and negatively impacts their maintenance, they have mostly approached software systems coded in traditional object-oriented programming (OOP), such as Java, C++ or .NET. This paper studies SATD in R packages and reports the results of a three-part study. The first part mined more than 500 R packages available on GitHub and analysed more than 164k of comments to generate a dataset. The second part administered crowd-sourcing to analyse the quality of the extracted comments, while the third part conducted a survey to address developers’ perspectives regarding SATD comments. The main findings indicate that a large amount of outdated code is left commented, with SATD accounting for about 3% of comments. Code Debt was the most common type, but there were also traces of Algorithm Debt, and there is a considerable amount of comments dedicated to circumventing CRAN checks. Moreover, package authors seldom address the SATD they encounter and often add it as self-reminders.

Vince Carey (07:41:43) (in thread): > Interesting concepts. I had heard the phrase “technical debt” (TD) but did not think it through. Clearly it implies that those taking on TD may have to pay back – and at an inconvenient time! The paper describes its methodology in detail but I don’t yet have a sense of how one could establish milestones for paying down the debt. Perhaps the best one can do is have a systematic code/comment review process and grade the results over time.

Vince Carey (08:03:57) (in thread): > https://www.usdebtclock.org/… imagine something like this atbioconductor.org… or CRAN, but measuring technical debt. Finally, I searched the paper for “surplus” to no avail. Could debt (identifiably suboptimal code/coding processes) be offset by other approaches to increasing code value such as adding vignettes and demonstrating achievement of valid solutions to substantive problems, and should this be taken into account? - Attachment (usdebtclock.org): U.S. National Debt Clock : Real Time > US National Debt Clock : Real Time U.S. National Debt Clock

Hervé Pagès (17:54:39) (in thread): > Although it’s almost impossible to quantify the TD and its interests for an open source software like Bioconductor. But the concept still applies and my feeling is that the TD tends to be underestimated. This kind of relates to a link to a paper you posted a few months ago Vince, that was about the importance of spending resources on maintenance, which includes paying back the debt.

2021-02-25

Mike Smith (05:50:57) (in thread): > Was anyone here contacted by the authors to be part of the survey? I feel a little disappointed that none of my packages on Github made it into their 500 selected packages.

Mike Smith (11:51:57) (in thread): > Just a quick reminder that this starts in 10 minutes. Really looking forward to the discussion.

Dirk Eddelbuettel (13:15:21): > Thanks to Mike and everybody for having me, and for a very lively discussions. Slides are here: - File (PDF): bioc_testing_feb2021.pdf

2021-02-26

Kasper D. Hansen (02:35:28): > Nice slides@Dirk Eddelbuettel. This was probably mentioned, but I think from the point of view of Bioconductor, I think integration with our build system output is pretty important. In the long run, developers will interact with errors not generated on their own setup. I don’t know what tidytest outputs in case of an error, but here is a successful build report from tinytest :http://bioconductor.org/checkResults/devel/bioc-LATEST/sRACIPE/malbec2-checksrc.htmland here is one with Runit:http://bioconductor.org/checkResults/devel/bioc-LATEST/minfi/malbec2-checksrc.html. I am not saying Runit is better here (to really make a call on this, I would like to see the output with failed tests), I am just saying this is a point of consideration for our project.

Mike Smith (03:53:45): > Well I’m glad we covered this topic - I’ve learnt lots of new things! I must have seen the test output for some packages in the build reports, but I’d never wondered about its absence in my reports, or clicked that it could be related to the choice of testing infrastructure. Here’s another example with RUnit where I actually find the output a bit overwhelming as there are so many separate files:http://bioconductor.org/checkResults/release/bioc-LATEST/illuminaio/nebbiolo1-checksrc.html

Mike Smith (04:01:37): > I’ll also add that as someone who hasn’t used RUnit, I don’t think I’ve missed those details from the build system. In my experience withtestthatif a build fails you get enough information in the standard check output to tell you that the problem is in the tests (and which test it was) e.g. > > * checking tests ... > Running 'testthat.R' > ERROR > Running the tests in 'tests/testthat.R' failed. > ... > ══ Failed tests ════════════════════════════════════════════════════════════════ > ── Error (test_problemPage.R:4:1): (code run outside of `test_that()`) ───────── > Error: Used authorPattern return zero results. > Backtrace: > █ > 1. └─BiocPkgTools::problemPage(includeOK = TRUE) test_problemPage.R:4:0 > 2. └─BiocPkgTools:::checkMe(...) > > [ FAIL 1 | WARN 1 | SKIP 0 | PASS 21 ] > Error: Test failures > > The yellow/red sign on the build report is enough to alert me to the problem, and I’d typically then try to recreate the error locally on my own hardware.

Kasper D. Hansen (04:56:13): > Perhaps I formulated myself badly. I have been using Runit because back in the days it was the only option. My main point is that IMO a desirable feature is for a package test suite to (a) run all tests even if some of them fails (probably all frameworks does this) and (b) have enough output visible in our build logs that you can comprehensively identify what is wrong, even on platforms you have less ready access to. We can also potentially change the way our build logs are generated and displayed of course.

Kasper D. Hansen (04:56:48): > This is especially useful for packages which are still widely used but you are no longer actively working on.

Kasper D. Hansen (04:58:18): > And it is also a comment to the slides which are more focused on how to write the tests and have no example of what a failed test output looks like (which I do care about).

Lluís Revilla (05:32:20) (in thread): > I think I took part of this study. I reviewed some R comments on a survey. I don’t think authors of packages used got any message.

Mike Smith (05:40:19) (in thread): > My initial reading of the methods made me thing they selected the packages for analysis, extracted the authors from the DESCRIPTION files, and then emailed those authors. On reflection it would be a bit weird to potentially review your own comments, so maybe I misread the methodology.

Dirk Eddelbuettel (08:22:35) (in thread): > You could just either mock a package quickly and addsetup_tinytest()or add the three lines to an existing package – then you can engineer exactly what fail you want to see. Also, given the 130+ CRAN examples which likely all have public CI you will probably find a few example at Travis, GitHub (Actions), Azure (Pipelines). > My biggest beef here, if any, is the poor report back fromR CMD check. The reporting fromtinytestitself is pretty clear, and yes a wider example would have been nice too. Good suggestion!

Dirk Eddelbuettel (08:47:00) (in thread): - File (Plain Text): Untitled

Dirk Eddelbuettel (08:47:55) (in thread): > So went digging and here is the last fail for Rcpp a while back (URL ishttps://github.com/RcppCore/Rcpp/runs/1717107641?check_suite_focus=true). You can see just above that all other tests still pass. (This one is nasty; and one point I was frustrated by a version comparison so now I set it in a headerandin DESCRIPTION and this tests matches them … and fails when one forgets to update the header).

Dirk Eddelbuettel (08:50:47) (in thread): > So again, as I said: these test runners are overall more alike than different. If you’re happy with your current selection, just rock on.

Kasper D. Hansen (09:19:40) (in thread): > Thanks, that looks good to me

Dirk Eddelbuettel (09:21:19) (in thread): > (You need to, as you likely did, make it “wide” to avoid the line breaks here.) > Yes, the basic information is all there: where it happend, what was received, what was expected, … > When you runtinytestby hand or interactively, the reporting is actually nice as the running counters ‘overwrite in place’ on the screen. Which … of course ends up with repeated lines inR CMD checklogfiles. That is less nice, but a ‘fee’ we pay. Can’t win’em all.

Dirk Eddelbuettel (09:42:09) (in thread): > And because no good deed goes unpunished here is tiledb at CRAN failing 2 out of 702 tests over a seemingly tiny time difference:https://www.r-project.org/nosvn/R.check/r-release-macos-x86_64/tiledb-00check.html

Krithika Bhuvanesh (11:05:54): > @Krithika Bhuvanesh has joined the channel

Krithika Bhuvanesh (11:08:58): > @Krithika Bhuvanesh has left the channel

Krithika Bhuvanesh (12:26:19): > @Krithika Bhuvanesh has joined the channel

Henrik Bengtsson (12:57:11) (in thread): > > (a) run all tests even if some of them fails (probably all frameworks does this) > R CMD checkon vanillatests/*.Rtest scripts runsalltest scripts even if some fail. It’s been like that for a few years, but, in the past, it did indeed stop at the first test script that failed.

Henrik Bengtsson (13:15:49) (in thread): > > (b) have enough output visible in our build logs that you can comprehensively identify what is wrong, even on platforms you have less ready access to. > This can be controlled by environment variable*R_CHECK_TESTS_NLINES*: Number of trailing lines of test output to reproduce in the log. If0all lines except the R preamble are reproduced. Default: 13. [R Internals] > > This applies toeachtests/*.Rscript that failed -notto the pool of them. So, if you have four failed tests, you’ll get 13 tail lines of error output for each of them. > > PS. This has to be set prior to launchingR CMD check, i.e. it isnotpossible to set it from within one of the package test scripts.

Henrik Bengtsson (13:27:10) (in thread): > The fact that you getR CMD checkerror output for each failedtests/*.Rscript is the number one reason why I (still) stay with bare-bone, vanillatests/*.Rscripts. > > All other test frameworks (RUnit,testthat,tinytest, …) rely on a singletests/testall.Rfile that then runs the test scripts living in some other folder. This means that these frameworks can only get at most*R_CHECK_TESTS_NLINES*(=13) lines of error output total regardless of how many test scripts fail. This is an unfortunate limitation, particularly when you try to troubleshoot errors on a remote machine (e.g. CRAN). To solve this, R Core would have implement something different for these test frameworks. > > However, a feasible workaround would be to have these test frameworks generate individualtests/*.Rfiles from, say,inst/tests/*.R, e.g.tinytest::build_tests(). Basically, a pre-compiler. If this could be done automatically duringR CMD buildorR CMD checkthat would be awesome, but I’m not sure there’s such a “hook”.

Dirk Eddelbuettel (13:32:50) (in thread): > But you are not limited to interaction with tests viaR CMD check. It is one hook for ‘aggregation’ just how a lots of things are checked in that run. For morefine-grainedlook at tests you have the tools by running per directory, package, file, …

Henrik Bengtsson (13:36:39) (in thread): > Just like@Kasper D. Hansen, I’m mostly interested in enough breadcrumbs to be able to troubleshoot errors occurring on remote machines but also architectures that I don’t have access to, e.g. CRAN servers. I know how to troubleshoot “interactively” locally.

Dirk Eddelbuettel (13:37:37) (in thread): > cp -ax inst/tinytest/*.R tests/– there, I fixed it for you:wink:(and of course the markdown parser got stoopid). This may need alibrary(tinytest)in each too as you forgo the aggregating runner.

Dirk Eddelbuettel (13:40:32) (in thread): > I prefer to be “pragmatic” rather than “dogmatic” in my use of these tools. There is one aspect I have control over (do I want tests? if so, how? which runner?) and one I very much do not (whatever R Core does).C’est la vie…

Dirk Eddelbuettel (13:41:56) (in thread): > But I am with you and Konrad on the limitation here—for example at GH Action I currently have limited visibility if something fails inR CMD checkrun.

Henrik Bengtsson (14:29:18) (in thread): > FWIW, on CI systems, we always have the possibility to dump the tests/*.R.{out,fail} logs at the end.

Dirk Eddelbuettel (14:31:05) (in thread): > And/or to set the env var for max lines to zero as I do forrun.shfrom r-ci which is my goto for CI:https://github.com/eddelbuettel/r-ci/blob/master/docs/run.sh#L43– but in the last time where this bit me it was a different (more complicated, to me) CI setup by someone else…

2021-02-27

Matt Ritchie (05:05:23) (in thread): > The authors are planning a follow-up study that looks into this for Bioconductor packages, so if you have suggestions re: how to improve the methodology used, do contact them

2021-03-01

Kayla Interdonato (12:14:27): > The recording from Thursday’s developers forum is now available on our YouTube -https://www.youtube.com/watch?v=nmY1jIhY9-Qas well as under the course materials on the Bioconductor website. - Attachment (YouTube): Developers Forum 19

2021-03-02

Mike Smith (11:56:39) (in thread): > Thanks a lot Kayla!

2021-03-10

Michael Lawrence (13:00:04): > Suggested topic for a future developer forum, the in progress WG on OOP in R:https://github.com/RConsortium/OOP-WG. We’d like to start discussing how we could facilitate porting frameworks like Bioconductor to our proposed system.

2021-03-12

USLACKBOT (15:23:05): > This message was deleted.

USLACKBOT (15:24:04): > This message was deleted.

Michael Lawrence (16:00:07) (in thread): > For some reason I recall Simon Urbanek doing something like this, but I might be misremembering.

Hervé Pagès (16:49:16) (in thread): > One could work around this by moving the sensible parts of the R code to the C/C++ level and produce/distribute binary packages. Actually there’s currently an attempt at sneaking this kind of close-source package into Bioconductor that I’m a little bit concerned about:https://github.com/Bioconductor/Contributions/issues/1886 - Attachment: #1886 rawrr > Update the following URL to point to the GitHub repository of
> the package you wish to submit to Bioconductor > > • Repository: https://github.com/cpanse/rawrr > > Confirm the following by editing each check box to ‘[x]’ > > I understand that by submitting my package to Bioconductor,
> the package source and all review commentary are visible to the
> general public. > I have read the Bioconductor Package Submission
> instructions. My package is consistent with the Bioconductor
> Package Guidelines. > I understand that a minimum requirement for package acceptance
> is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
> Passing these checks does not result in automatic acceptance. The
> package will then undergo a formal review and recommendations for
> acceptance regarding other Bioconductor standards will be addressed. > My package addresses statistical or bioinformatic issues related
> to the analysis and comprehension of high throughput mass spectrometry data. > I am committed to the long-term maintenance of my package. This
> includes monitoring the support site for issues that users may
> have, subscribing to the bioc-devel mailing list to stay aware
> of developments in the Bioconductor community, responding promptly
> to requests for updates from the Core team in response to changes in
> R or underlying software. > I am familiar with the Bioconductor code of conduct and
> agree to abide by it. > > I am familiar with the essential aspects of Bioconductor software
> management, including: > > ☑︎ The ‘devel’ branch for new packages and features. > ☑︎ The stable ‘release’ branch, made available every six
> months, for bug fixes. > ☑︎ Bioconductor version control using Git
> (optionally via GitHub). > > For help with submitting your package, please subscribe and post questions
> to the bioc-devel mailing list.

2021-03-15

Pablo Rodriguez (06:02:49) (in thread): > Electronlet’s you encapsulate Shiny apps into executables, not sure if you can do the same with an r package… - Attachment (electronjs.org): Application Distribution | Electron > NOTE: the location of Electron’s prebuilt binaries is indicated with electron/ in the examples below.

Kasper D. Hansen (08:12:25) (in thread): > @Hervé PagèsIt sounded concerning and I took a look. Unfortunately it is about reading mass spec data and the mass spec world is full of proprietary code and data formats, so I think it is a step forward. Might make sense to ping people more on top of this.

Hervé Pagès (12:22:04) (in thread): > Thanks for taking a look. Hopefully the mass spec experts on this Slack can take a look too.@Laurent Gatto,@Johannes Rainer,@Steffen Neumann: would be interesting to hear your thoughts on this. Any feedback you have on the rawrr submission would be appreciated. Thanks!

Steffen Neumann (12:45:52) (in thread): > I am lacking some context:disappointed:Depending on the topic, there would be#metabolomicsor#proteomicsfor deeper discussion.

Hervé Pagès (13:13:45) (in thread): > See the link to rawrr’s submission I posted above.

Laurent Gatto (13:19:48) (in thread): > There are open formats for MS data, that are supported bymzR. The proprietary binary formats aren’t open, and need vendor libraries to be parsed.rawrruses mono to parse these raw vendor files on non-Windows platforms. I assume it bundles the vendor libraries in the package, hence the closed-source component of the package.

Hervé Pagès (13:26:03) (in thread): > Yes, rawrr bundles the RawFileReader closed-source proprietary DLLs. Is this something we’re ok with? This creates a precedent. I’m not aware of any other Bioconductor package that does something like this. I’m not even sure the RawFileReader’s license allows us to distribute something like this. I’ve asked the rawrr folks to contact the RawFileReader people for clarifications.

Laurent Gatto (13:33:39) (in thread): > Yes, RawFileReader is the vendor library. I think they are allowed to distribute it, but users need to agree on a licence when using it (typically clicking yes on first use, or something along the lines)

Laurent Gatto (13:35:03) (in thread): > The reason is totally legit, but I agree it creates a precedent. The package is currently on github and available/installable for anyone to use, as far as I know. Not having it on Bioc wouldn’t, I believe, stop the community from using it.

Hervé Pagès (13:59:42) (in thread): > Assuming we get the green light from the RawFileReader people for redistributing their DLLs, how does rawrr fit in the Bioconductor protemics ecosystem? Should they return the data is some particular format e.g. in a Spectra object or derivative? There wouldn’t be much incentive to violate the open source dogma if it doesn’t interoperate nicely with the mzR stack, would it? I’m not familiar enough with the topic to know what level of interoperability is expected here.

Vince Carey (14:05:10) (in thread): > This is the first proprietary DLL to be distributed by Bioconductor? I think it would be better to have a function that retrieves the DLL when the user needs it. I am concerned about the precedent.

Vince Carey (14:09:51) (in thread): > So my response at the moment is a) do not redistribute proprietary DLLs in Bioconductor and b) if the package does not make use of Bioc classes or otherwise interact meaningfully with the Bioc ecosystem, it should not be in Bioconductor. I would be willing to provide guidance on how to solve b, and once it is solved, could look into strategies for defining helper functions that acquire and link to the said DLLs. The priority to be assigned to these offers is likely low but if there is an urgency let me know.

Hervé Pagès (14:13:18) (in thread): > Thanks Vince. I think we are on the same page. I’ll forward this to the rawrr folks thruhttps://github.com/Bioconductor/Contributions/issues/1886

Hervé Pagès (14:41:58) (in thread): > Done:https://github.com/Bioconductor/Contributions/issues/1886#issuecomment-799658952 - Attachment: Comment on #1886 rawrr > Hi @cpanse , @tobiasko , > > There’s been some internal discussion about this. Bioconductor is open source and has never distributed proprietary DLLs before so this would create a precedent. We think it would be better to have the RawFileReader DLLs automatically retrieved by the rawrr package the first time the user needs them (they could get installed in a place like tools::R_user_dir(which="cache"), in particular it’s important to not install them in rawrr’s installation folder because on many systems this is a read-only place). This would also provide a natural stop for asking the user to agree on the terms of the RawFileReader licence. > > The package would also need to interact meaningfully with the Bioconductor ecosystem e.g. by making use of Bioc existing classes for data representation. This is a generic requirement that applies to all submissions. For example, in the single cell experiment world, submission that deal with single cell data are strongly encouraged to use the SingleCellExperiment class, or at least to be able to operate on SingleCellExperiment objects. > > We suggest that you join the community-bioc Slack if you’ve not done so already, in particular the #metabolomics or #proteomics channels. This is a great place to meet other Bioconductor developers and discuss the interoperability topic with them. > > Best,
> H.

Laurent Gatto (14:58:05) (in thread): > As far as I can remember,rawrrdoes use Bioconductor data structures. It is a backend for theSpectrapackage.Spectradefines an API and backends that useDataFrame(for small data) or mzML files (viamzR), andrawrrprovides a backend for vendor files. So it does pass Vince’s point b above.

Steffen Neumann (15:00:06) (in thread): > Would be interesting to check ifhttps://cran.r-project.org/web/packages/opentimsr/index.htmlcan also serve as spectra backend. - Attachment (cran.r-project.org): opentimsr: An Open-Source Loader for Bruker’s timsTOF Data Files > A free, open-source package designed for handling .tdf data files produced by Bruker’s ‘timsTOF’ mass spectrometers. Fast, free, crossplatform, with no reading through EULAs or messing with binary .dll files involved.

Laurent Gatto (15:00:21) (in thread): > The README files say > > The package provides access to proprietary Thermo Fisher Scientific Orbitrap instrument data as a stand-alone R packageorserves asMsRawFileReaderBackendfor the BioconductorSpectrapackage. > so it seems to serve its own data structure and work as a backend forSpectra. - Attachment (Bioconductor): Spectra > The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different implementations (backends) to store mass spectrometry data. These comprise backends tuned for fast data access and processing and backends for very large data sets ensuring a small memory footprint.

Laurent Gatto (15:04:43) (in thread): > @Hervé Pagès- I’m happy to help out if there are specific points with this submission, just ping me in the issue. Due to heavy teaching load this week, it might take a bit of time to get back to you though.

Hervé Pagès (15:07:04) (in thread): > Thanks Laurent. I’ve encouraged the rawrr folks to join the Slack (#metabolomicsand#proteomicschannels) to discuss interoperability with you there.

Kasper D. Hansen (16:32:27) (in thread): > I can see some potential legal issues with distribution the DLL. I can also see why we - out of principle - don’t want to host packages which depends on proprietary DLLs. However, I don’t see that downloading the DLL through a function solves the later issue. I think it only addresses the legal stuff

Henrik Bengtsson (21:36:59) (in thread): > There’s big security issue going into the business of distributing 3rd-party binaries compared with binaries build from 3rd-party source. CRAN clearly draws the line there - no binaries at all [https://cran.r-project.org/web/packages/policies.html] (not even when built by the package maintainers themselves) - Attachment (cran.r-project.org): CRAN Repository Policy > CRAN Repository Policy

Henrik Bengtsson (21:40:06) (in thread): > Then there’s also the scientific and reproducibility aspect: what happens when the 3rd party goes out of business and there is no one to care about these binaries in 10 years from now but there’s still data out there that needs to be parsed by them and it’ll require researchers to dig out old machines with old operting systems in order to use them?

2021-03-16

Hervé Pagès (00:45:12) (in thread): > Yes, closed proprietary software stinks, nothing new here. People will decide if they want to userawrror not, and when they are asked to accept the terms of the licence, it will be their choice, not mine. At least, with the automatic download of the DLLs on first use, we’ll stay away from the business of redistributing these DLLs.

Mike Smith (04:16:57): > The next Bioconductor Developers’ Forum is this Thursday 18th March at 09:00 PST / 12:00 EST / 17:00 CET - You can find a calendar invite attached and athttps://tinyurl.com/BiocDevel-2021-03PLEASE NOTE: this is an hour earlier for those of us not yet in daylight savings time! > * This month we’re discussing object orientated programming in R - specifically the efforts of the RConsortium Object Orientated Working Group and how Bioconductor could be ported to the new OO system they’re proposing. > * We’ll be led by@Michael Lawrence, plus others from the working group. This should be a really interesting insight into the current progress they’re making, and give the Bioconductor community a chance to shape the direction of a cornerstone R’s future. > * As a primer there’s a really great short summary of the S3 and S4 systems athttps://github.com/RConsortium/OOP-WG/blob/master/proposal/proposal.orgwhich I thoroughly enjoyed reading. > We will be using BlueJeans and the meeting can be joined via:https://bluejeans.com/114067881(Meeting ID: 114 067 881) - File (Binary): BiocDevel-2021-03.ics

Johannes Rainer (05:44:43) (in thread): > Sorry for being late on this topic. I know the people from therawrrand AFAIK (and as@Laurent Gattowrote above) they provide also a backend to import MS data directly from the Thermo proprietary raw data files. This backend is supposed to be used with theSpectraBioC package and users can thus load data also from these proprietary data files. Like all of you I’m not a big fan of closed source and bundling DLLs along with the package. Generally I like that the developers picked up our idea of the backends and implemented this - just the lack of openess is a real problem I believe (legally, security wise and eventually problematic long-term support). You can also include me in any discussions with them@Hervé Pagèsif needed.

Hervé Pagès (12:03:26) (in thread): > I will. Thanks Johannes

Nathan Eastwood (15:30:33): > @Nathan Eastwood has joined the channel

Hervé Pagès (17:30:56) (in thread): > FWIW, this conversation has now moved to the#proteomicschannel.

2021-03-18

Michael Lawrence (13:13:37): > For anyone interested in exploring the adoption of R7 (just presented at the Dev Forum) by Bioconductor, feel free to join the#r7channel.

Kayla Interdonato (13:27:08): > @Michael LawrenceWould you be able to provide a link to your slides? I’ll be posting the recording under course materials on the website and it would be great to have the slides as well.

Michael Lawrence (13:35:04) (in thread): > Could I email you a PDF? Or is there an easier way than email?

Kayla Interdonato (13:37:45) (in thread): > Either email (kayla.morrell@roswellpark.org) or I think you can link a pdf on slack if you want to direct message me. Which ever way works for me.

Hervé Pagès (15:28:26) (in thread): > why not the R7 channel? (upper case)

Michael Lawrence (15:40:10) (in thread): - File (PDF): R7 for Bioc.pdf

Michael Lawrence (15:40:20) (in thread): > Good idea about the Slack attachment.

Michael Lawrence (15:41:41) (in thread): > Just assumed that Slack is case insensitive (channels seem to always be lowercase)

Hervé Pagès (15:46:11) (in thread): > ah ok. I guess that makes sense. Now I understand why our channel names are so ugly.

2021-03-19

Kayla Interdonato (10:40:46): > The recording from the developers’ forum yesterday is now available on our YouTube -https://www.youtube.com/watch?v=_QsFRiOBjt8as well as under the course materials on the Bioconductor website. - Attachment (YouTube): Developers Forum 20

2021-03-23

Lambda Moses (22:53:45): > Thank you for your answers to my previous development related questions. I also learnt a lot from the developers forum talks though I never spoke up to ask questions during the sessions. Now I have more questions. For Python dependencies, we have basilisk. I used it in a package and it’s really cool. But for those who use Rcpp, what are your experiences in managing C++ dependencies in SystemRequirements? Any suggestions, especially when there are multiple C++ dependencies (like insf)? Also, can CMake be used in Bioconductor packages?

Lambda Moses (22:53:49) (in thread): > For me, it’s fine on Mac for I have homebrew, but sometimes it’s a pain on a Linux server (I don’t have root to use yum) when the dependency is not already installed system wide and installing it with Anaconda sometimes introduced ABI error when the dependency requires a recent version of gcc but there’s a system wide outdated gcc. That’s why I never gotsfto compile on that server becausesfrequires gdal >= 2.0.1, which requires a more recent version of gcc which Anaconda has, but that outdated system wide gcc caused trouble. But even with root, it’s still kind of a pain to getsf’s C++ dependencies to install properly on all systems in GitHub Actions. I’m not really specifically asking aboutsf; that’s a particularly nasty case. I’m still asking because I’m writing a package that calls OpenCV with Rcpp for really fast image processing and I don’t feel like writingRcppOpenCV(or should there be anRcppOpenCV?). It works on my computer but I anticipate nasty compiler errors from users for mysterious reasons (I have got mysterious compiler errors forRcppArmadillobefore, so…). I really wish that C++ has something like CRAN or pip or conda so I don’t have to debate what to do when there’s a C++ library implementing something cool but introducing it as a dependency is such a pain that I would consider copying and pasting from that library or reinvent the wheel. In R and Python, I wouldn’t hesitate to use a dependency unless it’s really easy to do without.

2021-03-24

Dirk Eddelbuettel (00:03:24) (in thread): > It’s complicated, and there are (as you found out) few shortcuts. I put a few notes concerning approaches for package dependencies into this note covering a simpler case:https://arxiv.org/abs/1911.06416The most general case of external system requirements simply has no fully applicable solution for all operating systems and requirements. - Attachment (arXiv.org): Thirteen Simple Steps for Creating An R Package with an External… > We desribe how we extend R with an external C++ code library by using the Rcpp package. Our working example uses the recent machine learning library and application ‘Corels’ providing optimal yet…

Vince Carey (06:05:48) (in thread): > Is containerization a relevant strategy? When it seems likely that many users will not be able to establish the necessary runtime infrastructure for, e.g., OpenCV, then extending a well-established container to include it could simplify things for the user and ultimately the developer. The user must learn something about, e.g., docker client and its usage, but that will be repaid before too long.

Kasper D. Hansen (06:47:17) (in thread): > This is a common situation.

Kasper D. Hansen (06:48:28) (in thread): > You want to depend on an external library which is a pain to install. There are two extreme approaches to this (1) distribute the library inside an R package (2) have a configure file which finds and links the library.

Kasper D. Hansen (06:49:49) (in thread): > Choosing (1) is a big headache toyou. You now become responsible for getting your package to compile on different systems. If the library is hard to install, that makes it even worse. HOWEVER, there are great benefits to your users because IF you can get it to work, it just … works.

Kasper D. Hansen (06:51:13) (in thread): > Choosing (2) is easy for you; you say that the library is hard to install and you just essentially (through the configure call) require the user to fix this problem on their own machine. However, you’re essentially passing the buck to the user. As you yourself note (with thesfexample) that can be a real pain for users

Kasper D. Hansen (06:51:36) (in thread): > Philospophically, I like to think of this as a tradeoff between user time and developer time.

Kasper D. Hansen (06:52:19) (in thread): > With (1) you minimize user time at (great) cost to (your) developer time. With (2) you minimze the cost to the developer, but you increase the user time, potentially substantially.

Kasper D. Hansen (06:53:20) (in thread): > In my mind (1) is really the best solution for the R echo system, but it can also be a potentially impossible solution because you are now responsible for solving compilation problems for code you didn’t even write yourself.

Kasper D. Hansen (06:53:37) (in thread): > You should avoid CMAKE

Spencer Nystrom (07:53:42) (in thread): > Just a brief additional comment on Kasper’s points, another determining factor behind whether you can include the library is of course how the software is licensed. If it’s too restrictive, you can’t include it even if you wanted.

Kasper D. Hansen (09:30:52) (in thread): > That’s important as well, and I am completely ignoring that

Dirk Eddelbuettel (09:33:00) (in thread): > Well first paragraph on page 3 ofhttps://arxiv.org/pdf/1911.06416.pdfbut hey I already linked to it didn’t I ?:wink:

Dirk Eddelbuettel (09:34:30) (in thread): > Actually more openingfirstparagraph of the actual steps as a prerequisite. So corrected to para 3 on pg 1:slightly_smiling_face:

Kasper D. Hansen (09:50:52) (in thread): > Dude, you’re like posting papers and stuff

Kasper D. Hansen (09:51:18) (in thread): > But it was a good read, I read it this morning. Short though, but that’s sometimes good

Dirk Eddelbuettel (09:52:05) (in thread): > Right. Would you have read if it was longer?:wink:I had a few day at that point and more to say than a blog post so it became a note…

Dirk Eddelbuettel (09:52:40) (in thread): > Additions/extensions welcome if there is something from the Bioc side of thing.

Kasper D. Hansen (10:38:36) (in thread): > I think it is an outstanding high level overview. And an overview is needed for this. You could write a lot on each of the different sessions, and that would be good as well, but it might be better served as different pieces.

Kasper D. Hansen (10:39:38) (in thread): > For example, I have been playing with the idea of writing about how to write an autoconf script, which would clearly help the community (in case anyone read it) but so far its in the pile of “good ideas, not enough time” (a polite way of saying lack of follow through)

Dirk Eddelbuettel (10:41:53) (in thread): > I am with you. That topic alone, along maybe with the outside-of-our-circles-even-more-prevalentcmakeand the somewhat obnoxiously-named but popularanticonfshell scripts by Jeroen (of which I also have a few in packages, less aggressively named) is a paper-let. Happy to help – we can file that in the drawer of “projects we should start or have started but which have an uncertain future or completion date”….

Dirk Eddelbuettel (10:42:48) (in thread): > (I actually just tutored someone on autoconf the other day (maybe three or four weeks ago) and now I can’t even remember where that was. Involvedpkg-configtoo which often helps / shortens this….)

Kasper D. Hansen (10:43:35) (in thread): > yeah, I mean the skeleton you would want to do is to a large extent 100% identical between different libraries and describing that + how you handle library specific cases would be very useful

Kasper D. Hansen (10:44:23) (in thread): > I have run into this a lot which a multi user R I maintain for Hopkins biostat which has a - shall we say - peculiar setup and where I find that many packages have configure scripts assuming things they shouldn’t assume

Kasper D. Hansen (10:44:52) (in thread): > For example whether the configure script inheritsCFLAGSfrom R

Dirk Eddelbuettel (10:45:17) (in thread): > WRE is clear on this. No. That’s why there arePKG_*.

Dirk Eddelbuettel (10:45:51) (in thread): > Strange that Kurt et al would not squash in aR CMD check.

Kasper D. Hansen (10:51:21) (in thread): > Let’s just say I have found this in many packages including packages with R-core authors. It’s a small thing, but with the wrong assumptions it’ll break on my setup.

Dirk Eddelbuettel (10:52:28) (in thread): > So send PRs / patches.

Dirk Eddelbuettel (10:52:55) (in thread): > Also += vs = I presume.

Kasper D. Hansen (10:55:12) (in thread): > I’ve done that of course

Dirk Eddelbuettel (11:09:43) (in thread): > Specialised academic or HPC setups are hard to generalize from though. Definition of one-offs.

Kasper D. Hansen (11:34:54) (in thread): > My point was really that there is a standard recipe which should cover 98% of cases; it just needs to get written

2021-03-25

Catherine Ross (11:39:35): > @Catherine Ross has joined the channel

2021-03-29

Ludwig Geistlinger (10:07:51): > Are there best practices for “unit-testing” shiny apps? My understanding until here was that this is kind of hard to do because you can unit-test code and outputs, but once the app is up, you do not really have a way for programmatically testing it’s functionality. Have there been additional developments in this realm that I am not aware of? Tagging@Mike Smithfor the Bioconductor perspective, but would be interested in general what people might have worked out for this already.

Kasper D. Hansen (10:19:22): > I am so behind on all things shiny that Im not worth listening to, but for this particular question I would really ping the shiny people and see if they have something fortestthat. It’s the same company after all

Kasper D. Hansen (10:19:42): > As you note, unit testing GUIs is hard.

Kasper D. Hansen (10:19:50): > (likewise for plotting functions)

Marcel Ramos Pérez (10:26:59): > Last time I looked at this you could use RSelenium to run a headless browser and test some features of the app but it seems like there are newer ways of doing thishttps://shiny.rstudio.com/articles/testing-overview.html

Pablo Rodriguez (10:46:10): > shinytestpkg and shiny’stestServer() are great ways to test your apps. - Attachment (blog.rstudio.com): shinytest - Automated testing for Shiny apps > Continuing our series on new features in the RStudio v1.2 preview release, we would like to introduce shinytest. shinytest is a package to perform automated testing for Shiny apps, which allows us to: Record Shiny tests with ease. Run and troubleshoot Shiny tests. shinytest is available on CRAN, supported in RStudio v1.2 preview and can be installed as follows: install.packages(“shinytest”) Recording Tests This is the general procedure for recording tests:

Ludwig Geistlinger (12:22:11): > Thanks a lot for the helpful suggestions everyone!

Kevin Rue-Albrecht (12:39:11): > FWIW, in iSEE, we’ve focused on testing the ins and outs of the server side codehttps://github.com/iSEE/iSEE/tree/master/tests/testthatIt takes care of checking that individual functions produce the expected output, and that environment variables that are used to manage the app state have the expected value after key operations. > Lots of glitches can still happen at runtime (e.g. concurrency issues between callbacks due to rapid clicks) but I’d argue that most runtime issues that are not covered by our unit tests would be difficult to replicate anyway.

Kevin Rue-Albrecht (12:40:02): > With that said, I won’t pretend to be up to date with the latest best practices in unit testing of Shiny apps. I haven’t done my homework on that side in ages.

Aaron Lun (12:42:43): > One of them (shinytest?) involves taking screenshots and comparing them. Pretty crude, doesn’t work for collapsible boxes, false positives with minor UI changes, etc. etc.

Kevin Rue-Albrecht (12:43:59): > On the plus side, our rate of GitHub issues about bugs is rather small, which can be explained either by a small user base (?) or the fact that our unit testing strategy catches most of the issues before they turn into Shiny-level issues

Ludwig Geistlinger (12:59:37): > Yep, from the link Marcel provided it looks like your options range from (1) regular unit tests, (2) server functions tests that seems to encapsulate what Kevin describes for iSEE server code tests, and (3) snapshot-based tests via the shinytest package, which are however very sensitive to changes and based on comparing screenshots as Aaron points out.

Kasper D. Hansen (14:26:52) (in thread): > Obviously, this is because of your superior testing strategy!

Kasper D. Hansen (14:28:00): > So then the testing requires a browser? Would that work on our build server?

Hervé Pagès (15:32:55): > All the build machines have a browser.

2021-03-31

Lisa Cao (12:50:42): > @Lisa Cao has joined the channel

2021-04-01

Sean Davis (07:37:25) (in thread): > https://cran.r-project.org/web/packages/RSelenium/vignettes/shinytesting.html

2021-04-07

Nitesh Turaga (13:34:09): > I’d like to ask how I can set alerts for failure of a certain step in a github action? > > I have a github action which does a docker image update for me weekly on friday, and even though a single “step” in the job fails (the build docker image step)…the entire thing completes and becomes green (indicating success). > > An example of such a failure ishttps://github.com/Bioconductor/bioconductor_docker/runs/2255687109?check_suite_focus=trueAny advice on how I can set alerts for single steps in Github actions?

Nitesh Turaga (13:34:30): > I found a “slack” alert, but i’m trying to understand if there is a better way to do this.

Aaron Lun (13:48:04): > Usually you don’t have to do anything as long as your status!=0. Stuff fails hard for me without any effort

Nitesh Turaga (13:48:25): > Right, in my shell script i’m guessing right?

Aaron Lun (13:48:39): > yeah

Aaron Lun (13:49:01): > I guess you could try withset -e,set -u, but I’ve never done that and my actions explode on failure.

Nitesh Turaga (13:50:34): > Yep,@Hervé Pagèsjust gave me this exact advice

Nitesh Turaga (14:01:26): > Thanks@Aaron Lun

Dirk Eddelbuettel (14:19:15): > FWIW I see a pattern like the below withset -e pipefailevery now and then but have not needed it in “my” actions.

Dirk Eddelbuettel (14:20:12): - File (YAML): Untitled

Nitesh Turaga (14:23:03): > what doespipefaildo?

Dirk Eddelbuettel (14:26:18): > man bashsays: “If set, the return value of a pipeline is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands in the pipeline exit successfully. This option is disabled by default.” > > So it may help with error propagation.

Nitesh Turaga (14:51:48): > Aaah, thanks@Dirk Eddelbuettel

Henrik Bengtsson (16:56:36): > In R we can set the default user and site package library paths via environment variablesR_LIBS_USERandR_LIBS_SITE. What’s particularly neat is that they support “conversion specifiers”, e.g. if I use: > > R_LIBS_USER=~/R/%p-library/%v > > R will then the expand that to~/R/x86_64-pc-linux-gnu-library/4.0if I’m running R 4.0.5 on Linux (that’s actually the defaultR_LIBS_USERon Linux). This allows you to use separate package library paths for Bioc release and Bioc devel, e.g. > > R_LIBS_USER=~/R/%p-library/%v-bioc_3.12 > > and > > R_LIBS_USER=~/R/%p-library/%v-bioc_3.13 > > Now, since R is already aware of environment variableR_BIOC_VERSION(it’s in the R source code), it would be super neat to have a conversion specifier also for that, e.g. > > R_LIBS_USER=~/R/%p-library/%v-bioc_%b > > That way you’d never had to updateR_LIBS_USERwhen moving between R versions and Bioc version. It can be added to~/.Renvirononce. > > Turns out that it’s straightforward to add such a specifier. It’s a one line addition tobase:::.expand_R_libs_env_var(), as proposed onhttps://github.com/HenrikBengtsson/Wishlist-for-R/issues/123. [EDIT +2 hours: It’s a little bit more complicated because the default value ofR_BIOC_VERSIONis only available from thetoolspackage, so some code shuffling is needed] > > Comments?

2021-04-08

Mike Smith (02:51:45) (in thread): > I think this would be very useful. It’s very similar to the strategy I currently use in.Rprofileto switch between release and devel Bioconductor versions. Here’s my.Rprofile- probably can be improved but it’s been working for years. > > R_LIB_ROOT = paste0("/mnt/data/R-lib/") # do note the trailing_slash! > > r_version = paste0(R.Version()["major"],".",R.Version()["minor"]) > > bioc_version = Sys.getenv("BIOC_VERSION") > > if(nzchar(bioc_version)) { > R_USER_LIB_PATH = paste0(R_LIB_ROOT, r_version, "-", bioc_version) > } else { > R_USER_LIB_PATH = paste0(R_LIB_ROOT, r_version) > } > > dir.create(R_USER_LIB_PATH, showWarnings=FALSE, recursive=TRUE) > if (!is.element(R_USER_LIB_PATH, .libPaths())){ > .libPaths(c(R_USER_LIB_PATH, .libPaths())) > } > > I wasn’t aware ofR_BIOC_VERSION, so I’m just usingBIOC_VERSIONand I launch R with a script that sets that variable plus either R release or devel. This also lets me have two shortcuts on my desktop to launch RStudio in running either release or devel.

2021-04-12

Lori Shepherd (08:09:59): > Also – we are announcing a tentative release schedule shortly – hopefully later today – I know we tried last release to have a question/answer session if anyone had questions about the release so it might also be a good time for that if we were going to do one

Lori Shepherd (08:10:35): > its a tentative schedule since R still hasn’t announced their release but we need to start thinking and getting things in order for the release so its not too delayed behind the R release

Kasper D. Hansen (10:09:49): > Any idea what is happening on the R side? I can totally understand that everything is behind this year; it certainly is on my end. Are they expecting to just suddenly say “we’ll release in 2 wks”?

Kasper D. Hansen (10:10:09): > Anyway, it is what it is

Dirk Eddelbuettel (10:13:55): > Being a keen student of their schedule I also wondered… The March 31 release of 4.0.5 probably didn’t help in ensuring April. As I recall, Peter generally just ‘drops’ the date with a three week schedule for alpha, beta and rc releases. Maybe our two esteemed colleagues here who also happen to be R Core members have a bit more insight….

Michael Lawrence (18:05:19): > Sounds like mid-May would be a reasonable expectation, but no promises yet. Some decisions being made around Mac M1 support.

2021-04-13

Mike Smith (16:35:18): > The next Bioconductor Developers’ Forum is this Thursday 15th April at 09:00 PST / 12:00 EST / 18:00 CEST - You can find a calendar invite attached and athttps://tinyurl.com/BiocDevel-2021-04 > * As we’re fast approaching the release of Bioconductor 3.13 we’ll use this meeting to go over the current release schedule, and give some reminders of the things package developers might need to be aware of. If you’re relatively new to supporting a Bioconductor package (or just want a refresher!) this is a great chance to ask any questions you may have to members of the core team that manage the version transition. If you’ve got questions about version numbering:1234:, deprecation cycles:recycle:, or the last day you can make a change:hourglass_flowing_sand:this is the opportunity to ask! > * @Lori Shepherdwill also present some proposed major changes to BiocFileCache and how those might affect packages you’re maintaining or using. > * If anyone else has another topic they’d like to discuss or present, please let me know. - File (Binary): BiocDevel-2021-04.ics

2021-04-14

Dipanjan Dey (05:20:46): > @Dipanjan Dey has joined the channel

Sudarshan (09:10:44): > @Sudarshan has joined the channel

2021-04-15

Harshita Ojha bs17b012 (06:08:20): > @Harshita Ojha bs17b012 has joined the channel

Pierre-Luc Germain (07:59:34): > @Pierre-Luc Germain has joined the channel

2021-04-19

Annekathrin Ludt (04:33:30): > @Annekathrin Ludt has joined the channel

Kayla Interdonato (10:05:41): > The recording from Thursday’s devel forum is now available on the Bioconductor YouTube (https://www.youtube.com/watch?v=UWsBcpvsQU8) as well as under course materials on the Bioconductor website. - Attachment (YouTube): Developers Forum 21

2021-04-20

Henrik Bengtsson (21:10:59): > I just watched thelatest Bioc Devel Forumand found the discussion around the migration ofBiocFileCache’s default file cache location. I like that deprecation plan. There’s also a useful discussion around moving the cache files automagically for users, if possible to do so atomically. > > AFAIU, user’s will ever only notice if the defaultolddir != newdir. The examples, suggestions, and discussions on automate the move appears to have circulated around movingfiles. However, I didn’t see a discussion on just movingfolderoldirtonewdir. That’s an atomic() file operation by itself, it shouldn’t consume any extra disk space, it’s super fast, and should either work or not work. It should work for everyone who’s not already got the newnewdir(you could also easily work around it if it is empy). Did you consider that? > > () There one place where a move of a folder might not be atomic is where you move the folder across to another physical disk. In that casemv olddir newdirbecomescp -R olddir newdirfollowed by arm -rf olddir, which isnotatomic. However, for that to be the case the user would have had to have gone in to the OS cache folder (e.g.~/.cache) and created a specific symbolic link.

Henrik Bengtsson (21:15:36) (in thread): > On a side note: if you’re automating the cache folder/files move, I think it should only happen based on explicit user input, e.g. accepting a prompt in interactive mode or calling a specific function. Otherwise, there’s a risk that current, long running pipelines/jobs might break. It’s probably only the user who can know when it’s safe to make the move. > > PS. My interest in this is also that CRAN just asked me to update myR.cachepackage to align withtools::R_user_dir(.packageName, which="cache"). So it’s very similar to theBiocFileCachemigration, although the cost of losing the cache should be smaller.

2021-04-21

Spencer Nystrom (10:25:14) (in thread): > I don’t have much skin in the game here but I had a similar thought. Also, wouldn’t it be feasible to do some sanity checks before trying the move, like that the user has write permission in the target location & that enough space exists at the target for a copy (then tell the user if not)? Then try/catch the copy operation, then ask the user again if it’s OK to delete the old cache.

2021-04-27

Constantin Ahlmann-Eltze (04:18:53) (in thread): > Hey, I just saw this discussion and wanted to say that creating a symbolic link for~/.cacheto another disk is exactly what I have done on our cluster, as the disk space for the home directory is very limited, which I think isn’t too uncommon

Henrik Bengtsson (11:56:21) (in thread): > @Constantin Ahlmann-Eltze, in that case amvwould still be a quick, atomicmv, because~/.cache/BiocFileCache(old) and~.cache/R/BiocFileCache(new) are still on the same disk. It’s only when the “old” or the “new” cache folder has been symlinked to another disk, when there could be a problem. > > I also haven’t checked: it might be thatfile.rename()doesn’t work when source and destination are on different disks; it could be that it gives an error on some or all OSes.

2021-04-28

Mateusz Staniak (18:16:04): > @Mateusz Staniak has joined the channel

2021-04-29

Mike Smith (04:26:36): > Question for those that use GitHub actions. We’ve entered the weird limbo period prior to an R release, where R-devel is now R-4.2, but R-release is still 4.0. This breaks all my GH action workflows withBioconductor version '3.13' requires R version '4.1'; R version is too new;errors. What’re peoples strategies for handling this?

Kasper D. Hansen (04:27:26): > Well, you need to switch to using the R-4.1-alpha/beta branch

FelixErnst (04:27:34) (in thread): > Keep calm and ignore it

FelixErnst (04:27:45) (in thread): > At least I try:slightly_smiling_face:

Kasper D. Hansen (04:27:56): > Having said that, I have always found this check weird and irritating

Kasper D. Hansen (04:28:15) (in thread): > Its hard to ignore because it makes BiocManager not work

Kasper D. Hansen (04:29:07): > I don’t really see any purpose to this for this time period and I wish we did not have to deal with this.

Mike Smith (04:29:30) (in thread): > This is the time prior to a release is when getting rapid feedback on whether the changes I’ve made work cross platform (or on a specific platform) don’t break stuff.

FelixErnst (04:30:18) (in thread): > I know and I would like to do that as well, but what alternative is there?

FelixErnst (04:31:56) (in thread): > Modifying all the GHA to use R 4.1? Is it already available from r-lib? Even then all the binaries are not available from CRAN for R 4.1 anyway

Mike Smith (04:32:06) (in thread): > I was hoping someone would answer that! Maybe whoever manages the install R action has an [alpha] tag that I don’t know about. Or you could install R manually in the workflow, but it’s a lot of hassle for a 2 week window.

Mike Smith (04:33:30) (in thread): > My issue is that it’s not obvious (at least to me) how you do that in a GH workflow. I was hoping someone knew a straightforward way of getting there.

FelixErnst (04:34:14) (in thread): > But getting back to original question: Do all GHA break or jsut the non-linux ones?

Mike Smith (04:34:57) (in thread): > For me it’s all of themhttps://github.com/grimbough/Rhdf5lib/actions/runs/793809033

FelixErnst (04:35:29) (in thread): > This keeps the linux one alive > > { os: ubuntu-latest, r: 'devel', cont: "bioconductor/bioconductor_docker:devel", rspm: "[https://packagemanager.rstudio.com/cran/__linux__/focal/latest](https://packagemanager.rstudio.com/cran/__linux__/focal/latest)" } >

FelixErnst (04:35:38) (in thread): > It uses the container setup by Nitesh

Kasper D. Hansen (04:36:22) (in thread): > I am not up to speed on GH actions, but analyzing this, this is very much a Bioconductor-release issue; the CRAN-centric people don’t care in the same way. We might need to provide a solution

Kasper D. Hansen (04:36:52) (in thread): > How are GH action scripts working? You have your own private script. Or does it refer / sources some master script?

FelixErnst (04:37:39) (in thread): > The code is frombiocthis(Shout-out to@Leonardo Collado Torres)

Kasper D. Hansen (04:38:06) (in thread): > but still, when a fix is needed it then needs to get individually adapted by people.

FelixErnst (04:39:04) (in thread): > Basically all the GHA scripts I know, are based on an R-setup by r-lib (https://github.com/r-lib/actions)

Mike Smith (04:39:08) (in thread): > I find the GitHub naming a bit weird. The “actions” are building blocks, that you piece together into a “workflow”. So there’s an Action for “installing R” and you can provide an argument for the R version.

Mike Smith (04:40:01) (in thread): > Each package needs to have their own workflow, but a change in the “install R” action would affect any workflow that uses it.

Kasper D. Hansen (04:41:03) (in thread): > ah ok

Mike Smith (04:42:15) (in thread): > From Felix’s link, the specific action for installing R ishttps://github.com/r-lib/actions/tree/master/setup-r

Mike Smith (04:43:15) (in thread): > Maybe I can just supply4.1.0and it’ll work - that’d be nice.

FelixErnst (04:43:32) (in thread): > I thinks this will work right now

FelixErnst (04:43:47) (in thread): > But after the 19th you have to revert to devel again

Mike Smith (04:44:42) (in thread): > Except we won’t, because BioC 3.14 will also be R 4.1 (I think)

FelixErnst (04:47:51) (in thread): > You are right. Forgot about that

Mike Smith (04:48:19) (in thread): > I often think I want a “bioconducotr.org/which-r-should-i-use” page that just flashes that at me in massive letters.

FelixErnst (04:48:20) (in thread): > Dang, so you are asking the question, I would have asked in 3 weeks

FelixErnst (04:51:03) (in thread): > Do I feel stupid now… Its not like I haven’t done this dance a year ago

Mike Smith (04:57:10) (in thread): > My original question came from a vague memory of doing a terrible job of coping with this a year ago, and hoping someone elses smart solution existed. Maybe I just shouldn’t make any changes in these 3 weeks, then I don’t need to test anything! The backlog of open issues suggests that’s not going to be a well received plan though.

Mike Smith (04:58:11) (in thread): > Unfortunatlysetup-r r-version: 4.1.0doesn’t work yethttps://github.com/grimbough/Rhdf5lib/actions/runs/795439993

FelixErnst (04:59:32) (in thread): > ~~~I am not sure whether it changes anything, but maybe ~~~~~~~r: 4.1~~~~~~~~ is worth a try?~~~~I saw you did that for 20.04 already

FelixErnst (05:01:00) (in thread): > So I guess going with the container is the best option isn’t it? At least linux test work then

FelixErnst (05:01:18) (in thread): > https://github.com/microbiome/mia/runs/2459977637

Mike Smith (05:04:40) (in thread): > Personally, since I use Linux as my desktop environment, it doesn’t get me too much, but for others at least that’s one additional platform covered. I’ll spin up my Mac VM and rely on the build system for now.

Mike Smith (05:04:52) (in thread): > Thanks for the suggestions!

Charlotte Soneson (05:55:26) (in thread): > I have the same problem with my GHA, unfortunately also no clear solution:confused:For mac, if the R version is set todevel, thesetup-raction gets the.pkgfile fromhttps://mac.r-project.org/high-sierra/R-devel/R-devel.pkg- I guess you could get the current R 4.1 branch similarly fromhttps://mac.r-project.org/high-sierra/R-4.1-branch/R-4.1-branch.pkg, but it’s not clear to me exactly what should be the trigger for taking this path in the action (https://github.com/r-lib/actions/blob/0d6c4b3efe82090bc03e99c585d9e89003117931/setup-r/src/installer.ts#L440-L442) - currently it’ll go tohttps://cloud.r-project.org/bin/macosxin all other cases.

Mike Smith (06:01:40) (in thread): > I just forkedr-lib/actionsto see if I could fudge a way to get it to accept the URL. However it then does some checking on the version number and doesn’t like my4.1-branch

Mike Smith (06:03:16) (in thread): > I was hoping there’s be a “non-versioned”R-alpha.pkgor similar and I could just use thealphakeyword like we usedevel. However, if you have to also supply4.1or whatever is appropriate it’s pretty brittle

Kasper D. Hansen (06:04:47) (in thread): > Well, eventually - if my memory serves me well - the alpha becomes beta becomes release

Kasper D. Hansen (06:05:10) (in thread): > What we need is a script where we can say “bioc devel” and it just gets the stuff we need.

Mike Smith (06:09:18) (in thread): > I think we had a BOF session at a conference where we discussed the merit of some Bioconductor specific actions, that would take a Bioc version and handle the rest. Unfortunately, I have a feeling we concluded with “Mike will look into it”.

Lluís Revilla (06:23:42) (in thread): > Maybe this is where@Leonardo Collado Torresbiocthis could help? If it doesn’t tackle this corner case already(?)

Luke Zappia (06:55:10) (in thread): > My workflow (which I originally based off thebiocthisone) pulls the Bioconductor docker container and gets the R version in there. I just checked and it’s still pointing to R 4.1 so maybe that’s a solution.

Leonardo Collado Torres (08:32:08) (in thread): > I disabled macOS and Windows for now on mybiocthisGHA on thespatialLIBDpackage as you can see athttps://github.com/LieberInstitute/spatialLIBD/blob/master/.github/workflows/check-bioc.yml#L54-L56 - Attachment: .github/workflows/check-bioc.yml:54-56 > > - { os: ubuntu-latest, r: '4.1', bioc: '3.13', cont: "bioconductor/bioconductor_docker:devel", rspm: "[https://packagemanager.rstudio.com/cran/*_linux_*/focal/latest](https://packagemanager.rstudio.com/cran/*_linux_*/focal/latest)" } > #- { os: macOS-latest, r: '4.1', bioc: '3.13'} > #- { os: windows-latest, r: '4.1', bioc: '3.13'} >

Leonardo Collado Torres (08:34:22) (in thread): > r-libd/actionsuseshttps://github.com/r-lib/actions/blob/master/setup-r/src/installer.ts#L434-L458for getting the R versions on macOS. The root url ishttps://cloud.r-project.org/bin/macosx/although right now R 4.1 ishttps://mac.r-project.org/high-sierra/R-4.1-branch/R-4.1-branch.pkgwhich is under an additional sub-folder unlike sayhttps://cran.r-project.org/bin/macosx/R-4.0.5.pkg - Attachment: setup-r/src/installer.ts:434-458 > > function getFileNameMacOS(version: string): string { > const filename: string = util.format("R-%s.pkg", version); > return filename; > } > > function getDownloadUrlMacOS(version: string): string { > if (version == "devel") { > return "[https://mac.r-project.org/high-sierra/R-devel/R-devel.pkg](https://mac.r-project.org/high-sierra/R-devel/R-devel.pkg)"; > } > const filename: string = getFileNameMacOS(version); > > if (semver.eq(version, "3.2.5")) { > // 3.2.5 is 'special', it is actually 3.2.4-revised... > return "[https://cloud.r-project.org/bin/macosx/old/R-3.2.4-revised.pkg](https://cloud.r-project.org/bin/macosx/old/R-3.2.4-revised.pkg)"; > } > if ([semver.lt](http://semver.lt)(version, "3.4.0")) { > // older versions are in /old > return util.format( > "[https://cloud.r-project.org/bin/macosx/old/%s](https://cloud.r-project.org/bin/macosx/old/%s)", > filename > ); > } > > return util.format("[https://cloud.r-project.org/bin/macosx/%s](https://cloud.r-project.org/bin/macosx/%s)", filename); > } >

Leonardo Collado Torres (08:37:40) (in thread): > this was discussed a bit a year ago athttps://github.com/r-lib/actions/pull/68 - Attachment: #68 Update R-devel URL for macOS > Similar to travis-ci/travis-build#1885 > > This just updates the R-devel URL to: > > http://mac.r-project.org/high-sierra/R-4.0-branch/R-4.0-branch.pkg > > I’ve opted not to change the gfortran install from homebrew, but I can add that as well.

Leonardo Collado Torres (08:48:24) (in thread): > I just createdhttps://github.com/r-lib/actions/issues/286#issue-870978686

Aaron Lun (11:14:56) (in thread): > Just don’t have anRrequirement, obviously.

Marcel Ramos Pérez (13:54:11) (in thread): > We get this same issue on the CRAN devel builder withBiocManager. We have a potential fix in the works that converts the error into a persistent message.https://github.com/Bioconductor/BiocManager/tree/futureThe patch isn’t finalized we still have to discuss other solutions. This could work for GHA for now..

Henrik Bengtsson (15:54:29) (in thread): > This is what I suggested athttps://github.com/Bioconductor/BiocManager/issues/99#issuecomment-827765049:

2021-04-30

Mike Smith (15:19:14): > I fudged together my own action with some hardcoded URLs that are used if you provide the version asprereleasefor Mac or Windows. Not sure how frequently those URLs change but if anyone wants to make use of it there’s an example at:https://github.com/grimbough/Rhdf5lib/blob/de6e24590b557f6c67614aff0ffcbfe7d26a6fde/.github/workflows/main.yml#L22-L23

Mike Smith (15:19:32): > The action step is athttps://github.com/grimbough/Rhdf5lib/blob/de6e24590b557f6c67614aff0ffcbfe7d26a6fde/.github/workflows/main.yml#L55

2021-05-03

Stephen Chen (07:50:38): > @Stephen Chen has joined the channel

Krithika Bhuvanesh (16:50:03): > @Krithika Bhuvanesh has left the channel

2021-05-04

Leonardo Collado Torres (15:16:37): > this is really cool Mike!

Leonardo Collado Torres (15:16:59): > it could lead to a PR forr-lib/actions:smiley:

2021-05-05

Abel Torres Espin (09:54:54): > @Abel Torres Espin has joined the channel

2021-05-07

Jenny Drnevich (14:51:30): > @Jenny Drnevich has joined the channel

2021-05-11

Megha Lal (16:44:50): > @Megha Lal has joined the channel

2021-05-14

Robert Castelo (10:17:48): > Hi, my packageGenomicScoreswas building and checking fine until yesterday and I’ve got the error shown in the attached image, when executing one of the examples in the linux platform only. It says something about a corrupt cache, but it doesn’t look like it’s something that depends on the Package. Do you have any hint about what should i do to fix this? - File (PNG): Screenshot 2021-05-14 at 16.15.56.png

Lori Shepherd (10:29:19): > https://community-bioc.slack.com/archives/CDSG30G66/p1620981462066000 - Attachment: Attachment > question about nebbiolo1 and devel branch package checks. Some of my packages (tximeta, fishpond) cannot build on Linux because > Corrupt Cache: index file > See AnnotationHub's TroubleshootingTheCache vignette section on corrupt cache > cache: /home/biocbuild/.cache/R/AnnotationHub > should I just sit tight, or investigate? Note that the other builders don’t have error

2021-05-17

Mike Smith (14:59:09): > We’re scheduled to have a developers call this Thursday, right after the release of R-4.1. The full list of changes in this version can be found at:https://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.htmlI thought it might be fun to go through some of the changes together, maybe with different people highlighting the change they’re most interested in or affected by. > > Would anyone be interested in that? And if so do I have any volunteers? - first come first served if you want to demonstrate|>!

Hervé Pagès (15:08:33): > Note that this Thursday is release day for Bioconductor which means that the BioC core team is going to be very busy to make sure that the release goes smoothly. I don’t know for the other core team members but I know that I won’t be able to do anything else that day.

Lori Shepherd (15:10:59): > I also will mostly likely be tied up with release tasks.

Kasper D. Hansen (15:23:27): > Is the default for BiocParallel really to use all cpus? When I execute`bpparam()`` it returns an object with

Kasper D. Hansen (15:23:30): > bpnworkers: 18

Kasper D. Hansen (15:24:00): > I am surprised by this

Federico Marini (15:27:37): > Hm, I recall it is the available ones minus 2

Federico Marini (15:27:58): > or well, the minimum between 2 or that (nCores - 2)

Hervé Pagès (15:34:09): > The man page forMulticoreParam/multicoreWorkerssays: > > workers: ‘integer(1)’ Number of workers. Defaults to all cores > > available as determined by ‘detectCores’. > which is a little bit misleading since: > > > parallel::detectCores() > [1] 8 > > multicoreWorkers() > [1] 6 > > So yes, it seems to be nb of cores minus 2, at least on Linux and for the MulticoreParam backend, but I don’t know if that applies to other platforms/backends.

Kasper D. Hansen (15:47:57): > That’s relevant to me. I probably have 20 cores on this node and not 18

Kasper D. Hansen (15:48:16): > Is this really the right default?

Kasper D. Hansen (15:48:37): > Guess I can make a system wide configuration change on our cluster

Hervé Pagès (15:49:09): > What would the “right default” be? FWIWmake -juses all available cpus too.

Kasper D. Hansen (15:49:18): > I think 1

Aaron Lun (15:50:22): > I too would prefer 1, then I don’t have to defensivelyregisterandsetAutoBPPARAMeverything toSerialParaminside my functions.

Hervé Pagès (15:55:05): > I don’t know. FWIW I solve the problem by not using parallelization at all by default in my code so by default it behaves like if nb of cpu = 1. It seems that if you’re usingbplapply()you would expect some parallel evaluation to happen by default. Maybebplapply()should supportBPPARAM=NULLto fallback tolapply().

Kasper D. Hansen (15:56:30): > The system for BiocParallel where you can specificy a default backend through options would - in my opinion - suggest that you wantbpparam()to return these options if there are set, but 1 otherwise

Henrik Bengtsson (15:56:46): > I’m also in the default-to-sequential-processing camp. As more and more R packages turn to parallel processing, there will be more and more issues with tools competing about our CPU cores. The problem explodes on multi-tenant environments but also when you run multiple software concurrently on your own machine. Making parallelization an explicit choice by the user helps avoiding this.

Kasper D. Hansen (15:56:49): > Although as my Q reveals, I am not an expert user

Hervé Pagès (16:00:05): > I’m also in favor of avoiding entirely the parallelization machinery by default i.e. havebpparam()return NULL andbplapply()falling back tolapply()in that case.

Hervé Pagès (16:01:42): > And to keep the current behavior of backend constructors likeMulticoreParam()as it is now (i.e. use a certain nb of cpus by default).

Kasper D. Hansen (16:03:01): > I would still advocate that you need a special argument toMultiCoreParam()to use all available CPUs

Kasper D. Hansen (16:03:32): > But the most important thing is default behavior

Kasper D. Hansen (16:03:42): > ofbpparam()

Henrik Bengtsson (16:10:21): > BTW, note thatparallel::detectCores()may return 1 (e.g. VMs) or 2 (e.g. GitHub Actions, Travis CI, and AppVeyor CI). It may even returnNA_integer_on some systems. So, you need to use `ncores <- max(parallel::detectCores() - 2, 1, na.rm = TRUE)to get it right. … or useparallelly::availableCores(omit = 2L)[https://www.jottr.org/2021/04/30/parallelly-1.25.0/].availableCores()comes with other benefits, e.g. it respects HPC env vars, cgroups, containers, R options, etc + allows you override the default number of CPU cores via R option and env var. (disclaimer: I’m the author; it grew out of a wild-wild-west behavior on shared HPC resources)

Kasper D. Hansen (16:20:18): > that seems like what we want

Martin Morgan (17:39:55): > BiocParallel default cores —https://github.com/Bioconductor/BiocParallel/issues/141comments welcome

2021-05-18

Robert Castelo (02:00:27) (in thread): > This could be changed to falling back to theSerialParamclass. This is the current code of bpparam: > > function (bpparamClass) > { > if (missing(bpparamClass)) > bpparamClass <- names(registered())[1] > default <- registered()[[bpparamClass]] > result <- getOption(bpparamClass, default) > if (is.null(result)) > stop("BPPARAM '", bpparamClass, "' not registered() or in names(options())") > result > } > > Replace the line > > bpparamClass <- names(registered())[1] > > by > > bpparamClass <- names(registered())[3] > > which gives “SerialParam”.

Mike Smith (04:49:36) (in thread): > Maybe the serendipitous timing wasn’t so great after all! Perhaps it would be better to push the developers call back a week.

2021-05-20

Mike Smith (05:09:21) (in thread): > Ok, we’ll do the call next week (May 27th). I’ll send the announcement email later. Good luck with the release today everyone!

Hervé Pagès (12:09:57) (in thread): > Thanks Mike!

2021-05-21

Guido Barzaghi (04:15:41): > @Guido Barzaghi has joined the channel

2021-05-25

Mike Smith (11:33:43): > The next Bioconductor Developers’ Forum is this Thursday 27th May at 09:00 PST / 12:00 EST / 18:00 CEST - You can find a calendar invite attached and athttps://tinyurl.com/BiocDevel-2021-05I thought it might be fun to go through some of the changes introduced in R-4.1 together, with different people highlighting the change they’re most interested in or affected by. > > The full list of changes in this version can be found at:https://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.htmlIf you’d be willing to talk for between 1 and 5 minutes on anything introduced in R-4.1 please reply here so we don’t all pick the same topic. I think@Martin Morganhas already selected the native pipe - thanks Martin! - File (Binary): BiocDevel-2021-05.ics

Quang Nguyen (12:20:12): > @Quang Nguyen has joined the channel

2021-05-26

Henrik Bengtsson (14:26:51): > Hi, it looks like the Bioc build/check pipeline still doesn’t pick up all*R_CHECK_LENGTH_1_LOGIC2*bugs, i.e. the one you get from use&&instead of&in expression likec(FALSE,TRUE) && c(TRUE,FALSE). > > While running revdep checks onmatrixStatswith these checks enabled, I run into this bug forCopywriteRbut the Bioc system doesn’t pick it up, cf.https://bioconductor.org/checkResults/release/bioc-LATEST/CopywriteR/. Seehttps://github.com/PeeperLab/CopywriteR/issues/33for details and how to reproduce this withR CMD check.

Lori Shepherd (14:29:44): > FWIW Its check when packages are submitted to Bioconductor –@Hervé PagèsI thought we had this environment flag set on the daily builders too but maybe not?

Hervé Pagès (15:27:52) (in thread): > Hi Henrik, > Please use the#bioc-buildschannel for this kind of questions.*R_CHECK_LENGTH_1_LOGIC2*is defined and set as follow on the build machines: > > *R_CHECK_LENGTH_1_LOGIC2*=package:*R_CHECK_PACKAGE_NAME*,abort,verbose > > However note that theR CMD check CopywriteRerror is triggered during evaluation of the code in the vignette. And since we skip this step when runningR CMD check(this is an important difference with the CRAN builds), we cannot catch the problem during the CHECK step, only during the BUILD step. But the thing is that, sadly,*R_CHECK_LENGTH_1_LOGIC2*seems to be ignored duringR CMD build, which one might argue should not come as a surprise given the name of the variable, but still sad nonetheless. > I can’t think of an easy solution right now. To provide some context: the reason we skip evaluation of the vignette duringR CMD checkis to save CPU cycles and because we assume that any problem in the vignette would have been caught duringR CMD build.

Henrik Bengtsson (15:53:13) (in thread): > > … sadly,*R_CHECK_LENGTH_1_LOGIC2*seems to be ignored duringR CMD build, … > I’m certain it’ll work if you don’t usepackage:...to condition it on only code in the package. That is, using: > > *R_CHECK_LENGTH_1_LOGIC2*=verbose > > will definitely catch it. (That’s actually what I put in my personal~/.Renviron). The downside with this is that it will also catch this type of bug occurring in dependent packages. OTH, I’d argue it’s an important enough bug that it should be caught and when caught reported upstream. > > Alternatively, assuming you only runR CMD buildwhen there is an update, you could runoneround ofR CMD checkimmediately afterward where you also check the vignettes and “somehow” report that together with the build log. > > (subscribed to#bioc-buildsnow)

Henrik Bengtsson (15:55:53) (in thread): > > *R_CHECK_LENGTH_1_LOGIC2*… which one might argue should not come as a surprise given the name of the variable, but still sad nonetheless. > I also always thought the*R_CHECK*prefix referred toR CMD check, but I think someone had another explanation for the prefix that was unrelated toR CMD check. Maybe it was@Gabriel Becker?

Hervé Pagès (16:48:23) (in thread): > Yes using*R_CHECK_LENGTH_1_LOGIC2*=verboseworks. Interesting! So it’s not that*R_CHECK_LENGTH_1_LOGIC2*is ignored byR CMD build, it’s just that the very special value that it was set to would make it non-operational in theR CMD buildcontext. Thanks for the clarification. > That gives us an easy solution but, as you pointed out, it will triggerR CMD builderrors for packages that don’t have the problem but rely on other packages that have it. Since we are at the beginning of a new development cycle, let’s try this anyways and see how it goes. We can always step back if it’s causing too much trouble.

Hervé Pagès (16:59:43) (in thread): > Done (for*R_CHECK_LENGTH_1_CONDITION*and*R_CHECK_LENGTH_1_LOGIC2*):https://github.com/Bioconductor/BBS/commits/masterThe 3.14 builds have started already so we will wait until they’re finished tomorrow to deploy. This means the change won’t be reflected before Friday’s report.

Henrik Bengtsson (17:20:50) (in thread): > Awesome! > > … as you pointed out, it will triggerR CMD builderrors for packages that don’t have the problem but rely on other packages that have it > On the upside, this might be the push we need to get rid of these bugs sooner, so we’ll get to a point where R will make*R_CHECK_LENGTH_1_CONDITION*=trueand*R_CHECK_LENGTH_1_LOGIC2*=truethe default. These bugs are no fun and can be damaging. > > The current strategy over on CRAN is to get this fixed at submission time (new and updated packages). I suspect they’ll eventually scale up and turn on these checks across all existing packages too and then give maintainers two weeks to fix it. I’m a bit surprised it’s taking so long given the severity -*R_CHECK_LENGTH_1_CONDITION*was added in R 3.4.0 (Apr 2017) and*R_CHECK_LENGTH_1_LOGIC2*in R 3.6.0 (Apr 2019).

Hervé Pagès (18:37:30) (in thread): > Yes I’m surprised too and I wish they would push harder to expedite this. So the new settings on the Bioc builders will help give this a little push even if I didn’t really anticipate having us entering the business of doing QC on CRAN packages:pensive:

Henrik Bengtsson (18:57:43) (in thread): > BTW, are you setting those toverboseonly forR CMD buildor also forR CMD check? It’s probably less intrusive to only do it forR CMD buildto catch it in the vignettes. I thought you did it only forbuild. (But I don’t mind if doing it also incheckso we can start nagging upstream CRAN package maintainers)

Hervé Pagès (19:07:25) (in thread): > For both. The same environment variables are used at any stage of the builds, even for the INSTALL and BUILD BIN stages where most of them have no impact at all. I think that using different variables for different stages would make things too confusing/too hard to reproduce.

Alan O’C (20:29:00): > @Alan O’C has joined the channel

2021-05-27

Lori Shepherd (07:49:03) (in thread): > That is good to know about verbose. We didn’t use the true before so we could limit the error to the specific package code and not display the ones coming from other packages which ended up being too noisy. So I was trying to follow the conversation, does verbose still limit to package code only?

Hervé Pagès (10:37:39) (in thread): > No it doesn’t. I think it’s like using true but in more verbose.

Lori Shepherd (10:58:24) (in thread): > This was very deceiving even in implementing on the SPB code because it would pick it up in the dependency packages and leave as an ERROR on the current package

Lori Shepherd (10:58:30) (in thread): > I don’t think that is what we want on the build system or SPB?

Hervé Pagès (11:01:37) (in thread): > It’s an experiment, see discussion above.

Hervé Pagès (11:05:42) (in thread): > Not sure we want this on the SPB though, probably not.

Lori Shepherd (11:08:03) (in thread): > I’ll leave as is on the SPB - at least for now. perhaps in the future we could consider updating

Henrik Bengtsson (14:16:30) (in thread): > @Lori Shepherd, it’s thepackage:*R_CHECK_PACKAGE_NAME*,...part of the env var value that causes the setting to apply only to the package currently being set. You can even dopackage:future,...to apply it to only a specific package (herefuture). What follows in...is the action R should take when running into one of these two bug at run-time. The default isfalse, which results in a warning (as R always done). Iftrue, we get an error. Ifverbose, we get an error with details on why. (Note, it’s nottrue,verbose; it’s eithertrueorverbose). Then one can throw in anaborttoo (in addition totrueorverbose), which will cause R to terminate instead of giving an error (so only useful when runningR CMD check. > > So, I’d say, forR CMD check, usingpackage:*R_CHECK_PACKAGE_NAME*,verboseis the best, since doesn’t give “false-positive” for the individual package maintainers. Since that does not apply toR CMD build, usingverbosethere is the second best alternative if we want to catch these bugs in the vignettes too.

Henrik Bengtsson (14:41:36) (in thread): > This above is documented in ‘R Internals’ together with all the other*R_CHECK*...settings. > > These checks are done at runtime, which means they apply also whenever we run R. The assertion oflength(x) == 1inif (x) ...happens insrc/main/eval.c:asLogicalNoNA()at:https://github.com/wch/r-source/blob/trunk/src/main/eval.c#L2169-L2179The assertion oflength(x) == 1andlength(y) == 1inx && yandx || yhappens insrc/main/coerce.c:asLogical2()at:https://github.com/wch/r-source/blob/trunk/src/main/coerce.c#L1770-L1780The actually parsing and action taking on the environment variables happens in:https://github.com/wch/r-source/blob/trunk/src/main/errors.c#L2150-L2287

2021-05-31

Henrik Bengtsson (13:28:16) (in thread): > FYI, Bioc devel now catches these bugs, e.g.https://bioconductor.org/checkResults/devel/bioc-LATEST/CopywriteR/nebbiolo2-buildsrc.htmlandhttps://bioconductor.org/checkResults/devel/bioc-LATEST/bnem/nebbiolo2-buildsrc.html

Henrik Bengtsson (14:06:47) (in thread): > I browsed throughhttps://bioconductor.org/checkResults/devel/bioc-LATEST/index.html, and there are quite a few packages that suffer from these bugs. I spotted at least on where the bug is in a dependent package, so it’ll require reporting upstream to get this fixed but certainly looks doable.

2021-06-01

Hervé Pagès (16:51:08) (in thread): > Yup, now we have 101 software packages in the 3.14 daily report with either a “the condition has length > 1” or a “length > 1 in coercion to logical” failure. We need a plan:worried:.

Henrik Bengtsson (17:00:59) (in thread): > I guess that’s good … those previously hidden bugs are now revealed. For a plan: Sounds like you’ve got a way to programmatically pull out which the problematic packages are. I guess sending out a message to the package maintainers to look into the problem and fix it if it occurs in their package. That should weed out most of them. Then rerun and narrow in on problems with upstream packages not on Bioconductor.

Hervé Pagès (17:03:44) (in thread): > The number of packages that actually contain the problem is < 50. I’ll discuss with the team what to do about them. Thanks!

Henrik Bengtsson (17:08:11) (in thread): > (FWIW, for one of the packages I checked, this caught a typo where the code used something likefoo(x, param=row(df))but it’s clear that the developer meantfoo(x, param=nrow(df)))

Hervé Pagès (17:09:43) (in thread): > Do you think you can contact them? Thanks

Hervé Pagès (17:11:31) (in thread): > (preferably by opening an issue on GitHub, but if that doesn’t work and you have to use email, please cc@Lori Shepherdand me, thx)

Henrik Bengtsson (17:48:32) (in thread): > I’ve been doing this for many++ packages already but I’ve already spent way more time than I have on doing so. This is why I reached out to you in the first place. BTW, any CRAN or Bioconductor package that does not link to a GitHub/GitLab viaURLorBugReports, I ignore - emailing maintainers in the dark is too much work. For some of them, I found a GitHub repo but they don’t link to it from the packageDESCRIPTIONfile. I think you should add to your Bioconductor onboarding process to make sureURLandBugReportsare populated, unless the maintainer explicitly say they only communicate via email.

Henrik Bengtsson (17:49:32) (in thread): > FYI, I’ve been doing this for several years whenever I spot a problem.

Hervé Pagès (18:06:53) (in thread): > > I’ve been doing this for many++ packages already but I’ve already spent way more time than I have on doing so. > Kind of the same feeling here. We’ve already spent a lot of time last year chasing packages where this kind of errors were happening (using previous settingpackage:*R_CHECK_PACKAGE_NAME*,abort,verbose). It’s a very time consuming process. > > One problem I see with the new setting is that even though the FAILURE REPORTs try hard to be helpful and provide as much details as they can, they’re not particularly easy to understand and can be confusing. Sometimes they indicate that the problem occurs in a dependency even though that’s not really true. For example, for things likewhich(is.na(mat), arr.ind = dim(mat))(found inMSnbase) orstopifnot(class(a) == class(b) || (class(a) == "pgrid" && class(b) == "wpp"))(found inoptimalFlow), the FAILURE REPORT indicates that the error occurs inbase::which()orbase::stopifnot(), respectively. Technically true, but somehow misleading. In the case ofbase::which(), I would argue that the function should be able to catch the misuse by doing better argument checking so the issue is kind of on both sides. > > So yes it would be great to eradicate these errors, but the new setting (verbose) is very noisy and can be misleading. We’ll keep it for now but we might need to step back at some point if things become too ugly.

Henrik Bengtsson (18:22:25) (in thread): > I think you should be able to put the burden on troubleshooting on the package maintainer, especially so withpackage:*R_CHECK_PACKAGE_NAME*,abort,verbose. The community can help out when the maintainer can’t figure it out themself. I also hope that fixing some of the 50 packages (spotted withverbose), might also fix it for some of the other packages. > > From a scientific point of view, the risk from leaving those bugs unfixed is too high. There are so many ways the produced results can be messed up because of these bugs. If theses bugs happen in some essential packages, these risks scale up quickly and become a cost/burden/unknown/debt for many researchers out there.

2021-06-08

Arjun Krishnan (09:22:56): > @Arjun Krishnan has joined the channel

2021-06-10

Pariksheet Nanda (10:25:41): > I have a fresh R 4.1 installation and am developing a package with Bioconductor dependencies. Surprisingly after installingdevtoolsand runningdevtools::install()in the package git cloned directory, all the Bioconductor dependencies were installed! WithoutBiocManagerbeing installed. Has anyone encountered this behavior? I can’t find anything in the ChangeLogs for R 4.1 or devtools that explains howdevtools::installautomagically installs packages frombiconductor.org

Pariksheet Nanda (10:27:45) (in thread): > I just checked mygetOption("repos")which only listshttps://cloud.r-project.orgas the CRAN mirror and nothing else

Marcel Ramos Pérez (11:58:11) (in thread): > https://github.com/r-lib/remotes/blob/master/R/install-bioc.Rbut I’d recommend usingBiocManager::install("user/repo")to be safe

2021-06-11

Pariksheet Nanda (08:30:56) (in thread): > That’s for point me to that source file. Yeah, I usually useBiocManager::install(). This was the first time I saw the fallback to install dependencies from Bioconductor.

Mike Smith (12:14:41): > Anyone have any topics they’d like us to cover at a future developers forum? I’m happy to structure something, but looking for inspiration or topics.

2021-06-14

Tim Triche (14:53:28): > BiocFileCache and friends?

2021-06-15

Mike Smith (05:36:52): > The next Developers Forum will be this Thursday at 09:00 PST / 12:00 EST / 18:00 CEST - You can find a calendar invite attached and athttps://tinyurl.com/BiocDevel-2021-06We’ll be using BlueJeans athttps://bluejeans.com/114067881/@Martin Morganwill present some initial steps towards translating error messages in {BiocManager} into multiple languages. This isn’t something I’ve thought about before, and it will be great to have some idea how much work is required from a technical side, in addition to creating the actual translations. > > There are some notes available athttps://github.com/Bioconductor/BiocManager/blob/po-translations/README-Translation.mdthat may be of interest. In particular it would be great to get the perspective of he community (both developers and users) on the points in the “Concerns / next-steps” section - File (Binary): BiocDevel-2021-06.ics

Henrik Bengtsson (15:00:12) (in thread): > @Martin Morgan, I guess you’re already aware ofhttps://github.com/MichaelChirico/potools?

Martin Morgan (15:22:42) (in thread): > Yes, Michael and I have been talking a little off line. Perhaps Michael will be able to join us on Thursday, 12 Noon US Eastern (how do I invite someone to this slack? —michaelchirico4@gmail.com)

2021-06-17

Dirk Eddelbuettel (10:40:39) (in thread): > I actually just (Twitter) DM-ed him Mike’s tweet. I guess I was unaware of this convo – but better safe than sorry:slightly_smiling_face:

2021-06-18

Vince Carey (13:56:57): > @Mike Smith– here’s an idea for a developer forum: package lifecycle concepts, with particular attention to simplifying tasks of deprecating classes and datasets. Maybe not a whole session but@Marcel Ramos Pérezand@Hervé Pagèshave some developments in that area.

2021-06-21

Martin Morgan (10:33:38): > Last week’s developer forum on translating help messages is athttps://youtu.be/M8Fyj2HMYVw; if you’re interested in further discussion, visit the#documentation-translationchannel - Attachment (YouTube): Developers Forum 23

2021-06-28

Ben Story (12:09:28): > @Ben Story has joined the channel

2021-07-05

Chouaib Benchraka (01:57:06): > @Chouaib Benchraka has joined the channel

2021-07-16

Robert Castelo (10:57:54): > hi, in updating my R/BioC installation in macOS, a package calledstringiasks whether I want to install the newer version from source, if I say yes, I get the following compilation errors: > > [...] > ***** Compiler settings used: > CC=clang -mmacosx-version-min=10.13 > LD=clang++ -mmacosx-version-min=10.13 -std=gnu++14 > CFLAGS=-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -fPIC > CPPFLAGS=-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -UDEBUG -DNDEBUG > CXX=clang++ -mmacosx-version-min=10.13 -std=gnu++14 > CXXFLAGS=-isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -fPIC -D_XPG6 > LDFLAGS= > LIBS= > > **** libs > clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I. -Iicu55 -DU_STRINGI_PATCHES -Iicu55/unicode -Iicu55/common -Iicu55/i18n -DU_STATIC_IMPLEMENTATION -DU_COMMON_IMPLEMENTATION -DU_I18N_IMPLEMENTATION -DUCONFIG_USE_LOCAL -UDEBUG -DNDEBUG -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include -D_XPG6 -fPIC -fPIC -Wall -g -O2 -c stri_brkiter.cpp -o stri_brkiter.o > In file included from stri_brkiter.cpp:32: > In file included from ./stri_stringi.h:36: > In file included from ./stri_external.h:67: > In file included from /Library/Frameworks/R.framework/Resources/include/R.h:50: > /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/cmath:317:9: error: no member named 'signbit' in the global > namespace > using ::signbit; > ~~~~^ > [...] > > by now i’ve installed the older binary version, but i’m a bit worried that this problem may popup with other packages for which i may want to have the very last version. i’ve Googled a bit without success, does anyone working with macOS has a hint about what could be the problem and the solution?

2021-07-20

Al J Abadi (08:23:38): > I have a script that pushes the package changes to Bioconductor if checks and builds look good. I tried to set it up on GitHub as a workflow by providing the ssh keys and other config files using secrets but it won’t work. It looks like the following: > > - name: Push to Bioconductor > run: | > ## set up logger > mkdir log > touch log/log.txt > ## set up ssh > mkdir ~/.ssh > # copy the config file > cp .github/config ~/.ssh/config > wd=$PWD > cd ~/.ssh > # set up ssh agent > ssh-agent /bin/sh > eval `ssh-agent -s` > # add ssh key > ssh-add - <<< "$SSH_KEY" > # add known host > ssh-keyscan $BIOC_HOST > ~/.ssh/known_hosts > cd $wd > ## push > # add github account that has push access to Bioc. > git config user.name 'Al J Abadi' > git config user.email 'al.jal.abadi@gmail.com' > # ensure you have write access to the package > ssh -T git@git.bioconductor.org &>> log/log.txt > # create a temp branch to push (could also use HEAD) > git checkout -b current_branch > git push git@git.bioconductor.org:packages/mixOmics current_branch:master &>> log/log.txt > env: > SSH_KEY: ${{ secrets.PRIVATE_SSH_KEY_AL }} > BIOC_HOST: ${{ secrets.BIOC_HOST }} > > I can see that I have write access after this set up. Also, trying to make illegal pushes (i.e. while remote is ahead), I get what is expected : > > error: failed to push some refs to 'git.bioconductor.org:packages/mixOmics' > hint: Updates were rejected because the remote contains work that you do > hint: not have locally. > > (N.B. I was using illegal pushes as a test case to ensure I’m not constantly pushing while fine-tuning the workflow) > > But while trying to make valid pushes, I get: > > git@git.bioconductor.org: Permission denied (publickey). > > Would appreciate any insights.

Nitesh Turaga (09:10:10): > Hi@Al J Abadi, Why would you need to have this set up? Can you tell me how this helps you? Theoretically, as long as you have the private key set up on that machine correctly, it should work. Make sure the permissions on your key are correct, i.echmod 400 ~/.ssh/id_rsaAre you able to push from your local machine? If so, there is nothing wrong from our end (Bioconductor) > > Check outhttp://bioconductor.org/developers/how-to/git/faq/

Al J Abadi (20:17:12): > Hi@Nitesh Turaga, thank you for your response and the tip. I’ll try it and see how I go. > > Are you able to push from your local machine? If so, there is nothing wrong from our end (Bioconductor) > Yes > > Why would you need to have this set up? Can you tell me how this helps you? > For some reason, we constantly find ourselves having to reconfigure the id_rsa on local machines (this could arise from some IT issues we are having, I’m not sure though). I was trying to automate it on GitHub so that, for instance, an update to our devel package is as easy as pushing to a specific branch on GitHub (with some checks obviously). This way, IT issues won’t make updating our package a troublesome process we keep putting off. Also, this way and probably more importantly, even if I - the package maintainer - am too busy to push an important update, a PR from a contributor to that branch reminds me of pushing an important update and the process is as easy as merging that PR. I hope that makes sense.

Nitesh Turaga (21:18:32): > Ok, you are welcome to configure it. Did the permissions check work?

2021-07-22

Konrad J. Debski (03:48:40): > @Konrad J. Debski has joined the channel

Mike Smith (05:17:27): > The Developers’ Forum teleconference is taking a summer hiatus. The plans for this month fell through at short notice and I will be travelling for all of August (although feel free to organise something without me!) We’ll be returning in October with new impetus, new topics (and a new video platform). In some very-forward-planning I’m excited to announce that@Jeroen Oomswill be presenting in November on the R-Universe project!

2021-07-23

Nitesh Turaga (09:22:00): > For Conda users, any idea how I can debug this? It stems from trying to build a package usingR CMD build COTAN/(new package submission)https://gist.github.com/nturaga/4747303ce4361910d0a0720566948e7cI’m not a heavy conda user, and one of the stack overflow suggestions was use Python 3.7.9 and above, but that doesn’t seem to help anything.

Nitesh Turaga (09:22:19): > Since it’s a basilisk issue, i’ll tag@Aaron Lun.

Nitesh Turaga (09:23:53): > I see that it says,conda clean --packagesmay resolve my issue, but i’m such a conda / basilisk beginner that i’m not even sure what this helps? (Remove unused packages, but there are no packages, in fact there has never been a conda on this machine!)

Batool Almarzouq (15:52:44): > @Batool Almarzouq has joined the channel

2021-07-24

Kevin Blighe (20:33:55): > Seems that 2 of my packages are failing build, with the same error: > > Error: processing vignette 'RegParallel.Rmd' failed with diagnostics: > The size of the connection buffer (131072) was not large enough > to fit a complete line: > * Increase it by setting `Sys.setenv("VROOM_CONNECTION_SIZE")` > > Is this something that’s known?

Dirk Eddelbuettel (20:46:04) (in thread): > Maybe check the GitHub repo of the package implied by the beginning of that message for other/related issues? It was very recently updated…

Kevin Blighe (20:52:13) (in thread): > Hmm, I will see if they pass r cmd check and BiocCheck

2021-07-26

Alan O’C (08:04:09) (in thread): > Yeah seems naively like vroom just added that env var

2021-07-30

lalchung nungabt (04:42:55): > @lalchung nungabt has joined the channel

2021-08-05

Krutik Patel (05:41:45): > @Krutik Patel has joined the channel

Kevin Blighe (16:47:30) (in thread): > Unfortunately, I have no idea what’s happening with it. Still failing:https://master.bioconductor.org/checkResults/3.13/data-experiment-LATEST/RegParallel/malbec2-buildsrc.html

Kevin Blighe (16:48:59) (in thread): > I added this line in vignette (and also tried placing it at the top), but no successhttps://github.com/kevinblighe/RegParallel/blob/master/vignettes/RegParallel.Rmd#L188

2021-08-07

Octavio Morante-Palacios (04:36:44): > @Octavio Morante-Palacios has joined the channel

Kevin Blighe (19:47:07): > I have posted the above as an issue on vroom:https://github.com/r-lib/vroom/issues/364

Spencer Nystrom (23:07:33) (in thread): > I have nothing substantive to add except a quick heads up this may not get traction for a week or so:https://twitter.com/hadleywickham/status/1423763753594761225?s=19 - Attachment (twitter): Attachment > The tidyverse team is taking next week (Aug 9-13) off. Time away is so important for sustainable open source maintenance; we all need a break to recharge. You won’t hear from us on GitHub next week, but we look forward to working with y’all again when we get back! #rstats

Kevin Blighe (23:10:21) (in thread): > Thanks@Spencer Nystrom! Oh dear, this may linger on a bit!

2021-08-11

Rory Stark (13:06:46): > @Rory Stark has joined the channel

2021-08-12

Alan O’C (17:57:18) (in thread): > Just saw this in the wild for the first time inGEOquery::getGEO

Kevin Blighe (18:09:55) (in thread): > Yep, they’ve changed something in vroom. I asked Sean and it’s not anything he has modified in GEOquery itself

Alan O’C (18:12:48) (in thread): > Yeah I figured, this ran without error a few weeks back and I’m on release

Kevin Blighe (18:15:43) (in thread): > Hmm, maybe Ill try that buffer size that worked for you; although, this close to the release cycle deadline, I may just leave that part out of the vignettes for now and wait until next devel

Alan O’C (19:08:02) (in thread): > There’s something funny going on there, but yeah for that example (I’d forgotten it’s taken from the PCAtools vignette:slightly_smiling_face:) that magic number seems to work

Kevin Blighe (19:08:58) (in thread): > Luck O’ the Irish:shamrock:

2021-08-14

Mikhail Dozmorov (10:09:34): > I’m using the latestbioconductor:develDocker image. When building vignettes, it generates:WARNING 'qpdf' is needed for checks on size reduction of PDFs. I’ve tried compiling/installinghttps://github.com/qpdf/qpdfwithin the container, but it creates issues with permissions. Tested on a couple of packages that worked previously. Any advice?

Dirk Eddelbuettel (10:45:11): > Modify the container and installqpdf(and/or please state what “permission” error you are getting – my containers tend to be Ubuntu or Debian based so as root I just callapt). You can (locally) just dump your modified container. Additionally, maybe PR a change the official Dockerfile.

Martin Morgan (11:31:49): > I think there’s an efforthttps://github.com/nturaga/bioconductor_build_dockerto create a docker image that reflects the build system more completely, rather than the current images which are meant to use packages. A big ‘cost’ in terms of size is the inclusion of LaTeX. Scanning the Dockerfile shows qpdf is included in the image.@Nitesh Turagaor@Andres Wokatymay be a more reliable:wink:source of information…

Mikhail Dozmorov (11:39:34): > I’ve been usingbioconductor/bioconductor_docker:devel, and itsDockerfiledoesn’t seem to have qpdf. Adding it as suggested, helped (thanks,@Dirk Eddelbuettel). I’m not sure if I should usebioconductor/bioconductor_docker:develornturaga/bioconductor_build_docker:master

Sean Davis (12:19:39) (in thread): > tagging@Nitesh Turaga

Nitesh Turaga (21:49:29): > Hi@Mikhail DozmorovCan you tell me which package you are using is requiringqpdf?

Nitesh Turaga (21:50:05): > Just as a caution, thenturaga/bioconductor_build_docker:masteris a first attempt at replicating the linux build machine.

Nitesh Turaga (21:50:09): > It’s not complete.

Nitesh Turaga (21:50:45): > Martin explains the differences well, between the bioconductor_docker images and my test build_docker image.

Nitesh Turaga (21:51:34): > My naive understanding is you shouldn’t needqpdfunless you are trying to build vignettes.

Nitesh Turaga (21:53:25): > I see a few Bioconductor packages thatSuggestit, and on thatImports it, POMA. - File (PNG): Screen Shot 2021-08-14 at 9.52.28 PM.png

Mikhail Dozmorov (21:59:55): > Thank you, Nitesh, Martin. I testedpreciseTADhubandSpectralTADvignettes, both failed withoutqpdf. Perhaps, something wrong with the YAML header? But everything worked previously.qpdfin this case is an external dependency, adding it to the Docker file in theapt-getsection helped. Can make a PR, but still not sure why it fail at the first place.

2021-08-15

Dirk Eddelbuettel (09:24:02): > (No, it is just something thatR CMD check --as-crandoes for pdf vignettes. You do not govern that in the yaml header.)

2021-08-16

Jeroen Gilis (07:48:42): > @Jeroen Gilis has joined the channel

Kevin Blighe (20:50:33) (in thread): > Well, that was not quite the response for which I had hoped on GitHub.@Sean Davis, it seems that vroom developer is not too interested in a fix here.

2021-08-17

Kevin Blighe (09:33:46) (in thread): > @Alan O’C, Im not fully following that thread. The issue is solved by simple specifying the buffer in quotes likeSys.setenv(VROOM_CONNECTION_SIZE = "500000")?

Alan O’C (09:34:03) (in thread): > Seems that way, yeah

Alan O’C (09:34:43) (in thread): > The reason the magic 72 works is because it stops R from storing the env variable as “5e+05” and instead as “500072”

2021-08-18

Aedin Culhane (09:40:47): > If you have developed a Bioconductor Package and learned the challenges along the way If you have tips and tricks for would-be developers. Please consider joining the new Bioconductor Mentorship program to help new would be developer get from R scripts to Bioconductor package. Expression of interest form athttps://forms.gle/fya5JEArTT5kNEGr9<!channel> - Attachment (Google Docs): Bioconductor Developer Mentorship Program - Expression of Interest > Please indicate your interest to learn more and potentially become a mentor in the Bioconductor Developer Mentorship program. The Bioconductor Developer Mentorship Program will form a part of the Bioconductor “welcome mat” that the Bioconductor Community Advisory Board are developing. It will be a group-mentorship/buddy program for would-be, new, or hesitant developers or for developers who wish to refresh their skills. The goal of the program is to welcome and onboard new developers, develop educational material to assist new developers, improve the quality of packages submitted to Bioconductor, and strengthen community and interactions between Bioconductor developers. Mentors should have at least 1 package either in the release or accepted in the development branch of Bioconductor. Each program cycle would run for 6 months (or the development cycle of Bioconductor package release). For more information about the program see the Bioconductor Website [link].. or Google doc [https://docs.google.com/document/d/1Q-Hxmy0ZcKzKSbB-dtg02gJRlZ0Vi6WNOTF-W3bwjmY/edit?usp=sharing]

2021-08-19

Ava Hoffman (she/her) (11:32:44): > @Ava Hoffman (she/her) has joined the channel

Ava Hoffman (she/her) (11:32:50): > @Ava Hoffman (she/her) has left the channel

2021-08-31

Kevin Rue-Albrecht (07:39:03): > Hi all, just to point out that i’ve created#github-actionsto describe an issue that I’m having, without cross-posting it everywhere. > I also thought the channel could be useful as I haven’t found a channel here to specifically discuss issues about github actions for bioconductor packages.

2021-09-06

Eddie (08:22:38): > @Eddie has joined the channel

Eddie (08:27:14): > @Eddie has left the channel

2021-09-10

Gian Marco Franceschini (02:57:36): > @Gian Marco Franceschini has joined the channel

2021-09-13

Kevin Blighe (07:20:50) (in thread): > Back to this. I updated to next BioC Devel last night and I am pretty sure that this issue ‘breaks’ GEOquery.

Alan O’C (12:16:38) (in thread): > Yeah I imagine so, it seemed to be the header that broke that one example.

Amy Guillaumet (15:19:06): > @Amy Guillaumet has joined the channel

2021-09-16

margherita mutarelli (09:26:39): > @margherita mutarelli has joined the channel

Martin Morgan (11:01:16): > BiocParallel’s handling of random number streams has been updated. This is likely a breaking change for any package that uses random numbers in bplapply() or bpiterate(). Details of the change and new behavior (generally much improved over the previous situation) are athttp://bioconductor.org/packages/devel/bioc/vignettes/BiocParallel/inst/doc/Random_Numbers.pdf. > > I had thought the changes would be ‘final’, but probably there is another iteration (seehttps://github.com/Bioconductor/BiocParallel/pull/140for commentary) that might also break behavior. > > If your package is affected by these changes, then please be patient while we work toward a good solution. > > Regardless of the details of the final solution, if your package uses random numbers inbplapply()orbpiterate(), then the exact stream of random numbers seen byFUNwill be different from before the update, reflecting intrinsic differences in how these streams are generated. > > Martin

Kasper D. Hansen (13:26:57): > Do we have any idea on any limit or periodicity in separate streams? I sense that there is an assumption that we can easily have 10^6 separate independent streams? (Which may very well be true; I’m just asking if we know)

Martin Morgan (13:35:57): > Yes, it’s described inparallel::nextRNGStream() > > This uses as its underlying generator 'RNGkind("L'Ecuyer-CMRG")', > of L'Ecuyer (1999), which has a seed vector of 6 (signed) integers > and a period of around 2^191. Each 'stream' is a subsequence of > the period of length 2^127 which is in turn divided into > 'substreams' of length 2^76. > > Each element ofXcurrently gets its own substream, so2^76is the periodicity of a call toFUN(), with2^127calls toFUN()available.

Henry Miller (18:34:51): > @Henry Miller has joined the channel

2021-09-17

Martin Morgan (11:41:59): - Attachment: Attachment > and the winner, in terms of what went wrong with my attempt to monitor breakage, is that the anti_join() arguments in my script are reversed — should be anti_join(current_reports, checkpoint_reports, by = c("pkg", "version")) (actually I’m not sure whether "result" should be included in the by =) > > Breaking packages (as of today) are > > anti_join(current_reports, checkpoint_reports) |> print(n = Inf) |> count(pkg, arrange = TRUE) > Joining, by = c("pkg", "version", "result") > # A tibble: 25 × 3 > pkg version result > <chr> <chr> <chr> > 1 BASiCS 2.5.4 ERROR > 2 batchelor 1.9.1 ERROR > 3 BEclear 2.9.0 ERROR > 4 BiocSingular 1.9.1 ERROR > 5 Cardinal 2.11.2 ERROR > 6 ChromSCape 1.3.21 ERROR > 7 fgsea 1.19.2 ERROR > 8 FindMyFriends 1.23.0 ERROR > 9 GDCRNATools 1.13.1 ERROR > 10 HiCBricks 1.11.0 ERROR > 11 LineagePulse 1.13.0 WARNINGS > 12 metaseqR2 1.5.12 ERROR > 13 methylGSA 1.11.0 ERROR > 14 mixOmics 6.17.26 ERROR > 15 motifbreakR 2.7.1 WARNINGS > 16 proFIA 1.19.0 WARNINGS > 17 scater 1.21.3 ERROR > 18 scDblFinder 1.7.5 WARNINGS > 19 scone 1.17.1 ERROR > 20 scPCA 1.7.1 ERROR > 21 scran 1.21.3 ERROR > 22 sesame 1.11.12 ERROR > 23 signatureSearch 1.7.3 ERROR > 24 SingleR 1.7.1 ERROR > 25 zinbwave 1.15.2 ERROR

2021-09-18

Vince Carey (08:20:48): > Open question on this: could you have accelerated the assessment of breakage with the docker:devel and the current devel binaries of all reverse dependencies? If this makes sense, could we facilitate this kind of experiment on AnVIL?

2021-09-20

Dania Machlab (14:50:50): > @Dania Machlab has joined the channel

2021-09-25

Haichao Wang (07:20:07): > @Haichao Wang has joined the channel

2021-10-04

Alan O’C (19:31:01): > Has the RNG behaviour changed again since? I’m having to revert my previous changes

Alan O’C (19:31:30) (in thread): > What I’m really asking here is “will I have to fix it a third time?”

2021-10-05

Martin Morgan (06:22:53) (in thread): > Yes, the RNG behavior has changed for the final time — the streams seen by workers are different, the global seedset.seed()is ignored (reproducibility requires use of theRNGseed=argument toSerialParam(),SnowParam(), orMulticoreParam()), andbplapply()never increments the global seed. Thanks for your patience and diligence. I’ll post a more extensive update later today.

Henrik Bengtsson (11:46:56) (in thread): > Famous last words: “for the final time” - you like to live dangerously:stuck_out_tongue_winking_eye:

Alan O’C (11:51:46) (in thread): > Great, thanks! Sorry this probably came across more snarky than I intended but it was almost 1am for me and I was just trying to sincerely ask “should I keep an eye out for more changes before the next release”:slightly_smiling_face:

Alan O’C (11:52:17) (in thread): > My clarity of communication (not fantastic at the best of times) tends to go down the toilet after hours

2021-10-12

Jonathan Carroll (23:45:56): > @Jonathan Carroll has joined the channel

2021-10-13

David Zhang (12:06:38): > I have recently been introduced to OOP in R and am considering the possibility of using OOP to extend existing Bioconductor S4 classes in future projects/packages. > > From my reading, I am finding that the OOP style of programming offers more guidelines for structuring code dependant on the problem you’re trying to solve and it looks to be beneficial to think about the software design from the project outset. However, I am struggling with these ideas and have tried to read through some material about design patterns but, to me, they don’t seem to transfer easily to developing bioinformatics tools. Additionally, the resources I have come across on using S4 seem more tailored towards practical implementation rather than top-level design. > > I was wondering if there are any resources available that cover design patterns for developing bioinformatic software? Or more generally, how to plan/structure/design bioinformatics software using OOP? Thanks!

Michael Lawrence (16:30:56): > Well there’s probably no better example than the Bioconductor infrastructure (S4Vectors, IRanges, GenomicRanges, etc).

Kasper D. Hansen (21:04:17): > OOP in R is quite different from general OOP.

Kasper D. Hansen (21:05:53): > My two comments on OOP in R > 1. The success of the approach is all about data structures, ie. classes. > 2. IRanges and friends are indeed a model of success for developing a rich set of methods as well, but that is extremely unusual. Most packages should only have a few methods extending existing ones. Most functions in an analysis package should not be methods.

Kasper D. Hansen (21:07:09): > I am pointing out 2) because one common side effect of reading about OOP is that you start wanting to make everything into methods.

2021-10-14

Mike Smith (02:14:40): > @Kasper D. Hansencan you expand on why it’s not a good idea to make lots of methods and rather stick with functions?

Alan O’C (04:47:37): > I was going to say, most OOP books tend to be written for the likes of Java. I don’t know that I’ve seen an OOP recommendation for functional languages

Vince Carey (06:57:25) (in thread): > Don’t miss the discussion athttps://community-bioc.slack.com/archives/C35G93GJH/p1632965408036800 - Attachment: Attachment > hello! I’m preparing a lecture on object oriented programming and reading through Hadley Wickham’s Advanced R book on S4 (https://adv-r.hadley.nz/s4.html#learning-more). He notes that Bioconductor has some of the best material for S4 and points to a link from 2017 from @Martin Morgan @Hervé Pagès (https://bioconductor.org/help/course-materials/2017/Zurich/S4-classes-and-methods.html). Do people have fav recs on more recent S4 material?

David Zhang (12:35:34) (in thread): > Thanks Kasper! this is helpful

Michael Lawrence (12:48:17): > It would be interesting to think through functional OOP idioms for common design patterns.

Kasper D. Hansen (21:28:18): > Ok, so I’ll expand a bit here. Since methods are decoupled from classes in S4, one might wonder, what is really the benefit of a method. Let’s take a methodfoo()which dispatches on two classesAandB. You can of course mimic dispatch by the code

Kasper D. Hansen (21:30:59): > > foo <- function(x) { > if(class(x) == "A") > "some code" > if(class(x) == "B") > "some other code" > } > > The alternative would be to makefooa generic and have methods for two signatures.

Kasper D. Hansen (21:32:03): > The pros and the cons are - in my opinion > 1. as a function, it is easier to inspect and debugfoo, especially for the user > 2. as a generic, there is a certain beauty to the approach

Kasper D. Hansen (21:32:59): > What you really get is the dispatching. So my rule is, never write a generic unless you’re making 3 separate dispatches.

Kasper D. Hansen (21:34:36): > Generics - IMO - are specially useful for extending other people’s packages and approaches. I can define a new class and implement a new method for a generic and then I can plug into another package goodies. But in my experience, this happens very rarely. In the typical analysis software package, you have some data containers and you have some analysis code that work on these data containers, but which often does not really make sense outside your package.

Kasper D. Hansen (21:36:05): > There are of course exceptions. We should all implement methods for standard generics likeshowand also - in specific domains - for widely used generics in Bioconductor such asstart()etc if the class has a GenomicRanges inside of it. My point is that these cross-package examples are actually the exception, not the rule, for most software packages

Kasper D. Hansen (21:38:22) (in thread): > Henrik is pointing out that you should never write code like this, for various reasons. Instead you should doinherits(x, "A")(or perhapsis(x, "A")I might add)

Kasper D. Hansen (21:40:27): > What happens to developers new to OOP is that - because the line is blurred between a function and a method - you start seeing methods everywhere. But in general, many users will have a worse user experience, because of worse debugging tools and worse display tools. It can be very useful to use generics - when you really use the dispatch system. Hence my advice of (1) require at least 3 signatures and (perhaps) (2) a generic you could imaging would be useful outside of your package.

Kasper D. Hansen (21:41:06): > It took me a while to get here. I have certainly been in the “let’s make a bunch of generics” camp

Peter Hickey (21:42:44) (in thread): > this is a familiar chat:slightly_smiling_face:

Hervé Pagès (22:37:11): > On that topic I just reviewed a package today that was systematically defining S4 generics instead of regular functions, even if no S4 objects are involved and even if it seems very unlikely that dispatch will ever be involved:https://github.com/Bioconductor/Contributions/issues/2346#issuecomment-943640794This is of course not desirable. > > Generics, methods and dispatch are the 3 primary ingredients to make the ecosystem extensible. IMO it’s not so much the number of signatures that will drive the decision (because you can’t know this in advance) but whether or not you want to allow other contributors to implement methods for new types of objects to come in the future. > > But there’s a 4th ingredient that I think is often overlooked: coercion. In the case of Kasper’sfoo()function, the function could be implemented to work on a preferred data structure A, and start with: > > foo <- function(x) { > if (!is(x, "A")) > x <- as(x, "A") > ... > operates on an A object > } > > I actually took this approach 2 years ago for theSummarizedExperiment()constructor (https://github.com/Bioconductor/SummarizedExperiment/commit/0d74843c8923a917fecf0e2b75c95c8456da5cb3) This allowed me to turn it into a regular function (instead of an S4 generic with a bunch of methods), with many code simplifications, and to get rid of the ellipsis (which should also be avoided whenever possible).

Kasper D. Hansen (22:57:26): > In all honesty this is also how I tend to have dispatch-like behaviour inside a function. But it is also commonly done in S3 / S4 methods (and when it is done in S4 methods which ends up usingcallNextMethod()it is really hard to follow the code for a user)

Kasper D. Hansen (22:58:32): > I will argue that for most developers, they will tend to overestimate the number of people who would want to extend their approach. Which is why I think it makes sense to start with set of developers who you have control over: yourself.

2021-10-15

Mike Smith (06:09:46): > Thanks a lot for the expanded comments, very helpful discussion. On that note, anyone interested in presenting something like this at a developers forum call? Seems like a pretty good topic for that audience. I love to see some of the pitfalls in detail and, if we’re seeing some package submissions with inappropriate approaches it might be useful for new developers to either join or have a video available later.

Kasper D. Hansen (15:01:51): > I’m happy to rant about it in person

Yagmur Simsek (16:56:08): > @Yagmur Simsek has joined the channel

2021-10-18

John Kerl (09:16:05): > @John Kerl has joined the channel

Ludwig Geistlinger (11:20:33): > It’s been a couple of years that I worked in python, so I am not necessarily up > to speed with the latest developments in that space, but I am currently considering > to wrap some functional genomics datasets in a python package. > > Now before embarking on this idea, I wanted to better understand whether python > exercises similar concepts as we have for Bioconductor experimental data packages. > I am taking the TabulaMurisData package as an example, that serves single-cell > RNA-seq datasets from the TabulaMuris project as SingleCellExperiments. Could one > eg. think about a corresponding python package that serves single-cell RNA-seq datasets > from the TabulaMuris projects as AnnData objects? Or is this not something people > would do in python? > > I am not sure how many python developers we have in the channel but any > comments/pointers will be greatly appreciated. Maybe@Aaron Lunand/or@Luke Zappiacan comment? Thanks!

Luke Zappia (11:23:53) (in thread): > People in my lab have worked on sfairahttps://github.com/theislab/sfaira. It’s not quite the same model but it is designed to make it easy for people to access public datasets as AnnData objects. They are always keen to have more datasets.

Ludwig Geistlinger (11:24:49) (in thread): > Thanks a lot, I’ll check that out!

2021-10-19

Dario Righelli (09:00:32): > @Dario Righelli has joined the channel

2021-10-22

JP Cartailler (10:41:47): > @JP Cartailler has joined the channel

2021-10-26

Jenny Drnevich (11:21:55): > My institution has a compute cluster and it is past time to update R/Bioc on it. I have to request installation of new R versions and they aren’t too happy to do it more than once a year. I am definitely going to wait until Bioconductor 3.14 is out, hopefully Oct 27. But this is based on R 4.1.1 and R has 4.1.2 scheduled for release on Nov 1. Would it make sense for me to request 4.1.1 or wait until 4.1.2? We host internal mirrors of CRAN/Bioconductor that get updated once a week and all packages would get compiled with whatever version of R they install.

Kasper D. Hansen (11:26:41): > If you do that, its honestly not that much more work to compile R yourself

Kasper D. Hansen (11:27:00): > The hard part of compiling R is handling the dependencies, but they tend to change much less frequently

Kasper D. Hansen (11:28:21): > For some definition of not that much more work. I do it for our compute cluster where I have a place to install R which can be used by everyone else, and ok - it has taken some time for me to hone that system

Kasper D. Hansen (11:28:59): > But IMO it is essential to have R updated every 6 months. And most of what I spend my time maintaining is the packages, not R itself; I guess that’s what I mean

Kasper D. Hansen (11:29:27): > So if you do the packages, you have done 80-90% of the work

Kasper D. Hansen (11:30:09): > Hmm, I guess you may not be dealing with the devel branch?

Jenny Drnevich (11:30:09): > I agree that it is essential to update R, but the support team for the cluster does not and users are not allowed to compile their own R versions.

Jenny Drnevich (11:30:20): > Not, this is for release only

Lori Shepherd (11:30:20): > probably wait until 4.1.2 – we would update R within a week or two to after the release to be updated to the latest version on our builders . and I assume there is normally good reason for a R-patched version worth having the latest updates

Kasper D. Hansen (11:30:53): > I don’t see how this is neither enforceable nor sensible: > > users are not allowed to compile their own R versions.

Kasper D. Hansen (11:31:22): > You do need sysadmin help if you want to install R in such a way that everyone on the system can benefit

Jenny Drnevich (11:36:21): > I’m sure you can get around it but theySTRONGLYrecommend only they install software. They use a module system to handle all the bioinformatic software to make sure all the dependencies are the right versions, etc. Normally it is pretty nice to have them install software for you… They do provide quite a lothttps://help.igb.illinois.edu/Biocluster_Applications

Kasper D. Hansen (11:39:19): > It is possible to set up the module system to have users install their own modules.

Kasper D. Hansen (11:39:42): > Anyway, of course the nicest and easiest solution is for someone else to take of it. No disagreement there.

Jenny Drnevich (11:40:18) (in thread): > Hmm, so maybe I should wait a couple of weeks after 4.1.2 until your builders are all updated. They install the most commonly used packages in the main library (currently 477!) but any updates have to go into your personal library

Dirk Eddelbuettel (11:40:33): > (Also just to mention it: Depending on the Linux you’re on, installing the build-depscan beone command, namelyapt build-dep r-base(if you have a src-dep reference), that could still be of help even to the sys admin folks.)

Kasper D. Hansen (11:41:17): > So there are three approaches (1) convince sysadmins (2) accept (which is what you currently do) (3) roll your own (with additional cost of time). I am just noting that IME - which granted is based on extensive experience in the area - the cost of rolling out R is less than the cost of maintaining packages which you already do.

Kasper D. Hansen (11:41:48): > but time is valuable

Kasper D. Hansen (11:43:22): > For your specific question, I agree with@Lori Shepherdthat I would wait for 4.1.2. Although it means you will have some weeks where you don’t have access to the latest release and that may or may not matter to you and your users

Jenny Drnevich (11:44:12): > It would be easier to do (1) convince sysadmins if there were more R users requesting upgrades!

Kasper D. Hansen (11:45:15): > When I try to assess this I tend to think of (a) my own needs (which I know) and (b) any needs that someone has communicated to me. Of course, most people accept status quo, so I rarely have any specific requests. What I am trying to say is I recommend not thinking about hypotheticals but only about what you know about from you and your users

Kasper D. Hansen (11:45:51): > So if you (and your group / collaboarators) are hapy with waiting a few weeks for 4.1.2, that’s what I would do

Jenny Drnevich (11:46:02): > Due to COVID we are still running R 4.0.3 / Bioc 3.12 so we can wait a few more weeks.

Jenny Drnevich (11:48:30): > It doesn’t help that the R workshops we run always finish right around R/Bioc upgrades instead of starting right afterwards. Thanks for all your inputs,@Kasper D. Hansen! and@Lori Shepherdand@Dirk Eddelbuettel

Henrik Bengtsson (13:03:32): > I’d like to pick up on the thread about installing R from source asnon-privileged user. As@Kasper D. Hansensays, it’s straightforwardwhenall system library dependencies have been installed - it’s a matter ~5 minutes of compilation time and you’re done. All it takes is: > > $ curl -O[https://cran.r-project.org/src/base/R-4/R-4.1.1.tar.gz](https://cran.r-project.org/src/base/R-4/R-4.1.1.tar.gz)$ tar xzf R-4.1.1.tar.gz > $ cd R-4.1.1 > $ mkdir -p "$HOME/software/R-4.1.1" > $ ./configure --prefix="$HOME/software/R-4.1.1" > $ make > $ make install > > Then add: > > export PATH=$HOME/software/R-4.1.1/bin:$PATH > > and you’ve gotR/Rscript(=R 4.1.1) ready to go. > > The required system libraries are often, but not always, already available on most Linux systems. Any missing ones can be installed by sysadms directly from well established, trusted, official Linux repositories, i.e. it’s unlikely that sysadms don’t want to install them. > > Like Kasper, I had to learn how to do this because systems I had access to were outdated. I now do this regular on various RedHat/CentOS systems, which are commonly found in academia. Another advantage of installing from source, is that you can install multiple version of R in parallel (as the above example shows). > > However, I never kept track on exactly which OS system libraries one needs in order to build R from source.Does anyone know a ready-to-go reference for this that lists all required RedHat/CentOS dependencies?Knowing that will lower the threshold for anyone who’s about to build R from source for the first time.

Dirk Eddelbuettel (13:13:37): > > all required RedHat/CentOS dependencies > You probably want to talk to the RH/FC/CentOS/WhateverBrandingIsThisWeek maintainer. That may in fact be my pal Inaki (at least for FC) but it used to a fellow called Tom Calloway who piped in on the lists every now and then.

Kasper D. Hansen (13:31:12): > There is the stuff you need for core R, but - very importantly - the additional stuff you need for whatever subset of CRAN/Bioc packages your org needs.

Henrik Bengtsson (13:55:32) (in thread): > Yes, but I wanted to keep it simple for new-comers and identify the minimal set of required system libraries to get started. On top of these required libraries, it would be nice to identify the most common set of libraries that’ll allow end-users to install 95% of CRAN and Bioconductor packages out of the box. > > What complicates things further is that RedHat/CentOS comes with really dated compilers, which doesn’t support C++14 - something that more and more R packages rely on. There are officially supported solutions to that too, but I’ll leave that for now.

Federico Marini (16:27:42) (in thread): > just my 2 cents, but one can relatively safely say > > make -j 8 > > and speed up the compilation quite nicely:wink:

2021-10-28

Jun Yu (11:27:45): > @Jun Yu has joined the channel

2021-11-11

Mike Smith (07:50:08): > After a hiatus the Developers Forum will be back next Thursday 18th November at 09:00 PST / 12:00 EST / 18:00 CET - You can find a calendar invite attached and athttps://tinyurl.com/BiocDevel-2021-11@Jeroen Oomswill give an overview of the R Universe project (https://r-universe.dev/). This project offers individuals or organisations a platform to build, test and release R packages outside of CRAN or Bioconductor. As well as being potentially useful to those who are developing packages, there’s also a tonne of infrastructure engineering to create the build system, and it should be really interesting to learn more about this. > > Please note the meeting details have changed; We’re now on Zoom:https://embl-org.zoom.us/j/99712225827?pwd=ZUdWRGhqalgzRmsxbjY3RkRGRHMzZz09 - File (Binary): BiocDevel-2021-11.ics

Henrik Bengtsson (17:16:11): > metagenomeFeatures (https://bioconductor.org/packages/metagenomeFeatures) was removed in Bioconductor 3.14. Is there a reason why reverse dependenciesgreengenes13.5MgDb(bioconductor.org/packages/greengenes13.5MgDb)ribosomaldatabaseproject11.5MgDb (https://bioconductor.org/packages/ribosomaldatabaseproject11.5MgDb), andsilva128.1MgDb (https://bioconductor.org/packages/silva128.1MgDb) were not removed at the same time? They all useImports: metagenomeFeatures.

Henrik Bengtsson (17:17:51) (in thread): > I spotted this will trying to install all Bioc packages, andavailable.packages()shows that those three packages are available. But they obviously fail to install sincemetagenomeFeaturesis not available.

Henrik Bengtsson (17:31:10) (in thread): > A related, but less obvious, case isTOAST(https://bioconductor.org/packages/TOAST), which importsRefFreeEWAS(cran.r-project.org/package=RefFreeEWAS) from CRAN, but that was archived on 2021-09-19 (prior to the Bioc 3.14 release). Of course, contrary tometagenomeFeatures, that CRAN package may very well be revived at any time.

2021-11-12

Mike Smith (08:31:05) (in thread): > At a guess, something like theTOASTexample slips through because the build system doesn’t install all dependencies every time, so it was happy to keep buildingTOASTwith the existing installation ofRefFreeEWAS. I think I remember instances like this being reported before e.g.https://stat.ethz.ch/pipermail/bioc-devel/2020-November/017516.htmlNot sure how frequently the builder libraries get deleted and repopulated.

Lori Shepherd (14:37:03) (in thread): > Yes TOAST is interesting because we didn’t pick it and would pick it up when we do an R update – then we would pick it up and deprecate accordingly – > > The annotation packages were an oversight – we will update to show as a deprecation in 3.14 and remove in 3.15@Kayla Interdonatowhen we branch we should do this.

2021-11-14

Henrik Bengtsson (15:58:21): > Would it be possible for Bioconductor to validate, say before each Bioc release, that the maintainer contact info (packageDescription(pkgname)$Maintaine) is still valid?

Henrik Bengtsson (15:59:41) (in thread): > As an example, I stumbled uponaffydata(bioconductor.org/packages/affydata), with outdated email addresses as I tried to report an issue.

Hervé Pagès (18:08:01) (in thread): > I guess we could to that but then the question is: what do we do about the invalid addresses? > We already spend a lot of time chasing people around during the 6-8 weeks that precede a release if their package is failing on the daily reports. We rely on the maintainer address for that. If the address bounces back, we try to find another valid contact by looking at the Author field or at other packages from the same group/lab, or by googling around (Google Scholar can be very helpful). This is quite time-consuming and, unfortunately, it cannot really be automated. > > What would be easy to automate is to simply deprecate packages for which the Maintainer address bounces back. However I don’t think this is what you’re suggesting for a longtimer likeaffydatawith so many reverse deps (direct or indirect). Community involvement could help here e.g. maybe someone could “adopt”affydata?

Henrik Bengtsson (18:20:08) (in thread): > In general: If not deprecated, then maybe an automatic take over of maintenance by the Bioc team. I don’t think having packages without a maintainer is good for anyone. (I’m starting to understand some of the aggressive CRAN rules)

Hervé Pagès (18:32:43) (in thread): > Well that’s the thing: we (BioC team) don’t have the resource to maintain these packages. We’re actually trying to figure out how to put some orphaned longtimers likeVariantAnnotationin the hands of the community.

Henrik Bengtsson (19:28:47) (in thread): > Exactly - this is tricky. But I think being stricter on deprcation could be a solution. For example, make it a policy that non-maintained packageswill be deprecated in order to set expectations for the community. Procrastinating it will probably just make the problem bigger later on.

2021-11-15

Alan O’C (06:15:04) (in thread): > Is there a list of potential adoptee packages? My guess is that most of the older packages like affydata have probably broken in most ways they can at this point

Alan O’C (06:36:30) (in thread): > Also not sure if being strict on deprecation is the best plan. The recent noise about 900 packages being booted from CRAN because lubridate was failing on Solaris is probably not a good path for Bioc imo

2021-11-16

Alvaro Sanchez (03:16:19): > @Alvaro Sanchez has joined the channel

Nils Eling (03:32:17): > @Nils Eling has joined the channel

Hervé Pagès (04:10:47): > Starting withS4Vectors0.33.3, DataFrame is finally a virtual class (https://github.com/Bioconductor/S4Vectors/commit/1271fe8592b863f4b53d6d32a18099cfa705963a). Expect some turbulences on Wednesday on the report for BioC 3.15! I’ll try to fix as much as I can.

Kevin Rue-Albrecht (05:08:05): > Just when I had finished writing an episode for The Carpentries:sob:https://carpentries-incubator.github.io/bioc-project/05-s4/index.html#the-dataframe-class:laughing:

Kevin Rue-Albrecht (05:08:40) (in thread): - File (PNG): image.png

Vince Carey (08:03:12) (in thread): > IMHO this falls under the “scope of services” of Bioc, which is not formally defined. What is a “non-maintained” package – one for which the named maintainer does not answer emails? Shall we draft a policy and have it ratified by TAB/CAB? Deprecation and removal are manual processes at this point that consume important core effort. Deprecation is sometimes followed by restoration, which is also time-consuming. Guidance in the domain of controlling external dependencies is surely worth a workup for contributors. Quite a bit of effort has been devoted to reporting on dependency-related vulnerability in BiocPkgTools. Keeping an open ecosystem viable depends strongly on contributor alertness and commitment to fixing on a relatively tight schedule. It would be good to have a system-wide review of vulnerabilities of this ilk – packages that have nontrivial reverse dependencies that themselves show little sign of maintenance – can we define warning signs and mobilize community members to seek out the maintainers of such packages and somehow reduce the risk of collapse? I have done a little work on expanding the package-check reporting athttps://vjcitn.github.io/BiocBuildTools/articles/cicd1.html#supporting-developers-with-build-system-enhancementsand perhaps that can be pushed forward. - Attachment (vjcitn.github.io): Continuous integration and delivery approaches for Bioconductor > BiocBuildTools

Jenny Drnevich (08:44:40): > Is DataFrame moving to a virtual class going to change much for the end user as opposed to developers? I am almost clueless as to the technical details. One thing I noticed that had changed in the last year or so is that callingclass()on a DataFrame returns “DFrame”, which didn’t use to be the case: > > > library(S4Vectors) > > DF1 <- DataFrame( > + Integers = c(1L, 2L, 3L), > + Letters = c("A", "B", "C"), > + Floats = c(1.2, 2.3, 3.4) > + ) > > DF1 > DataFrame with 3 rows and 3 columns > Integers Letters Floats > <integer> <character> <numeric> > 1 1 A 1.2 > 2 2 B 2.3 > 3 3 C 3.4 > > class(DF1) > [1] "DFrame" > attr(,"package") > [1] "S4Vectors" >

Hervé Pagès (10:55:33) (in thread): > Doesn’t change a thing. AFAICT everything you wrote still stands. That’s because you focus on the API and don’t talk internal representation, which is good. Nice document BTW!

Kevin Rue-Albrecht (10:57:52) (in thread): > Thanks! - File (PNG): image.png

Hervé Pagès (11:03:27) (in thread): > From an end-user point of view nothing will change, except for what’s returned byclass()as you already noticed. Same for developers: DFrame is just a concrete DataFrame subclass and should be considered an implementation detail. The less the end user/developer knows about it the better. It would not be desirable to see people start to code specifically towards DFrame (e.g. by using things likeis(x, "DFrame")and/or by documenting the argument of their function as being a DFrame) when they don’t need to.

Kevin Rue-Albrecht (11:09:26) (in thread): > To be fair, a lot of what “I” have written is copied from the man page of the class xD

Marcel Ramos Pérez (11:38:59) (in thread): > That’s a clever way to have us read the manual:grin:Cool graphics BTW!

Hervé Pagès (13:49:02) (in thread): > To clarify, the “turbulences” I anticipate on tomorrow’s report is because there are still many serialized DataFrame instances around that will cause problems. I’ve already taken care of many of them (the fix is to simply pass them thruupdateObject()) but I’m sure tomorrow’s report will reveal many more.:sweat_smile:

2021-11-17

Hervé Pagès (14:07:33): > Gosh.. this is very redMore than I expected:woozy_face:Working on it…

Kasper D. Hansen (17:01:42): > Why change the api? Why not just called the virtual class DFrame.

Kasper D. Hansen (17:02:43): > Probably too late for this comment.

Kasper D. Hansen (17:03:28): > But don’t we have to reserialuze everything. Every single SummarizedExpetiment.

Hervé Pagès (17:11:01): > No API change. The DataFrame/DFrame change was announced 2 years go:https://www.bioconductor.org/help/course-materials/2019/BiocDevelForum/02-DataFrame.pdfYes, there’s a lot of stuff to reserialize. Still working on it.

Henrik Bengtsson (18:00:25) (in thread): > For the record, I stumbled upon the ‘Installing every CRAN package in R on CentOS 7’ blog post 2016-10-08 (https://clint.id.au/?p=1428), which showsyum install ...instructions for all required system libraries at the time.

2021-11-18

Martin Morgan (11:56:20): > <!here>reminder of today’s developer forum starting in 4 minutes - Attachment: Attachment > After a hiatus the Developers Forum will be back next Thursday 18th November at 09:00 PST / 12:00 EST / 18:00 CET - You can find a calendar invite attached and at https://tinyurl.com/BiocDevel-2021-11 > > @Jeroen Ooms will give an overview of the R Universe project (https://r-universe.dev/). This project offers individuals or organisations a platform to build, test and release R packages outside of CRAN or Bioconductor. As well as being potentially useful to those who are developing packages, there’s also a tonne of infrastructure engineering to create the build system, and it should be really interesting to learn more about this. > > Please note the meeting details have changed; We’re now on Zoom: > https://embl-org.zoom.us/j/99712225827?pwd=ZUdWRGhqalgzRmsxbjY3RkRGRHMzZz09

Martin Morgan (16:21:21): > Great presentation from@Jeroen Oomsabouthttps://r-universe.dev! Thanks so much; video recording should be posted in the next several days…

Henrik Bengtsson (22:10:38) (in thread): > And then I stumbled upongetsysreqs, which querieshttps://sysreqs.r-hub.io/for system dependencies for one or several R packages; > > # remotes::install_github("mdneuzerling/getsysreqs") > > getsysreqs::get_sysreqs(c("plumber", "rmarkdown"), distribution = "centos", release = "7") > [1] "libcurl-devel" "openssl-devel" "make" "pandoc" > [5] "libicu-devel" "libsodium-devel" >

2021-11-19

Jeroen Ooms (04:06:52): > Thank you, the slides are here:https://jeroen.github.io/bioc2021

Chris Vanderaa (10:49:48): > Hello! I need advice regarding the implementation of anupdateObject()method.Context: I’m helping maintaining theQFeaturesclass that inherits directly fromMultiAssayExperiment.QFeaturescontains an additional attribute/slot called@assayLinks(required to containAssayLinksobjects). We recently realized that theupdateObject()(inherited fromMultiAssayExperiment) does not work onQFeaturesbecause the default value for the@assayLinks(generated after callingnew()) slot is invalid when theQFeaturesobject is not empty. I therefore need to implement anupdateObject()forQFeaturesobjects.Implementation:the current solution I have is > > setMethod("updateObject", "QFeatures", > function(object, ..., verbose = FALSE) > { > if (verbose) > message("updateObject(object = 'QFeatures')") > ## Store slots that are specific to QFeatures > al <- object@assayLinks > ## Update the MAE slots > ## We use updateObjectFromSlots to update only MAE slots. > ## Note that the function throws a warning when slots are > ## dropped ("version" and "assayLinks" in this case). > ans <- suppressWarnings(updateObjectFromSlots( > object, objclass = "MultiAssayExperiment" > )) > ## Create the updated QFeatures object > new("QFeatures", > ExperimentList = experiments(ans), > colData = colData(ans), > sampleMap = sampleMap(ans), > metadata = metadata(ans), > assayLinks = al) > } > ) > > My question: is the code making sense to you? Does this implementation follow the Bioc expectations?

Hervé Pagès (12:16:43): > AnupdateObject()method should update the existing object instead of trying to create a new object from scratch. Concretely this means that it should update the slots of the object instead of creating a new object withnew(). > It looks like the method for MultiAssayExperiment objects is usingnew(). However, even if they are careful to donew(class(object), ...)(supposedly to play nice with subclasses), this doesn’t work, because, as you noticed, this approach won’t be able to set the additional slots of the derived objects properly. > > Once they’ve fixed this, yourupdateObject()method will be: > > setMethod("updateObject", "QFeatures", > function(object, ..., verbose=FALSE) > { > if (verbose) > message("updateObject(object = 'QFeatures')") > object@assayLinks <- updateObject(object@assayLinks, ..., verbose=verbose) > callNextMethod() > } > ) > > A general rule is that methods for QFeatures objects should be agnostic of the internal representation of their parent class. Concretely this means that you should only deal with your slots (assayLinks) and usecallNextMethod()to let parent methods deal with the rest of your object. > > Please open an issue on GitHub forMultiAssayExperiment. Thanks!

Laurent Gatto (12:25:17) (in thread): > Thank you Hervé for your help!

Kayla Interdonato (14:57:11): > The recording from yesterday’s developer forum is now available on the Bioconductor YouTube channel (https://www.youtube.com/watch?v=JefyrHl63-0) as well as the course materials (https://www.bioconductor.org/help/course-materials/) - Attachment (YouTube): Developers Forum 24

Marcel Ramos Pérez (19:56:33) (in thread): > Thanks this should be fixed inMultiAssayExperimentv1.21.3.

2021-11-22

Chris Vanderaa (04:30:46) (in thread): > Once again, thank you very much@Hervé Pagèsand@Marcel Ramos Pérez!:pray:

2021-11-24

Helge Hecht (13:15:47): > @Helge Hecht has joined the channel

2021-11-26

Francesc Català (06:43:42): > @Francesc Català has left the channel

Francesc Català (06:45:46): > @Francesc Català has joined the channel

2021-11-29

Jenny Drnevich (14:32:53): > My local compute cluster admin is trying to set up an internal mirror of Bioconductor but is having trouble. Here is the Rprofile.site: > > if (file.exists('/private_stores/mirror/R/cran')) { > suppressMessages(options("repos" = c(CRAN="file:///private_stores/mirror/R/cran"))) > } > if (file.exists('/private_stores/mirror/R/bioconductor')) { > suppressMessages(options(BIOCONDUCTOR_CONFIG_FILE="file:///private_stores/mirror/R/bioconductor/config.yaml")) > suppressMessages(options(BIOCONDUCTOR_ONLINE_VERSION_DIAGNOSIS=FALSE)) > suppressMessages(options(BioC_mirror = "file:///private_stores/mirror/R/bioconductor")) > } > > Here’s what the admin told me: “about to give up on bioconductor. i think there is abug in biocmanager. it won’t let me use a local mirror. I kept on getting ‘Error: Bioconductor version cannot be validated; no internet connection?’ which is suppose to get disabled by setting BIOCONDUCTOR_ONLINE_VERSION_DIAGNOSIS variable to false, but it isn’t honoring it. After plenty of reinstalls, getting Error in [<-.numeric_version(**tmp**, , 2, value = NA_integer_) : when trying to do any BiocManager function. my guess it either doesn’t support file:// anymore and it must be a http site or just some other bug. cran works fine.”

Martin Morgan (14:40:52): > Hi@Jenny DrnevichI opened an issue athttps://github.com/Bioconductor/BiocManager/issues/122

Jenny Drnevich (14:41:52) (in thread): > Thanks! I wasn’t sure of the right location to ask.

Dario Strbenac (18:34:37): > @Dario Strbenac has joined the channel

Dario Strbenac (19:00:01): > I broke S4 method dispatch in the development branch of ClassifyR by defining asetClassUnionwith the same pair of classes but different union name to the one already in S4Vectors (i.e.DataFrame_OR_NULL). I expect the second output in both cases. The output of first scenario results in theDataFramebeing treated like aSimpleList. Any insights from the experts? R doesn’t emit any warnings about the class union trampling an existing one, so I would not be aware that it was unless Hervé told me. > > library(S4Vectors) > setClassUnion("DataFrameOrNULL", c("DataFrame", "NULL")) > DataFrame(A = 1:3, B = 4:6) > > R version 4.1.2 and S4Vectors 0.32.3 output: > > DataFrame of length 2 > names(2): A B > > R Under Development and S4Vectors 0.33.5 output: > > DataFrame with 3 rows and 2 columns > A B > <integer> <integer> > 1 1 4 > 2 2 5 > 3 3 6 >

2021-12-07

Dirk Eddelbuettel (12:50:34): > (Not quite sure where to put this so I put it here). December is the time of the year when I go about updating theBHpackage, compromising between remaining somewhat current yet not being quite as volatile as Boost itself with releases every four months. So I just looked at the CRAN page and noticed this:

Dirk Eddelbuettel (12:50:36): - File (PNG): image.png

Dirk Eddelbuettel (12:51:20): > The package actually has no R code. It is just a vessel for C++ headers. So Depends: is off (and outdated), as is Imports:. Worth cleaning up, one day? Maybe for the next BioConductor release in a few months? Obviously not urgent as it does zero harm. (And yes, more packages have LinkingTo: as they should…)

2021-12-10

Alex Mahmoud (10:04:00): > @Alex Mahmoud has joined the channel

2021-12-14

Megha Lal (08:24:39): > @Megha Lal has left the channel

2021-12-15

Roger Vargas (11:14:34): > @Roger Vargas has joined the channel

2022-01-06

Kurt Showmaker (14:05:31): > @Kurt Showmaker has joined the channel

2022-02-01

Stephanie Hicks (20:24:27): > @Stephanie Hicks has left the channel

2022-02-14

Federico Marini (07:44:31): > Ever returning discussion/Consideration:https://twitter.com/jokergoo_gu/status/1493113544040239106?s=20&t=7iHqBx2UsKPE1mlLN7RzQA - Attachment (twitter): Attachment > My new preprint: pkgndep: a tool for analyzing dependency heaviness of R packages . We proposed a new metric named “dependency heaviness” to answer the question: “which parent contributes high dependencies to its child package?” > https://doi.org/10.1101/2022.02.11.480073 https://pbs.twimg.com/media/FLfd56RWUAsQkMW.jpg

2022-02-23

Susanna (12:08:36): > @Susanna has joined the channel

2022-02-24

Mike Morgan (11:59:41): > I’m currently working on porting some new developments in themiloRpackage into Rcpp, but struggling conceptually with passing R sparse matrices into my functions - what is the current (best-ish) practice for this? I’ve seen there is an RcppSparseMatrix package in development to define Sparse Matrix classes, but isn’t currently available on CRAN for R 4.1.1. Is the advice to follow the Rcpp Gallery example and define a sparse matrix class internally in my package (á la:https://gallery.rcpp.org/articles/sparse-matrix-class/)? - Attachment (gallery.rcpp.org): Constructing a Sparse Matrix Class in Rcpp > Creating a lightweight sparse matrix class in Rcpp

Dirk Eddelbuettel (12:05:41): > Hi Mike – the rcpp-devel list may be more on point:slightly_smiling_face:One of the problems with sparse matrices is some “sparse” (bad pun alert triggered) support in base R. My usual approach is to construct matrix types supported by the Matrix package, and on the C++ code side stick with for example with RcppArmadillo which has some support and some converter examples. Another example is a (small, simple) import + export helper for tiledb, seehttps://github.com/TileDB-Inc/TileDB-R/blob/master/R/SparseMatrix.R

Dirk Eddelbuettel (12:07:56): > There are alsoslam,spam,SparseM, … on CRAN but we don’t really have much in terms of a unification.

Mike Morgan (12:18:52): > Thanks for the speedy reply Dirk. OK - I just wanted to make sure that I wasn’t missing some super obvious class definition somewhere.

Dirk Eddelbuettel (12:21:17): > No worries. And at the interface we’re back to dealing with what R itself supports so it is likelyRcpp::S4for the access toMatrix.

Peter Hickey (16:23:04): > beachmat(https://bioconductor.org/packages/devel/bioc/html/beachmat.html) by@Aaron Lunmay also be suitable - Attachment (Bioconductor): beachmat (development version) > Provides a consistent C++ class interface for reading from and writing data to a variety of commonly used matrix types. Ordinary matrices and several sparse/dense Matrix classes are directly supported, third-party S4 classes may be supported by external linkage, while all other matrices are handled by DelayedArray block processing.

Aaron Lun (16:23:26): > jesus this hummus sound is terrifying

Dario Strbenac (21:30:54): > We are beginning the process of converting a package’s documentation from hand-written Rd files into Roxygen2 format. However, I looked at S4Vectors, AnnotationHub and GenomicFeatures to get some ideas about the syntax for S4 methods and I found that they all use the traditional Rd writing approach. I also don’t find any mention of it in theBioconductor Package Guidelines for Developers and Reviewers. Are there downsides to using it for a package that makes extensive use of S4 methods (also I find no minimal example to see inRoxygen2’s vignette)? > > S4 methods are a little more complicated. Unlike S3 methods, all S4 methods must be documented. You can document them in three places … > … yes, but what’s the syntax to use?

Kasper D. Hansen (21:45:22): > Im guessing you’re going to see little support for S4 because the roxygen2 people don’t use it. And I have to say I find roxygen2 to not really be any improvement.

Peter Hickey (21:47:45): > are you planning to do this manually or with something likehttps://yihui.org/rd2roxygen/? - Attachment (yihui.org): Rd2roxygen - Yihui Xie | 谢益辉 > The package Rd2roxygen helps R package developers who used to write R documentation in the raw LaTeX-like commands but now want to switch their documentation to roxygen2, which is a convenient tool …

Dirk Eddelbuettel (21:48:50): > I use it a little. Ande.gsearching for@slotat GH gets 60-some hits:https://cs.github.com/?q=org%3Acran+%40slot

Dario Strbenac (21:50:42): > Ah,Rd2roxygen could be convenient to see how the Rd file translates for S4 methods. I’ll give it a go.

Kasper D. Hansen (22:00:51): > I’ll also add that I don’t love our support for S4 in the first place. You can write man pages but they tend to be much less structured than “normal” pages. Also issue - IMO - when you have methods for different signatures in different packages.

Marcel Ramos Pérez (22:18:10) (in thread): > We also have some examples here:https://code.bioconductor.org/search/search?q=%40slot

2022-02-25

Joshua Sodicoff (16:20:15): > @Joshua Sodicoff has joined the channel

2022-03-04

Dario Strbenac (00:00:00): > Interestingly, Rd2roxygen doesn’t work well, either. I ran into the issue which other people described in 2017. It only convertsone section into Roxygen if you have multiple sections(e.g. Constructor, Accessors, Summary - I have used Herve Pages’ documentation of core packages as the basis for my documentation) and it mangles\itemlists within\describe.

Dario Strbenac (01:00:05) (in thread): > Yes, it looks clunky. > > \describe{ > \item{}{ > \code{show(result)}: Prints a short summary of what \code{result} contains. > }} >

Hervé Pagès (03:29:26) (in thread): > Oouch, I hope you picked up a more or less recent man page likehttps://github.com/Bioconductor/GenomicRanges/blob/master/man/GPos-class.RdIn retrospect there are some very old man pages that I’m not too proud of. Would need to spend some time updating/rewriting them.

2022-03-14

Federico Marini (06:46:48): > Not sure which one is the most adequate venue for this question, but I think this might be a good starter. > > StringDB usage in R as of now is relatively confined to the stringdb package itself. > I was wondering whether there is a need to have db dumps of the PPIs from that, e.g. into the ExperimentHub framework. > I do recall I discussed a little with@Ludwig Geistlingerre: bioPlex, and I think that is somehow supported at least by a caching system. > Wondering if others might find useful to have a compact command to fetch that info in a programmatic way, to be further used in the steps of interpretation

Ludwig Geistlinger (10:04:33): > I think both stringdb and bioplex would benefit from being accessible via a Hub. Whether it should be ExperimentHub is a good question, I think we would actually need something like a NetworkHub.

Vince Carey (10:22:57) (in thread): > where does ndex fit in?ndexbio.orgcan we take advantage of that system in conceiving a “network hub”?

Ludwig Geistlinger (10:28:55) (in thread): > yes it would actually be a great starting point for a network hub in having collected already a large variety of relevant networks. Similar to StringDB, we have thendexrBioc package which implements downloading from the ndex server.

Vince Carey (11:02:14) (in thread): > we have a relationship withdexter pratt through nci itcr consider an r21 or u01 to get this funded

Federico Marini (11:12:11): > I was thinking of EHub because the definition of “Experiment” in this case is quite relaxed

Federico Marini (11:12:22): > but not “really” an AnnotationHub

Federico Marini (11:12:49): > I mean, in principle it is the same construct behind it, only the A or the E changing

Federico Marini (11:13:45): > what I would love to have is as I said the “one click to the format one might need”, possibly with interconversions and whatever is in which can be leveraged

2022-03-17

Martin Morgan (10:30:14): > @Mike SmithI was wondering if a developer forum on basilisk would be a good idea? Maybe@Alan O’C(or@Luke Zappiaor of course@Aaron Lun) could present? Maybe a general introduction / why the Bioconductor community should adopt, a brief walk through how a developer might write a package that uses it, and enough of a glimpse ‘under the hood’ so that developers with unusual circumstances can grok what is going on.

Alan O’C (17:36:11) (in thread): > I don’t know that I’d be the best person for the job, but I’d be happy to do it. Having said that, I am in “strictly no new commitments” mode until May

2022-03-20

Sarvesh Nikumbh (21:40:49): > @Sarvesh Nikumbh has joined the channel

2022-03-22

Michael Lawrence (18:15:57): > @Michael Lawrence has joined the channel

2022-03-24

Martin Morgan (11:38:16) (in thread): > @Alan O’Cwould you be up for something in late May? I think the traditional developer forum schedule is the third Thursday (May 19) at 12 noon US Eastern

2022-03-25

Alan O’C (10:27:30) (in thread): > I can’t commit at the moment but I can say a tentative probably

Martin Morgan (10:29:53) (in thread): > OK thanks that’s good enough for now; hopefully it wouldn’t be a big lift for you, kind of fun to put it together & share! Thanks

2022-03-31

Federico Marini (12:02:47): > I cannot find a “dedicated” channel for this, so I am putting it here for us all developers. > I am toying with the idea of having a binder instance for packages I am developing, where all dependencies are set and users can “try before~~~buy~~~even installing”. > Does any of you have previous experiences and tips for this?

Jenny Drnevich (12:09:57): > pwd

Jenny Drnevich (12:10:07) (in thread): > Oops!

Federico Marini (12:16:56) (in thread): > :slightly_smiling_face:

Alan O’C (12:19:18) (in thread): > /home/bioc-slack/

Vince Carey (14:31:04): > @Federico Marinicheck with@Sean Davis?

Dirk Eddelbuettel (14:37:48): > One alternative I became aware of very recently thanks to another R dev ishttps://gitpod.io– allows one to add any (public) Docker container. Might be easier to set up than binder. Have not played yet myself.

Sean Davis (14:49:09) (in thread): > I can easily host containers that you control on Orchestra. The downside of binder for R is that the build time can be rather long. Orchestra instances will launch ~immediately if no autoscaling is required and in ~90 seconds if autoscaling kicks in.

Sean Davis (14:50:11) (in thread): > As the Bioc Build System develops, I can see us having docker containers and Orchestra launchers for all packages.

2022-04-01

Federico Marini (06:00:57) (in thread): > Yeah I was thinking that could be some wider-scope aim that would be made possible. > Could be excellent in demo’ing software packages

Federico Marini (06:01:17) (in thread): > THanks Dirk! Will look into that

2022-04-04

Sean Davis (13:46:24) (in thread): > Happy to chat,@Federico Marini, to get a better sense of what you have in mind and to see if there is anything we can work on together.

2022-04-08

Ludwig Geistlinger (09:48:55): > Hi developers,@Vince Careywill speak about “Open Computational Ecosystems for Modern Genomics” on Monday, April 11th, 3-4 PM EST. Feel free to tune in

2022-04-15

Nicole Ortogero (15:19:42): > @Nicole Ortogero has joined the channel

2022-04-18

Dirk Eddelbuettel (10:55:58): > Is there an equivalent totools::CRAN_package_db()for BioConductor? When ‘computing’ over packages that need to be built or updated this db is really handy. I would love to query something like this for, say,Rgraphvizto be told it needsgraph,BiocGenerics. How would I go about that?

Hervé Pagès (11:25:20): > Ifavailable.packages(contrib.url(BiocManager::repositories()))doesn’t provide the fields you’re looking for, maybe try > > views <- tempfile() > download.file("[https://bioconductor.org/packages/3.15/bioc/VIEWS](https://bioconductor.org/packages/3.15/bioc/VIEWS)", views) > bioc_db <- read.dcf(views) > > It has a lot more fields e.g. rev deps thru thedependsOnMe,importsMe, andsuggestsMefields.

Martin Morgan (11:25:41) (in thread): > I would have used base R > > > db = available.packages(repos = BiocManager::repositories()) > > tools::package_dependencies("Rgraphviz", db, recursive=TRUE) > $Rgraphviz > [1] "methods" "utils" "graph" "grid" "stats4" > [6] "graphics" "grDevices" "BiocGenerics" "stats" > > This isn’t ideal becauseBiocManager::repositories()only provides information for the version of Bioconductor relevant to your version of R; you could work around this with > > repos = sub("packages/[0-9.]+", "/packages/3.13", BiocManager::repositories()) >

Dirk Eddelbuettel (11:27:06) (in thread): > That’s spot on Martin – thank you. The rest of my test uses the same functions fromtoolsso getting thedbviaavailable.packages()is the missing link.

Dirk Eddelbuettel (11:27:57): > Also nice. Looking at the rendered html so i suspected it had be there somewhere:slightly_smiling_face:

2022-04-19

Sean Davis (09:31:08) (in thread): > The BiocPkgTools package (https://www.bioconductor.org/packages/release/bioc/html/BiocPkgTools.html) has some additional functionality, but it sounds like you probably have the answer you need. - Attachment (Bioconductor): BiocPkgTools > Bioconductor has a rich ecosystem of metadata around packages, usage, and build status. This package is a simple collection of functions to access that metadata from R. The goal is to expose metadata for data mining and value-added functionality such as package searching, text mining, and analytics on packages.

Dirk Eddelbuettel (09:34:49) (in thread): > Thanks@Sean Davis– I am in fact coming from a (working) Base R setup which I need to complement with some BioConductor packages that are used so a preference for working existing workflows rather than new bells and whistles.

Dirk Eddelbuettel (20:17:43) (in thread): > On second thought the dependency footprint ofBioPkgToolsis a bit on the heavy side so I won’t use it here. Thanks though.

2022-04-25

Sean Davis (13:51:30) (in thread): > Thx for the feedback,@Dirk Eddelbuettel. The BiocPkgTools package is a bit of a utility package and probably doesn’t adhere to “do one thing and do it well.”

2022-05-03

Ray Su (06:55:44): > @Ray Su has joined the channel

2022-05-05

Flavio Lombardo (05:56:48): > @Flavio Lombardo has joined the channel

2022-05-06

Dirk Eddelbuettel (11:14:09) (in thread): > For completeness, what I was working on is now ‘out’ and announced; 19k binary packages for 20.04 with (parts of) 3.14 – and I am currently adding 22.04 with (parts of) 3.15.https://twitter.com/eddelbuettel/status/1522220397357244416 - Attachment (twitter): Attachment > Using #RStats on @ubuntu LTS cloud, server, desktop?
> > The new #CRANapt repo has 19000 .deb binaries with full dependencies and apt integration. Demo of a full tidyverse installation in 18 seconds (on @awscloud) below. > > More details at https://eddelbuettel.github.io/r2u/

Martin Morgan (13:41:49) (in thread): > Pretty neat, Dirk! FWIW on thebioconductor/bioconductor_docker(e.g.,:develor:RELEASE_3_15) containers, a new featureBiocManager::install("tidyverse")(any Bioconductor package or its dependencies, which happen to include tidyverse) installs from a repository of pre-built binaries. > > > BiocManager::install("tidyverse") > 'getOption("repos")' replaces Bioconductor standard repositories, see > '?repositories' for details > > replacement repositories: > CRAN:[https://packagemanager.rstudio.com/cran/__linux__/focal/latest](https://packagemanager.rstudio.com/cran/__linux__/focal/latest)... > trying URL '[https://bioconductor.org/packages/3.16/container-binaries/bioconductor_docker/src/contrib/colorspace_2.0-3_R_x86_64-pc-linux-gnu.tar.gz](https://bioconductor.org/packages/3.16/container-binaries/bioconductor_docker/src/contrib/colorspace_2.0-3_R_x86_64-pc-linux-gnu.tar.gz)' > Content type 'application/x-tar' length 2625739 bytes (2.5 MB) > ================================================== > downloaded 2.5 MB > ... > > trying URL '[https://bioconductor.org/packages/3.16/container-binaries/bioconductor_docker/src/contrib/tidyverse_1.3.1_R_x86_64-pc-linux-gnu.tar.gz](https://bioconductor.org/packages/3.16/container-binaries/bioconductor_docker/src/contrib/tidyverse_1.3.1_R_x86_64-pc-linux-gnu.tar.gz)' > Content type 'application/x-tar' length 425138 bytes (415 KB) > ================================================== > downloaded 415 KB > > * installing **binary** package 'colorspace' ... > * DONE (colorspace) > ... > * installing **binary** package 'tidyverse' ... > * DONE (tidyverse) > > The downloaded source packages are in > '/tmp/RtmpMtysxN/downloaded_packages' > Updating HTML index of packages in '.Library' > Making 'packages.html' ... done > > I’m not sure timing-wise how this compares to your approach (I have to do something, I think, to make docker performant on M1 macs…) and it relies on the docker container designed to run (almost) all Bioconductor packages. > > Would be neat to hear what would be required to extend your approach to Bioconductor packages (not sure if the Bioconductor release cycle causes problems…).

Dirk Eddelbuettel (14:04:24) (in thread): > Do your premade binaries account for system dependencies?

Martin Morgan (14:16:31) (in thread): > Yes, the docker container comes with required system dependencies, mostlyhere.

Dirk Eddelbuettel (14:17:41) (in thread): > I see. That’s nice in that it covers it, and it is a wee bit more restrictive than what I do as it works only within / on top of your container. Still, binaries are good. And choice is good.

Martin Morgan (14:38:53) (in thread): > (should say that this is mostly the work of@Nitesh Turaga; I’m just a fan of approaches like these that speed / ease package installation)

Dirk Eddelbuettel (14:42:23) (in thread): > And under-complained-about aspect also is that the ‘application package manager’ being separate from the system one means that when the system upgrades, say, libgsl23 to libgsl25 all your R packages using GSL break. Can’t happen when you integrate the packaging which is what I have done, but it gets esoteric:slightly_smiling_face:And limits you to setups where you can. As your containers are ephemeral it is also less of concern. On longer-lived servers, desktops, laptops it is nice to not get bitten by this. > > But yes, ease-of-use and speed and lack of installation failures are bigger factors.

2022-05-10

Daniel Adediran (05:05:35): > @Daniel Adediran has joined the channel

2022-05-11

Sean Davis (17:03:15): > Perhaps a bit off-topic, but does anyone have a good Bioc-centric Contributor markdown for inclusion in packages that incorporates bioc coding style recs, etc.?

2022-05-13

Leonardo Collado Torres (10:17:19): > styler::style_file("file.Rmd", transformers = biocthis::bioc_style())is something I use… likely not the answer you were looking for

Sean Davis (13:42:43) (in thread): > Not what I was looking for, but even better!

2022-05-16

Pedro Sanchez (07:02:05): > @Pedro Sanchez has joined the channel

Pedro Sanchez (07:44:55): > Hi all! Can someone explain a little bit more in depth what’s the difference between SnowParam and MulticoreParam’s distributed and shared memory computing?

Leonardo Collado Torres (08:04:56) (in thread): > Hehe :) I’m glad you found it useful ^^

Vince Carey (13:53:05): > I’ll take a shot at it. With SnowParam, you are orchestrating independent possibly heterogeneous task executors. Each one needs to get sufficient information to do all aspects of the task – packages, functions, data must be fully available to each executor separately. With MulticoreParam, the executors can share many aspects of resources necessary to carry out the task, because they are in the same shared-memory system and each executor is started with the fork() system call that replicates resources available in the orchestrating process.

Kasper D. Hansen (14:12:04): > However, experience tells us that even withfork()(MultcoreParam) it is extremely easy to get memory duplication.

2022-05-17

Pedro Sanchez (03:21:45): > Thank guys! I got it:grin:

Vince Carey (07:56:08): > Just to make sure I understand@Kasper D. Hansenpoint – if you are using MulticoreParam on a machine with G gigs of RAM and you want k workers, you probably need to be sure that no worker is using more than G/(k+1) gigs or you can crash everything with “cannot allocate” errors.

Vince Carey (07:59:45): > I am stating this very crudely and refinements are welcome. One of the attractions (to me) of kubernetes is that in principle one can start to address the notion that different parts of the problem have different resource requirements at different points in the workflow, and one can “program” this in a potentially fault-tolerant/autoscaling way. I do not have any clear examples illustrating these potentials.

2022-05-18

Dario Strbenac (05:00:00): > bpparam()will choose good options for whatever computer the software is running on, so it makes a good default parameter for package code.

Kasper D. Hansen (13:08:07): > I am commenting on the tantalizing prospect of shared memory between process withfork. Say for example you have a massive matrix X and you want to do computations across columns. You can then do something like > > mclapply(seq(along = ncol(X)), function(ii) sum(X[,ii]), mc.cores = 10) > > with thehopethat X does not get duplicated 10 times. I am assuming the function appears to be read-only on the object. However, in practice, I have often observed that X does in fact get duplicated and it isextremelyhard to reason about when or why.

Martin Morgan (13:41:54) (in thread): > The reason for this might be herehttps://stat.ethz.ch/pipermail/r-devel/2015-July/071554.html, where the garbage collector actually touches the memory page holding the object, hence marks it for copying in the forked process. The implication is that eventually all memory will be duplicated; there’s kind of a race between the persistence of the forked process and duplication by the garbage collector.

Leonardo Collado Torres (14:19:38) (in thread): > ohh!!

Martin Morgan (16:07:04) (in thread): > I’m not sure, but@JiefeiWang’sSharedObjectpackage (see the vignette) might avoid this kind of effect – the data are memory-mapped and I believe marked ‘read-only’, so possibly outside R’s garbage collector… Would be interesting to explore…

JiefeiWang (16:35:11) (in thread): > It is a known issue that the memory of the forked R process will eventually get copied. If you know that the object should never be duplicated, there is an option for enforcing it in SharedObject. For example, in section 5 of thevignette > > share(1:4, copyOnWrite = FALSE) > > This will force R to use the same object even if you are trying to change its value. Therefore, you can create a single large matrix withcopyOnWrite = FALSEand then send it to all workers(the worker does not have to be a forked process), any change in the matrix will never duplicate the object and the change can be observed by all workers immediately.

Kasper D. Hansen (22:06:16) (in thread): > I knew it was the fault of the garbage collector but I never had a solution until now. I’ll check theShareObject package out.

Kasper D. Hansen (22:07:23) (in thread): > Also, while I knew it was the fault of the garbage collector, I also found it very hard to reason about.

2022-06-07

Simon Pearce (09:27:56): > @Simon Pearce has joined the channel

Simon Pearce (09:30:47): > Is there a way to set a global default in a package? I would like to be able to specify a genome (e.g. hg19 or hg38) and have various functions in my package load the correct data object based on that (with manual specifying if not already defined)

Dirk Eddelbuettel (09:36:52) (in thread): > “Many”. My current favorite (for an entire config file) is something like this (which requires R 4.0.0 or later). The code example pulls a per-package config filelocation > > .defaultConfigFile <- function() { > pkgdir <- tools::R_user_dir(packageName()) # ~/.local/share/R/ + package > if (dir.exists(pkgdir)) { > fname <- file.path(pkgdir, "config.dcf") > if (file.exists(fname)) { > return(fname) > } > } > return("") > } >

Dirk Eddelbuettel (09:40:36) (in thread): > I then read from the config file viacfg <- read.dcf()and access the (named) values ascfg[1, ,drop=TRUE]Now, if you just want a value a nice way isgetOption("defaultGene", "hg19")which allows your users to specify another one (viaoptions()) but if they don’t your default ofhg19, say, holds.

Simon Pearce (09:43:49) (in thread): > Do you have an example of that in use?

Dirk Eddelbuettel (09:46:01) (in thread): > Yes, both. Which one are you after?

Simon Pearce (09:46:46) (in thread): > The config one, I think I get how to use thegetOptionone

Dirk Eddelbuettel (09:48:39) (in thread): > Sure. There is an entire (not particularly efficiently re-used) set in something I wrote recently:https://github.com/eddelbuettel/r2u/blob/master/R/init.RIt does the second step too and sticks the values it reads into a per-package ‘global but hidden’ environment.

Dirk Eddelbuettel (09:50:54) (in thread): > There are also a bunch of CRAN packages helping with configuration storage and retrieval so if you want to outsource this (at the small cost of an extra dependency) you can too but my needs are often simple enough fpr the code above.

Simon Pearce (09:52:45) (in thread): > Ok, thanks, I’ll take a look. I think thegetOptionone will be most appropriate for me.

Spencer Nystrom (10:11:35) (in thread): > As usual, Dirk’s spot on here. My only advice for thegetOptionapproach is if you are calling it in several spots in your codebase, you should abstract it into a private function so you can easily update the global default of your package in the future.

Dirk Eddelbuettel (10:12:58) (in thread): > Yes I did just that as well with getter helpers in the few packages where I now do that. And in general I quite like this scheme.

Spencer Nystrom (10:15:07) (in thread): > Yeah while I usually am not a fan of reasoning about global state, it has come in handy in a few spots for me, at least from a UX perspective anyway.

2022-06-08

Simon Pearce (04:27:17): > I have a function in my package to extend variousdplyrfunctions on theqseaSetobjects defined in theqseapackage. So I have a function calledarrange.qseaSetthat calls my local function to do the work, like this: > > arrange.qseaSet <- function(.data, ..., .by_group = FALSE){arrangeQset(.data, ..., .by_group)} > > This seems to work fine, but when I check withdevtoolsit gives me this error: > > ❯ checking R code for possible problems ... NOTE > Error: object 'arrange' not found whilst loading namespace 'mesa' > Execution halted > > How do I fix that?

Alan O’C (05:54:31) (in thread): > It reads like you need to add an import declaration to yourNAMESPACE: (importFrom(dplyr,arrange)or the equivalent roxygen#' @importFrom dplyr arrange)? > > In the good old days you also needed to tag s3 methods with roxygen, but I think now you just need to#' @exportthem. > > R is pretty laid back at dispatching s3 methods, so if you have dplyr loaded when working with the package locally, it won’t complain.devtools::check()tends to run things in a clean environment so is much more strict about imports etc than egdevtools::load_all()

Simon Pearce (06:00:18) (in thread): > My NAMESPACE file is being automatically generated by roxygen, where should I put#' @importFrom dplyr arrange?

Alan O’C (06:01:26) (in thread): > Any .R file; for larger packages I tend to put imports in the documentation for functions that use them, but smaller packages I’ll use a central filehttps://github.com/Alanocallaghan/snifter/blob/master/R/snifter-internal.R

Simon Pearce (06:08:00) (in thread): > ok, thanks, I’ll put them each in where I define each new S3 method.

Simon Pearce (10:55:36): > Next question, best way to cache a result? I calculate something over a reference genome and would like to save it for the future.

Simon Pearce (11:08:09) (in thread): > As my previous method of saving the result into an object indatais not going to fly

Michael Stadler (11:16:27): > I realise this is self-inflicted and that BioC core is on it, and I am happy to wait untilarm64is officially supported, but in the meanwhile just in case someone else bumped into this and got further than me: > I am using R 4.2 / BioC devel 3.16 on arm64 (apple silicon), and have been successful for the most part (packages compile from sources without problems), but I fail with the recently updatedRhtslib(version 1.99.5). It compiles fine but upon test loading the package it throws: > > Error in dyn.load(dll_copy_file) : > unable to load shared object '/.../Rhtslib.so': > dlopen(/.../Rhtslib.so, 0x0006): symbol not found in flat namespace '_hts_version' > > Any idea what could be the issue?

Spencer Nystrom (11:24:15) (in thread): > Just to clarify, Is this for an R package or just an analysis?

Simon Pearce (11:34:12) (in thread): > For an R package

Simon Pearce (11:34:38) (in thread): > (although I’m interested in useful ways to cache results for analysis too)

Spencer Nystrom (11:37:03) (in thread): > And is the idea to cache a result that a user generated or do you have a core dataset that is used as an asset to the package that shouldn’t change? > > For analysis, look into thetargetspackage, it is amazing.

Vince Carey (11:37:43): > I’ve seen this, but it was when I was trying to port to new htslib. What exactly are you doing to install Rhtslib?

Martin Morgan (11:41:07) (in thread): > BiocFileCache (maybethisvignette section) or thememoisepackage are likely candidates.

Michael Stadler (11:42:06): > I am runningBiocManager::install("Rhtslib")and say “yes” when asked if it should try to compile it from sources. I also tried to clone the repository and rundevtools::load_all()on it, which fails in the same way. Not sure if this helps, but I inspected the shared objects for contained symbols and got: > > $ nm Rhtslib/src/Rhtslib.so | grep hts_version > U _hts_version > $ nm Rhtslib/src/htslib-1.15.1/libhts.a | grep hts_version > 0000000000000780 T _hts_version > U _hts_version >

Simon Pearce (11:43:47) (in thread): > The result is a set of CG positions across the genome, using aBSgenomeobject. Done once per genome

Simon Pearce (11:46:14) (in thread): > I’d rather not have to specify a cache directory if possible

Vince Carey (11:46:35): > I don’t have my M1 with me right now but I will check later. Sorry you are running into this.

Simon Pearce (11:48:28) (in thread): > It only takes a couple of minutes, but I want to reuse the result per bam file I’m reading in. memoise looks the likely solution

Simon Pearce (11:49:41) (in thread): > Although, does memoise work nicely with BiocParallel?

Hervé Pagès (12:49:48) (in thread): > BiocFileCacheis for on-disk caching so the caching persists across sessions.memoiseis for in-memory caching so the caching only lasts for the current session. You need to first decide what kind of caching you want.

Hervé Pagès (12:53:52) (in thread): > Now a tricky question is what kind of caching works better in the context ofBiocParallel. Well it depends:wink:

Hervé Pagès (13:07:18) (in thread): > On a Mac Mini M1 with Monterey: > > > library(BiocManager) > Bioconductor version 3.16 (BiocManager 1.30.18), R 4.2.0 (2022-04-22) > > > install("Rhtslib", force=TRUE) > Bioconductor version 3.16 (BiocManager 1.30.18), R 4.2.0 (2022-04-22) > Installing package(s) 'Rhtslib' > Warning: unable to access index for repository[https://bioconductor.org/packages/3.16/bioc/bin/macosx/big-sur-arm64/contrib/4.2](https://bioconductor.org/packages/3.16/bioc/bin/macosx/big-sur-arm64/contrib/4.2): > cannot open URL '[https://bioconductor.org/packages/3.16/bioc/bin/macosx/big-sur-arm64/contrib/4.2/PACKAGES](https://bioconductor.org/packages/3.16/bioc/bin/macosx/big-sur-arm64/contrib/4.2/PACKAGES)' > Warning: unable to access index for repository > ... > Package which is only available in source form, and may need > compilation of C/C++/Fortran: 'Rhtslib' > Do you want to attempt to install these from sources? (Yes/no/cancel) y > installing the source package 'Rhtslib' > > trying URL '[https://bioconductor.org/packages/3.16/bioc/src/contrib/Rhtslib_1.99.5.tar.gz](https://bioconductor.org/packages/3.16/bioc/src/contrib/Rhtslib_1.99.5.tar.gz)' > Content type 'application/x-gzip' length 4581330 bytes (4.4 MB) > ================================================== > downloaded 4.4 MB > > * installing **source** package 'Rhtslib' ... > **** using non-staged installation via StagedInstall field > **** libs > cd "htslib-1.15.1" && make -f "Makefile.Rhtslib" > clang -arch arm64 -falign-functions=64 -Wall -g -O2 -fpic -fvisibility=hidden -I. -I/opt/R/arm64/include -D_FILE_OFFSET_BITS=64 -c -o kfunc.o kfunc.c > clang -arch arm64 -falign-functions=64 -Wall -g -O2 -fpic -fvisibility=hidden -I. -I/opt/R/arm64/include -D_FILE_OFFSET_BITS=64 -c -o kstring.o kstring.c > clang -arch arm64 -falign-functions=64 -Wall -g -O2 -fpic -fvisibility=hidden -I. -I/opt/R/arm64/include -D_FILE_OFFSET_BITS=64 -c -o bcf_sr_sort.o bcf_sr_sort.c > ... > clang -arch arm64 -falign-functions=64 -Wall -g -O2 -fpic -fvisibility=hidden -I. -I/opt/R/arm64/include -D_FILE_OFFSET_BITS=64 -c -o hfile_libcurl.o hfile_libcurl.c > ar -rc libhts.a kfunc.o kstring.o bcf_sr_sort.o bgzf.o errmod.o faidx.o header.o hfile.o hts.o hts_expr.o hts_os.o md5.o multipart.o probaln.o realn.o regidx.o region.o sam.o synced_bcf_reader.o vcf_sweep.o tbx.o textutils.o thread_pool.o vcf.o vcfutils.o cram/cram_codecs.o cram/cram_decode.o cram/cram_encode.o cram/cram_external.o cram/cram_index.o cram/cram_io.o cram/cram_stats.o cram/mFILE.o cram/open_trace_file.o cram/pooled_alloc.o cram/string_alloc.o htscodecs/htscodecs/arith_dynamic.o htscodecs/htscodecs/fqzcomp_qual.o htscodecs/htscodecs/htscodecs.o htscodecs/htscodecs/pack.o htscodecs/htscodecs/rANS_static4x16pr.o htscodecs/htscodecs/rANS_static.o htscodecs/htscodecs/rle.o htscodecs/htscodecs/tokenise_name3.o hfile_libcurl.o > ranlib libhts.a > clang -arch arm64 -dynamiclib -install_name /usr/local/lib/libhts.3.dylib -current_version 3.1.15 -compatibility_version 3.1.15 -L/opt/R/arm64/lib -fvisibility=hidden -o libhts.dylib kfunc.o kstring.o bcf_sr_sort.o bgzf.o errmod.o faidx.o header.o hfile.o hts.o hts_expr.o hts_os.o md5.o multipart.o probaln.o realn.o regidx.o region.o sam.o synced_bcf_reader.o vcf_sweep.o tbx.o textutils.o thread_pool.o vcf.o vcfutils.o cram/cram_codecs.o cram/cram_decode.o cram/cram_encode.o cram/cram_external.o cram/cram_index.o cram/cram_io.o cram/cram_stats.o cram/mFILE.o cram/open_trace_file.o cram/pooled_alloc.o cram/string_alloc.o htscodecs/htscodecs/arith_dynamic.o htscodecs/htscodecs/fqzcomp_qual.o htscodecs/htscodecs/htscodecs.o htscodecs/htscodecs/pack.o htscodecs/htscodecs/rANS_static4x16pr.o htscodecs/htscodecs/rANS_static.o htscodecs/htscodecs/rle.o htscodecs/htscodecs/tokenise_name3.o hfile_libcurl.o -lz -lm -lbz2 -llzma -lcurl > ln -sf libhts.dylib libhts.3.dylib > mkdir -p "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/include/htslib" > cd "htslib-1.15.1/htslib" && cp * "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/include/htslib" > clang -arch arm64 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -D_FILE_OFFSET_BITS=64 -I"/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/include" -I'/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/zlibbioc/include' -I/opt/R/arm64/include -fPIC -falign-functions=64 -Wall -g -O2 -c R_init_Rhtslib.c -o R_init_Rhtslib.o > mkdir -p "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/usrlib" > cd "htslib-1.15.1" && cp libhts.a "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/usrlib" > clang -arch arm64 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/opt/R/arm64/lib -o Rhtslib.so R_init_Rhtslib.o /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/usrlib/libhts.a -lcurl -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation > mkdir -p "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/testdata/tabix" > cd "htslib-1.15.1/test" && (cp *.sam *.bam *.vcf *.bcf *.cram *.fa *.fa.fai *.gff *.bed "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/testdata" 2>/dev/null || true) && cd tabix && (cp *.sam *.bam *.vcf *.bcf *.cram *.fa *.fa.fai *.gff *.bed "/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/testdata/tabix" 2>/dev/null || true) > installing to /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/Rhtslib/libs > **** R > **** inst > **** byte-compile and prepare package for lazy loading > **** help > ***** installing help indices > **** building package indices > **** installing vignettes > **** testing if installed package can be loaded > * DONE (Rhtslib) > > The downloaded source packages are in > '/private/var/folders/l6/k_g8thn139g8ygrgdv73c10r0000gs/T/RtmpNnCij6/downloaded_packages' >

Hervé Pagès (13:07:37) (in thread): > sessionInfo(): > > R version 4.2.0 (2022-04-22) > Platform: aarch64-apple-darwin20 (64-bit) > Running under: macOS Monterey 12.4 > > Matrix products: default > BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib > LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_GB/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] BiocManager_1.30.18 > > loaded via a namespace (and not attached): > [1] compiler_4.2.0 tools_4.2.0 >

Michael Stadler (13:12:35) (in thread): > ThankyouHervé,IguessthatmeansIhaveanissuewithmylocalenvironmentthatdidnotsurfacebefore.Wouldyoumindsharingyour.R/Makevars?IcanalsopostminetomorrowwhenIambackinoffice.

Hervé Pagès (13:15:22) (in thread): > No.R/Makevars. Nothing special really, this is a stock R install using the latest binary available athttps://mac.r-project.org/

Hervé Pagès (13:17:06) (in thread): > Nope sorry, the R binary is the official binary from CRAN:https://cran.r-project.org/bin/macosx/Thanks to@Andres Wokatyfor setting up this Mac Mini.

Andres Wokaty (13:51:42) (in thread): > I’m still working on the setup. It’s about 75% of the way.

Kasper D. Hansen (14:34:24) (in thread): > @Michael StadlerYou should NEVER have.R/Makevarsespecially if you’re using the official binary.

Vince Carey (15:54:13) (in thread): > I am just confirming Herve’s result on the M1 macbook air.

Simon Pearce (16:08:24) (in thread): > Apparently the code I’m actually looking at takes just over a minute for the whole hg38 genome. I can probably just not bother caching that with how often I actually need to call it. But a simple > > .onLoad <- function(libname, pkgname) { > getCGPositions <<- memoise::memoise(getCGPositions) > } > > appears to work nicely. With BiocParallel apply functions, it will calculate it in parallel for the first n elements (how many cores are being used), and then used the memoised value for the rest.

Hervé Pagès (16:55:46) (in thread): > Alternatively you could callgetCGPositions()beforeentering thebplapplyloop and pass the result as an additional argument to yourbplapplycallback function. That would avoid having all the nodes recompute the same thing. I’m not even sure you need any kind of caching for this.

2022-06-09

Michael Stadler (01:44:30) (in thread): > Thanks for all the helpful replies - I will try without the.R/Makevarsand report back here

Michael Stadler (01:50:53) (in thread): > It worked:tada:, and that’s a much simpler setup (though without openMP support, but that’s fine on a laptop I guess). Thanks to everyone!

Martin Morgan (08:30:12) (in thread): > So what was in.R/Makevars?

2022-06-13

Michael Stadler (02:45:05) (in thread): > I was following the many recipies out there that advice to use homebrewllvm, so.R/Makevarsprimarily defined the location of the compilers and some additional compiler and linker flags. Here is my (non-functional in the case ofRhtslib) version: > > LLVM_DIR=$(shell brew --prefix llvm) > LIBS_DIR=/opt/R/arm64 > GFORTRAN_DIR=$(LIBS_DIR)/gfortran > SDK_DIR=$(shell xcrun --show-sdk-path) > > CC=$(LLVM_DIR)/bin/clang -isysroot $(SDK_DIR) -target arm64-apple-macos12 > CXX=$(LLVM_DIR)/bin/clang++ -isysroot $(SDK_DIR) -target arm64-apple-macos12 > FC=$(GFORTRAN_DIR)/bin/gfortran -mtune=native > > CXX11=$(LLVM_DIR)/bin/clang++ -isysroot $(SDK_DIR) -target arm64-apple-macos12 > CXX14=$(LLVM_DIR)/bin/clang++ -isysroot $(SDK_DIR) -target arm64-apple-macos12 > CXX17=$(LLVM_DIR)/bin/clang++ -isysroot $(SDK_DIR) -target arm64-apple-macos12 > CXX20=$(LLVM_DIR)/bin/clang++ -isysroot $(SDK_DIR) -target arm64-apple-macos12 > > CFLAGS=-falign-functions=64 -g -O2 -Wall -pedantic -Wno-implicit-function-declaration > CXXFLAGS=-falign-functions=64 -g -O2 -Wall -pedantic > FFLAGS=-g -O2 -Wall -pedantic > > SHLIB_OPENMP_CFLAGS=-fopenmp > SHLIB_OPENMP_CXXFLAGS=-fopenmp > SHLIB_OPENMP_FFLAGS=-fopenmp > > PKG_CXXFLAGS += $(SHLIB_OPENMP_CXXFLAGS) > PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS) > > CPPFLAGS=-I$(LLVM_DIR)/include -I$(LIBS_DIR)/include > LDFLAGS=-L$(LLVM_DIR)/lib -L$(LIBS_DIR)/lib > FLIBS=-L$(GFORTRAN_DIR)/lib/gcc/aarch64-apple-darwin21/12.0.0 -L$(GFORTRAN_DIR)/lib -lgfortran -lemutls_w -lm -Wl,-rpath,$(GFORTRAN_DIR)/lib >

2022-06-14

Hervé Pagès (10:06:11) (in thread): > Seehttps://stat.ethz.ch/pipermail/r-sig-mac/2021-June/014112.htmlif all you are after is OpenMP support.

2022-06-15

Henrik Bengtsson (13:53:22) (in thread): > I stumbled upon the following in my old notes:https://github.com/tdhock/mclapply-memory(Toby Dylan Hocking, Why does mclapply take so much memory?)

Henrik Bengtsson (13:57:21) (in thread): > (EDIT: whoops, I see that@Martin Morganalready shared and paraphrased this comment, but I leave it here to not add confusing by removing it again) > > Alsohttps://stat.ethz.ch/pipermail/r-devel/2015-July/071554.html: > > > From: Joshua Bradley <jgbradley1 atgmail.com> > > > > I have been having issues using parallel::mclapply in a memory-efficient > > way and would like some guidance. I am using a 40 core machine with 96 GB > > of RAM. I’ve tried to run mclapply with 20, 30, and 40 mc.cores and it has > > practically brought the machine to a standstill each time to the point > > where I do a hard reset. > > When mclapply forks to start a new process, the memory is initially > shared with the parent process. However, a memory page has to be > copied whenever either process writes to it. Unfortunately, R’s > garbage collector writes to each object to mark and unmark it whenever > a full garbage collection is done, so it’s quite possible that every R > object will be duplicated in each process, even though many of them > are not actually changed (from the point of view of the R programs). > > One thing on my near-term to-do list for pqR is to re-implement R’s > garbage collector in a way that will avoid this (as well as having > various other advantages, including less memory overhead per object). > > Radford Neal

2022-07-01

kent riemondy (13:45:41): > @kent riemondy has joined the channel

2022-07-04

Andrew J. Rech (19:45:17): > @Andrew J. Rech has joined the channel

2022-07-28

Athena Chen (13:23:51): > @Athena Chen has joined the channel

2022-07-29

Aedin Culhane (15:59:11): > Developers forum birds of a feather at Bioc2022 in 45 mins

Mercedeh Javanbakht Movassagh (16:53:12): > @Mercedeh Javanbakht Movassagh has joined the channel

Rushika (16:58:41): > @Rushika has joined the channel

Wes W (17:04:11): > @Wes W has joined the channel

Samuel Gamboa (17:07:01): > @Samuel Gamboa has joined the channel

Aedin Culhane (19:51:17): > Thanks for a great conversation at the developers birds of a feather today at Bioc2022. Kristyna’s idea of a hackathon and working on a package clinic was great. So I am copying those interested in this thread.@Kelly StreetMargaret has great ideas on this too.@Kristyna Kupkova@Lori Shepherd@Margaret Turner@Claire Rioualen

Aedin Culhane (19:54:14): > @Alex Mahmoudand@Andres Wokatyalso had great ideas at the developers birds of a feather.@Alex Mahmoudsuggested created a pathway for learning how to develop and use the cloud

Aedin Culhane (19:56:24): > @Alex Mahmoudwould be a great person to speak on the developers forum about how best to write code so it can be deployed on any cloud platform@Mike Smith@Kevin Rue-Albrecht

Margaret Turner (20:16:39): > @Margaret Turner has joined the channel

2022-07-30

Jeroen Gilis (12:10:18): > Quick question: the latest build reports for bioc release are from July 15th; when will be the next build?

Lori Shepherd (12:56:05): > We are investigating the issue.

Lori Shepherd (12:58:02) (in thread): > @Hervé PagèsI know we saw this the other day.@Andres Wokaty

2022-08-02

Andres Wokaty (12:27:03): > The next build report for 3.15 software is available. Sorry for the delay it getting it back up.

2022-08-15

Pedro Sanchez (10:34:11): > Hi all! Is there a way to setprogressbar = TRUEby default? I am usingBPPARAM = bpparam()but do not know how to print the progressbar

Martin Morgan (10:49:41) (in thread): > You could register your ‘custom’ bpparam with progress bar, e.g.,register(MulticoreParam(progressbar = TRUE)). Thenbpparam()will return the customized version.

Pedro Sanchez (11:40:29) (in thread): > But will thenbpparam()return alwaysMulticoreParam()or will have it’s normal OS-agnostic use?

Martin Morgan (12:07:38) (in thread): > The OS-agnostic selection of params is set when the package is loaded; you can see the order withregistered(). When youregister(...), you revise the order, so it is no longer OS-agnostic (for the duration of your R session). > > I think it would help to clarify what you’re trying to do. If you’re writing a script for your own use, and you might sometimes run on Windows, sometimes on Linux / macOS, you could > > param <- bpparam() > bpprogressbar(param) <- TRUE > register(param) > > If you’re writing a package you might force the use of a progress bar > > f = function(param = bpparam()) { > if (!bpprogressbar(param)) { > message("I'm giving you a progress bar whether you want one or not!") > bpprogressbar(param) <- TRUE > } > ... > } > > or something a little more friendly > > f = function(param = bpparam(), progressbar = TRUE) { > bpprogressbar(param) = progressbar > ... > } > > This discussion sounds familiar, was it brought up in some other forum?

Pedro Sanchez (12:28:43) (in thread): > That’s super helpful Martin!! Thank you for taking the time to explain it. And yes, it’s because I’m writing a package

2022-08-22

Erin Kelley (15:10:03): > @Erin Kelley has joined the channel

2022-08-24

Jeroen Ooms (09:58:24): > Question (from non-bioc user): what is the policy of which bioc packages are maintained underhttps://github.com/bioconductoror under the authors personal github account or neither? What do people use as issue tracker for packages that are not on Github?

Lori Shepherd (10:03:41): > Bioconductor core team maintained and developed are generally the only ones on the Bioconductor github. personal githubs are encouraged for developing withgit.bioconductor.orgbeing and canonical location for the bioconductor builds and whats available from BiocManager::install

Lori Shepherd (10:04:19): > support.bioconductor.organd just emailing listed maintainer would be other options for if a package does not have an individual personal github

Jeroen Ooms (10:11:15): > OK thanks:slightly_smiling_face:

Jeroen Ooms (10:12:49): > One more question: once a branch is released (e.g. bioc 3.15) do packages still get updates or patches in theirRELEASE_3_15branch, or is everything fixed and frozen at that point?

Lori Shepherd (10:16:37): > older releases are frozen – but the current release branch in this case RELEASE_3_15 can still receive updates or patches – we strongly request that new features are only introduced in devel and that only bug fixes are on release but it is up to the maintainer

Jeroen Ooms (10:21:31): > Ok, thanks so much!

2022-08-25

Jeroen Ooms (04:47:57): > Is there a meta/api somewhere (or just a json file) for programmatically querying the current release and devel version of bioc?

Jeroen Ooms (04:49:03): > The best I have come up with is this, but there is probably a better way:slightly_smiling_face: > > grep('devel ->', readLines('[ftp://ftp.gwdg.de/pub/misc/bioconductor/packages/](ftp://ftp.gwdg.de/pub/misc/bioconductor/packages/)'), value = T) >

Lluís Revilla (05:14:54): > I don’t think there is something like this, but it would be great if Bioconductor provided a file such as the one CRAN provideshttps://cran.r-project.org/src/contrib/PACKAGES.rdsfor both devel and release (I think it is created withtools::write_PACKAGESand updated withtools::update_PACKAGES)

Martin Morgan (07:55:00): > @Jeroen Ooms > > > yml <- yaml::read_yaml("[https://bioconductor.org/config.yaml](https://bioconductor.org/config.yaml)") > > yml$devel_version > [1] "3.16" > > yml$release_version > [1] "3.15" > > This might also be useful: > > > yml$r_ver_for_bioc_ver |> tail() |> str() > List of 6 > $ 3.11: chr "4.0" > $ 3.12: chr "4.0" > $ 3.13: chr "4.1" > $ 3.14: chr "4.1" > $ 3.15: chr "4.2" > $ 3.16: chr "4.2" > > This is parsed byBiocManager:::.version_map()@Lluís RevillaPACKAGES*are part of all so-called “CRAN-style” repositories, and are parsed by, e.g.,available.packages(repos = BiocManager::repositories()). They are available at, for instance, > > > paste0( contrib.url(BiocManager::repositories()[1]), "/PACKAGES.rds" ) > [1] "[https://bioconductor.org/packages/3.16/bioc/src/contrib/PACKAGES.rds](https://bioconductor.org/packages/3.16/bioc/src/contrib/PACKAGES.rds)" > > and also for the other (data annotation, data experiment, etc) repositories and platforms. There is alsohttps://bioconductor.org/packages/3.16/bioc/VIEWSwhich contains information to create the package ‘landing pages’ and includes Linux, macOS, and Windows availability.

Lluís Revilla (08:25:47): > Many thanks@Martin MorganI wasn’t fully aware how it retrieved the packages. I thought there was a yaml file or json.

Martin Morgan (08:30:42) (in thread): > CRAN repositories developed before yaml / json and use so-called ‘Debian Control Files’; PACKAGES can be parsed withread.dcf(), with any url accessed through a url connectionread.dcf(url("https://..."))

Jeroen Ooms (08:39:45) (in thread): > Thank you, this is very useful!

2022-08-26

Chris Vanderaa (10:51:28): > Hello Bioc devels! > I am experiencing a segfault issue using a combination of thebioconductor/bioconductor:devel+reticulate+sklearn… Here is my setup: > 1. I start an interactive Docker container using: > > > $ sudo docker run -i -t bioconductor/bioconductor_docker:devel bash > > (the image was pulled 6 days ago) > 2. I open theRconsole and run: > > > BiocManager::install("reticulate") > > library(reticulate) > > repl_python() ## not installing miniconda > > 3. I run the following python code: > > >>> import sklearn.impute > >>> X = [[0, 1, 3], [3, 4, 5]] > >>> gen = sklearn.metrics.pairwise_distances_chunked(X) > >>> for chunk in gen: > ... print(chunk) > > This code is useless, but it is a minimal example that reproduces the error I encounter withsklearn.impute.KNNImputer. > The last command leads to the segfault error: > > ***** caught segfault ***** > address 0x7f2d543bc100, cause 'memory not mapped' > > Traceback: > 1: py_call_impl(callable, dots$args, dots$keywords) > 2: builtins$eval(compiled, globals, locals) > 3: py_compile_eval(code, capture = FALSE) > 4: doTryCatch(return(expr), name, parentenv, handler) > 5: tryCatchOne(expr, names, parentenv, handlers[[1L]]) > 6: tryCatchList(expr, names[-nh], parentenv, handlers[-nh]) > 7: doTryCatch(return(expr), name, parentenv, handler) > 8: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), names[nh], parentenv, handlers[[nh]]) > 9: tryCatchList(expr, classes, parentenv, handlers) > 10: tryCatch(py_compile_eval(code, capture = FALSE), error = handle_error, interrupt = handle_interrupt) > 11: repl() > 12: doTryCatch(return(expr), name, parentenv, handler) > 13: tryCatchOne(expr, names, parentenv, handlers[[1L]]) > 14: tryCatchList(expr, classes, parentenv, handlers) > 15: tryCatch(repl(), interrupt = identity) > 16: repl_python() > > I could make two additional observations: > * The python code runs fine when instead of opening the R console (2.) I open the python console (using/usr/bin/python3), still in the Docker container > * I tested the issue on my local installation outside the Docker container and the example runs correctly > Any ideas what is happening or suggestions on how to fix this?:pray:

Pedro Sanchez (10:58:24) (in thread): > Isn’t the correct way to installreticulate``install.packages("reticulate")?

Chris Vanderaa (11:04:30) (in thread): > I thoughtBiocManager:installis more versatile. I gaveinstall.packagesa try and the issue remains… but thanks for the answer

Martin Morgan (11:53:38) (in thread): > Not too helpful other than to say yes I can reproduce this, and to note thatBiocManager::install()andinstall.packages()are not quite the same on the docker image – BiocManager uses the RStudio binary package repository for ‘fast’ installs, whereasinstall.packages()installs from source. But installing from source viainstall.packages()doesn’t change the outcome…

Hervé Pagès (15:53:26) (in thread): > I was also able to reproduce this on therocker/rstudioimage so this doesn’t seem to have much to do with Bioconductor. BTW I found it challenging to get Python module scikit-learn installed on this image. Way I finally did it was to start the image with-e ROOT=TRUEwhich allows me to usesudo. Then in the RStudio Terminal I did: > > sudo apt-get update > sudo apt-get install python3-pip > sudo -H pip3 install scikit-learn >

Chris Vanderaa (16:55:29) (in thread): > Thanks for the feedback and for pinpointing the issue to therocker/rstudioimage.

2022-08-29

Jeroen Ooms (06:07:40): > q: how often doeshttps://bioconductor.org/packages/json/3.16/bioc/packages.jsonget updated?

Jeroen Ooms (06:10:44): > I noticed a package that has an outdated url in the json:

Jeroen Ooms (06:10:46): > > bioc <- jsonlite::fromJSON('[https://bioconductor.org/packages/json/3.16/bioc/packages.json](https://bioconductor.org/packages/json/3.16/bioc/packages.json)') > bioc$crisprScore$URL > bioc$crisprScore$BugReports >

Jeroen Ooms (06:11:22): > This was updated 18 days ago:https://github.com/crisprVerse/crisprScore/commit/d910afb94

Lori Shepherd (07:18:39): > it should be updated daily. I will look into this

Lori Shepherd (08:53:36): > @Andres Wokaty/@Hervé PagèsCan you check the generation of the VIEWS file on the builders – it doesn’t look like it has updated recently as the information in the json is created off the VIEWS – The VIEWS for this package has a last git commit date of 07-28 and there have been commits to thegit.bioconductor.orglocation after that – There seems to be a bug in the generation of the VIEW file on the builder

Andres Wokaty (09:46:55) (in thread): > I’ll take a look

Andres Wokaty (22:46:05) (in thread): > I think they pushed without bumping because the last commit is the change for the URL without a bump

2022-08-30

Jeroen Ooms (03:57:56) (in thread): > Would it be possible to re-generate the json from the new descriptions, even if the version did not change? Perhaps you can cache based on the the sha of the description file, instead of the version number?

Lisa Breckels (07:20:17): > @Lisa Breckels has joined the channel

Lori Shepherd (08:18:32) (in thread): > we do updates on the builders based on version bumps. Any changes in packages should be documented with a version bump or the system will not propagate the new version to the users.

Jeroen Ooms (08:24:13) (in thread): > Ah ok, well I guess we have to wait then for the author to change the version number for the url to be fixed.

Hervé Pagès (12:12:59) (in thread): > Looks like they just did that. Granted it passes the next BUILD/CHECK,crisprScore1.1.15 should propagate in the next 24 hours or so.

2022-08-31

Vince Carey (19:52:49) (in thread): > Would it be correct to say that at this time this is resolved in rocker/rstudio?

2022-09-01

Chris Vanderaa (07:18:07) (in thread): > Yes indeeed!rocker/rstudio:develdoes the job whilerocker/rstudio:latestfails:pray:How/when will this propagate tobioconductor/bioconductor_docker:devel?

Vince Carey (08:05:36) (in thread): > Well, I did not follow exactly how the error was triggered, but > > stvjc@stvjc-XPS-13-9300:~/BIOC_SOURCES$ docker run -ti bioconductor/bioconductor_docker:devel bash > root@817acf19e482:/# python3 > Python 3.8.10 (default, Jun 22 2022, 20:18:18) > [GCC 9.4.0] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> import sklearn > >>> import sklearn.impute > >>> X = [[0, 1, 3], [3, 4, 5]] > >>> gen = sklearn.metrics.pairwise_distances_chunked(X) > >>> for chunk in gen: > ... print(chunk) > ... > [[0. 4.69041576] > [4.69041576 0. ]] > > this morning.

Chris Vanderaa (11:13:55) (in thread): > In thebioconductor/bioconductor_docker:devel, this example works in python, but it doesn’t work through R withreticulate… As pointed out by@Hervé Pagès, therocker/rstudioimage (I assumerocker/rstudio:latest) leads to the same issue, so this has to do with the rocker image. From your comment I thought you meant that this issue is solved inrocker/rstudio:devel, which it is! I am now curious when the currentrocker/rstudio:develwill be used bybioconductor/bioconductor_docker:devel

Vince Carey (13:55:41) (in thread): > Because I ran the example in the snipped above in the bioconductor/bioconductor_docker:devel, I assumed that the problem was resolved in that container too, but maybe not?

Dario Strbenac (19:30:18): > Has anyone seen the errorundefined reference to__gxx_personality_v0’during installation from source? It happens on the Windows builder but not on the Linux builder. My understanding is that onlySystemRequirements: C++14`is necessary.

Alan O’C (19:32:02) (in thread): > I thought you needed to setCXX_STDinsrc/Makevars?

Alan O’C (19:32:12) (in thread): > Well, Makevars.win in this case

Dario Strbenac (19:45:02) (in thread): > It is not the impression that I have. > > 1.2.4 Using C++ CodePackages without a src/Makevars or src/Makefile file may specify that they require C++11 for code in the src directory by including ‘C++11’ in the ‘SystemRequirements’ field of the DESCRIPTION file, e.g. > > > SystemRequirements: C++11 > > However, I think I noticed what I am missing. > > The ‘NeedsCompilation’ field should be set to “yes” if the package contains native code which needs to be compiled. > I didn’t put a NeedsCompilation field in my DESCRIPTION file. I’ve managed to write R-only code for the past twelve years of being an R programmer but now I am learning about this for the first time.

Alan O’C (19:52:33) (in thread): > Ah, nice

Kasper D. Hansen (21:43:27) (in thread): > You shouldn’t ever have to setCXX_STDI think

2022-09-02

Chris Vanderaa (03:42:51) (in thread): > On my side, I still have the problem of running the python snipped above throughreticulatewhen usingbioconductor/bioconductor_docker:devel.

Martin Morgan (06:14:59) (in thread): > Bioconductor ‘devel’ is using R-4.2.1 until the next release (end of October?) so the docker container won’t switch to rstudio:devel (using R-devel) until after that. You could clonehttps://github.com/Bioconductor/bioconductor_dockerand changehttps://github.com/Bioconductor/bioconductor_docker/blob/181fbf5591afa4d22fcd4d1aceb2b0b7183ba7e9/Dockerfile#L2to readFROM rocker/rstudio:develand rebuild locally > > git clone[https://github.com/Bioconductor/bioconductor_docker](https://github.com/Bioconductor/bioconductor_docker)cd bioconductor_docker > ## edit... > docker build -t bioconductor_docker:rstudio-devel . > > and confirm that it still works. One difference between rstudio:4.2.1 (used by bioconductor_docker:devel) and rstudio:devel is the version of python installed > > $ docker run -it --rm rocker/rstudio:4.2.1 python3 --version > WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested > Python 3.8.10 > > version 3.10.4 on rstudio:devel > > I think this in turn is because the base ubuntu differs – 20.04 on rstudio:2.4.1, 22.04 on rstudio:devel. You’d have to do some apt-get magic to get python3.10 on 20.04 and I’m not strong enough to know what that is. So doing something like > > docker run -it --rm bioconductor/bioconductor_docker:devel bash > ## apt magic to get python3.10 as default > # R > > and then test that this solves the problem.

Martin Morgan (06:17:01) (in thread): > Maybe@Hervé Pagèscan help with the apt-magic, if that seems like a useful thing to explore; I don’t know that it would be good to change the bioconductor_docker:devel container ‘upstream’ in this way but at least there would be more understanding of the problem

Vince Carey (12:38:02) (in thread): > Now I think I understand how to generate the error: put the python snippet above in a file and use reticulate::py_run_file. Now I see the error with the bioc container.

Vince Carey (12:42:04) (in thread): > This then raises the question of whether basilisk within the container could allow use of this code and avoid the problem. I introduced a function skPWD (PWD = pairwise distance) in the devel branch of BiocSklearn. As long as the versions of libstdc++.so.6 are compatible between the linux run time and the conda components used for BiocSklearn, it works. A discussion of events arising when these are incompatible is athttps://github.com/LTLA/basilisk/issues/20

Hervé Pagès (17:00:05) (in thread): > This needs to be fixed inrocker/rstudio:latest. Have you reported the problem to the RStudio folks@Chris Vanderaa?

2022-09-05

Chris Vanderaa (05:48:01) (in thread): > @Martin Morgan: thanks for the guidelines. Since I’m struggling with apt magic as well, I’ll mess with the DockerFile until the rstudio Docker image is fixed in latest.@Vince Carey: it is on my todo list for a while, but I should indeed get more experience withbasilisk, maybe it could solve (or at least clarify) this type of issues for later builds.@Hervé Pagès: indeed, while still trifling with the Docker images, I hope for a quick transfer ofrocker/rsdutio:develtorocker/rsdutio:latest. I just reported the issue:https://github.com/rocker-org/rocker/issues/500

Chris Vanderaa (08:20:45) (in thread): > From the answer of the maintainers, switching to libblas solves the issue: > > ARCH=$(uname -m) > update-alternatives --set "libblas.so.3-${ARCH}-linux-gnu" "/usr/lib/${ARCH}-linux-gnu/blas/libblas.so.3" > update-alternatives --set "liblapack.so.3-${ARCH}-linux-gnu" "/usr/lib/${ARCH}-linux-gnu/lapack/liblapack.so.3" > > Thanks a lot for the support!

2022-09-07

Lori Shepherd (16:38:21): > Bioconductor 3.16 Release Schedule and Deadlines

2022-09-12

Kasper D. Hansen (13:26:03): > A fresh install ofminfiwithSuggestsdependencies on Bioc devel (on OS X intel). This is crazy to see (output attached in separate file because it’s too long)… Hmm, looking at this, I guess thatdependencies="Suggests"installs these dependences for all packages, not just forminfi?

Kasper D. Hansen (13:26:27): - File (Markdown (raw)): minfi_depends.md

Alan O’C (13:32:04) (in thread): > Indeed: > > dependencies: logical indicating whether to also install uninstalled > packages which these packages depend on/link > to/import/suggest (and so on recursively). >

Kasper D. Hansen (13:34:44) (in thread): > Yeah, kind of crazy its recursive IMO, but I can figure out a way around it

Alan O’C (13:36:25) (in thread): > I agree, would be nice to be able to just get the one layer if you want rather than installing half of CRAN

Marcel Ramos Pérez (14:06:12) (in thread): > Did you trydependencies = TRUE? I think that should only installSuggestsfor the current package according to?install.packages.

2022-09-19

Ryan Williams (16:50:50): > @Ryan Williams has joined the channel

2022-09-25

Maria Doyle (20:52:48): > @Maria Doyle has joined the channel

2022-09-27

Jennifer Holmes (16:14:48): > @Jennifer Holmes has joined the channel

2022-10-06

Devika Agarwal (13:10:46): > @Devika Agarwal has joined the channel

2022-10-12

Taiyun Kim (01:44:51): > @Taiyun Kim has joined the channel

Taiyun Kim (08:00:20): > Hi Bioc developers! > I am having an issue in fetching from Bioc git repository to my mac and I would much appreciate if I could get some help to resolve this issue. > I have generated an rsa key from my mac and added the public key to my GitHub and Bioconductor Git Credentials, but when I trygit fetch upstreamI am getting a message as shown below. > > git@git.bioconductor.org: Permission denied (publickey). > fatal: Could not read from remote repository. > > Please make sure you have the correct access rights > and the repository exists. > > Any suggestions on how to resolve this?

Lori Shepherd (08:03:43) (in thread): > https://contributions.bioconductor.org/git-version-control.html#faqhave you checked that indeed you see access writes for the package ` > > ssh -T git@git.bioconductor.org > > There is also reference to FAQ #13,14,15 if you see you have access writes but still can push including some commands that should be included for further assistance

Taiyun Kim (08:16:30) (in thread): > When I checkhttps://git.bioconductor.org/BiocCredentials/permissions_by_user/, I see the package I want to fetch, but when I try the following on terminal > > ssh -T git@git.biocondcutor.org > > I get the message below… > > git@git.bioconductor.org: Permission denied (publickey). >

Taiyun Kim (08:17:01) (in thread): > I have also tried by specifying the SSH key with ~/.ssh/config file but it doesn’t seem to help..

Lori Shepherd (08:18:12) (in thread): > and when you see the package on the app it has a RW

Taiyun Kim (08:18:55) (in thread): > It only lists the names of the packages.

Lori Shepherd (08:21:48) (in thread): > As a developer you should have R W next to your package– what is the name of your package so I can check the setting on our side

Taiyun Kim (08:22:50) (in thread): > scReClassifyandPhosR

Lori Shepherd (08:24:57) (in thread): > All of our settings are correct. I would suggest trying to generate a new key, adding to the GitCredentials App, and using access with the newly generated key

Taiyun Kim (08:25:55) (in thread): > Ok, I will try that. Thanks!

Taiyun Kim (08:56:52) (in thread): > It worked!:pray:Not sure what the issue was with the old SSH key though.

Nitesh Turaga (12:33:28): > When I use a.First < function() { y <- 1; rm(list = ls() }in my.Rprofile.site, why does the variableypersist? > > Should therm(list = ls())remove it from the scope? > > What am I doing wrong?

Nitesh Turaga (12:33:47): > (or.Rprofile) too

Hervé Pagès (12:45:37) (in thread): > Hi@Nitesh Turaga, it doesn’t for me: > > > cat(readLines(".Rprofile"), sep="\n") > .First <- function() { y <- 1; rm(list = ls()) } > > ls() > character(0) > > Note that you don’t need to dorm()at all,yis local to the function: > > > cat(readLines(".Rprofile"), sep="\n") > .First <- function() { y <- 1 } > > ls() > character(0) > > Foryto persist, you’d need to define it in the global environment with<<-: > > > cat(readLines(".Rprofile"), sep="\n") > .First <- function() { y <<- 1 } > > ls() > [1] "y" >

Nitesh Turaga (12:52:46) (in thread): > Got it! Thanks@Hervé Pagès. > > I had to restart my R session and clear “.RData”. This was a naive mistake:confused:

Nick Robertson (20:12:51): > @Nick Robertson has joined the channel

Ellis Patrick (21:40:13): > @Ellis Patrick has joined the channel

Ellis Patrick (21:46:55): > Hi, we are having a problem pushing to our new submission simpleSeghttps://github.com/Bioconductor/Contributions/issues/2774I believe we have tripped ourselves up by writing the wrong maintainer in the description file (it was suppose to be Alex who submitted the issue… ). I am happy to stay maintainer, but I don’t appear to have access rights to simpleSeg.

Lori Shepherd (22:15:31) (in thread): > I’mon ET so I’ll have a look in the morning to help straighten it out

Yue Cao (23:16:06): > @Yue Cao has joined the channel

Yue Cao (23:18:53): > Hi BioC team, I am trying to activate my email associated with my package scFeatures (https://github.com/Bioconductor/Contributions/issues/2815) onhttps://git.bioconductor.org/BiocCredentials/account_activation/. > However it shows that the email address is not associated with a Bioconductor package, despite this is the email we put down in the maintainer field of the description. > Wondering if you are able to help with it, thanks!

2022-10-13

Lori Shepherd (07:44:22) (in thread): > I will have to create Alex a GitCredentials account – Can you please have them email me the email they would like to use to manage their git credentials and if they have a github account they would like auto associated with ssh keys

Lori Shepherd (07:46:29) (in thread): > scFeatures has not passed the precheck to start an official review and therefore is not in ourgit.bioconductor.orgsystem yet. Once the package passes prechecks, it will be added to git and assigned an official reviewer. At this current stage please update your github repo and ping Vince that you have uploaded changes.

Ellis Patrick (08:54:16) (in thread): > Fantastic! Thanks so much :)

Yue Cao (18:19:11) (in thread): > Got it, thanks for the clarification!

2022-10-14

Martin Morgan (15:41:50): > For a CRAN package it is ‘easy’ to include a badge (e.g., the CRAN version badge in theREADMEof this package just includes a link to the image returned byhttps://www.r-pkg.org/badges/version/rjsoncons). Is it similarly easy to add Bioconductor badges?

Martin Morgan (15:58:25) (in thread): > Answering my own question, I guess (obviously!) I can use the link to badges that appear on package landing pages, likehttp://bioconductor.org/packages/BiocParalleland the ‘last commit’shield. Maybe I was hoping for different badges (eg., version number, last-updated-date, downloads per month) - Attachment (Bioconductor): BiocParallel > This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.

Marcel Ramos Pérez (16:27:22) (in thread): > Perhaps Bioconductor could have a page similar tohttps://www.r-pkg.org/services#badges:thinking_face:to show which ones are available to use

2022-10-16

Nick Robertson (21:54:21) (in thread): > We addressed the issues that came to light in the prechecks, but@Yue Caodoesn’t appear to have received access in the Bioconductor git for the scFeatures package. Are there any other steps that need to be completed?

2022-10-17

Milan Malfait (04:21:51) (in thread): > usethis::use_bioc_badge()may also be useful, though I think it only shows Bioc build status

Lori Shepherd (07:59:24) (in thread): > I will process the packages shortly to assign reviewers – at that time they should have access

Martin Morgan (08:23:05) (in thread): > Thanks@Milan Malfait; I looked inbiocthis::use_bioc_<tab>but did not think to look inusethis::. You’re right it inserts the build status badge; maybe abiocthisenhancementuse_bioc_badges()would allow insertion of one or more badges…

Milan Malfait (11:10:46) (in thread): > I put something together and opened a PR inbiocthis:https://github.com/lcolladotor/biocthis/pull/35, basically just to add the badges from the Bioc landing page to a package’s README

2022-10-18

Jianhai Zhang (12:28:50): > @Jianhai Zhang has joined the channel

Jianhai Zhang (12:29:26): > I documented my function “covis” (a method for SVG class) below. However, I got two warnings. Can anyone give solutions? The same problem exists for the function “spatial_hm”. The repo to reproduce warnings:https://github.com/jianhaizhang/spatialHeatmap,roxygen2::roxygenize('spatialHeatmap'); devtools::build('spatialHeatmap', vignettes = FALSE); rcmdcheck::rcmdcheck('spatialHeatmap_2.1.3.tar.gz');Documentation of covis.R:#' @name covis``#' @rdname covis,SVG-method``#' @docType methodsWarnings: > Undocumented S4 methods: > generic ‘covis’ and siglist ‘SVG’ > generic ‘spatial_hm’ and siglist ‘SVG’ > > checking Rd sections … > Objects in without in documentation object ‘covis’: > ‘4method{covis}{SVG}’ > > Objects in without in documentation object ‘spatial_hm’: > ‘4method{spatial_hm}{SVG}’

Lluís Revilla (15:48:56) (in thread): > I think you also need to export the methods in order to be able to use them (even if they are internal)

Jianhai Zhang (15:49:43) (in thread): > both covis and spatial_hm are exported, not internal.

Alan O’C (15:57:38) (in thread): > You need to use alias there not rdname right?

Alan O’C (15:59:04) (in thread): > although if you only have one method for covis/spatial_hm then it doesn’t need to be an s4 generic

Jianhai Zhang (15:59:12) (in thread): > You mean this?

Jianhai Zhang (15:59:40) (in thread): > @alias covis,SVG-methoddoes not work either.

Jianhai Zhang (16:01:41) (in thread): > covis/spatial_hm are two of the methods for S4 class SVG.

Alan O’C (16:02:05) (in thread): > Do you anticipate ever adding a covis method for another class?

Alan O’C (16:02:13) (in thread): > (or the same question for spatial_hm)

Jianhai Zhang (16:02:19) (in thread): > no.

Alan O’C (16:02:47) (in thread): > Then it doesn’t need to be s4, you can just have a normal function and check the class of the input is SVG

Alan O’C (16:03:43) (in thread): > It should make the documentation easier and often makes it easier for end users to debug things

Jianhai Zhang (16:04:11) (in thread): > That was the design before.

Jianhai Zhang (16:05:15) (in thread): > So if one function is associated with multiple S4 classes, then it should be designed as an Ｓ4 method?

Alan O’C (16:06:09) (in thread): > Generally people use S4 (or S3) generics because they want to be able to dispatch to the correct method for a number of possible different inputs

Marcel Ramos Pérez (18:03:54) (in thread): > I think you need@aliases covis,SVG-method; note you can use@describeIninstead to document methods

Marcel Ramos Pérez (18:05:04) (in thread): > Also make sure to add#'in empty lines so that the roxygen block is a single block that is easier to identify

Jianhai Zhang (21:43:54) (in thread): > Technically, the following works out. Thanks a lot. Also thank Alan O’C for the explanations on S4 method design.#' @name covis ``#' @rdname covis ``#' @aliases covis,SVG-method

2022-10-28

Brian Schilder (08:30:46): > @Brian Schilder has joined the channel

Vandenbulcke Stijn (12:29:31): > @Vandenbulcke Stijn has joined the channel

2022-11-03

Haichao Wang (12:43:06): > hey , should I emailwebmaster@bioconductor.orgfor changing my email address linked to my account on the bioc support forum?

Lori Shepherd (12:55:52) (in thread): > There is#support-sitechannel here too for discussions as well

Natay Aberra (13:02:07): > @Natay Aberra has joined the channel

Natay Aberra (13:02:57) (in thread): > Youshouldbe able tochangeyour emailfromyour userprofile.

2022-11-07

Lambda Moses (18:54:42): > I get this when pushing to upstream after version bump for bug fix. This happens in both devel and release branches. Is anyone else also getting this problem? > > remote: Error: Please bump the version again and push. > remote: > remote: The build did not start as expected. If the issue persists, > remote: please reach out at bioc-devel@r-project.org or post on the > remote: Github issue where your package is being reviewed. > remote: > remote: 500 Server Error: Internal Server Error for url:[https://issues.bioconductor.org/start_build](https://issues.bioconductor.org/start_build)remote: > To git.bioconductor.org:packages/Voyager.git > 3e4c55c..b1f2a0a HEAD -> RELEASE_3_16 >

Lori Shepherd (19:12:43) (in thread): > What do you have as your remotes ‘git remote -v’ ? Did you set up any hooks on your GitHub? This should not happen and since your packages are accepted it should not trigger anything on issues.

Lambda Moses (19:30:58) (in thread): > This is the output ofgit remote -v: > > origin git@github.com:pachterlab/voyager.git (fetch) > origin git@github.com:pachterlab/voyager.git (push) > upstream git@git.bioconductor.org:packages/Voyager.git (fetch) > upstream git@git.bioconductor.org:packages/Voyager.git (push) >

Lori Shepherd (19:36:20) (in thread): > I’ll investigate on our end.

Lambda Moses (20:23:00) (in thread): > Thanks!

2022-11-08

Lukas Weber (00:23:34) (in thread): > I had this same error in a push to SpatialExperiment package (devel branch) earlier today too (I was planning to wait a day or two before trying again, but looks like it may be a more common issue right now). Thanks!

Lambda Moses (00:24:18) (in thread): > Then apparently it’s not just the packages that are newly accepted into Bioconductor.

Lori Shepherd (10:34:28): > I believe this issue should be resolved.@Lambda Moses/@Lukas WeberCould you please try again and let me know if you continue to see this ERROR.

Lukas Weber (10:43:59) (in thread): > Thank you! Yes, I just pushed a new version bump now and it has worked

Nitesh Turaga (11:01:29) (in thread): > FWIW, the two remotesoriginandupstreamare different from@Lambda Moses- notice the caps onvoyagervsVoyager.

Lori Shepherd (11:03:40) (in thread): > capital V is what we have ongit.bioconductor.organd can confirm the pull from there –

2022-11-09

Lambda Moses (01:23:13) (in thread): > That’s just the repo name. It should be capital V in the DESCRIPTION file. The lower case v in the GitHub repo is reduce confusion in the pkgdown website URL. I don’t think it’s relevant to the issue here since I push to the GitHub and Bioconductor repos separately.

2022-11-15

Mike Smith (06:09:18): > I was wondering if anyone had thoughts on packages that useCMaketo configure compiled code. The reason I’m asking is that the HDF5 group are proposing to discontinue usingautotools and only support CMake for the library. This will completely change how I bundle and build the distribution of HDF5 in Rhdf5lib. > > My initial reaction is that this is a bad thing, as it’ll require users to install something extra outside of R if they want to build from source, whereas it’s safe to assume autotools is installed. Does anyone have any other thoughts on this? > > I don’t think any Bioconductor packages use CMake, but there are some on CRAN. None that I’ve found mention CMake as a ‘SystemRequirement’ in the DESCRIPTION so maybe my assumptions are wrong.

Henrik Bengtsson (07:45:45) (in thread): > FWIW,https://cran.r-project.org/package=nloptrmentions CMake - Attachment (cran.r-project.org): nloptr: R Interface to NLopt > Solve optimization problems using an R interface to NLopt. NLopt is a free/open-source library for nonlinear optimization, providing a common interface for a number of different free optimization routines available online as well as original implementations of various other algorithms. See https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/](https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/)) for more information on the available algorithms. Building from included sources requires ‘CMake’. On Linux and ‘macOS’, if a suitable system build of NLopt (2.7.0 or later) is found, it is used; otherwise, it is built from included sources via ‘CMake’. On Windows, NLopt is obtained through ‘rwinlib’ for ‘R <= 4.1.x’ or grabbed from the ‘Rtools42 toolchain’ for ‘R >= 4.2.0’.

Dirk Eddelbuettel (07:51:00) (in thread): > We can do one better than pointing at a single package:

Dirk Eddelbuettel (07:51:27) (in thread): - File (R): Untitled

Dirk Eddelbuettel (07:53:22) (in thread): > And WRE now even has a Section 1.2.8 oncmake:https://rstudio.github.io/r-manuals/r-exts/Creating-R-packages.html#using-cmake

Kasper D. Hansen (08:48:34): > But what’s the advantage of dropping autotools? I am not an expert but my impression is that once autotools is up and running, its pretty nice and - in my (very limited) experience - more robust than CMake.

Kasper D. Hansen (08:50:31): > Anyway, the fact that we now have a section in WRE is a good sign and I would be happy to help with a CMake migration if that is necessary. I just don’t get why HDF5 wants to migrate.

Dirk Eddelbuettel (09:04:40): > @Kasper D. HansenWe are in a shielded little corner of our software world. I personally also likeautotoolsandconfigurebetter and findcmake(config) code to be almost imprenetable and very verbose. The rest of the world, though, is very different and extensively usescmakeincluding for package management and compile-time procurement of (source-level) resources). It is very common, and growing.

Mike Smith (09:16:17): > It’s not really a migration, because HDF5 can currently be built with autotools or CMake. It’s up to the user to choose. I think it is a significant burden on them to keep both mechanisms in sync and up-to-date. It’s also true that autotools on Windows is much less well supported, and I end up shipping binaries that I built with CMake for that platform.
> > Here’s the blog post from HDF5 with some more details (https://www.hdfgroup.org/2022/11/can-we-remove-the-autotools/). They’re also asking for feedback from the community, which I plan to do once I’ve read everyone’s replies here:wink: - Attachment (The HDF Group): Can We Remove the Autotools? - The HDF Group > HDF5 can be built using two build systems: the Autotools (since HDF5 1.0) and CMake (since HDF5 1.8.5). For a long time, the Autotools were better maintained and CMake was more of an “alternative” build system that we primarily used for handling Windows support (the legacy Visual Studio projects were removed in HDF5 1.8.11). This is no longer the case though—CMake support in HDF5 is (almost) as good as Autotools support and CMake, in general, is much more commonly used now than when we first introduced it. So why are we still hanging on to the legacy Autotools?

Mike Smith (11:53:19) (in thread): > Thanks@Dirk EddelbuettelHow did you makedb?available.packages()doesn’t seem to list the “SystemRequirements” field.

Dirk Eddelbuettel (11:53:59) (in thread): > I am sorry – I omitted a line that I will add now. It is a call totools::CRAN_package_db(). ESS for the win, was still in myR:miscbuffer anyway:slightly_smiling_face:

Dirk Eddelbuettel (11:57:08) (in thread): > Also should be 12 not 10 once I learn how to callgrepl:

Dirk Eddelbuettel (11:57:19) (in thread): - File (R): Untitled

Dirk Eddelbuettel (11:57:45) (in thread): > And then there may be undeclared ones. I vaguely recallhttpuvalso using it.

2022-11-16

Mike Smith (03:21:44) (in thread): > I suspect there’s quire a few that don’t declare it. A quick search on the CRAN GitHub mirror gives 500+ “files written in CMake” (https://github.com/search?l=CMake&o=desc&q=org%3Acran++language%3ACMake+language%3ACMake&s=indexed&type=Code). Not sure how to return unique repos, but I guess it’s not those 12 packages with 50 files each.

Dirk Eddelbuettel (08:24:09) (in thread): > I would do a GitHub (in the ‘cran’ org for the mirror) search for CMakeLists.txt. In fact here is one, yielding ‘about 100 files’:https://github.com/search?q=org%3Acran%20CMakeLists.txt&type=code

Lambda Moses (19:42:39): > I see this warning in the check results: > > Warning: S3 methods 'effectiveLibSizes.default', 'effectiveLibSizes.DGEList', 'effectiveLibSizes.DGEGLM', 'effectiveLibSizes.DGELRT' were declared in NAMESPACE but not found > > which I haven’t seen on my computer. I also see it in the check results of some other packages like EGSEA. Any idea what’s causing it?

Pedro Baldoni (19:48:14) (in thread): > I’d suggest posting this on the support site. The warning message I think it is due to a recent name change ofeffectiveLibSizestogetNormLibSizesin the devel version of edgeR

2022-11-17

rebecca jaszczak (16:59:59): > @rebecca jaszczak has joined the channel

Hervé Pagès (19:46:25) (in thread): > Here is the recent change inedgeR: > > commit d7fe5d6c8a17465aa5fd15cebe057675772a2153 (HEAD -> master, origin/master, origin/HEAD) > Author: Gordon Smyth <smyth@wehi.edu.au> > Date: Mon Nov 7 21:16:59 2022 +1100 > > Rename effectiveLibSizes() to getNormLibSizes(). > > effectiveLibSizesshould probably be deprecated in favor of the new function to make the transition smoother.@Pedro BaldoniNote that the support site is for user-oriented questions.

Lambda Moses (22:14:51) (in thread): > The problem is that I never used anything from edgeR in my package, although it’s most likely suggested by a dependency. So I don’t think if there’s anything I can do about it.

2022-11-18

Hervé Pagès (18:19:51) (in thread): > There’s always something you can do, I guess. You can identify the dependency that causes the warning and ask them to fix the problem, typically by opening an issue on GitHub. You can also open an issue foredgeRand ask them to deprecateeffectiveLibSizesin favor of the new function.

2022-11-21

Jonathan Griffiths (05:29:07): > @Jonathan Griffiths has joined the channel

2022-11-25

Jianhai Zhang (22:15:01): > How can I install the same R devel as the bioc devel (R Under development (unstable) (2022-10-25 r83175) – “Unsuffered Consequences”) on my local computer? > I found this tutorial, but it is too complex:http://singmann.org/installing-r-devel-on-linux/

Jianhai Zhang (22:16:49): > Why is the R devel (2022-10-25) on Bioc devel is earlier than that (2022-10-31) on Bioc release?

Jianhai Zhang (23:09:09): > I downloaded the bioc docker fromhttps://www.bioconductor.org/help/docker/. When I runsudo apt-get install ghostscript, is the password bioc? It says Sorry, try again. > > rstudio@e06d45bc5923:/$sudo apt-get install ghostscript[sudo] password for rstudio: > Sorry, try again.

2022-11-26

Vince Carey (06:58:32) (in thread): > You don’t really need r83175 to work effectively with R-devel and the Bioconductor devel branch. You do need a reasonably current version of R-devel, R-devel sources can change quite rapidly for relatively minor alterations – you would see this with “svn log” in the svn URL:https://svn.r-project.org/R/trunk. Changes to our installed version of R-devel for the build system are relatively infrequent to confer stability on the overall process of building and checking thousands of packages. I would say that an R-devel version that is current to a month or two, and for which BiocManager::valid() can return true, should be sufficient for your development and testing. If you really want to put the effort into getting an exact match, you’ll have to use subversion commands to extract the specific release you want to test. svn co –help seems to give sufficient information.

Vince Carey (07:10:44) (in thread): > Please give the command you are using to run docker.

Jianhai Zhang (19:21:33) (in thread): > docker run -it --user rstudio bioconductor/bioconductor_docker:devel bash

Vince Carey (22:57:13) (in thread): > You can add software to the image by omitting the--user rstudio; you will be logged in as root and do not even have to use sudo. After you finish your changes you can usedocker committo save the enhanced image locally and then use it as rstudio.

2022-11-27

Hervé Pagès (20:56:48) (in thread): > one uses R devel (4.3) and the other one R 4.2.2

2022-11-30

Jianhai Zhang (04:24:13) (in thread): > Thanks.

2022-12-01

Belinda Phipson (19:12:46) (in thread): > @Lambda MosesGordon has pushed a fix for edgeR so it should resolve now if you bump your version

Belinda Phipson (19:13:28) (in thread): > I also had the same warning come up and it has resolved this morning. I emailed Gordon and he pushed the fix ASAP.

2022-12-05

Joseph (13:20:17): > @Joseph has joined the channel

Joseph (13:27:05): > Hi everyone! I am new to the Bioconductor submission process and was hoping to get some help on understanding the licensing. When choosing the license for my description file, is it as simple as choosing an appropriate one from the Wikipedia page (https://en.wikipedia.org/wiki/Comparison_of_free_and_open-source_software_licenses)? Or is there a more intricate process involved? Also, do I need to consider discrepancies between libraries I import (Example: Dplyr on CRAN which uses the MIT license and limma on Bioconductor that uses GPL-2?) Thanks in advance for the help! - Attachment: Comparison of free and open-source software licenses > This comparison only covers software licenses which have a linked Wikipedia article for details and which are approved by at least one of the following expert groups: the Free Software Foundation, the Open Source Initiative, the Debian Project and the Fedora Project. For a list of licenses not specifically intended for software, see List of free-content licences.

Kasper D. Hansen (21:38:08): > This is a good question which we have discussed in the technical advisory board that we need some written guidelines on.

Kasper D. Hansen (21:38:19): > So this is my personal point of view

Kasper D. Hansen (21:39:39): > First, yes, in general it is as easy as picking a license and adding it to your DESCRIPTION file. The main complications are when you include code other people wrote in your package. And also, co-authors needs to be consulted here. I have some comments on theimportquestion below.

Kasper D. Hansen (21:41:34): > Second, let me comment a bit on the legalese, in a hopefully objective way. We have a number of open source licenses available. These licenses uses terms which has certain interpretations in the community and often >1 interpretation - you can find many opinions about this on the internet. Legally, such licenses has to be interpreted by the courts (who will assign real legal meaning to the terms) and this has not happened at a large scale yet, although we had a recent ruling.

Kasper D. Hansen (21:43:15): > This is not just stuff I can write. As an example, the court system in Germany has rules that the term “non-commercial usage” essentially only allows forpersonalusage. For example, a public university using the code in teaching is considered a commercial usage in Germany. This is an example of a term where I think some people (perhaps even the authors of some licenses) would be surprised at this interpretation.

Kasper D. Hansen (21:44:39): > This emphasizes that most discussions about licenses are not always super grounded in reality. I say this, because you can find very strong opinions about this from people who sound like they know what they are talking about, but often it is just opinions. I would say an organization to the FSF has had some lawyers look at this, but again, this is “just” lawyers and not the courts.

Kasper D. Hansen (21:48:01): > I can give a quick rundown of licenses, and I am happy to include more details. This is a dated view - its been a while since I have systematically looked at this. > > There are the really free licenses like BSD and MIT which allow users to do essentially whatever they want with the code. This includes the situation where a company incorporates the code into commercial software, a situation many academics are horrified by. I am not; in many cases I think that is a success story. I will say that some academic code is widely used, and could be considered worth something, but that wide usage is usually occuring partly because the code was freely licensed. I have certainly heard about code which haslessusage because of the license. This is why I like stuff like BSD and MIT although it is not my licenses of choice.

Kasper D. Hansen (21:48:58): > Then we have stuff like GPL and friends which tries to restrict usage for the common good. A lot of good stuff has happened because of GPL, but personally, I want my software to be as widely used as possible, see previous paragraph.

Kasper D. Hansen (21:50:50): > Personally, I use Artistic-2.0 which (at least was) widely used in Bioc by the initial creators. Artistic-2.0 essentially allows any usage WITH CREDIT to authors. Note that both stuff like GPL and MIT/BSD really doesn’t care about whether authors get credit. This goes against much of the reward structure in academia. I don’t have an issue with people using my code, but I would like to get some recognition about this. Perhaps there is a better license than Artistic 2.0; I am not up to date on this.

Kasper D. Hansen (21:51:49): > Regardingimportand friends. This is an interpretation issue. For better or worse, the view in the R community is that you can choose your license freely irrespective of what your package depends on.

Kasper D. Hansen (21:52:25): > The exception is really only when you include code written by others in the package. For example, Rgraphviz bundles Graphiz and therefore has inherited the license.

Dirk Eddelbuettel (21:53:12) (in thread): > > which tries to restrict usage for the common good > No, as the GPL is about redistribution. You can use GPL-licensed code as you see fit in any way (many corporations do) but it places limits on (re-)distribution. That is a feature.

Kasper D. Hansen (21:54:12): > Some packages say stuff like “the code in these files are under license A and the code in these files are under license B”. Those licenses are very hard to parse computationally. For this reason, I believe that we will work towards only have single licenses, ie. all the code in a package is under the same license. This is what CRAN has asked for a while. But I am not sure we have a final decision on this for the project.

Dirk Eddelbuettel (21:54:26) (in thread): > That discussion is as old as free software. Github has a good overview, and some sites such ashttps://choosealicense.com/are helpful. (Sorry, started to type this hours ago but got distracted.) - Attachment (Choose a License): Choose an open source license > Non-judgmental guidance on choosing a license for your open source project

Kasper D. Hansen (21:55:03) (in thread): > Absolutely, I got tired and was not precise here. To really understand GPL you need to read something that is not included in my description. The same goes for MIT/BSD

Dirk Eddelbuettel (21:55:37) (in thread): > Not so. CRAN has multi-license package. Sometimes you get a license from things you include but your own code may be different.

Kasper D. Hansen (21:56:10) (in thread): > Nevertheless, I have been asked by CRAN to not do this. Perhaps this policy has changed, but I know for a fact I was asked to do this.

Kasper D. Hansen (21:56:33) (in thread): > Although - granted - it was some years ago

Kasper D. Hansen (22:01:33) (in thread): > Ok, I read the CRAN policies. Multi-license packages are - in my reading - neither explicitly allowed nor explicitly prohibited.

Dirk Eddelbuettel (22:04:33) (in thread): > Dodb <- tools::CRAN_package_db()which gets a NbOfPackages x 65OrSoColumns matrix and tabulate the license column. Many point to an external license file. Here is one from a package I am involved withhttps://github.com/corels/rcppcorels/blob/master/inst/LICENSE

Kasper D. Hansen (22:04:59) (in thread): > This really needs some better history with examples and what we have learned. The community does have some anecdotal experiences that would be helpful to describe.

Kasper D. Hansen (22:11:16) (in thread): > I believe you. Your example is not great because you have a LICENSE ininstwhereas R-exts specifies it need to be in the package source top level directory and the licenses mentioned in the file are not the same as those mentioned in the License field in the DESCRIPTION. But yes, I have seen multi-licenses before and perhaps the changes I was asked to do has been considered to be too restrictive for CRAN. I am just pointing out the policy does not explcitely allow it and we know lack of explict permission is sometimes interpreted …

Kasper D. Hansen (22:12:49) (in thread): > “example not great” in the sense that this should not be copy and pasted. It is of course a great example of a multi-license package

Dirk Eddelbuettel (22:14:50) (in thread): > It goes both ways. I know Hannes includes a bazillion things inside the sources ofduckdb(as ‘vendored’ C++ libraries). Yet he include a one-line license line for CWI only. Likely incorrect in the stricter sense as heshipsother sources with his. > > Debian is actually stricter here than CRAN and when we upload we have to explicitly enumerate licenses (and copyrights). > > Also what you mentioned earlier on ‘hard to parse’ is ‘so what’. Python has PEPs for that, Debian has parseable copyright formats. It can (and should ?) be accomodated but both CRAN and BioC are smaller … and resources are still finite – so no auto-parseable setups.

Dirk Eddelbuettel (22:17:02) (in thread): > Anyway, it’s a complicated (and old) topic. And despite pumping a 1000 words out between us I doubt we added that much more ‘signal’ over the ‘noise’.

Kasper D. Hansen (22:18:02) (in thread): > Absolutely. Well. I do think there are a couple of R specific comments here and I also think that I put out my view on academic credit and did a very short explanation of why so many core packages are Artistic 2.0

Kasper D. Hansen (22:18:45) (in thread): > We should however have a guide on this in Bioc and clearly we would want you to read it:slightly_smiling_face:

Dirk Eddelbuettel (22:19:21) (in thread): > Narrator: Dirk was last seen hiding under his table.

Kasper D. Hansen (22:19:58) (in thread): > That’s your prize for speaking up:slightly_smiling_face:Thanks for the comments btw. because I was def. imprecise.

2022-12-06

Joseph (13:07:54): > This is great! Thank you so much for taking the time to answer this@Kasper D. Hansenand@Dirk Eddelbuettel.

Kevin Rue-Albrecht (13:24:29): > For what it’s worth, licensing options are mentioned in a few places in the guidelines: > * http://contributions.bioconductor.org/description.html?q=license#description-license > * http://contributions.bioconductor.org/license.html?q=license#license > It’s not much help if you’re looking for advicechoosinga license, but it gives a few good pointers. > You’ll find a link to a list of licenses used in various parts of R:https://www.r-project.org/Licenses/ - Attachment (contributions.bioconductor.org): Chapter 6 The DESCRIPTION file | Bioconductor Packages: Development, Maintenance, and Peer Review > The DESCRIPTION file must be properly formatted. The following sections will review some important notes regarding fields of the DESCRIPTION file and associated files. 6.1 Package This is the… - Attachment (contributions.bioconductor.org): Chapter 9 The LICENSE file | Bioconductor Packages: Development, Maintenance, and Peer Review > The license field should preferably refer to a standard license (see wikipedia) using one of ’s standard specifications. Licenses restricting use, e.g., to academic or non-profit researchers, are…

Marcel Ramos Pérez (14:01:54) (in thread): > FWIW There are numerous online resources that can guide a developer when choosing a license: (first two links are from GitHub)https://choosealicense.com/https://github.com/readme/guides/open-source-licensingandhttps://www.digitalocean.com/community/tutorials/understanding-open-source-software-licenses

2022-12-12

Umran (17:57:51): > @Umran has joined the channel

2022-12-13

Ana Cristina Guerra de Souza (08:59:41): > @Ana Cristina Guerra de Souza has joined the channel

2022-12-16

Laurent Gatto (11:41:26): > The Bioconductor classes and methodsworking groupis going to reflect on whatclasses should be considered ‘official’and thus re-used in newly submitted packages, and to what extend this should possibly be enforced during review. > > If you would like to join the discussion, please open an issue in the working groupgithub repoand tag yourself. We plan our first round of discussions Jan 2023. - Attachment (workinggroups.bioconductor.org): Chapter 2 Currently Active Working Groups / Committees | Bioconductor Working Groups: Guidelines and activities > The following describe currently active working groups listed in alphabetical order. If you are interested in becoming involved with one of these groups please contact the group leader(s). 2.1… - Attachment (contributions.bioconductor.org): Chapter 4 Important Bioconductor Package Development Features | Bioconductor Packages: Development, Maintenance, and Peer Review > 4.1 biocViews Packages added to the Bioconductor Project require a biocViews: field in their DESCRIPTION file. The field name “biocViews” is case-sensitive and must begin with a lower-case ‘b’….

Laurent Gatto (11:53:28) (in thread): > @Lluís Revilla, I believe you brought this up here some time ago and might be interested.

Lluís Revilla (12:04:09) (in thread): > Thanks for the ping Laurent. Yes, I’m interested and Vince Carey emailed me some time ago to contribute more on this.

Lluís Revilla (12:04:35) (in thread): > Count me in

Laurent Gatto (12:05:24) (in thread): > Could you still ping yourself in an issue, to make sure, and if you want send a PR to add yourself as a working group member.

Lluís Revilla (12:28:04) (in thread): > Done!

Malte Thodberg (13:19:39) (in thread): > I would like to join this discussion as well if possible!

2022-12-17

Dario Righelli (03:00:23) (in thread): > hi@Laurent GattoI’d like to join the working group, do I have to open a new issue or just ping myself in an already opened one?

Laurent Gatto (03:35:57) (in thread): > As you prefer: if you have specific points you would like to highlight, feel free to open an issue and spell them out (even if some have been mentioned by others already) or just reply to one that is already open saying that you would like to join too (and adding your comments if relevant).

2022-12-19

Dario Righelli (03:41:05) (in thread): > ok thanks!

Aedin Culhane (06:36:49): > Has anyone created a Bioconductor packages with Rust code ?@Michael Lynch

Michael Lynch (06:36:52): > @Michael Lynch has joined the channel

Vince Carey (06:41:09): > @Aedin Culhanethe rd4 packagehttps://github.com/Bioconductor/Contributions/issues/2732uses rust, was building and moving through review but stalled at submitter end. - Attachment: #2732 (inactive) rd4 initial submission > Update the following URL to point to the GitHub repository of
> the package you wish to submit to Bioconductor > > • Repository: https://github.com/fulcrumgenomics/rd4 > > Confirm the following by editing each check box to ‘[x]’ > > I understand that by submitting my package to Bioconductor,
> the package source and all review commentary are visible to the
> general public. > I have read the Bioconductor Package Submission
> instructions. My package is consistent with the Bioconductor
> Package Guidelines. > I understand Bioconductor <https://bioconductor.org/developers/package-submission/#naming|Package Naming Policy> and acknowledge
> Bioconductor may retain use of package name. > I understand that a minimum requirement for package acceptance
> is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
> Passing these checks does not result in automatic acceptance. The
> package will then undergo a formal review and recommendations for
> acceptance regarding other Bioconductor standards will be addressed. > My package addresses statistical or bioinformatic issues related
> to the analysis and comprehension of high throughput genomic data. > I am committed to the long-term maintenance of my package. This
> includes monitoring the support site for issues that users may
> have, subscribing to the bioc-devel mailing list to stay aware
> of developments in the Bioconductor community, responding promptly
> to requests for updates from the Core team in response to changes in
> R or underlying software. > I am familiar with the Bioconductor code of conduct and
> agree to abide by it. > > I am familiar with the essential aspects of Bioconductor software
> management, including: > > ☑︎ The ‘devel’ branch for new packages and features. > ☑︎ The stable ‘release’ branch, made available every six
> months, for bug fixes. > ☑︎ Bioconductor version control using Git
> (optionally via GitHub). > > For questions/help about the submission process, including questions about
> the output of the automatic reports generated by the SPB (Single Package
> Builder), please use the #package-submission channel of our Community Slack.
> Follow the link on the home page of the Bioconductor website to sign up.

Peter Hickey (17:03:52): > Not BioC, but general R + Rust advice:https://github.com/r-rust

2022-12-20

Vince Carey (06:09:05): > Thanks Pete!@Hervé Pagèsmight likehttps://ivanceras.github.io/svgbob-editor/

2022-12-22

Hervé Pagès (09:45:19) (in thread): > :blush:didn’t really work onhttps://github.com/Bioconductor/SparseArray/blob/143bd1917dc719076a51b0f041ba55e9503da14f/man/SVT_SparseArray-class.Rd#L150-L154

2023-01-03

Nitesh Turaga (15:24:38): > Hi folks, Happy new year! > > Where would I put a package in theDESCRIPTIONfile of an R package only being used in unit testing? Can I use roxygen@importFromlike tags in unit test functions viatestthat?

Dirk Eddelbuettel (15:25:22) (in thread): > Suggests is common, and left without complaint byR CMD check --as-cran

Nitesh Turaga (15:25:42) (in thread): > Perfect! Thanks@Dirk Eddelbuettel

Dirk Eddelbuettel (15:26:44) (in thread): > Andpkg::functionhelps with the import considerations. You may want to also test for presence viaif (requireNamespace("pkg", quietly=TRUE))

Nitesh Turaga (15:32:07) (in thread): > Thank you!

2023-01-04

Kozo Nishida (04:47:15): > Hi all, > Will “Bioconductor Developers’ Forum” be held this month? > I’d like to check it to manage the event calendar.https://bioconductor.org/help/events/

Federico Marini (10:59:55): > Maybe I missed that or I had something else in place to show me that - I recall BiocCheck did also show the location of the “offending” error/warning/notes. This behavior seems to be gone in the current release. > Is there a way to display these things again? I tried to search the docs but was unsuccessful so far. Thanks in advance!

Marcel Ramos Pérez (11:02:07): > Hi Federico, if you are running BiocCheck interactively you should see a folder .BiocCheck in your directory. That will contain the full report.

Federico Marini (11:04:19): > ah I see, thanks a lot. keeping the eyes on the RStudio project “main folder” does not help to spot that. I will keep the file open and see if it updates upon re-generating:wink:

Federico Marini (11:05:11): > aaaaaaah there are also my good old namespace import suggestions from:love_letter:codetoolsBioc:love_letter:

Federico Marini (11:05:37): > (which I know are to be taken with a grain of salt, but still helps me to make sure the bulk of it is in)

2023-01-06

Henrik Bengtsson (11:29:26): > I just noticed that Bioc package pages now link to the corresponding folder inhttps://code.bioconductor.org/, e.g. onhttps://bioconductor.org/packages/devel/bioc/html/illuminaio.htmlthere’s: > > Bioc Package Browser:https://code.bioconductor.org/browse/illuminaio/Neat!:pray:

Lori Shepherd (11:33:32) (in thread): > Credit to@Mike Smithfor the PR to add it:slightly_smiling_face:

Nitesh Turaga (16:10:14): > Has anyone had success with theRemotes:field in the DESCRIPTION of R-packages to auto install dependencies? > > It seems when I try it, it says there is no package > > Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : > there is no package called 'myPackage' > Calls: <Anonymous> ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart > Execution halted > ERROR: lazy loading failed for package 'mainPackage' > > The exact annotation I have in myRemotes:field is > > Remotes: github::privateorg/myPackage@main >

Nitesh Turaga (16:10:41): > I took the Remotes: suggestion fromhttps://cran.r-project.org/web/packages/remotes/vignettes/dependencies.html

Alan O’C (16:15:35) (in thread): > How are you loading/installing your package? Remotes has worked fine for me. > > I’ve not tried with private repos, but I imagine as long as your PAT is set up right it should be fine.

Nitesh Turaga (16:16:12) (in thread): > PAT is set up right, i’m able to confirm this, because when I doinstall_github()it works without a hitch.

Nitesh Turaga (16:16:27) (in thread): > I’m just trying an R CMD INSTALL at the moment.

Alan O’C (16:18:20) (in thread): > Ah. Yeah I think Remotes is picked up byremotes::install_*",devtools::install, but not by the base R infra. eginstall.packageswill also ignore it I think. This is (partly) why there’s the warning at the end about CRAN submission

Nitesh Turaga (16:21:14) (in thread): > I see! So, just install them separately then? > > remotes::install_github('privateorg/myPackage') > > ## then > > remotes::install_github('privateorg/mainPackage') >

Nitesh Turaga (16:21:35) (in thread): > Cause I don’t understand why the first line works, and not within theRemotes:field

Nitesh Turaga (16:21:52) (in thread): > > > packageVersion('remotes') > [1] '2.4.2' >

Marcel Ramos Pérez (16:23:44) (in thread): > R CMD INSTALLwouldn’t pick theRemotesfield up, onlyremotes::install_localand friends would, AFAICT

Alan O’C (16:24:36) (in thread): > remotes::install_github('privateorg/mainPackage')should work on its own in that context

Nitesh Turaga (16:25:58) (in thread): > Let me try again

Ludwig Geistlinger (16:31:45) (in thread): > What worked for me is to put the remote (private) github-only package name as-is intoDepends:,Imports:, orSuggests:(ieImports: myPackage) and, in addition, declare the repo viaRemotes: privateorg/myPackage

Nitesh Turaga (16:37:02) (in thread): > I see, let me try that as well.

Nitesh Turaga (16:39:59) (in thread): > This is the correct answer! Thanks@Ludwig Geistlingerand@Marcel Ramos Pérezand@Alan O’C

2023-01-09

Simon Pearce (07:22:05): > Does anyone happen to have an example of automatically running tests via gitlab CI, rather than github actions? I need to develop internally on a gitlab instance for “political” reasons.

Alan O’C (07:53:05) (in thread): > I don’t have an example handy but I did this in a previous org. Are you just looking to get started, or do you have a sticking point?

Tiago C. Silva (17:19:23) (in thread): > @Simon PearceHere is my.gitlab-ci.yml, I hope it helps you. You will also need to set the runner CI/CD. I am using pkgdown to create the documentation and covr to create the examples/test coverage reports. > > # The Docker image that will be used to build your app > image: tiagochst/bioc_libs:latest > > stages: > - check > - pages > - coverage > > pages: > stage: pages > script: > - R -e "devtools::install()" > - R -e "pkgdown::build_site()" > - R -e "covr::gitlab(type='all')" > artifacts: > paths: > - public > > coverage: > stage: coverage > dependencies: > - pages > script: > - R -e 'covr::to_cobertura(covr::package_coverage(), filename = "public/cobertura-coverage.xml")' > artifacts: > reports: > coverage_report: > coverage_format: cobertura > path: public/cobertura-coverage.xml > > check: > stage: check > script: > - R -e 'devtools::check()' >

2023-01-10

Simon Pearce (01:35:20) (in thread): > Thanks. > I’ve not used gitlab CI before. I have github actions running nicely via thebiocthispackage, but now need to swap (awkwardly at the week I had earmarked for development within our group!), so don’t have much starting point apart from what biocthis made for me.

Atuhurira Kirabo Kakopo (10:52:06): > @Atuhurira Kirabo Kakopo has joined the channel

Nitesh Turaga (11:39:29): > Is there a way forBiocCheck::BiocCheck()to identify which filesT/Fstyle TRUE / FALSE are coming from?

Nitesh Turaga (11:39:38): > or a fancy regex for it?

Marcel Ramos Pérez (11:40:54): > Yes, check the<package>.BiocCheckfolder will give you all the locations after runningBiocCheck::BiocCheckon the source pkg folder

Robert Shear (14:11:21): > @Robert Shear has joined the channel

Nick R. (18:39:59): > @Nick R. has joined the channel

Nick R. (18:53:31): > Hi all, I was wondering if it is possible to add an organisation-level maintainer to several packages? We would like to make it easier for our group to collaborate on packages and push fixes even when the primary maintainer is not available.

2023-01-11

Laurent Gatto (01:57:18) (in thread): > My 2 cents: as long as that email is registered in the support forum, I suppose it should pass the checks. While I do see the benefits, the risk is that because there’s no single person responsible to address issues, nobody is really responsible (I think that’s also why there’s a note (or is it a warning) when multiple persons have the maintainer role in the DESCRIPTION file).

Simon Pearce (03:09:41) (in thread): > Thanks@Tiago C. Silva, managed to get it to work with that (and help from our Scientific Computing team to setup the runners).

Nick R. (17:25:59) (in thread): > I totally understand the diffusion of responsibility perspective, and from our perspective there will always be a person that should be first contact when it comes to each package. One of the main problem is when students are the maintainers and they disappear.

Nick R. (17:27:22) (in thread): > So would the best way be to add a second maintainer with a valid email to the the description file? Will this be enough to give them access rights?

2023-01-12

Laurent Gatto (00:34:55) (in thread): > Adding a second maintainer will trigger a note or warning when checking (for the reasons above). Having said that, being a maintainer (as defined in the DESCRIPTION file) and having write access to the Bioconductor Github repo (provided by Bioc core upon acceptance) are two different things. I believe that the maintainer and another author could be given access (to be confirmed by@Lori Shepherd) so that if/when the maintainer leaves, the other author could take over and update the roles in the DESCRIPTION file.

Lori Shepherd (08:02:27) (in thread): > i agree with everything said thus far. There still has to be someone maintaining access keys for an email linked to an organizational level email if that it utilized. simply adding maintainer to the description will not give them access rights; we would need to change access on our side as well and create an email associated account on the BiocCredentials app. The fear with multiple maintainers and an organizational level is as laurent pointed out 1. that no one would be responsible and 2. if there are internal disputes between maintainers on the direction of a package it could get messy 3. multiple maintainers have to be careful not to push and override other maintains changes which is why we recommend the approach of PR to a local maintained repo and a single maintainer pushing to Bioconductor – it should also be noted that when/if there are ever issues to a package we generally only reach out to the listed maintainer of a package. There will still need to be some level of control from someone to manage the access keys if a different organizational level email is used. It should also be noted a few people that tried this in the past then did not monitor that organizational level email and missed important deadlines or notifications to fix packages so be cautious. Also if this approach is used, make sure it is not something that would be require sign up like mailing list email for the same reason as stated; auto emails then will fail and bounce and we consider it an invalid email and that is not allowed.

Nick R. (21:18:41) (in thread): > Our ideal situation is for the nominated maintainer to maintain their current responsibilities and to have the organisation level maintainer as a contingency plan. We have a lot of packages in the SydneyBioX umbrella and maintainers are not always as responsive as we would like. Sometime, due to inter-dependencies, a bug in one package can cause problem in other packages in our environment. For these reasons, we would like to have an organisation-level account with push access to the BioC repo that will allow us to make changes if necessary when the maintainer is unavailable.

Nick R. (21:23:13) (in thread): > We have already created an email for the task (maths.bioconductor@sydney.edu.au). In regards to your concern that email may not be answered, all emails to that address will be forwarded to me and I will ensure that any administrative issues are dealt with in a timely manner.

2023-01-20

Lluís Revilla (02:44:40) (in thread): > Hi@Laurent Gatto, my calendar is getting full for the end of January, will we have a starting meeting by the end of this month? Or do you prefer to continue the discussions via the github repo for some more time (or a channel here ).

Laurent Gatto (02:45:26) (in thread): > Sorry about that - I’ll follow up shortly to set a date for our meeting.

Lluís Revilla (02:57:30) (in thread): > Great:+1:

2023-01-23

Mark Jens (07:56:21): > @Mark Jens has joined the channel

Nick R. (22:34:36): > Hi all, I was wondering if there is an API for retrieving information about package builds on the BioC build servers (i.e. build status and where problems occurred). I am trying to build a dashboard to monitor several packages and I haven’t been able to find any documentation of an API

Peter Hickey (22:35:57) (in thread): > Are the package badges sufficient? Something like what Aaron has done withhttps://github.com/LTLA/LTLA/blob/master/README.md

Nick R. (22:40:06) (in thread): > That’s a really good suggestion! I was planning something a little more elaborate, but if an API isn’t available I’ll probably use something like this until I’ve written the web scraping code.

2023-01-24

Martin Grigorov (02:56:18): > @Martin Grigorov has joined the channel

Martin Morgan (02:56:28) (in thread): > BiocPkgToolsprovides summaries of build reports (see the vignette).

Federico Marini (08:45:00): > FYI,https://github.com/git/git/security/advisories/GHSA-c738-c5qq-xg89Got it from our sysads

Vince Carey (11:01:20) (in thread): > @Nick R.keep us posted about your dashboard development. I would agree that an API would be appropriate for this information.

2023-01-25

Tim Triche (14:49:16): > nice catch. updating now

2023-01-26

Yu Zhang (12:31:43): > @Yu Zhang has joined the channel

Hervé Pagès (13:19:53) (in thread): > Luckily nobody has crazy.gitattributesfiles in their packages: > > biocbuild@nebbiolo2:~/bbs-3.16-bioc/MEAT0$ wc */.gitattributes > 2 14 85 BASiCS/.gitattributes > 17 42 378 BatchQC/.gitattributes > 7 14 103 DEScan2/.gitattributes > 2 10 122 ELMER/.gitattributes > 4 8 52 fgsea/.gitattributes > 6 12 86 gdsfmt/.gitattributes > 6 12 86 HIBAG/.gitattributes > 1 2 35 iCNV/.gitattributes > 1 11 65 iterClust/.gitattributes > 1 2 12 missMethyl/.gitattributes > 1 3 17 MOFA2/.gitattributes > 4 8 52 motifcounter/.gitattributes > 0 2 11 MSstatsShiny/.gitattributes > 5 8 53 MultiAssayExperiment/.gitattributes > 3 8 51 PharmacoGx/.gitattributes > 0 0 0 Prostar/.gitattributes > 3 8 51 psichomics/.gitattributes > 4 8 52 regsplice/.gitattributes > 17 42 378 scoreInvHap/.gitattributes > 4 8 58 SimBu/.gitattributes > 6 12 86 SNPRelate/.gitattributes > 94 234 1833 total >

2023-01-30

Nitesh Turaga (12:29:02): > Has anyone weighed the benefits ofassert_thatpackage vs just usingstopifnotfor validation in R packages? It throws better error messages, and there isn’t any serious slow down in the checking(benchmarking) but i’m wondering if there are other issues i’m not considering. I do not like the additional package dependency though.

Marcel Ramos Pérez (12:43:35) (in thread): > I haven’t but usuallyifstatements with clearstopfunctions are enough for my purposes. For more robust checking, we started adding checking functions inBiocBaseUtils, e.g.,isCharacter. BothassertthatandBiocBaseUtilshave minimal deps.

Marcel Ramos Pérez (12:45:21) (in thread): > FWIW,stopifnotallows a error message as the name of the argument, e.g.,stopifnot("here is an error" = FALSE)

Spencer Nystrom (12:52:12) (in thread): > Wow I never knew that, and the syntax is wild

Nitesh Turaga (14:14:10) (in thread): > I knew about the syntax of stopifnot, but it’s not pleasant as the message gets longer:slightly_smiling_face:

Nick R. (21:06:57) (in thread): > I made alittle dashboard. The web scraping code can be foundhere. - Attachment (nick-robo-biocbuildcheck-dash-db466s.streamlit.app): Streamlit > This app was built in Streamlit! Check it out and visit https://streamlit.io for more awesome community apps. :balloon:

Nick R. (21:36:26) (in thread): > Let me know if you think I should add anything

2023-01-31

Malte Thodberg (03:58:24) (in thread): > I’ve moved on to use thecheckmatepackage, mainly because I find it quicker to write thanassertthat

Francesc Català-Moll (14:33:59): > @Francesc Català-Moll has joined the channel

2023-02-01

Nick R. (18:44:18): > It seems likeR CMD INSTALLon lconway is not usingMakevarsin foundsrc/, unlike the other hosts.ClassifyRonly has a check warning on Iconway and inspecting we can see that Iconway is using the incorrect c++ standard library (-std=gnu++14) to compile.cppfiles.

Kasper D. Hansen (21:27:16): > I looked a bit at this@Nick R.

Kasper D. Hansen (21:28:56): > Like you, I am surprised that the mac platform uses-std=gnu++14despite the fact that you setCXX_STDin Makevars. I have no experience with C++ standards but when I read the section in R-exts, I get the impression that it should work.

Kasper D. Hansen (21:29:59): > But you’re wrong when you say it works on Linux. The compilation call on Linux also includes-std=gnu++14, but despite this, you don’t get any warnings. Probably because it is a different compiler (Clang vs GCC)

Kasper D. Hansen (21:31:40): > Note that both platforms has thestd=gnu++14as part of their standard CXX compiler call (see the configuration page), so one possibility is that your Makevars is not enough to force a switch from CXX to CXX11. But here, I have no experience, I am just offering a hypothesis

Kasper D. Hansen (21:33:34): > On Windows you do getstd=gnu++11, but this call is part of the CXX call on Windows, so I am again offering my hypothesis that your Makevars doesn’t force the switch on any platform

Nick R. (21:42:53): > I didn’t notice that it’s using 14 on Linux because I only checked the Windows build. I guess it isn’t using Makevars at all.

Nick R. (21:44:14): > It seems to be working as expected ondevel, which is even stranger

Kasper D. Hansen (21:45:37): > yeah, that’s interesting

Kasper D. Hansen (21:46:01): > But is this worth thinking about, because this seems to be a change that should only go in devel?

Kasper D. Hansen (21:46:17): > (ok, yes, it is worth thinking about)

Kasper D. Hansen (21:54:33): > I don’t see any mention of this in R-devel NEWS

2023-02-02

Dirk Eddelbuettel (11:10:25): > I would venture thatR CMD INSTALLdoesnotmisssrc/Makevars(orsrc/Makevars.win). Its use is more than critically important. If I had to guess, I would think that any confusion is due to the (arguably confusing!) choice of multiple CXX*FLAGS. You do need, say,CXX14FLAGSwhen you doCXX_STD = CXX14, ditto for 11 or 17 or 20. Also, when you seestd=gnu++14that is just a quirk of how R interacts with gcc/g++. I just did some work on a C++-version specific bug / misfeature, and helped a few R(cpp) users confused becauseR-develnow checks and barks about ‘Specified C++11: please update to current default of C++17’ (which may become its own can of worms). Happy to expand.

Dirk Eddelbuettel (12:24:28): > Also: R 4.2.0 defaults to C++14 – and there was a recent r-devel commit correcting in on Windows, seehttps://developer.r-project.org/blosxom.cgi/R-devel/2023/01/26#n2023-01-26. And R 4.3.0 is expected to default to C++17.

Kasper D. Hansen (12:59:59): > @Dirk EddelbuettelI don’t know if you looked at the code. You’re the expert here.@Nick R.is essentially assuming that the only thing he needs to do to enforce C++11 is to have > > CXX_STD = CXX11 > > inMakevars. Nothing in DESCRIPTION and nothing anywhere else. Just casually reading R-exts, this is the same impression I get.

Dirk Eddelbuettel (13:02:18): > That is correct (as seen in lots of CRAN packages). And as of a week ago, r-devel now says ‘NOTE Specified C++11: please update to current default of C++17’ which scares a lot of people. > > Also correct is what I wrote that ‘CXXFLAGS’ will be ignore and once you set C++11 as the standard you need ‘CXX11FLAGS’ which is a gotcha.

Kasper D. Hansen (13:18:09): > Yes, so the confusing thing is that on the bioc build servers, we can see the CXX11FLAGS and CXX11 etc and - to my casual eye - it looks correctly setup.

Kasper D. Hansen (13:19:00): > But on one build, setting the Makevars as above doesn’t seem to switch to CXX11FLAGS as expected, while on the newer build, it works as expected

Kasper D. Hansen (13:19:33): > This could be a build system change (although to my eye, the two systems looks similar), it could be an R change or something else.

Kasper D. Hansen (13:20:48): > DOH (as expected), I was getting confused by the various versions etc.

Kasper D. Hansen (13:21:33): > In release (version 3.2.6) you don’t have the Makevars file@Nick R., when you look at the Bioc git onhttps://code.bioconductor.org/browse/ClassifyR/RELEASE_3_16/

Kasper D. Hansen (13:21:38): > Ok, that solves it

Dirk Eddelbuettel (13:45:00): > Got it. I have done similar tricks on myself (mostly locally) withcleanupscripts shared thatrm src/Makevarswhich is good when you haveconfigurecreating one fromsrc/Makevars.inbut deadly when you do not. As you just deleted your build instructions. Ooops. All good then.

Nick R. (17:11:00): > Oh, it was my mistake then:sweat_smile:. Thanks for the help

2023-02-06

Martin Grigorov (07:32:30): > Hi! What is the upstream Git repo for this package -https://bioconductor.org/packages/release/bioc/html/bgx.html? > I seeSource Repository git clonehttps://git.bioconductor.org/packages/bgxat the bottom of that page but I am not sure whether this is the main repo or a mirror of it.

Lori Shepherd (07:37:15): > git.bioconductor.org/packages/ is the upstream git repo that Bioconductor uses for any package – maintainers may keep separate github repos that we hope are in sync with what is there but if you are looking for the source code for a distributed package of Bioconductor that would be the correct location

Martin Grigorov (07:37:57) (in thread): > Thank you,@Lori Shepherd!

Aidan Lakshman (14:08:54): > @Aidan Lakshman has joined the channel

Ying Chen (21:35:25): > @Ying Chen has joined the channel

2023-02-07

Rishi (04:32:57): > @Rishi has joined the channel

2023-02-20

Martin Grigorov (08:48:13): > Hi! I experience a problem while trying to build theqrqcpackage. Maybe someone here could help! > The qrqc package usesRhtslib::pkgconfig("PKG_LIBS")to get the path to htslib.a (https://github.com/vsbuffalo/qrqc/commit/30eebece49543574866700baa17695c6567fb50e) > But then the builds fails with: > > $ ~/bbs-3.17-bioc/R/bin/R CMD build qrqc > ... > gcc -shared -L/home/biocbuild/bbs-3.17-bioc/R/lib -L/usr/local/lib -o qrqc.so R_init_io.o io.o '/home/biocbuild/bbs-3.17-bioc/R/library/Rhtslib/usrlib/libhts.a' -lcurl -L/home/biocbuild/bbs-3.17-bioc/R/lib -lR > /usr/bin/ld: cannot find '/home/biocbuild/bbs-3.17-bioc/R/library/Rhtslib/usrlib/libhts.a': No such file or directory > collect2: error: ld returned 1 exit status > make: ***** [/home/biocbuild/bbs-3.17-bioc/R/share/make/shlib.mk:10: qrqc.so] Error 1 > ERROR: compilation failed for package 'qrqc' > * removing '/tmp/RtmpeHmlbV/Rinst1a96c65e7c59f/qrqc' > ----------------------------------- > ERROR: package installation failed > > The file is existing: > > ll /home/biocbuild/bbs-3.17-bioc/R/library/Rhtslib/usrlib/ > total 10480 > drwxrwxr-x 2 biocbuild biocbuild 4096 Feb 11 03:40 ./ > drwxrwxr-x 10 biocbuild biocbuild 4096 Feb 11 03:40 ../ > -rw-rw-r-- 1 biocbuild biocbuild 6824016 Feb 11 03:40 libhts.a > -rwxrwxr-x 1 biocbuild biocbuild 3894744 Feb 11 03:40 libhts.so* > lrwxrwxrwx 1 biocbuild biocbuild 9 Feb 11 03:40 libhts.so.2 -> libhts.so* >

Martin Grigorov (08:52:02) (in thread): > the problem is thatRhtslib::pkgconfig("PKG_LIBS")returns the path wrapped in single quotes. If I hardcode the path without the quotes then the build work!

Martin Grigorov (09:00:06) (in thread): > https://github.com/vsbuffalo/qrqc/pull/10

Martin Grigorov (09:00:52) (in thread): > @Hervé PagèsCould you please review this PR since you introduced the usage of Rhtslib. I guess it worked for you somehow

2023-02-27

Mike Smith (09:52:47): > I’m currently seeing the following warning on the build system from some of my packages that bundle other libraries: > > * checking compiled code ... WARNING > Note: information on .o files is not available > File /home/pkgbuild/packagebuilder/workers/jobs/2925/R-libs/Rarr/libs/Rarr.so: > Found __printf_chk, possibly from printf (C) > Found __sprintf_chk, possibly from sprintf (C) > Found abort, possibly from abort (C) > Found printf, possibly from printf (C) > Found puts, possibly from printf (C), puts (C) > Found stderr, possibly from stderr (C) > > I remember this got elevated from a NOTE to WARNING fairly recently, however I’m not able to produce it at all on my local machine. Does anyone know if there’s a specific setting required to trigger this warning? I’ve tried using theRenviron.biocand the version of R-devel available on the build system.

Hervé Pagès (13:46:48): > Hi Mike, > Note that both the SPB and daily builds use the--library=/path/to/library --install=check:captured-install-output.txttrick when runningR CMD checkin order to avoid an unnecessary reinstallation of the package during the check process. Strangely it seems that we only get the “Compiled code should not call entry points which might terminate R” warning when using this trick. Not sure why. In theory this should produce check results that are identical to a regularR CMD check. I’ll ask the CRAN folks about this (they shared that undocumented trick with us a few years ago, I think they use it for the CRAN checks too, or at least they used to). > I can reproduce the warning on my laptop with: > > R CMD INSTALL Rarr_0.99.3.tar.gz >toto.txt 2>&1 > R CMD check --no-vignettes --timings --library=/home/hpages/R/R-4.3.r83438/library --install=check:toto.txt Rarr_0.99.3.tar.gz > > FWIW we also started to see this warning for several other packages on the daily report a couple of months ago, including forRhtslibandRhdf5lib. Note that the warning “propagates” to rev dep packages that are linked (viaLinkingTo) toRhtsliborRhdf5lib, because they are statically linked. > Not sure what’s the best course of action from here. I remember that Martin (@Martin Morgan) was using the following trick inRsamtoolsa few years ago when the package included its own copy of samtools: > * Use the following preprocessor options when invoking gcc:-Dprintf=_samtools_printf -Dexit=_samtools_exit -Dabort=_samtools_abortetc… > * Define “R-friendly” functions_samtools_printf,_samtools_exit,_samtools_abort, etc… that use things likeRprintfanderrorinstead ofprintfandabort/exit. > BTW I’m not sure what the problem is withsprintf. Looks like the chances of false positives are actually quite high with this warning so it’s kind of unfortunate that it got recently elevated from a NOTE to a WARNING. > Anyways, can we just ignore it for now?

Mike Smith (14:36:03): > Thanks@Hervé Pagès. I should have realised I was also seeing the warning on my GitHub action which mimics the BBS results passing trick (https://github.com/grimbough/Rarr/actions/runs/4281211813/jobs/7453955174#step:7:435) That would have saved me some time compiling R devel from January:face_with_peeking_eye:I’m happy to ignore it for now in the existing packages like Rhdf5lib, since the code hasn’t changed and no one has ever complained about it before. I mostly wanted to figure out why I wasn’t seeing the warning, as it’s appearing on a package I have in submission but I hadn’t observed it myself.

2023-02-28

Kasper D. Hansen (08:05:40): > The issue here is that if you call (say)sprintf()from C code in an R package you can crash stuff. However, you can also have this symbol in the DLL without the symbol ever being invoked because of the way the code runs. For example if the library defines a function that is never used in a package.

Kasper D. Hansen (08:06:06): > Herve, you could look at this by saving the compiled version of the library and runotoolon the DLL

Kasper D. Hansen (08:09:53): > I have had the same Note (now Warning) in Rgraphviz for many years. In Rgraphviz case its because the Graphviz DLL defines a number of output functions that are not being called from the R interface. Herve’s solution is the right one, but I have been hesitant to go through the entire Graphviz base (which is huge) and find all the occurrences.

Kasper D. Hansen (08:10:42): > It might be nice to (together) define a number ofprintf_override()or perhaps even figure out how to do this with macros. I expect many external libraries will have this issue

Dirk Eddelbuettel (08:31:35) (in thread): > Not correct on the “can crash”. It “merely” garbles R’s output which is why the powers that be prefer we use the one used by R when everything gets synchronised. > > Now,abort()andexit()would take a (likely interactive) R session down so that is why they are banned.

Kasper D. Hansen (08:33:22) (in thread): > Thanks for being a bit more clear on this, I was mixing up the different calls

Dirk Eddelbuettel (08:35:25) (in thread): > No mas. And thanks to@Hervé Pagèsfor resurrecting the trick by@Martin Morgan. I had seen that “somewhere somehow” years ago and I had bugged that I could never find it. Might be worth a generalised header solution. > > Still leaves the other nag about which Duncan M just posted on r-package-devel the other day. Also annoying and likelynotfixable with preprocessor tricks.

2023-03-01

jeremymchacón (12:12:13): > @jeremymchacón has joined the channel

2023-03-06

Lambda Moses (10:50:30): > When will the Bioc 3.17 release schedule be posted?

Lori Shepherd (11:12:08): > shortly – within this week. I am working on it now. We just got word last week of the estimate R 4.3 release which is what we schedule our spring release around

2023-03-07

Dario Righelli (05:09:13): > Hi all, sorry for the lame question, my packageDEScan2is failing on a test on version 3.16 and 3.17. > I’m fixing the error, but how do I push the fixed version to both Bioconductor versions? > In other words, version 3.16 is the RELEASE and 3.17 is the DEVEL, is that right? > Thanks in advance

Vince Carey (06:12:58): > literally, for the release use branchRELEASE_3_16, and for devel, usemaster, until tomorrow! There will be an announcement tomorrow when the default branch name for devel will be changed todevel. See the box atbioconductor.orgfor some relevant links.

Dario Righelli (07:28:14): > Thanks Vince!

Marcel Ramos Pérez (10:28:29): > Hi Dario! This section has the detailshttp://contributions.bioconductor.org/git-version-control.html?q=bug%20fix#bug-fix-in-release-and-devel

Dario Righelli (10:50:51) (in thread): > thanks@Marcel Ramos Pérez, yes indeed I followed this tutorial, but I wasn’t sure about the 3.17 version. Because my package started to fail on both versions 3.16 and 3.17. > So, in the end the “devel” version still is the 3.17

Marcel Ramos Pérez (11:07:18) (in thread): > That’s right. The devel version is not prominent on the website but checking this page helpshttps://bioconductor.org/checkResults/

2023-03-08

Unyime William (01:16:33): > @Unyime William has joined the channel

2023-03-10

Edel Aron (15:24:42): > @Edel Aron has joined the channel

2023-03-12

Belinda Phipson (21:14:19): > Hi, I am a (long term) maintainer of the missMethyl Bioconductor package along with Jovana Maksimovic. I am attempting to fix a bug and can’t push to Bioconductor. I suspect it is because only Jovana has the rights to do push changes/bug fixes etc. Is it possible that I am granted access as well? And how do I go about doing that?

2023-03-13

Lori Shepherd (07:44:00) (in thread): > Asking here or better#bioc_gitor thebioc-devel@r-project.orgmailing list is fine for requests. You do actually have access but it looks like you have to set up Your BiocCredentials account and add ssh keys for identification.BiocCredentialsI will private message with a few other details.

Lori Shepherd (08:24:22): > The release schedule for Bioconductor release 3.17 has been announced. Please be mindful of important dates and deadlines.Bioconductor 3.17 Release Schedule

2023-03-15

Laurent Gatto (12:58:54): > Has anyone experience with integrating python + cython code in R/Bioc packages?

Joseph (23:47:45) (in thread): > Hi@Lori Shepherd, I’m sorry if this is redundant, but I just wanted to confirm. I have submitted my package to Bioconductor, and am currently in the process of making edits based on my reviewer’s comments. The March 31st deadline is only for the initial submission correct, not package edits/corrections?

2023-03-16

Mike Morgan (05:27:32): > I’m trying to wrap my head around error handling and pass the traceback from Rcpp to R when using bp* functions from BiocParallel to give informative error messages to the user. I’m usingRcpp::stop()to get the Rcpp/C++ levels up to R, however, when usingbpapplyandbpokthe traceback consists of the bp* call stack - is this a limit of how deep the traceback goes, or am I missing something? An example of the kind of thing I’m attempting below: > > ## Inside function (1) > func1 <- function(...){ > # execution code > bp.list <- bpapply(..., FUN=func2, args) > > # bp.list is the output from bplappyl() > bperr <- attr(bp.list[[x]], "traceback") > > # then add bperr to return object > output.list[[x]] <- list(..., "ERROR"=bperr) > return(output.list) > } > > mainfunc <- function(...){ > result.list <- func1(...) > > if(check for other failure condition){ > print(lapply(result.list, `[[`, "ERROR")) > stop(unique(unlist(lapply(result.list, `[[`, "ERROR")))) > } > return(blah blah blah) > } > > mainfunc(args) > > ## Example output: > [[140]] > [1] "37: testNhoods(sim1.mylo, design = ~Condition + (1 | Condition), " " design.df = sim1.meta, glmm.solver = \"Fisher\", fail.on.error = TRUE)" > [3] "36: glmmWrapper(Y = dge$counts, disper = 1/dispersion, Xmodel = x.model, " " Zmodel = z.model, off.sets = offsets, randlevels = rand.levels, " > [5] " reml = REML, glmm.contr = glmm.cont, genonly = geno.only, " "35: bptry({" > [7] " bplapply(seq_len(nrow(Y)), BPOPTIONS = bpoptions(stop.on.error = FALSE), " " FUN = function(i, Xmodel, Zmodel, Y, off.sets, randlevels, " > [9] "34: tryCatch(expr, ..., bplist_error = bplist_error, bperror = bperror)" "33: tryCatchList(expr, classes, parentenv, handlers)" > [11] "32: tryCatchOne(tryCatchList(expr, names[-nh], parentenv, handlers[-nh]), " " names[nh], parentenv, handlers[[nh]])" > [13] "31: doTryCatch(return(expr), name, parentenv, handler)" "30: tryCatchList(expr, names[-nh], parentenv, handlers[-nh])" > [15] "29: tryCatchOne(expr, names, parentenv, handlers[[1L]])" "28: doTryCatch(return(expr), name, parentenv, handler)" > [17] "27: bplapply(seq_len(nrow(Y)), BPOPTIONS = bpoptions(stop.on.error = FALSE), " " FUN = function(i, Xmodel, Zmodel, Y, off.sets, randlevels, " > [19] " disper, genonly, kin.ship, glmm.contr, reml) {" "26: bplapply(seq_len(nrow(Y)), BPOPTIONS = bpoptions(stop.on.error = FALSE), " > [21] " FUN = function(i, Xmodel, Zmodel, Y, off.sets, randlevels, " " disper, genonly, kin.ship, glmm.contr, reml) {" > [23] "25: .bpinit(manager = manager, X = X, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, " " BPOPTIONS = BPOPTIONS, BPREDO = BPREDO)" > [25] "24: bploop(manager, BPPARAM = BPPARAM, BPOPTIONS = BPOPTIONS, ...)" "23: bploop.lapply(manager, BPPARAM = BPPARAM, BPOPTIONS = BPOPTIONS, " > [27] " ...)" "22: .bploop_impl(ITER = ITER, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, " > [29] " BPOPTIONS = BPOPTIONS, BPREDO = BPREDO, reducer = reducer, " " progress.length = length(redo_index))" > [31] "21: .collect_result(manager, reducer, progress, BPPARAM)" "20: .manager_recv(manager)" > [33] "19: .manager_recv(manager)" "18: .recv_any(manager$backend)" > [35] "17: .recv_any(manager$backend)" "16: .bpworker_EXEC(msg, bplog(backend$BPPARAM))" > [37] "15: tryCatch({" " do.call(msg$data$fun, msg$data$args)" > [39] " }, error = function(e) {" "14: tryCatchList(expr, classes, parentenv, handlers)" > [41] "13: tryCatchOne(expr, names, parentenv, handlers[[1L]])" "12: doTryCatch(return(expr), name, parentenv, handler)" > [43] "11: do.call(msg$data$fun, msg$data$args)" "10: (function (...) " > [45] " BiocParallel:::.workerLapply_impl(...))(X = 1:140, FUN = function (i, " " Xmodel, Zmodel, Y, off.sets, randlevels, disper, genonly, " > [47] "9: BiocParallel:::.workerLapply_impl(...)" "8: do.call(lapply, args)" > [49] "7: (function (X, FUN, ...) " " {" > [51] " FUN <- match.fun(FUN)" "6: FUN(X[[i]], ...)" > [53] "5: withCallingHandlers({" " tryCatch({" > [55] " FUN(...)" "4: tryCatch({" > [57] " FUN(...)" " }, error = handle_error)" > [59] "3: tryCatchList(expr, classes, parentenv, handlers)" "2: tryCatchOne(expr, names, parentenv, handlers[[1L]])" > [61] "1: value[[3L]](cond)" > > None of these are actually the error message I’m expecting from the callRcpp::stop()call. It seems the deepest part of the call back is atryCatch``call which is well above theRcpp::stop()call. The docs fortraceback()`suggest it should not be limited, Any thoughts/advice?

Martin Morgan (07:27:53) (in thread): > BiocParallel usestryCatch()to catch the error and report the errormessageto the user, but as you note this means that the traceback generated by R only goes to the tryCatch; similar limitations are seen in other R code. I think this could be enhanced to provide a full traceback; maybe open an issue athttps://github.com/Bioconductor/BiocParallel/issues?

Lori Shepherd (07:32:01) (in thread): > correct. March 31 deadline is to submit to the tracker and we say reviewers will have a chance to do at least an initial review on a package. Packages in review will still have to pass review/corrections and be accepted by April 19 to be included in the next release. Packages submitted after March 31 we cannot guarantee we will have time to review as there are alot of other tasks that need to be completed at release time; those packages and packages that do not meet the April 19 deadline, will still be able to be in Bioconductor but they will be in the devel branch until the next Fall release.

Mike Morgan (08:02:03) (in thread): > Thanks for the suggestion Martin - duly noted and issue submitted.

Joseph (13:12:39) (in thread): > Got it, thank you!

2023-03-22

Michael Love (15:13:07): > Has anyone updated their GitHub Actions to build pages based on pushes todevelbranch? If not I’ll just mess around and find out

Lambda Moses (15:34:12) (in thread): > Yes, use something like > > on: > push: > branches: [devel] >

2023-03-23

Davide Risso (08:03:55): > Hi everyone! I’m developing a package (https://github.com/drisso/learn2count) that depends on a R-forge package (https://r-forge.r-project.org/R/?group_id=522). I know that the Bioconductor policy is that packages should only depend on CRAN and Bioc packages. But is there a reason for not including R-forge in the list? It seems a reputable and stable enough repository, is it not? I totally understand that we don’t want to depend on random github packages, but I was somewhat surprised about treating R-forge in the same way. In this specific case, I actually need only one function that I can easily re-write, but I was curious about this and I thought I’d ask… - Attachment (r-forge.r-project.org): R-Forge: countreg: Count Data Regression: R Development Page > S3 functions for generalized count data regression and related tools

Kasper D. Hansen (11:47:19): > I am far from an expert but isn’t R-forge more of a development site?

2023-03-24

Federico Marini (09:20:51): > r forge is something I saw being around before the github approach took over

Federico Marini (09:21:26): > so overall I would say it is still “unofficially out”. Just some space to do software development, in R

Federico Marini (09:22:13): > it is still under the domain of the “r-project”, but AFAIK there is no check on what lands there

Federico Marini (09:25:06): > given the fact that the devel status for countreg is (likely entered once by the maintainers)

Federico Marini (09:25:08): - File (PNG): image.png

Federico Marini (09:25:31): > probably they simply forgot to update that?

Dirk Eddelbuettel (09:40:12): > It is (was?( a Sourceforge (remember that?) clone that WU Vienna host(s|ed) and that has long atrophied after many-but-not-all project started using GH. So it is still alive. Yet some of us recently noticed a fresh new mirror of it at GH:https://github.com/r-forgeI am unsure who owns it / what the plan is.

Federico Marini (09:47:24): > oh yeah SourceForge. That ended up quite weird

Vince Carey (10:23:54): > I’d say the bottom line here is to keep our policy of only Bioc and CRAN and contact the countreg author to ask them if they would want to move to CRAN or Bioc.

Lluís Revilla (10:27:55): > What is the policy of Bioc to Additional repositories? Would something like drat or r-universe be accepted (In CRAN they are accepted so this could be another solution)?

Lori Shepherd (10:29:44): > Currently Bioc requires all packages to be on CRAN or Bioc; no additional repositories beyond that for stated package dependencies

Dirk Eddelbuettel (10:30:02) (in thread): > For Suggests: only though which seems reasonable for CRAN.

Lluís Revilla (10:30:44) (in thread): > Ah, yes. I forgot that here the problem is that it is a Depends:

Dirk Eddelbuettel (10:31:24) (in thread): > So easy for BioC to say no as CRAN does too:grin:

Martin Morgan (10:55:49) (in thread): > I thinkAdditional_repositoriesare problematic for the user anyway, because any repository in this field is not automatically handled byinstall.packages()?

Dirk Eddelbuettel (10:58:25) (in thread): > This is a multi-year discussion and CRAN is moving in the right direction thatoptionaldependencies in Suggests need to be tested for. > > I personally find it lovely that I can have additional data packages, or alternative implementations, in Suggests and for exampleexplicitlyinstall them in CI and tests. > > The case you describe is a different one. Valid in and by itself (“user experience”), but to me both too narrow and not relevant for the use cases that are supported.

Dirk Eddelbuettel (11:00:31) (in thread): > (Per?install.packages, settingdependencies=TRUEwill install Suggests (including fromAdditional_repositories:), andR CMD checktests the URL if listed.)

Alexander Bender (13:37:32): > @Alexander Bender has joined the channel

2023-03-27

Federico Marini (18:49:42): > fyi, just happened to me > > so I followed the solution indicated here:https://github.blog/2023-03-23-we-updated-our-rsa-ssh-host-key/ - Attachment (The GitHub Blog): We updated our RSA SSH host key | The GitHub Blog > At approximately 05:00 UTC on March 24, out of an abundance of caution, we replaced our RSA SSH host key used to secure Git operations for GitHub.com. - File (PNG): image.png

2023-03-28

Alan O’C (04:17:12) (in thread): > Was definitely a bit spooky googling and hoping I wasn’t falling for a very sophisticated attack

Federico Marini (09:59:39) (in thread): > no, just github being overcautious, somehow

Federico Marini (09:59:40) (in thread): > :smile:

Alan O’C (11:23:41) (in thread): > Aye, butwhat ifthe hackers got full control of github and managed to put out a blog post and a bunch of news articles about it?:stuck_out_tongue:

Aidan Lakshman (13:11:43): > R integers are 32 bit–does this mean that theinttype in C is also always aint32_ttype?

Dirk Eddelbuettel (13:20:49) (in thread): > There is a technicality here in thatint32_tguaranteesyou 32 bits (4 bytes) whereint, going back to grandfathers Kernighan and Ritchie, is “implementation dependent”. Some of us are old enough for vague memories of 16 bit ints.

Hervé Pagès (13:21:05): > R integers just match theinttype in C. According to the C specs theinttype is not guaranteed to be 32-bit but it turns out to be 32-bit on the plartforms that R supports so R integers are effectively 32-bit in practice, and it’s safe to assume that.

Dirk Eddelbuettel (13:21:12) (in thread): > “In practice” right now they are the same on common 32 bit and 64 bit machines.

Hervé Pagès (13:28:09): > In other words, when manipulating R integers in C, your code should useint, notint32_t.

Hervé Pagès (13:33:40) (in thread): > The same hackers that made up covid and global warming:scream:

Aidan Lakshman (17:26:48) (in thread): > Yep,that’swhy I was asking—I know R guarantees 32 bit integers, but Iwasn’tsure if that extended to the C backend…nowadays I doubtit’sreally an issue, but i was mainly wondering ifINTEGER(…)would return anint*or anint32_t*

Aidan Lakshman (17:27:46) (in thread): > Sorry, just realizing my initial question was unclear—i knowyou’renot always guaranteed 32b ints in C:sweat_smile:, in my head I was thinking about the R-C interface but I definitely did not articulate that haha

Dirk Eddelbuettel (17:28:28) (in thread): - File (PNG): image.png

Aidan Lakshman (17:28:47) (in thread): > :grimacing:yep, brb

Dirk Eddelbuettel (17:29:16) (in thread): > In all seriousness the R headers are not all that hidden and the manuals – dense as they are – contain all the truth. The quarto rendered versions are nice:https://rstudio.github.io/r-manuals/

Dirk Eddelbuettel (17:29:42) (in thread): > The last two are important too besidesWriting R Extensions

Aidan Lakshman (17:34:24) (in thread): > Ah that is a great render, much easier on the eyes than WRE haha

Aidan Lakshman (17:34:32) (in thread): - File (PNG): IMG_0725

Aidan Lakshman (17:34:50) (in thread): > :D

Joseph (21:38:53): > Hi everyone, I was wondering if I might be able to get some help figuring out what exactly is wrong with my build. I am getting the below error and attached some screenshots of my roxygen2 import statements and the imports section of my DESCRIPTION file. I can’t seem to replicate the error on my end and only see this after pushing to the repository on Bioconductor. Thank you in advance! > > Error: objects rowSums, colSums, rowMeans, colMeans are not exported by 'namespace:BiocGenerics' > - File (PNG): image.png - File (PNG): image.png

Peter Hickey (22:34:41) (in thread): > I think it’ll be due to these recent change by@Hervé Pagès:https://github.com/Bioconductor/BiocGenerics/commit/5d95680bc8cafd0833ee96415b07f6ad18d53b99

Peter Hickey (22:37:39) (in thread): > I’m not sure if it’s something you need to fix in your package by importingrowSumsetc. fromMatrixGenericsrather thanBiocGenericsin yourNAMESPACEor if it’s something to be fixed/updated in one of your dependencies

Peter Hickey (22:38:11) (in thread): > a link to your package would help someone with time to take a look into it

Peter Hickey (22:38:41) (in thread): > Hopefully Herve will chime in, although I think he’s offline for a few days

Joseph (22:56:02) (in thread): > Thanks a lot@Peter Hickey! I am currently in the review process for getting my package accepted. Here is a link to the GitHub however.https://github.com/GuoLabUCSD/OutSplice

Peter Hickey (23:36:32) (in thread): > let your reviewer know (if you haven’t already) about this thread and I think they’ll be able to work through it with you. > I suspect it’ll be a case of waiting a few days to see if things settle down as this change makes its way through core BioC packages becauseBiocGenericsis at the very bottom of the BioC dependency stack

Peter Hickey (23:38:03) (in thread): > from my brief look at your package I think there’s nothing for you to do right now because you don’t directlyimportany ofrowSums,colSums,rowMeans, orcolMeansin yourNAMESPACE

2023-03-29

Hervé Pagès (02:38:41) (in thread): > Pete is right, I did it.:blush:Yesterday I moved therowSums,colSums,rowMeans,colMeansgenerics fromBiocGenericstoMatrixGenericsso now they need to be imported from the latter (they are inMatrixGenerics>= 1.11.1). I didn’t have time to assess the damage caused by this change yet, sorry, will do in a few days and repair as much as possible. The daily build report for devel will help (this change will be reflected on the next report). > I just realize thatgenefilterneeded its imports to be fixed so I just did that (ingenefilter1.81.3), seehttps://github.com/Bioconductor/genefilter/commit/ebab04c0c856357e7eb2b5d3695c953a427f7cde. > Unfortunately I caught this too late for the current builds so this package alone will introduce a lot of red on tomorrow’s report (genefilterhasa lotof reverse deps, direct and indirect,OutSpliceis one of them).@JosephOutSplicealso explicitely importsrowSums,colSums,rowMeans,colMeansfromBiocGenerics. Since these are the only things you import fromBiocGenerics, you can just replace every occurence ofBiocGenericswithMatrixGenerics. That’s what I did forgenefilter(I also specifiedMatrixGenerics (>= 1.11.1)in the Imports field). Hope this helps and sorry for the inconvenience.

Matteo Tiberti (10:11:35): > @Matteo Tiberti has joined the channel

Lori Shepherd (13:40:14) (in thread): > @Hervé Pagèsjust a feeling that this probably effects a descent amount of packages – was there an announcement onbioc-devel@r-project.organnouncing this as well?

Hervé Pagès (15:03:20) (in thread): > Just announced now.

Joseph (16:29:46) (in thread): > Thanks@Hervé Pagès, I’m sorry, but can you elaborate on what you meant by explicitly imports? For OutSplice, we just import GenomicRanges, GenomicFeatures, and IRanges which depend on BiocGenerics. Does this still mean I should still be specifying MatrixGenerics in my imports as an addition?

Himel Mallick (16:56:28): > @Himel Mallick has joined the channel

Peter Hickey (17:52:47): > What’s BioC’s commitment to 32-bit OSs/systems? > The 3.16 release notes (https://bioconductor.org/news/bioc_3_16_release/) say “Bioconductor 3.16 is compatible with R 4.2, and is supported on Linux, 64-bit Windows, and Intel 64-bit macOS 10.13 (High Sierra) or higher” > It seems clear what that means for Windows and macOS but is that an implicit commitment to 32-bit Linux systems?

Peter Hickey (17:55:49): > I’ve had a helpful report of a test failure inDelayedMatrixStatsonppc32(https://github.com/PeteHaitch/DelayedMatrixStats/issues/90) but I don’t have resources to investigate much more at the moments - Attachment: #90 One test fails on ppc32: Error: cannot allocate vector of size 305.2 Mb > > > R version 4.2.3 (2023-03-15) -- "Shortstop Beagle" > Copyright (C) 2023 The R Foundation for Statistical Computing > Platform: powerpc-apple-darwin10.8.0 (32-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > > library(testthat) > > library(DelayedMatrixStats) > Loading required package: MatrixGenerics > Loading required package: matrixStats > > Attaching package: 'MatrixGenerics' > > The following objects are masked from 'package:matrixStats': > > colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, > colCounts, colCummaxs, colCummins, colCumprods, colCumsums, > colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, > colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats, > colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds, > colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads, > colWeightedMeans, colWeightedMedians, colWeightedSds, > colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet, > rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods, > rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps, > rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins, > rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks, > rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars, > rowWeightedMads, rowWeightedMeans, rowWeightedMedians, > rowWeightedSds, rowWeightedVars > > Loading required package: DelayedArray > Loading required package: stats4 > Loading required package: Matrix > Loading required package: BiocGenerics > > Attaching package: 'BiocGenerics' > > The following objects are masked from 'package:stats': > > IQR, mad, sd, var, xtabs > > The following objects are masked from 'package:base': > > Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append, > as.data.frame, basename, cbind, colnames, dirname, do.call, > duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted, > lapply, mapply, match, mget, order, paste, pmax, [pmax.int](http://pmax.int), pmin, > [pmin.int](http://pmin.int), rank, rbind, rownames, sapply, setdiff, sort, table, > tapply, union, unique, unsplit, which.max, which.min > > Loading required package: S4Vectors > > Attaching package: 'S4Vectors' > > The following objects are masked from 'package:Matrix': > > expand, unname > > The following objects are masked from 'package:base': > > I, expand.grid, unname > > Loading required package: IRanges > > Attaching package: 'DelayedArray' > > The following objects are masked from 'package:base': > > apply, rowsum, scale, sweep > > > Attaching package: 'DelayedMatrixStats' > > The following objects are masked from 'package:matrixStats': > > colAnyMissings, rowAnyMissings > > > > > test_check("DelayedMatrixStats") > Loading required package: rhdf5 > > Attaching package: 'HDF5Array' > > The following object is masked from 'package:rhdf5': > > h5ls > > R(43227,0xa0dfb620) malloc: ***** mmap(size=320004096) failed (error code=12) > ***** error: can't allocate region > ***** set a breakpoint in malloc_error_break to debug > R(43227,0xa0dfb620) malloc: ***** mmap(size=320004096) failed (error code=12) > ***** error: can't allocate region > ***** set a breakpoint in malloc_error_break to debug > [ FAIL 1 | WARN 0 | SKIP 0 | PASS 14731 ] > > ══ Failed tests ════════════════════════════════════════════════════════════════ > ── Error ('test_GitHub_issues.R:37:3'): Issue 54 is fixed ────────────────────── > Error: cannot allocate vector of size 305.2 Mb > Backtrace: > ▆ > 1. ├─testthat::expect_equal(rowsum(as.matrix(m3), S), rowsum(m3, S)) at test_GitHub_issues.R:37:2 > 2. │ └─testthat::quasi_label(enquo(object), label, arg = "object") > 3. │ └─rlang::eval_bare(expr, quo_get_env(quo)) > 4. ├─base::rowsum(as.matrix(m3), S) > 5. ├─base::as.matrix(m3) > 6. └─DelayedArray::as.matrix.Array(m3) > 7. └─DelayedArray:::.from_Array_to_matrix(x, ...) > 8. ├─base::as.array(x, drop = TRUE) > 9. └─DelayedArray::as.array.Array(x, drop = TRUE) > 10. └─DelayedArray:::.from_Array_to_array(x, ...) > 11. ├─DelayedArray::extract_array(x, index) > 12. └─DelayedArray::extract_array(x, index) > 13. ├─methods::callNextMethod() > 14. └─DelayedArray (local) .nextMethod(x = x, index = index) > 15. ├─DelayedArray::extract_array(x@seed, index) > 16. └─DelayedArray::extract_array(x@seed, index) > 17. └─base::unlist(res, use.names = FALSE) > > [ FAIL 1 | WARN 0 | SKIP 0 | PASS 14731 ] > Error: Test failures > Execution halted >

Hervé Pagès (19:02:28) (in thread): > I mean having something like this in yourNAMESPACEfile: > > importFrom(BiocGenerics,colMeans) > importFrom(BiocGenerics,colSums) > importFrom(BiocGenerics,rowMeans) > importFrom(BiocGenerics,rowSums) > > which I think you had earlier inOutSplice/NAMESPACE. But you replaced them already with: > > importFrom(MatrixGenerics,colMeans) > importFrom(MatrixGenerics,colSums) > importFrom(MatrixGenerics,rowMeans) > importFrom(MatrixGenerics,rowSums) > > and you also replacedBiocGenerics (>= 0.44.0)withMatrixGenerics (>= 1.11.1)in yourImportsfield, so you’re all good.:white_check_mark:

Kasper D. Hansen (19:04:47): > Could this be an issue of endianess since it fails on ppc32? I added a comment to the issue

Hervé Pagès (19:25:17) (in thread): > Actually you only seem to usecolSumsin theOutSplicepackage (in thecalcBurden()function), so why woud you import the other 3 generics? If you’re going to import selectively, only import what you need.

Joseph (19:35:57) (in thread): > Hi@Hervé Pagès, sorry for the confusion. Originally I didn’t have any importFrom statements involving BiocGenerics. However, when I got the error saying those 4 functions weren’t being exported I tried adding all of the above importFrom statements. It actually seems to be building correctly on nebbiolo1 now without any of these additional importFrom statements at all, but not merida1.

Hervé Pagès (19:47:03): > Darwin 10.8.0, really? That’s Mac OS X Snow Leopard 10.6.8 from 2011. I don’t think it’s reasonable to try to support this. As our recent release announcements say, Bioconductor supports Intel 64-bit macOS 10.13 (High Sierra) or higher, like CRAN does.

Kasper D. Hansen (19:51:40): > One thing I found interesting from the github issue is that the poster has had no issues with ~1,400 R packages and this is one of the few where something breaks.

Hervé Pagès (19:54:48) (in thread): > Oh, you just removed theimportFrom(MatrixGenerics, ...)statements, I see. I mean, if you’re only usingcolSums()on anordinarymatrix or data.frame, then you don’t need thecolSumsgenericat all, you only need thecolSumsbase functiondefined in thebasepackage, and you automatically get that without having to specify any import. So, yes, no need to introduce an unnecessary dependency onMatrixGenerics.

Joseph (20:01:23) (in thread): > Yup, that was my thinking! Thanks so much for the time@Hervé Pagès! I’m guessing the reason it is building correctly on nebbiolo1 (Linux), but not merida1 (macOS) is because the auto-repair / changes haven’t been adjusted there yet?

Hervé Pagès (20:12:25) (in thread): > Looks like the daily builds didn’t kick off on merida1 today, that’s why. We’ll investigate. Please ignore merida1’s results until this is resolved. Thanks!

2023-03-30

Mike Smith (04:42:05): > rhdf5 is one of the other packages that isn’t working so well on that platform (https://github.com/grimbough/rhdf5/issues/122) - Attachment: #122 Some tests fails on PowerPC: [ FAIL 39 | WARN 0 | SKIP 2 | PASS 970 ] > ``> > R version 4.2.3 (2023-03-15) -- "Shortstop Beagle" > Copyright (C) 2023 The R Foundation for Statistical Computing > Platform: powerpc-apple-darwin10.8.0 (32-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > > library(testthat) > > library(rhdf5) > > > > test_check("rhdf5") > R(60123,0xa0dfb620) malloc: ***** mmap(size=4000002048) failed (error code=12) > ***** error: can't allocate region > ***** set a breakpoint in malloc_error_break to debug > R(60123,0xa0dfb620) malloc: ***** mmap(size=4000002048) failed (error code=12) > ***** error: can't allocate region > ***** set a breakpoint in malloc_error_break to debug > [ FAIL 39 | WARN 0 | SKIP 2 | PASS 970 ] > > ══ Skipped tests ═══════════════════════════════════════════════════════════════ > • On CRAN (2) > > ══ Failed tests ════════════════════════════════════════════════════════════════ > ── Failure ('test_H5A.R:112:5'): unsigned 32-bit integer attributes are read correctly ── >H5Aread(aid, bit64conversion = “int”)did not produce any warnings. > Backtrace: > ▆ > 1. ├─testthat::expect_true(any([is.na](http://is.na)(expect_warning(H5Aread(aid, bit64conversion = "int"))))) at test_H5A.R:112:4 > 2. │ └─testthat::quasi_label(enquo(object), label, arg = "object") > 3. │ └─rlang::eval_bare(expr, quo_get_env(quo)) > 4. └─testthat::expect_warning(H5Aread(aid, bit64conversion = "int")) > ── Failure ('test_H5A.R:112:5'): unsigned 32-bit integer attributes are read correctly ── > any([is.na](http://is.na)(expect_warning(H5Aread(aid, bit64conversion = "int")))) is not TRUE > >actual: FALSE >expected: TRUE > ── Failure ('test_H5A.R:116:5'): unsigned 32-bit integer attributes are read correctly ── > expect_silent(H5Aread(aid, bit64conversion = "double")) not equivalent to c(1:9, 2^31). > 10/10 mismatches (average diff: 2.9e+08) > [1] 1.68e+07 - 1 == 1.68e+07 > [2] 3.36e+07 - 2 == 3.36e+07 > [3] 5.03e+07 - 3 == 5.03e+07 > [4] 6.71e+07 - 4 == 6.71e+07 > [5] 8.39e+07 - 5 == 8.39e+07 > [6] 1.01e+08 - 6 == 1.01e+08 > [7] 1.17e+08 - 7 == 1.17e+08 > [8] 1.34e+08 - 8 == 1.34e+08 > [9] 1.51e+08 - 9 == 1.51e+08 > ... > ── Failure ('test_H5A.R:120:5'): unsigned 32-bit integer attributes are read correctly ── >x3not equivalent to bit64::as.integer64(c(1:9, 2^31)). > Mean relative difference: 3.844444 > ── Failure ('test_H5P_dcpl.R:16:5'): Filters can be set ──────────────────────── > H5Pset_szip(dcpl, options_mask = 1L, pixels_per_block = 8L) is not more than 0. Difference: -1 > ── Failure ('test_H5P_dcpl.R:17:5'): Filters can be set ──────────────────────── > H5Pget_nfilters(dcpl) not equal to 3L. > 1/1 mismatches > [1] 2 - 3 == -1 > ── Failure ('test_h5create.R:263:5'): attributes can be added using file name ── > "foo_attr" %in% names(h5readAttributes(file = h5File, name = "foo")) is not TRUE > >actual: FALSE >expected: TRUE > ── Failure ('test_h5read.R:83:5'): Reading attributes too ────────────────────── > as.character(attributes(baa)$scale) not equal to attributes(D)$scale. > Lengths differ: 0 is not 1 > ── Error ('test_h5read.R:350:5'): Reading SZIP ───────────────────────────────── > Error: 'idx' argument is outside the range of filters set on this property list. > Backtrace: > ▆ > 1. ├─... %>% expect_equal(c(64, 32)) at test_h5read.R:350:4 > 2. ├─testthat::expect_equal(., c(64, 32)) > 3. │ └─testthat::quasi_label(enquo(object), label, arg = "object") > 4. │ └─rlang::eval_bare(expr, quo_get_env(quo)) > 5. ├─testthat::expect_is(., "matrix") > 6. │ └─testthat::quasi_label(enquo(object), label, arg = "object") > 7. │ └─rlang::eval_bare(expr, quo_get_env(quo)) > 8. ├─testthat::expect_silent(h5read(szip_file, "DS1")) > 9. │ └─testthat:::quasi_capture(enquo(object), NULL, evaluate_promise) > 10. │ ├─testthat (local) .capture(...) > 11. │ │ ├─withr::with_output_sink(...) > 12. │ │ │ └─base::force(code) > 13. │ │ ├─base::withCallingHandlers(...) > 14. │ │ └─base::withVisible(code) > 15. │ └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo)) > 16. └─rhdf5::h5read(szip_file, "DS1") > 17. └─rhdf5:::h5readDataset(...) > 18. └─base::tryCatch(...) > 19. └─base (local) tryCatchList(expr, classes, parentenv, handlers) > 20. └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]]) > 21. └─value[[3L]](cond) > 22. └─rhdf5:::h5checkFilters(h5dataset) > 23. └─rhdf5::H5Pget_filter(pid, i - 1) > ── Failure ('test_h5readAttributes.R:31:5'): 64-bit integer attributes are read correctly ── >x1 <- h5readAttributes(h5File, “A”, bit64conversion = “int”)did not produce any warnings. > ── Failure ('test_h5readAttributes.R:32:5'): 64-bit integer attributes are read correctly ── > "int64" %in% names(x1) && "uint32" %in% names(x1) is not TRUE > >actual: FALSE >expected: TRUE > ── Failure ('test_h5readAttributes.R:33:5'): 64-bit integer attributes are read correctly ── > x1$int64 not equivalent to c(1:9, NA). > target is NULL, current is numeric > ── Failure ('test_h5readAttributes.R:34:5'): 64-bit integer attributes are read correctly ── > x1$uint32 not equivalent to c(1:9, NA). > target is NULL, current is numeric > ── Failure ('test_h5readAttributes.R:37:5'): 64-bit integer attributes are read correctly ── > x2$int64 not equivalent to c(1:9, 2^32). > target is NULL, current is numeric > ── Failure ('test_h5readAttributes.R:38:5'): 64-bit integer attributes are read correctly ── > x2$uint32 not equivalent to c(1:9, 2^31). > target is NULL, current is numeric > ── Failure ('test_h5readAttributes.R:39:5'): 64-bit integer attributes are read correctly ── > x2$int64 inherits from‘NULL’not‘character’. > ── Failure ('test_h5readAttributes.R:40:5'): 64-bit integer attributes are read correctly ── > x2$int64 has type 'NULL', not 'double'. > ── Failure ('test_h5readAttributes.R:43:5'): 64-bit integer attributes are read correctly ── > x3$int64 not equivalent to c(1:9, 2^32). > target is NULL, current is numeric > ── Failure ('test_h5readAttributes.R:44:5'): 64-bit integer attributes are read correctly ── > x3$uint32 not equivalent to c(1:9, 2^31). > target is NULL, current is numeric > ── Failure ('test_h5readAttributes.R:45:5'): 64-bit integer attributes are read correctly ── > x3$int64 is not an S3 object > ── Failure ('test_h5write.R:39:5'): Attributes are written too ───────────────── > "scale" %in% names(h5readAttributes(file = h5File, name = "B")) is not TRUE > >actual: FALSE >expected: TRUE > ── Failure ('test_h5writeAttributes.R:33:5'): Adding attribute to file ───────── >attr_backhas length 0, not length 4. > ── Failure ('test_h5writeAttributes.R:34:5'): Adding attribute to file ───────── > all(...) is not TRUE > >actual: FALSE >expected: TRUE > ── Failure ('test_h5writeAttributes.R:35:5'): Adding attribute to file ───────── > attr_back$char_attr[1] inherits from‘NULL’not‘character’. > ── Failure ('test_h5writeAttributes.R:36:5'): Adding attribute to file ───────── > attr_back$int_attr[1] inherits from‘NULL’not‘character’. > ── Failure ('test_h5writeAttributes.R:37:5'): Adding attribute to file ───────── > attr_back$numeric_attr[1] inherits from‘NULL’not‘character’. > ── Failure ('test_h5writeAttributes.R:63:5'): Adding attribute to group ──────── >attr_backhas length 0, not length 4. > ── Failure ('test_h5writeAttributes.R:64:5'): Adding attribute to group ──────── > all(...) is not TRUE > >actual: FALSE >expected: TRUE > ── Failure ('test_h5writeAttributes.R:65:5'): Adding attribute to group ──────── > attr_back$char_attr[1] inherits from‘NULL’not‘character’. > ── Failure ('test_h5writeAttributes.R:66:5'): Adding attribute to group ──────── > attr_back$int_attr[1] inherits from‘NULL’not‘character’`. > ── Failure (‘test_h5writeAttributes.R:67:5’): Adding attribute to…

Jonathan Griffiths (04:58:42): > There was a discussion a while ago about some C++-related NOTES that were recently upgraded to WARNINGS:https://community-bioc.slack.com/archives/CLUJWDQF4/p1677509567090139Is there any more clear official guidance on this? Can we ignore them? (as they are not directly introduced by code in the package, as far as I can tell?) - Attachment: Attachment > I’m currently seeing the following warning on the build system from some of my packages that bundle other libraries: > > * checking compiled code ... WARNING > Note: information on .o files is not available > File /home/pkgbuild/packagebuilder/workers/jobs/2925/R-libs/Rarr/libs/Rarr.so: > Found __printf_chk, possibly from printf (C) > Found __sprintf_chk, possibly from sprintf (C) > Found abort, possibly from abort (C) > Found printf, possibly from printf (C) > Found puts, possibly from printf (C), puts (C) > Found stderr, possibly from stderr (C) > I remember this got elevated from a NOTE to WARNING fairly recently, however I’m not able to produce it at all on my local machine. Does anyone know if there’s a specific setting required to trigger this warning? I’ve tried using the Renviron.bioc and the version of R-devel available on the build system.

Mike Smith (05:10:48) (in thread): > My reading of the responses was that it’s ok to ignore these warnings for the time being, as the underlying issue has actually been there for a longtime and we were fine with it when they were just NOTEs. This is particularly the case if you’re only seeing them as a result of linking to another package - I’m not sure what your scenario is. > > FWIW I did actually find some instances of code that really was potentially troublesome in some places e.g.sprintf()tostderr, that should have used the appropriate R mechanisms instead, so they aren’t all false positives. > > I also note that it’s only a warning on Linux, and remains a NOTE on the other platforms. As I had different reports for each platform (due to code differences & linking with different libraries on each OS), I focused on addressing Linux first to make the warning go away.

Vince Carey (06:55:23): > On the 32-bit linux: I think in response to@Peter Hickeywe should revise the wording on our statement of “supported on”. I don’t know exactly what should be said. Our support commitment is best demonstrated through the systems we build and check on, but may be further limited by enumerating the packages thatmustpass check (and will receive core developer time) for the ecosystem to be viable for its main purposes. I would propose referring to the platform descriptions on the build reports. I would be happy to see the set of platforms on which we build grow, but we lack both the human and material assets to grow much more, having adopted M1 mac alongside intel mac. The work of@Martin Grigorovand colleagues on ARM (64!) linux has been very instructive, and one could say that we are moving in the direction of “supporting” that platform too, insofar as@Alex Mahmoudis using qemu emulation in github actions. As the ARM64 platform gains broader adoption, hardware will likely need to come within our build system. I would say the fate of packages on ppc32 and other “old” platforms depends on community provision of relevant testing platforms and community/maintainer willingness to do the porting work and verification. Clarification of the general “support perimeter” for Bioconductor is a long-standing concern of mine, and I see us remaining flexible for the foreseeable future, but open to suggestions on steps that could be taken to increase developer and user productivity.

Jonathan Griffiths (06:56:28) (in thread): > Thanks for the information!

Martin Grigorov (07:07:40) (in thread): > Most of the Bioc/R packages just work on Linux ARM64! > For some of the packages the maintainers found that there were some minor bugs in arithmetics and fixed them! > > The main problem is that most of the package maintainers do not have access to Linux ARM64 hardware and it is hard to debug the issues:confused:

Martin Grigorov (07:08:22) (in thread): > Athttps://community-bioc.slack.com/archives/C02CWTCB1LJ/p1680094441498579I shared few ways to overcome this issue but itisa real issue!

Yikun Jiang (11:39:55): > @Yikun Jiang has joined the channel

Yikun Jiang (11:50:53) (in thread): > :100:

Yikun Jiang (12:00:16) (in thread): > Just random thoughts, we could write a doc about how to test bioconductor on linux aarch64 in somewhere (such asarm-linuxrepo we mentioned before). Then we can put the doc link to final linux arm64 ci page (https://yikun.github.io/bioconductor-0301/report/long-report.html).

2023-03-31

Wes W (10:46:56): > let me know if this is the wrong place to ask this. but once you have your package done and say you want to publish a paper describing and benchmarking the package, what are some of your fav journals to submit to for this kind of thing?

Lluís Revilla (10:54:06) (in thread): > You can submit it to the Journal of Open Source:https://joss.theoj.org/some also publish to the F1000 Research journal:https://f1000research.com/gateways/bioconductor - Attachment (joss.theoj.org): Journal of Open Source Software > Journal of Open Source Software (JOSS) is a developer friendly, open access journal for research software packages. - Attachment (f1000research.com): Articles from gateway Bioconductor - F1000Research > Read the latest peer reviewed Bioconductor articles and more on F1000Research

Davide Risso (11:07:54) (in thread): > I personally like PLOS Computational Biology. Their software papers are treated similar to a research paper (e.g., 6 figures) unlike say Bioinformatics, whose Application Notes are ridiculously short (1 1/2 page w/ 1 figure).

Davide Risso (11:08:58) (in thread): > I vaguely remember@Matt Ritchiementioning he liked NAR Genomics and Bioinformatics, but I haven’t submitted anything there yet

Sarvesh Nikumbh (12:04:30) (in thread): > Not yet like PLOS Computational Biology, but OUP Bioinformatics Application Notes now accept up to 4 pages long articles (still one figure though)

Aidan Lakshman (15:41:50) (in thread): > BMC Bioinformatics is a low impact journal that seems to publish a lot of these things, past that, seconding PLOS comp bio, Bioinformatics is a nice middle ground, and then if its really good, NAR

2023-04-01

Wes W (12:23:35) (in thread): > thank you all for the recommendations! this is extremely helpful

vedha (14:44:07): > @vedha has joined the channel

2023-04-03

Jeroen Ooms (07:04:36): > I am getting installation errors, because R-4.3alpha on MacOS has changed the default root directory for binary packages tobig-sur-x86_64for example:https://cran.r-project.org/bin/macosx/big-sur-x86_64/contrib/4.3/

Jeroen Ooms (07:05:05): > But the bioc repository still uses the old dir, so we get: > > Warning: Warning: unable to access index for repository[https://bioconductor.org/packages/3.17/bioc/bin/macosx/big-sur-x86_64/contrib/4.3](https://bioconductor.org/packages/3.17/bioc/bin/macosx/big-sur-x86_64/contrib/4.3): > cannot open URL '[https://bioconductor.org/packages/3.17/bioc/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES](https://bioconductor.org/packages/3.17/bioc/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES)' > Warning: Warning: unable to access index for repository[https://bioconductor.org/packages/3.17/data/annotation/bin/macosx/big-sur-x86_64/contrib/4.3](https://bioconductor.org/packages/3.17/data/annotation/bin/macosx/big-sur-x86_64/contrib/4.3): > cannot open URL '[https://bioconductor.org/packages/3.17/data/annotation/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES](https://bioconductor.org/packages/3.17/data/annotation/bin/macosx/big-sur-x86_64/contrib/4.3/PACKAGES)' >

Jeroen Ooms (07:27:35): > Perhaps a quick fix is to create a symlinkbig-sur-x86_64 -> .inside/packages/3.17/bioc/bin/macosx/

Vince Carey (08:43:59): > Noted. We will get back to you here.

Mengbo Li (20:45:07): > @Mengbo Li has joined the channel

2023-04-04

Jacques SERIZAY (05:00:20): > @Jacques SERIZAY has joined the channel

2023-04-05

Vince Carey (13:36:42) (in thread): > We are working on resolving this and there will be additional updates here.

Joseph (14:34:55): > Hi everyone, apologies in advance if this is the wrong place to ask this. Has anyone worked with data from the TCGA / the Broad GDAC Firebrowse API? I would like to include a small subset of some example data (https://gdac.broadinstitute.org/) with our package. However, I am having difficulty making sense of the data usage policy (https://broadinstitute.atlassian.net/wiki/spaces/GDAC/pages/844333156/Data+Usage+Policy) and cannot seem to get into contact with anyone from the Broad or the Center for Cancer Genomics, so I just wanted to see if anyone has done something similar in the past? Thanks in advance.

Dario Strbenac (20:00:02): > Why not justTCGAbiolinks? You should be demonstrating interoperability with existing infrastructure. Firehose is also old and not harmonised across projects. - Attachment (Bioconductor): TCGAbiolinks > The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

Nick R. (21:25:35): > Hi all, I was wondering what the recommended way to depend on a suggest of a dependency. The motivating example islimma, which suggestsstatmod, yet throws an error sometimes when it is not available. What is the correct way to depend onstatmodin my package if it will not be included in the namespace or used in any of the code?

Hervé Pagès (23:56:32) (in thread): > Maybe add it to your ownSuggests? But if your package absolutely needs it then add it toImportsand selectively import a random symbol from it in your NAMESPACE, soR CMD checkdoesn’t complain.

2023-04-06

Nick R. (00:11:41) (in thread): > Thanks for the suggestion. I was hoping there might be a more elegant solution but I guess it’s just a limitation of R’s package dependency management. Am I correct in thinking that it should not be in limma’s Suggests if it causes an error when it is not available?

Hervé Pagès (00:54:52) (in thread): > No, that’s expected from packages in Suggests. They provide additional functionality and if the package is not installed then that functionality doesn’t work, which generally translates into an error. But the error should be graceful and with an informative error message e.g. something like “you need to install package foo to use functionality xxx”. It should also fail as early as possible.

Pedro Sanchez (05:44:38) (in thread): > cBioPortalDatawould be another option, right Dario? - Attachment (Bioconductor): cBioPortalData > The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.

Vince Carey (06:27:39) (in thread): > And if ^^ is not clearly articulated incontributions.bioconductor.orgguidelines, please file a PR to that doc resource

Tim Triche (11:55:46) (in thread): > It took me years before Aaron Lun explained that to me

Tim Triche (11:56:27) (in thread): > Sort of.The two are not always in sync especially for copy number calls

Tim Triche (11:56:37) (in thread): > Careful down this rabbit hole

Andres Wokaty (13:16:10) (in thread): > I’ve created the symlinks. It should take a few days for propagation to happen and resolve these.

Marcel Ramos Pérez (18:06:34) (in thread): > There’s a whole cadre of packages that provide TCGA data in and outside of Bioconductor. To name a few that use Firehose data:RTCGAToolboxgets data directly from the Broad GDAC Firehose (andcuratedTCGADatafor that data in MultiAssayExperiment format).FirebrowseRhttps://github.com/mariodeng/FirebrowseRis also available but it hasn’t been updated in quite a while…cBioPortalDatadoes also provide TCGA data from the TCGA project and from published studies. FWIW, issues with the actual cBioPortal data should be filed athttps://github.com/cBioPortal/cBioPortalthey’re pretty responsive.GenomicDataCommonsprovides harmonized data.

Kasper D. Hansen (21:22:32) (in thread): > Aaron is busy dude

2023-04-07

Jayaram Kancherla (12:42:38) (in thread): > yup, I wrote a wrapper for@Aaron Lunrds2cpp to read serialize R objects in Python. The painful part is the build process rather than the development. cibuildwheel is super nice but also super slow. ref:https://github.com/BiocPy/rds2py

Laurent Gatto (13:59:02) (in thread): > Thanks!

Jayaram Kancherla (13:59:40) (in thread): > let me know if you need run into any issues, happy to help

2023-04-08

Jeroen Ooms (09:06:00) (in thread): > thank you!

2023-04-11

Nick R. (19:36:14): > Several package are having trouble building on mac due to a missing library (example,anotherandone more). > > ERROR: R installation problem: File /Library/Frameworks/R.framework/Versions/4.3/Resources/lib/libgcc_s.1.dylib not found! > > These are all packages that include C source code.

Henrik Bengtsson (19:47:03) (in thread): > Seehttps://stat.ethz.ch/pipermail/bioc-devel/2023-April/019605.html

Nick R. (19:47:36) (in thread): > Thanks!

2023-04-13

Rory Stark (06:53:27): > In January, on the developer mailing list, there was a short thread entitled “Warnings in ‘checking compiled code’ of R CMD check” where what was previously a NOTE was now a WARNING when linking toRhtslib. In that thread,@Hervé Pagèssaid “it’s ok to ignore the warning”. 22 packages link toRhtsliband hence continue to have a WARNINGS status in the build. I just wanted to confirm that this is OK and we can expect the packages to generate warnings from here on out?

Mike Smith (10:29:57) (in thread): > I think that’s still the consensus following various discussions in here on the topic e.g.https://community-bioc.slack.com/archives/CLUJWDQF4/p1680166722059619andhttps://community-bioc.slack.com/archives/CLUJWDQF4/p1677509567090139 - Attachment: Attachment > There was a discussion a while ago about some C++-related NOTES that were recently upgraded to WARNINGS: https://community-bioc.slack.com/archives/CLUJWDQF4/p1677509567090139 > > Is there any more clear official guidance on this? Can we ignore them? (as they are not directly introduced by code in the package, as far as I can tell?) - Attachment: Attachment > I’m currently seeing the following warning on the build system from some of my packages that bundle other libraries: > > * checking compiled code ... WARNING > Note: information on .o files is not available > File /home/pkgbuild/packagebuilder/workers/jobs/2925/R-libs/Rarr/libs/Rarr.so: > Found __printf_chk, possibly from printf (C) > Found __sprintf_chk, possibly from sprintf (C) > Found abort, possibly from abort (C) > Found printf, possibly from printf (C) > Found puts, possibly from printf (C), puts (C) > Found stderr, possibly from stderr (C) > I remember this got elevated from a NOTE to WARNING fairly recently, however I’m not able to produce it at all on my local machine. Does anyone know if there’s a specific setting required to trigger this warning? I’ve tried using the Renviron.bioc and the version of R-devel available on the build system.

Hervé Pagès (11:00:30) (in thread): > Actually it seems that this NOTE was made a WARNING only in the case where thesprintfsymbol is found in the shared object (.so file). The other symbols still generate a NOTE only. The good news is that Oleksii Nikolaienko convinced the HTSlib folks to remove all usage ofsprintffrom the HTSlib code source (seehttps://github.com/Bioconductor/Rhtslib/issues/31#issuecomment-1495609870). So next time we updateRhtslibwith the latest HTSlib, these warnings should go away. - Attachment: Comment on #31 R CMD check warning on the __sprintf_chk symbol > Hi @hpages ,
> Please see the following discussion. Guys from HTSlib were very nice to substitute sprintfs with snprintfs so we could get rid of R CMD check warnings. Would it be possible to update Rhtslib to the most recent version?
> [Suggested solution for sprintfs in samtools parts of Rhtslib is that we fix them ourselves]
> Thanks!

Mike Smith (11:03:34) (in thread): > Excellent, I didn’t realise it was only for that specific warning, but did notice it was either NOTE or WARNING on different platforms. I did the same switch tosnprintfin rhdf5 to make the warnings go away therehttps://github.com/grimbough/rhdf5/commit/d82311100b00075b58447b37485169701bb0363b

2023-04-14

Lambda Moses (01:38:50): > Any server admins here? How do you update R on servers? The person responsible for my institution’s bioinformatics core is leaving, and I’m going to take over managing some of the bioinformatics core servers. To be honest, I find it kind of annoying that newer versions of Bioconductor force us to use newer versions of R although the packages most likely do run on somewhat older versions of R and we can’t easily change the R version on a server. I haven’t found a way for users to have independent versions of R that would still work with RStudio Server (at least the free version). Also when the major version is changed, we’ll have to reinstall all the packages, which can take quite a while on Linux because they have to be compiled from source. Since R major version changes only once a year, reinstalling all packages isn’t too bad, but I don’t want many users to wake up the next day and find all their packages gone because I upgraded from R 4.2 to R 4.3. I suppose I can send out a notification a week in advance and do the update on Saturday. I have asked for one day ofsudoto update R system wide on my lab’s server, but I would be more cautious with the bioinformatics core servers because more people are using them.

Laurent Gatto (02:22:50) (in thread): > We use different virtual machines for different R versions, each VM being accessible via ssh and RStudio server.

Hervé Pagès (02:33:21) (in thread): > > although the packages most likely do run on somewhat older versions of R > some will, some won’t, and we don’t want to play russian roulette do we?

Lambda Moses (02:39:13) (in thread): > @Laurent GattoThat sounds like a good idea. Do you use Singularity?

Pedro Sanchez (02:54:39) (in thread): > Although not servers but a high-performance computer (Linux), me and my labmates access it in the same way. We have Singularity images with the different versions of R or other languages. We install a core set of packages and then we allow users to install other packages they want in an.Rproject-dependent way. So, we also usesshand Rstudio server

Davide Risso (02:55:47) (in thread): > I second@Laurent Gatto’s comment. We (well I’m just a user…) use singularity and docker containers (typically the devel and the latest release of the Bioconductor docker images)

Davide Risso (02:57:58) (in thread): > Related q for@Laurent Gattore Rstudio server: our IT doesn’t like that because it doesn’t play nice with queue management systems (e.g. slurm). Do you have a solution for that?

Jianhai Zhang (05:05:22): > Is there a way to know the list of installed packages on the daily building machines? In the help files of my package, there islibrary( SeuratData);(https://github.com/satijalab/seurat-data) andInstallData("stxBrain")(https://satijalab.org/seurat/articles/spatial_vignette.html). If not installed, can any Bioc administrator install it? Thanks. - Attachment (satijalab.org): Analysis, visualization, and integration of spatial datasets with Seurat > Seurat

Martin Morgan (06:18:52) (in thread): > I think the only thing to be managed is R version – RStudio and R / BiocManager user library management take care of the rest. Our HPC uses ‘modules’ which require limited user skills (hence something I can do!) and seems to be quite robust and light-weight. I’ve never administered a shared system, so…

Jeroen Ooms (07:42:05): > Hello me again:slightly_smiling_face:Is it known/expected that the latest bioconductor binaries for macos now require at least macos 12.0 (instead of 11.0, like CRAN)?@Vince Carey@Andres Wokaty

Jeroen Ooms (07:42:37): > When testing on MacOS 11 (big-sur), I get these sort of errors:

Jeroen Ooms (07:42:39): > > Error in dyn.load(file, DLLpath = DLLpath, ...) : > unable to load shared object '/Users/runner/work/_temp/Library/fgsea/libs/fgsea.so': > dlopen(/Users/runner/work/_temp/Library/fgsea/libs/fgsea.so, 6): Symbol not found: __ZNKSt3__115basic_stringbufIcNS_11char_traitsIcEENS_9allocatorIcEEE3strEv > Referenced from: /Users/runner/work/_temp/Library/fgsea/libs/fgsea.so (which was built for Mac OS X 12.0) > Expected in: /usr/lib/libc++.1.dylib >

Jeroen Ooms (07:43:51): > It would be nice to target the same version of MacOS as cran does, version 11.0 (big-sur)

Jeroen Ooms (07:45:19): > I think you can accomplish this by adding-mmacosx-version-min=11.0to your CPPFLAGS, but Simon Urbanek knows best.

Lori Shepherd (07:48:24) (in thread): > If you look at the long reporthttps://bioconductor.org/checkResults/3.17/bioc-LATEST/long-report.htmlYou will see at the top for each builder it shows installed pkgs and has a number that is a link to a page with the list of packages and version number.

Lori Shepherd (07:49:05) (in thread): > What is your package? Make sure any package you need is stated in your DESCRIPTION as a Depends, Imports, or Suggests and remember all dependent packages must be on CRAN or Bioconductor in order to be installed.

Kasper D. Hansen (08:57:00): > @Jeroen OomsI think I am correct in stating that we want to stick to the CRAN requirements for macOS. Thanks for the reported and proposed fix

Jianhai Zhang (16:56:01) (in thread): > My package is spatialHeatmap.

Joseph (17:43:17) (in thread): > Hi@Lori Shepherd. I had some other quick questions about the deadlines. I see that April 21st is listed for the last chance for packages to pass the R CMD build/check tests, however the 24th is the last one to commit changes? Does this mean we can still push changes to the devel branch up until the 24th? > > Also, sorry if this has been covered, but I am having some trouble understanding the documentation. If we have bug fixes or updates to our package after the 24th, will these not get included until the October release?

Jianhai Zhang (20:42:37): > Hello, I have an example data to use in my vignette, which is > 10M. I want to host it (saved using saveRDS) on a git repo and read it from vignette directly. However, there is always an error. When I manually download the .rds file from git repo, then everything is fine. Does anyone have solutions? > > Code:download.file('https://github.com/jianhaizhang/SHM_data/blob/main/spatial_single_cell/srt_sc.rds',"test.rds", method="wget")``test <- readRDS('test.rds')

Dirk Eddelbuettel (20:59:38) (in thread): > You can place the rds file inside a package, and then access it viasystem.file("data", "file.rds", package="yourdatapackage"). I do that in one package with a date sets for tests. The package can then in a so-called Additional_repositories: repo. Brooke and I once wrote about this very trick for ‘suggested’ data packages, seehttps://doi.org/10.32614/RJ-2017-026As for your direct download from GH, that works too but note that the ‘raw’ URL for your dataset ishttps://github.com/jianhaizhang/SHM_data/raw/main/spatial_single_cell/srt_sc.rdswhich works for me: > > > download.file("[https://github.com/jianhaizhang/SHM_data/raw/main/spatial_single_cell/srt_sc.rds](https://github.com/jianhaizhang/SHM_data/raw/main/spatial_single_cell/srt_sc.rds)", "/tmp/srt_sc.rds") > trying URL '[https://github.com/jianhaizhang/SHM_data/raw/main/spatial_single_cell/srt_sc.rds](https://github.com/jianhaizhang/SHM_data/raw/main/spatial_single_cell/srt_sc.rds)' > Content type 'application/octet-stream' length 12185651 bytes (11.6 MB) > ================================================== > downloaded 11.6 MB > > > x <- readRDS("/tmp/srt_sc.rds") > > > > The key for any such GH property and a download is to pick the ‘raw’ link.

Jianhai Zhang (22:00:46) (in thread): > Thank you so much!

Dirk Eddelbuettel (22:02:42) (in thread): > (The direct download approach is still tricky as you need to fail gracefully when there is no connection etc.)

Lori Shepherd (22:48:54) (in thread): > We like commits to be by the by the 21 because it guarantees a build on the daily builder before we freeze and make the release. After the release, if there arebugcorrection/fixes they would have to be pushed to both the newly release_3_17 and devel branches. New features should only be introduced to devel tho.

Lori Shepherd (22:51:57) (in thread): > We do not allow data to be hosted on GitHub it must be a trusted server…our hosted Bioconductor, institution level, zenod, etc…this is Bioconductorpolicy

Lori Shepherd (22:53:55) (in thread): > Pleasedo not download from GitHub, Dropbox, or otherpersonal level sites

Joseph (22:59:12) (in thread): > Great thanks@Lori Shepherd. Our package was recently accepted, however there are some very small corrections I would like to include, does this mean I can push to the devel branch as I have been doing as long as it builds correctly and everything is done before the 21st?

Lori Shepherd (23:05:25) (in thread): > You can push at any time to devel except for about 3 hours when we create the new release. Everything pushed before the announcement of the freeze with be included in both the new release branch and devel. After that time any bug fixes would need to be pushed too both branches. > But short answer…yes

Lori Shepherd (23:06:54) (in thread): > If you download from GitHub we will reach out to correct it or the package will be at risk of deprecation

Joseph (23:11:47) (in thread): > Ok, great! Thank you so much for your time!

2023-04-15

Lluís Revilla (07:08:40) (in thread): > Also the Additional_repository trick is not supported as per discussion inhttps://community-bioc.slack.com/archives/CLUJWDQF4/p1679668184759999 - Attachment: Attachment > Currently Bioc requires all packages to be on CRAN or Bioc; no additional repositories beyond that for stated package dependencies

Vince Carey (07:10:49) (in thread): > Please look athttp://contributions.bioconductor.org/data.html. A large data resource should be prepared as an ExperimentHub component (or AnnotationHub if it happens to be annotation). See the ExperimentHubData/AnnotationHubData packages. - Attachment (contributions.bioconductor.org): Chapter 13 Package data | Bioconductor Packages: Development, Maintenance, and Peer Review > When developing a software package, an excellent practice is to give a comprehensive illustration of the methods in the package using an existing experiment data package, annotation data or data…

2023-04-17

Laurent Gatto (02:30:46) (in thread): > Our admins use Proxmox virtual environments, that come with the Ceph storage system that is used (I am only a user here). Software bundles are handled by HPC modules, as mentioned by Martin.

Laurent Gatto (02:41:42) (in thread): > @Davide Risso- yes, good point. The R/RStudio VMs aren’t part of the HPC cluster - they run on the same hardware, but are provided for different usages. R/RStudio VMs (we have R 4.1, 4.2 and devel) have a decent amount of RAM (I think 64G, 8 cores) for interactive use. If a user needs more capacity and parallel processing, then they would use the cluster’s compute nodes and slurm through different nodes.

2023-04-18

Jianhai Zhang (13:10:02): > Hello, can any administrator install the promises package on palomino3 Windows Server? I got an error due to absence of this package:http://bioconductor.org/checkResults/devel/bioc-LATEST/spatialHeatmap/palomino3-buildsrc.html

Andres Wokaty (14:07:13): > @Jianhai ZhangI updated R to R RC for the coming release, so it usually takes a few days for the builders to reinstall everything, but I’ll make sure there isn’t an issue with promises.

Andres Wokaty (14:35:39) (in thread): > There doesn’t seem to be an issue with promises and it was installed during the course of that run, so we should the number of errors on the Windows devel builder decrease in tomorrow’s repot.

Jermiah Joseph (16:41:36): > @Jermiah Joseph has joined the channel

2023-04-19

Michael Love (07:59:32) (in thread): > it was something even more basic that I was missing:if: github.ref == 'refs/heads/master' &&->if: github.ref == 'refs/heads/devel' &&

Vince Carey (09:16:20) (in thread): > is this all sorted yet?

2023-04-20

Helen Lindsay (08:08:23): > @Helen Lindsay has joined the channel

Jianhai Zhang (23:14:17) (in thread): > The data are hosted on Zenod now.

2023-04-21

Vince Carey (00:10:55) (in thread): > Great. Now you can create an ExperimentHubData package with information about the Zenodo address for retrieval and metadata, etc. See the vignette for the ExperimentHubData package.

Lori Shepherd (07:27:09) (in thread): > Or at minimum use BiocFileCache to do file caching.

2023-04-22

Jianhai Zhang (18:13:30) (in thread): > Thanks for the suggestion.

2023-04-25

Jeroen Ooms (06:14:46) (in thread): > no

Jeroen Ooms (06:14:53) (in thread): > I am still getting these errors

Vince Carey (09:39:15) (in thread): > Tomorrow is our official release of 3.17. We’ll tackle it thursday if all goes according to plan.

Jeroen Ooms (09:39:49) (in thread): > ok. lmk if you need any help

2023-04-26

Henrik Bengtsson (11:42:22): > Issue:https://code.bioconductor.org/browse/doesn’t show any commits after 2023-04-25 11:44:10 UTC (i.e. ~28 hours ago). I pushed commits ~12 hours ago. Is that a bug, or intentional?

Mike Smith (12:12:56): > It’s possible that the operation to update is still running, as it will be syncing every package after the version bump. I’d be surprised if it takes more than 24 hours, it hasn’t in the past, but it seems likely. I’ll see if I can diagnose anything.

Nils Eling (15:35:15): > Hi all, is there any way to perform a bug fix on the 3.16 release? I only caught this a week ago and therefore wasn’t able to push it to the 3.16 branch. I believe that most user won’t update their R version for quite some time and therefor won’t work with the correct version of the package.

Lori Shepherd (15:37:26): > unfortunately no. We do not allow any changes to frozen branches since we have no way to test them or build against the dependencies anymore

Lori Shepherd (15:40:38): > Thanks to all developers and community members for contributing to the project! Bioconductor 3.17 is now available! > Please see the full release announcement:https://bioconductor.org/news/bioc_3_17_release/

Henrik Bengtsson (18:41:45): > The <https://bioconductor.org/config.yaml> file says that Bioc devel (3.18) targets R 4.3.0. Is that correct, or should it be R 4.4.0 (=R devel)? > > ## CHANGE THIS WHEN WE RELEASE A VERSION: > release_version: "3.17" > r_version_associated_with_release: "4.3.0" > r_version_associated_with_devel: "4.3.0" > > ## CHANGE THIS WHEN WE RELEASE A VERSION: > devel_version: "3.18" >

Marcel Ramos Pérez (18:46:25): > Yes, that is right. Note the pattern in the releases :https://bioconductor.org/about/release-announcements/and more concretelyhttps://contributions.bioconductor.org/use-devel.html?q=R-devel#which-version-of-r

Kasper D. Hansen (18:50:46): > We’re in the off-devel cycle, because 4.4 doesn’t get release until a year from now.

Kasper D. Hansen (18:51:13): > So 3.18 will use 4.3.x with x being whatever minor release is operational when 3.18 is released

Kasper D. Hansen (18:52:22): > so 4.3.0 will almost surely (based on history) really be 4.3.1 or 4.3.2, but those doesn’t exist yet which is why I think the yaml is 4.3.0. Im guessing it will be updated later

Henrik Bengtsson (19:08:59): > Doh … I should know this (by now). Despite all the coffee I had today, I apparently didn’t have enough. (My instinct today was that we move to R-devel as soon as a new version is available, but the rule of thumb is that that it only happens ~1/2 year in.)

2023-04-27

Nils Eling (04:26:58) (in thread): > Thanks! That’s unfortunate.

Nils Eling (04:50:51): > Hi all, a question related to my previous one and one that I already posted on the mailing list: It seems like since a couple of month install/build/check on MacOS takes more than twice as long as on Linux/Windows (e.g.https://bioconductor.org/checkResults/release/bioc-LATEST/scater/). This leads to TIMEOUT for packages with longer check times (e.g.basilisk,benchdamic,BgeeDB,ChIPseeker,ClassifyR, etc.). It seems that the MacOS binaries for those packages are not up-to-date with the current release version (e.g. for my packageimcRtoolsthe macOS binary version is 1.5.2 while release was 1.5.5 –> 1.6.0). Can we expect for the install/build/check times to go down again or would we need to rewrite the checks?

Vince Carey (09:56:18) (in thread): > Our hypothesis at the moment is that by introducing the macOS 11.3 SDK on the builders as noted athttps://mac.r-project.org/this problem will be resolved.@Jeroen Ooms– any improvements?@Andres Wokatycan indicate whether I have it right. - Attachment (mac.r-project.org): R for macOS Developers > This is the home for experimental binaries and documentation related to R for macOS. To learn more about the R software or download released versions, please visit www.r-project.org.

Andres Wokaty (10:32:50) (in thread): > Yes, we’re using MacOSX 11 SDK and targeting MacOSX 11 following Simon’s recommendations to build binaries. We also resolved some issues with linked libraries. Please try getting a more recent version of the binaries. If you’re still having an issue, it would be helpful to know more about how you’re testing as well as the package.

Hervé Pagès (16:07:54): > Unfortunately merida1 (Intel Mac) is a much slower machine than nebbiolo1 (Linux) or palomino3 (Windows). I don’t think the TIMEOUT we see forimcRtoolson merida1 is new because we’ve used this machine since the beginning of the BioC 3.17 daily builds in last November. Also the fact that no x86_64 Mac binary is available forimcRtoolssuggests that the TIMEOUT has been here since the beginning of the 3.17 builds (note that the Mac binary version 1.5.2 that is currently available is an arm64 binary). > The unit tests in the package take more than 10 min. (see timings at the bottom ofhttps://bioconductor.org/checkResults/3.17/bioc-LATEST/imcRtools/nebbiolo1-checksrc.htmlorhttps://bioconductor.org/checkResults/3.17/bioc-LATEST/imcRtools/nebbiolo1-checksrc.html). Have you considered moving the heaviest tests to the long tests builds? Seehttps://contributions.bioconductor.org/long-tests.htmlfor more info. - Attachment (contributions.bioconductor.org): B Long Tests | Bioconductor Packages: Development, Maintenance, and Peer Review > B.1 What are they Code in the tests subdirectory of all Bioconductor software packages is run by R CMD check on a daily basis as part of the Bioconductor nightly builds. The maximum amount of time…

2023-04-28

Nils Eling (02:53:41) (in thread): > Thanks Hervé! I wasn’t fully aware of the change in machines and will move some of the tests to long tests.

Jeroen Ooms (07:32:52) (in thread): > Yes it looks like all builds that failed in the past week, are now passing. Thanks for fixing this!

2023-05-01

Michael Love (14:33:12): > I haven’t poked around on this at all yet, but any thoughts? Users are getting > > unable to load shared object '/Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/site-library/DESeq2/libs/DESeq2.so': > > on Mac x86 machines with the Mac binary in Bioc 3.17.https://support.bioconductor.org/p/9151097

Andres Wokaty (15:45:44) (in thread): > I’ll look into this.

Kasper D. Hansen (15:51:42) (in thread): > Isn;t this about the user having to install the new version of gfortran ?

Kasper D. Hansen (15:52:29) (in thread): > Fromhttps://mac.r-project.org/tools/

Kasper D. Hansen (15:53:05) (in thread): > Or is the build supposed to statically link against gfortran?

Kasper D. Hansen (15:55:21) (in thread): > The last issue on that page (from Cynthia) seems weird: it tries to reference a4.2/Resources/lib/libR.dylib(note the version 4.2 instead of the expected 4.3)

Andres Wokaty (15:55:44) (in thread): > No, they shouldn’t have to. When we create the binaries, we include some libraries for the user. The problem is with the code generating the binaries, which I’ll fix.

Kasper D. Hansen (15:58:38) (in thread): > yeah, so that’s static linking against gfortran

Hervé Pagès (16:11:27) (in thread): > It’s not static linking. Simon bundles those libs in the R binary e.g. for the intel binary: > > merida1:~ biocbuild$ ls /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/ > libR.dylib libRblas.dylib libRlapack.dylib libquadmath.0.dylib > libR.dylib.dSYM libRblas.dylib.dSYM libRlapack.dylib.dSYM pkgconfig > libRblas.0.dylib libRblas.vecLib.dylib libgcc_s.1.1.dylib > libRblas.0.dylib.dSYM libRblas.vecLib.dylib.dSYM libgfortran.5.dylib > > so we just modify the paths to link to these instead (withinstall_name_tool).

Kasper D. Hansen (16:20:32) (in thread): > ah! Interesting!

Hervé Pagès (17:04:04) (in thread): > seehttps://github.com/Bioconductor/BBS/issues/281for the details

2023-05-02

Robert Castelo (10:13:02): > Hi, the devel version ofGSVAdoesn’t seem to be available throughBiocManager::install(): > > BiocManager::version() > [1] '3.18' > BiocManager::install("GSVA", force=TRUE) > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://cloud.r-project.org](https://cloud.r-project.org)Bioconductor version 3.18 (BiocManager 1.30.20), R 4.3.0 (2023-04-21) > Installing package(s) 'GSVA' > Warning message: > package 'GSVA' is not available for Bioconductor version '3.18' > > A version of this package for your version of R might be available elsewhere, > see the ideas at[https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages](https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages) > > .. and in fact the devel landingpagestill points to the last devel version 1.47.3, instead of the current devel of the package 1.49.0. Other packages that I maintain do install their corresponding devel versions, so maybe the problem is specific to this package.

Lori Shepherd (10:20:31): > So while the landing page says 1.47.3 if you look at the bottom of the landing page there are no source or binary available. if you look at the build reporthttp://bioconductor.org/checkResults/devel/bioc-LATEST/GSVA/while its green OK there is a red button indicating that it was held back from propagating – hovering over the red dot it says that BiocSingular is unavailable so it held back propagation (which in turn was held back because ScaledMatrix is unavailable and indeed did fail to build)

Lori Shepherd (10:21:49): > packages can be found on the builder because we do an install step before build/check – but if the package can’t be propagated and found by the general use because of failed propagation of a dependency, while it will show a clean build report it will not propagate until its dependency can be available

Lori Shepherd (10:30:09): > And while we will reach out to any package that is failing – and there are auto notifications of package failures – we encourage maintainers that have packages failing due to dependencies to reach out to other maintainers to politely encourage quick fixes:slightly_smiling_face:

Kasper D. Hansen (10:38:01): > Social pressure is real and it works!

Robert Castelo (11:49:52): > ah, ok, I saw the greenOKlabels and forgot to hover over the red dots. If I’ve navigated correctly the dependency paths, the problem seems to stem from DelayedArray (hi@Hervé Pagès!!), but I still do not completely understand the situation. Looking at the log of DelayedArray, one could speculate that the internal transition to depend on S4Arrays could be the source of the problem: > > Author: Hervé Pagès <hpages.on.github@gmail.com> > Date: Sun Apr 30 19:11:45 2023 -0700 > > small tweak > > commit 84ac5ab023cb987287a75ea67860e502a97705f8 > Author: Hervé Pagès <hpages.on.github@gmail.com> > Date: Sun Apr 30 17:43:34 2023 -0700 > > DelayedArray now depends on S4Arrays > > commit 87328f68b8c33474b596054650db2efdb88772ca > Author: J Wokaty <jwokaty@users.noreply.github.com> > Date: Tue Apr 25 10:51:17 2023 -0400 > > bump x.y.z version to odd y following creation of RELEASE_3_17 branch > > However, there is no source/binary available for DelayedArray before that transition either, shouldn’t the build system have propagated the source and binary versions that follow right after the bumping commit on April 25th??

Hervé Pagès (12:18:23): > Looks likeGSVAdepends indirectly onScaledMatrix(viaBioSingular), which fails to pass CHECK in BioC 3.18:https://bioconductor.org/checkResults/3.18/bioc-LATEST/ScaledMatrix/nebbiolo2-checksrc.htmlThe reasonR CMD checkcomplains that “there is no package called ‘ResidualMatrix’” is because for some reason it seems to think that theScaledMatrix.Rmdvignette needsResidualMatrix. But since the latter is not in Suggests, that causesR CMD checkto fail (remember that we use*R_CHECK_SUGGESTS_ONLY*=truefor the Linux builds). > Yes, a PR forScaledMatrixor a little bit of social pressure might help.

Lori Shepherd (12:22:01) (in thread): > A PR / Fix should be applied to both the RELEASE_3_17 and devel branches; we plan to reintroduce the*R_CHECK_SUGGESTS_ONLY*=trueon release in the next few weeks but removed it temporarily so more packages could initially pass in the release.

Hervé Pagès (12:46:54) (in thread): > > there is no source/binary available for DelayedArray > Hmm…I see a source tarball here:https://bioconductor.org/packages/3.18/DelayedArray. Furthermore, I don’t see a red LED for nebbiolo2 that reveals that a package is not allowed to propagate because the source tarball forDelayedArrayis not available:thinking_face:

Robert Castelo (13:00:54) (in thread): > you’re right, i saw the windows and mac binaries missing, and overlooked the presence of the source tarball, then hovering over the red dot from ResidualMatrix said that DelayedArray wasn’t available, but of course this was for windows, not linux. sorry for the noise.

Robert Castelo (13:10:36) (in thread): > so, if i understood correctly what happened,_R_CHECK_SUGGESTS_ONLY=truewas removed temporarily to help packages passing to the release without errors, but kept intact in the devel build system, which resulted in some packages missing source tarballs and binaries in the devel after the version bump due to errors derived from such a flag. i’d suggest for the next release to synchronize that action, i.e., remove_R_CHECK_SUGGESTS_ONLY=true, in both release and devel, so that the next devel cycle starts with as many source tarballs and binaries as possible, and then reintroduce it simultaneously in both branches.

Hervé Pagès (13:26:19): > > so that the next devel cycle starts with as many source tarballs > We purposedly didn’t want to do this. By not propagating source tarballs and their rev deps that fail due to the_R_CHECK_SUGGESTS_ONLY=truesetting on Linux, we increase social pressure and these errors (which are super easy to fix) will get taken care of faster:wink:

Robert Castelo (13:45:56): > ah .. okey .. i see the logic, but this implies that development in pkgs that need those missing upstream deps cannot resume right after the release, until those fixes are propagated .. well it can resume, but one won’t be able to build and test .. let’s increase social pressure by putting an emoji:rage:

Hervé Pagès (13:49:18): > That emoji might have more impact if you put it somewhere onScaledMatrixgithub repo:wink:

Lori Shepherd (13:51:34): > The idea of the install of all packages and then the red propagation block is that packages that depend on another could get an accurate build report even though the dependency is failing in build/check – so most of individual package functionality can be verified – and while its inconvenient you could still download the git repo and install a needed dependency manually (though yes less than ideal and more ideal that the dependency fix their package asap)

Henrik Bengtsson (14:46:43) (in thread): > Contrary to CRAN, there are no emails going out to Bioc package maintainers when their packages starting to fail, correct? I always thought it was up to the community to pitch in a report issues to maintainers whenever there’s a problem, unless the maintainer notices it themself.

Hervé Pagès (15:00:26) (in thread): > > Contrary to CRAN, there are no emails going out to Bioc package maintainers > Yes there are. Frequency can vary but automatic notification emails are usually sent at least once a week for ERRORs or TIMEOUTs on Linux only. I think they were not activated in 3.18 yet (the 3.18 builds are very new and we were waiting for them to stabalize), but they should be by now.

Henrik Bengtsson (15:20:57) (in thread): > Perfect. I didn’t know that.

Kasper D. Hansen (17:59:21) (in thread): > However up to the release I had issues with 2 packages : bsseq and affxparser.I got emails about bsseq but not affxparser.

Henrik Bengtsson (21:32:42) (in thread): > > I got emails about bsseq but not affxparser. > The ‘affxparser’ issue was an MS Windows-only error. From@Hervé Pagèscomment above, it sounds like automatic emails only happens for issues detected on the Linux check servers.

Hervé Pagès (22:28:40) (in thread): > Yep, for Linux only. The reason for this is that the risk of false positives tends to be significantly higher on Windows and Mac, so we’ve always been a little hesitant to include these platforms in the automatic emails.

2023-05-05

Steffen Neumann (12:12:15): > Hi, I hope this is not a total FAQ, is there a way to have a distribution independent way to install system dependencies ?

Steffen Neumann (12:14:57) (in thread): > The BioC docker container is based on Ubuntu / Debian’ish apt install stuff. I want / need to test building on e.g. Fedora, and the existing GitHub Actions haveapt install libxyz-devsprinkled across. So for rpm based distributions that’d need to be rpm, yum, dnf or whatever. > Ideas ? Preferably ones that do not involve conda ?

Dirk Eddelbuettel (12:36:40) (in thread): > Yes/no/kinda. No in the ‘cannot easily bring what is system-level stuff into application level’ but Inaki’sbspmcomes pretty closewhen and where it can rely on system-level repossuch as his ‘copr cran’ repo for Fedora, Detlef’s repo for OpenSUSE, or my r2u for Ubuntu. Thepakpackage tries to fill the gap too but fundamentally there are reasons why some things are at the OS level.

Steffen Neumann (12:38:33) (in thread): > Thanks, I will tryhttps://cran.r-project.org/web/packages/bspm/index.html - Attachment (cran.r-project.org): bspm: Bridge to System Package Manager > Enables binary package installations on Linux distributions. Provides functions to manage packages via the distribution’s package manager. Also provides transparent integration with R’s install.packages() and a fallback mechanism. When installed as a system package, interacts with the system’s package manager without requiring administrative privileges via an integrated D-Bus service; otherwise, uses sudo. Currently, the following backends are supported: DNF, APT, ALPM.

Dirk Eddelbuettel (12:39:05) (in thread): > It is of course ‘easiest when wrapped’ so do look at r2u:https://eddelbuettel.github.io/r2u/ - Attachment (eddelbuettel.github.io): CRAN as Ubuntu Binaries - r2u > Cheap, fast, reliable – pick any three!

Steffen Neumann (12:40:40) (in thread): > No, my stuff already works on Ubuntu but I will GitHub action testing in fedora containers in the os matrix or so

Dirk Eddelbuettel (12:41:29) (in thread): > Gotcha. We have a container rocker/r-bspm:f38 I used this week for a ‘on Fedora’ check I needed to do.

Steffen Neumann (12:43:45) (in thread): > Iirc I came across that but failed to understand it. I might come back with more questions when I try again. But not today anymore . Thanks for the input and heads up!

2023-05-08

Axel Klenk (08:47:30): > @Axel Klenk has joined the channel

2023-05-12

Mike Morgan (12:08:49): > This isn’tper sea Bioconductor question, so please feel free to tell me to go away and ask elsewhere. > > I’m hitting a computational bottleneck due to large dense matrix multiplications - this is all being done with RcppArmadillo for scalability reasons (https://github.com/MarioniLab/miloR/tree/genetic_case). Conceptually, this seems like this is exactly what GPUs are designed for, however, from reading around the extensive overheads of moving data transfer from CPU to GPU and back is a considerable overhead which might wipe out any performance gains. > > So, has anyone experience of handling a similar situation during their package development, and what approach did you opt for eventually?

Kasper D. Hansen (13:03:21): > is this for one specific dataset or do you want to have a general solution.

Kasper D. Hansen (13:03:40): > I am saying this, because it is not too hard to roll your own on your own system but it is harder to generalize

Kasper D. Hansen (13:04:03): > Matrix mult (depending on the dimensions) is embarrassingly parallel and you can explot that

Kasper D. Hansen (13:04:53): > Once you start to hit memory limits, data access is really the bottleneck, but IME that gets to be very system dependent

Aaron Lun (14:28:44): > also, at least some GPU operations are non-determinstic, depending on the partitioning of tasks to threads/cores/whatever. numerical imprecision can quickly add up and give you different results between runs

Aaron Lun (14:29:06): > and logisticially it’s just a pain. expensive copute on AWS and all GPUs on local clusters are always hogged by NN trainers

Dirk Eddelbuettel (14:30:48) (in thread): > Funny way to spell bitcoin miners you have there.

Mike Morgan (16:25:15): > Thanks Kasper and Aaron. This is part of package devel, so needs to be generalized across systems. The dimensions range from 100’s to 1000’s, usually with 2 or more matrix multiplication steps. It sounds like it might be more headache than its worth at this point, and I’d use my time better trying to get a parallelized version running.

2023-05-17

Federico Marini (03:49:06): > Encountering a weird error which I cannot really pinpoint, for a report that used to run fine in the previous bioc version

Federico Marini (03:49:42): > I am integrating a couple of sce objects and doing the “common joint processing” - sce_processed_all is a list of sces > > > sce_processed_all <- mapply(FUN=runPCA, x=sce_processed_all, subset_row=all.hvgs, > + MoreArgs=list(ncomponents=25, BSPARAM=RandomParam()), > + SIMPLIFY=FALSE) > Error: BiocParallel errors > 1 remote errors, element index: 1 > 0 unevaluated and other errors > first remote error: > Error in .read_block_OLD(x, viewport, as.sparse = as.sparse): could not find function ".read_block_OLD" >

Federico Marini (03:49:59): > a google search did not even find the read_block_OLD function

Federico Marini (03:50:21): > I did a quick scan ofcode.bioconductor.orgas well,https://code.bioconductor.org/search/search?q=read_block_OLD - Attachment (code.bioconductor.org): Bioconductor Code: Search > Search source code across all Bioconductor packages

Federico Marini (03:51:24): > any clue of some underlying packages leading to its fail?

Federico Marini (03:52:41): > I could also use a fallback solution with “indexing” things, but this one is a little more elegant

Federico Marini (03:54:18): > I could not see where I actually turn on the parallel computing here, any help is very appreciated:wink:

Peter Hickey (04:06:47): > I’m guessing it’s related to stuff@Hervé Pagèsis doing re-factoring DelayedArray/S4Arrays/SparseArray (https://github.com/Bioconductor/DelayedArray/blob/devel/TODO) although the function name in the error is slightly different

Peter Hickey (04:07:28): > but that’s all I can guess:sweat_smile:

Federico Marini (05:36:34): > likely -> - File (PNG): image.png

Federico Marini (05:36:46): > thanks Pete:wink:

Mike Morgan (05:51:10): > A user raised an issue on miloR this morning relating to this issue, but I don’t get it on my local machine. I’ll point the user to the DelayedArray issue for now.

Martin Morgan (07:15:24) (in thread): > BiocManager::valid() ?

Hervé Pagès (13:54:35) (in thread): > sessionInfo()please? Make sure you have the latestS4Arrays(1.1.4) andDelayedArray(0.27.2)

2023-05-18

Oluwafemi Oyedele (05:53:35): > @Oluwafemi Oyedele has joined the channel

Federico Marini (06:47:05) (in thread): > it is, for the 3.17 - need to be on it for consistency with the start of the project

Federico Marini (06:48:02) (in thread): > as in the reply to@Martin Morgan, I still am (have to be) on bioc 3.17

Federico Marini (06:48:21) (in thread): > I probably don’t even have access to these versions (correctly)

Federico Marini (06:50:05) (in thread): > > > sessionInfo() > R version 4.3.0 (2023-04-21) > Platform: x86_64-apple-darwin20 (64-bit) > Running under: macOS Monterey 12.6.4 > > Matrix products: default > BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib > LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > time zone: Europe/Berlin > tzcode source: internal > > attached base packages: > [1] stats4 stats graphics grDevices utils datasets methods base > > other attached packages: > [1] CellMixS_1.16.0 kSamples_1.2-9 SuppDists_1.1-9.7 bluster_1.10.0 Matrix_1.5-4 > [6] scDblFinder_1.14.0 celldex_1.10.0 SingleR_2.2.0 batchelor_1.16.0 BiocSingular_1.16.0 > [11] DropletUtils_1.20.0 scran_1.28.1 scater_1.28.0 ggplot2_3.4.2 scuttle_1.10.1 > [16] iSEE_2.12.0 SingleCellExperiment_1.22.0 SummarizedExperiment_1.30.1 Biobase_2.60.0 GenomicRanges_1.52.0 > [21] GenomeInfoDb_1.36.0 IRanges_2.34.0 S4Vectors_0.38.1 BiocGenerics_0.46.0 MatrixGenerics_1.12.0 > [26] matrixStats_0.63.0 knitr_1.42 > > loaded via a namespace (and not attached): > [1] splines_4.3.0 later_1.3.1 BiocIO_1.10.0 bitops_1.0-7 filelock_1.0.2 > [6] tibble_3.2.1 R.oo_1.25.0 XML_3.99-0.14 lifecycle_1.0.3 edgeR_3.42.2 > [11] doParallel_1.0.17 lattice_0.21-8 MASS_7.3-60 magrittr_2.0.3 limma_3.56.1 > [16] sass_0.4.6 rmarkdown_2.21 jquerylib_0.1.4 yaml_2.3.7 metapod_1.8.0 > [21] httpuv_1.6.11 cowplot_1.1.1 DBI_1.1.3 RColorBrewer_1.1-3 ResidualMatrix_1.10.0 > [26] zlibbioc_1.46.0 purrr_1.0.1 R.utils_2.12.2 RCurl_1.98-1.12 rappdirs_0.3.3 > [31] circlize_0.4.15 GenomeInfoDbData_1.2.10 ggrepel_0.9.3 irlba_2.3.5.1 dqrng_0.3.0 > [36] DelayedMatrixStats_1.22.0 codetools_0.2-19 DelayedArray_0.26.2 DT_0.27 tidyselect_1.2.0 > [41] shape_1.4.6 ScaledMatrix_1.8.1 viridis_0.6.3 shinyWidgets_0.7.6 BiocFileCache_2.8.0 > [46] GenomicAlignments_1.36.0 jsonlite_1.8.4 GetoptLong_1.0.5 BiocNeighbors_1.18.0 ellipsis_0.3.2 > [51] ggridges_0.5.4 iterators_1.0.14 foreach_1.5.2 tools_4.3.0 Rcpp_1.0.10 > [56] glue_1.6.2 gridExtra_2.3 xfun_0.39 mgcv_1.8-42 dplyr_1.1.2 > [61] HDF5Array_1.28.1 shinydashboard_0.7.2 withr_2.5.0 BiocManager_1.30.20 fastmap_1.1.1 > [66] rhdf5filters_1.12.1 fansi_1.0.4 shinyjs_2.1.0 digest_0.6.31 rsvd_1.0.5 > [71] R6_2.5.1 mime_0.12 colorspace_2.1-0 RSQLite_2.3.1 R.methodsS3_1.8.2 > [76] tidyr_1.3.0 utf8_1.2.3 generics_0.1.3 data.table_1.14.8 rtracklayer_1.60.0 > [81] httr_1.4.6 htmlwidgets_1.6.2 S4Arrays_1.0.4 pkgconfig_2.0.3 gtable_0.3.3 > [86] blob_1.2.4 ComplexHeatmap_2.16.0 XVector_0.40.0 htmltools_0.5.5 rintrojs_0.3.2 > [91] clue_0.3-64 scales_1.2.1 png_0.1-8 rstudioapi_0.14 rjson_0.2.21 > [96] nlme_3.1-162 curl_5.0.0 shinyAce_0.4.2 cachem_1.0.8 rhdf5_2.44.0 > [101] GlobalOptions_0.1.2 BiocVersion_3.17.1 parallel_4.3.0 miniUI_0.1.1.1 vipor_0.4.5 > [106] AnnotationDbi_1.62.1 restfulr_0.0.15 pillar_1.9.0 grid_4.3.0 vctrs_0.6.2 > [111] promises_1.2.0.1 dbplyr_2.3.2 beachmat_2.16.0 xtable_1.8-4 cluster_2.1.4 > [116] beeswarm_0.4.0 evaluate_0.21 Rsamtools_2.16.0 cli_3.6.1 locfit_1.5-9.7 > [121] compiler_4.3.0 rlang_1.1.1 crayon_1.5.2 ggbeeswarm_0.7.2 viridisLite_0.4.2 > [126] BiocParallel_1.34.1 munsell_0.5.0 Biostrings_2.68.1 colourpicker_1.2.0 ExperimentHub_2.8.0 > [131] sparseMatrixStats_1.12.0 bit64_4.0.5 Rhdf5lib_1.22.0 KEGGREST_1.40.0 statmod_1.5.0 > [136] shiny_1.7.4 interactiveDisplayBase_1.38.0 AnnotationHub_3.8.0 igraph_1.4.2 memoise_2.0.1 > [141] bslib_0.4.2 xgboost_1.7.5.1 bit_4.0.5 >

Tim Triche (12:37:24): > @Aaron Lun@Kevin Rue-Albrechtis there any appetite for a NonlinearEmbeddingMatrix type of container in SingleCellExperiment? I’m using LinearEmbeddingMatrix to store diagonalized NMF fits and realized it would be nice to storeuwot’s returned model alongside UMAP coords (for similar reasons as with NMF in LEM)

Aaron Lun (12:38:26): > probably best to create another package. the LEM itself really should be in a separate package dedicated to EMs, its presence in SCE is just a historical quirk that no one’s fixed.

Tim Triche (12:38:50): > BiocEmbeddings or some such?

Tim Triche (12:39:01): > are LEMs used anywhere besides in SCEs?

Tim Triche (12:39:37): > sometimes I worry that SCE and the helper functions that do stuff in it (e.g.logNormCounts) have been atomized into a million tiny packages

Tim Triche (12:40:13): > on the other hand if we want an unrefactored Big Ball of Mud, Seurat is:arrow_right:thataways

Tim Triche (12:41:52): > I’ll punt on this until I hear someone make a case for anything fancier thanreduced.dim.matrixwithattr(., "fit")set to the returned uwot model

2023-05-19

Hervé Pagès (01:47:13) (in thread): > Puzzling! The re-factoring Pete was mentioning could have been involved but it’s in BioC 3.18 only, so I’m not sure I have a good explanation for the error you are getting. Can you try to re-installS4ArrayswithBiocManager::install("S4Arrays", force=TRUE)and see if that helps?

Federico Marini (04:46:43) (in thread): > will do!

Federico Marini (04:48:32) (in thread): > Ok, tried but the error persists:disappointed:

Federico Marini (04:49:08) (in thread): > can we set the backend to go serial in mapply? If that avoids the biocparallel error, it could be a good enough patch on it

Martin Morgan (07:45:42) (in thread): > BiocParallel::register(SerialParam())maybe; you could try something liketrace(bplapply, tracer = quote(print(sys.calls())))to figure out wherebplapplyis being called. If this is somehow the source of the problem, then I assume that you have two libraries on your system with incompatible versions of packages installed in each, and it would be good to understand why BiocParallel is losing track of the ‘main’ configuration

Federico Marini (14:36:51) (in thread): > good point

Federico Marini (14:36:57) (in thread): > thanks, I will try!

Federico Marini (14:41:54) (in thread): > Tracing on, and I get > > [[23]] > rsvd.default(x, k = k, nu = nu, nv = nv, ...) > > [[24]] > A %*% O > > [[25]] > A %*% O > > [[26]] > .super_BLOCK_mult(x, y, MULT = `%*%`) > > [[27]] > S4Arrays:::bplapply2(seq_len(length(x_grid)), FUN = .left_mult, > x = x, y = y, grid = x_grid, MULT = MULT, BPPARAM = BPPARAM) > > [[28]] > BiocParallel::bplapply(X, FUN, ..., BPPARAM = BPPARAM) > > [[29]] > .doTrace(print(sys.calls()), "on entry") > > [[30]] > eval.parent(exprObj) > > [[31]] > eval(expr, p) > > [[32]] > eval(expr, p) > > Error: BiocParallel errors > 1 remote errors, element index: 1 > 0 unevaluated and other errors > first remote error: > Error in .read_block_OLD(x, viewport, as.sparse = as.sparse): could not find function ".read_block_OLD" >

Federico Marini (14:42:10) (in thread): > this is happening evening after editing the call ofmapplyto

Federico Marini (14:42:19) (in thread): > > sce_processed_all <- mapply(FUN=runPCA, x=sce_processed_all, subset_row=all.hvgs, > MoreArgs=list(ncomponents=25, BSPARAM=RandomParam(), > BPPARAM=BiocParallel::SerialParam()), > SIMPLIFY=FALSE) >

Federico Marini (14:42:42) (in thread): > in case the parallel execution would have been handled by runPCA, as the trace seemed to point to

Federico Marini (14:43:35) (in thread): > All in all, I can say, the usual joys of post-update-of-R

Martin Morgan (16:07:16) (in thread): > I think the usual rules of debuging apply – make it simpler, e.g., by > > runPCA(sce_processed_all[[1]], ...) > > and simpler still by taking back the layers of the call stack. I ended up somewhere like > > > showMethods(runPCA) > Function: runPCA (package BiocSingular) > x="ANY" > x="SingleCellExperiment" > > so I guess you’re expecting to get at the second, and I see > > > selectMethod(runPCA, "SingleCellExperiment") > Method Definition: > > function (x, ...) > { > .local <- function (x, ..., altexp = NULL, name = "PCA") > { > if (!is.null(altexp)) { > y <- altExp(x, altexp) > } > else { > y <- x > } > reducedDim(x, name) <- calculatePCA(y, ...) > x > } > .local(x, ...) > } > <bytecode: 0x1411eee80> > <environment: namespace:scater> > > I’m guessing you end up atcalculatePCA()? So something like > > calculatePCA(sce_processed_all[[1]],...) > > and so on until you have a simplified example

2023-05-22

Hervé Pagès (22:35:34) (in thread): > @Federico MariniI’m not 100% sure but I have a feeling that the .read_block_OLD error could be specific to theDelayedArray0.26.2 binaries for Windows and Mac. Seehttps://support.bioconductor.org/p/9152409/#9152412If you have Xcode you can try to reinstall from source withBiocManager::install("DelayedArray", type="source", force=TRUE). Otherwise theDelayedArray0.26.3 binary for Intel Mac should become available tomorrow. I apologize for the inconvenience.

2023-05-23

Hervé Pagès (11:55:10) (in thread): > Looks like it did the trick for this user:https://github.com/Bioconductor/HDF5Array/issues/55#issuecomment-1558704896 - Attachment: Comment on #55 Error with read_block() when using writeHDF5Array() > Installing from source solved the problem, thanks!

2023-05-25

Jacob Krol (17:13:27): > @Jacob Krol has joined the channel

2023-06-07

Alyssa Obermayer (18:29:16): > @Alyssa Obermayer has joined the channel

Nitesh Turaga (19:16:24): > When comparing two objects in R, i’m gettingidentical(x, y)asFALSE, butall.equal(x, y)asTRUE. > > When I eyeball the two objects they look the same to me. It’s a complex object. > > How do I find out what the difference is between the two objects? I had high hopes onall.equal. Anything else I can use to just give me the difference between the two?

Axel Klenk (19:33:02): > Without having actually used one of them, there are packages diffobj and waldo for this purpose.

Spencer Nystrom (19:47:10) (in thread): > You could usewaldo::compare()on a complex object to get a nice diff.

Kasper D. Hansen (20:51:41): > What do you want to do here. For example,all.equal()has a precision argument (tolerance) so for this function - which I believe is useful for anything floating point-wise - equal is up to a user-defined tolerance. That is not the same as identical.

Kasper D. Hansen (20:52:17): > If your two objects - say numeric matrices - “fail” theall.equal()test, it prints out some details on what is different, but only if it fails

Kasper D. Hansen (20:53:19): > ?identicaltells us > > The function ‘all.equal’ is also sometimes used to test equality > > this way, but was intended for something different: it allows for > > small differences in numeric results.

Kasper D. Hansen (20:53:55): > … ok, forget my comments, I didn’t read your question well enlugh

Dirk Eddelbuettel (21:09:18) (in thread): > Quick demo below. We usediffobj::diffObj(which predateswaldoby some time) to show students differences betweentheiranswer and the reference solution. The usual R print works there. Here I mock up an example where the difference does not jump out as it is below the print precision.

Dirk Eddelbuettel (21:09:29) (in thread): - File (R): Untitled

Dirk Eddelbuettel (21:10:34) (in thread): > It does however inform you that the difference is in columna. So a quick follow-up can help:

Dirk Eddelbuettel (21:10:42) (in thread): - File (R): Untitled

2023-06-08

Hervé Pagès (00:21:38) (in thread): > I’ll addtestthat::expect_identical()to the list of suggestions. I find its description of the differences between the 2 objects quite user-friendly.

Hervé Pagès (00:27:20) (in thread): > (could be that it useswaldobehind the scene so it might actually not be different from what you get withwaldo::compare())

Nitesh Turaga (03:41:46) (in thread): > Thanks all. I’ll try waldo

Alan O’C (06:36:01) (in thread): > Also worth checkingtolerance(andscaleif things get really desperate) inall.equal

2023-06-09

Tomi Suomi (10:36:50): > @Tomi Suomi has joined the channel

2023-06-19

Lluís Revilla (09:41:46): > I see error messages “404 Not Found” athttps://code.bioconductor.org/browse/andhttps://code.bioconductor.org/search/. Is there a (planned) maintenance?

Mike Smith (09:58:48) (in thread): > Nope, this isn’t planned. Thanks for the heads up, I’ve passed the status onto the IT team here at EMBL.

Lluís Revilla (11:49:14) (in thread): > Thank you for this service!

Vince Carey (13:14:59) (in thread): > Some of us have seen this as an intermittent error that is overcome with a second attempt in short order.

2023-06-20

Mike Smith (03:49:28) (in thread): > In this instance it was an issue with the storage back end, which needed our IT services to resolve, rather than something transient. > > If you do run into intermittent errors, please let me know and I’ll see if I can spot anything in the logs. It’s not an issue I’ve encountered myself.

Lluís Revilla (03:51:08) (in thread): > Thanks for solving it so quickly. I haven’t find intermittent errors and I did check after 5 minutes before asking here.

2023-06-28

Andrew Ghazi (10:56:12): > @Andrew Ghazi has joined the channel

2023-07-04

Alexander Bender (09:36:11): > @Alexander Bender has left the channel

2023-07-13

Joseph (19:27:55): > Hi everyone, > I just had to push an update to the release branch for my package and was wondering how long it usually takes to see the updated version on the Bioconductor webpage for my package? Sorry, I am new to Bioconductor package development and git as well so I just want to make sure everything is working!

Lori Shepherd (19:34:38): > Once you push it has to build and check on the build system before it would be reflected in the landing page. And the builds happen once a day.https://bioconductor.org/checkResults/3.17/bioc-LATEST/the top of the build page will have the date and it shows the version built

Kasper D. Hansen (21:44:21): > Remember you need to update the R package version to trigger a build.

2023-07-18

Hervé Pagès (17:20:41): > Minor correction: software packages are built every day, no matter what, but, after you push a change, the modified package won’t propagate to the public repos if its version is not higher than the already published package.

2023-07-20

Almog Angel (08:23:08): > @Almog Angel has joined the channel

Almog Angel (08:32:55): > Hello everyone! I am working on a package that use GeneSetCollection to store ~60K gene signatures. As a result, the size of the object is around 600MB. The problem is when I attempt to load this GeneSetCollection as an RDS R becomes very slow. More specifically, the console becomes unresponsive and takes much time to execute any commands. Could anyone shed some light on why this might be happening?

Lluís Revilla (08:55:18) (in thread): > GeneSetCollections store each gene information of the set independently of the other gene set so to load or translate to a different id it might be slow: you might have genes from different annotations and multiple sources, and the annotation libraries must be loaded. Probably this pushes the limits of the RAM you are using. > There was a “project” 4 years ago to provide new classes to store gene sets more efficiently (there was a Birds of Feather in Bioc2019). Currently there are two packages that might work better for such great numbers:BiocSet(in Bioconductor) andBaseSet(I’m the developer of the later, which is on CRAN and I’m preparing a new release soon). I don’t know what you want to do with your big GSC but you can work with it in BaseSet with this transformation:TS <- BaseSet::tidySet(geneIds(my_big_gsc)). Hopefully you won’t have problems loading and using TidySets.

Almog Angel (09:00:43) (in thread): > Thanks Lluis. Actually, I am not trying to manipulate or use my GSC object at all. I am simply reading this object into R, and then everything I run in my R console takes long to execute.

Almog Angel (09:01:41) (in thread): > It’s also not a RAM problem, there is enough free memory

Lluís Revilla (09:07:34) (in thread): > How did you generate it? Could you sharelength(unique(lapply(gsc, geneIdType)))andlength(gsc)? I’m not sure of the internals but maybe there are too many connections to the internal database of the annotation which is causing problems.

Lluís Revilla (09:08:35) (in thread): > But my question about what you want to do is because if you don’t need the GeneSetCollection class maybe you can simplify the object, clean the gene set collection and work with them that way.

Almog Angel (09:11:05) (in thread): > To be honest, I prefer using a simple list of genes . The only reason I use GSC, is because it is recommended by Bioconductor. Moreover, using a simple list does that same job, faster and with half the size of the object.

Almog Angel (09:13:13) (in thread): > > > length(unique(lapply(signatures_collection, geneIdType))) > [1] 1 > > length(signatures_collection) > [1] 65208 >

Lluís Revilla (09:19:11) (in thread): > Several packages use them because there are methods to work with that make easier some operations. If you don’t need it, don’t use it:smile:. > If you are developing a package that would be a different thing. Good luck!

Almog Angel (09:21:16) (in thread): > :slightly_smiling_face:Thanks

Kasper D. Hansen (09:44:10) (in thread): > You’re sure its not RAM. If the 600MB is the size of the RDS you may need substantially more RAM to load it.

Kasper D. Hansen (09:44:48) (in thread): > Are you saying that R is more unresponsive than right after you have created the object, prior to serializing it? That would be weird to me

Almog Angel (09:52:40) (in thread): > It’s not RAM, I am not using R locally and there is enough free RAM on my server. That’s indeed very weird. This object cause R to be unresponsive, laggy, and once I remove it (using the broom icon on RStudio) R is back to normal.

Martin Morgan (10:00:58) (in thread): > does the same thing happen if you use R from the console, even in RStudio? i.e., is this an RStudio-specific issue (maybe trying to display the object in the ‘Environment’ pane?)

Almog Angel (10:33:29) (in thread): > @Martin Morgan, interesting… You are right.. This problem only occur in RStudio

Martin Morgan (11:27:42) (in thread): > I think a weird short-term work-around might be to start variables with a., e.g.,.signatures_collection. They then won’t be listed in the ‘Environment’ pane. > > Maybe there is an RStudio expert here who can provide a better solution, or point to an existing or new RStudio defect where this can be reported?

Kasper D. Hansen (12:39:43) (in thread): > yeah RStudio does some overrides of R stuff

Kasper D. Hansen (12:40:24) (in thread): > But if it is the Enivornment pane (which I think sounds super plausible) I think it should have happened also before you serialized the object

2023-07-25

Nitesh Turaga (17:11:11): > TheBioCcontainersrepo doesn’t want to show up as available forRELEASE_3_17branch. What am I doing wrong? > > ➜ Documents docker run -it bioconductor/bioconductor_docker:RELEASE_3_17 bash > root@30573ea624d5:/# cd > root@30573ea624d5:~# R > > R version 4.3.0 (2023-04-21) -- "Already Tomorrow" > Copyright (C) 2023 The R Foundation for Statistical Computing > Platform: aarch64-unknown-linux-gnu (64-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > Natural language support but running in an English locale > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > > BiocManager::repositories() > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://packagemanager.posit.co/cran/latest](https://packagemanager.posit.co/cran/latest)BioCsoft > "[https://bioconductor.org/packages/3.17/bioc](https://bioconductor.org/packages/3.17/bioc)" > BioCann > "[https://bioconductor.org/packages/3.17/data/annotation](https://bioconductor.org/packages/3.17/data/annotation)" > BioCexp > "[https://bioconductor.org/packages/3.17/data/experiment](https://bioconductor.org/packages/3.17/data/experiment)" > BioCworkflows > "[https://bioconductor.org/packages/3.17/workflows](https://bioconductor.org/packages/3.17/workflows)" > BioCbooks > "[https://bioconductor.org/packages/3.17/books](https://bioconductor.org/packages/3.17/books)" > CRAN > "[https://packagemanager.posit.co/cran/latest](https://packagemanager.posit.co/cran/latest)" >

Nitesh Turaga (17:11:43): > It works fine forRELEASE_3_16

2023-07-26

Charlotte Soneson (00:12:09) (in thread): > Could it be related tohttps://github.com/rstudio/rstudio/issues/12908? Which version of RStudio are you using? We had issues with RStudio becoming very slow when working with large lists, and updating to version 2023.06 solved it for us. - Attachment: #12908 Changes to environment very slow when large sf object is loaded. > System details > > > RStudio Edition : Desktop > RStudio Version : 2023.03.0+386 > OS Version : Windows 10 21H2 > R Version : 4.2.2 > > > Steps to reproduce the problem > > > sessionInfo() > options(timeout = 60000) > u <- "[https://prod-is-usgs-sb-prod-publish.s3.amazonaws.com/641f0b82d34e807d39b8a1c1/sample_very_large_sf.rds](https://prod-is-usgs-sb-prod-publish.s3.amazonaws.com/641f0b82d34e807d39b8a1c1/sample_very_large_sf.rds)" > f <- tempfile(fileext = ".rds") > download.file(u, f, mode = "wb") > > test <- readRDS(f) > > x <- 1 > > # observe very slow response when removing x > rm(x) > > # also notice that operations that mutate the environment are SLOW > test2 <- test[1:100, ] > rm(test2) > > # operations that mutate existing data are not slow? > test$test <- sample(1:10, nrow(test), replace = TRUE) > > > > > sessionInfo() > > R version 4.2.2 (2022-10-31 ucrt) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 19044) > > Matrix products: default > > locale: > [1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 > [3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C > [5] LC_TIME=English_United States.utf8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.2.2 cli_3.6.0 tools_4.2.2 withr_2.5.0 fs_1.6.0 > [6] glue_1.6.2 rstudioapi_0.14 reprex_2.0.2 lifecycle_1.0.3 rlang_1.0.6.9000 > > > Also note an error emitted while typing out the [1:100, ] part of the below. > > > test2 [#12895](- test[1:100, ] > > Error in order(typeScores, scores) : argument lengths differ > In addition: Warning message: > In completions$type == .rs.acCompletionTypes$DATAFRAME & !completions$context %in% : > longer object length is not a multiple of shorter object length > > > I also notice that after running the reprex, the RStudio download monitor process name remains in Task Manager and is the process that is thrashing when the environment changes… This is likely unrelated, but was an odd observation worth noting. > > Describe the problem in detail > > I updated to the new rstudio and have found that, when I have large sf data.frames loaded - the one in the sample triggers the issue - rstudio thrashes the RStudio R Session process every time the environment is modified. Watching Task Manager, I see that process hammering a CPU for a couple seconds every time I execute code that touches the R environment. I’ve verified that the same does not occur in a R session run outside of RStudio. My suspicion is that this is related to <https://github.com/rstudio/rstudio/issues/12895) but I’ve not attempted to use the viewer on these data.frames. > > Describe the behavior you expected > > In prior versions of sf, performance has been satisfactory with data this large in the environment. I can confirm that downgrading to RStudio 2022.12.0 fixes the slowness. > > ☑︎ I have read the guide for submitting good bug reports. > ☑︎ I have installed the latest version of RStudio, and confirmed that the issue still persists. > ☑︎ If I am reporting an RStudio crash, I have included a diagnostics report. > ☑︎ I have done my best to include a minimal, self-contained set of instructions for consistently reproducing the issue.

Guillaume Devailly (08:16:45): > @Guillaume Devailly has joined the channel

Frederick Tan (09:15:12): > @Frederick Tan has joined the channel

Louis Le Nézet (10:32:37): > @Louis Le Nézet has joined the channel

Louis Le Nézet (10:39:29): > Hi ! > > I’m working on a package where I have a new object class. This class is needed for some of the function I’m working on and not necessary for others (a simple vector list or df could do). > What would be the best way to proceed for all the functions ? > Here is what I’ve done until now : > > dosomething <- function(x, ...){ > UseMethod("dosomething") > } > > do_something <- function(v1, v2, v3){...} # return a vector > > dosomething.default(v1, v2, v3){ > do_something(v1, v2, v3)} > > dosomething.myclass <- function(obj){ > obj$v4 <- dosomething(obj$v1, obj$v2, obj$v3) > obj > } > > By doing so the user can specify either directly the vectors needed or the object containing the vectors needed in slots. > The returned value would also be different, either a single vector or the updated object.

Kevin Rue-Albrecht (11:07:57) (in thread): > First reaction: that looks like S3 classes, why not S4 (which is more common in Bioconductor)? > (I’m a bit rusty about the S3 stuff tbh)

Spencer Nystrom (12:46:41) (in thread): > This looks like reasonable S3 to me, though the extrado_something()method doesn’t seem necessary. I would just code it directly indo_something.default()

Kevin Rue-Albrecht (12:47:42) (in thread): > Oh Ididn’tmean unreasonable, I was just curious whether there was any particular reason to prefer S3 over S4 in this instance :)

Spencer Nystrom (12:48:14) (in thread): > Oh I wasn’t disagreeing with you Kevin! I just meant it looked ‘correct’ to me.

Spencer Nystrom (12:48:50) (in thread): > As in, how I would expect an S3 implementation to look.

Kevin Rue-Albrecht (12:50:41) (in thread): > Yup. It seems sensible to me too, in fact. “Same in same out” is how many generics tend to operate, in my experience

Alex Mahmoud (14:40:38) (in thread): > Reposting answer here from#containers: It works as expected on my end. > > docker run --rm -it --platform linux/amd64 bioconductor/bioconductor_docker:RELEASE_3_17 Rscript -e "BiocManager::repositories()" > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://packagemanager.posit.co/cran/__linux__/jammy/latest](https://packagemanager.posit.co/cran/__linux__/jammy/latest)BioCcontainers > "[https://bioconductor.org/packages/3.17/container-binaries/bioconductor_docker](https://bioconductor.org/packages/3.17/container-binaries/bioconductor_docker)" > BioCsoft > "[https://bioconductor.org/packages/3.17/bioc](https://bioconductor.org/packages/3.17/bioc)" > BioCann > "[https://bioconductor.org/packages/3.17/data/annotation](https://bioconductor.org/packages/3.17/data/annotation)" > BioCexp > "[https://bioconductor.org/packages/3.17/data/experiment](https://bioconductor.org/packages/3.17/data/experiment)" > BioCworkflows > "[https://bioconductor.org/packages/3.17/workflows](https://bioconductor.org/packages/3.17/workflows)" > BioCbooks > "[https://bioconductor.org/packages/3.17/books](https://bioconductor.org/packages/3.17/books)" > CRAN > "[https://packagemanager.posit.co/cran/__linux__/jammy/latest](https://packagemanager.posit.co/cran/__linux__/jammy/latest)" > > One thing to note though is that now we havearm64containers as well as the defaultamd64, and thearm64containers have binaries disabled (because they are incompatible). We do have arm64 binaries built and will hopefully get them into production for 3.18. In the meantime, you might have to add--platform linux/amd64to your docker run command to make sure you are using the right container if you are on anarm64machine (eg M chip macs). Note however that it will use an emulator if you are on anarm64and explicitly request thelinux/amd64container, and that can slow down your computation

2023-07-27

Louis Le Nézet (06:07:40) (in thread): > Sorry, I’m not quite familiar with the OOP in R. > The thing is that I have a S4 object “Pedigree” for which I’m developping multiple dedicated function that could be used outside from the Pedigree objects. > That’s why I was going for developping generic functions and dedicated methods for the Pedigree class. > So maybe the best would be to do something like this: > > dosomething <- function(x, ...){ > UseMethod("dosomething") > } > > dosomething.default(v1, v2, v3){ > # return a vector} > > setMethod("dosomething", "Pedigree", function(obj){ > obj$v4 <- dosomething(obj$v1, obj$v2, obj$v3) > obj > } >

Spencer Nystrom (07:09:04) (in thread): > So you have an S4 class that will have methods defined for it but you want those methods to also be available to other non-S4 objects? In that case you should probably use S4. > > See, the “generic functions and dispatch” section here:http://adv-r.had.co.nz/S4.html

Louis Le Nézet (09:13:40) (in thread): > That’s effectively, what I wanted to do. > Is something like this better ? > > setClass( > "Pedigree", > slots = c(v1 = "character", v2 = "character", v3 = "character", v4 = "character"), > prototype = prototype(v1 = "A", v2 = "B", v3 = "C", v4 = character()) > ) > > setGeneric("dosomething", function(obj, ...) { > standardGeneric("dosomething") > }) > > setMethod("dosomething", "Pedigree", function(obj) { > obj@v4 <- dosomething(obj@v1, obj@v2, obj@v3) > obj > }) > > setMethod("dosomething", "character", function(obj, v2 = "B", v3="C") { > paste(obj, v2, v3) > }) > > dosomething(new("Pedigree")) > dosomething(c("A")) > dosomething(c("A"), c("D")) >

Spencer Nystrom (09:19:11) (in thread): > Yeah that looks right. What’s nice about S4 is you can have dispatch against multiple argument types, so you could even make different methods based on the type of v1 or v2 if you needed to at some point.

Louis Le Nézet (09:31:26) (in thread): > Okay ! > Thanks a lot for the help !

2023-07-28

Konstantinos Daniilidis (13:47:09): > @Konstantinos Daniilidis has joined the channel

2023-07-29

Aia Oz (07:50:34): > @Aia Oz has joined the channel

2023-08-01

Simon Pearce (05:10:48): > I’m usingBiocParallelinside my package to speed up several different functions. > I’d like to set my package to automatically parallelise everything, without having to specify such in every function. > I thought that the default forBiocParallel::registered()would be to use a SerialParam unless it had been explicitly set, but that isn’t true (a MulticoreParam is typically the default). > Is there a way to check whetherBiocParallel::register()has been explicitly called? Or do I need to set a global variable/option or something?

Martin Morgan (12:05:15) (in thread): > Actually, the default isMulticoreParam()on Linux / macOS,SnowParam()on Windows, with the rationale that if one didn’t want to use parallel evaluation (SerialParam()) one wouldn’t use BiocParallel. > > There is no way to check thatBiocParallel::register()has been called; other packages sometimes have an option<pkgname>.BiocParallelor similar. But I’m not really sure that I follow how this would parallelize everything automatically…? Also it’s worth carefully evaluating (a) whether the code can be re-written so that it is ‘vectorized’ (operates on vectors) rather than iterative (implied bylapply()etc) and (b) even if iteration is the best option, whether parallel evaluation is effective. For the latter, a concer is that each iteration doesn’t do much work, and the cost of starting / stoping (sending data to / retrieving results from workers) outweighs the parallel evaluation. Kind of conversely, managing memory can be very tricky in a parallel context – each worker can only consume a maximum of 1 / total amount of memory (usually much less), which is something you as a developer don’t have insight to on your user’s computer.

2023-08-02

Simon Pearce (04:34:10) (in thread): > Thank you for the response Martin. > Ok, I’ve only checked on Linux/macOS, not used Windows. > My thought was to have the option be able to be set at the top of any script, and then all subsequent functions can check if that has been set and use parallelisation if set (with an appropriate number of cores as set by the user). > I’m building off another package that usesBiocParallelalready, and mostly want to wrap those functions . But you are right that I should probably verify that it is actually necessary to parallelise, and that it really does speed up the calculations. I suspect so, at least in the HPC that I’m working on which has lots of RAM available, but I should definitely verify that.

2023-08-03

Jiefei Wang (13:40:10): > @Jiefei Wang has joined the channel

Ritika Giri (15:57:33): > @Ritika Giri has joined the channel

2023-08-07

Jiaji George Chen (11:21:05): > @Jiaji George Chen has joined the channel

2023-08-08

Peter Hickey (02:24:18): > Could I please get a couple of people to post their output from running the following code with an up-to-date installation of BioC 3.17 (i.e. the current release) > > suppressPackageStartupMessages(library(MatrixGenerics)) > showMethods("colSdDiffs") > selectMethod("colSdDiffs", "matrix") > showMethods("rowSdDiffs") > selectMethod("rowSdDiffs", "matrix") > > suppressPackageStartupMessages(library(DelayedMatrixStats)) > showMethods("colSdDiffs") > selectMethod("colSdDiffs", "matrix") > showMethods("rowSdDiffs") > selectMethod("rowSdDiffs", "matrix") > > BiocManager::version() > BiocManager::valid() > sessionInfo() >

Peter Hickey (02:25:32) (in thread): > Here’s what I’m getting > > > showMethods("colSdDiffs") > Function: colSdDiffs (package MatrixGenerics) > x="ANY" > x="matrix_OR_array_OR_table_OR_numeric" > > > selectMethod("colSdDiffs", "matrix") > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = NA) > { > matrixStats::colSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > } > <bytecode: 0x137967e08> > <environment: namespace:MatrixGenerics> > > Signatures: > x > target "matrix" > defined "matrix_OR_array_OR_table_OR_numeric" > > showMethods("rowSdDiffs") > Function: rowSdDiffs (package MatrixGenerics) > x="ANY" > x="matrix_OR_array_OR_table_OR_numeric" > > > selectMethod("rowSdDiffs", "matrix") > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = NA) > { > matrixStats::rowSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > } > <bytecode: 0x1402da0a8> > <environment: namespace:MatrixGenerics> > > Signatures: > x > target "matrix" > defined "matrix_OR_array_OR_table_OR_numeric" > > > > suppressPackageStartupMessages(library(DelayedMatrixStats)) > > showMethods("colSdDiffs") > Function: colSdDiffs (package MatrixGenerics) > x="ANY" > x="DelayedMatrix" > x="dgCMatrix" > x="matrix" > x="matrix_OR_array_OR_table_OR_numeric" > > > selectMethod("colSdDiffs", "matrix") > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = TRUE) > { > if (!is.null(rows) && !is.null(cols)) > x <- x[rows, cols, drop = FALSE] > else if (!is.null(rows)) > x <- x[rows, , drop = FALSE] > else if (!is.null(cols)) > x <- x[, cols, drop = FALSE] > if (is.na(useNames)) { > deprecatedUseNamesNA() > } > else if (!useNames) { > colnames(x) <- NULL > } > apply(x, MARGIN = 2L, FUN = sdDiff, na.rm = na.rm, diff = diff, > trim = trim, ...) > } > <bytecode: 0x154551b90> > <environment: namespace:matrixStats> > > Signatures: > x > target "matrix" > defined "matrix" > > showMethods("rowSdDiffs") > Function: rowSdDiffs (package MatrixGenerics) > x="ANY" > x="DelayedMatrix" > x="dgCMatrix" > x="matrix_OR_array_OR_table_OR_numeric" > > > selectMethod("rowSdDiffs", "matrix") > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = NA) > { > matrixStats::rowSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > } > <bytecode: 0x1402da0a8> > <environment: namespace:MatrixGenerics> > > Signatures: > x > target "matrix" > defined "matrix_OR_array_OR_table_OR_numeric" > > > > BiocManager::version() > [1] ‘3.17’ > > BiocManager::valid() > 'getOption("repos")' replaces Bioconductor standard repositories, see 'help("repositories", > package = "BiocManager")' for details. > Replacement repositories: > BioCsoft:[https://bioconductor.org/packages/3.17/bioc](https://bioconductor.org/packages/3.17/bioc)BioCann:[https://bioconductor.org/packages/3.17/data/annotation](https://bioconductor.org/packages/3.17/data/annotation)BioCexp:[https://bioconductor.org/packages/3.17/data/experiment](https://bioconductor.org/packages/3.17/data/experiment)BioCworkflows:[https://bioconductor.org/packages/3.17/workflows](https://bioconductor.org/packages/3.17/workflows)BioCbooks:[https://bioconductor.org/packages/3.17/books](https://bioconductor.org/packages/3.17/books)CRAN:[https://packagemanager.posit.co/cran/latest](https://packagemanager.posit.co/cran/latest)[1] TRUE > > sessionInfo() > R version 4.3.1 (2023-06-16) > Platform: aarch64-apple-darwin20 (64-bit) > Running under: macOS Ventura 13.4.1 > > Matrix products: default > BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib > LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > time zone: Australia/Melbourne > tzcode source: internal > > attached base packages: > [1] stats4 stats graphics grDevices datasets utils methods base > > other attached packages: > [1] DelayedMatrixStats_1.22.1 DelayedArray_0.26.7 S4Arrays_1.0.5 > [4] abind_1.4-5 IRanges_2.34.1 S4Vectors_0.38.1 > [7] BiocGenerics_0.46.0 Matrix_1.6-0 MatrixGenerics_1.12.3 > [10] matrixStats_1.0.0 > > loaded via a namespace (and not attached): > [1] sparseMatrixStats_1.12.2 lattice_0.21-8 grid_4.3.1 > [4] renv_1.0.0 compiler_4.3.1 tools_4.3.1 > [7] Rcpp_1.0.11 BiocManager_1.30.21.1 crayon_1.5.2 >

Peter Hickey (02:26:23) (in thread): > In particular, thecolSdDiffs,matrix-methodchanges (to the wrong thing) upon loadingDelayedMatrixStats

Peter Hickey (02:27:22) (in thread): > I’m seeing this on both ubuntu (installing packages from source) and macOS arm64 (installing package binaries)

Lluís Revilla (07:37:24) (in thread): > From a fresh BiocManager installation > > Function: colSdDiffs (package MatrixGenerics) > x="ANY" > x="matrix_OR_array_OR_table_OR_numeric" > > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = NA) > { > matrixStats::colSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > } > <bytecode: 0x55efef637890> > <environment: namespace:MatrixGenerics> > > Signatures: > x > target "matrix" > defined "matrix_OR_array_OR_table_OR_numeric" > Function: rowSdDiffs (package MatrixGenerics) > x="ANY" > x="matrix_OR_array_OR_table_OR_numeric" > > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = NA) > { > matrixStats::rowSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > } > <bytecode: 0x55eff53f7f80> > <environment: namespace:MatrixGenerics> > > Signatures: > x > target "matrix" > defined "matrix_OR_array_OR_table_OR_numeric" > Function: colSdDiffs (package MatrixGenerics) > x="ANY" > x="DelayedMatrix" > x="dgCMatrix" > x="matrix" > x="matrix_OR_array_OR_table_OR_numeric" > > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = TRUE) > { > if (!is.null(rows) && !is.null(cols)) > x <- x[rows, cols, drop = FALSE] > else if (!is.null(rows)) > x <- x[rows, , drop = FALSE] > else if (!is.null(cols)) > x <- x[, cols, drop = FALSE] > if (is.na(useNames)) { > deprecatedUseNamesNA() > } > else if (!useNames) { > colnames(x) <- NULL > } > apply(x, MARGIN = 2L, FUN = sdDiff, na.rm = na.rm, diff = diff, > trim = trim, ...) > } > <bytecode: 0x55effd392460> > <environment: namespace:matrixStats> > > Signatures: > x > target "matrix" > defined "matrix" > Function: rowSdDiffs (package MatrixGenerics) > x="ANY" > x="DelayedMatrix" > x="dgCMatrix" > x="matrix_OR_array_OR_table_OR_numeric" > > Method Definition: > > function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > trim = 0, ..., useNames = NA) > { > matrixStats::rowSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > } > <bytecode: 0x55eff53f7f80> > <environment: namespace:MatrixGenerics> > > Signatures: > x > target "matrix" > defined "matrix_OR_array_OR_table_OR_numeric" > > BiocManager::version() > [1] '3.17' > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://cloud.r-project.org](https://cloud.r-project.org)> BiocManager::valid() > [1] TRUE > > sessionInfo() > R version 4.3.1 (2023-06-16) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 22.04.2 LTS > > Matrix products: default > BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 > LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > > time zone: Europe/Madrid > tzcode source: system (glibc) > > attached base packages: > [1] stats4 stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] DelayedMatrixStats_1.22.1 DelayedArray_0.26.7 > [3] S4Arrays_1.0.5 abind_1.4-5 > [5] IRanges_2.34.1 S4Vectors_0.38.1 > [7] BiocGenerics_0.46.0 Matrix_1.5-4.1 > [9] MatrixGenerics_1.12.3 matrixStats_1.0.0 > > loaded via a namespace (and not attached): > [1] sparseMatrixStats_1.12.2 lattice_0.21-8 grid_4.3.1 > [4] compiler_4.3.1 tools_4.3.1 Rcpp_1.0.11 > [7] BiocManager_1.30.21.1 crayon_1.5.2 >

Lluís Revilla (07:49:16) (in thread): > And from inside Rstudio: > > > suppressPackageStartupMessages(library(MatrixGenerics)) > showMethods("colSdDiffs") > #> Function: colSdDiffs (package MatrixGenerics) > #> x="ANY" > #> x="matrix_OR_array_OR_table_OR_numeric" > selectMethod("colSdDiffs", "matrix") > #> Method Definition: > #> > #> function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > #> trim = 0, ..., useNames = NA) > #> { > #> matrixStats::colSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > #> diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > #> } > #> <bytecode: 0x5556dc3f4e78> > #> <environment: namespace:MatrixGenerics> > #> > #> Signatures: > #> x > #> target "matrix" > #> defined "matrix_OR_array_OR_table_OR_numeric" > showMethods("rowSdDiffs") > #> Function: rowSdDiffs (package MatrixGenerics) > #> x="ANY" > #> x="matrix_OR_array_OR_table_OR_numeric" > selectMethod("rowSdDiffs", "matrix") > #> Method Definition: > #> > #> function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > #> trim = 0, ..., useNames = NA) > #> { > #> matrixStats::rowSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > #> diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > #> } > #> <bytecode: 0x5556dd767ab8> > #> <environment: namespace:MatrixGenerics> > #> > #> Signatures: > #> x > #> target "matrix" > #> defined "matrix_OR_array_OR_table_OR_numeric" > > suppressPackageStartupMessages(library(DelayedMatrixStats)) > showMethods("colSdDiffs") > #> Function: colSdDiffs (package MatrixGenerics) > #> x="ANY" > #> x="DelayedMatrix" > #> x="dgCMatrix" > #> x="matrix" > #> x="matrix_OR_array_OR_table_OR_numeric" > selectMethod("colSdDiffs", "matrix") > #> Method Definition: > #> > #> function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > #> trim = 0, ..., useNames = TRUE) > #> { > #> if (!is.null(rows) && !is.null(cols)) > #> x <- x[rows, cols, drop = FALSE] > #> else if (!is.null(rows)) > #> x <- x[rows, , drop = FALSE] > #> else if (!is.null(cols)) > #> x <- x[, cols, drop = FALSE] > #> if ([is.na](http://is.na)(useNames)) { > #> deprecatedUseNamesNA() > #> } > #> else if (!useNames) { > #> colnames(x) <- NULL > #> } > #> apply(x, MARGIN = 2L, FUN = sdDiff, na.rm = na.rm, diff = diff, > #> trim = trim, ...) > #> } > #> <bytecode: 0x5556f05a88a0> > #> <environment: namespace:matrixStats> > #> > #> Signatures: > #> x > #> target "matrix" > #> defined "matrix" > showMethods("rowSdDiffs") > #> Function: rowSdDiffs (package MatrixGenerics) > #> x="ANY" > #> x="DelayedMatrix" > #> x="dgCMatrix" > #> x="matrix_OR_array_OR_table_OR_numeric" > selectMethod("rowSdDiffs", "matrix") > #> Method Definition: > #> > #> function (x, rows = NULL, cols = NULL, na.rm = FALSE, diff = 1L, > #> trim = 0, ..., useNames = NA) > #> { > #> matrixStats::rowSdDiffs(x, rows = rows, cols = cols, na.rm = na.rm, > #> diff = diff, trim = trim, ..., useNames = !isFALSE(useNames)) > #> } > #> <bytecode: 0x5556dd767ab8> > #> <environment: namespace:MatrixGenerics> > #> > #> Signatures: > #> x > #> target "matrix" > #> defined "matrix_OR_array_OR_table_OR_numeric" > > BiocManager::version() > #> [1] '3.17' > BiocManager::valid() > #> 'getOption("repos")' replaces Bioconductor standard repositories, see > #> 'help("repositories", package = "BiocManager")' for details. > #> Replacement repositories: > #> CRAN:[https://ftp.cixug.es/CRAN](https://ftp.cixug.es/CRAN)#> Warning: 1 packages out-of-date; 0 packages too new > #> > #> * sessionInfo() > #> > #> R version 4.3.1 (2023-06-16) > #> Platform: x86_64-pc-linux-gnu (64-bit) > #> Running under: Ubuntu 22.04.2 LTS > #> > #> Matrix products: default > #> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 > #> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > #> > #> locale: > #> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > #> [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 > #> [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8 > #> [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C > #> [9] LC_ADDRESS=C LC_TELEPHONE=C > #> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > #> > #> time zone: Europe/Madrid > #> tzcode source: system (glibc) > #> > #> attached base packages: > #> [1] stats4 stats graphics grDevices utils datasets methods > #> [8] base > #> > #> other attached packages: > #> [1] DelayedMatrixStats_1.22.1 DelayedArray_0.26.7 > #> [3] S4Arrays_1.0.5 abind_1.4-5 > #> [5] IRanges_2.34.1 S4Vectors_0.38.1 > #> [7] BiocGenerics_0.46.0 Matrix_1.5-4.1 > #> [9] MatrixGenerics_1.12.3 matrixStats_1.0.0 > #> > #> loaded via a namespace (and not attached): > #> [1] crayon_1.5.2 vctrs_0.6.3 cli_3.6.1 > #> [4] knitr_1.43 rlang_1.1.1 xfun_0.39 > #> [7] purrr_1.0.1 styler_1.10.1 glue_1.6.2 > #> [10] htmltools_0.5.5 rmarkdown_2.23 R.cache_0.16.0 > #> [13] grid_4.3.1 evaluate_0.21 fastmap_1.1.1 > #> [16] sparseMatrixStats_1.12.2 yaml_2.3.7 lifecycle_1.0.3 > #> [19] BiocManager_1.30.21.1 compiler_4.3.1 fs_1.6.3 > #> [22] Rcpp_1.0.11 rstudioapi_0.15.0 R.oo_1.25.0 > #> [25] lattice_0.21-8 R.utils_2.12.2 digest_0.6.33 > #> [28] reprex_2.0.2 magrittr_2.0.3 R.methodsS3_1.8.2 > #> [31] tools_4.3.1 withr_2.5.0 > #> > #> Bioconductor version '3.17' > #> > #> * 1 packages out-of-date > #> * 0 packages too new > #> > #> create a valid installation with > #> > #> BiocManager::install("gert", update = TRUE, ask = FALSE, force = TRUE) > #> > #> more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date > sessionInfo() > #> R version 4.3.1 (2023-06-16) > #> Platform: x86_64-pc-linux-gnu (64-bit) > #> Running under: Ubuntu 22.04.2 LTS > #> > #> Matrix products: default > #> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 > #> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > #> > #> locale: > #> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > #> [3] LC_TIME=es_ES.UTF-8 LC_COLLATE=en_US.UTF-8 > #> [5] LC_MONETARY=es_ES.UTF-8 LC_MESSAGES=en_US.UTF-8 > #> [7] LC_PAPER=es_ES.UTF-8 LC_NAME=C > #> [9] LC_ADDRESS=C LC_TELEPHONE=C > #> [11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C > #> > #> time zone: Europe/Madrid > #> tzcode source: system (glibc) > #> > #> attached base packages: > #> [1] stats4 stats graphics grDevices utils datasets methods > #> [8] base > #> > #> other attached packages: > #> [1] DelayedMatrixStats_1.22.1 DelayedArray_0.26.7 > #> [3] S4Arrays_1.0.5 abind_1.4-5 > #> [5] IRanges_2.34.1 S4Vectors_0.38.1 > #> [7] BiocGenerics_0.46.0 Matrix_1.5-4.1 > #> [9] MatrixGenerics_1.12.3 matrixStats_1.0.0 > #> > #> loaded via a namespace (and not attached): > #> [1] crayon_1.5.2 vctrs_0.6.3 cli_3.6.1 > #> [4] knitr_1.43 rlang_1.1.1 xfun_0.39 > #> [7] purrr_1.0.1 styler_1.10.1 glue_1.6.2 > #> [10] htmltools_0.5.5 rmarkdown_2.23 R.cache_0.16.0 > #> [13] grid_4.3.1 evaluate_0.21 fastmap_1.1.1 > #> [16] sparseMatrixStats_1.12.2 yaml_2.3.7 lifecycle_1.0.3 > #> [19] BiocManager_1.30.21.1 compiler_4.3.1 fs_1.6.3 > #> [22] Rcpp_1.0.11 rstudioapi_0.15.0 R.oo_1.25.0 > #> [25] lattice_0.21-8 R.utils_2.12.2 digest_0.6.33 > #> [28] reprex_2.0.2 magrittr_2.0.3 R.methodsS3_1.8.2 > #> [31] tools_4.3.1 withr_2.5.0 > > > ^{Created on 2023-08-08 with reprex v2.0.2} - Attachment (reprex.tidyverse.org): Prepare Reproducible Example Code via the Clipboard

Peter Hickey (17:31:23) (in thread): > Thanks@Lluís Revilla! That looks much the same as what I’m seeing

Peter Hickey (17:31:48): > @Hervé Pagèsany ideas what’s happening with what I’m seeing inhttps://community-bioc.slack.com/archives/CLUJWDQF4/p1691475858184039? - Attachment: Attachment > Could I please get a couple of people to post their output from running the following code with an up-to-date installation of BioC 3.17 (i.e. the current release) > > suppressPackageStartupMessages(library(MatrixGenerics)) > showMethods("colSdDiffs") > selectMethod("colSdDiffs", "matrix") > showMethods("rowSdDiffs") > selectMethod("rowSdDiffs", "matrix") > > suppressPackageStartupMessages(library(DelayedMatrixStats)) > showMethods("colSdDiffs") > selectMethod("colSdDiffs", "matrix") > showMethods("rowSdDiffs") > selectMethod("rowSdDiffs", "matrix") > > BiocManager::version() > BiocManager::valid() > sessionInfo()

2023-08-09

Hervé Pagès (01:00:46) (in thread): > ThatcolSdDiffs()method is defined in theDelayedMatrixStatspackage: > > hpages@XPS15:~/DelayedMatrixStats$ find . -type d -name '.git' -prune -o -type f -exec grep -Hw 'colSdDiffs' {} \; | grep setMethod > ./R/colSdDiffs.R:setMethod("colSdDiffs", "DelayedMatrix", > ./R/colSdDiffs.R:setMethod("colSdDiffs", "matrix", matrixStats::colSdDiffs) > > But no such thing forrowSdDiffs(): > > hpages@XPS15:~/DelayedMatrixStats$ find . -type d -name '.git' -prune -o -type f -exec grep -Hw 'rowSdDiffs' {} \; | grep setMethod > ./R/rowSdDiffs.R:setMethod("rowSdDiffs", "DelayedMatrix", > > The fact thatselectMethod("colSdDiffs", "matrix")reports that the method is defined in thematrixStatspackage is admitedly misleading here. I suspect that this is because thedefinitionargument of thesetMethod()statement is set tomatrixStats::colSdDiffswhich is a function that belongs tomatrixStats, andselectMethod()wants to report that instead of reporting the package where the method effectively gets defined.:man-shrugging:

Peter Hickey (01:37:15) (in thread): > Thanks Herve! I stared at that file yesterday and somehow did not see that:tired_face:. I’ll remove that extraneousselectMethod("colSdDiffs", "matrix")fromDelayedMatrixStats

Michael Lawrence (10:03:24): > I was disappointed that there wasn’t a developer day at this year’s conference. That was always a lot of fun. Do we still have a regular developer forum? Sorry that I’ve been out of the loop.

2023-08-10

Stevie Pederson (09:21:45): > @Stevie Pederson has joined the channel

2023-08-13

Mikhail Dozmorov (21:39:15): > https://code.bioconductor.org/browse/seems to be down. Found it by timing out onhttps://code.bioconductor.org/browse/DEXSeq/, and then when trying “Git Browser” athttps://code.bioconductor.org/ - Attachment (code.bioconductor.org): Bioconductor Code > Search and browse the source code of Bioconductor packages.

2023-08-15

Tyrone Chen (02:24:39): > @Tyrone Chen has joined the channel

Kevin Rue-Albrecht (04:10:24): > Hi fellow developers > I know I’ve run into this in the past, but can’t remember how to chase down the issue, given the generic (lack of) information given > > ❯ checking for code/documentation mismatches ... WARNING > Warning in formals(fun) : argument is not a function > Warning in formals(fun) : argument is not a function > > ❯ checking R code for possible problems ... NOTE > Warning in formals(fun) : argument is not a function > Warning in body(fun) : argument is not a function > Warning in formals(fun) : argument is not a function > Warning in body(fun) : argument is not a function > > https://github.com/iSEE/iSEEpathways/actions/runs/5857843251/job/15880591443?pr=17#step:23:117Anyone remembers what to look for?

Martin Grigorov (04:34:48) (in thread): > since you useshinythis might be related -https://github.com/rstudio/shiny/issues/1676#issuecomment-314801760 - Attachment: Comment on #1676 R 3.4.0 gives warning on all Shiny apps > @kmezhoud, it sounds like it is probably a problem in another package, not Shiny. If you put options(warn=2, shiny.error=recover) at the top of your server function, it might help you find the source of the problem, by throwing an error when the warning happens.

Kevin Rue-Albrecht (04:39:19) (in thread): > Hm. Interesting. I didn’t pick that up on my Google radar. I do note that the post is from 2017; I’ve been developing a bunch of shiny-related packages in the last few years without having that warning (at least not on a regular basis), and the warning only occurs during rcmdcheck, not at runtime, so I suspect that my issue is a little bit different in nature. In any case, thanks for the pointer, the thread does offer a few ideas to chase the issue.

Kevin Rue-Albrecht (07:08:42) (in thread): > Thanks again, but unfortunately, I can confirm that this does not seem to identify my issue, as the warning does not appear at runtime but only during rcmdcheck. > I’ve also tried removing from the package documentation the only object that’s not a function (embedPathwaysResultsMethods) and I still get the warning. > I’m currently comparing with another very similar package where I’m not getting the warning at all:https://github.com/iSEE/iSEEde/actions/runs/5248688387/job/14202402761

Vince Carey (07:21:04): > @Michael Lawrenceasked about developer day and the continuity of the developer forum. This used to be run on a given thursday of each month IIRC. It would be great to restart this with a series of topics and invited speakers. As for interest in restoring a developer-targeted component of the annual conference, I have heard this from other quarters as well. The conference committee will need to be engaged to see how to address this.@Levi Waldron

Vince Carey (07:25:53): > Off the top of my head, the following developer-oriented topics could be of interest for monthly meetings: SparseArray, tatami, Biocpy, unifying the approach to github action usage with attention to the telemetry action for instrumenting build and check, container image management, road maps for (BiocParallel, BiocCheck, BiocManager), orchestrating workshops and analyzing performance of workshop deployment, r2u/bioc2u …

Nick Owen (07:58:50): > @Nick Owen has joined the channel

2023-08-16

Kevin Rue-Albrecht (05:22:56) (in thread): > If my memory and intuition serve, I thinkit’ssomething to do with the Roxygen doc strings. My setup is a little bit complex with joint man pages for related functions, but I think roxygen is picking up something unexpected. > Like I said I only see one objectthat’snot a function, and thatdidn’tseem to help. > I thinkI’llgo nuclear, wipe out the doc strings and paste them back in progressively hoping to spot when the issue appears. Lots of rcmdcheck runs ahead of me:sweat_smile:

Kevin Rue-Albrecht (08:31:38) (in thread): > Hmpf. Bad strategy. Removing too much roxygen and the package doesn’t even build, and enough it builds, I get the warning. I’ll have to find a smarter approach to getting to the bottom of that

Nikhil Mane (14:38:09): > @Nikhil Mane has joined the channel

Kevin Rue-Albrecht (16:26:05) (in thread): > Woops. Found it. There were two methods where I defined an internal.local <- function() ...for no reason other than a naive copy paste from seeing it in theshowMethods(...)of some related code. I don’t know why, but it turns out this is not something I’m supposed to write myself. Removing the.localstuff removed the warnings

Kevin Rue-Albrecht (16:26:09) (in thread): > seehttps://github.com/iSEE/iSEEpathways/blob/rcmdcheck/R/FgseaEnrichmentPlot-class.R#L138

Kevin Rue-Albrecht (16:26:52) (in thread): > (link might not be valid soon, as it’s pointed to the branch that I work on, and where I will fix the issue)

Kevin Rue-Albrecht (16:28:15) (in thread): > Persistent link here:https://github.com/iSEE/iSEEpathways/blob/a8a211962a0586646ad3c215d374c830b48593a9/R/FgseaEnrichmentPlot-class.R#L138

Lambda Moses (17:44:44): > I’m debating if this should go in the main (release) or devel branch. I found an embarrassing bug in my package on Bioc 3.17, and realized that fixing it needs some pretty invasive changes such as changes in several methods for the new S4 class that is mostly used internally but is exported and adding some new exported functions. The package of interest isSpatialFeatureExperiment(SFE) and the S4 class of interest inherits fromVirtualSpatialImagefromSpatialExperimentto manage images, just a simple thin wrapper aroundterra::SpatRaster. SFE is still experimental and isn’t too popular yet (downloaded by around 90 distinct IPs per month in the past few months, while SCE has over 10,000 per month), so I don’t think I have to be that conservative in making changes. I’m debating if it should go to the main branch because it’s a bug fix and it’s kind of urgent or if it should go the the devel branch because it’s an invasive change. Any suggestions?

2023-08-17

Vince Carey (11:24:45): > Hi@Lambda Moses– my view would be that the opportunity to fix the bug in release should be taken. We should minimize all possible downstream effects. There is a tradeoff between achieving stability in release and constraining developer’s improvements to their code. I don’t want to create a precedent and I am glad you have posed the question. I’d be interested to hear other voices on the topic –@Hervé Pagès?

2023-08-18

Matteo Tiberti (09:14:03): > hi everyone, we are working to on our review for the inclusion of our Moonlight2R package, and we have an R problem we are a bit stuck on. We’re getting this Error in the examples for one of our functions duringR CMD check(it’s the first time we use this one): > > * checking examples ... [37s/42s] ERROR > Running examples in 'Moonlight2R-Ex.R' failed > The error most likely occurred in: > > > ### Name: GSEA > > ### Title: GSEA > > ### Aliases: GSEA > > > > ### **** Examples > > > > data("DEGsmatrix") > > dataFEA <- GSEA(DEGsmatrix = DEGsmatrix) > Error in loadNamespace(name) : there is no package called 'org.Hs.eg.db' > > this is strange becauseorg.Hs.eg.dbis actually installed (I can load it withlibrary). Our function only uses it in conjunction withclusterProfiler::bitrfor what I see. Any idea of what we might be doing wrong? Here is the function header and example: > > #' GSEA > #' > #' This function carries out the GSEA enrichment analysis. > #' @param DEGsmatrix DEGsmatrix output from DEA such as dataDEGs > #' @param top is the number of top BP to plot > #' @param plot if TRUE return a GSEA's plot > #' @importFrom grDevices dev.list > #' @importFrom grDevices graphics.off > #' @importFrom clusterProfiler bitr > #' @importFrom DOSE gseDO > #' @importFrom DOSE simplot > #' @return return GSEA result > #' @export > #' @examples > #' data("DEGsmatrix") > #' dataFEA <- GSEA(DEGsmatrix = DEGsmatrix) > GSEA <- function (DEGsmatrix, top, plot = FALSE){ >

Jacques SERIZAY (09:17:46) (in thread): > My guess is that you need to importorg.Hs.eg.dband add it to your NAMESPACE/DESCRIPTION.

Matteo Tiberti (09:35:14) (in thread): > I see - I’ll try that, thank you

Matteo Tiberti (09:50:30) (in thread): > seems to have worked, thanks a lot! A pretty straightforward fix. The error message made me think there was something else going on, but happy it works now!

Kevin Rue-Albrecht (11:19:14) (in thread): > If it’s only used in the man page, you can add it toSuggests:If you import anything from it in your package, then add it toImports:

Hervé Pagès (11:41:48): > @Lambda MosesIt’s a tough call but adding new exported functions should not be an issue. A more problematic situation is if you need to modify the behavior of existing user-facing functions. If this is the case, make sure to add a note in the man page of each affected function about that (preferrably a VERY VISIBLE ONE at the beginning of the man page). Depending on the severity of the situation you might also consider having the function emit a warning. > An alternative is to implement new functions (using new names) with the correct behavior and preserve the old ones but deprecate them in favor of the new ones. This is the less invasive approach but is not always practical. > Note that these recommendations apply even if you only fix the devel version of your package. We should not change the behavior of functions in release but IIUC it sounds like in your case it might be unavoidable.

Victor Yuan (12:35:32): > @Victor Yuan has joined the channel

Lambda Moses (14:10:39): > @Vince Carey@Hervé PagèsThank you for your informative comments. Yes, some existing exported functions are changed, but I don’t fret about it because the class is mostly used internally. I’m more inclined to merge to main since it’s an embarrassing bug.

2023-08-21

Louis Le Nézet (10:50:47): > Hi ! > > I’m working on generic methods and I have some trouble with the parameters order in the methods and generic. > Most of the method are as follow: > * A general method taking a set of vector of same length and processing it > * A method for a dataframe where all the vectors needed are in the columns > * A method for my S4 class where the dataframe containing the vectors are in a slot > My aim is that the function need to detect the type of entry and correctly asses the mandatory and optional arguments. > What I’m doing: > * The generic ask for the first vector / object and the optional argument > * The method for character ask for all mandatory arguments > * The method for dataframe check for the mandatory arguments in the columns > * My S4 method call the dataframe method and stored the result in the object > > > setGeneric("fct1", function(obj, opt1 = TRUE, opt2 = 1, ...) { > standardGeneric("fct1") > }) > > setMethod("fct1", "character", function( > obj, opt1 = TRUE, opt2 = 1, mdt1, mdt2 > ) { > id <- obj > do_stuff() > give_back_vector() > }) > > setMethod("fct1", "data.frame", function( > obj, opt1 = TRUE, opt2 = 1 > ) { > check_columns(obj, c("id", "mdt1", "mdt2")) > fct1(obj$id, opt1 = opt1, opt2 = opt2, obj$mdt1, obj$mdt2) > }) > > setMethod("fct1", "MyClass", function( > obj, opt1 = TRUE, opt2 = 1 > ) { > obj$df$res <- fct1(obj$df, opt1 = opt1, opt2 = opt2) > obj > }) > > Is it the right way to go ?

Louis Le Nézet (10:55:02) (in thread): > The problem by ordering like that is that you’ll need to write the optional argument when calling the function with the vectors. > > fct1(id, TRUE, 1 mdt1, mdt2) > fct1(df) > fct1(obj) >

Marcel Ramos Pérez (11:11:06) (in thread): > You may be better off creating a plain function that generates the result offct1rather than wedging in arguments into thefct1method (the method signature should match the generic’s exactly).

Jacques SERIZAY (11:43:03) (in thread): > Does this answer your question (I named yourfct1asmyFun): > > #' @examples > #' MC <- MyClass(df = data.frame(id = 'ID', mdt1 = 'this is mdt. 1', mdt2 = 'this is mdt. 2')) > #' MC_modified <- myFun( > #' x = MC, > #' opt1 = 'this is opt. 1', > #' opt2 = 'this is opt. 2' > #' ) > #' MC_modified@df$res > NULL > > # ~~~~~~~~~~~~ Defining class MyClass ~~~~~~~~~~~~ # > > methods::setClass("MyClass", > slots = c( > df = "data.frame" > ) > ) > > MyClass <- function(df) {methods::new("MyClass", df = df)} > > > # ~~~~~~~~~~~~ Defining function myFun ~~~~~~~~~~~~ # > > setGeneric("myFun", function(x, ...) {standardGeneric("myFun")}) > > setMethod("myFun", "character", function(x, mdt1, mdt2, opt1 = TRUE, opt2 = 1) { > print("Character method...") > if (missing(mdt1)) {stop("mdt1 missing")} > if (missing(mdt2)) {stop("mdt2 missing")} > # Do whatever > id <- paste0(c(x, mdt1, mdt2, opt1, opt2), collapse = ' / ') > id > }) > > setMethod("myFun", "data.frame", function(x, opt1 = TRUE, opt2 = 1) { > print("data.frame method...") > # Check that columns exist > if (!all(c("id", "mdt1", "mdt2") %in% colnames(x))) stop("`id`, `mdt1` and `mdt2` columns must be present") > # Redirects to `chararcter` method > myFun(x = x$id, mdt1 = x$mdt1, mdt2 = x$mdt2, opt1 = opt1, opt2 = opt2) > }) > > setMethod("myFun", "MyClass", function(x, opt1 = TRUE, opt2 = 1) { > print("MyClass method...") > x@df$res <- myFun(x@df, opt1 = opt1, opt2 = opt2) > x > }) >

Jacques SERIZAY (12:01:30) (in thread): > Why thumb down@Marcel Ramos Pérez? I agree it’s not the simplest approach, but is it invalid nonetheless?

Marcel Ramos Pérez (12:03:46) (in thread): > The method signatures are inconsistent with the generic. Another option is to add all the arguments to the generic.

Jacques SERIZAY (12:11:22) (in thread): > If I document all the params, the .Rd is pretty clear to me: > > doc package:myPkg R Documentation > > Documentation > > Description: > > Documentation > > Usage: > > MyClass(df) > > myFun(x, ...) > > ## S4 method for signature 'character' > myFun(x, mdt1, mdt2, opt1 = TRUE, opt2 = 1) > > ## S4 method for signature 'data.frame' > myFun(x, opt1 = TRUE, opt2 = 1) > > ## S4 method for signature 'MyClass' > myFun(x, opt1 = TRUE, opt2 = 1) > > Arguments: > > df: a data.frame with columns 'id', 'mdt1' and 'mdt2' > > x: a character, data.frame or MyClass object > > ...: Passed to appropriate method > > mdt1: mdt1 arg. > > mdt2: mdt2 arg. > > opt1: opt1 arg. > > opt2: opt2 arg. > > Examples: > > MC <- MyClass(df = data.frame(id = 'ID', mdt1 = 'this is mdt. 1', mdt2 = 'this is mdt. 2')) > MC_modified <- myFun( > x = MC, > opt1 = 'this is opt. 1', > opt2 = 'this is opt. 2' > ) > MC_modified@df$res > > check()returns 0 errors, 0 warnings and 0 notes. > > I could not find any guidelinesagainst methods signatures being different from the generic. In fact, if I remove the ellipsis from the generic, I get this message, which suggests that the ellipsis is valid: > > r > devtools::document() > ℹ Updating myPkg documentation > ℹ Loading myPkg > Error in `load_all()`: > ! Failed to load R/myfun.R > Caused by error in `rematchDefinition()`: > ! methods can add arguments to the generic 'myFun' only if '...' is an argument to the generic > Run `rlang::last_trace()` to see where the error occurred. >

Jacques SERIZAY (12:14:24) (in thread): > Please let me know if it’s a requirement for Bioc to not use the ellipsis in the generic, I’m putting together a package with a new method and was considering using it:sweat_smile:

Marcel Ramos Pérez (14:44:08) (in thread): > It’s okay to have the ellipsis but the signatures must match across methods in addition to matching the generic. Seehttps://developer.r-project.org/howMethodsWork.pdf

Jacques SERIZAY (15:59:45) (in thread): > “Therefore, in principle all methods must have exactly the same formal arguments as the generic function. Some slight trickery in the call to setMethod() allows methods to insert some new arguments corresponding to the . . . argument in the generic function.” that would make me a trickster:fox_face:. Thanks@Marcel Ramos Pérezfor the info!

Hervé Pagès (22:28:14) (in thread): > @Louis Le NézetIf you name the extra argumentsmdt1andmdt2(e.g.fct1(id, mdt1=x1, mdt2=x2)), then you don’t need to pass the optional arguments. This is also better style than doingfct1(id, TRUE, 1 mdt1, mdt2).

Hervé Pagès (22:58:44) (in thread): > @Jacques SERIZAY@Marcel Ramos PérezYes in the case of a generic with the ellipsis, individual methods can introduce new formal arguments. This is actually very common practice.@Louis Le Nézet3 more things: > * “signature” is not the same as “list of formal arguments”. The former is the subset of the latter that is used for method dispatch and it can be specified via thesignatureargument insetGeneric(). In your case, I would actually suggest that you restrict dispatch toobjby specifyingsignature="obj". > * Do not put curly brackets aroundstandardGeneric("fct1")otherwise you’ll get a non-standard generic instead of a standard one. > * Finally, IMO maybe a cleaner approach in this particular case is to put optional argumentsopt1andopt2afterthe ellipsis: > > > setGeneric("fct1", signature="obj", > function(obj, ..., opt1=TRUE, opt2=1) standardGeneric("fct1") > ) > > setMethod("fct1", "character", > function(obj, mdt1, mdt2, opt1=TRUE, opt2=1) { > ... > } > ) > > setMethod("fct1", "data.frame", > function(obj, opt1=TRUE, opt2=1) { > ... > } > ) > > This way the user no longer needs to name themdt1andmdt2arguments i.e. they can just dofct1(id, mdt1, mdt2). However now they need to name the optional arguments but that seems more natural to me.

2023-08-22

Hervé Pagès (02:17:20) (in thread): > Ah… unfortunately putting formal arguments after the ellipsis in the generic leads to other problems due to a bug in R. Seehttps://bugs.r-project.org/show_bug.cgi?id=18538for the details. Even though the bug got fixed recently in R devel (4.4) it seems that it is still present in R 4.3.1.

Louis Le Nézet (03:39:34) (in thread): > I didn’t know it was possible to restrict the signature to one argument, it’s quite convenient ! > Being able to put the ellipsis before the optional arguments would be really nice. > But due to the bug still present, maybe I should go for: > > setGeneric("fct1", signature="obj", > function(obj, ...) standardGeneric("fct1") > ) > > setMethod("fct1", "character", > function(obj, mdt1, mdt2, opt1=TRUE, opt2=1) { > ... > } > ) > > setMethod("fct1", "data.frame", > function(obj, opt1=TRUE, opt2=1) { > ... > } > ) >

Hervé Pagès (11:32:44) (in thread): > I suppose that would work. Note that in this case you actually don’t need to specifysignature="obj"since the signature includes all arguments preceding the ellipsis by default. > > Also: right now it seems that the method for data.frame returns a vector parallel to the input data.frame, whereas the method for MyClass objects returns a modified MyClass object. Maybe it would be more consistent to have the method for data.frame return the input data.frame with the new column added to it? In other words, I’m suggesting that both methods behave likeendomorphismsi.e. that they return an object of the same type as the input. This would make their behavior more predictible.

Louis Le Nézet (12:21:40) (in thread): > Okay I will try to do so !

2023-08-23

Matteo Tiberti (03:50:58): > hi, again, I have a (probably naive) question to share. Following the review of our package, we have setLazyData: falsein ourDESCRIPTION. As unintended consequence, we now get a bunch of warnings duringR CMD checkfor several data objects: > > Data with usage in documentation object 'NCG' but not in code: > 'NCG' > > I tried looking around and found that this might be due to abug in Roxygen, except we do@usageas suggested in the linked issue in the data object definition, for what I can tell: > > ... > #'@docType data > #'@usage data(NCG) > #'@name NCG > ... > > in fact we get this inman/NCG.Rd: > > \usage{ > data(NCG) > } > > it is then used in the@examplessection of a function: > > #' @examples > ... > #' data(NCG) > > and in the function definition: > > ... > # Load NCG file > data(NCG) > ... > > We’re a bit at loss of what we might be doing wrong - any suggestion?

Marcel Ramos Pérez (10:11:16) (in thread): > offis an invalid value for theLazyDatafield. Either delete the field or set it tofalse

Matteo Tiberti (10:13:09) (in thread): > Thank you - my bad, it is actually set tofalse. I’ll update the original message

Marcel Ramos Pérez (10:15:31) (in thread): > Do you have the rest of the data documentation roxygen block? Or even better, a GH repo you can point to ?

Matteo Tiberti (10:17:28) (in thread): > you can have a look at this:https://github.com/ELELAB/Moonlight2R/tree/fix76_description_updatesthere might be small differences respect to the content my message because I’ve been making some attempts at fixing it in the meantime, that I haven’t committed as they didn’t work

Matteo Tiberti (10:18:33) (in thread): > the rest of the data block: > > #' Network of Cancer Genes 7.0 > #' > #'@description A dataset retrived from Network of Cancer Genes 7.0 > #'@details The NCG_driver is reported as a OCG or TSG when at least one of three > #' three databases have documented it. These are cosmic gene census (cgc), > #' vogelstein et al. 2013 or saito et al. 2020. The NCG_driver is reported as a > #' candidate, when literature support the gene as a cancer driver. > #'@docType data > #'@usage data(NCG) > #'@name NCG > #'@aliases NCG > #'@return A 3347x7 table > #' > #'@format The format have been rearranged from the original. > #' <symbold>|<NCG_driver>|<NCG_cgc_annotation>|<NCG_vogelstein_annotation>| > #' <NCG_saito_annotation>|<NCG_pubmed_id> > #'@source \url{[http://ncg.kcl.ac.uk/](http://ncg.kcl.ac.uk/)} > #'@references Comparative assessment of genes driving cancer and somatic > #'evolution in non-cancer tissues: an update of the Network of Cancer Genes (NCG) > #'resource. > #'Dressler L., Bortolomeazzi M., Keddar M.R., Misetic H., Sartini G., > #'Acha-Sagredo A., Montorsi L., Wijewardhane N., Repana D., Nulsen J., > #'Goldman J., Pollit M., Davis P., Strange A., Ambrose K. and Ciccarelli F.D. > #' > "NCG" >

Matteo Tiberti (10:19:12) (in thread): > this is just one example, we have many data blocks with this same problem, and some that don’t seem to trigger it (but look the same to me- I couldn’t see any meaningful difference)

Matteo Tiberti (10:19:16) (in thread): > thanks for helping!

Marcel Ramos Pérez (11:45:43) (in thread): > I didn’t get the warning when runningR CMD checkonfix76_description_updates

Matteo Tiberti (11:59:40) (in thread): > I see, this is kind of weird. Can I ask you what setup you’re using? I’m on a Docker image derived frombioconductor/bioconductor_docker:develbuilt on 07/08 (R 4.3.1,roxygen27.2.3)

Matteo Tiberti (12:22:14) (in thread): > I have just tried recloning and switching branch from scratch just to be safe but I’m still getting the warning, unfortunately

2023-08-24

Lachlan Baer (01:13:06): > @Lachlan Baer has joined the channel

Matteo Tiberti (06:57:50) (in thread): > for future reference, after much head bashing I found my problem - we hadn’t updateddata/datalistto reflect the content of thedatafolder, once we did so the WARNING went away. Is this expected behaviour?

Matteo Tiberti (07:18:12) (in thread): > anyhow, thanks for helping@Marcel Ramos Pérez!

Martin Grigorov (08:52:17): > Hi! DoesBiocFileCacheuseR_DEFAULT_INTERNET_TIMEOUTenv var from Renviron.bioc ?

Martin Grigorov (08:53:00) (in thread): > I am trying to debughttps://bioconductor.org/checkResults/3.18/bioc-LATEST/MSnbase/kunpeng2-checksrc.html

Martin Grigorov (08:55:01) (in thread): > it usesrpx(that uses BiocFileCache) to download a file fromhttp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/but often times out onkunpeng2and the test tries to parse incomplete XML, i.e. BiocFileCache downloads ~300MBs from the 429M file

Martin Grigorov (08:56:45) (in thread): > I wonder which timeout setting I could increase to make it work

Lori Shepherd (11:09:29) (in thread): > BiocFileCache uses httr::GET to download files – so I believe using httr::timeout would be the most approrpiate

Martin Grigorov (13:54:51) (in thread): > Thank you,@Lori Shepherd! > This is the change I made inrpxpackage: > > biocbuild@kunpeng2 ~/g/rpx (devel)> git diff > diff --git a/R/cache.R b/R/cache.R > index 52ed49f..fa3e1f6 100644 > --- a/R/cache.R > +++ b/R/cache.R > @@ -168,7 +168,7 @@ allPXD <- function(cache = rpxCache()) { > return(.read_and_parse_sitemap(rid$rpath)) > } else { > ## update and return > - bfcdownload(cache, rid$rid, ask = FALSE) > + bfcdownload(cache, rid$rid, ask = FALSE, config = timeout(3000)) > rid <- bfcquery(cache, url) > return(.read_and_parse_sitemap(rid$rpath)) > } > > It seems it does what I needed - it is still downloading the file!

Lori Shepherd (14:01:24) (in thread): > side note – knowing that different interenet connections can play a factor maybe it would be worth making it an argument to the function so that if needed users could adjust as necessary?

Martin Grigorov (14:04:12) (in thread): > I discuss it withLaurent Gatto(the author ofrpxandMSnbase) athttps://github.com/lgatto/MSnbase/issues/595. For the time being I think I can even revert the local change after downloading the files. The next runs will use them from~/.cache/R/rpx/... - Attachment: #595 R CMD check failure on Linux ARM64

2023-08-30

Dario Strbenac (09:00:02): > If I runload_all()three times while developingClassifyR, the first time is fine, the second is a warning and the third is an error. What is the cause? Should I use an alternative rapid prototyping approach? > > > load_all() > ℹ Loading ClassifyR > > load_all() > ℹ Loading ClassifyR > Warning message: > class "Surv" is defined (with package slot ClassifyR) but no metadata object found to revise superClass information---not imported? Making a copy in package ClassifyR > > load_all() > ℹ Loading ClassifyR > in method for 'ROCplot' with signature '"ClassifyResult"': no definition for class "ClassifyResult" > Error in `load_all()`: > ! Failed to load R/classes.R > Caused by error in `setOldClass()`: > ! inconsistent old-style class information for "Surv"; the class is defined but does not extend "oldClass" and is not valid as the data part >

Marcel Ramos Pérez (10:24:08): > It seems like R is confusing the S4 class that was loaded for the S3 one after doingload_all()a third time.~~~This is likely because there is no ~~~~~~~importClassesFrom~~~~~~~~ directive in the ~~~~~~~~NAMESPACE~~~~~~~~ and also ~~~~~~~~survival~~~~~~~~ doesn’t ~~~~~~~~exportClasses("survival")~~~~. Any recommendations@Martin Morgan@Hervé Pagès?

Martin Morgan (11:25:24) (in thread): > As ever the first step is to produce a simple reproducible example. For instance I > > devtools::create("ClassifyRTest") > > then added a file R/Surv.R > > setOldClass("Surv") > > setClass("A", slots = c(x = "Surv")) > > Repeatedly runningdevtools::load_all()did not cause problems… > > EDIT: {survival} uses S3 so there are no classes to export / import.

2023-09-02

Louis Le Nézet (13:06:03): > Hi, > I’m looking for a way to delete the check() ‘global variable error’ when using dplyr. > I have function like: > > dad_child <- df[df$dadid != missid, c("dadid", "id")] %>% > group_by(dadid) %>% > summarise(child = list(id)) %>% > mutate(num_child_dir = lengths(child)) %>% > rename(id = dadid) > > But then I get the following error with check() > > num_child,character: no visible binding for global variable 'child' > Undefined global functions or variables: > child > > One suggested solution on stackoverflow was to add.data$before all variable it works but it is now deprecated: > > Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0. > i Please use `"child"` instead of `.data$child` > > What is the recommended way to proceed ?

Dirk Eddelbuettel (13:23:55) (in thread): > 1. Just dochild <- NULLoutside your function in the same file (creating a dummy) > 2. Useutils::gobalVariables(c("child") > Either suppresses the error.

Spencer Nystrom (13:43:29) (in thread): > Another workaround is to quote the names. Likesummarise("child" = foo(bar)).

Louis Le Nézet (13:51:24) (in thread): > Thanks, works like a charm !

Henrik Bengtsson (21:11:04) (in thread): > > Just dochild <- NULLoutside your function … > I always do thatinsidemy functions, so that it does not affect other functions checked, i.e. I don’t want to risk missing true positivechildmistakes in other functions. That’s also why I never likedglobalVariables(). Of course, one can putchild <- NULLoutsidethe function butinsideanlocal()environment.

2023-09-03

Louis Le Nézet (14:18:01): > Hi, > I’m now working on the Roxygen documentation on some S4 methods. > I don’t quite know how to properly write them when the type of the first argument differ between methods and also when the return change. > Should I put everything on the generic or should I separate the documentation ? > > #' My function > #' > #' @param obj Vector of fathers ids or a pedigree > #' @param momid Vector of mothers ids > #' @param missid Character defining the missing ids > #' > #' @return A paste of the father and mother ids > #' or a pedigree with the parents ids > #' @docType methods > #' @export > setGeneric("myfunction", signature = "obj", > function(obj, ...) standardGeneric("myfunction") > ) > > #' @docType methods > #' @aliases myfunction,character > #' @rdname myfunction > setMethod("myfunction", "character", function(obj, momid, missid = "0") { > paste(obj, momid, sep = missid) > }) > > #' @docType methods > #' @aliases myfunction,Pedigree > #' @rdname myfunction > setMethod("myfunction", "Pedigree", > function(obj, missid = "0") { > obj$par <- myfunction(obj$dadid, obj$momid, missid) > obj > } > ) >

Louis Le Nézet (16:05:30) (in thread): > Otherwise I would have gone for something like > > #' My function > #' > #' @param obj An object either a character vector or a pedigree > #' @param ... Arguments to be passed to methods > #' > #' @docType methods > #' @export > setGeneric("myfunction", signature = "obj", > function(obj, ...) standardGeneric("myfunction") > ) > > #' @docType methods > #' @aliases myfunction,character > #' @rdname myfunction > #' @param dadid A character vector > #' @param momid A character vector > #' @param missid Character defining the missing ids > #' @usage ## S4 method for signature 'character' > #' @usage myfunction(dadid, momid, missid = "0") > #' @return A character vector with the parents ids > setMethod("myfunction", "character", function(obj, momid, missid = "0") { > dadid <- obj > paste(dadid, momid, sep = missid) > }) > > #' @docType methods > #' @aliases myfunction,Pedigree > #' @param ped A pedigree object > #' @param missid Character defining the missing ids > #' @usage ## S4 method for signature 'Pedigree' > #' @usage myfunction(ped, missid = "0") > #' @return A pedigree with the parents ids > #' @include pedigreeClass.R > #' @rdname myfunction > setMethod("myfunction", "Pedigree", > function(obj, missid = "0") { > ped <- obj > ped$par <- myfunction(ped$dadid, ped$momid, missid) > ped > } > ) > > The usage section look like > > myfunction(obj, ...) > > ## S4 method for signature 'character' > > myfunction(dadid, momid, missid = "0") > > ## S4 method for signature 'Pedigree' > > myfunction(ped, missid = "0") > > But the value section is not divided by methods

2023-09-04

Alan O’C (03:15:49) (in thread): > It would probably be more informative to document the possible types of value accepted in one place, rather than distributed across a bunch of different documentation pages.

Louis Le Nézet (03:33:59) (in thread): > I agree. > Do you have a good example of a S4 generic / methods function with different input / output for the methods ?

2023-09-05

Martin Morgan (04:04:23) (in thread): > The message is a bit subtle, and I’m guessing that your example is incomplete. Here’s a simple reproducible example > > > mtcars |> filter(.data$mpg > 30) |> select(.data$mpg) > mpg > Fiat 128 32.4 > Honda Civic 30.4 > Toyota Corolla 33.9 > Lotus Europa 30.4 > Warning message: > Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0. > ℹ Please use `"mpg"` instead of `.data$mpg` > > But as hinted at by ‘tidyselect’ the deprecation is actually from theselect()part of the formulation (in a new R session, …) > > > mtcars |> filter(.data$mpg > 30) |> select("mpg") > mpg > Fiat 128 32.4 > Honda Civic 30.4 > Toyota Corolla 33.9 > Lotus Europa 30.4 > > > > This makes sense. The.datapronoun disambiguates wherempgcomes from – themtcarsdata frame – and.data$mpgis a vector suitable for use in filtering. It doesn’t make sense to have the vector.data$mpgas a (scalar) column selection. > > Actually when the error occurs it provides information on where it comes from > > This warning is displayed once every 8 hours. > Call `lifecycle::last_lifecycle_warnings()` to see where this warning was > generated. > > lifecycle::last_lifecycle_warnings() > [[1]] > <warning/lifecycle_warning_deprecated> > Warning: > Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0. > ℹ Please use `"mpg"` instead of `.data$mpg` > --- > Backtrace: > ▆ > 1. ├─dplyr::select(filter(mtcars, .data$mpg > 30), .data$mpg) > 2. └─dplyr:::select.data.frame(filter(mtcars, .data$mpg > 30), .data$mpg) > 3. └─tidyselect::eval_select(expr(c(...)), data = .data, error_call = error_call) > ... > > so fromdplyr::select()So the solution is to use.data$...and"..."as appropriate, and avoid giving R carte blanche to ignore incorrectly used variables.

Lori Shepherd (08:21:35): > The release schedule for Bioconductor release 3.18 can be found at:https://bioconductor.org/developers/release-schedule/. Please be mindful of important deadlines.

2023-09-12

Louis Le Nézet (09:08:49): > Hi, > I’m working on some plot check withusethis::test_that()and the functionvdiffr::expect_doppelganger(). > The problem I’m facing is that the graphic parameterspar()aren’t the same when I rundevtools::test()anddevtools::check(). > I tried to fix thepar()with a list but I don’t succeed to control correctly the graphics. Therefore the plot isn’t exactly the same. > What should I do ?

2023-09-13

Christopher Chin (17:03:29): > @Christopher Chin has joined the channel

2023-09-18

Beatriz Calvo Serra (05:37:07): > @Beatriz Calvo Serra has joined the channel

2023-09-19

Louis Le Nézet (09:32:48): > Hi, > I’m encountering the following error with an S4 class I’m building: > > Undocumented S4 methods: > generic '[[<-' and siglist 'Pedigree,ANY,missing' > All user-level objects in a package (including S4 classes and methods) > should have documentation entries. > > From what I understood I need to add a new method with the given signature. > But I don’t understand what would call such a signature and what the expected result should be. > I would have guessed, that it is called when the user want to affect a new value to a given slot, but the last argument is missing. > Do anyone know ?

Kasper D. Hansen (10:48:38): > This is a documentation error

Kasper D. Hansen (10:48:51): > You need to add a manpage which documents this signature

Louis Le Nézet (10:53:40): > The thing is that I don’t know what is the meaning of this signature. > Is it something like: > > Pedigree[[i]] <- value > setMethod("[[<-", c(x = "Pedigree", i = "ANY", value = "missing"), > function(x, i, value) { > slot(x, i) <- value > validObject(x) > x > } > ) > > But if so, why is thevalueargument missing ? > Or is it to assign to none to delete the slot ?

Hervé Pagès (13:42:44): > > From what I understood I need to add a new method with the given signature. > It’s the otherway around:R CMD checkis telling you that you already have a method with this signature but that you don’t have the corresponding alias in any of your man pages. It seems that in this case the alias you need to add would be\alias{[[<-,Pedigree,ANY,missing-method}. Yes this is a strange method and I’m not sure what the use case for such method would be. What’s strange about your communication is that you don’t seem to be aware that your package defines such method, that’s where I’m really confused. After you install and load the package, what doesshowMethods("[[<-")report?

2023-09-20

Louis Le Nézet (03:57:46): > Hi, > It make more sense. Apparently the signature ofPedigree[[i]] <- valueas an S4 method is\S4method{[[}{Pedigree,ANY,missing}(x, i, j, ...) <- valuein the Rd file. > But for a reason I don’t quite understand the documentation was present in the R script (in roxygen) and present in the man but not recognize in the check(). > Everything works fine now, even if I don’t know what changed (I’m unable to reproduce the error). > Thanks !

Alan O’C (06:44:02): > https://github.com/LouisLeNezet/kinship2/blob/b19415014f96fad2afde4b62bb9938ecfca2431f/R/pedigreeClass.R#L115

Alan O’C (06:44:49): > It seems like this method gets aliased tox="Pedigree", i="ANY", j="missing"somehow > > Most likely because the signature isc(x, i, j, value), but the argument list isx, i, j, ..., value

2023-09-22

Amanda Hiser (13:53:15): > @Amanda Hiser has joined the channel

2023-09-24

Teemu Daniel Laajala (04:34:47): > @Teemu Daniel Laajala has joined the channel

2023-09-25

Nikhita (18:40:40): > @Nikhita has joined the channel

2023-10-03

Louis Le Nézet (09:35:07): > Hi, > The error came back without me knowing why… > The script is: > > #' @rdname extract-methods > #' @return The pedigree object with the slot `i` replaced by `value`. > setMethod("[[<-", c(x = "Pedigree", i = "ANY", j = "missing", value = "ANY"), > function(x, i, j, value) { > slot(x, i) <- value > validObject(x) > x > } > ) > > The man is like > > % Generated by roxygen2: do not edit by hand > % Please edit documentation in R/pedigreeClass.R > \name{show,Pedigree-method} > \alias{[[<-,Pedigree,ANY,missing,ANY-method} > \title{Pedigree methods} > \usage{ > \S4method{[[}{Pedigree,ANY,missing,ANY}(x, i, j) <- value > } > > The error I get > > Undocumented S4 methods: > generic '[[<-' and siglist 'Pedigree,ANY,missing' > > What should I do to solve this ?

Hervé Pagès (13:26:31): > FWIW personally I try to keep the signatures of my methods as simple as possible e.g. in the above case I’d probably do something like this: > > setMethod("[[<-", "Pedigree", > function(x, i, j, value) { > if (!missing(j)) > stop("'x[[i, j]] <- value' is not supported on a Pedigree object") > slot(x, i) <- value > validObject(x) > x > } > ) > > Then the alias and\usagelines in the man page just need to be\alias{[[<-,Pedigree-method}and\S4method{[[}{Pedigree}(x, i, j) <- value. This should avoid the kind of annoyance that you’re going thru with the complicated method signature that you’re currently using. > More importantly, please note that defining a[[<-method that is basically a disguised generic slot setter (i.e.x[["foo"]] <- valuedoesx@foo <- value) is not recommended. We recommend that you define slot-specific accesor methods instead, one getter and setter per slot. A typical approach is to name the getter and setter like the slot. This will allow the user to dofoo(x)and/orfoo(x) <- valueto get and/or set thefooslot. Not all slots necessarily need to have a corresponding pair of getter/setter, only the slots that you want to expose to the user. And even the exposed slots don’t necessarily need to have a getteranda setter, but maybe only one of them. It’s an important part of the design process of a new class to decide what parts of an object are going to be exposed and how they’re going to be exposed.

2023-10-04

Louis Le Nézet (07:17:04): > Hi, > Thanks a lot for this complete answer. It makes it really clear for me. > I will modify my code accordingly !

Martin Morgan (16:02:13): > I’ve been developing ‘AlphaMissense’ to submit as a package to make the variant annotations derived from DeepMind / AlphaMissense (publication,Zenodo) easily available to the R and Bioconductor communities. I appreciate any feedback / issues / etc before submitting; I am particularly interested in a second vignette that provides compelling biological examples motivating the package. Respond here or as issues on the repository! - Attachment (Zenodo): Predictions for AlphaMissense > This repository provide AlphaMissense predictions. Please see the README for more details. For questions about AlphaMissense or the prediction Database please email mailto:alphamissense@google.com|alphamissense@google.com.

Ludwig Geistlinger (16:10:43) (in thread): > I am thinking about adding a vignette on visualization of the variant landscape and regions of interest usinggosling.

Hervé Pagès (21:26:16) (in thread): > Parentheses are missing afterinstalled.packages: > > if (!"BiocManager" %in% rownames(installed.packages)) > install.packages("BiocManager", repos = "[https://cran.r-project.org](https://cran.r-project.org)") > > Confusingly, the above actually seems to work but it willalwaysre-installBiocManager.

2023-10-05

James Hennessy (11:25:14): > @James Hennessy has joined the channel

James Hennessy (11:30:46): > hey can does anybody know the url proxy for bioconductor

Lori Shepherd (11:36:28): > can you be more specific? what are you trying to do?

James Hennessy (12:59:38): > set up a nexus proxy to download bioconductor packages

James Hennessy (12:59:49): > willhttp://www.bioconductor.orgwork

Lori Shepherd (13:07:55): > so your downloading packages outside of R? are you using current R or a legacy version ?

2023-10-06

Jake Fanale (15:12:39): > @Jake Fanale has joined the channel

Martin Morgan (18:35:16) (in thread): > I’ve been made aware of the#alpha-missensechannel, where other conversations continue

2023-10-08

Ramon Massoni-Badosa (18:27:48): > @Ramon Massoni-Badosa has joined the channel

2023-10-10

Joseph (12:20:36): > Hi everyone, I seem to be getting an error on kunpeng2 for my package “OutSplice,” in the devel branch and I am having some trouble figuring out what is wrong. It seems to be working on every other system. I think I have seen several other posts involving errors on kunpeng2 so I am not sure if this is something I need to look into fixing. If this does need correcting on my end, when is the deadline to get it fixed by for the 3.18 release? > > Build Report:https://bioconductor.org/checkResults/devel/bioc-LATEST/OutSplice/kunpeng2-checksrc.html

Andres Wokaty (12:57:52) (in thread): > Hi@Josephwe recommend that you try to address the issue on Linux arm64 and there’s some guidance athttps://blog.bioconductor.org/posts/2023-07-14-linux-arm64-github-actions/and athttps://blog.bioconductor.org/posts/2023-06-09-debug-linux-arm64-on-docker/. OutSplice will still be included in the 3.18 release. - Attachment (blog.bioconductor.org): Bioconductor community blog - Testing Packages on Linux ARM64 with GitHub Actions > How to use GitHub Actions to systematically build and test a Bioconductor package on Linux ARM64 architecture. - Attachment (blog.bioconductor.org): Bioconductor community blog - Emulated build and test of Bioconductor packages for Linux ARM64 > Build and test for Linux ARM64 with Docker on x86_64 host

Stevie Pederson (20:39:03) (in thread): > Hi@Joseph. I had a similar problem which resolved itself after a couple of days with no input from me.base::assign(".ptime", proc.time(), pos = "CheckExEnv")was the common line in my build error. I also followed the above and built a local docker for testing with everything passing on my local docker. However, when following the above, you will need to install all of your package dependencies before runningR CMD build xyzand for me that took 6 hours, so if you’re savvy enough with docker images (I’m not), I’d highly recommend mounting a local directory for the R packages on your docker and setting that to be where.libPaths()looks. I wasn’t smart enough so had to go through the 6hr install twice and just made sure I had the second attempt at installations running from the time I arrived at the office until I tried build testing late in the day. You will also need to install pdflatex on that docker (apt install texlive-latex-extra) to ensureR CMD checkruns successfully

2023-10-11

Martin Grigorov (03:24:15) (in thread): > The issue is easily reproducible onkunpeng2. I can test patches if this could be easier for you!

Romane Libouban (05:41:35): > @Romane Libouban has joined the channel

2023-10-12

Louis Le Nézet (07:10:24): > Hi everyone, > I’m looking for the simplest way to link a vignette to another.vignette("rectangle", package = "tidyr")works but I would like something like hyperlink[Rectangle vignette](#something). > Does anyone know how to do it ? > Thanks !

Lluís Revilla (07:29:19): > The first option is the safest, as some vignettes are pdfs. But if you know where the vignette is hosted (pdf or not) you could use that as url.

Joseph (13:41:09) (in thread): > Thanks for the replies everyone! Similar to what happened with@Stevie Pedersonit seems that the issue has resolved itself for now.

2023-10-13

Steffen Neumann (07:51:11): > Hi, we have an organically grown shiny app, and I want to improve modularisation and separate logic and GUI. Do we have good examples in BioC ?

Steffen Neumann (07:53:23) (in thread): > I am thinking about putting all logic into an R package, and have the GUI separately. I’d love to have everything in one github repo to maintain consistency, so I’d like to avoid mypackage and mypackageGUI, although the latter would simplify (shiny) dependency management.

Jared Andrews (09:08:09) (in thread): > Look at iSEE or genetonic.

Jared Andrews (09:09:16) (in thread): > iSEE takes the breaking things up to the extreme, but it’s probably the most complex one in Bioconductor by far.

Jared Andrews (09:09:51) (in thread): > Also see the guidelines for Shiny app packages in the contributions book

Mike Smith (09:09:58) (in thread): > One example that I quite like is openPrimeRui (https://code.bioconductor.org/browse/openPrimeRui/tree/devel/) .The shiny app is placed ininst/shinyand the code is in the standardRdirectory.

Martin Morgan (11:12:19) (in thread): > does code in inst/shiny get checked by R CMD check? I would think this would not be a good thing, based on the innumerable mistakes I make / get caught by R CMD check

Jared Andrews (11:17:11) (in thread): > The guidelines currently suggest a function that returns a Shiny app rather than sticking it ininst/shiny.https://contributions.bioconductor.org/shiny.html

Dirk Eddelbuettel (11:45:44) (in thread): > Years ago, and way beforegolemand whatnot, I went this route of splitting ‘core’ functionality into a package for testing standalone etc with as lite a shiny shim as possible. But I haven’t done real shiny work in years.

Steffen Neumann (12:01:02) (in thread): > Thanks for the replies, since more complex apps will have images, CSS, static stuff etc I like the open prime approach Mike mentioned.

Jared Andrews (12:03:14) (in thread): > I have a package/app in review that has both images and CSS if you’ve any interest:https://github.com/j-andrews7/CRISPRball

2023-10-14

Jake Fanale (17:07:16): > Hey everyone! I am working on developing a package, and I’m trying to add it to the bioconductor repo. However, I can’t find my package onhttps://git.bioconductor.org/, how would I resolve this?

Sean Davis (17:10:32) (in thread): > Hi, Jake. Has your package already been submitted for review to the Contributions Submission Tracker (https://github.com/Bioconductor/Contributions)?

Jake Fanale (17:11:09) (in thread): > It has, here:https://github.com/Bioconductor/Contributions/issues/3097 - Attachment: #3097 CytoMethIC > Update the following URL to point to the GitHub repository of
> the package you wish to submit to Bioconductor > > • Repository: https://github.com/zhou-lab/CytoMethIC > > Confirm the following by editing each check box to ‘[x]’ > > I understand that by submitting my package to Bioconductor,
> the package source and all review commentary are visible to the
> general public. > I have read the Bioconductor Package Submission
> instructions. My package is consistent with the Bioconductor
> Package Guidelines. > I understand Bioconductor <https://bioconductor.org/developers/package-submission/#naming|Package Naming Policy> and acknowledge
> Bioconductor may retain use of package name. > I understand that a minimum requirement for package acceptance
> is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
> Passing these checks does not result in automatic acceptance. The
> package will then undergo a formal review and recommendations for
> acceptance regarding other Bioconductor standards will be addressed. > My package addresses statistical or bioinformatic issues related
> to the analysis and comprehension of high throughput genomic data. > I am committed to the long-term maintenance of my package. This
> includes monitoring the support site for issues that users may
> have, subscribing to the bioc-devel mailing list to stay aware
> of developments in the Bioconductor community, responding promptly
> to requests for updates from the Core team in response to changes in
> R or underlying software. > I am familiar with the Bioconductor code of conduct and
> agree to abide by it. > > I am familiar with the essential aspects of Bioconductor software
> management, including: > > ☑︎ The ‘devel’ branch for new packages and features. > ☑︎ The stable ‘release’ branch, made available every six
> months, for bug fixes. > ☑︎ Bioconductor version control using Git
> (optionally via GitHub). > > For questions/help about the submission process, including questions about
> the output of the automatic reports generated by the SPB (Single Package
> Builder), please use the #package-submission channel of our Community Slack.
> Follow the link on the home page of the Bioconductor website to sign up.

Sean Davis (17:14:01) (in thread): > Ah. Looks like the review might not be fully completed. I’d suggest following up with a comment on the issue tracker.

Jake Fanale (17:14:34) (in thread): > Okay. When is the deadline for packages to be finished with the review process?

Jake Fanale (17:17:07) (in thread): > Never mind, found the article about it being due this coming Wednesday. Thank you for your help Sean!

Sean Davis (17:18:16) (in thread): > And thank you for what looks like a nice contribution!

2023-10-18

Jake Fanale (10:26:23): > Hi, I’m getting an issue where I can’t push to my upstream/devel because I don’t have permission to do so (even though I am the package maintainer and created the branch myself). Is there a known fix for this? - File (PNG): image.png

Lori Shepherd (10:39:52) (in thread): > I think I see the issue on our config side. Hold tight and I’ll let you know when to try again

Lori Shepherd (10:43:00) (in thread): > @Jake FanaleI think the issue is fixed. Please try again

Jake Fanale (10:45:23) (in thread): > It seems to have worked, thank you!

Lori Shepherd (10:45:44) (in thread): > sorry for the temporary inconvenience. glad its working now. cheers

2023-10-25

Lori Shepherd (16:13:29): > Bioconductor Core Team is pleased to release Bioc 3.18! Thank you to all developers and community members for contributing to the project. The full release announcement can be found at:https://bioconductor.org/news/bioc_3_18_release/

2023-10-26

Ramon Massoni-Badosa (09:48:36) (in thread): > thanks to the Bioconductor Core Team for your incredible work:tada:

2023-10-30

Robert Castelo (06:22:54): > Hi, I can see at theBioc Docker Hubthat three days ago the images for the new release 3_18 were pushed, but not the devel image/tag. Is there a plan for this to happen in the coming days, or is this a glitch in the pipeline that prevents the devel image to be pushed to the Docker Hub?

Vince Carey (08:56:14): > we are aware thanks for patience

2023-10-31

Chiagoziem David (08:55:17): > @Chiagoziem David has joined the channel

2023-11-01

Marcel Ramos Pérez (11:07:14) (in thread): > Hi Robert! The devel imagebioconductor/bioconductor_docker:devel-amd64is available now. We are still resolving some issues with thearm64one. Thanks for your patience.

Robert Castelo (13:30:58) (in thread): > Great! it’s working! thanks for all the efforts!!

2023-11-07

LAR (14:44:55): > @LAR has joined the channel

LAR (14:54:04): > Hello everyone: > > As shownhere on the Bioconductor website, my package is receiving a general error of: > > there is no package called 'BiocStyle' > > My package only uses theBiocStylepackage at the top of.Rmdfiles in thevignettesfolder in the format: > {---} > title: 'Manuscripts' > package: pkgName > bibliography: pkgName.bib > output: > BiocStyle::html_document: > toc_float: true > tidy: TRUE > border-width: 5px > vignette: > > \usepackage[utf8]{inputenc} > %\VignetteIndexEntry{"Title"} > %\VignetteEngine{knitr::rmarkdown} > %\VignetteEncoding{UTF-8} > %\VignettePackage{pkdName} > --- > > First, I addedBiocStyleto the “Imports” part of my DESCRIPTION file based on instructionshere. However, the same error remained. > > Second, I followed advicehere, and addedBiocStyleto the “Suggests” part of my DESCRIPTION file (which appears as below with more packages called). However, the same error remains. > > Suggests: > BiocStyle (>= 3.18), > BiocGenerics (>= 0.29.1) > > I’m pretty lost on how to resolve this problem, and would greatly appreciate some advice. Thank you again.

Hervé Pagès (15:01:23) (in thread): > Did you push your changes to the Bioconductor git server? Looking at thebigPintrepo there (git clonehttps://git.bioconductor.org/packages/bigPint), I don’t seeBiocStylein theSuggestsfield (I checked thedevelandRELEASE_3_18branch).

LAR (15:34:48) (in thread): > Thanks, Hervé. No, I’ve not pushed the changes to the Bioconductor git server. I’ve been runningdevtools::check()anddevtools::build()on my local computer to check if it solves it before pushing.

Hervé Pagès (16:31:35) (in thread): > And addingBiocStyletoSuggestsdoesn’t solve the problem on your local computer? Of course you want to make sure thatBiocStyleis installed before runningdevtools::check()ordevtools::build().

Nick R. (21:08:14): > Hi all, I was wondering if there is there a list of AnnotationData, ExperimentData and Workflow packages somewhere like thelist of Software packagesthat is available?

Lori Shepherd (21:53:43): > Yeshttps://bioconductor.org/packages/devel/data/experiment/src/contrib/PACKAGES.https://bioconductor.org/packages/devel/data/annotation/src/contrib/PACKAGES https://bioconductor.org/packages/devel/workflows/src/contrib/PACKAGES

Nick R. (21:56:49): > Thanks!

Nick R. (22:23:45) (in thread): > How are packages included in these lists? spicyWorkflow is not in the workflow list. I’m guessing this might be because it has been failing for a while. > On the other hand, scFeatures has been failing too and but is still in the package list for software. > > Is there a list somewhere that includes all packages, regardless of build status?

2023-11-08

Lori Shepherd (07:05:34) (in thread): > VIEWS file should I think include regardless of build status

Lori Shepherd (07:07:52) (in thread): > https://www.bioconductor.org/packages/release/bioc/VIEWSThere is one for release and devel and same Schema as above so change bioc to data/experiment or data/annotation or workflow

Michael Love (08:16:49): > Just noticed that SparseArray and DelayedArray don’t have arm64 binaries. I may have missed an announcement elsewhere, but is this known/intended?https://bioconductor.org/packages/release/bioc/html/SparseArray.html https://bioconductor.org/packages/release/bioc/html/DelayedArray.html - File (PNG): Screenshot 2023-11-07 at 7.35.57 PM.png - File (PNG): Screenshot 2023-11-07 at 7.37.28 PM.png

Lori Shepherd (08:27:50) (in thread): > https://bioconductor.org/checkResults/release/bioc-mac-arm64-LATEST/SparseArray/kjohnson1-checksrc.htmlSparseArray is in ERROR state since release so it is not available yet; DelayedArray depends on SparseArray so it is not available yet

Michael Love (08:42:20) (in thread): > got it, thanks for update Lori

Lori Shepherd (08:42:51) (in thread): > I think@Hervé Pagèshas pushed up some changes recently; I don’t know if it was specific to this or not

Hervé Pagès (11:11:41) (in thread): > To be more precise: if you don’t haveBiocStyleinstalled,R CMD buildandR CMD checkwillalwaysproduce the “there is no package called ‘BiocStyle’” error that you see on nebbiolo1, whetherBiocStyleis in Suggests or not, becauseBiocStyleis used in your vignette andR CMD buildorR CMD checkwon’t install it automatically. Once you installBiocStyle, this error will go away for you, but, ifBiocStyleis not in Suggests you will still see it on nebbiolo1, because of the*R_CHECK_SUGGESTS_ONLY*=truesetting we use on this machine. The reason we use this setting is to detect undeclared dependencies like this. Hope this helps.

Hervé Pagès (11:56:59) (in thread): > Yes it was. My bad that I didn’t see this Mac arm64 specific error earlier. > The Mac arm64 binaries should become available tomorrow (it always takes longer for these binaries to propagate because of the sluggishness of kjohnson1, our Mac arm64 builder). > Sorry for the inconvenience.

Hervé Pagès (13:11:18) (in thread): > I was able to take some shortcuts. The Mac arm64 binaries forSparseArrayandDelayedArrayare now available in BioC 3.18. Many more Mac arm64 binaries that depend on those will become available tomorrow.

Michael Love (15:54:02) (in thread): > thanks!

Nick R. (17:09:49) (in thread): > Thanks! That’s great

Hervé Pagès (17:19:04) (in thread): > kjohnson1 finished ealier than expected (https://bioconductor.org/checkResults/3.18/bioc-mac-arm64-LATEST/) and 100+ Mac arm64 binaries just propagated.

LAR (21:02:23) (in thread): > Thank you so much, Hervé. This has been really helpful, and it did resolve the problem! I’m now fixing a few more issues and hope to push the changes to Bioconductor. I’ll post here once it is complete. Thanks again!:slightly_smiling_face:

2023-11-09

LAR (05:38:13) (in thread): > Hello Hervé. I believe the issue was resolved and that I pushed the changes to thedevelandreleaseversions on Bioconductor using instructionshere. > > I didn’t get any errors when pushing changes, but I’m not sure how successful it was. It seems not much changed on the main Bioconductor page for my package (here). > > If I should ask about this somewhere else, please let me know. And thanks again for your advice. - Attachment (Bioconductor): bigPint > Methods for visualizing large multivariate datasets using static and interactive scatterplot matrices, parallel coordinate plots, volcano plots, and litre plots. Includes examples for visualizing RNA-sequencing datasets and differentially expressed genes.

LAR (09:30:21) (in thread): > Shucks. I’ve been following instructions for doing a version bump under the “New package workflow” section (here), and made it up to the point 5 (“Push changes to the Bioconductor and GitHub repositories”). > > However, upon attempting the line > > git push upstream devel > > I receive the error: ” ! [remote rejected] devel -> devel (pre-receive hook declined) > error: failed to push some refs to ‘git.bioconductor.org:packages/bigPint.git’”

LAR (09:32:34) (in thread): > I see in Bioc-devel mail archives (here) that this error can be something on the Bioconductor developer side. Should I post my issue somewhere in particular to determine if it is a problem on my end or Bioconductor end? > > Apologies for consecutive issues here. I’m happy to post in a more appropriate location if needed. > > Thank you again for your time and advice.

Hervé Pagès (11:49:02) (in thread): > Yes please ask on the bioc-devel mailing list or in the#bioc_gitchannel. Thanks!

2023-11-14

Federico Marini (19:46:19) (in thread): > biased view here, since I use it for all my shiny apps (GeneTonic being indeed one of them): I think it is better to have the code of the appalso inside the R/ folder itself. The potential benefit of code being caught is one of the reasons.

Federico Marini (19:46:39) (in thread): > plus: the function becomes “fully exportable” if this is what you would like to do

Federico Marini (19:47:25) (in thread): > and yes, iSEE is probably a bit over the top of you start the path of de-coupling things

Federico Marini (19:48:15) (in thread): > as for the “extra content”: it is all very possible to have the app basically anywhere and have these components acknowledged and recognized/used by the app no matter where that is

Jared Andrews (19:55:47) (in thread): > I also prefer it be a function in /R. It just makes it easier to find for those interested in how it works under the hood.

2023-11-16

Francesc Català-Moll (13:05:26): > Hello everyone, > > I would like to open a discussion about the use of classes in package development for Bioconductor. I have followed Bioconductor’s guidelines and used existing S4 classes for my package. However, during the review, I was suggested to use a more modern class. > > I understand that this suggestion may be intended to improve the compatibility or performance of my package in the future. However, this would require a significant rewrite of my code, which is challenging given the time I have already invested in developing my package using a Bioconductor class. > > I think it would be helpful if Bioconductor could tag classes to help developers make more informed decisions about which classes to use. Some possible tags could be: > > 1. “Deprecated”: For classes that are no longer recommended for use. > 2. “Legacy”: For classes that are still used in some existing packages, but are not recommended for new developments. > 3. “Recommended”: For classes that are recommended for use in the development of new packages. > 4. “Experimental”: For classes that are in development or testing phase. > > This could prevent situations where developers invest time in using a class, only to find out later that there is a newer or preferred class available. > > I would like to hear what others think about this topic. Have you had similar experiences? Do you think it would be helpful to have tags indicating the status of classes? > > Thank you for your time and I look forward to your feedback.

Hervé Pagès (16:27:06) (in thread): > Hi Francesc, thanks for your input. > This kind of labelling would be a good starting point for the Classes and Methods Working Group:https://github.com/Bioconductor/BiocClassesWorkingGroupNote that interoperability within the Bioconductor ecosystem can greatly be improved by the ability to coerce from one class to the other. Unfortunately, in the case ofphyloseqvsTreeSummarizedExperiment, I don’t see that coercion from one class to the other is supported at the moment. If this coercion makes sense, maybe it’s something that we could request to the authors/maintainers of theTreeSummarizedExperimentpackage. I’ll check with them. > With this functionality, users of yourphyloseq-centric package will get access to all theTreeSummarizedExperiment-centric tools available in other packages. Hopefully this will help mitigate your reviewer’s concerns about the “lack of modernity” of thephyloseqclass and spare you a costly refactoring.

2023-11-17

Sudarshan (00:51:22) (in thread): > @Leo Lahti@Tuomas Bormanthe mia R/BioC based on TreeSE has this functionalityhttps://www.bioconductor.org/packages/release/bioc/html/mia.html - Attachment (Bioconductor): mia > mia implements tools for microbiome analysis based on the SummarizedExperiment, SingleCellExperiment and TreeSummarizedExperiment infrastructure. Data wrangling and analysis in the context of taxonomic data is the main scope. Additional functions for common task are implemented such as community indices calculation and summarization.

2023-11-20

Almog Angel (11:21:52): > Hey everyone, > I’m curious about how you all handle memory management in package development. Are there specific strategies, packages, or functions you’ve found particularly effective? > Also, I’m interested in your thoughts on usinggc()to free memory within the package, is it recommended? > Looking forward to your insights.

Alan O’C (12:22:13) (in thread): > It’s generally not recommended to usegc()(eghttp://adv-r.had.co.nz/memory.html#gc)

Hervé Pagès (12:22:17) (in thread): > If your package only contains R code (i.e. it doesn’t contain C or C++ code), there’s generally no need to worry about memory management. Of course you still want to pay close attention to the memory usage of your functions and to the memory footprint of your objects. If they seem excessive, then some investigation is required. In this case we would need some details in order to be able to provide more useful feedback.

Vince Carey (12:22:56): > I’d say that in general we let R do what it likes in the space of memory management. Very large memory images are problematic, so out-of-memory data management with SQLite or HDF5 are common. The Rcollectl package produces traces of resource consumption that can be timestamped. The profvis package is also useful. As for gc(), the code browsercode.bioconductor.orgis useful, I searched forgc\(and found quite a few instances. I see a couple of nice thread responses now.

Almog Angel (12:26:52) (in thread): > I found out that the main problem with memory in my package was caused by running a function that fits an ML model. In my package, I generate simulation data which is a large list of numeric matrices (>5GB) that I use to train the model. The problem is where I introduce parallel processing. For example with 10 CPUs the memory usage might be >50GB. Are there solutions for that?

Alan O’C (12:28:32) (in thread): > You probably need to be careful what’s being passed to each of the parallel processes. On some systems with some parallel options, each process will get a copy of the entire R workspace. If each of those contains the full list of numeric matrices, you’ll get 10 copies of the 5GB list, plus another copy for each process whenever the list for that process gets modified

Almog Angel (12:32:17) (in thread): > Sorry about the following question, I am not a computer scientist: Is there a way to run parallel processes that all use the same source of memory instead of each one using a copy of the entire data?

Alan O’C (12:32:58) (in thread): > Herve can almost certainly remember the specifics better than I :)

Vince Carey (12:33:05) (in thread): > https://bioconductor.org/packages/release/bioc/html/SharedObject.html - Attachment (Bioconductor): SharedObject > This package is developed for facilitating parallel computing in R. It is capable to create an R object in the shared memory space and share the data across multiple R processes. It avoids the overhead of memory dulplication and data transfer, which make sharing big data object across many clusters possible.

Vince Carey (12:33:36) (in thread): > not the only way but worth knowing about

Almog Angel (12:33:44) (in thread): > :heart:Thanks!

Almog Angel (12:34:49) (in thread): > Finally I can test my package without killing our lab server:joy:

Hervé Pagès (12:37:44) (in thread): > @Almog AngelDoes each worker need access to the entire list of matrices to do the job? If not then you should be able to give them only the subset of the big list that they need. This would avoid generating 10 copies of the big list.

Almog Angel (12:44:19) (in thread): > @Hervé PagèsActually the way I do that is more complex. > I useparallel::mclapplyto run through the list of matrices with X cores. Within themclapplyI use a function that trains the ML model which also supports multiple cores. > I ran multiple tests and found out that the fastest is to use 0.75X of the cores formclapplyand 0.25X of the cores for the ML training function. However I did not test what happens with the memory.

Hervé Pagès (12:55:46) (in thread): > We recommemd the use ofBiocParallelinstead ofparallel. > Be aware that if you use 10 workers forparallel::mclapplyand, say, 4 workers for the ML training function, then you’re using 10 x 4 cores in total! That’s what happens with nested parallelization. > Is your function that trains the ML model parallelized at the R level (i.e. viaBiocParallelorparallel) or at the C/C++ level? My understanding is that the latter wil generally be able to use a shared memory approach out-of-the-box.

Almog Angel (12:58:59) (in thread): > Thanks, I will checkBiocParallel now.The function I am using is the following: > > options(rf.cores=nRFcores, mc.cores=1) > model <- randomForestSRC::var.select(frac ~ ., as.data.frame(data), method = "vh.vimp", verbose = FALSE, refit = TRUE, fast = TRUE) > > They ask to defineoptionsfor how many RF cores and how many MC cores.,

Hervé Pagès (13:21:13) (in thread): > UsingBiocParallel::bplapplywith 10 workers shouldnotgenerate 10 copies of the entire list of matrices. Only copies of the individual list elements that are beeing processed by the 10 workers at any given time. So if your list is big (e.g. 200+ elements), these copies are not going to significantly improve memory usage (only by 5%). > However you also want to make sure that you are not passing some big object to the function passed toBiocParallel::bplapply(FUNargument). IfFUNhas additional arguments, then keep in mind that all the objects passed to these arguments will be copied 10 times. > Looks likerandomForestSRis parallelized at the C level (via OpenMP) so should be able to do shared-memory parallel computing. Maybe you could try using less workers at theBiocParallel::bplapplyorparallel::mclapplylevel and more workers at the ML training level see if that reduces memory usage (maybe at the cost of a small slowdown but that’s still better than killing your lab server:wink:).

Almog Angel (13:26:14) (in thread): > @Hervé PagèsThanks! I am changing my code to BiocParallel as we speak. I understand that the recommended environment for linux/mac is MulticoreParam. I will let you know if there is improvement in memory usage :)

2023-11-21

Mark Robinson (04:32:08): > @Mark Robinson has joined the channel

2023-11-27

Changqing (20:16:44): > @Changqing has joined the channel

2023-11-28

Steffen Neumann (16:21:28): > Hi, tearing my hairs on this: MyDockerfilehasRUN R -e 'devtools::install_deps("myPackage")'in it. Thedocker build silently continues (resulting in a broken build)

Steffen Neumann (16:23:03) (in thread): > you can reproduce with > > R -e 'install.packages("jhgjhg")' ; echo $? > > which will yield0. Any way to elegantly solve that without lengthy TryCatch stuff in the R code ?

Dirk Eddelbuettel (16:31:01) (in thread): > Maybeoptions(warn=2)to die sooner? > > And/or maybe fight the cause? (I.e. find which package(s) fail to install and see why?) That said, I am having pretty good luck with myr2usystem in both containers, automated CI runs (for work and for my projects) and for interactive debugging.install_deps()is really aremotesfunction these days and I have that wrap inlitter’sinstallDeps.rwhich I use often. The are nearly 400 BioC packages in r2u, and the alpha-stage bioc2u has (last I heard) about 3800 for near complete coverage. - Attachment (eddelbuettel.github.io): CRAN as Ubuntu Binaries - r2u > Easy, fast, reliable – pick all three!

Steffen Neumann (17:59:18) (in thread): > Fighting the cause is what I want … but I didn’t see the failure until the very end, because the earlierRUNsilently didn’t fail. Theoptions()is what I needed

Henrik Bengtsson (21:43:16) (in thread): > > Rscript -e 'pkg="jhgjhg"; install.packages(pkg); loadNamespace(pkg)' > > should exit with1, if the installation fails. > > PS. See alsohttps://github.com/HenrikBengtsson/Wishlist-for-R/issues/34 - Attachment: #34 WISH: install.packages had option to throw error code if install fails > When installing packages using scripts (e.g. Dockerfiles) it is nice to have install.packages throw an error instead of a warning if the package installation fails. One can simulate this of course using withCallingHandlers(warning = stop), but it’s not ideal to treat any warning as an error; since some warnings are just that and do not mean that the package installation has failed. Or maybe there’s already a good work-around for this I’m just overlooking? > > Thanks for any ideas and for maintaining this Wishlist!

2023-12-06

Mikhail Dozmorov (11:41:49): > I’m running in a series of issues with Bioconductor 3.18. On R 4.3.2, Mac (Intel). > 1.BiocManager::install("ChAMP")fails, “package ‘ChAMP’ is not available for Bioconductor version ‘3.18’”. I can install it from the binary archive.https://bioconductor.org/packages/ChAMP/2. Dependencies of ChAMP fail to install as also “not available”, e.g., DMRcate. > 3. The DSS package, one of DMRcate’s dependencies, cannot be run because “DSS.so” cannot be opened because the developer cannot be verified.” > 4. The DSS package cannot even be installed via conda because it fails to download from BioC 3.18 repository. Example: “ERROR: post-link.sh was unable to download any of the following URLs:https://bioconductor.org/packages/3.18/data/annotation/src/contrib/GenomeInfoDbData_1.2.11.tar.gz”. > Is this the issue specific to ChAMP installation, or more general BioC 3.18 issues? Not sure where else to ask.

Lori Shepherd (11:47:03) (in thread): > https://support.bioconductor.org/p/9155456/#9155482The DMRcate maintainer has altered the package to build without the DSS dependency and both DMRcate and ChAMP propagated on today’s build and should become available.https://bioconductor.org/checkResults/release/bioc-LATEST/DMRcate/andhttps://bioconductor.org/checkResults/release/bioc-LATEST/ChAMP/

Lori Shepherd (11:48:44) (in thread): > DSS because it is failing on linux wont be available for any platform. They have been contacted but so far have been unresponsive

Mikhail Dozmorov (11:50:44) (in thread): > Will try the new DMRcate tomorrow. If it will work without DSS, that should solve the issues. Thank you!

2023-12-18

Francesc Català-Moll (05:19:34): > Hello everyone, > > I’ve noticed something in the documentation that I’d like to discuss. According to the documentation, R CMD check is expected to last less than 40 minutes. However, in the build report, I’ve observed that a timeout error (TIMEOUT) occurs when R CMD check exceeds 15 minutes. Here’s the error message I found: > > TIMEOUT: Running testthat.R ERROR TIMEOUT: R CMD check exceeded 15 mins > > Could someone clarify this discrepancy? > > Thank you for your help.

Lori Shepherd (07:23:11): > is this for an already accepted package or one that is currently being submitted? On the daily builder we have increased the timeout limit for various reasons; the single package builder still uses a more strict time limit as packages should run efficiently (remember we have over 2000 packages to build nightly). If the issue is testing may I suggest moving some long running tests to the long test format that only run once a week? Seehttp://contributions.bioconductor.org/long-tests.html.

Francesc Català-Moll (09:34:31) (in thread): > Dear@Lori Shepherd, > > Thank you for your suggestion and assistance with our package that is currently under review (Bioconductor Contributions Issue #3158). We understand the importance of package build efficiency, especially considering the large number of packages that are built daily. > > We have increased the coverage of the package as requested, without using the examples in the computation. This has resulted in an overlap as all exported functions now have their own examples and tests, which has increased the build time. > > We appreciate your suggestion to move some long-running tests to the long test format that only runs once a week. We will consider this option and see how we can implement it to improve the efficiency of our package. > > Thank you again for your help. > > Best

2023-12-19

Johannes Rainer (02:26:39): > Hi all, > I’m having issues with the current Bioconductor docker image (from Friday 15.12.).rlangfailed to update: > > > BiocManager::install("rlang") > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://cloud.r-project.org](https://cloud.r-project.org)Bioconductor version 3.19 (BiocManager 1.30.22), R Under development (unstable) > (2023-12-13 r85679) > Installing package(s) 'rlang' > trying URL '[https://cloud.r-project.org/src/contrib/rlang_1.1.2.tar.gz](https://cloud.r-project.org/src/contrib/rlang_1.1.2.tar.gz)' > Content type 'application/x-gzip' length 763521 bytes (745 KB) > ================================================== > downloaded 745 KB > > * installing **source** package 'rlang' ... > **** package 'rlang' successfully unpacked and MD5 sums checked > **** using staged installation > **** libs > using C compiler: 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0' > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I./rlang/ -I/usr/local/include -fvisibility=hidden -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c capture.c -o capture.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I./rlang/ -I/usr/local/include -fvisibility=hidden -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c internal.c -o internal.o > In file included from internal/internal.c:32, > from internal.c:1: > internal/tests.c: In function 'ffi_test_Rf_warning': > internal/tests.c:104:3: error: format not a string literal and no format arguments [-Werror=format-security] > 104 | Rf_warning(r_chr_get_c_string(msg, 0)); > | ^~~~~~~~~~~~~~~~~~ > internal/tests.c: In function 'ffi_test_Rf_error': > internal/tests.c:108:3: error: format not a string literal and no format arguments [-Werror=format-security] > 108 | Rf_error(r_chr_get_c_string(msg, 0)); > | ^~~~~~~~~~~~~~ > internal/tests.c: In function 'ffi_test_Rf_warningcall': > internal/tests.c:113:3: error: format not a string literal and no format arguments [-Werror=format-security] > 113 | Rf_warningcall(call, r_chr_get_c_string(msg, 0)); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~ > internal/tests.c: In function 'ffi_test_Rf_errorcall': > internal/tests.c:117:3: error: format not a string literal and no format arguments [-Werror=format-security] > 117 | Rf_errorcall(call, r_chr_get_c_string(msg, 0)); > | ^~~~~~~~~~~~~~~~~~~~~~ > cc1: some warnings being treated as errors > make: ***** [/usr/local/lib/R/etc/Makeconf:191: internal.o] Error 1 > ERROR: compilation failed for package 'rlang' > * removing '/usr/local/lib/R/host-site-library/rlang' > > don’t know if that’s an issue ofrlangor of the docker? anybody having a solution to this?sessionInfofrom the docker: > > > sessionInfo() > R Under development (unstable) (2023-12-13 r85679) > Platform: x86_64-pc-linux-gnu > Running under: Ubuntu 22.04.3 LTS > > Matrix products: default > BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 > LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Etc/UTC > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.4.0 >

Mike Smith (06:07:59) (in thread): > TheRf_errorcalland similar functions were updated in the R source a few weeks ago (https://github.com/wch/r-source/commit/360aeff9fc2ea3def1b0370c16239570fac6bdfc). I guess the interface has changed slightly andrlangis not longer using them correctly.

Johannes Rainer (09:58:58) (in thread): > Thanks - so, means waiting forrlangto be updated… and not using the Bioc docker image at the moment.

Marcel Ramos Pérez (11:17:34) (in thread): > It looks like therlangbinary (frompackagemanager) works : > > > library(rlang) > > Attaching package: 'rlang' > > The following object is masked from 'package:base': > > %||% > > > packageVersion("rlang") > [1] '1.1.2' >

Mike Smith (11:22:20) (in thread): > I’m not sure what would happen if you actually tried to use on the affected functions, given they’ve been compiled against a different version of the R, but it’s probably safe otherwise. You can also turn off the-Werror=format-securityflag in the C compiler to get it working from source. Something likeCFLAGS=-win$HOME/.R/Makevarswould achieve that. It probably comes with the same caveats as using the binary, but this is just a warning not an error. I think it’s the Ubuntu base image that actually sets this compiler flag. It’s not the default on my system.

2023-12-20

Johannes Rainer (01:39:02) (in thread): > Thanks Mike! that worked!

Stevie Pederson (21:07:51): > Hi all. I’m prepping another package for submission and a regular comment for new submissions is to request removal of github actions from thedevelbranch. A common outcome seems to be that people just stop using github actions, which in reality, may not be encouraging best practice (assuming using github actions falls under the ‘best-practice’ banner). I know that you can setup a separate branch (e.g.gh-actions) with actions enabled on that branch, thenmanuallycopydevelonto the branchgh-actionsto run testing before pushing togithub/develorbioconductor/devel. However, the workflow yaml would bedevelspecific, making testing of bugfixes on the current release branch by manual copying between branches a lot more cumbersome, as the workflow yaml would need to be temporarily rewritten for different R versions and Bioc versions. The general issue of github actions doesn’t seem to be discussed in the Git Version Control chapter of the Contributions book, and I may have poor googling skills so haven’t found clear guidance on this idea elsewhere. Is there guidance available somewhere on 1) How to setup gh-actions on a separate branch so that testing is automated without the need for manually copying between branches, or 2) How to manage this using both devel & bugfixes on the current release? > > Also tagging@Lachlan Baeras I know he has a keen interest in this.

Sean Davis (21:15:26) (in thread): > I’m curious to hear the rationale as well..Rbuildignoreshould handle a.githubdirectory just fine.

2023-12-21

jim rothstein (00:50:33): > @jim rothstein has joined the channel

Amanda Hiser (11:24:19): > Good morning! I’m currently building apkgdown sitefor my Bioconductor package. I’d like to include a package logo in the final site, but to do so,pkgdownrequires the logo image to be saved in theman/figures/directory. During package submission, I was told that Bioconductor requires images to be stored ininst/figures/, and that theman/figures/directory must be removed (which I did). However, this leads to a couple problems now that I’m working with pkgdown - for example, without aman/figures/directory, I can’t direct pkgdown to the image files I want to use in the site. I also can’t usepkgdown::build_favicons(), as the function looks for an image inman/figures/before building the favicons from the package logo. Has anyone else run into this problem? Is there a good workaround for not being able to use aman/figures/directory when working with pkgdown sites and images?

Vince Carey (11:28:28): > Hi Amanda. Are you using a gh-pages branch for your pkgdown site? This is a good practice to separate the potentially voluminous html and image data from the software per se. The folders you require could be restored for the gh-pages branch but absent for the devel branch.

Vince Carey (11:32:37) (in thread): > Thanks for these comments@Stevie Pederson. We are preparing recommendations on this topic. One reason the specifics have lagged is that we want our approach to be robust to possible changes in our approach to source code repository management. Please use a separate branch for the action workflow at this time.

Amanda Hiser (11:34:47) (in thread): > Yes, I am using agh-pagesbranch. I hadn’t considered adding these folders to that branch instead (I’m new to pkgdown, clearly), I will try that! Thank you!

Stevie Pederson (19:31:47) (in thread): > Thanks@Vince Carey. That’s really encouraging to hear. I figure these decisions are usually made with good reason and was keen to follow my understanding of ‘best practice’, so I’ll carry on with that strategy & will look forward to seeing how things play out.

2023-12-28

David Rach (08:45:01): > @David Rach has joined the channel

2023-12-29

Ahmad Al Ajami (04:49:59): > @Ahmad Al Ajami has joined the channel

2024-01-10

Francesc Català-Moll (05:48:28): > Hello everyone, > > I’m experiencing an issue with my package in Bioconductor. For the first time, the build process is giving me the following warning: “* Checking for Bioconductor software dependencies… * WARNING: No Bioconductor dependencies detected; consider a CRAN submission” (build report). > > This is new behavior for my package (issue#3158) and it has taken me by surprise. I’ve noticed that issue#3268is also experiencing this error, so I suspect it might be something internal to BiocCheck. > > Has anyone else experienced this issue or have any idea how to solve it? Any help would be greatly appreciated. > > Thank you. - Attachment: #3158 dar - Attachment: #3268 Submission of the tidyomics package

Hervé Pagès (08:30:32) (in thread): > The warning messaqge is a little dry but is mostly to let the reviewer know that the package does not seem to be reusing any of the resources already available in the Bioconductor ecosystem (e.g. some classes like SummarizedExperiment or GRanges). Sometimes it’s ok, sometimes it’s not. Only a close examination of the situation by a human will tell. > I don’t think the warning should be a blocker for the review process to move forward though.

Francesc Català-Moll (08:55:18) (in thread): > Thank you for your response@Hervé Pagès. > > What surprises me is that this warning only appears in the build made on the Bioconductor server, but this error does not appear locally or in GitHub actions. Could there be a specific configuration on the Bioconductor server that causes this? I would appreciate any clarification on this. > > I am attachingBiocCheck.logand the link to thelatest actionfrom the GitHub repository for further reference. Please let me know if you need any additional information. - File (Plain Text): 00BiocCheck.log

Lluís Revilla (09:38:25) (in thread): > Are you using the same enviroment variables as the SBP (https://github.com/Bioconductor/packagebuilder/blob/devel/check.Renviron) ? Often these are overlooked and they are important for CRAN and Bioconductor submission

Marcel Ramos Pérez (09:51:12) (in thread): > @Francesc Català-MollThe check is meant to run on the SPB where package submissions are reviewed. If you need to reproduce, useBiocCheck::BiocCheckGitClone()on the source package folder

Jacques SERIZAY (11:21:19) (in thread): > Hi all, > Do you guys know if this error is somewhat related? > > * installing **source** package 'S4Arrays' ... > **** using staged installation > **** libs > using C compiler: 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0' > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I'/usr/local/lib/R/site-library/S4Vectors/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c R_init_S4Arrays.c -o R_init_S4Arrays.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I'/usr/local/lib/R/site-library/S4Vectors/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c S4Vectors_stubs.c -o S4Vectors_stubs.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I'/usr/local/lib/R/site-library/S4Vectors/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c abind.c -o abind.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I'/usr/local/lib/R/site-library/S4Vectors/include' -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c array_selection.c -o array_selection.o > array_selection.c: In function 'C_Lindex2Mindex': > array_selection.c:352:17: error: format not a string literal and no format arguments [-Werror=format-security] > 352 | error(errmsg_buf()); > | ^~~~~~~~~ > array_selection.c: In function 'C_Mindex2Lindex': > array_selection.c:416:17: error: format not a string literal and no format arguments [-Werror=format-security] > 416 | error(errmsg_buf()); > | ^~~~~~~~~ > cc1: some warnings being treated as errors > make: ***** [/usr/local/lib/R/etc/Makeconf:191: array_selection.o] Error 1 > ERROR: compilation failed for package 'S4Arrays' > * removing '/usr/local/lib/R/site-library/S4Arrays' > > I manage to installrlangbut it has been updated yesterday to 1.1.3, don’t know if this was addressing the issue raised by@Johannes Rainer. > > > BiocManager::install('rlang') > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://cloud.r-project.org](https://cloud.r-project.org)Bioconductor version 3.19 (BiocManager 1.30.22), R Under development (unstable) > (2024-01-03 r85769) > Installing package(s) 'rlang' > trying URL '[https://cloud.r-project.org/src/contrib/rlang_1.1.3.tar.gz](https://cloud.r-project.org/src/contrib/rlang_1.1.3.tar.gz)' > Content type 'application/x-gzip' length 763765 bytes (745 KB) > ================================================== > downloaded 745 KB > > * installing **source** package 'rlang' ... > **** package 'rlang' successfully unpacked and MD5 sums checked > **** using staged installation > **** libs > using C compiler: 'gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0' > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I./rlang/ -I/usr/local/include -fvisibility=hidden -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c capture.c -o capture.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I./rlang/ -I/usr/local/include -fvisibility=hidden -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c internal.c -o internal.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I./rlang/ -I/usr/local/include -fvisibility=hidden -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c rlang.c -o rlang.o > gcc -I"/usr/local/lib/R/include" -DNDEBUG -I./rlang/ -I/usr/local/include -fvisibility=hidden -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c version.c -o version.o > gcc -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o rlang.so capture.o internal.o rlang.o version.o -L/usr/local/lib/R/lib -lR > installing to /usr/local/lib/R/site-library/00LOCK-rlang/00new/rlang/libs > **** R > **** inst > **** byte-compile and prepare package for lazy loading > **** help > ***** installing help indices > ***** copying figures > **** building package indices > **** testing if installed package can be loaded from temporary location > **** checking absolute paths in shared objects and dynamic libraries > **** testing if installed package can be loaded from final location > **** testing if installed package keeps a record of temporary installation path > * DONE (rlang) > > Using latest bioconductor_docker:devel (docker run --pull always -it bioconductor/bioconductor_docker:devel /bin/bash) : > > > sessionInfo() > R Under development (unstable) (2024-01-03 r85769) > Platform: x86_64-pc-linux-gnu > Running under: Ubuntu 22.04.3 LTS > > Matrix products: default > BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 > LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Etc/UTC > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.4.0 >

Jacques SERIZAY (11:24:13) (in thread): > I’ve managed to haveS4Arraysinstall ok after adding-wtoCFLAGSas pointed out by@Mike Smith, but do you guys know if this issue is being for a long-term solution? Is there any better scenario to install these packages when usingbioconductor_docker:devel? All my github actions fail because of this issue:angry:

Kasper D. Hansen (13:51:26) (in thread): > I think there should be one check for IMPORTS/DEPENDS and another check for SUGGESTS

Kasper D. Hansen (13:51:36) (in thread): > This package Suggests Bioc packages

Marcel Ramos Pérez (15:53:15) (in thread): > Bioconductor packages in Suggests is not a strong enough dependency IMO but the check can be added as a NOTE. It would require more work to discern what packages are only usingBiocStyleversus those providing extensive Bioc package use in their examples.

Kasper D. Hansen (15:54:42) (in thread): > It does provide evidence that the authors plug into Bioc IMO

Kasper D. Hansen (15:56:17) (in thread): > Also, and this is just my opinion, I don’t think this check should be there at all. I am not sure what it helps with, given that we should absolutely be open to genomics packages which does not depend on existing packages. The real issue is more complicated: we want new packages to depend on existing packageswhen it makes sense

Marcel Ramos Pérez (16:00:35) (in thread): > Ultimately it is up to the reviewer to make the decision; the warning can be ignored. I’ve had some packages that come in with missed opportunities to re-use Bioc infrastructure so I think it is better to flag those early on.

Francesc Català-Moll (16:02:43) (in thread): > However, in my case I have multiple imported bioconductor packages and the warning pops up anyway: - File (PNG): Screenshot_2024-01-10-21-59-34-645_com.github.android.png

Marcel Ramos Pérez (16:18:59) (in thread): > @Francesc Català-MollI can’t reproduce it locally withBiocCheck:::checkDESCRIPTIONFile("~/bioc/dar").Updatethis has been resolved in version1.39.18

Xin (20:45:40): > @Xin has joined the channel

2024-01-11

Johannes Rainer (05:31:32) (in thread): > I’m currently using this workaround in my GHA (thanks to@Philippe Hauchamps) based onwithr: > > - name: Install dependencies > run: | > withr::with_makevars( > c(CFLAGS = "-w", > CXXFLAGS = "-w", > CPPFLAGS = "-w"), > { > (code installing dependencies) > }, > assignment = "+=") > shell: Rscript {0} > > Would however more than happy if that could be fixed somehow upstream…

Mike Smith (05:38:27) (in thread): > I suspect@Hervé Pagèswill get to it soon. He’s already fixed it inXVectors(https://code.bioconductor.org/browse/XVector/commit/26ce2819a1b46b16e4023c50e1d987707ee491ba). Maybe you can copy that idea and submit a pull request toS4Arraysto get it done ASAP. - Attachment (code.bioconductor.org): Bioconductor Code: XVector > Browse the content of Bioconductor software packages.

Johannes Rainer (08:51:30) (in thread): > Good point; Done - not sure if I did it correctly though - am no C guru. Also, I assume there are more packages affected?

Jacques SERIZAY (09:37:17) (in thread): > @Johannes Rainerpff sorry I opened another PR for the exact same purpose and came back here to link to it, and then realized you had opened one in the meantime:man-gesturing-ok:FYI though, you added declarations even the next arg is a string, but in my case local install worked by just adding declarations where needed, like what@Hervé Pagèsdid forXVectors. Not a guru either so I’m not sure this is the right terminology, but I think not all your changes are required. > I thinkrhdf5``SparseArrayandDelayedArrayare also in need of PRs, I’ll get to work on that

Johannes Rainer (09:41:05) (in thread): > great work, thanks! I closed my PR - and thanks for looking into the other packages too:+1:

Jacques SERIZAY (09:54:47) (in thread): > @Hervé Pagès@Mike Smithopened PRs forrhdf5andSparseArray. If I flag other packages that need fixes I’ll open other PRs if that’s ok

Jacques SERIZAY (09:55:41) (in thread): > Thanks Mike and Johannes for suggestions for quick/long-term fixes!

Saga (11:59:11): > @Saga has joined the channel

Hervé Pagès (21:42:52) (in thread): > @Jacques SERIZAY@Johannes RainerPRs merged. Thanks!

2024-01-12

Johannes Rainer (01:54:04) (in thread): > Thanks@Jacques SERIZAY!

Sridhar N (11:31:32): > @Sridhar N has joined the channel

2024-01-16

Jacques SERIZAY (04:45:18) (in thread): > @Hervé PagèsFYI another PR for Rsamtools:https://github.com/Bioconductor/Rsamtools/pull/60 - Attachment: #60 Fix gcc’s “format-security” warning

Hervé Pagès (14:18:36) (in thread): > Done (with some tweak). Thanks!

2024-01-19

Charlotte Soneson (11:39:37): > Not sure where is the best place to post this, so I try here (sorry if I missed it already being discussed somewhere). In Bioc 3.18/3.19 I’m running into an issue withTENxPBMCData, when used together with e.g.SingleR. It seems related to thelogcountsassay being aDelayedMatrix. More precisely, I run the following: > > pbmc3k <- TENxPBMCData::TENxPBMCData(dataset = "pbmc3k") > pbmc3k <- scuttle::logNormCounts(pbmc3k) > rownames(pbmc3k) <- scuttle::uniquifyFeatureNames( > ID = SummarizedExperiment::rowData(pbmc3k)$ENSEMBL_ID, > names = SummarizedExperiment::rowData(pbmc3k)$Symbol_TENx > ) > ref_monaco <- celldex::MonacoImmuneData() > pred_monaco_main <- SingleR::SingleR(test = pbmc3k, ref = ref_monaco, > labels = ref_monaco$label.main) > > which fails with > > Error in .Primitive("[")(new("HDF5ArraySeed", filepath = "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/ExperimentHub/11df840c3de1e_1605", : > object of type 'S4' is not subsettable > > The error appears locally (on mac), on our servers, and on GitHub Actions. In Bioc 3.17, it runs without an error. I don’t think it’s averyrecent issue, looking back at earlier GitHub Actions runs that was likely the cause already at least late last year. I’ve tried to remove and re-download the indicated object from my cache, but that didn’t help. Converting thelogcountsassay of thepbmc3kobject to a regular matrix solves the problem. I was wondering if there was maybe an update to (e.g.)HDF5Arraythat may have rendered the array stored inExperimentHubinvalid somehow, or if anyone has another explanation for this issue:slightly_smiling_face:

Vince Carey (11:48:43): > Replicated on linux in 3.19. Will explore…

Hervé Pagès (11:56:48): > HDF5ArraySeed objects are not subsettable, only HDF5Array objects. An HDF5Array object is just an HDF5ArraySeed object wrapped inside a DelayedArray shell.

Charlotte Soneson (13:09:57) (in thread): > Thanks - I wonder why it’s trying to subset the HDF5ArraySeed…the array looks fine as far as I can understand, and I can do other things with it:thinking_face: > > > class(assay(pbmc3k, "logcounts")) > [1] "DelayedMatrix" > attr(,"package") > [1] "DelayedArray" > > mean(assay(pbmc3k, "logcounts")) > [1] 0.03709643 >

Hervé Pagès (14:41:35) (in thread): > Yes, the array looks fine. The error occurs inSingleR::classifySingleR(). I’m trying to understand what’s going on…

Hervé Pagès (15:51:00) (in thread): > A slightly simpler example that reproduces the error: > > library(HDF5Array) > library(scuttle) > library(SingleR) > > ref <- .mockRefData() > test <- .mockTestData(ref) > assay(test, withDimnames=FALSE) <- writeHDF5Array(assay(test), as.sparse=TRUE) > > ref <- scuttle::logNormCounts(ref) > test <- scuttle::logNormCounts(test) > > trained <- trainSingleR(ref, label=ref$label) > test <- assay(test, "logcounts") > > classifySingleR(test, trained) > > This was working prior to BioC 3.18. Still need to figure out what has changed inHDF5Array,SingleR, orbeachmatthat breaks it.

Hervé Pagès (16:43:33) (in thread): > In the meantime a workaround is to do: > > assay(pbmc3k, withDimnames=FALSE) <- writeHDF5Array(assay(pbmc3k), as.sparse=FALSE) > > right after loadingpbmc3kwithTENxPBMCData::TENxPBMCData(dataset = "pbmc3k").

Jacques SERIZAY (19:08:58) (in thread): > @Michael LawrenceI noticed you took care of the call towarning()inrtracklayer(https://github.com/lawremi/rtracklayer/commit/86407bbef2d02455053b7b7c96afe9c5ce6949e7), thanks!! Any chance you could bump the version and push to bioconductor upstreaml, when you have few minutes to spare? Even though this does not impede installs on the BBS, all continuous integration workflows which rely on rtracklayer and bioconductor’s devel Docker crash:disappointed:

Hervé Pagès (20:02:32) (in thread): > Looks like a bug in the tatami code: > > library(Rcpp) > library(beachmat) > > cat(" > #include \"Rtatami.h\" > #include <vector> > #include <algorithm> > > // Not necessary in a package context, it's only used for this vignette: > // [[Rcpp::depends(beachmat)]] > > // [[Rcpp::export(rng=false)]] > Rcpp::NumericVector column_sums(Rcpp::RObject initmat) { > Rtatami::BoundNumericPointer parsed(initmat); > const auto& ptr = parsed->ptr; > > auto NR = ptr->nrow(); > auto NC = ptr->ncol(); > std::vector<double> buffer(NR); > Rcpp::NumericVector output(NC); > auto wrk = ptr->dense_column(); > > for (int i = 0; i < NC; ++i) { > auto extracted = wrk->fetch(i, buffer.data()); > output[i] = std::accumulate(extracted, extracted + NR, 0.0); > } > > return output; > } > ", file="column_sums.cpp") > > sourceCpp("column_sums.cpp") > > Then: > > m <- matrix(101:115, ncol=3) > column_sums(initializeCpp(m)) > # [1] 515 540 565 > column_sums(initializeCpp(0.5 * m)) > # [1] 257.5 270.0 282.5 > > library(HDF5Array) > > M1 <- writeHDF5Array(m) > column_sums(initializeCpp(M1)) > # [1] 515 540 565 > column_sums(initializeCpp(0.5 * M1)) > # [1] 257.5 270.0 282.5 > > M2 <- writeHDF5Array(m, as.sparse=TRUE) > column_sums(initializeCpp(M2)) > # [1] 515 540 565 > column_sums(initializeCpp(0.5 * M2)) > # Error in .Primitive("[")(new("HDF5ArraySeed", filepath = "/tmp/RtmpfIUb7y/HDF5Array_dump/auto00002.h5", : > # object of type 'S4' is not subsettable > > @Aaron Lun?

Aaron Lun (20:03:18) (in thread): > huh. thanks, i’ll look at it later

2024-01-20

Aaron Lun (02:24:25) (in thread): > Alright, the issue is that it’s a HDF5 array for whichis_sparse()returnsTRUE.tatamisees this and then tries to callextract_sparse_arrayon theHDF5ArraySeed, but there isn’t any such method. This falls back to theANYmethod, which fails as observed above. > > > selectMethod("extract_sparse_array", signature="HDF5ArraySeed") > Method Definition: > > function (x, index) > { > slice <- S4Arrays:::subset_by_Nindex(x, index) > as(slice, "SparseArray") > } > <bytecode: 0x55a5869e7188> > <environment: namespace:SparseArray> > > Signatures: > x > target "HDF5ArraySeed" > defined "ANY" > > The immediate fix for@Charlotte Sonesonis to importbeachmat.hdf5, which should directtatamito use native C++ bindings for HDF5 rather than attempting to call the absent method. > > > library(beachmat.hdf5) > > column_sums(initializeCpp(0.5 * M2)) > [1] 257.5 270.0 282.5 > > The longer-term fix is to probably add aextract_sparse_arraymethod forHDF5ArraySeeds that havesparse=True.

Charlotte Soneson (03:17:26) (in thread): > Thank you@Hervé Pagèsand@Aaron Lun! I can confirm that importingbeachmat.hdf5solves the problem for me.

2024-01-21

Hervé Pagès (02:38:31) (in thread): > @Aaron LunNo code should still be usingextract_sparse_array(). The generic was renamedOLD_extract_sparse_array()in may 2023. At the time I took care of the renaming inbeachmat(seehttps://github.com/tatami-inc/beachmat/commit/12397bcc7b9c6e9efe4dbf7df7b28fff0f0906ee) as well as in any other package that was usingextract_sparse_array():HDF5Array,DelayedRandomArray,TileDBArray,alabaster.matrix, andSCArray. Note that there are plenty ofOLD_extract_sparse_array()methods, including one for HDF5ArraySeed objects: > > > showMethods(OLD_extract_sparse_array) > Function: OLD_extract_sparse_array (package DelayedArray) > x="ConstantArraySeed" > x="DelayedAbind" > x="DelayedAperm" > x="DelayedNaryIsoOp" > x="DelayedSubassign" > x="DelayedSubset" > x="DelayedUnaryIsoOp" > x="DelayedUnaryIsoOpStack" > x="DelayedUnaryIsoOpWithArgs" > x="dgCMatrix" > x="dgRMatrix" > x="H5SparseMatrixSeed" > x="HDF5ArraySeed" > x="lgCMatrix" > x="lgRMatrix" > x="SparseArraySeed" > > Theextract_sparse_array()generic will make a come back at some point but is not ready yet: it has almost no useful method at the moment. > I believe that the tatami stuff was added tobeachmatafter this renaming. Easiest fix for now is to replace the 3 occurences ofextract_sparse_arrayinbeachmat/inst/include/tatami_r/UnknownMatrix.hppwithOLD_extract_sparse_array. I’ll make the change.

Kunal Bagga (07:58:13): > @Kunal Bagga has joined the channel

Hervé Pagès (22:36:24) (in thread): > @Aaron LunBefore I submit the PR do you think you can resyncbeachmat’s GitHub repo withgit.bioconductor.org? If you also bring the RELEASE_3_18 branch to GitHub, I can send a PR for release too. Thx!

2024-01-22

Aaron Lun (01:05:11) (in thread): > okay, let me do it instead; the offending code comes from a vendored library that exists in a separate repo, so i’ll just do it there and it’ll propagate its way down.

Hervé Pagès (12:26:36) (in thread): > beachmatdoes not seem to support matrices obtained by slicing a multidimensional DelayedArray: > > a <- array(1:120, 6:4) > M3 <- DelayedArray(a)[ , 2, ] > initializeCpp(M3) > # Error: object has no 'class' attribute > > Bug or known limitation?

Aaron Lun (12:28:19) (in thread): > that’s known, i just didn’t have the strength/willpower/need to write an API for higher-dimensional access.

Aaron Lun (12:28:48) (in thread): > (hence the “mat” part of “beachmat” ==> matrix, otherwise I’d have called it beacharray or something.)

Hervé Pagès (12:29:44) (in thread): > I understand that the mat part is for matrix and I don’t expect it to handle multidimensional arrays but M3 is a matrix

Aaron Lun (12:29:44) (in thread): > (but if you have thoughts, I’d be happy to see what we can do)

Hervé Pagès (12:31:53) (in thread): > no particular thoughts, but maybe a more informative error message?

Aaron Lun (12:31:58) (in thread): > Yes,M3is a matrix but beachmat goes through and unpacks the delayed operations for native execution in C++. So when it does so, it encounters the higher-dimensional array lurking underneath. > > In this case, it should probably fall back to block processing. I thought it was doing that anyway, but I guess the try/catch didn’t reach all the way.

2024-01-26

Eszter Ari (07:54:17): > @Eszter Ari has joined the channel

Eszter Ari (08:01:10): > Hi Everyone, > I have 2 questions about creating an ExperimentHubData Bioconductor package. > We are developing a new R package aiming to submit to Bioconductor:https://github.com/ELTEbioinformatics/mulea. For this I’ve created a muleaData package aiming to become an ExperimentHubData Bioconductor package:https://github.com/ELTEbioinformatics/muleaData. So I have created R files in *.rds format will be uploaded to the Microsoft Azure Genomic Data Lake of Bioconductor (but the SAS token was expired so need to be renewed). I also created the metadata.csv. > > Question 1: > When I run theExperimentHubData::makeExperimentHubMetadata(pathToPackage = getwd())code I get this message: > # missing or NA values for ‘Coordinate_1_based set to TRUE’ > # Loading valid species information. > # Error in if (all((meta$Location_Prefix == “https://bioconductorhubs.blob.core.windows.net/annotationhub/”) | : > # missing value where TRUE/FALSE needed > Is this OK? (These are not genomic data so I cannot provide Coordinate_1_based.) > > Question 2: > The rds-s are data.frames can be handle with themuleapackage. How should I create the function calledmuleaData(?) within the muleaData ExperimentHubData Bioconductor package to read these rds-s from the server? I mean I see this example: > {apdata} > apData["EH166"] > > athttps://bioconductor.org/packages/release/bioc/vignettes/ExperimentHub/inst/doc/ExperimentHub.htmlThanks for the help!

Lori Shepherd (08:07:43) (in thread): > If you plan to use the Bioconductor Microsoft Azure Genomic Data Lake, Please remove the Location_Prefix from your metadata.csv. Location_Prefix is only necessary if data is stored on a different server (like ensembl, zenodo, etc) . This will make the ERROR in question 1 disappear and the internal code will default to our provided location; its getting hung on a NA base location.

Eszter Ari (08:09:25) (in thread): > Thank you@Lori Shepherd. Meanwhile I wrote my second question as well.:slightly_smiling_face:

Lori Shepherd (08:13:39) (in thread): > not sure I fully understand question 2 but if your asking how it loads… a single bracket “[” will llist the metadata of the resources without downloading it, a double bracket “[[” will donwload the resource. this should be done on the experimenthub object database (ie. ExperimentHub() or a subset or query of that main database object). Once your resources are uploaded and we add them to the database, you will be able to find the resources with queries or directly with the provided EH ids; You have currently specified Rds as the dispatchclass which means upon calling the double bracket to retrieve the resource, the resource will also be loaded into R with the readRds function. If you do not want them loaded automatically, then a different dispatchclass should be chosen

Eszter Ari (08:15:38) (in thread): > Without the Location_Prefix appearing the metadata.csv and without the rds file been uploaded to the server I got this warning:ExperimentHubData::makeExperimentHubMetadata(pathToPackage = getwd()) > > missing or NA values for ‘Coordinate_1_based set to TRUE’ > > Loading valid species information. > > Error in .checkSourceurlPrefixesAreValid(object@SourceUrl) : > > sourceurl provided has an invalid prefix (missing protocol). Source urls should be > > full uris that point to the original resources used in a recipe. > Is this OK?

Lori Shepherd (08:18:07) (in thread): > if the function is ERROR then it is not okay as we wont be able to get the resources into the database but on quick glance of the metadata file they do look like they should be valid. Let me look into this more as there might be a bug somewhere on the validation code

Eszter Ari (08:18:54) (in thread): > Many thanks. Please wait a sec. I pull the new matedata file

Eszter Ari (08:19:03) (in thread): > push (not pull)

Eszter Ari (08:20:02) (in thread): > done

Lori Shepherd (08:20:29) (in thread): > I will be in touch about this soon. Let me know if the answer to #2 made sense. When we add the data to the database we also provide an initial example query to get you started on finding your resources and retrieving the eh ids.

Eszter Ari (08:24:18) (in thread): > answer to question 2 made sense I think. So I don’t have to create a so calledmuleaDatafunction within the muleadata package that handle (list, read) the rds files? Then I am not sure what to write to the vignette and man files…

Lori Shepherd (08:25:06) (in thread): > You can show how to find the resources and download an example of one of the resources. Also document and describe your resources.

Lori Shepherd (08:25:56) (in thread): > If you have a group that are related and should be downloaded together, a helper function to auto donwload mutiple at once? (not sure if appropriate or not since I haven’t looked into the package in detail yet)

Eszter Ari (08:27:35) (in thread): > Themulepackage will use a singemuleDatards for one function call and not more at once.

Eszter Ari (08:27:48) (in thread): > so no grouping is needed

Eszter Ari (08:35:16) (in thread): > So once I will be able to upload the rds-s and the muleaData will be accepted by Bioconductor this function will be valid automatically (there is an rds called “Transcription_factor_TFLink_Caenorhabditis_elegans_All_EnsemblID.rds”)? Without explicitly creating a function calledmuleaDatain the package? > {muleadata} > muleaData["Transcription_factor_TFLink_Caenorhabditis_elegans_All_EnsemblID"] > muleaData[["Transcription_factor_TFLink_Caenorhabditis_elegans_All_EnsemblID"]] > > Then before the acceptance the man containing this: > > \name{muleaData} > \alias{muleaData} > \title{muleaData} > \usage{ > muleaData() > } > \description{ > Downloads muleaData. > } > \examples{ > muleaData[["Transcription_factor_TFLink_Caenorhabditis_elegans_All_EnsemblID"]] > } > > will give an error when running thercmdcheck::rcmdcheck()?

Eszter Ari (08:37:16) (in thread): > > checking examples ... ERROR > Running examples in 'muleaData-Ex.R' failed > The error most likely occurred in: > > > ### Name: muleaData > > ### Title: muleaData > > ### Aliases: muleaData > > > > ### **** Examples > > > > muleaData[["Transcription_factor_TFLink_Caenorhabditis_elegans_All_EnsemblID"]] > Error: object 'muleaData' not found > Execution halted >

Lori Shepherd (08:50:36) (in thread): > so the issue with the valid sourceurl is www without https. so its having issues withwww.ensembl.org

Lori Shepherd (08:51:02) (in thread): > I have some questions about your data . This looks like annotation data for a variety of organisms. Why is it not possible to use the existing annotation objects provided in AnnotationHub ? for example your first few relate to Arabidopis thaliana > > > ah = AnnotationHub() > > query(ah, c("Arabidopsis", "ensembl")) > AnnotationHub with 7 records > # snapshotDate(): 2023-12-26 > # $dataprovider: FANTOM5,DLRP,IUPHAR,HPRD,STRING,SWISSPROT,TREMBL,ENSEMBL,CE... > # $species: Arabidopsis thaliana > # $rdataclass: SQLiteFile, OrgDb > # additional mcols(): taxonomyid, genome, description, > # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags, > # rdatapath, sourceurl, sourcetype > # retrieve records with, e.g., 'object[["AH91655"]]' > > title > AH91655 | LRBaseDb for Arabidopsis thaliana (Thale cress, v001) > AH97723 | LRBaseDb for Arabidopsis thaliana (Thale cress, v002) > AH100430 | LRBaseDb for Arabidopsis thaliana (Thale cress, v003) > AH107152 | LRBaseDb for Arabidopsis thaliana (Thale cress, v004) > AH111344 | LRBaseDb for Arabidopsis thaliana (Thale cress, v005) > AH113865 | LRBaseDb for Arabidopsis thaliana (Thale cress, v006) > AH114076 | org.At.tair.db.sqlite > > > > ah["AH114076"] > AnnotationHub with 1 record > # snapshotDate(): 2023-12-26 > # names(): AH114076 > # $dataprovider:[ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/](ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/)# $species: Arabidopsis thaliana > # $rdataclass: OrgDb > # $rdatadateadded: 2023-10-03 > # $title: org.At.tair.db.sqlite > # $description: NCBI gene ID based annotations about Arabidopsis thaliana > # $taxonomyid: 3702 > # $genome: NCBI genomes > # $sourcetype: NCBI/ensembl > # $sourceurl:[ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/](ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/),[ftp://ftp.ensembl.org/p](ftp://ftp.ensembl.org/p)... > # $sourcesize: NA > # $tags: c("NCBI", "Gene", "Annotation") > # retrieve record with 'object[["AH114076"]]' > > > > ah[["AH114076"]] > loading from cache > OrgDb object: > | DBSCHEMAVERSION: 2.1 > | Db type: OrgDb > | Supporting package: AnnotationDbi > | DBSCHEMA: ARABIDOPSIS_DB > | ORGANISM: Arabidopsis thaliana > | SPECIES: Arabidopsis > | TAIRSOURCENAME: Tair > | TAIRSOURCEDATE: 2023-Sep12 > | TAIRSOURCEURL:[https://www.arabidopsis.org/](https://www.arabidopsis.org/)| TAIRGOURL:[https://www.arabidopsis.org/download_files/GO_and_PO_Annotations/Gene_Ontology_Annotations/ATH_GO_GOSLIM.txt.gz](https://www.arabidopsis.org/download_files/GO_and_PO_Annotations/Gene_Ontology_Annotations/ATH_GO_GOSLIM.txt.gz)| TAIRGENEURL:[https://www.arabidopsis.org/download_files/Genes/TAIR10_genome_release/TAIR10_functional_descriptions](https://www.arabidopsis.org/download_files/Genes/TAIR10_genome_release/TAIR10_functional_descriptions)| TAIRSYMBOLURL:[https://www.arabidopsis.org/download_files/Public_Data_Releases/TAIR_Data_20220630/gene_aliases_20220630.txt.gz](https://www.arabidopsis.org/download_files/Public_Data_Releases/TAIR_Data_20220630/gene_aliases_20220630.txt.gz)| TAIRPATHURL:[ftp://ftp.plantcyc.org/pmn/Pathways/Data_dumps/PMN15_January2021/pathways/ara_pathways.20210325.txt](ftp://ftp.plantcyc.org/pmn/Pathways/Data_dumps/PMN15_January2021/pathways/ara_pathways.20210325.txt)| TAIRPMIDURL:[https://www.arabidopsis.org/download_files/Public_Data_Releases/TAIR_Data_20220630/Locus_Published_20220630.txt.gz](https://www.arabidopsis.org/download_files/Public_Data_Releases/TAIR_Data_20220630/Locus_Published_20220630.txt.gz)| TAIRCHRURL:[https://www.arabidopsis.org/download_files/Maps/seqviewer_data/sv_gene.data](https://www.arabidopsis.org/download_files/Maps/seqviewer_data/sv_gene.data)| TAIRATHURL:[https://www.arabidopsis.org/download_files/Microarrays/Affymetrix/affy_ATH1_array_elements-2010-12-20.txt](https://www.arabidopsis.org/download_files/Microarrays/Affymetrix/affy_ATH1_array_elements-2010-12-20.txt)| TAIRAGURL:[https://www.arabidopsis.org/download_files/Microarrays/Affymetrix/affy_AG_array_elements-2010-12-20.txt](https://www.arabidopsis.org/download_files/Microarrays/Affymetrix/affy_AG_array_elements-2010-12-20.txt)| CENTRALID: TAIR > | TAXID: 3702 > | KEGGSOURCENAME: KEGG GENOME > | KEGGSOURCEURL:[ftp://ftp.genome.jp/pub/kegg/genomes](ftp://ftp.genome.jp/pub/kegg/genomes)| KEGGSOURCEDATE: 2011-Mar15 > | GOSOURCENAME: Gene Ontology > | GOSOURCEURL:[http://current.geneontology.org/ontology/go-basic.obo](http://current.geneontology.org/ontology/go-basic.obo)| GOSOURCEDATE: 2023-07-27 > | GOEGSOURCEDATE: 2023-Sep11 > | GOEGSOURCENAME: Entrez Gene > | GOEGSOURCEURL:[ftp://ftp.ncbi.nlm.nih.gov/gene/DATA](ftp://ftp.ncbi.nlm.nih.gov/gene/DATA)| EGSOURCEDATE: 2023-Sep11 > | EGSOURCENAME: Entrez Gene > | EGSOURCEURL:[ftp://ftp.ncbi.nlm.nih.gov/gene/DATA](ftp://ftp.ncbi.nlm.nih.gov/gene/DATA)Please see: help('select') for usage information > > The annotation hub also potential provides more recent builds than the provide 109? > Granted there isn’t a record for every species you provide but for the ones that are available you probably should be grabbing those and then only upload ones that are not available?

Eszter Ari (08:57:42) (in thread): > So our GMTs and data.frames contains 5, 10 and 20 consecutive genes (created with a sliding window) – for example, it also contains GO and transcription factors and the target genes in other GMTs – to do enrichment analysis. Therefore we cannot simply use the already existing AnnotatinHub. > We have 880 GMT files from 27 orgnaisms collected from 16 different databases and mapped to 4 different IDs:https://github.com/ELTEbioinformatics/GMT_files_for_mulea

2024-01-29

Johannes Rainer (01:55:14) (in thread): > Hi, also chiming in:slightly_smiling_face:- and sorry for the maybe dumb questions, just want to understand a bit what the annotations are that you provide. > > so, if I get it correctly, the data you provide in your files is not already present in any other annotation resources, but you create these files (and content) based on various different annotation resources (sort of combining their data)? > > If so, would it be possible to re-use/link to these annotation resources? > > also, do you plan to update your data for newer e.g. Ensembl releases? then it’s also the question whether this is an ExperimentHub data or an AnnotationHub data?

2024-01-31

Clemens Kohl (07:20:46): > @Clemens Kohl has joined the channel

Clemens Kohl (07:29:35): > Hi everyone, > I am using pkgdown to create a website for my package (https://github.com/VingronLab/APL) and the website is build on a separate branch gh-pages. The files however still end up in the git packer files, which leads to a .pack file >5MB. One of the libraries the website uses is a plotly library which by itself already is 3.5MB, making it virtually impossible to get below the 5MB limit enforced by BiocCheck. Is there any way to both keep the website and and get rid of the large *.pack files from git? I already went through the git history and deleted all large files I do not use anymore.

Alan O’C (12:11:25) (in thread): > If you don’t push the gh-pages branch to bioc git, then the large objects will never be seen by bioconductor checks

Clemens Kohl (12:15:44) (in thread): > Thanks for the tip Alan! Am I understanding this correctly, that it will not transfer the .git/objects/pack/*.pack files (of the main branch) to the bioc git if I dont push the gh-pages branch (which I definitely don’t plan to do)? I somehow assumed these files are also transferred, but I am not really familiar with them.

Alan O’C (12:17:06) (in thread): > Yes, git only gets copies of objects on the branches/refs that you push to a given remote. eg, I can have a 10GB file on branchsillylocally, but if I only ever pull/push to github onmain, then github will never see thesillybranch or the 10GB file

Clemens Kohl (12:18:25) (in thread): > amazing! Thanks you for the help and the explanation!

Vince Carey (15:43:17) (in thread): > It looks likecontributions.bioconductor.orgdoes not address github pages management (to separate branch) but it should. If anyone finds time to make a relevant edit that would be great.

2024-02-05

Dirk Eddelbuettel (08:51:46): > When I useavailable.packages()to do some computation on the repo, I do not seeRcisTargetin 3.18 (but, as a spot-check, in 3.17). Is that expected? The website lists it as anormally available packagetoo (And I do know aboutBiocManagerfor actual installation, this was just for some dependency graph related task.) > > > biocrepo317 <- paste0("[https://bioconductor.org/packages/3.17/bioc](https://bioconductor.org/packages/3.17/bioc)") > > biocrepo318 <- paste0("[https://bioconductor.org/packages/3.18/bioc](https://bioconductor.org/packages/3.18/bioc)") > > apBIOC317 <- data.table(ap="Bioc", as.data.frame(available.packages(repos=biocrepo317))) > > apBIOC318 <- data.table(ap="Bioc", as.data.frame(available.packages(repos=biocrepo318))) > > apBIOC317[grepl("^Rcis", Package), 1:5] > ap Package Version Priority Depends > <char> <char> <char> <char> <char> > 1: Bioc RcisTarget 1.20.0 <NA> R (>= 3.5.0) > > apBIOC318[grepl("^Rcis", Package), 1:5] > Empty data.table (0 rows and 5 cols): ap,Package,Version,Priority,Depends > > >

Mike Smith (09:06:48) (in thread): > That package hasn’t successfully built on Linux since the last release. If the Linux builder doesn’t complete, then the package tarball doesn’t propagate to the Bioc repository, and then the package is not listed as available. > > There have been some discussions about whether a missing source package should render it unavailable if there are working compiled versions, but as far asavailable.packages()is concerned it’s broken on all platforms.

Lori Shepherd (09:07:18) (in thread): > RcisTarget has yet to build on 3.18 and 3.19 for linux source which likely why it is not listed as it is unavailable for 3.18 and 3.19. The maintainer has been made aware and responded they are looking into the error

Dirk Eddelbuettel (09:07:37) (in thread): > Thank you, that is spot on and why I (on Linux, which I failed to mention, my bad) do not see it.

2024-02-08

Peace Sandy (04:47:53): > @Peace Sandy has joined the channel

2024-02-15

Mx (18:02:19): > @Mx has joined the channel

2024-02-19

Matteo Tiberti (09:37:13): > hi, just a quick question - what is BioC’s stance on using:::to access internal functions of other packages? I would assume this is not considered good practice, since e.g. the internal API might change without warning. > > more in general, what is considered best practice, in order to use a function of another package that has not been exported? I would think the best option would be to ask the package developers to make it public, as suggested in the documentation of:::, and failing that to reimplement it or include it in our own codebase as our internal or exported function (following what the original code license allows)

Lluís Revilla (10:26:43) (in thread): > It should be avoided, I think R CMD check warns about it. It is generally recommended to ask to the developer you want to depend to export the function (there are several examples of this in r-package-devel). It might help to uncover some issues with the approach thought. > Keep in mind that to copy it in the package is regulated by the licenses of the other package and yours …

Matteo Tiberti (10:27:39) (in thread): > thanks for the insight - sure, license applies

2024-02-23

Lambda Moses (22:58:15): > I want your opinions about this. When you subset matrices, data frames, and SCE objects withnumeric(0),logical(0), or a vector of allFALSE, you actually don’t get an error, warning, or message, but get something back with 0 rows or columns. What do you think of this behavior? Would it be dangerous to downstream code? > > I’m asking because I need to deal with this scenario forSpatialFeatureExperiment, and I have an opportunity to choose what I think makes more sense which is to give a warning, though I may also stick to the same behavior as SCE without warnings or messages which I find confusing and surprising.

2024-02-24

Lambda Moses (00:04:15): > Another place where I consider breaking with convention iswithDimnamesin getter and setters, which defaults toTRUEin SCE. To be honest, I find it super annoying and I really want to change the default toFALSEin SFE. A justifiable reason for me to break with convention:colGeometryandrowGeometryin SFE aresfdata frames, andsfis part of the Tidyverse, which generally doesn’t like row names. When I subsetsfdata frames, the row names are dropped. This is just thatsfis unlike matrices.

2024-02-25

Vince Carey (08:29:57): > Hi@Lambda Moses. IMHO it is your prerogative to introduce changes of this sort. I would doubt that much would break if a warning were produced when a 0-dim object is encountered. For thewithDimnamesdefault, it would be good to hear from@Hervé Pagèsin case there is some hard-to-anticipate consequence.

Lambda Moses (15:18:01): > I do want to give a warning beacuse 0 rows/cols did cause weird errors in my case because most people don’t anticipate it. I’m trying to be consistent with existing convention so users don’t need to learn that many new things after learning SCE, so it’s for user friendliness. Meanwhile I really wantwithDimnames = FALSEfor SFE to be more user friendly.

Kasper D. Hansen (23:45:02): > Like@Vince Carey, I think that warnings or errors with 0-dim objects are uncontroversial. I agree they are almost always user errors. You can argue you should allow it following the R conventions, but you can also argue the R conventions are wrong.

Kasper D. Hansen (23:50:07): > I am more on the fence on thewithDimnamesquestion. Here, there are two conflicting traditions: the “new” tradition (in the R sense) of the tidyverse which is modeled on relational databases which do not have row names. There are certain computational advantages to this choice, especially depending on the underlying storage format. And then you have the “old” tradition (in the R sense) of having row names being a primary key into the rows of the object (which some confusion since data.frames are requires to have unique rownames whereas matrices don’t). Here, you can argue that users expect something like aSummarizedExperimentto have row names. This is especially useful in the microarray or classic gene expression world where the row names would be probes or gene names.

Kasper D. Hansen (23:51:19): > What would the row names mean in the SFE setup? Do they have some actual interpretation (like say gene names) or are they essentially made up identifiers. That would guide my choice here.

Kasper D. Hansen (23:52:32): > (I would expect made-up identifiers and I would therefore think it makes sense to not have dimnames).

Kasper D. Hansen (23:52:52): > However, I don’t understand why you think having no rownames makes things more userfriendly.

Kasper D. Hansen (23:55:45): > FInally, just a note, what the tidyverse uses as data.frames are not - despite their insistence - actually R data.frames which are required to have row.names. I understand why they made the choice, but it does muddle the waters a bit.

2024-02-26

Lambda Moses (00:49:38) (in thread): > Yes, in the case of col and rowGeometries. colGeometry row names correspond to cell IDs, and rowGeometry row names correspond to genes. The annoying part is really thatsfdrops the row names after subsetting, so in some internal functions, I have to add the row names back.

Lambda Moses (00:52:53) (in thread): > Like when people are trained using Tidyverse, they don’t expect row names. Often I get the error fromwithDimnameswhen I forget to add the row names while the entries are already correctly ordered.

Alan O’C (05:34:11) (in thread): > Where this conflicts with the Bioconductor experience, which involves a lot of matrices and dataframe-alikes with rownames, I’d tend to stick with the Bioc style. Unless, of course, you’re specifically writing a tidy-style package, eg tidybulk

Alan O’C (05:35:58) (in thread): > Also it may simply be grumpy user bias on my part, but as somebody who learned base R before tidyverse, the insistence that I must never have row names (and removing them if I ever add them) is quite frustrating

Kasper D. Hansen (08:30:41) (in thread): > I would say that@Alan O’Canswer exemplifies my question: saying that this change makes it more user friendly, is only true for some people. I do think you should consider having no row names, but I don’t think user-friendlyness is a good way to think about it.

Kasper D. Hansen (08:30:56) (in thread): > Perhaps people like@Alan O’Cand me who like row names are slowly dying out though.

Kasper D. Hansen (08:34:32) (in thread): > So constantly adding in rownames has a performance implication because with most object structures, adding a rowname means the object is duplicated.

Kasper D. Hansen (08:36:19) (in thread): > So there are performance (and convenience) reasons to drop rownames, but it also sounds like you have some good use for the names. So the tradeoff here is tricky. I can see this is a hard choice to make.

Alan O’C (09:22:57) (in thread): > Also strictly speaking I think sf is integrated with tidyverse but is not a part of it, as evidenced by the fact that they use data.frames not tibbles. > > While I’m doing grumpy user rants, a great example of a bad warning is the warning that setting row names on a tibble is deprecated, which to me is a lie since setting rownames is a no-op warning

Hervé Pagès (10:58:47): > @Lambda MosesNot sure why something likex[integer(0)]would deserve a warning. Subsetting by a numeric subscript of length N is expected to return a vector of length N. N=0 is not a special case in that regard. Note that the same happens in Python e.g. ifxis a string or list thenx[3:3]returns an empty string or list. There are many conventions that are questionable in R but this is not one of them. > As for thewithDimnames’s default: you should not change it. When a new classBextends an existing classA, instances ofBare stillAobjects (viais()) so are expected to behave likeAobjects. Keep in mind that there’s code around that has been written to work onAobjects, and this code assumes that any object that satisfiesis(x, "A")followsA’s conventions and behavior. Breaking away from these conventions and behavior has the potential to break any code that deals withAobjects. > In other words, extending an existing classAmeans accepting a contract. It’s not ok to change the contract for your newBobjects by overwriting methods defined forAor an ancestor ofA.

Henrik Bengtsson (11:32:32) (in thread): > > Not sure why something likex[integer(0)]would deserve a warning. Subsetting by a numeric subscript of length N is expected to return a vector of length N. N=0 is not a special case in that regard. > I agree with@Hervé Pagès. > > If one would like to “protect” or inform the user about a possible mistake, it should probably be done at a different level where the result on the subsetting is used, i.e. error/warn aboutlength(y) == 0if that results fromy <- x[subset](or any other call) and is not supported/not anticipated.

Aaron Lun (12:47:24) (in thread): > also agree. zero-length subset is not a special case

Lambda Moses (20:32:13): > Alright, thank you for your helpful comments. In the release version of SFE, something likesfe[,logical(0)]leads to a mysterious error coming from the spatial graphs because I haven’t thought about this scenario before a collaborator stumbled upon it, so now I have to deal with it. I suppose I would stick to convention and not give a warning in the subset method itself, but I will give an error or warning in thecropmethod when spatial cropping has removed all cells. > > RegardingwithDimnames: SFE has new getters and setters such asrowGeometry(),rowGeometry<-,colGeometry(), andcolGeometry<-, which are internally stored inint_colData. The implementation and user interface are meant to emulate those ofreducedDims. There’s also the newlocalResultgetter and setter and it’s much more similar toreducedDimsin that those are matrices and notsf. I won’t change the defaults ofreducedDimsfor SFE, and maybe I’ll keepwithDimnames = TRUEfor the geometries since in some cases I do use the row names like when theMULTIPOINTgeometries for transcript spots are read in for some but not all genes. I haven’t benchmarked the performance impact of putting the row names back tosfdata frames though.

Jonathan Carroll (23:07:27) (in thread): > I’m being pedantic here, buttibblesdohave thedata.frameclass (in order to fall-back to that when a more specific method is not available) and while they’re not printed, theydohave (integer-as-character)row.namesand youcanset them > > tbl <- tibble::tibble(a=3:5, b=6:8) > tbl > #> # A tibble: 3 × 2 > #> a b > #> <int> <int> > #> 1 3 6 > #> 2 4 7 > #> 3 5 8 > > is(tbl, "data.frame") > #> [1] TRUE > > row.names(tbl) > #> [1] "1" "2" "3" > > tbl["2", ] > #> # A tibble: 1 × 2 > #> a b > #> <int> <int> > #> 1 4 7 > > row.names(tbl) <- LETTERS[1:3] > #> Warning: Setting row names on a tibble is deprecated. > > row.names(tbl) > #> [1] "A" "B" "C" > > tbl > #> # A tibble: 3 × 2 > #> a b > #> * <int> <int> > #> 1 3 6 > #> 2 4 7 > #> 3 5 8 > > (note the asterisk above the fake rownames which indicates that they are there, but not printed)

Kasper D. Hansen (23:22:10) (in thread): > tryrow.names(tbl[1:2,])

Kasper D. Hansen (23:23:13) (in thread): > Also, the output ofis()is just the output of a “promise” the tibble authors make, which is - IMO - not totally true.

Kasper D. Hansen (23:24:17) (in thread): > That promise is that a tibble fully conforms to the data.frame operations in R and my claim is that this is not fully true.

Jonathan Carroll (23:24:22) (in thread): > They’re not persistent, no, but even thatdoeshave some rownames. The “promise” is that"data.frame"is in this vector > > class(tbl) > [1] "tbl_df" "tbl" "data.frame" >

Kasper D. Hansen (23:25:51) (in thread): > yeah, that’s just a vector. I could do > > > x = 1:3 > > class(x) = c("tbl_df", "tbl", "data.frame") > > is(x, "data.frame") > [1] TRUE > > is(x, tibble) > > is(x, "tbl") > [1] TRUE >

Kasper D. Hansen (23:26:28) (in thread): > I mean, theclassattribute is just a vector. You can do anything with that. We have no formal way of ensuring that it makes sense

Kasper D. Hansen (23:27:30) (in thread): > That is one of the deficiencies of S3

Jonathan Carroll (23:28:32) (in thread): > Your claim was “they are not actually R data.frames” but I don’t see where that is currently the case. Sure, in the future they could make something incompatible, buttibbles currentlydohave rownames, even if they’re clobbered from anything you might have set.

Kasper D. Hansen (23:29:07) (in thread): > They don’t act like data.frames, since they don’t retain rownames when subsetting

Kasper D. Hansen (23:31:42) (in thread): > But you can also argue that there is not a very explicit definition of what exactly a data.frame is, and I would agree with that

Kasper D. Hansen (23:32:36) (in thread): > But conversely, there is absolutely data frame code out there which uses rownames and which assumes they are retained when subsetting, for example

Jonathan Carroll (23:38:54) (in thread): > Retaining rownames is not part of the contract of that class (because S3 is just a label) and I’ve seen plenty ofmerge(a, b)code that produces wrong results because people have assumed rownames would have a consistent ordering across that. > > Part of this issue is that you’re using tibble:::[.tbl_df to subset, which makes no guarantees about preserving rownames (in fact promises to remove them). I would claim that this behaviour belongs to that function, not the object. Instead, if you wantdata.framesematics for subsetting, you should do > > row.names(`[.data.frame`(tbl, 1:2, )) > [1] "a" "b" > > but that isn’t the default behaviour.

Jonathan Carroll (23:50:09) (in thread): > “The tibble API does not behave the same way as the data.frame API” I could entirely agree with, but as far as R is concerned, a tibbleisa data.frame.

2024-02-27

Kasper D. Hansen (11:43:27) (in thread): > I don’t think you can make that statement with that level of conviction. In S3 we don’t have formally defined classes, so I would argue that the contract you enter when you do something like the tibble developers do, is that a certain API is satisfied. But not only are there no class definitions, it is also not clear what that API is.

Kasper D. Hansen (11:44:52) (in thread): > I am of the opinion that the subset operator and how it behaves is an integral part of the class definition. Certainly, the API is part of the class defintiion from an operational point of view, since - what this really means - is that you’re supposed to be able to use tibbles everywhere you use data.frames.

Kasper D. Hansen (11:45:25) (in thread): > It is just unclear exactly what the API is. And we haven’t even talked about the C level API for some of these classes.

Kasper D. Hansen (11:46:32) (in thread): > Now, I can appreciate the opinion that the (for example) subsetting and rownames are not part of this API. I just disagree with it.

Kasper D. Hansen (11:48:25) (in thread): > rownames has always been integral to data frames. For example - at least according to my memory (which often fails me though) - when data frames were allowed to have no rownames, the default “1”, “2”, etc was interoduced exactly because it was considered so important that it had unique rownames.

Kasper D. Hansen (11:51:37) (in thread): > I do think that the operational advantages of putting databases into tibbles and have them appear as data frames are big. And I appreciate that the developers was between a rock and a hard place. Either you don’t claim tibbles are data.frames, in which case there is tons of code where tibbles would not work. Or you claim that tibbles are data frames and ignore the API violations (or try to argue that those are not actually API violations as you’re doing). Either choice has some consequences. And I respect that tibbles went for the second choice. However, it is not going to stop me from saying that I believe that there is API violations.

2024-02-28

Tram Nguyen (16:07:15): > @Tram Nguyen has joined the channel

2024-03-04

Sean Davis (13:21:54): > Is anyone aware of the support status for quarto as a vignette engine?@Henrik Bengtssonor others?

Dirk Eddelbuettel (13:30:37) (in thread): > There is a fresh thread on r-package-devel where the respective developer (Christophe D.) is pleading with CRAN / trying to work out how to get the package itself onto CRAN to have an initial run at VignetteEngine: quarto. > > It looks to be close. Uwe L just asked for a resubmission.

Hervé Pagès (13:54:44) (in thread): > FWIW, we already have quarto books (thanks to Jacques Serizay):https://bioconductor.org/books/release/BiocBookDemo/

Sakshi Varshney (14:24:28): > @Sakshi Varshney has joined the channel

2024-03-05

Vamika Mendiratta (02:08:45): > @Vamika Mendiratta has joined the channel

Mike Smith (03:37:57) (in thread): > I think support is almost there in {quarto} now e.g.https://github.com/quarto-dev/quarto-r/blob/main/vignettes/hello.qmdEdit: This is probably held up by what Dirk mentioned. If I install {quarto} from GitHub I’m able to build a package with a Qmd vignette usingVignetteBuilder: quarto

Blessing Ene Anyebe (14:22:54): > @Blessing Ene Anyebe has joined the channel

2024-03-06

Dirk Eddelbuettel (11:11:46) (in thread): > Looks like the CRAN side just got sorted out: - File (PNG): image.png

2024-03-14

Mike Smith (11:33:40): > Could someone with an ARM64 Mac and a recent intallation of R-devel test the output ofR CMD config CXXfor me? I’ve had a report from Brian Ripley that that rhdf5filters is using conflicting C++ standards during installation (-std=gnu++11 -std=gnu++17) but AFAICS this is coming fromR CMD config CXXI see the same output on a recent GitHub actions run on that platform (https://github.com/grimbough/rhdf5filters/actions/runs/8266244317/job/22613769283#step:8:25) but it doesn’t look like that on kjohnson3 (https://bioconductor.org/checkResults/3.19/bioc-mac-arm64-LATEST/kjohnson3-NodeInfo.html), which has a new (but not quite bleeding edge) version of R-devel. It’d be great to get a second opinion from someone who has access to a an ARM64 Mac before I conclude this really isn’t my issue.

Charlotte Soneson (11:51:40) (in thread): > Here’s what I get: > * first with the R devel that I had installed (from January 3) > > > % R CMD config CXX > clang++ -arch arm64 -std=gnu++17 > % R > > R Under development (unstable) (2024-01-03 r85769) -- "Unsuffered Consequences" > > > * then after installing a new R devel (I did not change anything else) > > > % R CMD config CXX > clang++ -arch arm64 -std=gnu++11 -std=gnu++17 > % R > > R Under development (unstable) (2024-03-12 r86109) -- "Unsuffered Consequences" >

Kasper D. Hansen (11:52:06) (in thread): > > $ R --version > R Under development (unstable) (2024-01-20 r85814) -- "Unsuffered Consequences" > Copyright (C) 2024 The R Foundation for Statistical Computing > Platform: aarch64-apple-darwin20 > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under the terms of the > GNU General Public License versions 2 or 3. > For more information about these matters see[https://www.gnu.org/licenses/](https://www.gnu.org/licenses/). > > $ R CMD config CXX > clang++ -arch arm64 -std=gnu++17 >

Kasper D. Hansen (11:52:55) (in thread): > So looks like it confirms Charlotte’s output.

Kasper D. Hansen (11:53:07) (in thread): > This is using the R-devel binaries from Simon Urbanek

Kasper D. Hansen (11:53:23) (in thread): > https://mac.r-project.org - Attachment (mac.r-project.org): R for macOS Developers > This is the home for experimental binaries and documentation related to R for macOS. To learn more about the R software or download released versions, please visit www.r-project.org.

Charlotte Soneson (11:53:35) (in thread): > Same for me

Kasper D. Hansen (11:59:23) (in thread): > This is confirmed by the configuration report which you can see athttps://mac.r-project.org/logs/log-R-devel.big-sur.arm64.html#conf

Kasper D. Hansen (11:59:38) (in thread): > > R is now configured for aarch64-apple-darwin20 > > Source directory: /Volumes/Builds/R4/R-devel > Installation directory: /Library/Frameworks > > C compiler: clang -arch arm64 -falign-functions=64 -Wall -g -O2 > Fortran fixed-form compiler: /opt/gfortran/bin/gfortran -arch arm64 -Wall -g -O2 > > Default C++ compiler: clang++ -arch arm64 -std=gnu++11 -std=gnu++17 -falign-functions=64 -Wall -g -O2 > C++11 compiler: clang++ -arch arm64 -std=gnu++11 -std=gnu++11 -falign-functions=64 -Wall -g -O2 > C++14 compiler: clang++ -arch arm64 -std=gnu++11 -std=gnu++14 -falign-functions=64 -Wall -g -O2 > C++17 compiler: clang++ -arch arm64 -std=gnu++11 -std=gnu++17 -falign-functions=64 -Wall -g -O2 > C++20 compiler: clang++ -arch arm64 -std=gnu++11 -std=gnu++20 -falign-functions=64 -Wall -g -O2 > C++23 compiler: clang++ -arch arm64 -std=gnu++11 -std=gnu++2b -falign-functions=64 -Wall -g -O2 > Fortran free-form compiler: /opt/gfortran/bin/gfortran -arch arm64 -Wall -g -O2 > Obj-C compiler: clang -arch arm64 -falign-functions=64 -Wall -g -O2 -fobjc-exceptions > > Interfaces supported: X11, aqua, tcltk > External libraries: pcre2, readline, curl > Additional capabilities: PNG, JPEG, TIFF, NLS, cairo, ICU > Options enabled: framework, shared BLAS, R profiling, memory profiling > > Capabilities skipped: > Options not enabled: > > Recommended packages: yes >

Kasper D. Hansen (12:00:18) (in thread): > Click on “build” and then scroll up

Kasper D. Hansen (12:00:39) (in thread): > I think sending this to Ripley should be the proof you need

Mike Smith (12:07:14) (in thread): > Awesome, thanks both for the reports and the extra detective work@Kasper D. Hansen. I was hunting around for that config file and couldn’t find it. I think I skipped past the link since it has the date Mar 31 2023 written next to it!

2024-03-15

Calandra Grima (01:58:22): > @Calandra Grima has joined the channel

2024-03-17

Raihanat Adewuyi (20:43:01): > @Raihanat Adewuyi has joined the channel

2024-03-18

Steffen Neumann (08:48:58): > Hi@Hervé Pagès, I think you broke opentimsr:slightly_smiling_face:

Steffen Neumann (08:51:17) (in thread): > I ran in the issuehttps://github.com/michalsta/opentims/issues/25and try to fix it. I found that you helped the R team to fix something:https://cran.r-project.org/doc/manuals/r-release/NEWS.htmlinhttps://bugs.r-project.org/show_bug.cgi?id=18538and I am unable to find where in opentims the issue is now - Attachment: #25 Installation opentimsr: arguments (‘na.rm’) after ‘…’ must appear in the same place at the end of the argument list > Hi,
> on R-4.3.3 I am getting: > > > R CMD INSTALL /tmp/opentimsr_1.0.13.tar.gz > * installing to library ‘/usr/local/lib/R/site-library’ > * installing **source** package ‘opentimsr’ ... > **** using staged installation > **** libs > using C compiler: ‘gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’ > using C++ compiler: ‘g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0’ > using C++17 > g++ -std=gnu++17 -I"/usr/share/R/include" -DNDEBUG -I'/usr/local/lib/R/site-library/Rcpp/include' -fpic -g -O2 -ffile-prefix-map=/build/r-base-14Q6vq/r-base-4.3.3=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -c RcppExports.cpp -o RcppExports.o > ... > g++ -std=gnu++17 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -o opentimsr.so RcppExports.o Rinterface.o converters.o opentims.o scan2inv_ion_mobility_converter.o so_manager.o thread_mgr.o tof2mz_converter.o zstddec_cpl.o -L/usr/lib/R/lib -lR > installing to /usr/local/lib/R/site-library/00LOCK-opentimsr/00new/opentimsr/libs > **** R > **** inst > **** byte-compile and prepare package for lazy loading > Error : in method for ‘range’ with signature ‘x="OpenTIMS"’: arguments (‘na.rm’) after ‘...’ in the generic must appear in the method, in the same place at the end of the argument list > Error: unable to load R code in package ‘opentimsr’ > Execution halted > ERROR: lazy loading failed for package ‘opentimsr’ > * removing ‘/usr/local/lib/R/site-library/opentimsr’ > > > with > > > R version 4.3.3 (2024-02-29) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 22.04.4 LTS >

Steffen Neumann (08:52:46) (in thread): > I only find ellipses at the end in opentimsr, so itshouldhave been safe. What am I missing ?

Steffen Neumann (09:26:03) (in thread): > The only relevant hint I see is that in plain R without anything loaded there is > > > range > function (..., na.rm = FALSE) .Primitive("range") > > but I have no idea how that plays/collides with the method definition :-(

Vince Carey (10:25:44) (in thread): > Have you tried patching to > > setMethod("range", > "OpenTIMS", > function(x, from, to, by=1L, ..., na.rm=TRUE){ > > – there is no example so I don’t know if this will work

Marcel Ramos Pérez (12:01:29) (in thread): > It seems like the method is not correctly specified, i.e., not using all the arguments in the generic. Where is therangegeneric? There should be an@importFrompointing to the S4 generic. If you’re using the S3 generic it should look likerange.<myS3class>

Steffen Neumann (12:10:15) (in thread): > Cool, thanks for your input. I am not maintainer of that opentimsr, only a user trying to fix it:slightly_smiling_face:Where do I find (or where should be) the definition of the Generic ? Currently, it is not inhttps://github.com/michalsta/opentims/blob/master/opentimsr/NAMESPACE, and I don’t see any definition in the package elsewherehttps://github.com/michalsta/opentims/blob/d90c02355ac17014fefc47b6647e207ca6d0e760/opentimsr/R/opentimsr.R#L49. > So it seems that a corresponding generic needs to be defined, as I also don’t see anything in e.g. BiocGenerics

Marcel Ramos Pérez (12:15:17) (in thread): > I would contact the maintainer and ask for clarification. I didn’t see anything inMatrixGenericseither

Vince Carey (12:25:31) (in thread): > is this relevant? > > > showMethods("range") > Function: range (package base) > x="CompressedIntegerList" > x="CompressedIRangesList" > x="CompressedLogicalList" > x="CompressedNumericList" > x="CompressedRleList" > x="COO_SparseArray" > x="DelayedArray" > x="GenomicRanges" > x="GRangesList" > x="IntegerRanges" > x="IntegerRangesList" > x="SparseArraySeed" > x="StitchedGPos" > x="StitchedIPos" > x="SVT_SparseArray" > > > getMethod("range", "GenomicRanges") > Method Definition: > > function (x, ..., na.rm = FALSE) > { > .local <- function (x, ..., with.revmap = FALSE, ignore.strand = FALSE, > na.rm = FALSE) > { > if (!identical(na.rm, FALSE)) > warning("'na.rm' argument is ignored") >

Vince Carey (12:27:33) (in thread): > > > getGeneric("range") > standardGeneric for "range" defined from package "base" > belonging to group(s): Summary > > function (x, ..., na.rm = FALSE) > standardGeneric("range", .Primitive("range")) > <bytecode: 0x556f1c2635d0> > <environment: 0x556f1c26ae98> > Methods may be defined for arguments: x, na.rm > Use showMethods(range) for currently available ones. >

Steffen Neumann (13:03:39) (in thread): > Thanks everyone, I fixed the method definition and sent a PR

Hervé Pagès (14:21:24) (in thread): > > @Hervé Pagès, I think you broke opentimsr > Wow, really?opentimsr’s Depends and Imports: > > Depends: > R (>= 3.0.0) > Imports: > Rcpp (>= 0.12.0), > methods, > DBI, > RSQLite > > I think you’re overestimating my powers:wink:

Marcel Ramos Pérez (15:26:22) (in thread): > FWIW, it looks like the S4 generic is defined as part of a group rather than individually. See?methods::SummaryandgetGroupMembers(Summary)

Vince Carey (17:01:43) (in thread): > https://stat.ethz.ch/R-manual/R-devel/library/base/html/groupGeneric.html,https://stat.ethz.ch/R-manual/R-devel/library/methods/html/S4groupGeneric.htmlare good to know

2024-03-22

Daniel Mullen (15:04:43): > @Daniel Mullen has joined the channel

Daniel Mullen (15:25:21): > I hope this is the right place to ask, but when building/submitting a package for Bioconductor review, if we have necessary datasets that are larger than >5 Mb, is the accepted practice for handling these to use AnnotationHub and/or ExperimentHub? Also, the >5 Mb files in question include a mix of genomic datasets used by functions in the package, as well as an example dataset for use in our function examples/vignette, so should these datasets be split between AnnotationHub and ExperimentHub, or should we roll them all into a single ExperimentHub (since the genomic datasets are curated and we aren’t expecting to update them in the future)? Thank you for your assistance!

Hervé Pagès (16:44:09) (in thread): > The channel to use for this is#package-submission > > if we have necessary datasets that are larger than >5 Mb, is the accepted practice for handling these to use AnnotationHub and/or ExperimentHub? > Yes > > or should we roll them all into a single ExperimentHub (since the genomic datasets are curated and we aren’t expecting to update them in the future)? > Experimental data should go to ExperimentHub, and, if the genomic datasets are annotations, they should go to AnnotationHub. There might be use cases for these annotations beyond the scope of your package so people should be able to find them in AnnotationHub.

Daniel Mullen (16:46:29) (in thread): > That makes sense. Thank you for your assistance and the heads up on that channel. If I have any further questions I’ll post them there.

2024-03-23

Chioma Onyido (12:46:40): > @Chioma Onyido has joined the channel

Chioma Onyido (12:47:26): > @Chioma Onyido has left the channel

2024-03-24

Najla Abassi (09:58:02): > @Najla Abassi has joined the channel

2024-03-25

Kevin Stachelek (13:17:42): > @Kevin Stachelek has joined the channel

2024-03-26

Sara m.morsy (16:09:06): > @Sara m.morsy has joined the channel

Sara m.morsy (23:42:33): > Hi, anyone is interested to join a team for developing bioconductor package?

2024-03-27

Jovana Maksimovic (00:38:49): > @Jovana Maksimovic has joined the channel

abhich (05:45:56): > @abhich has joined the channel

Lluís Revilla (07:55:58) (in thread): > I am curious. Are you offering a position or a collaboration to develop the package?

Jeroen Ooms (09:04:59): > Hello, I am having another attempt at indexinghttps://bioc.r-universe.dev. I am running into an issue that for some packages I cannot even create a source package from the checkout because the maintainer field in the DESCRIPTION file has invalid syntax. Is this known/expected? R does not allow multiple entries in theMaintainerfield. It concernsccrepe, cleanUpdTSeq, DegCre, fcScan, IntOMICS, NanoStringDiff, pmp, r3Cseq, RepViz, synapter, customProDB. Is this something that could be fixed?

Lori Shepherd (09:13:03) (in thread): > not sure how you are getting the checkouts … just looking at the first entry is this because the maintainer field has multiple entries ? > > Maintainer: Emma Schwager <emma.schwager@gmail.com>,Craig Bielski<craig.bielski@gmail.com>, George Weingart<george.weingart@gmail.com> > > because yes legacy packages that use Author/Maintainer instead of Authors@R exist and ones that use Maintainer instead of Authors@R could have multiple listed. It also does not affect making a source tar on our system or myself locallyhttps://bioconductor.org/checkResults/devel/bioc-LATEST/ccrepe/ > > SoftwarePkg$ R CMD build ccrepe > * checking for file 'ccrepe/DESCRIPTION' ... OK > * preparing 'ccrepe': > * checking DESCRIPTION meta-information ... OK > * installing the package to build vignettes > * creating vignettes ... OK > * checking for LF line-endings in source and make files and shell scripts > * checking for empty or unneeded directories > * building 'ccrepe_1.39.1.tar.gz' >

Kasper D. Hansen (09:13:51) (in thread): > I agree with@Jeroen Oomsthat this is a bug because R-exts very clear says that the maintainer is a single person.

Kasper D. Hansen (09:14:03) (in thread): > However, I am surprised thatR CMD checkdoes not catch this

Kasper D. Hansen (09:14:20) (in thread): > This is undoubtedly why we see packages with this behaviour

Kasper D. Hansen (09:14:41) (in thread): > I just checked the build report forccrepeand it has multiple maintainers and nothing in the check log

Kasper D. Hansen (09:15:09) (in thread): > So I guess the long term solution would be for R CMD check to actually check this

Lori Shepherd (09:15:19) (in thread): > R CMD build would be more appropriate. Since it fails if you use Authors@R in this case but not if you use Authors/Maintainer

Kasper D. Hansen (09:16:42) (in thread): > I will let this decision up to people more knowledgable than me, but I believe the intention is to have minimal checks in build

Lori Shepherd (09:18:40) (in thread): > it already does it for Authors@R in R CMD build I think . but R still allows use of Author/Maintainer rather than just Authors@R (where it is enforced). Authors@R is check on incoming packages now in BiocCheck but the issue is that it didn’t always exist and there are a lot of legacy packages that do not use Authors@R .@Hervé Pagèsat one point was against adding the BiocCheck but maybe he has more to say on it too

Jeroen Ooms (09:19:28) (in thread): > The reason why R CMD check does not catch it is because on CRAN this is already rejected before check runs (I think when building a source pkg)

Kasper D. Hansen (09:19:32): > Just an aside from following up on Jereon’s question. I looked at the release build forccrepeand this is not about calling out that package, but I was kind of surprised to see something like > > * checking R code for possible problems ... NOTE > calculate.z.stat.and.p.value: no visible global function definition for > 'var' > calculate.z.stat.and.p.value: no visible global function definition for > 'pnorm' > ccrepe: no visible binding for global variable 'cor' > nc.score: no visible global function definition for 'na.omit' > nc.score: no visible global function definition for 'complete.cases' > qc_filter: no visible binding for global variable 'x' > Undefined global functions or variables: > complete.cases cor na.omit pnorm var x > Consider adding > importFrom("stats", "complete.cases", "cor", "na.omit", "pnorm", "var") > to your NAMESPACE file. > > be a NOTE. Of course, this is how it has always been, but the issue is that I don’t think we catch (bad) NOTEs being introduced over the lifetime of a package. I think this particular note is the kind where making sure this note gets resolved would improve the quality of the code

Dirk Eddelbuettel (09:20:20) (in thread): > I don’t think that is true@Jeroen Ooms– CRAN permits ‘old school’ Author: and Maintainer: in uploadstodayfor existing packages but they have been enforcing Author@R fornewpackages for a while. (Mind you with single Maintainer.)

Kasper D. Hansen (09:20:31) (in thread): > Well, CRAN is not R etc etc, but you’re probably right about the origin. I stand by my assessment that this seems like a check that could / should be done

Lori Shepherd (09:20:37): > We struggle with people even cleaning up WARNINGS …

Jeroen Ooms (09:20:38) (in thread): > @Dirk Eddelbuettelthat is unrelated to this

Jeroen Ooms (09:21:05) (in thread): > It is not allowed to have a package with aMaintainerfield that has multiple names in it.

Kasper D. Hansen (09:21:08) (in thread): > Are you sure about this@Jeroen Oomsbecause@Lori Shepherdsays that if you use Author@R, this actually does get checked

Jeroen Ooms (09:21:37) (in thread): > And also you can have only one author with thecrerole when you useAuthors@R

Kasper D. Hansen (09:21:50) (in thread): > I mean, I read R-exts and you’re right about the requirement, but does CRAN actually have the custom check for old style Maintainer?

Dirk Eddelbuettel (09:22:03) (in thread): > @Kasper D. HansenSee what I wrote above. Old-form is allowed (at CRAN), but it is assumed (apparently) to have only one Maintainer.

Kasper D. Hansen (09:24:33) (in thread): > I agree with you Dirk (I think). The question is whether CRAN has custom code that checks this beyond what we already have in R CMD build/check

Kasper D. Hansen (09:24:46) (in thread): > The specs clearly says a single maintainer though.

Kasper D. Hansen (09:26:59): > One could dream of a system where we allow certain NOTE/WARNING after review, then those allowances gets stored and then we heavily flag packages where new issues pop up.

Dirk Eddelbuettel (09:27:04) (in thread): > I was actually noodling with the idea of proposing extra optional checks called via a hook (similar tocleanup) but later convinced myself that maybe a package is easier for that. I often forget URL and BugReports, CRAN doesn’t care but I get an implicit naughtygram from@Jeroen Oomsvia r-universe. More checks is always better, but getting changes into CRAN / base R takes long.

Kasper D. Hansen (09:28:10) (in thread): > That would be useful, also to R-core who could use this to look at proposals as well

Kasper D. Hansen (09:28:32) (in thread): > We haveBiocCheckwhich does something like that

Jeroen Ooms (09:40:36) (in thread): > OK found it, CRAN checks it at the incoming (submission) stage. If you submit to CRAN you get this error:https://github.com/r-devel/r-svn/blob/master/src/library/tools/R/QC.R#L8533

Jeroen Ooms (09:46:01) (in thread): > If you check with--as-cranI also see that

Kasper D. Hansen (09:47:42) (in thread): > Ah good catch

Kasper D. Hansen (09:47:57) (in thread): > Ok, you’re 100% right then:slightly_smiling_face:

Lori Shepherd (09:48:22) (in thread): > we cannot use –as-cran since it specifies a lot of additional checks on the daily builder. Again, this is already checked on incoming Bioconductor packages as well with BiocCheck. Just not a check for existing packages.

Lori Shepherd (09:50:56) (in thread): > Again, it would be great if this could actually be put in R CMD build like it already is for Authors@R instead of both groups making ad-hoc checks for it

Lori Shepherd (09:52:00) (in thread): > I can reach out to these packages but we will not fix it for them obviously. and again, I know@Hervé Pagèsin the past had strong feelings about not enforcing punishing those that do not use of Authors@R / a single maintainer

Kasper D. Hansen (09:52:51) (in thread): > We can probably get this resolved pretty easily. I don’t think this is punishing; it is just reminding people of the standard

Lori Shepherd (09:54:43) (in thread): > Its interesting that there were only these package tho as there are other packages listed on our build report with multiple maintainers not listed here

Jeroen Ooms (09:55:17) (in thread): > What I don’t understand is how you reach out to maintainers, if it is ambigous who it is? The reason CRAN enforces the single-maintainer is because they need a contact point for the package, and they will archive the package if that contact cannot be reached.

Lori Shepherd (09:55:28) (in thread): > @Kasper D. Hansenyou’d be surprised most argue that they want multiple people notified when a package is failing and to be listed as a maintainer to make changes and also for recognition of work

Lori Shepherd (09:55:50) (in thread): > @Jeroen Oomssee above – that is actually the intention of most of these packages

Kasper D. Hansen (09:56:29) (in thread): > Oh, I agree with that sentiment - it is much nicer to get notifications to multiple people. But that’s not the standard (unfortunately).

Lori Shepherd (09:57:36) (in thread): > CRAN standard – this was one of herve’s strong arguments against enforcing it

Kasper D. Hansen (09:57:38) (in thread): > Almost every time I get contacted as a maintainer, my first task is to forward it to other people

Kasper D. Hansen (09:58:19) (in thread): > Well, R-exts very clearly states that the Maintainer field has a single person, at least now. I looked it up

Dirk Eddelbuettel (09:59:07) (in thread): > And they (wrongly, as you and I feel) argue that the single field cannot be a mailing list or alias. Silly but ‘those are the rules’

Lori Shepherd (10:00:12) (in thread): > well the problem with many mailing lists (as we / I’ve encountered a few in Bioconductor) is they involve sign up … so an auto notification system , like our BBS emails, actually never get through because they are stuck in a sign up

Lori Shepherd (10:01:07) (in thread): > alias however – completely agree

Dirk Eddelbuettel (10:01:32) (in thread): > @Lori ShepherdTheir objection predates that, and I explicitly stated alias too because you can have a local alias (Rcpp did, they told us to remove it). Anyway – no point in arguing about it, it is their stated fact. > > Re –as-cran, and you probably know this, but you could try to use it yet opt out of the checks giving you trouble as there are a lot of individual env vars to turn things on and off. But might be too much work …

Lori Shepherd (10:02:08) (in thread): > yes agreed alias should probably be allowed if the one maintainer is enforced

Jeroen Ooms (10:28:23) (in thread): > OK I guess this is more involved than I thought. I’ll try to work around it on my end then.

Jeroen Ooms (10:44:57): > One final question (more of a comment than a question:sweat_smile:)

Jeroen Ooms (10:45:50): > There is a package called that is calledIntOMICSon git and bioc, however the package package name in the description seems to beIntOMICSr(with an extrarat the end):https://www.bioconductor.org/packages/devel/bioc/html/IntOMICS.html

Jeroen Ooms (10:46:11): > Is this known? it is certainly confusing my bot

Jeroen Ooms (10:49:49): > It is not reflected in the bioc metadata: > > bioc <- jsonlite::fromJSON('[https://bioconductor.org/packages/json/3.19/bioc/packages.json](https://bioconductor.org/packages/json/3.19/bioc/packages.json)') > bioc$IntOMICS$Package >

Kasper D. Hansen (10:50:34): > I think this may be a recent name change. And I may have been (indirectly) involved in it. The issue is that Intomics is a registered trademark and name of a bioinformatics company. Unrelated to the company name, a research published an algorithm with the name. Unfortunately, since it is all about genomics data, this is the case where a company has to defend their trademark or loose it.

Jeroen Ooms (10:51:09): > OK so maybe the metadata should be updated accordingly?

Kasper D. Hansen (10:51:22): > Perhaps this change is following some private negotiation between the company and the package authors.

Kasper D. Hansen (10:51:59): > Yeah, I am guessing the authors may have wanted to rename their package and thought that this could happen through DESCRIPTION not realizing that other changes needs to be done as well

Kasper D. Hansen (10:53:12): > Additional information (1) the trademark has been active for quite a few years, well preceeding the name of the method (in effect, that method name should not have been used) (2) the company was recently aquired and had a name change, but the trademark still exists.

Jeroen Ooms (10:53:36): > Perhaps it makes sense to just retire the old package name and re-introduce it with the new name?

Kasper D. Hansen (10:54:10): > Probably the core team needs to get in touch with the package authors about how to resolve this. There is some guesswork on my end here as well

Jeroen Ooms (10:54:30): > Fortunately nothing seems to depend on it, so renaming it should not break other pkgs.

Kasper D. Hansen (10:55:04): > Its unfortunate all around because they used this name in their research paper.

Sara m.morsy (11:04:57) (in thread): > It is collaboration and we will submit to Biohackathon EU to finalize it by the end of this year

Lluís Revilla (11:12:08) (in thread): > I might be interested depending on more details, such as area of the package, publication related to it, why do you seek a collaborator, …

Hervé Pagès (12:09:57): > FWIW the mismatch between repo and package names causes this error on the daily builds:https://bioconductor.org/checkResults/devel/bioc-LATEST/IntOMICS/

Hervé Pagès (12:18:05) (in thread): > > I know@Hervé Pagèsin the past had strong feelings about not enforcing punishing those that do not use of Authors@R / a single maintainer > Just to clarify: true and false. Yes strong feelings about not punishing those that still use the good ol’ Author/Maintainer fields, but no opinion about those that use multiple maintainers (if that works for them, and for us, then why not, we’re not CRAN).

Lori Shepherd (12:19:17): > its been failing. They asked on bioc-devel awhile ago . I have been in process of helping them reset it on our system but it has been a struggle

Lori Shepherd (12:20:43): > They needed to reset branches and currently the package is still failing because they didn’t update all the necessary files to pass R CMD build / R CMD check – so the removal / replacement is in progress

Henrik Bengtsson (13:13:05) (in thread): > I just verified what was already said above.R CMD build teenydoesnotcomplain about: > > Package: teeny > Version: 1.0 > Title: A Minimal, Valid, Complete R Package > Description: A minimal, valid, complete R package that can be used as a baseline for testing and troubleshooting various components of R and R packages. > Author: Alice Bobson <alice@example.org>, Charlie Carolson <charlie@example.org> > Maintainer: Alice Bobson <alice@example.org>, Charlie Carolson <charlie@example.org> > License: GPL (>= 3) > > but with: > > Package: teeny > Version: 1.0 > Title: A Minimal, Valid, Complete R Package > Description: A minimal, valid, complete R package that can be used as a baseline for testing and troubleshooting various components of R and R packages. > Authors@R: c( > person("Alice", "Bobson", role=c("aut", "cre", "cph"), > email = "alice@example.org"), > person("Charlie", "Carolson", role=c("aut", "cre", "cph"), > email = "charlie@example.org")) > License: GPL (>= 3) > > it gives: > > $ R CMD build teeny > * checking for file 'teeny/DESCRIPTION' ... OK > * preparing 'teeny': > * checking DESCRIPTION meta-information ... ERROR > Authors@R field gives more than one person with maintainer role: > Alice Bobson <alice@example.org> [aut, cre, cph] > Charlie Carolson <charlie@example.org> [aut, cre, cph] > > See section 'The DESCRIPTION file' in the 'Writing R Extensions' > manual. >

Henrik Bengtsson (13:21:38) (in thread): > … and the multipleMaintainerissue is caught byR CMD check --as-cran; > {sh} > $ R CMD check --as-cran teeny_1.0.tar.gz > * using log directory '/home/henrik/repositories/teeny.Rcheck' > * using R version 4.3.3 (2024-02-29) > * using platform: x86_64-pc-linux-gnu (64-bit) > * R was compiled by > gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 > GNU Fortran (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 > * running under: Ubuntu 22.04.4 LTS > * using session charset: UTF-8 > * using option '--as-cran' > * checking for file 'teeny/DESCRIPTION' ... OK > * this is package 'teeny' version '1.0' > * checking CRAN incoming feasibility ... [3s/12s] WARNING > Maintainer: 'Alice Bobson <alice@example.org>, Charlie Carolson <charlie@example.org>' > > The maintainer field is invalid or specifies more than one person > > New submission > * checking package namespace information ... OK > ... > > but not withR CMD checkwithout--as-cran.

Lori Shepherd (13:23:07) (in thread): > yes that was my point as well. It seems like if it is to be truly enforced than R-core should adjust R CMD build to fail for both cases of multiple maintainers Authors@R and Maintainer. Not just the former.

Henrik Bengtsson (13:25:39) (in thread): > For the sake of Bioc, I think it’s worth consideringAuthor/Maintainerbeing deprecated and move toAuthors@R.

Lori Shepherd (13:25:56) (in thread): > we do in BiocCheck for incoming packages… at least I think we do… I’ll double check if its an enforcement or suggestion

Dirk Eddelbuettel (13:26:44) (in thread): > Not picking a fight here with you but this needs no change at their end. Every upload to CRAN is required to pass--as-cranso this is covered and tested. It is too bad your local circumstances are preventing you from runningR CMD check --as-cranin your ‘prod’ but the ability is there.

Henrik Bengtsson (13:27:21) (in thread): > > we do in BiocCheck for incoming packages (edited) > With the risk of taking this thread in a different direction - would it be feasible to run BiocCheck before each new release?

Lori Shepherd (13:27:54) (in thread): > again there are many checks that are specific to cran that we can not use –as-cran .. nor is Bioconductor CRAN as Herve pointed out

Lori Shepherd (13:29:06) (in thread): > There would be tons of ERRORs beside the authors@R running BiocCheck on legacy packages as we have implemented many new policies over the years – we have talked about doing this and posting the results somewhere and slowly work towards trying to implemented but my guess is it would be a long / multi year process to have it fully implemented on accepted packages

Hervé Pagès (13:30:42) (in thread): > Link to the experimental BiocCheck builds:https://bioconductor.org/checkResults/3.19/bioc-testing-LATEST/They run on a small subset of software packages only.

Henrik Bengtsson (13:31:50) (in thread): > I peeked at theR CMD checksource code (https://github.com/wch/r-source/blob/cf81bfdbfcbd98388234c06d4c3e3f6a90166046/src/library/tools/R/QC.R#L7379-L7418), and unfortunately it does not seem that this particular CRAN-incoming check can be enabled with a specific*R_CHECK_nnn*environment variable. Some can, but not this one.

Dirk Eddelbuettel (13:35:57) (in thread): > @Henrik BengtssonI did the same by glancing at what env vars are indexed in the R Internals manual giving some control what runs (and I keep forgetting how many / few have the switches). Putting an opt-in switch around the test may be a patch that gets merged, and would help@Lori Shepherd.

Henrik Bengtsson (14:10:27) (in thread): > > Putting an opt-in switch around the test may be a patch that gets merged … > Yes, that could be worth exploring and quick solution. > > Related to this: The currentR CMD check(src/library/tools/QC.R) is a rather complicated nested set of local … local functions that are hard to debug and play around with. I think a rewrite of the check framework could greatly benefit the R Project, the Bioconductor Project, and beyond. Imagine if we could support check “extension” likeR CMD check --flavor=CRAN,R CMD check --flavor=Bioconductor,R CMD check --flavor=CRAN,Bioconductor,lintr. This is of course a big project. I’ve started that idea in <https://github.com/HenrikBengtsson/Wishlist-for-R/issues/16>.

2024-03-28

Jeroen Ooms (06:15:46): > Does BioConductor have a particular strategy to deal with dependencies that are archived on CRAN? Like a particular CRAN snapshot? I am running into a hand full of packages in devel that cannot be built because they have a hard dependency on something that was archived on CRAN in the past few months. Is this related to your cranhaven project@Henrik Bengtsson?

Lori Shepherd (06:40:13): > We talked about CranHaven for release not for devel. But still in discussion. We currently pick these up on the builders after we reinstall R. If a CRAN package remains archived Bioconductor packages will have to remove or be deprecated so we would rather know sooner than later and notify maintainers they have to start modifying packages

Almog Angel (07:04:24): > Seems like I cannot useBiocParallel::bplapplywhile training models withlightgbm::lgb.train. I tried using thenum_threads = 1parameter forlightgbmbut it still fails to train models in parallel, resulting: > > LightGBM Model > (Booster handle is invalid) > > Any ideas?

Henrik Bengtsson (11:41:08) (in thread): > Yes, CRANhaven is meant to give an extra cushion for when a package is archived on CRAN. It’s meant to add some extra leeway to the R community and projects like Bioconductor to the disruptive, sudden, zero-notice archival of CRAN packages. > > Packages can currently live on CRANhaven for five weeks. Importantly, a third of archived package return to CRAN at some point. Approx 50% of the packages that eventually get unarchived, get unarchived within the five-week time limit. The exact lifespan on CRANhaven can and will be fine tuned going forward, but the objective is to strike a balance between lowering friction to the R community and keeping some “pressure” on package maintainers to resubmit fixed to CRAN. (An extreme alternative, which I don’t think we want to use, would be a “CRANafterlife” repository that would serve a forever-version of CRAN, which btw I think is already covered by your CRAN R-universe) > > Other things have to be considered too, e.g. what should happen when we’re close to new Bioconductor releases? As@Lori Shepherdsuggests, one approach is to only use CRANhaven as a fallback for Bioc release (“cushion”), but not Bioc devel (“no cushion”). > > We’re already learning things from CRANhaven. For example, a perfectly fine package (all OK) gets archived on CRAN if the maintainer email address bounces or there’s no response. I’ve already notified one maintainer about this who didn’t know their package was archived. This means CRANhaven can also serve as a mechanism to notify maintainers who are not aware (e.g. automatically creating an GitHub issue, or through the community reaching out throughout different channels). TheCRANhaven Dashboardis meant to help on this process. > > So, yes, I definitely think CRANhaven will be able to remove some of the pain points that Bioconductor and its community experience from CRAN packages being archived.

Jeroen Ooms (11:48:07) (in thread): > Hmm some bioc packages depend on CRAN packages that were archived in 2023.

Lori Shepherd (11:48:58) (in thread): > these should already have been caught and notified or deprecated. Do you check for which packages are currently deprecated?

Jeroen Ooms (11:59:12) (in thread): > I’m was going over all packages inhttps://bioconductor.org/packages/json/3.19/bioc/packages.json. Should I filter out the ones that have"PackageStatus": "Deprecated"?

Lori Shepherd (12:02:02) (in thread): > most likely as those are the ones that are marked as deprecated in 3.19 and removal in 3.20 some of these may return but not more than a few normally do:https://support.bioconductor.org/p/9156893/

Lori Shepherd (12:02:15) (in thread): > they are identified as failing on our builders for an extended period of time and have been notified and are unresponsive

Jeroen Ooms (12:03:29) (in thread): > OK thanks I will remove those from my list then

Lori Shepherd (12:15:49) (in thread): > These will also exclude packages that are user requested deprecated too so yes probably a good thing to do

Joselyn Chávez (12:58:39): > Dear developers, I would like to hear your opinion in an issue. > > We are preparing the Giotto package for submitting to Bioconductor. We are listing diverse packages under the Imports section, but we’re facing a conflict with some functions that have the same name. We are already using the syntax package::function for using them in our code, but we still get the warning when loading our package and of course when running BiocCheck. Is there a recommended way to deal with these conflicts? - File (PNG): Screenshot 2024-03-28 at 12.42.06 PM.png

Dirk Eddelbuettel (13:06:17) (in thread): > Are you using a ‘global’Imports: pkg(also inNAMESPACE) or the more selective and preferred@importFrom pkgA fA1 fA2. That second way is preferred if you can. Classic case is importing, say,selectandmutatefromdplyrbut notfilter.

Jonas Schuck (14:20:23): > @Jonas Schuck has joined the channel

Joselyn Chávez (15:39:40) (in thread): > Thanks a lot, that worked!

2024-03-29

Manisha Nair (06:14:29): > @Manisha Nair has joined the channel

2024-03-30

Artür Manukyan (15:49:16): > @Artür Manukyan has joined the channel

2024-04-10

Ludwig Lautenbacher (04:44:05): > @Ludwig Lautenbacher has joined the channel

Ludwig Lautenbacher (04:56:47): > Continuing theconversationfrom#general > > Hi everyone! I have a quick question that I hope you are able to help me with. I’m preparing to submit a package to Bioconductor. The package is a client for a web server, and its R code resides in a subdirectory of the main GitHub repository. Can I submit it from this subdirectory, or do I need to create a separate repository for the submission? I would prefer to have the client and server code in the same repo but if that is not supported I will seperate them. > Thanks,@Kasper D. Hansenand@Vince Carey, for your assistance! For those interested, here’s the repository:https://github.com/wilhelm-lab/koina/tree/feature/rclient. Currently, the R package resides in clients/koinar alongside the Python package. Keeping all code in one repository seems more manageable for an overview of all related code. > Is it possible to build only a subfolder in Bioconductor? Creating a separate repository appears to be an excessive step, considering the package will be mirrored togit.bioconductor.org. However, I’ll opt for that if there are no simpler alternatives.

Vince Carey (05:39:40) (in thread): > I had a look at the repo and found it quite interesting so I hope we will be able to work together. At this time I don’t see a path to submitting the package as a subfolder of a larger repo, but that might be a feature of my ignorance of git/github capabilities. Clearly this is a result of conventions that have evolved over time in Bioconductor, that have worked for hundreds if not thousands of submissions (seehttps://github.com/Bioconductor/Contributions/issues/with over 3000 issues closed). I don’t see a way to relax this constraint at this time, so I would ask that you take the step of submitting via separating the client code. (It’s conceivable that you could come up with a maintenance approach for a submission to CRAN that would not require you to make this separation, and then Bioconductor packages could interoperate with the koina client package by declaring the relationship and calling its functions.) We will discuss the technical issue in Bioc core and will comment back here if there are insights that would simplify the process.

Alan O’C (07:17:56) (in thread): > That’s a branch, not a subdirectory? If it was a subdirectory you could separate it into a git submodule, then include the submodule in the parent project, and submit the R client to bioc as normal

Alan O’C (07:18:36) (in thread): > Using branches like that is a bit of a code smell to me if I’m honest

Ludwig Lautenbacher (07:44:12) (in thread): > It’s a branch at the moment because I’m still working on it. Once the client is done I will merge it to main.

Alan O’C (07:45:37) (in thread): > Then yeah just pull out the history of clients/koinar into its own repo and make that a submodule of the main repo, submit koinar to bioc as normal

Ludwig Lautenbacher (09:18:14) (in thread): > Thanks for your help! I submitted the package now - Attachment: #3392 KoinaR - R package to interface with Koina web service > Update the following URL to point to the GitHub repository of
> the package you wish to submit to Bioconductor > > • Repository: https://github.com/wilhelm-lab/koinar > > Confirm the following by editing each check box to ‘[x]’ > > I understand that by submitting my package to Bioconductor,
> the package source and all review commentary are visible to the
> general public. > I have read the Bioconductor Package Submission
> instructions. My package is consistent with the Bioconductor
> Package Guidelines. > I understand Bioconductor <https://bioconductor.org/developers/package-submission/#naming|Package Naming Policy> and acknowledge
> Bioconductor may retain use of package name. > I understand that a minimum requirement for package acceptance
> is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
> Passing these checks does not result in automatic acceptance. The
> package will then undergo a formal review and recommendations for
> acceptance regarding other Bioconductor standards will be addressed. > My package addresses statistical or bioinformatic issues related
> to the analysis and comprehension of high throughput ~~genomic~~ proteomics data. > I am committed to the long-term maintenance of my package. This
> includes monitoring the support site for issues that users may
> have, subscribing to the bioc-devel mailing list to stay aware
> of developments in the Bioconductor community, responding promptly
> to requests for updates from the Core team in response to changes in
> R or underlying software. > I am familiar with the Bioconductor code of conduct and
> agree to abide by it. > > I am familiar with the essential aspects of Bioconductor software
> management, including: > > ☑︎ The ‘devel’ branch for new packages and features. > ☑︎ The stable ‘release’ branch, made available every six
> months, for bug fixes. > ☑︎ Bioconductor version control using Git
> (optionally via GitHub). > > For questions/help about the submission process, including questions about
> the output of the automatic reports generated by the SPB (Single Package
> Builder), please use the #package-submission channel of our Community Slack.
> Follow the link on the home page of the Bioconductor website to sign up.

Alan O’C (09:19:54) (in thread): > Awesome, hope it goes well:slightly_smiling_face:

Hervé Pagès (21:12:51): > @Aaron LunWhy isn’t a round trip withsaveObject/readObjectbringing back the original matrix? > > library(alabaster.base) > m <- matrix(1:12, ncol=3) # dense matrix > tmp <- tempfile() > saveObject(m, tmp) > m2 <- readObject(tmp) # NOT a dense matrix > identical(m2, m) > # [1] FALSE > class(m2) > # [1] "ReloadedMatrix" > # attr(,"package") > # [1] "alabaster.matrix" > identical(as.matrix(m2), m) > # [1] TRUE > > It does for a DataFrame. > > I would not expect this for something that intends to provide a language-agnostic equivalent to RDS-based serialization. Why not just bring back the original object by default likesaveRDS/readRDSdoes, unless specified otherwise?

Aaron Lun (21:55:55): > I did consider it, and for a period of development, I did try to recapitulate the class exactly. But in the end I gave up for several reasons. > * In the specific case of matrices and arrays, lazy loading was very beneficial for a wide variety of our applications (e.g., Shiny), given that most uses of complex objects like, e.g., SEs didn’t need all of the assays in memory at once. So any attempt to default to restoring the original class increased resource usage (loading time and RAM consumption) across the board and made everyone unhappy. > * On a similar note, theReloadedMatrixclass is particularly useful for deduplication when people update annotations or metadata in an SE without changing the assay values.saveObject,ReloadedArray-methodis specialized to avoid actually saving the object but just link to the existing file, saving time, bandwidth and storage space. Restoring the original class would lose that provenance information when it gets fed back intosaveObject. > * Class preservation was previously achieved by having language-specific tags to instruct each client to coerce to the right class. However, those tags would get wiped upon a roundtrip via another language, given that R and Python don’t have exact equivalence in their data structures. After a few of these trips between languages, the disk representation ends up converging to the “lowest common denominator” as all sets of tags get wiped. I suppose one could ask each language’s reader/saver to preserve other language’s tags when doing a roundtrip, but I think that would result in some rather unpredictable behavior as the result would depend on the not-easily-user-visible history of each object. > * In the end, no one really seemed bothered by it, as it was easy enough to get to a suitable representation for their downstream work. Each application can decide how they want to handle things after (or even during) the loading process. For example,scRNAseqandcelldexoverride thealabasterreaders so that matrices are restored in memory, for compatibility with the previous representations. > So yes, I know where you’re coming from, and I tried, but the philosophical benefits of a perfect roundtrip were outweighted by the practical realities of people actually using this system.

2024-04-11

Lambda Moses (02:18:52): > @Aaron LunAnother question aboutalabaster: I think it makes sense to save the assay matrices as h5 so we can useDelayedArrayand not load them into memory. But you also savecolData,rowData, andspatialCoordsas h5. In those cases, why did you choose h5 instead of say arrow parquet which is also supported by different languages? Is it to not add more system dependencies?

Aaron Lun (02:34:20): > mostly because i already had it in HDF5 and I couldn’t be bothered to change it given that we’d be loading it into memory anyway. The secondary reason is the dependencies. The exact format doesn’t really matter here because the default DF readers will load everything into memory anyway. > > That said, it’s totally possible to create another specification for, e.g.,parquet_data_frameand read/write that into, say,this. I just haven’t encountered a real use case where the DF is so large that it warrants that kind of treatment.

Aaron Lun (02:44:58): > also, i vaguely rememberarrowtaking an eternity to install on my laptop. Don’t know what it was doing, but I decided I didn’t want to deal with that on a regular basis.

Hervé Pagès (05:16:37): > How about giving the user the choice between ordinary matrices coming back as ReloadedMatrix objects or in their original form by adding an arg toreadObject()? E.g. something likereadObject(path, strict=TRUE)would return the original object. Withstrict=FALSEas default since this is the most useful behavior for the typical alabaster use cases. > Of course it’s not a big deal when the original object is an ordinary matrix, the user could just callas.matrix()on the object returned byreadObject(). However for more complex objects where matrices can hide in many corners (e.g. an SCE or a GRanges object with a matrix in the metadata columns), the user won’t necessarily know how to restore the original object. > For example, with this RangedSummarizedExperiment object: > > library(SummarizedExperiment) > m <- matrix(101:112, nrow=4) > gr <- GRanges("chr1", IRanges(1, 11:14)) > mcols(gr) <- DataFrame(m=I(m)) > counts <- matrix(rpois(8, 0.9), nrow=4) > rse <- SummarizedExperiment(list(counts=counts), rowRanges=gr) > > People won’t understand why after a round trip to alabaster the returned object is not identical to the original one, and they won’t know where the differences are or what they are. Are the differences in form only or in content? They have no easy way to tell. Knowing that they can actually get the object back in strictly identical content and form by usingstrict=TRUEwould be reassuring as it would demonstrate thatsaveObject()didn’t lose or alter the original content. > > BTW have you considered usingsave.alabaster/read.alabasterinstead ofsaveObject/readObject?

Lambda Moses (06:14:51) (in thread): > It does take quite a while to compile the arrow R package and make sure it matches the C++ version but I’ve been using binaries as well. I think it’s worthwhile to deal with parquet because it’s becoming common in spatial transcriptomics, like in Vizgen MERFISH (using CellPose) and 10X Xenium output, the cell segmentation vertices have a parquet file. For Xenium, there’s also a csv file for the segmentation vertices, but it’s so much faster to read parquet compared to csv, even when usingfread. I find parquet attractive since when you also read it in Python or JavaScript, the data is not duplicated in memory. > > Foralabaster.sfe(still working on it), I’ve been storing the geometries on disk as GeoParquet which are read back into R assffast. GeoParquet files are also much smaller than GeoJSON. Also here GeoParquet IO is done by GDAL rather than arrow, and you have to install GDAL in order to installsfanyway though a newer version of GDAL is required for GeoParquet. I can also use SQL to query the parquet file which can be more helpful when it comes to transcript spot geometries when each gene has millions of spots and you don’t need to load the spots of all genes at once.

Raymond Lesiyon (22:17:44): > @Raymond Lesiyon has joined the channel

2024-04-12

Aaron Lun (02:53:54) (in thread): > In the past, I did have an argument inreadObject()that did that. And thenI removed itbecause I could see it getting complicated. Specifically, I could forsee different users asking for different options to coerce to their favorite in-memory type, and my function signatures would get too cluttered. I also didn’t want to be the arbiter of what was a “worthy” coercion to include. > > So, this task is now left to the user, or more typically, to the application that callsalabaster. You’ve already seenscRNAseq’sscLoadObjectoverride for S(C)Es, but the same approach can be applied to override the loading of each child component. For example, the matrices hiding in themcolscan be coerced by overriding the loading generic with specific behavior whenever thedense_arrayobject type is encountered. > > More generally, I don’t think I can guarantee that the same object is returned after a roundtrip toalabaster’s on-disk representation. The most obvious example is that attributes are lost; other languages don’t really have this concept, so they just get sliced off. (While less obvious, we also lose ALTREP optimizations that can impact performance before/after a roundtrip.) One can check the specificationshereto see whatalabasterconsiders important such that support is mandated across all language clients. > > Otherwise, if 100% recovery of the object in the R session is important, sacrificing inter-language portability and using an RDS file is the easiest solution. Fromrds2py, we have a fair amount of experience with loading RDS files in Python, and much information is discarded along the way when creating the appropriate data structure - that’s just how it is. > > As for thesave.alabaster; no, I haven’t considered it, but I probably wouldn’t do it unless I was forced to. I don’t like putting the package name in the function names, seems redundant when namespacing can be used.

Aaron Lun (02:55:31) (in thread): > sounds interesting. maybe the same on-disk representation can be generalized forParquetDataFrames.

Jovana Maksimovic (03:13:48): > Hi Bioconductor Team,@Calandra Grimais currently updating mine and@Belinda Phipson’s missMethyl package to work with Illumina’s EPIC v2 array. > Currently there is an available annotation for this on AnnotationHub (https://bioconductor.org/packages/devel/data/annotation/html/EPICv2manifest.html), however, we cannot easily integrate this within our workflow as missMethyl assumes the annotation data format used by the minfi package. > Instead, we are currently using these packages, which are compatible with minfi: > * https://github.com/jokergoo/IlluminaHumanMethylationEPICv2anno.20a1.hg38 > * https://github.com/jokergoo/IlluminaHumanMethylationEPICv2manifest > Will having these GitHub packages as dependencies cause problems during the Bioconductor submission process? > Also, are you aware whether the author of these annotation packages is planning to submit them to Biocondctor and in which format (annotation package or annotation hub)? - Attachment (Bioconductor): EPICv2manifest (development version) > A data.frame containing an extended probe manifest for the Illumina Infinium Methylation v2.0 Kit. Contains the complete manifest from the Illumina-provided EPIC-8v2-0_EA.csv, plus additional probewise information described in Peters et al. (2024).

Lluís Revilla (04:16:34) (in thread): > I am just another member of Bioconductor, but packages on CRAN and Bioconductor can only depend on packages on these repositories, if packages outside them are needed they should be in a suggests and add a Additional_repositories field. You can find more information here about what is allowed and how to do it:https://contributions.bioconductor.org/bioconductor-package-submissions.html - Attachment (contributions.bioconductor.org): Chapter 1 Bioconductor Package Submissions | Bioconductor Packages: Development, Maintenance, and Peer Review > Introduction Types of Packages Package Naming Policy Author/Maintainer Expectations Submission Experiment data package Annotation package Workflow package Review Process Following Acceptance…

Renuka Potbhare (05:20:41): > @Renuka Potbhare has joined the channel

Hervé Pagès (05:56:50) (in thread): > > if packages outside [CRAN/Bioconductor] are needed they should be in a suggests and add a Additional_repositories field. > I don’t see this in our doc and I don’t think it’s a good idea. At the very best they should be inEnhances, notSuggests. This means that they won’t be installed forR CMD checkso any code in your package making use of functionalities from the packages inEnhanceswill need to be dead code, which is not good. This is why we strongly discourage this practice.

Lluís Revilla (06:00:40) (in thread): > sorry, I didn’t mention that I was referencing the R manual R-extensions:https://cran.r-project.org/doc/manuals/r-release/R-exts.html. Of course, Bioconductor can restrict those rules.

Lluís Revilla (06:00:44) (in thread): > @Hervé PagèsI’m interested in hearing you position about this, specially considering that now it is very easy to create additional repositories and there are several movements to create new repositories with the aspiration to become officially recognized by R/R-core.

Hervé Pagès (10:54:44) (in thread): > I don’t know what “becoming officially recognized by R/R-core” will mean concretely so I guess we’ll just wait and see…

Lluís Revilla (11:38:27) (in thread): > Yeah, that’s what I’ve been working on. The lack of criteria is not helping. > In my opinion the same recognition CRAN and Biocomductor have: at least that CRAN and Bioconductor would halt submissions with the same name as a package in that repository and allow to depend on packages on those other repositories.

Hervé Pagès (12:47:22) (in thread): > > Specifically, I could forsee different users asking for different options to coerce to their favorite in-memory type, and my function signatures would get too cluttered. I also didn’t want to be the arbiter of what was a “worthy” coercion to include. > Just to clarify, I was only suggesting an option to get things back in theiroriginalin-memory type, e.g. via a simple togglestrictorwith.original.type. Doesn’t seem too much clutter and doesn’t really invite for crazy requests. > > As for the save.alabaster; no, I haven’t considered it, but I probably wouldn’t do it unless I was forced to. I don’t like putting the package name in the function names, seems redundant when namespacing can be used. > Thealabasterpart in the name here refers to the format, not the name of the package (also strictly speaking the functions are defined inalabaster.base, not inalabaster). ThesaveObject/readObjectnames are so generic.save*/read*functions are usually suffixed with the name of the format they work with e.g.saveRDS/readRDS,write.dcf/read.dcf,etc… Helps convey what the function is really about.

Kasper D. Hansen (13:32:31): > This is a big problem. I had not understood that the recently submitted manifest package is not following the minfi standard. That is super confusing and unexpected. The package name should IMO change. Our standard for annotation packages is that there is some relationship between package name and data format and that seems broken here.

Lori Shepherd (13:42:32) (in thread): > those two github packages were recently submitted to Bioconductor and are included in devel so they are available so you shouldn’t have a problem listing them in your package

Kasper D. Hansen (14:14:03): > Now, some of this is caused by minfi not having an official annotation file for EPIC2 and I really, really need to fix that. But my comment is about package naming amongst the annotation packages

Lori Shepherd (14:23:01): > James and I asked Zuguang about his packages when we were running the release annotation pipeline to know if they were going to be submitted or we should provide them since so many were asking for the EPIC v2 and many pointed to the unofficial github packages. I might suggest asking him to update the format then on the github repos and resubmitting an updated version. It was noted that over a year ago you were pinged about providing these packages for 3.17 yourself which never resolved.

2024-04-13

Kasper D. Hansen (07:46:08): > Lori, I have absolute no issue with Zuguang’s packages (I assume you refer tohttps://github.com/jokergoo/IlluminaHumanMethylationEPICv2manifest). That would be unreasonable. My issue is about the (accepted)EPICv2manifestpackage. But looking more carefully, I no longer have an issue. I think I was thrown off by the title of the pagehttps://bioconductor.org/packages/devel/data/annotation/html/EPICv2manifest.htmlwhich specifically is “Illumina Infinium MethylationEPIC v2.0 extended manifest from Peters et al. 2024”. To me, that title seems very similar to the package style names we have for minfi annotation packages, and I think that is a problem because theEPICv2manifestpackage is not a minfi-style annotation package. However, as I write this, it is clear to me that I was confused by the title of the HTML page, since the actual package name is more substantially different. I must also add that I think it is great that theEPICv2manifestpackage exists, I was just unhappy with the name (well, as I said above more specifically, I was unhappy with the title of the HTML page). - Attachment (Bioconductor): EPICv2manifest (development version) > A data.frame containing an extended probe manifest for the Illumina Infinium Methylation v2.0 Kit. Contains the complete manifest from the Illumina-provided EPIC-8v2-0_EA.csv, plus additional probewise information described in Peters et al. (2024).

Lori Shepherd (18:21:15): > Ah ok. I missed that this was referenced too. Sorry about that . what exactly about the html page is problematic? Trying to understand

Ankitha Ramaiyer (20:48:31): > @Ankitha Ramaiyer has joined the channel

2024-04-14

Jovana Maksimovic (23:45:44) (in thread): > Thanks everyone for your feedback. Good to know that those packages have now been submitted and we can proceed accordingly.

2024-04-15

Peter Hickey (00:21:11): > I’ve just realised theDelayedMatrixStatsvignette failsR CMD checkat the ‘checking running R code from vignettes’ step (https://bioconductor.org/checkResults/3.18/bioc-LATEST/DelayedMatrixStats/nebbiolo2-checksrc.html). > > The particular code chunk (https://code.bioconductor.org/browse/DelayedMatrixStats/blob/RELEASE_3_18/vignettes/DelayedMatrixStatsOverview.Rmd#L72) demonstrates some code that causes an error, so I’d included aerror = TRUEin the chunk header. > I think I was following advice similar tohttps://r-pkgs.org/vignettes.html#sec-vignettes-eval-option, but is that advice incorrect/outdated or am I doing something else wrong? > > Unfortunately it seems like it’s been failing for months(?) and I somehow overlooked it:grimacing:

2024-04-17

Robert Castelo (06:57:59): > Hi, this morning I noticed that one of the packages I maintain, atena, was updated in my R-devel installation, while I knew that I had not submitted changes upstream. Thelanding pageat Bioconductor of the devel version shows that indeed version has been bumped from 1.7.0 to 1.9.0 as if we were switching to a new release, however, if I clone the upstream repo and look at the log, there is no recent commit with such a version bump, but I do see aduplicationof such last commit from October, albeit with a different commit hash: > > $ git clone git@git.bioconductor.org:packages/atena > $ cd atena > $ grep Version DESCRIPTION > Version: 1.9.0 > $ git log | head -12 > commit db720afe17ca78f4d0ea2c4c3c49ad6281aa70dc > Author: J Wokaty <jennifer.wokaty@sph.cuny.edu> > Date: Tue Oct 24 11:36:26 2023 -0400 > > bump x.y.z version to odd y following creation of RELEASE_3_18 branch > > commit 634a46974205fe83e7e83c1d6900ae070efd71fc > Author: J Wokaty <jennifer.wokaty@sph.cuny.edu> > Date: Tue Oct 24 11:36:26 2023 -0400 > > bump x.y.z version to even y prior to creation of RELEASE_3_18 branch > > Does anybody noticed the same thing? I was actually fixing some bugs and updating documentation, should I override the version bump to what would be the proper 1.7.1 version when I push those changes upstream?

Lori Shepherd (07:02:09): > You cant go backwards (or strongly suggested to not go backwards), so if there is a greater one in the wild I’d continue on. > Of note,. If you read closely, those are not actually duplicate commits. One bumps to even for the release (bump prior to branch) and one bumps to odd (following creation on branch) which is expected

Lori Shepherd (07:03:37): > We can follow up and look into it. Remember at release we do this auto bump and branch automatic from the core team on all packages for maintainers

Robert Castelo (07:05:33): > Oops, you’’re right, but this gets even more weird, because the version bumped without a commit:scream:

Lori Shepherd (07:31:24): > But when I look at the commit it is acutally 1.9.0 ? > > lorikern@jbcj433:~/BioconductorPackages/SoftwarePkg/atena(devel)$ git show db720afe17ca78f4d0ea2c4c3c49ad6281aa70dc > commit db720afe17ca78f4d0ea2c4c3c49ad6281aa70dc (HEAD -> devel, upstream/master, upstream/devel) > Author: J Wokaty <jennifer.wokaty@sph.cuny.edu> > Date: Tue Oct 24 11:36:26 2023 -0400 > > bump x.y.z version to odd y following creation of RELEASE_3_18 branch > > diff --git a/DESCRIPTION b/DESCRIPTION > index 7e2070b..55637d5 100644 > --- a/DESCRIPTION > +++ b/DESCRIPTION > @@ -1,7 +1,7 @@ > Package: atena > Type: Package > Title: Analysis of Transposable Elements > -Version: 1.8.0 > +Version: 1.9.0 >

Martin Morgan (07:55:08): > R-devel recently changed its version, maybe you have an alias or something such that you are updating a different library?

Robert Castelo (08:08:53): > Found it, thanks for the hints, I messed up with different libraries, sorry for the noise!!

Amanda Hiser (11:27:31): > I’m currently updating an annotation package (SomaScan.db), and I’m unable to replicate an R CMD check failure (from the build report) using thebioconductor/bioconductor_docker:develDocker image. I’m assuming this is because thedevelimage is still using release 3.18, with R 4.3, while the build report is generated using Ubuntu 22.04.3 LTS, R version 4.4.0 beta (on nebbiolo1). When will theRELEASE_3_19docker image become available, or when will thedevelimage be updated to reflect the architecture used for the upcoming release? If that won’t happen until after the 3.19 release is finalized, is there another Docker image I can use in the meantime that replicates the build/check system currently used on nebbiolo1? Otherwise, I need to wait a week between builds to confirm that the check failure was cleared, as this is an annotation and not a software package

Amanda Hiser (12:32:05) (in thread): > Here is thesessionInfo()after launching thedevelimage using the Quick Start instructions fromhttps://github.com/Bioconductor/bioconductor_docker?tab=readme-ov-file#quick-start: > > R version 4.3.1 (2023-06-16) > Platform: aarch64-unknown-linux-gnu (64-bit) > Running under: Ubuntu 22.04.3 LTS > > Matrix products: default > BLAS: /usr/lib/aarch64-linux-gnu/openblas-pthread/libblas.so.3 > LAPACK: /usr/lib/aarch64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Etc/UTC > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.3.1 tools_4.3.1 >

Andres Wokaty (13:08:54) (in thread): > ~~~You can update your docker image with ~~~~~~~docker pull bioconductor/bioconductor_docker:RELEASE_3_18~~~~~~~~ , which should give you 3.19 with R 4.4.~~~~

Andres Wokaty (13:17:08) (in thread): > Sorry, I see that devel now gives R 4.5 so my prior advice isn’t helpful. > > You can try using the BBS container by specifying R 4.4.0:docker pullghcr.io/bioconductor/bioconductor_salt:devel-jammy-bioc-3.19-r-4.4.0You can read more about it athttps://github.com/Bioconductor/bioconductor_salt

Amanda Hiser (13:33:45) (in thread): > Oh great, that sounds perfect! I’m getting an error when I run thedocker pullcommand that you provided, see below. Are there any additional files that I need locally to pull this onto an arm64 Mac? > > > docker pull ghcr.io/bioconductor/bioconductor_salt:devel-jammy-bioc-3.19-r-4.4.0 > devel-jammy-bioc-3.19-r-4.4.0: Pulling from bioconductor/bioconductor_salt > no matching manifest for linux/arm64/v8 in the manifest list entries >

Alex Mahmoud (13:40:19) (in thread): > In order to work with the BBS/salt container on an arm64 mac, you need to add--platform linux/amd64to the docker commands to force using the emulator. The bioconductor_docker containers are available for both amd64 and arm64 natively, but pre-compiled binaries are only built continuously for amd64 (in arm64 it would build from source)

Amanda Hiser (13:44:16) (in thread): > Got it, that makes sense. I’m new to Docker so I appreciate the explanation, the container is successfully downloading now. Thank you!

Alex Mahmoud (14:25:19) (in thread): > No worries, and thank you for the report! This allowed us to notice that devel (3.19) bioconductor_docker containers have moved to the new R devel (4.5) since R devel (4.4) is now in pre-release phase. This is a bug in the R version detection in the container build scripts which I am currently fixing

2024-04-18

Amanda Hiser (11:35:48) (in thread): > Sorry to come back to this@Alex Mahmoud, but is theghcr.io/bioconductor/bioconductor_salt:devel-jammy-bioc-3.19-r-4.4.0capable of being run in an RStudio Server session? I’ve tried creating adocker runcommand similar to the examples here (https://www.bioconductor.org/help/docker/) to get a web browser pointing to the image, but I’ve run into a couple issues and I’m wondering if this isn’t something the image is configured to do. Again, might just be because of my inexperience with Docker. My command looks like this: > > docker run \ > --platform linux/amd64 \ > -e PASSWORD=test \ > -p 8787:8787 \ > ghcr.io/bioconductor/bioconductor_salt:devel-jammy-bioc-3.19-r-4.4.0 > > Which produces this error:Fatal error: you must specify '--save', '--no-save' or '--vanilla'. I created a small Dockerfile to specifyENTRYPOINT ["sh"], which resolves the error, but the site launched from that running container (athttp://localhost:8787) doesn’t point to anything, just a blank page. This could be because I’m just making mistakes, but I wanted to make sure that there isn’t some fundamental configuration for this image (or something similar) that is preventing me from doing this - Attachment (bioconductor.org): Bioconductor - Docker for Bioconductor > The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.

Alex Mahmoud (11:47:33) (in thread): > No worries, happy to help, especially since this is cascading from a bug on our side. In short, no, that container is meant to give an R environment mimicking the Bioconductor Build System Linux machine, so it doesn’t provide an RStudio interface, just an R command-line interface. (Jsyik, in case helpful in the future, you may change the ENTRYPOINT at runtime via--entrypoint /bin/shto avoid having to make a Dockerfile and new container image if that’s all you’re changing. ) > > To unblock you rn, since the tagged containers are still rebuilding with the correct versions after the bugfix, you can still use the old R 4.4 Bioc 3.19 container which is still available by hash, so if your goal is to get RStudio with those versions, you likely would want to useghcr.io/bioconductor/bioconductor@sha256:f5d5dfbbccdb2c750f507e678f93f59b87dd6cbcfcacda38f8f6fec681fdbc5dwith your command, and that should work. Feel free to let me know if you’re encountering more issues (tagging is helpful too so I don’t miss the message).

Amanda Hiser (11:49:42) (in thread): > Had a feeling there was an easier way than creating a Dockerfile, should have known about the--entrypointarg:woman-facepalming:I’ll give that command a shot and let you know if it doesn’t work, I really appreciate the quick response!

Alex Mahmoud (16:26:51) (in thread): > Thebioconductor/bioconductor:develcontainer has also been updated and now has the correct R version. So you can now use it as expected. Sorry again for the bug, and thank you for reporting it!

Amanda Hiser (16:27:21) (in thread): > That’s great news, thank you so much!!

Alex Mahmoud (16:37:07) (in thread): > Actually, re-reading the original message, just want to clarify in case helpful: The bug was that the devel containers were running too new of an R version (4.5.0instead of4.4.0due to R being in the 3-versions timeframe when R devel is bumped up and the previous R-devel becomes pre-release for a few weeks). I notice that your issue is that you had an older version of the devel container, so you need to do adocker pullfirst, and you should be good. eg > > docker pull bioconductor/bioconductor_docker:devel > > docker run --rm bioconductor/bioconductor_docker:devel Rscript -e 'BiocManager::version(); R.version' > > will return 3.19 and R 4.4.0

Amanda Hiser (16:37:49) (in thread): > Thank you for the clarification!

2024-04-19

S L (08:01:06): > @S L has joined the channel

2024-04-20

Lambda Moses (00:24:11): > Is it just me? I can’t install any package from Bioc devel. Not sure if it’s related: the search functionality doesn’t work on the Bioc website. > > > BiocManager::install("SingleCellExperiment", force = TRUE, update = FALSE) > 'getOption("repos")' replaces Bioconductor standard repositories, see 'help("repositories", package = > "BiocManager")' for details. > Replacement repositories: > CRAN:[https://cran.rstudio.com/](https://cran.rstudio.com/)Bioconductor version 3.19 (BiocManager 1.30.22), R Under development (unstable) (2024-01-09 r85796) > Installing package(s) 'SingleCellExperiment' > Warning message: > package 'SingleCellExperiment' is not available for Bioconductor version '3.19' >

Robert Castelo (04:26:03): > I can install it without trouble using the devel docker container, see attached image. - File (PNG): Screenshot 2024-04-20 at 10.25.31.png

Lambda Moses (04:35:52): > OK, actually I could install some packages, but many packages are missing, including all the*Experimentpackages and I suppose all packages that import them, so now there are 1075 packages while I expect over 2000. > > > df <- available.packages(repos = "[https://bioconductor.org/packages/3.19/bioc](https://bioconductor.org/packages/3.19/bioc)") > > nrow(df) > [1] 1075 > > "SingleCellExperiment" %in% rownames(df) > [1] FALSE >

Lambda Moses (04:38:28): > I tried it on my laptop and in the rstudio:devel docker container on my lab’s server and got the same results. So it’s not my home wifi’s problem. Now I begin to wonder if it has something to do with different Bioconductor mirrors.

Lambda Moses (04:40:20): > There’re still over 2000 packages on Bioc 3.18 > > > df <- available.packages(repos = "[https://bioconductor.org/packages/3.18/bioc](https://bioconductor.org/packages/3.18/bioc)") > > nrow(df) > [1] 2216 >

Robert Castelo (04:45:39): > Ok, I can reproduce this, I’m also getting 1075 packages only for Bioc 3.19 with those instructions, and if I try to installSingleCellExperimentoutside the docker container, which I guess does not rely on binary packages, then I also can reproduce the problem: > > > BiocManager::version() > [1] '3.19' > > BiocManager::install("SingleCellExperiment", force = TRUE, update = FALSE) > Bioconductor version 3.19 (BiocManager 1.30.22), R Under development (unstable) > (2024-04-02 r86266) > Installing package(s) 'SingleCellExperiment' > Warning message: > package 'SingleCellExperiment' is not available for Bioconductor version '3.19' > > A version of this package for your version of R might be available elsewhere, > see the ideas at[https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#Installing-packages](https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#Installing-packages) >

Lambda Moses (04:46:25): > Yeah, some docker containers use rspm’s binary repo

Kasper D. Hansen (08:19:39): > Sometimes we have had hiccups when R-beta is release (which it was the 12th) and Bioconductor switches to R-beta instead of R-devel.. But I can’t really see that this should have the impact described above.

Lori Shepherd (10:23:17): > We updated to R beta this weekend and reset the repo. Many packages failed on tht first run and therefore are unavailable as they didn’t propagate. We expect this to resolve over the weekend.

Dirk Eddelbuettel (11:04:25) (in thread): > I thought this was an interesting twist, and did not recall seeing ‘prerel’ before so I tooted with screenshot.https://mastodon.social/@eddelbuettel/112297727829069211 - Attachment (Mastodon): Dirk Eddelbuettel (@eddelbuettel@mastodon.social) > Attached: 1 image > > #rstats flavour ‘prerel’ now used at CRAN on macOS and Windows – don’t recall having seen that before. R 4.4.0 will be released in five days.

Alex Mahmoud (12:25:08) (in thread): > We actually build and host our own binary repo for bioconductor packages and their CRAN dependencies > > > df <- available.packages(repos = "[https://bioconductor.org/packages/3.19/container-binaries/bioconductor_docker](https://bioconductor.org/packages/3.19/container-binaries/bioconductor_docker)") > > > > nrow(df) > [1] 4308 > > It only defaults to RSPM for other packages

2024-04-25

Mercedes Guerrero (05:02:06): > @Mercedes Guerrero has joined the channel

2024-04-26

Jeroen Ooms (04:43:35): > I see a new version of bioc will be released next week. Does that mean all repos automatically get a “bump x.y.z version to even y” commit? When exactly does this happen?

Jeroen Ooms (04:52:39) (in thread): > How can one see for which distro are these linux binaries are?

Vince Carey (06:14:23): > @Lori Shepherdwill tell the exact details; an announcement will be made.

Vince Carey (06:15:32): > > library(VariantAnnotation) > library(GenomicFeatures) > debug(VariantAnnotation:::.localCoordinates) > debugMethod("predictCoding", c(query="CollapsedVCF", subject="TxDb", seqSource="FaFile", varAllele="missing")) > debugMethod("mapToTranscripts", c(x="GenomicRanges", transcripts="GRangesList")) > > are steps taken to search for a possible bug in VariantAnnotation

Vince Carey (06:16:44): > > debugMethod <- > function (mname, sig) > { > trace(mname, signature = sig, tracer = browser) > } > > is in my .Rprofile for ~20 years. Have better approaches to debugging S4 emerged in the mean time?

Lori Shepherd (06:17:27): > @Jeroen OomsYes on tue before the release we send am announcement that all repos will be frozen for a few hours. The core team then does a version bump to even y and creates a RELEASE_3_19 branch. Then we do another y version branch to make devel odd again.
> Then we send another announcement when we unfreeze the epos

Alex Mahmoud (09:43:07) (in thread): > They are built per container.bioconductor_dockeris the name for our rstudio container built on top ofrocker/rstudio. We build them in the container to also ensure sysdeps exist, so we only distribute them for our containers

Amanda Hiser (10:57:24) (in thread): > When we update the NEWS file, should we add an entry or account for the version bump that the core team will do? As in, should the most recent entry in the NEWS.md file use the expected bumped version, or should it match the current version (prior to the bump)?

Lori Shepherd (11:02:36) (in thread): > you may use either in the NEWS as we will pick up both

Amanda Hiser (11:02:48) (in thread): > Oh perfect, thanks!

Marcel Ramos Pérez (12:27:45) (in thread): > Hi Amanda, note that theNEWSfile is meant to be seen by end users vianews(package = "package"). End users will have release versions of packages and therefore the news should coincide (for clarity) with the release version of a package. E.g., devel version of package:1.3.2& version in news file before release:1.4.0

Kasper D. Hansen (16:44:23): > What is the difference betweenhttp://master.bioconductor.org/packages/stats/bioc/minfi/minfi_stats.tabandhttp://master.bioconductor.org/packages/stats/bioc/minfi_stats.tab

Kasper D. Hansen (16:45:18): > The first URL is - I think - the official download stats but what is the second one (where the numbers are 10x bigger)?

Marcel Ramos Pérez (21:13:37) (in thread): > CC:@Robert Shear

2024-04-29

Mike Smith (04:22:50): > Interestingly you can swap any package name in the second one e.g.https://bioconductor.org/packages/stats/bioc/rhdf5_stats.tabInfact, it doesn’t even have to be a package name;x_stats.tabworks too. > > I think it’s a global table across all packages. It’s the same as:https://bioconductor.org/packages/stats/bioc/bioc_stats.tab

Samuel Gunz (09:35:28): > @Samuel Gunz has joined the channel

2024-04-30

Lambda Moses (18:24:05): > Just wondering, how often do people have trouble installing thearrowpackage, especially if you need to compile it? I’m asking because I currently use GeoParquet to save processed geometries from SpatialFeatureExperiment because I like how small the file is and how fast it is to read. But meanwhile I kind of wonder if it’s a good idea or maybe I should also use GeoJSON in case the user has trouble installingarrow.

Kasper D. Hansen (22:16:59): > I am maintaining a user-installed R on a HPC so I don’t exactly have control over the system. It was a bit of a pain to get arrow installed, but not worse than some of the geospatial packages for example.

2024-05-01

Lambda Moses (00:49:50): > You still need GDAL to read GeoJSON. If arrow isn’t worse than installing GDAL then I suppose arrow is fine and I really like its features. I have also had problems with installing system dependencies on my lab’s server when I don’t have root. I think from now on I’ll always run RStudio Server from docker containers on the server to better manage the R version change every year in order to use the newest Bioc release. That also manages system dependencies and is good for reproducibility. When I do have root, I installed apptainer (formerly singularity) to run docker containers. I’ll write instructions on how to run docker containers with apptainer when you don’t have root.

Kasper D. Hansen (08:50:37): > I have a conda environment where I then compile R and packages from source.That’sa bit finicky because condadoesn’talways provide working binaries (which is disappointing and can be hard to debug).But once deployed it works well on a HPC system.

Lambda Moses (13:56:36): > I also did that, but sometimes I ran into issues on CentOS 7 when the compiler is so old that it won’t compile packages that require C++17 or even 14. Also don’t want to change the system wide R version every year in case it disrupts other people’s work. So I use docker.

Lori Shepherd (15:30:22): > Bioconductor Core Team is pleased to release Bioc3.19! Thank you to all developers and community members for contributing to the project. The full release announcement can be found at:https://bioconductor.org/news/bioc_3_19_release/

Henrik Bengtsson (20:14:36) (in thread): > A few comments: > 1. CentOS 7 will be defunct on 2024-06-30 (no more updates, no security updates, …). Hopefully, your organization will upgrade very soon, e.g. to Rocky 8. > 2. Your life will be easier on Rocky 8, because you have more modern tools and system libraries by default. > 3. RedHat provides “Software Collections” (SCLs) for CentOS 7 and Rocky 8. Specifically, thedevtoolsetSCLs provide newer versions of computer tools, cf. <https://www.c4.ucsf.edu/software/scl.html>. You should be good with GCC 10, i.e.devtoolset-10for building R and compiling all CRAN and Bioconductor packages. Have you checked if those are available on your system? Without them, you’ll have to compile your own GCC tools from source, but that is likely to cause a lot of other issues, so I would not go there.

Henrik Bengtsson (20:20:01) (in thread): > > Also don’t want to change the system wide R version every year in case it disrupts other people’s work. > Make sure you learn aboutenvironment modules. They make it super easy to have multiple versions of the same software installed in parallel, e.g.module load r/4.4.0to change the shell to use R 4.4.0,module load r/2.15.3to change it to R 2.15.3, and so on. I’d say it’s an essential tool/setup when doing science and reproducible research, both in single-user and multi-user environments. See <https://www.c4.ucsf.edu/software/software-repositories.html> and <https://wynton.ucsf.edu/hpc/software/software-repositories.html> for HPC environments that provide them.

Jared Andrews (20:30:28) (in thread): > We also use singularity images to run R/RStudio on our HPC as managing the system dependencies is easier. Modules are available for each R version as well, but they can’t share packages with the singularity RStudio images due to compiler differences, etc.

2024-05-02

Lambda Moses (02:08:08) (in thread): > > GCC 10, i.e.devtoolset-10for building R and compiling all CRAN and Bioconductor packages. Have you checked if those are available on your system? > Actually I have root access to one of the servers though I use my non-root account for data analyses. I tried to use yum or dnf to install newer GCC but it didn’t work and nothing could replace the super old system GCC.

Leo Lahti (10:45:27): > @Leo Lahti has joined the channel

Henrik Bengtsson (11:14:22) (in thread): > > … Modules are available for each R version as well, but they can’t share packages with the singularity RStudio images due to compiler differences, etc. > To clarify for others following this: > > TL;DR: > Running R via Linux containers will mess up your regular R package installations if you also run R installed outside of containers. > > Details: > This means that the R installations done on the bare-bone operating system are not binary compatible with the ones done in Linux containers (e.g. Apptainer/Singularity), meaning packages installed in user’s personal R package library path (e.g.~/R/86_64-pc-linux-gnu-library/4.4) must not mix and match. The safest is to set differentR_LIBS_USERif users use both, e.g.R_LIBS_USER-~/R/ubuntu24_04-%p-library/%v(for container based package installations). > > The same problem happens for Python packages, etc. > > (… and then we have Conda manages to wreak havoc by itself without us being able to blame Linux containers, but that’s a different story)

Jared Andrews (11:17:38) (in thread): > SettingR_LIBS_USERdirectly in the images is indeed how we handle it, so by default, users don’t have to do anything.

Henrik Bengtsson (11:19:19) (in thread): > > Actually I have root access to one of the servers though I use my non-root account for data analyses. I tried to use yum or dnf to install newer GCC but it didn’t work and nothing could replace the super old system GCC. > I don’t thinkdnfis available on CentOS 7, but I might be wrong. At least it’s not on the CentOS 7.9 system I checks. It only hasyum. > > Installingdevtoolset-10willnotreplace the default GCC. That is by design. Please seehttps://wynton.ucsf.edu/hpc/software/scl.htmlfor how SCLs work with GCC (that page is forRocky 8, where “devtoolset” is now called “gcc-toolset”). To be clear, you want to install: > > $ yum info devtoolset-10 > ... > Installed Packages > Name : devtoolset-10 > Arch : x86_64 > Version : 10.1 > Release : 0.el7 > Size : 2.2 k > Repo : installed > From repo : centos-sclo-rh > Summary : Package that installs devtoolset-10 > License : GPLv2+ > Description : This is the main package for devtoolset-10 Software Collection. >

2024-05-03

Jeroen Ooms (03:45:20): > We updatedhttps://bioc.r-universe.devto include all current packages including win/mac binaries. It would be interesting to compare the results with the current bioconductor build infrastructure, to test if we see similar problems for packages failing on e.g. windows or vignettes.

Federico Marini (12:49:45): > I guess some of you might have seen this already?https://nvd.nist.gov/vuln/detail/CVE-2024-27322Don’t know if there’s a more relevant channel than this one, but it might be something to keep in mind

Dirk Eddelbuettel (16:08:50) (in thread): > Maybe also readhttps://aitap.github.io/2024/05/02/unserialize.html

Aaron Lun (17:11:04): > already exploiting it to print ascii anime art on everyone’s computers

2024-05-04

Federico Marini (12:29:47): > A follow up by Bob Rudis on this:https://rud.is/b/2024/05/03/cve-2024-27322-should-never-have-been-assigned-and-r-data-files-are-still-super-risky-even-in-r-4-4-0/ - Attachment (rud.is): CVE-2024-27322 Should Never Have Been Assigned And R Data Files Are Still Super Risky Even In R 4.4.0 - rud.is > I had not planned to blog this (this is an incredibly time-crunched week for me) but CERT/CC and CISA made a big deal out of a non-vulnerability in R, and it’s making the round on socmed, so here we are. A security vendor decided to try to get some hype before 2024 RSAC and made… Continue reading →

2024-05-05

Dario Strbenac (20:00:00): > It appears there is a disconnect between the website and package installation. All ofscClassify’s indicatorsare O.K. However, > > library(BiocManager) > install("scClassify") > Bioconductor version 3.19 (BiocManager 1.30.22), R 4.4.0 (2024-04-24) > package 'scClassify' is not available for Bioconductor version '3.19' > > One of the dependencies,Cepo, is failing, which is likely the cause of the installation error. The maintainer is updating the package, but I thought I would report this scenario for improvement.

Lori Shepherd (20:52:27): > I’m not sure where or how to better reflect this and open to suggestions…. Technically the package passes except a dependency is not passing so where or how this should be indicated gets tricky….i can see if there would be a way to pull this information from the logs and change the availability shield but it doesn’t seem quite right to reflect it on the build shield since it is passing on its own….

2024-05-06

Luke Zappia (08:18:25): > Is it possible to reset the****{BiocFileCache}****cache for****{zellkonverter}****(on release and devel)? Thanks!

Lori Shepherd (08:21:49) (in thread): > may I ask why this is needed? it doesn’t look like it is failing in any way and shouldn’t a cache update as needed?

Luke Zappia (08:24:45) (in thread): > There is an error loading from the cache in one of the long testshttps://bioconductor.org/checkResults/release/bioc-longtests-LATEST/zellkonverter/nebbiolo1-checksrc.html. The main build passes fine so I’m happy to leave it if it’s annoying to do but it’s been like that for a while so I thought I should try and fix it.

Lori Shepherd (08:32:02) (in thread): > Does zellkonverter use its own unique cache or the default BiocFileCache on a system?

Luke Zappia (08:34:50) (in thread): > I guess the default one. It downloads some example files for the long tests using: > > library(BiocFileCache) > cache <- BiocFileCache(ask = FALSE) > file <- bfcrpath(cache, ...) > > Caching isn’t used anywhere in the actual package, just the long tests.

Lori Shepherd (08:36:06) (in thread): > Thanks. Instead of getting rid of the entire default, I’d like to remove the individual files from the cache then … Is it just the single filehttps://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad?

Luke Zappia (08:37:35) (in thread): > Yeah, I think that is the only one that seems to be an issue. That would be great, thanks!

Lluís Revilla (16:35:40) (in thread): > In CRAN a packages that is failing would be removed from the repository after a warning to the package and all its dependencies. I couldn’t fine any text on the website or book about roles or expectations of maintainers after acceptance in the release branch, I thought we had to fix those “reasonably quick” or something like that. > But perhaps there could be a warning in a package website (bioconductor.org/packages/pkg) if it is not available due to a dependency, so the users are aware of this issue if there is no further action from Bioconductor

2024-05-07

Robert Shear (12:00:00) (in thread): > @Kasper D. Hansen, The first link is the correct link to report the download statistics for theminfi_stats.tabpackage. The second link should return a status 404. Instead it returns the equivalent of [http://master.bioconductor.org/packages/stats/bioc/](http://master.bioconductor.org/packages/stats/bioc/), which is the download statistics for all the software packages combined. > > This is a bug. I will open an issue on project,bioconductor/bio-web-stats. > > Thank you for identifying it.

2024-05-08

Stephany Orjuela (11:56:20): > @Stephany Orjuela has left the channel

2024-05-09

Robert Castelo (07:05:37): > Dear Git experts, I cannot understand the following interaction with the Bioconductor Git server: > 1. Download a fresh clone of a Bioconductor package from my GitHub repo and add the Bioconductor upstream remote: > > > $ git clone git@github.com:rcastelo/GenomicScores > $ cd GenomicScores > $ git remote add upstream git@git.bioconductor.org:packages/GenomicScores > > 2. Verify we’ve clone the current devel branch and version of the package: > > $ git status > On branch devel > Your branch is up to date with 'origin/devel'. > > nothing to commit, working tree clean > $ grep Version DESCRIPTION > Version: 2.17.0 > > 3. Fetch the upstream branches and check out the upstream/RELEASE_3_19 branchwithoutthe-boption: > > $ git fetch upstream > remote: Enumerating objects: 159, done. > [...] > $ git checkout upstream/RELEASE_3_19 > Note: switching to 'upstream/RELEASE_3_19'. > > You are in 'detached HEAD' state. You can look around, make experimental > changes and commit them, and you can discard any commits you make in this > state without impacting any branches by switching back to a branch. > > If you want to create a new branch to retain commits you create, you may > do so (now or later) by using -c with the switch command. Example: > > git switch -c <new-branch-name> > > Or undo this operation with: > > git switch - > > Turn off this advice by setting config variable advice.detachedHead to false > > HEAD is now at 7e55935 bump x.y.z version to even y prior to creation of RELEASE_3_19 branch > > 4. Verify the status and version of this RELEASE_3_19 branch and DESCRIPTION file: > > $ git status > HEAD detached at upstream/RELEASE_3_19 > nothing to commit, working tree clean > $ grep Version DESCRIPTION > Version: 2.16.0 > > 5. Checkout the devel branch: > > $ git checkout devel > Previous HEAD position was 7e55935 bump x.y.z version to even y prior to creation of RELEASE_3_19 branch > Switched to branch 'devel' > Your branch is up to date with 'origin/devel'. > > 5. Because I should have checked out upstream/RELEASE_3_19 with the-boption, let’s do it now and verify again the status and version of the branch and the DESCRIPTION file: > > $ git checkout -b upstream/RELEASE_3_19 > Switched to a new branch 'upstream/RELEASE_3_19' > $ git status > On branch upstream/RELEASE_3_19 > nothing to commit, working tree clean > $ grep Version DESCRIPTION > Version: 2.17.0 > > Here is the question, why now upstream/RELEASE_3_19 is pointing to the devel branch?? I also get this situation when checking out directly from devel to upstream/RELEASE_3_19 with-b, which is how I came across it, but wanted to point out that without-bI do see the correct release version. Thanks!

Vince Carey (07:25:42): > Hi Robert, I’ve never used -b with checkout. Since it says it switches to anewbranch my sense is that it is making a local branch with this name based on the current code image, which was 2.17 when you did it. The references to -b in the contributions document section 21 all seem to pertain to making a new local branch. I will defer immediately to any git expert who can clarify.

Artür Manukyan (07:50:08) (in thread): > True, i only use it when i intend to create a new branch and checkout there

Martin Grigorov (07:54:22): > you need to do:git checkout -b newBranchName existingBranchOrTagName

Lori Shepherd (07:54:30): > A -b creates a new branch locally and switches to it instead of creating and then switching in two steps. Normally I get explicit when doing it so there is no confusion so when I use it I do something likegit checkout -b RELEASE_3_19 upstream/RELEASE_3_19which will create create a local RELEASE_3_19 branch and track/sync with upstream/RELEASE_3_19 (assuming I fetched and it exists)

Martin Grigorov (07:55:01): > by skipping theexistingBranchOrTagNameit will use the currently checked out branch

Robert Castelo (09:54:20): > Thanks everyone, it’s true I was forgetting thenewBranchNamevalue:man-facepalming:withgit checkout -b RELEASE_3_19 upstream/RELEASE_3_19things work fine.

Dirk Eddelbuettel (10:08:25): > There is a micro-bug in Rdisop in that a call to Rf_errror() provides the character* message, but not the format string"%s". That blew up my build (forr2u) using the default compiler settings in the distro–as R 4.4.0 flags this as an error.

Vince Carey (10:20:10): > thanks dirk, did you report to the contributor or should i start that process

Vince Carey (10:20:51) (in thread): > i have no familiarity with the package

Dirk Eddelbuettel (10:22:31) (in thread): > Please do. I have no familiarity with the Bioc processes.

Vince Carey (10:25:36) (in thread): > It is strange because the github repo has an issue with a request for deprecation and abandonment dating back a few years. Could take a little while to resolve.https://github.com/sneumann/Rdisop/issues/23 - Attachment: #23 Deprecate Rdisop in BioC and archive this repo. > Hi,
> I am afraid I’ll be unable to keep maintenance of Rdisop,
> as the number of open bugs is increasing, and the C++ part
> has never really been maintained by me at all. In case someone
> wants to adopt the package, please contact me.
> Yours,
> Steffen

Dirk Eddelbuettel (10:26:13) (in thread): > :sad-parrot:

Martin Morgan (10:28:42) (in thread): > Was wondering what the error is? It seems likemost of the callsare of the formRf_error("oops"); are you saying that it is supposed to beRf_error("%s", "oops")? That would be a problem for many packages! - Attachment (code.bioconductor.org): Bioconductor Code: Search > Search source code across all Bioconductor packages

Dirk Eddelbuettel (10:29:48) (in thread): > Yes and yes.

Vince Carey (10:30:28) (in thread): > @Laurent Gattocould you have a look at this – Rdisop is high-ranking package, but Steffen N. has asked to drop maintenance. A couple of CRAN packages depend on it. Should we take steps to adopt it into the core-maintained collection? Is it that significant for proteomics?

Dirk Eddelbuettel (10:30:28) (in thread): > I think you can try to turn the -Werror equivalent off in the compiler settings, mine are on by distro default so it barkedon the sources from 3.19.

Lori Shepherd (10:30:45) (in thread): > FWIW: Rdisop never requested formal deprecation in Bioconductor (and its not failing which is why it still remains) and as far as I can tell never put this request out on the bioc-devel mailing list. Maybe someone would be wiling to take it over as indicated by their own issue posted if they announced it formally to the community

Martin Morgan (10:36:47) (in thread): > Writing R Extensionssays > > …in the simplest case can be called with a single character string argument giving the error message > Also I didn’t see any NEWS entry?

Dirk Eddelbuettel (10:37:34) (in thread): > I am fairly positive I had to adjust a few packages of mine in the previous six to eight months leading up to R 4.4.0. This is very much not news.

Dirk Eddelbuettel (10:39:30) (in thread): > Now I am confused too as a quick spot check did not throw an error. Hm.

Dirk Eddelbuettel (10:39:54) (in thread): - File (R): Untitled

Dirk Eddelbuettel (10:43:06) (in thread): > Anyway here is your MVP error ticket. Using plain vanilla r2u, based on plain vanilla ubuntu 24.04

Dirk Eddelbuettel (10:43:17) (in thread): - File (Shell): Untitled

Dirk Eddelbuettel (10:43:52) (in thread): > The containerrocker/r2u:24.04is ‘not quite officially published’ but available as are the r2u builds for 24.04 and 3.19. This is all ‘in process’ for me. (AndinstallBioc.ris a simplelittlerwrapper aroundBiocManager.)

Henrik Bengtsson (10:44:20) (in thread): > Could it be thatexceptionMesgin > > Rf_error(exceptionMesg); > > ends up comprising formatting symbols, e.g.%which bringsRf_error()down the wrong path?

Dirk Eddelbuettel (10:44:58) (in thread): > No.

Dirk Eddelbuettel (10:46:08) (in thread): > I just made itRf_error("%s", exceptionMsg)and all was well. This is something loads of other CRAN packages (incl some of mine) fixed over the last few months. Nothing special here AFAICT.

Dirk Eddelbuettel (10:48:34) (in thread): > It’s Debian/Ubuntu thing to have-Werror=format-security. If you remove that from the compiler invocation you should be good.

Henrik Bengtsson (10:48:40) (in thread): > Oh, so maybe it’s okay to use astring literal, but not astring variable.The compiler cannot validate the string variable, so considers it unsafe. That would make sense.

Martin Morgan (12:23:52) (in thread): > A close reading of the error message says thatRf_error(fmt)produces an error butRf_error(fmt, bar)does not (even though the second form is problematic iffmtcontains more than 1 format specifier). Using the code search facility I was only able to find (and confirm)one other case(HilbertVisGUI) whereRf_error()is invoked with a single variable. - Attachment (code.bioconductor.org): Bioconductor Code: Search > Search source code across all Bioconductor packages

Vince Carey (14:33:30) (in thread): > To explore this a little more I modified Makevars to include the -Werror=format-security and R 4.4 CMD check produces WARNING > > Non-portable flags in variable 'PKG_CXXFLAGS': > -Werror=format-security >

Vince Carey (14:34:08) (in thread): > (that’s after repairing the Rf_error calls)

Vince Carey (14:46:58) (in thread): > I have notified the maintainers of corMID and enviGCMS that someone has to take over Rdisop; I hope one of them will step up.

2024-05-10

Nitesh Turaga (04:32:22): > Hi folks, > > These two packages from the new release 3.19 are missing the manual pages (404, Page not found) > > hdxmsqc <- “https://bioconductor.org/packages/release/bioc/manuals/hdxmsqc/man/hdxmsqc.pdf” > > rawdiag <- “https://bioconductor.org/packages/release/bioc/manuals/rawDiag/man/rawDiag.pdf”

Lori Shepherd (07:09:54): > likely because they have yet to build and are unavailable for release 3.19

Lori Shepherd (07:10:49): > I would expect this to resolve once they fix the packages

Lori Shepherd (07:22:38) (in thread): > haven’t forgotten about this but apparently its a little more complicated than normal and trying to trace down if its a bug in biocfilecache or in your code base. long short when I search for the entry to remove its not found > > > bfcquery(bfc, "[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)", field="rname") > # A tibble: 0 × 10 > # ℹ 10 variables: rid <chr>, rname <chr>, create_time <dbl>, access_time <dbl>, > # rpath <chr>, rtype <chr>, fpath <chr>, last_modified_time <dbl>, > # etag <chr>, expires <dbl> > > so nothing to remove…. but indeed when I try the bfcrpath call I can reproduce the error > > > cache = bfc > > bfcrpath(cache, "[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)") > adding rname '[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)' > |======================================================================| 100% > > Error in bfcrpath(cache, "[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)") : > not all 'rnames' found or unique. > In addition: Warning messages: > 1: download failed > web resource path: '[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)' > local file path: '/home/biocbuild/.cache/R/BiocFileCache/28dc287f95d760_GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad' > reason: Not Found (HTTP 404). > 2: bfcadd() failed; resource removed > rid: BFC189 > fpath: '[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)' > reason: download failed > 3: In value[[3L]](cond) : > trying to add rname '[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)' produced error: > bfcadd() failed; see warnings() >

Lori Shepherd (07:23:42) (in thread): > actually when I navigate to that url in a browser it says no such bucket? > > <Error> > <Code>NoSuchBucket</Code> > <Message>The specified bucket does not exist.</Message> > </Error> > > and I can verify with a httr::HEAD call > > > library(httr) > > HEAD("[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)") > Response [[https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad](https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad)] > Date: 2024-05-10 11:23 > Status: 404 > Content-Type: application/xml; charset=UTF-8 > <EMPTY BODY> >

Lori Shepherd (07:24:50) (in thread): > so it seems like it might be an issue with how the file is being stored/distributed

Maria Doyle (07:59:36): > :trophy:Last Chance to Nominate for the Bioconductor Awards 2024!:trophy:Time is running out to recognise individuals who have made impactful contributions to our community. Don’t miss this opportunity to honour their hard work and achievements!:star2:Criteria: Significant contributions in one or more ways to the project:busts_in_silhouette:Selection: 4 awardees based on impact:alarm_clock:Deadline: 15th May 2024 > > Nominate Now:Nomination FormLearn More:Bioconductor Awards

2024-05-11

Aleru Divine (04:04:58): > @Aleru Divine has joined the channel

Dirk Eddelbuettel (10:53:08) (in thread): > @Vince CareyFYI I just encountered the same issue in package oligo. Another one-line fix, injecting"%s"before the variable carrying the message string.

Martin Morgan (11:08:06) (in thread): > I think the offending call is atlines 53-54of oligo/src/basecontent.c > > sprintf(errmess, "Unknown base %c in row %d, column %d.", seq[j], i+1, j+1); > error(errmess); > > and it would bebetterto > > error("Unknown base %c in row %d, column %d.", seq[j], i+1, j+1); > > to avoid the equally problematic use ofsprintf() - Attachment (code.bioconductor.org): Bioconductor Code: oligo > Browse the content of Bioconductor software packages.

Dirk Eddelbuettel (11:08:30) (in thread): > Yes that also works. I opted for minimalism.

2024-05-12

Vince Carey (14:22:58) (in thread): > @Benilton S Carvalho^^

2024-05-13

Luke Zappia (02:31:02) (in thread): > :crying_cat_face:It worked locally but maybe I forgot to check if it was loading an already cached version. Thanks for looking into it, I’ll see if I can sort it out.

2024-05-14

Steve Lianoglou (11:11:29): > @Steve Lianoglou has joined the channel

2024-05-15

Elisa Gómez de Lope (07:38:17): > @Elisa Gómez de Lope has joined the channel

Elisa Gómez de Lope (07:41:01): > Dear dev experts, I’m building a package called funOmics. I’m currently facing an issue with the NMF package. One of the functions in funOmics uses this package. The package is imported both in the NAMESPACE file and in the functions.R file, where the functions are located (in roxygen2 format@import, @importFrom). I have also tried to call the functions as NMF::nmf() and NMF::coef() instead of just nmf() and coef(), however the error persists. > The error (see below) only gets removed if I explicitly load the library itself, like, if I have library(NMF) before calling the package function that, internally, uses one of its functions. I have explored a range of possible solutions but i have not been successful. What would you recommend? I know it’s not ideal but, should i add library(NMF) in the vignette (this means, the user will ahve to do that when using this specific function)? or should i add it to the Depends in the DESCRIPTION file? Weirdly enough, this issue only happens with this library, not with any other. > Error traceback: > > Error in `do.call()`: > ! 'what' must be a function or character string > Backtrace: > 1. funOmics::summarize_pathway_level(X, kegg_sets, type = "nmf") > 2. funOmics:::aggby_dimred(ifunmat, type) > 4. NMF::nmf(X, rank = 1) > 6. NMF::nmf(x, rank, NULL, ...) > 7. NMF (local) .local(x, rank, method, ...) > 9. NMF::nmf(x, rank, method, seed = seed, model = model, ...) > 11. NMF::nmf(x, rank, method = strategy, ...) > 12. NMF (local) .local(x, rank, method, ...) > 13. base::do.call(...) > > When in vignette: > > |........................... | 54% [unnamed-chunk-18] > Quitting from lines at lines 182-184 [unnamed-chunk-18] (funomics_vignette.Rmd) > > Error in `do.call()`: > ! 'what' must be a function or character string > Backtrace: > 1. funOmics::summarize_pathway_level(X, kegg_sets, type = "nmf") > 2. funOmics:::aggby_dimred(ifunmat, type) > 4. NMF::nmf(X, rank = 1) > 6. NMF::nmf(x, rank, NULL, ...) > 7. NMF (local) .local(x, rank, method, ...) > 9. NMF::nmf(x, rank, method, seed = seed, model = model, ...) > 11. NMF::nmf(x, rank, method = strategy, ...) > 12. NMF (local) .local(x, rank, method, ...) > 13. base::do.call(...) > Execution halted >

Martin Morgan (08:11:49) (in thread): > Without attaching NMF, I did > > > do.call(NMF::nmf, list(matrix(), rank = 1)) > Error: NMF::nmf - Input matrix x contains at least one null or NA-filled row. > In addition: Warning message: > In min(x, na.rm = TRUE) : no non-missing arguments to min; returning Inf > > which finds the function and then fails. On the other hand if I just specifynmf > > > do.call(nmf, list(matrix(), rank = 1)) > Error in h(simpleError(msg, call)) : > error in evaluating the argument 'what' in selecting a method for function 'do.call': object 'nmf' not found > > Not quite the same error but pointing to the problem –do.call()looks on the search path (packages that are attached) not in packages that are merely loaded. So somewhere in thenmfcode there is ado.call()that assumes the object is on the search path. This is an issue for the NMF maintainer to fix; I don’t think there’s an easy way for to work around this other than Depend:ing on NMF.

Tim Triche (09:23:46) (in thread): > Could also try RcppML::nmf():wink:

Alan O’C (10:07:26) (in thread): > I’d suggest settingeval = FALSEon the?pkg::funcode blocks, it pops up help pages when I try to run the vignette locally

Elisa Gómez de Lope (10:38:55) (in thread): > thanks a lot for your comments! As a short-term solution I’m adding NMF to Depends and opened an issue to see if/how this can be solved on NMF’s end.

Elisa Gómez de Lope (10:39:56) (in thread): > @Tim TricheI’m using it with rank=1 and got different outputs with NMF::nmf() and RcppML::nmf(). I guess the internal method is different?

Tim Triche (11:47:57) (in thread): > Yes RcppML diagonalizes the factors to stabilize convergence

Tim Triche (11:49:26) (in thread): > https://www.biorxiv.org/content/10.1101/2021.09.01.458620v2 - Attachment (bioRxiv): Fast and interpretable non-negative matrix factorization for atlas-scale single cell data > Non-negative matrix factorization (NMF) is a popular method for analyzing strictly positive data due to its relatively straightforward interpretation. However, NMF has a reputation as a less efficient alternative to the singular value decomposition (SVD), a standard operation that is highly optimized in most linear algebra libraries. Sparse single-cell sequencing assays, now feasible in thousands of subjects and millions of cells, generate data matrices with tens of thousands of strictly non-negative transcript abundance entries. We present an extremely fast NMF implementation made available in the RcppML (Rcpp Machine Learning library) R package that rivals the runtimes of state-of-the-art Singular Value Decomposition (SVD). NMF can now be run quickly on desktop computers to analyze sparse single-cell datasets consisting of hundreds of thousands of cells. Our method improves upon current NMF implementations by introducing a scaling diagonal to increase interpretability, guarantee consistent regularization penalties across different random initializations, and symmetry in symmetric factorizations. We use our method to show how NMF models learned on standard log-normalized count data are interpretable dimensional reductions, describe interpretable patterns of coordinated gene activities, and explain biologically relevant metadata. We believe NMF has the potential to replace PCA in most single-cell analyses, and the presented NMF implementation overcomes previous challenges with long runtime. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2024-05-23

Lori Shepherd (14:20:52): > Due to unforeseen circumstances, our daily builders and new submission builders are currently offline. We are sorry for the inconvenience and will restore as soon as possible

2024-05-24

James W. MacDonald (16:30:03): > @James W. MacDonald has joined the channel

James W. MacDonald (16:38:02): > On Linux, if I do > > load(BiocManager) > install("zlibbioc", configure.args="--with-libzbioc") > install("oligo") > > Where oligo imports zlibbioc in the DESCRIPTION file, as well as in the NAMESPACE, as well as having # include "zlib.h" in the source files, my expectation is that oligo will be built against the headers inzlibbioc`. However, I get things like this: > > In file included from /usr/include/zlib.h:34, > from ParserGzXYS.c:4: > > Where the site-wide zlib headers are being used instead. Is there another step that I am missing?

2024-05-27

Alan O’C (04:29:22): > Do you not need to specify LinkingTo in DESCRIPTION for that use case?

Martin Morgan (11:46:57) (in thread): > Yes, zlibbioc expects the package to opt-in to its use, usually only on Windows, as outlined inthe vignette. It might be possible to put the installed location of zlibbioc (somewhere undersystem.file(package = "zlibbioc")) on the appropriateLD_LIBRARY_PATHand include location as environment variables after installing zlibbioc and before oligo… > > zlibbioc was intended mainly to support zlib on Windows back in the day when R did not contain a Windows zlib. Those days are over, so really zlibbioc should be removed; the library itself is very old (version 1.2.5, versus current 1.3.1) so exposes any security issues fixed in that interval.

2024-05-28

James W. MacDonald (10:49:19) (in thread): > The vignette says (edited) > > All packages wishing to use the libraries inzlibbiocmust > * AddImports: zlibbiocto theDESCRIPTIONfile. > * Addimport(zlibbioc)to theNAMESPACEfile. > Reference the relevant include file in your C source code: > > #include "zlib.h" > > Which does not appear specific to Windows, as the next part of the vignette goes into detail about how to get it to work on Windows.

James W. MacDonald (10:53:14) (in thread): > Anyway, this is in reference to this question on the support site (https://support.bioconductor.org/p/9158498/), which might indicate that the code used inoligoneeds to be updated to use more modern versions of zlib. Or maybe I misunderstand the errors?

James W. MacDonald (11:11:49) (in thread): > Ah, bingo! Thanks Alan

2024-05-29

Martin Morgan (10:01:31) (in thread): > oligowould need additional commands in its Makevars file, but this wouldn’t be useful because most users have not built zlibbioc with the necessary flags. I see the error (as a warning) on the buildershttps://bioconductor.org/checkResults/release/bioc-LATEST/oligo/so yes oligo code needs to be updated…

2024-05-30

Joshua Hamilton (14:57:13): > @Joshua Hamilton has joined the channel

2024-06-01

Martin Morgan (11:00:18) (in thread): > I see that this is first reported in theBioc 3.1 build reportfrom 2015 on Ubuntu 14.04; the file was introduced in 2010. I’d guess that this has always been incorrect, a pretty significant bug, and surprising that it hasn’t lead to segfaults and other problems. > > I think the changes are easy > > diff --git a/src/ParserGzXYS.c b/src/ParserGzXYS.c > index f324111..4df4caa 100644 > --- a/src/ParserGzXYS.c > +++ b/src/ParserGzXYS.c > @@ -9,7 +9,7 @@ > /***************************************************************************************************************************** > **** countLines: counts lines, returns integer > *****************************************************************************************************************************/ > -static int gzcountLines(gzFile *file){ > +static int gzcountLines(gzFile file){ > int lines = 0; > char buffer[1000]; > char *token; > @@ -38,7 +38,7 @@ static int gzcountLines(gzFile *file){ > *****************************************************************************************************************************/ > > static char *gzxys_header_field(const char *currentFile, const char *field){ > - gzFile *fp; > + gzFile fp; > int j; > char *result, *final; > char buffer[LINEMAX]; > @@ -124,7 +124,7 @@ static void gzread_one_xys(const char *filename, double *signal, > SEXP R_read_gzxys_files(SEXP filenames, SEXP verbosity){ > int nfiles, nrows, i, verbose, *ptr2xy; > double *ptr2signal; > - gzFile *fp; > + gzFile fp; > SEXP signal, xy, output; > SEXP dimnames, dimnamesxy, fnames, colnamesxy, namesout, dates; > char *d0, *d1; > > There’s another warning that should be addressed > > basecontent.c:54:8: warning: format string is not a string literal (potentially insecure) [-Wformat-security] > error(errmess); > ^~~~~~~~~~~~~ > basecontent.c:54:8: note: treat the string as an argument to avoid this > error(errmess); > ^ > "%s", > 1 > > also an easy fix > > diff --git a/src/basecontent.c b/src/basecontent.c > index a4ee78b..90b70cf 100644 > --- a/src/basecontent.c > +++ b/src/basecontent.c > @@ -13,8 +13,6 @@ > > //#include "strutils.h" > > -char errmess[256]; > - > SEXP basecontent(SEXP x) > { > SEXP rv, rownames, colnames, dimnames, dim; > @@ -50,8 +48,7 @@ SEXP basecontent(SEXP x) > ig++; > break; > default: > - sprintf(errmess, "Unknown base %c in row %d, column %d.", seq[j], i+1, j+1); > - error(errmess); > + error("Unknown base %c in row %d, column %d.", seq[j], i+1, j+1); > } > } > INTEGER(rv)[i ] = ia; > > See alsohttps://community-bioc.slack.com/archives/CLUJWDQF4/p1715440086931989?thread_ts=1715264410.574839&cid=CLUJWDQF4Is Benilton@Benilton S Carvalhostill active with oligo? I don’t see a github URL in the DESCRIPTION file.

Benilton S Carvalho (14:41:14): > @Benilton S Carvalho has joined the channel

Benilton S Carvalho (14:44:59) (in thread): > Working on this,@Martin Morganand@James MacDonald.

Benilton S Carvalho (22:05:07) (in thread): > I believe I have addressed the issues. Apologies for the radio-silence.

Benilton S Carvalho (22:05:27) (in thread): > I believe I have addressed the issues. Apologies for the radio-silence.

2024-06-02

Muskan Singh (08:14:00): > @Muskan Singh has joined the channel

2024-06-03

Martin Morgan (11:29:51) (in thread): > Thanks@Benilton S Carvalho; I think the changes were made to the RELEASE_3_19 branch but also need to be made to devel? > > Also FWIW in thinking about it I can see that this is not usually a catastrophic bug –gzFileis an ‘opaque’ pointer defined inzlib.h asvoid *, typically stored as a pointerpthat needs to be cast(gzFile) pin the internal zlib code. In the oligo code the incorrect defintiongzFile* fpis therefore avoid ****pointer, which will occupy the same space asp, and the internal cast(gzFile) fpwill usually do the right thing.

Benilton S Carvalho (11:31:08) (in thread): > I was pretty sure I had pushed to devel as well. Will double check in a bit.

Benilton S Carvalho (13:01:28) (in thread): > @Martin Morgan, it should be now on devel. Apologies for that.

2024-06-05

S L (07:51:52): > Hello, I am developing an ExperimentHub package to go with a Software package and I am running into some trouble with it, I would be really glad if anyone with previous experience in this could help. This package “contains” an SQLite database with some precomputed values used by the software package as well as example data (github link herehttps://github.com/slrvv/CENTREprecomputed). I followed the instructions here:https://bioconductor.org/packages/devel/bioc/vignettes/HubPub/inst/doc/CreateAHubPackage.html. The metadata.csv is already in the production database and I am hosting the data on my own server. However when I try to use the SQLite database I get the following error > > > library("ExperimentHub") > > eh <- ExperimentHub() > snapshotDate(): 2024-05-28 > > query(eh, "CENTREprecomputed") > ExperimentHub with 6 records > # snapshotDate(): 2024-05-28 > # $dataprovider: ENCODE, FANTOM5, big.databio.org and MPI for molecular genetics > # $species: Homo sapiens > # $rdataclass: character, SQLiteConnection > # additional mcols(): taxonomyid, genome, description, coordinate_1_based, maintainer, rdatadateadded, > # preparerclass, tags, rdatapath, sourceurl, sourcetype > # retrieve records with, e.g., 'object[["EH9540"]]' > > title > EH9540 | Precomputed Fisher combined p-values and crup correlations database > EH9541 | H3K4me3 ChIP-seq HeLa-S3 from ENCODE > EH9542 | H3K4me1 ChIP-seq HeLa-S3 from ENCODE > EH9543 | H3K27ac ChIP-seq HeLa-S3 from ENCODE > EH9544 | Control ChIP-seq HeLa-S3 from ENCODE > EH9545 | RNA-seq gene quantifications HeLa-S3 from ENCODE > > eh[["EH9540"]] > see ?CENTREprecomputed and browseVignettes('CENTREprecomputed') for documentation > loading from cache > Error: failed to load resource > name: EH9540 > title: Precomputed Fisher combined p-values and crup correlations database > reason: dbExistsTable(conn, "metadata") is not TRUE > > The only example of a metadata table inside a SQLite database I found was herehttps://bioconductor.org/packages/release/bioc/vignettes/AnnotationForge/inst/doc/MakingNewAnnotationPackages.pdfbut this is for AnnotationHub packages and doesn’t seem to be what I want to have.

James W. MacDonald (09:46:08) (in thread): > The error you get says that there isn’t a metadata table in your SQLite DB. That should be a simple fix - the metadata table is meant to say where all the underlying data come from. > > As an example > > > dbGetQuery(org.Hs.eg_dbconn(), "select * from metadata;") > name value > 1 DBSCHEMAVERSION 2.1 > 2 Db type OrgDb > 3 Supporting package AnnotationDbi > 4 DBSCHEMA HUMAN_DB > 5 ORGANISM Homo sapiens > 6 SPECIES Human > 7 EGSOURCEDATE 2024-Mar12 > 8 EGSOURCENAME Entrez Gene > 9 EGSOURCEURL[ftp://ftp.ncbi.nlm.nih.gov/gene/DATA](ftp://ftp.ncbi.nlm.nih.gov/gene/DATA)10 CENTRALID EG > 11 TAXID 9606 > 12 GOSOURCENAME Gene Ontology > 13 GOSOURCEURL[http://current.geneontology.org/ontology/go-basic.obo](http://current.geneontology.org/ontology/go-basic.obo)14 GOSOURCEDATE 2024-01-17 > 15 GOEGSOURCEDATE 2024-Mar12 > 16 GOEGSOURCENAME Entrez Gene > 17 GOEGSOURCEURL[ftp://ftp.ncbi.nlm.nih.gov/gene/DATA](ftp://ftp.ncbi.nlm.nih.gov/gene/DATA)18 KEGGSOURCENAME KEGG GENOME > 19 KEGGSOURCEURL[ftp://ftp.genome.jp/pub/kegg/genomes](ftp://ftp.genome.jp/pub/kegg/genomes)20 KEGGSOURCEDATE 2011-Mar15 > 21 GPSOURCENAME UCSC Genome Bioinformatics (Homo sapiens) > 22 GPSOURCEURL > 23 GPSOURCEDATE 2024-Feb29 > 24 ENSOURCEDATE 2023-Nov22 > 25 ENSOURCENAME Ensembl > 26 ENSOURCEURL[ftp://ftp.ensembl.org/pub/current_fasta](ftp://ftp.ensembl.org/pub/current_fasta)27 UPSOURCENAME Uniprot > 28 UPSOURCEURL[http://www.UniProt.org/](http://www.UniProt.org/)29 UPSOURCEDATE Thu Apr 18 21:39:39 2024 >

James W. MacDonald (09:47:02) (in thread): > So if you just create a metadata table with a name and value column, I would imagine you will be good to go.

2024-06-06

S L (06:28:38) (in thread): > Hi! Thanks for the reply what is the meaning of central id? And what options of db type exist? I haven’t seen that mentioned anywhere. Also should I say that the Supporting package is AnnotationDbi if the package is ExperimentHub?

Martin Morgan (11:39:42): > Drawing attention to this post to the R-devel mailing listhttps://stat.ethz.ch/pipermail/r-devel/2024-June/083449.htmlwhich is likely to be relevant (and an important step in the right direction) for package development using C / C++ code.

2024-06-11

Ziru Chen (04:36:25): > @Ziru Chen has joined the channel

Alik Huseynov (05:24:35): > @Alik Huseynov has joined the channel

2024-06-17

Dario Strbenac (12:00:05): > Poor, neglected S4Vectors’DataFrame. > > library(tidyomics) > data <- DataFrame(sample = LETTERS[1:5], score = rnorm(5)) > data |> filter(score > 0) > Error in UseMethod("filter") : > no applicable method for 'filter' applied to an object of class "c('DFrame', 'DataFrame', 'SimpleList', 'DF_OR_df_OR_dt', 'RectangularData', 'List', 'DataFrame_OR_NULL', 'Vector', 'list_OR_List', 'Annotated', 'vector_OR_Vector')" >

Stevie Pederson (12:31:26) (in thread): > I’m keen to try bringingextraChIPscloser to the tidyomics strategies, but for now, there’s always the following. Not sure if that’ll cause conflicts withtidyomicsthough. > > library(extraChIPs) > data |> as_tibble() |> dplyr::filter(...) >

Alik Huseynov (12:33:08) (in thread): > I don’t expectdplyr::filterto work onDataFrame, it has to be some tidy/tibble class

Stevie Pederson (12:35:21) (in thread): > Yeah, that’s why i wroteextraChIPs::as_tibble()for DataFrames, GenomicRanges, GenomicInteractions, Seqinfo (etc, etc) objects. Obvs, before tidyomics really took off…

Alik Huseynov (12:39:44) (in thread): > looks good!

Hervé Pagès (14:39:59) (in thread): > > I don’t expect dplyr::filter to work on DataFrame , it has to be some tidy/tibble class > Unless someone implements afiltermethod for DataFrame, which is what we need here.data |> as_tibble() |> filter( ...)kind of works but it returns a tibble. With afiltermethod for DataFrame, we would be able to dodata |> filter(...)and that would return another DataFrame.

Alik Huseynov (14:50:12) (in thread): > yes, that would be indeed awesome.

Alik Huseynov (15:06:26) (in thread): > btw, is anyone already working onfiltermethod for DataFrame?

Hervé Pagès (15:09:58) (in thread): > I’m not. I think that’s something maybe for thetidyomicspackage? Anyone willing to open an issue and ask therehttps://github.com/tidyomics/tidyomics/issues? Thanks

Alik Huseynov (15:10:40) (in thread): > good idea, I will do that

Alik Huseynov (15:19:49) (in thread): > https://github.com/tidyomics/tidyomics/issues/19

Stevie Pederson (21:18:00) (in thread): > Nice one!

2024-06-20

Lucio Queiroz (08:58:45): > @Lucio Queiroz has joined the channel

Jared Andrews (11:08:47): > I feel like I remember a package that had a class for storing arbitrary differential expression results from DESeq2/limma/edgeR in a consistent format, but I cannot for the life of me remember what it was called or if it was even complete. Does anybody have any idea what I’m talking about?

2024-06-21

Jacques SERIZAY (11:08:13) (in thread): > I can think of ViDGER , but this package is for visualization of the results from these packages (not sure it included limma though). Not sure this is what you are looking forhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6954399/ - Attachment (PubMed Central (PMC)): Interpretation of differential gene expression results of RNA-seq data: review and integration > Differential gene expression (DGE) analysis is one of the most common applications of RNA-sequencing (RNA-seq) data. This process allows for the elucidation of differentially expressed genes across two or more conditions and is widely used in many applications …

Jared Andrews (16:41:44) (in thread): > Unfortunately that’s not it. I want to say it kept the results inrowDatain aSummarizedExperimentor something, but had clean ways of accessing them that was consistent across the methods. Perhaps I made it up.

2024-06-23

Charlotte Soneson (10:55:21) (in thread): > Makes me think ofiSEEde:slightly_smiling_face:https://www.bioconductor.org/packages/iSEEde/ - Attachment (Bioconductor): iSEEde > This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.

Jared Andrews (11:05:53) (in thread): > Ah! That does look like what I’m thinking of, thanks!

2024-06-30

Clara Tejido (12:50:49): > @Clara Tejido has joined the channel

2024-07-02

Constanza Perez (18:54:28): > @Constanza Perez has joined the channel

Hervé Pagès (23:46:25): > A heads-up that the following are now deprecated inDelayedArray0.31.5:SparseArraySeedobjects plus theOLD_extract_sparse_array()andread_sparse_block()generics. This is part of the modernization of block-processing of sparse datasets. In particularread_block()now loads sparse blocks in anSVT_SparseArrayobject instead of aSparseArraySeedordgCMatrixobject, hence leveraging the efficiency of the former. For example the code below is between 3x and 4x faster in BioC 3.20 vs BioC 3.19: > > library(ExperimentHub) > fname <- ExperimentHub()[["EH1039"]] > library(HDF5Array) > oneM <- TENxMatrix(fname, group="mm10") > cv <- colVars(oneM) # block processed > > The packages possibly affected by these deprecations are:batchelor,beachmat,SCArray,DropletUtils,glmGamPoi,scuttle,DelayedMatrixStats,alabaster.matrix,scran,dreamlet,DelayedTensor,TileDBArray,CAGEfightR,DelayedRandomArray,TSCAN,scater,SCArray.sat. I’ll take care of each of them in the next few days.

2024-07-03

Aedin Culhane (05:19:51): > In a review from gigascience, they state “register any new software application in the bio.tools andSciCrunch.orgdatabases to receive RRID (Research Resource Identification Initiative ID) and biotoolsID identifiers,” Does Bioconductor do this? have you done this before?

Kasper D. Hansen (17:48:46): > I have never heard of these

Kasper D. Hansen (17:49:14): > Perhaps I should have.

2024-07-04

Mike Smith (03:54:09): > I’ve not heard of SciCrunch, but have used bio.tools in the past. There’s 1500 tools in the ‘BioConductor’ collection, so I suspect someone has systematically added a lot, but not recently. 6-7 years ago we registered all our de.NBI tools in bio.tools and it was a nightmare. Manual entry was tedious, but the API submission via JSON was equally painful with no easy way to validate your JSON until submission and little in the way of debugging when it failed. That was a few years ago, so hopefully it’s improved, but I did not enjoy the process. We simply stopped doing any updates after a while. > > As far as making tools visible or referable, I don’t really see what bio.tools brings that Bioconductor doesn’t already have. The landing page already has most (all?) of the information on the bio.tools page, plus we get a DOI for long term referencing, and the information is kept up-to-date automatically by the build system. Perhaps for more independent tools it provides a useful service, but I think Bioconductor actually does the same job well.

rodrigo arcoverde (05:51:52): > @rodrigo arcoverde has joined the channel

Hervé Pagès (12:57:56) (in thread): > My understanding is that the planned move from biocViews to the EDAM ontology should facilitate the inclusion of Bioconductor tools in the bio.tools registry.

Robert Castelo (13:45:01): > Hi, I’d say that I’ve always observed how vectorized operations have better performance than iterative operations based onsapply(), but I recently encountered a counterexample withIntegerListobjects, and I’d like ask the experts here why in this case that observation does not hold. Here is a minimal code making this point: > > suppressPackageStartupMessages(library(IRanges)) > > ## create an IntegerList object of random integers > p <- 10000 > l <- as.list(sample(10:100, size=100, replace=TRUE)) > l <- lapply(l, function(n, p) sample(1:p, size=n, replace=FALSE), p) > il <- IntegerList(l) > > ## sample a long vector of real numbers > z <- rnorm(p) > > ## create a function f1() that iterates through the list > ## and sums the real numbers at positions indicated by the > ## random sets of integers > f1 <- function(x) sapply(x, function(y) sum(z[y])) > > ## create the same function this time using vectorized > ## operations of the IntegerList class > f2 <- function(x) sum(relist(z[unlist(il)], il)) > > bench::mark(f1(il), f2(il), iterations=1000) > # A tibble: 2 × 13 > expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc > <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int> <dbl> > 1 f1(il) 357us 382us 2472. 175KB 0 100 0 40.5ms > 2 f2(il) 754us 813us 1200. 85KB 12.1 99 1 82.5ms > # ℹ 5 more variables: total_time <bch:tm>, result <list>, memory <list>, > # time <list>, gc <list> > > as you can see,f2()using vectorized operations onIntegerListobjects takes twice as much time asf1()using the iterativesapply(). What could be an explanation behind the slower performance of vectorized operations in this case? .. maybe the overhead of object management (creating, validation, dispatching, etc.) ? .. is there anyIRangessorcery I don’t know of that would beat thesapply()incantation?:sweat_smile:

Marcel Ramos Pérez (16:05:50) (in thread): > If you start with a plain numeric vector, then yes it will take some time to transform the numeric vector into a list-like representation (e.g., withrelist). So I don’t think a transformation and summation operation would beatlapply/vapply. An advantage to having*Listobjects would be to have the representation ofzalreadyas aNumericList()based on the workflow but that’s not always possible. In general, both methods shown here are pretty fast.

Hervé Pagès (22:39:55) (in thread): > @Robert CasteloThis is actually a small loop (the length of your lntegerList object is only 100). There’s a little bit of a small fixed overhead withf2()that givesf1()an advantage on such a small loop. With bigger loops, that advantage goes away. E.g. with an IntegerList object of length 1000 (i.e.size=1000): > > > microbenchmark(f1(il), f2(il)) > Unit: microseconds > expr min lq mean median uq max neval cld > f1(il) 1220.493 1255.0415 1370.7596 1293.117 1348.178 4701.557 100 a > f2(il) 655.448 677.7475 753.8372 694.706 771.238 3876.535 100 b > > size=10000: > > > microbenchmark(f1(il), f2(il)) > Unit: milliseconds > expr min lq mean median uq max neval cld > f1(il) 12.107002 12.92722 15.079087 13.569822 13.941332 114.59181 100 a > f2(il) 2.204237 2.39086 3.240399 3.088899 3.248178 8.30704 100 b > > size=100000: > > > microbenchmark(f1(il), f2(il)) > Unit: milliseconds > expr min lq mean median uq max neval cld > f1(il) 133.20419 143.62076 158.36233 148.42031 152.44105 290.7321 100 a > f2(il) 17.84308 24.58816 31.06603 26.70258 27.27606 178.2969 100 b > > etc…

Hervé Pagès (22:47:55) (in thread): > If you know that the length of the IntegerList object is going to be small (e.g. one list element per chromosome), and you really want to use an IntegerList rather than an ordinary list, then call theIRangesList()constructor withcompress=FALSEto get a SimpleIRangesList instead of a CompressedIRangesList. This should make yourf1()implementation or any code relying onlapply()/sapply()slightly faster.

2024-07-05

Hervé Ménager (03:36:21): > @Hervé Ménager has joined the channel

Claire Rioualen (03:37:30): > @Claire Rioualen has joined the channel

Antonin Thiébaut (04:37:23): > @Antonin Thiébaut has joined the channel

S L (11:07:05): > Hello, I am developing an ExperimentHub package to go with a Software package and I am running into some trouble with it. The package should expose an SQL database. I am following what is written inhttps://bioconductor.org/packages/release/bioc/vignettes/AnnotationForge/inst/doc/MakingNewAnnotationPackages.pdfin chapter 7 Setting up a package to expose a SQLite database object. There is a point where it is mentioned: “Once you have set up the metadata you will need to create a class for your package > that extends the AnnotationDb class. In the case of the org.Hs.eg.db package, the > class is defined to be a OrgDb class” What does this class extension actually mean and where should it go inside of the package? Another question is on the DB SCHEMAS. Are they always needed? In particular in a case where there are no relationships between the tables? In my case HUMAN_DB and the like are not applicable. I know there are some packages that use schemas defined for the particular packages likehttps://gitlab.com/42analytics1/public/reactome.db-r-package/-/blob/main/static_files/REACTOME_DB.sql?ref_type=heads - Attachment (GitLab): static_files/REACTOME_DB.sql · main · 42Analytics / Public / reactome.db R package · GitLab > GitLab.com

Michael Lawrence (14:05:52): > I am hitting some sort of git hook issue when trying to commit rtracklayer: > > remote: FATAL: W refs/heads/master packages/rtracklayer m.lawrence DENIED by fallthru > remote: error: hook declined to update refs/heads/master > To git.bioconductor.org:packages/rtracklayer.git > ! [remote rejected] master -> master (hook declined) > error: failed to push some refs to 'git.bioconductor.org:packages/rtracklayer.git' > > Thanks for any help.

Marcel Ramos Pérez (14:11:27) (in thread): > Hi Michael, pushing tomasterhas been disabled. Please use thedevelbranch

Robert Castelo (15:30:32) (in thread): > Thanks for the tips!! I’m not quite getting the speed up with longer IntegerList objects within a package and I was guessing whether thez[unlist(il)]bit might not be picking up the rightunlist()method for IntegerList objects, but I could not find what import I should use, e.g., I’m importingrelist()from IRanges, but if I try to importunlist()from IRanges, roxygen2 tells me “****@importFrom**Excluding unknown export from IRanges: unlist“, how can I import the correctunlist()method for InterList objects?

Hervé Ménager (15:36:38) (in thread): > > Hi, It is true that the 1500 Bioconductor tools were imported as a one-time batch a few years ago. At some point, there were challenges in using the bio.tools API, but we hope that more recent versions are easier to use and they now allow for entry validation before creation. We (I’m part of the bio.tools and EDAM teams) have started discussing with members of the Bioconductor community to integrate the biocViews and bio.tools perspectives. The idea is that, with adequate documentation and integration, there would be no additional or even less work to publish Bioconductor packages. By publishing a package to a more general registry, it would improve their visibility to bioinformaticians who might not have looked for a package in Bioconductor.

Hervé Pagès (16:25:15) (in thread): > Always import the S4 generic function rather than a specific method (method dispatch will take care of finding the right method). Theunlist()generic is defined inBiocGenericsso just@importFrom BiocGenerics unlist.

2024-07-09

Nitesh Turaga (12:05:58): > Anyone use positron here? What are your thoughts?

Nitesh Turaga (12:06:17): > https://github.com/posit-dev/positron

2024-07-11

Jenny Drnevich (09:50:50) (in thread): > I haven’t heard about it. The “next-generation data science IDE” description makes me wonder if that is their intended eventual replacement for RStudio?

Lluís Revilla (09:53:48) (in thread): > For the moment they want to keep maintaining both.

2024-07-14

Sudipta Hazra (18:20:02): > @Sudipta Hazra has joined the channel

2024-07-15

Michael Lawrence (13:43:11) (in thread): > So this is a permanent change? Everything goes into devel from now on?

Marcel Ramos Pérez (13:44:36) (in thread): > Yes, we wrote a blog post here some time agohttps://blog.bioconductor.org/posts/2023-03-01-transition-to-devel/ - Attachment (Bioconductor community blog): Renaming the Default Branch to Devel – Bioconductor community blog > During the 3.17 devel cycle, Bioconductor will rename the default branch devel.

Michael Lawrence (13:46:14) (in thread): > I have a bunch of stuff that went into master. Is there an easy way to bring that over?

Michael Lawrence (13:47:04) (in thread): > It would be hard to believe that I am the only one surprised by this.

Marcel Ramos Pérez (13:49:18) (in thread): > we have some code that would help with that esp. for maintainers with lots of packages on GH and locallyhttps://github.com/bioconductor/branchrename

Michael Lawrence (13:53:25) (in thread): > Ok, thanks for the pointer. I am not sure if I feel comfortable using those tools. I will just cherrypick each commit over.

Marcel Ramos Pérez (13:59:48) (in thread): > masteranddevelbranches should be clones.. ifmasteris ahead ofdevelyou can map and push with (no need to cherry-pick) > > git push upstream master:devel > > and then rename the oldmasterbranch with > > git branch -m master devel > > (more details in the blog post)

Michael Lawrence (15:43:17) (in thread): > Ok, everything seems to work. Thanks for your help. git freaks me out, to be honest.

Ahmad Al Ajami (18:28:17): > Hi everybody, I hope this is the right place to ask the following. > I am usingBuildABiocWorkshoptemplate to create a package demo. I keep getting the following error on gh-pages when building the vignette withr-build-and-check:the condition has length > 1. This error is thrown after running a simple:runPCA(sce). > Locally, I am able to run the entire vignette, knit an html file and executedevtools::check()with no errors whatsoever. I am usingBiocManagerversion 3.19 andRversion 4.4.1. I tried many things but this error on GitHub persists. Has anyone faced such an error withrunPCA? > I am happy to share a link to the repo if needed. > Thanks a lot and cheers!

Vince Carey (19:16:53): > do provide a link

2024-07-16

Ahmad Al Ajami (02:38:25) (in thread): > https://github.com/ahmadalajami/scIGDWorkflowDemoBioC2024Thanks a million, Vince!

Hiru (06:56:24): > @Hiru has joined the channel

Hiru (07:17:01): > Hello! May I know the average review time around this time of the year? I would like to get my package reviewed before the deadline for my thesis in late August.

Lori Shepherd (07:20:01): > it can vary –I generally moderate packages (quick glance and assign reviewers) once or twice a week – once a reviewer is assigned we say to allow reviewers 2-3 weeks for a review – making sure you have run R CMD build, R CMD check, BiocCheck and have followed the important Bioconductor features incontributions.bioconductor.orgwill help speed things along

Hiru (07:24:31) (in thread): > Thanks for the quick reply! I can confirm running check(), build() and BiocCheck() with only a couple of notes (I believe I can justify them). > Would it be realistic to expect a review in about a month? I submitted my package last week.

Lori Shepherd (07:25:26) (in thread): > yes. I plan on moderating package later today, tomorrow at the latest so as long as there are no other issues you should be assigned a reviewer shortly

Hiru (07:29:12) (in thread): > Thank you for the assurance!

Vince Carey (10:23:29) (in thread): > i dont see an error log in your repo

Ahmad Al Ajami (10:34:31) (in thread): > Cause I literally just managed to find a way around. After more than 50 commits:white_check_mark::green_heart:All I had to do was to subsetnrowsin thesceobject before runningrunPCA(). Something like this:sce <- sce[chosen_hvgs, ]. Keeping the originalsceobject and runningrunPCA(sce, subset_row=chosen_hvgs)was throwing an error. I had to choose around 500 features only, which is OK for the demo session purpose. Anything more was causing an error. > > Although the error message from the checks was unclear to me (the condition has length > 1), I do believe it was memory that was causing the issue. I am not sure how much is allocated, but it did take a big chunk of memory when run locally

Ahmad Al Ajami (10:36:35) (in thread): > Thanks a lot for offering to help though. Much appreciated!

Hervé Pagès (12:10:47) (in thread): > @HiruNote that we have a dedicated channel for questions about package submissions/reviews:#package-submissionThanks!

Hiru (12:19:55) (in thread): > Thanks!

2024-07-19

Henrik Bengtsson (02:16:23) (in thread): > > Although the error message from the checks was unclear to me (the condition has length > 1), > This is a bug in some of the R code you’re using. Callingtraceback()when you get the error, would tell you exactly in which function. > > Seehttps://github.com/HenrikBengtsson/Wishlist-for-R/issues/38for details on this error message. In the past, R ignored these type of bugs, but starting with R 4.2.0 (2022-04-22), it considers it an error. From the NEWS in R 4.2.0: > * Callingif()orwhile()with a condition of length greater than one gives an error rather than a warning. … > Source:https://cran.r-project.org/doc/manuals/r-release/NEWS.html - Attachment: #38 ROBUSTNESS: Give error if condition for control statements have length != 1 > Issue > > Control statements if(cond) and while(cond) in R gives an error if length(cond) == 0, but if length(cond) > 1, then only a warning is produced, e.g. > > > x <- 1:2 > > if (x == 1) message(“x == 1”) > x == 1 > Warning message: > In if (x == 1) message(“x == 1”) : > the condition has length > 1 and only the first element will be used > > By the design of if and while control statements, it makes no sense to use a condition with a length other than one. Because of this there is not logical reason why the above should not be considered an error rather than a warning. > > Suggestion > > The long-term goal should be that R always produces an error if the length of the condition differs from one. However, due to only a warning has been produces this far, there is a great risk that there exist lots of code that will break immediately if an error is generated. In order to avoid wreaking havoc, a migration from producing a warning to an error may go via an option check.condition with default value FALSE. When FALSE, the current behavior remains and only a warning is produced. With TRUE, an error is produced. R CMD check --as-cran could enable options(check.condition = TRUE) such that all new packages submitted to CRAN must pass this new requirement. This will also allow individual developers to run checks locally. > > Patch > > A complete patch is available in https://github.com/HenrikBengtsson/r-source/compare/hb-develop…HenrikBengtsson:hb-feature/check-condition. Example: > > > options(“check.condition”) > $check.condition > [1] FALSE > > if (x == 1) message(“x == 1”) > x == 1 > Warning message: > In if (x == 1) message(“x == 1”) : > the condition has length > 1 and only the first element will be used > > > options(check.condition = TRUE) > > x <- 1:2 > > if (x == 1) message(“x == 1”) > Error in if (x == 1) message(“x == 1”) : the condition has length > 1 > > References > > • R-devel thread ‘vector arguments to if()’, 2002-11-30. https://stat.ethz.ch/pipermail/r-devel/2002-November/025537.html > • R-devel thread ‘Control statements with condition with greater than one should give error (not just warning) [PATCH]’ on 2017-03-03 (https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/thread.html#73817|https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/thread.html#73817)

2024-07-25

rodrigo arcoverde (09:38:27): > Hi all, > Have someone here tried to implement programs that are not installed through conda in a conda environment created by basilisk? I am having a problem that one program does not exist on bioconda channel for windows since this channel doesn’t support windows. But the program (igblast) exists for windows. What would be the best way to set it up to be used within basilisk conda env? Should I force windows users to download the program and allocate it inside the R package?

Aaron Lun (11:39:37): > if you can pip install it (either viapip=or via a vendoredpath=), you can define windows-specific instructions.https://github.com/crisprVerse/crisprScore/blob/devel/R/basilisk.Rhas windows-specific paths (though it just skips windows in this case).

2024-07-26

rodrigo arcoverde (05:04:03) (in thread): > Thanks for the answer! Unfortunately, it does not install via pip. What exactly do you mean by a vendored path? Could it a be a URL with the program for windows?

rodrigo arcoverde (05:05:11) (in thread): > The program itself is not in python, but I am using it in a python script

rodrigo arcoverde (05:08:22) (in thread): > Here is the program itself:https://ftp.ncbi.nih.gov/blast/executables/igblast/release/LATEST/

rodrigo arcoverde (05:48:07) (in thread): > Another question, how can I set up subdirectories for osx-64 in a conda env to make some osx64 packages available to arm-64 OS? In conda, I can do: > > conda config --env --set subdir osx-64 > > then, I can install it as normal in a arm64

Qiwen Octavia Huang (19:48:15): > @Qiwen Octavia Huang has joined the channel

2024-07-27

Aaron Lun (03:53:39) (in thread): > 1. If it’s a standalone (non-conda, non-python) program, basilisk doesn’t have much to say on the matter. You’ll just have to download it and install it separately, or prompt the user to do so. > 2. No support for pulling fromosx-64right now. We usereticulate::conda_installand there’s no opportunity to inject aconda configbetween the internalcreateandinstallcalls. Perhaps a--platform osx-64can be added to theadditional_create_arguments=, but I haven’t tried. If it works you could consider making a PR to the basilisk repo atthis line here.

2024-07-29

Vince Carey (05:45:01): > Checking in here: If you are conversant with CMake (cmake.org) please upvote.

rodrigo arcoverde (05:46:28) (in thread): > Thanks for all the answers, Aaron! I will see if I can make it work for windows by prompting to the user to download and then come back to the arm-64.

Stevie Pederson (10:27:54): > Heads up for those who use pkgdown on their personal repos:https://www.tidyverse.org/blog/2024/07/pkgdown-2-1-0/ - Attachment (tidyverse.org): pkgdown 2.1.0 > pkgdown 2.1.0 includes two major new features: support for quarto vignettes and a “light switch” that lets the reader switch between light and dark mode. It also contains a bunch of other improvements to both the user and the developer experience.

Dirk Eddelbuettel (10:41:43) (in thread): > It’s actually been out at CRAN for a little over three weeks.

Stevie Pederson (10:49:08) (in thread): > Ah sorry. Thought I was being helpful. I always get anxious about breaking changes with these posit releases, but this one looks relatively benign.

Dirk Eddelbuettel (10:50:51) (in thread): > They aredefinitelylaggy with their press releases. If you want to see ‘change as it happens’ I built a service for that … some 17 (?) or so years ago. CRANberries has an RSS feed, and skeets and toots if you’re into that. The plain (old) html is athttps://dirk.eddelbuettel.com/cranberries/

Tim Triche (11:04:47): > quarto support for pkgdown is great!

Stevie Pederson (23:07:25) (in thread): > Does bioc support this? Haven’t tried yet, but it might pretty cool in future

2024-07-30

Tim Triche (10:19:08) (in thread): > Not sure but one way to find out

Hervé Pagès (14:53:13) (in thread): > > Does bioc support this? > FWIW BioC supports quarto-based books thanks to@Jacques SERIZAY’s work:https://bioconductor.org/packages/BiocBook - Attachment (Bioconductor): BiocBook > A BiocBook can be created by authors (e.g. R developers, but also scientists, teachers, communicators, …) who wish to 1) write (compile a body of biological and/or bioinformatics knowledge), 2) containerize (provide Docker images to reproduce the examples illustrated in the compendium), 3) publish (deploy an online book to disseminate the compendium), and 4) version (automatically generate specific online book versions and Docker images for specific Bioconductor releases).

Jorge Kageyama (17:49:28): > @Jorge Kageyama has joined the channel

2024-08-01

Matt H (08:52:06): > @Matt H has joined the channel

Matt H (09:52:40) (in thread): > Hi, > > I’ve tried adding--platform osx-64and--subdir osx-64to theadditional_create_argsasadditional_create_args=c("--override-channels", "--platform osx-64"),andadditional_create_args=c("--override-channels", "--subdir osx-64"),Which produces this error when trying to call our function that usesbasilisk: > > /Users/loregroup/Library/Caches/org.R-project.R/R/basilisk/1.17.2/0/bin/conda create --yes --prefix /Users/loregroup/Library/Caches/org.R-project.R/R/basilisk/1.17.2/scifer/1.7.5/igblast_wrap_basilisk2 'python=3.9.19' --quiet -c bioconda -c conda-forge --override-channels '--platform osx-64' > usage: conda [-h] [-V] command ... > conda: error: unrecognized arguments: --platform osx-64 > Error: Error creating conda environment '/Users/loregroup/Library/Caches/org.R-project.R/R/basilisk/1.17.2/scifer/1.7.5/igblast_wrap_basilisk2' [exit code 2] > Called from: stopf(fmt, envname, result, call. = FALSE) > > and > > + /Users/loregroup/Library/Caches/org.R-project.R/R/basilisk/1.17.2/0/bin/conda create --yes --prefix /Users/loregroup/Library/Caches/org.R-project.R/R/basilisk/1.17.2/scifer/1.7.5/igblast_wrap_basilisk_test 'python=3.9.19' --quiet -c bioconda -c conda-forge --override-channels '--subdir osx-64' > usage: conda [-h] [-V] command ... > conda: error: unrecognized arguments: --subdir osx-64 > Error: Error creating conda environment '/Users/loregroup/Library/Caches/org.R-project.R/R/basilisk/1.17.2/scifer/1.7.5/igblast_wrap_basilisk_test' [exit code 2] > Called from: stopf(fmt, envname, result, call. = FALSE) > > I have been able to useconda create -n env_name -c bioconda -c conda-forge --override-channels --platform osx-64and--platform osx-64through the terminal to create conda environments with the osx-64 platform on a osx-arm64 device (M2 silicon), conda version 24.7.1. > > I’m unsure if the issue is coming from conda or reticulate. Changing the order where--platform osx-64goes doesn’t seem to help as it doesn’t recognize the argument at all. From what I could gather both –platform and –subdir were added in to conda version 23.10.0 (https://github.com/conda/conda/pull/11794)?

Matt H (11:55:52) (in thread): > Additionally I think the conda binary that basilisk uses by default is version 4.12.0?

Aaron Lun (11:56:09) (in thread): > i’m pretty sure those need to be two separate strings, i.e.,c("--platform", "osx-64"). and make sure you use bioc-devel.

2024-08-02

Matt H (04:41:12) (in thread): > Thank you so much! I’m checking that now,

2024-08-05

Jeroen Ooms (08:14:59): > Hello, I noticedzlibbiochasPackageStatus: "Deprecated"in bioc 3.20. Does this mean it will be removed? There are 1729 packages depending on it.

Lori Shepherd (09:11:34) (in thread): > @Martin Morgan/@Hervé Pagèscould one of you comment on this? It was thought that it is no longer needed?

Matt H (10:06:30) (in thread): > Hi Aaron, > Thanks again for responding and your help! Your suggestion worked when I changed theconda=argument fromconda.cmdto a path of a newer conda version 24.7.1. This allowed us to progress to an error about incompatible architecture which I’ve encountered before which seems to be a more complex problem. > > Error: /libpython3.9.dylib' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e' or 'arm64')) >

Martin Morgan (10:07:20) (in thread): > On Linux & macOS it was always meant as a placeholder, deferring to system libraries. On Windows it was required because there was no alternative zlib. But Ithinkthis has changed, and it is not necessary there (R provides an alternative?)? But I don’t have the resources to confirm that. It is old (v.1.2.5) which is problematic from a security perspective. Minimally the zlib source code should be updated. > > I guess there are only 34 direct reverse CRAN / Bioc dependencies so not quite so daunting…

Jeroen Ooms (11:40:31) (in thread): > Yes all platforms just have-lzthese days, there is no need to have an R package for this.

Jeroen Ooms (11:44:05) (in thread): > Apologies, my count is the recursive hard deps

Aaron Lun (11:48:48) (in thread): > You don’t mention the context in which this failure occurs. > > I would guess that it happens when loading Python via reticulate. > > I would further guess that your R installation uses a native arm64 build. > > My final guess is that it is not possible to link an x86-64 python to an arm64 R build. So if you want to use this environment, you’ll have to do so outside ofreticulate.

2024-08-06

Matt H (04:51:00) (in thread): > Yes sorry, the error occurred while trying to create the environment when calling the function that runs igblast. Thank you so much for the assistance thus far.

2024-08-14

Michael Lawrence (09:26:25): > I was just browsing the R source code and discovered that this has been possible since at least 2013: > > > setClass("Foo", slots = c(foo = "character")) > > `@<-.Foo` <- function(object, name, value) { attr(object, name) <- paste0(value, "."); object } > > foo <- new("Foo") > > foo@foo <- "bar" > > foo@foo > [1] "bar." > > Maybe others new about this? It’s interesting in that it allows for some degree of encapsulation of S4 slots. A simple framework could be devised that e.g. allows for per-slot validation, automatic coercion, etc.@()was made generic during the S7 work last year. We made it skip S4 objects though, for obvious reasons. I didn’t realize that@<-()was dispatching even on S4 objects. Anyway, people should be migrating to S7 anyway, and it already supports this level of encapsulation.

Tim Triche (11:09:27): > What’sS7?

Marcel Ramos Pérez (11:11:00) (in thread): > https://www.youtube.com/watch?v=P3FxCvSueag - Attachment (YouTube): Hadley Wickham | An introduction to R7 | RStudio (2022)

Tim Triche (11:11:36): > Ok, now I see what S7 is.Should this become an item of emphasis for BioC? It might allow some cleanups of cruft and dependencies that make the “chain” a bit heavyweight for “light” operations such as in GEOquery.

Tim Triche (11:12:44) (in thread): > Ah, I remember R6 now.Is S7 meant to be a foundation in R 5.0 and forwards?

Tim Triche (11:16:04): > Reading through the documentation it seems to address a number of issues related to interop (eg wasm and python) and common use cases (e.g. Seurat encouraging Willy-Nilly use ofattr(a@b$c[‘d’], “efg”))

Tim Triche (11:18:12): > Are you still on the TAB? If not, who represents Genentech’s perspective?

Michael Lawrence (13:12:19) (in thread): > It’s marrying improved S4-style formal class definitions with a form of multiple dispatch based on S3. We do hope to get it into base R. I’ve really enjoyed working with it, not to pat myself on the back:wink:

Michael Lawrence (13:13:23) (in thread): > No one, as far as I know. I wouldn’t mind rejoining.

Tim Triche (13:13:27) (in thread): > I’mgoing to take a stab at implementing a rectangular file-backed summarizedExperiment-like class with it, using biocthis for the template generation.Wish me luck:wink:

Michael Lawrence (13:14:06) (in thread): > Good luck!:slightly_smiling_face:

Lluís Revilla (13:24:38): > Thanks@Michael Lawrence, this might come handy. In relation to@Tim Triche’s question. I thought that S7 was not yet ready for general use (or for Bioconductor’s usage), as CRAN’s package is still under 1.0 version. Perhaps I am remembering something already fixed on the 0.1.1 version… Would you recommend to update packages to use it?

Charlotte Soneson (13:28:38) (in thread): > @Michael LawrenceApplication is open until end of August:https://community-bioc.slack.com/archives/C35G93GJH/p1721836160278269 - Attachment: Attachment > :star2: Annual Nominations Open for CAB & TAB! :star2: > > Are you interested in contributing to Bioconductor decision-making, or do you know someone who would be a great fit? Join our advisory boards! > > :globe_with_meridians: The Community Advisory Board (CAB) aims to: > • enable productive and respectful participation in the Bioconductor project by users and developers at all levels of experience > • empower user and developer communities by coordinating training and outreach activities > :wrench: The Technical Advisory Board (TAB) aims to: > • develop strategies to ensure long-term technical suitability of core infrastructure for the Bioconductor mission > • identify and pursue technical and scientific aspects of funding strategies for long-term viability of the project > :date: Apply by Aug 31: > • CAB Application Form > • TAB Application Form > Don’t miss this opportunity to impact Bioconductor’s future! > > Feel free to share this on LinkedIn or https://genomic.social/@bioconductor/112842242945767836|Mastodon

2024-08-15

Maria Doyle (11:49:48): > Great question,@Lluís Revilla! I’m also curious to hear thoughts on this. By the way, that S7 discussion was really interesting. If anyone’s up for turning thoughts like these into short blog posts for Bioconductor, it could be helpful for the community, and it would be great to have more technical posts. I’m happy to help out if needed.

Kasper D. Hansen (12:47:59): > I am returning to one of my pet peeves: bug fixes in release. We use to have a policy (not always followed) that any bug fix in release needed to be accompanied by an email to bioc explaining the change. I don’t see this policy anymore in the “book”:https://contributions.bioconductor.org/index.html - Attachment (contributions.bioconductor.org): Welcome | Bioconductor Packages: Development, Maintenance, and Peer Review > This is a minimal example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook.

Kasper D. Hansen (12:48:20): > Also, it is not totally clear where the reporting should happen - on the support site or on Slack etc.

Kasper D. Hansen (12:48:33): > What are peoples opinion here?

Kasper D. Hansen (12:49:16): > Originally, this policy was introduced because we have a stable release and - per definition - any bug fix in release is a change to a stable thing and those things should not be undertaken lightly

Vince Carey (15:37:40): > Hi@Kasper D. Hansen– I personally don’t remember an email policy for bug fixes in release. Nor do I see any statement of a policy restricting modifications to code in release to “bug/doc fixes”, though such a statement should be prominent IMHO. You are right that bug fixes can alter stability but our working approach to date is that when a bug is identified it is repaired with auditing in git – the changes are identifiable and an explanation should be present in the log and NEWS. “Stability” is affected but the archiving practices imply that any given environment should be reproducible with available resources. Are there changes that have been problematic for you, and should a different practice be considered?

Vince Carey (15:38:18): > If an email approach is to be taken, I’d suggest that bioc-devel be the target? The readership is presumably committed to thinking about the ecosystem as a whole and would have some chance of identifying pitfalls?

Michael Lawrence (16:19:20) (in thread): > It’s ready for general usage. We are still making improvements to it, but I have encountered any major bugs. Please try it out.

2024-08-16

Shian Su (02:53:07): > Are there any thoughts on integrating pkgdown documentation into BioC? There are a few tidyomics packages now with very nice pkgdown documentation, but in order to find them you’d need to click through to the github page first. I wonder if there’s some nice way to serve the pkgdown documentation directly on the BioC package landing page.

Lluís Revilla (05:00:10) (in thread): > You can have more than one URL field in the DESCRIPTION with a link to the html pkgdown webpage. Some packages do this for several releases.

Lluís Revilla (05:01:29) (in thread): > Great! Thanks for the work on it!

Vince Carey (09:59:15) (in thread): > There have definitely been thoughts about this. I would agree that we want the surfacing of advanced documentation by developers, including interactive resources and sites, to work well for bioconductor contributing developers. If you would like to experiment with modifications to the landing page design/functionality (or other aspects of bioc’s web presence via the main site), thebioconductor.orgrepo is open under the bioc github organization. I was able to get a local instance running using the README doc. Landing-page-generation is part of the BBS, also open.

Lluís Revilla (10:08:01) (in thread): > There is also the effort of CRAN (and Bioconductor) to provide the manual pages via html using pkg2HTML (@Vince CareyI hope this can progress further). > To make it easier to find those repos:https://github.com/Bioconductor/bioconductor.org, andhttps://github.com/Bioconductor/BBS

Vince Carey (11:42:05) (in thread): > Thanks for the reminder@Lluís Revilla… I was going to ask Kurt what the next step is in this process. Do you happen to know?

Lluís Revilla (11:50:17) (in thread): > During the useR!2024 we worked on some accessibility issues found manual pages generated by pkg2HTML. I am not sure if those were fixed in r-devel (there were also two issues harder to decide). Besides publishing the meta.rds I am not sure what are the next steps if pkg2HTML is not fixed. Perhaps the next step for both CRAN and Bioconductor packages should be fix all references to documentation as this alone will take time

2024-08-17

Lluís Revilla (14:24:54): > ANACONDAis enforcing the policy and restricting its usage (which limits it for learning/teaching):https://www.theregister.com/2024/08/08/anaconda_puts_the_squeeze_onThis might impact basilisk and related packages…

Vince Carey (14:48:52) (in thread): > right, modifications to basilisk have been made to avoid use of default channels if u see risks of inadvertent use of anaconda please let us know

2024-08-19

Shian Su (00:05:03) (in thread): > Thanks for the responses, adding it to URL would be one solution, but most preferably it would appear under the “Documentation” section if available rather than in the URL section. I don’t know if there’s a meaningful way to automatically detect pkgdown pages from URL field, they should mostly have the formathttps://username.github.io/Package/, so one way would be to move any such field to the Documentation section as “PkgDown Site” or something to that effect, though it does not feel like a robust solution.

Vince Carey (08:47:59) (in thread): > I agree that it should show in Documentation. It will take a) a guideline for developers who want to do it, so that we have uniform information in DESCRIPTION to support the discovery and linking and b) a modification to landing-page generation to use the information. Let’s assume the guideline takes the form of PkgdownURL: https://[maintainername].github.io… are there github API calls that could be used to find candidate URLs for this field, filling maintainer name from data available in BiocPkgTools?

Rema Gesaka (09:37:21): > @Rema Gesaka has joined the channel

Tim Triche (11:21:46) (in thread): > biocthishas templates for this eghttps://trichelab.github.io/FileSetExperiment/ - Attachment (trichelab.github.io): A rectangular FileSet-backed object with rowData and colData > A FileSetExperiment is a SummarizedExperiment backed by FileSets.

Tim Triche (11:22:14) (in thread): > Mildly fussy to get the first run but afterwards awesome.Thanks@Leonardo Collado Torres!

Lluís Revilla (11:38:07) (in thread): > Some maintainers host the pkgdown website on their own domain:https://biocor.llrs.dev/, should these developers be forced to move the website to the github domain? I don’t think this will be a wise movement. For checking existing urls if they are pkgdown websites or not one could check if the link points to a website with some class starting with pkgdown, The HTML pages have classes like: “pkgdown-footer-right” and “pkgdown-footer-left” (I am not sure if this can be hidden/changed, probably not easily) this could provide enough proof that the website is indeed a pkgdown. > However this doesn’t check if the rendered version matches the vesion on Bioconductor. This would need an additional check. One can also have a dev version and a released version in pkgdown too, this is something to keep in mind too.

Vince Carey (11:47:18) (in thread): > thanks@Lluís Revilla… very informative for now let’s make it voluntary via the URL field and see how that works we can gather data on adoption and discuss how to surface better

2024-08-21

Ludwig Lautenbacher (12:30:23): > Hi all, I’m in a bit of a dilemma at the moment. I implemented my package usingsetRefClass. My package reviewer now mentioned that S4 classes are preferred over R6. Asking me to give a reason for why an R6 class is required. Coming from a background of using mostly python I found the structure enabled withsetRefClassa lot easier to understand and make work. I could attempt to rewrite this as an S4 class but the benefit of this is not clear to me. Can somebody explain to me why S4 classes are preferred? Thedocumentationonly says it is without explaining why. - Attachment (contributions.bioconductor.org): Chapter 15 R code | Bioconductor Packages: Development, Maintenance, and Peer Review > Everyone has their own coding style and formats. There are however some best practice guidelines that Bioconductor reviewers will look for. can be a robust, fast and efficient programming language…

2024-08-22

Kevin Rue-Albrecht (03:50:47) (in thread): > That’s an excellent question. > * I’ve got a (distant) background of Python/Java/other and also see the value in those other implementations of OOP. > * Having written a bunch of Bioconductor packages myself, I just got used to S4 (when R6 didn’t exist or wasn’t as popular as it is now) and never bothered trying to move over. > * I believe that you have a very valid discussion point. I agree that it would be more helpful if the Bioconductor documentation/guidance gave a bit of context/motivation for preferring S4 over other implementations. > * I can see that the guidance you linked states that S4 is preferred over S3, but I do not see any mention of R6. The reviewer may have extrapolated here. The guidance may need to be updated to address R6. > * I hope you’ll find that Bioconductor is a very open-minded community. The guidance is a live document which is regularly updated. I’ve mostly contributed to the Shiny section based on my own experience, and I’m very open to others contributing their own perspective to the discussion. One of the key objectives of the Bioconductor project is interoperability, which is more easily achieved when everyone follows the same guidelines. Once guidelines are established, the challenge moves to striking a balance between backward-compatibility and keeping up-to-date with new technology

Lluís Revilla (04:25:21) (in thread): > With S4 you have checks for validity of the objects and methods to update the objects (if there are changes on it). You can implement those same methods in S3 or R6 but they aren’t automatically enforced as in S4. However, recently there was some suggestion to use S7, a different OOP paradigm that has just been released. It mixes several other paradigms, maybe you find that one easier.

Laurent Gatto (05:23:46) (in thread): > Another possible element is that in R, we are used to a pass-by-value semantic. Pass-by-reference isn’t a concept that many R users are familiar with, and could lead to unexpected changes to an object. However, if there are reasons to use references internally and add a pass-by-value interface to avoid such a confusion, I don’t see any issue (and there are already such examples in Bioconductor).

Vince Carey (06:48:00) (in thread): > code.bioconductor.orgcan be used to identify existing uses of setRefClass. I would agree that the guidelines should include material on choices to be made in this domain.

Michael Lawrence (14:21:46) (in thread): > Is there a link to the submission issue? Some context would help. Point of clarification: R6 andsetRefClass()are different things.setRefClass()is actually a feature of S4 that enables mutable objects that interact via message passing. The feature was colloquially known as R5. R6 was developed by Posit, largely because S4 reference classes were not meeting the performance requirements of Shiny.

2024-08-28

Ludwig Lautenbacher (08:08:04) (in thread): > Thanks everyone for the feedback!@Laurent Gattothat is valuable feedback! Adding an interface that takes both the model and the inputs as arguments and just calls the method is easy enough to do.@Michael Lawrencehere is the issuehttps://github.com/Bioconductor/Contributions/issues/3392. That explains why you couldn’t findsetRefClassin the R6 documentation. - Attachment: #3392 KoinaR - R package to interface with Koina web service > Update the following URL to point to the GitHub repository of
> the package you wish to submit to Bioconductor > > • Repository: https://github.com/wilhelm-lab/koinar > > Confirm the following by editing each check box to ‘[x]’ > > • I understand that by submitting my package to Bioconductor,
> the package source and all review commentary are visible to the
> general public. > • I have read the Bioconductor Package Submission
> instructions. My package is consistent with the Bioconductor
> Package Guidelines. > • I understand Bioconductor <https://bioconductor.org/developers/package-submission/#naming|Package Naming Policy> and acknowledge
> Bioconductor may retain use of package name. > • I understand that a minimum requirement for package acceptance
> is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
> Passing these checks does not result in automatic acceptance. The
> package will then undergo a formal review and recommendations for
> acceptance regarding other Bioconductor standards will be addressed. > • My package addresses statistical or bioinformatic issues related
> to the analysis and comprehension of high throughput ~~genomic~~ proteomics data. > • I am committed to the long-term maintenance of my package. This
> includes monitoring the support site for issues that users may
> have, subscribing to the bioc-devel mailing list to stay aware
> of developments in the Bioconductor community, responding promptly
> to requests for updates from the Core team in response to changes in
> R or underlying software. > • I am familiar with the Bioconductor code of conduct and
> agree to abide by it. > > I am familiar with the essential aspects of Bioconductor software
> management, including: > > • The ‘devel’ branch for new packages and features. > • The stable ‘release’ branch, made available every six
> months, for bug fixes. > • Bioconductor version control using Git
> (optionally via GitHub). > > For questions/help about the submission process, including questions about
> the output of the automatic reports generated by the SPB (Single Package
> Builder), please use the #package-submission channel of our Community Slack.
> Follow the link on the home page of the Bioconductor website to sign up.

Ludwig Lautenbacher (11:37:58): > Hi all, I have another (unrelated) question. My package creates an SSL connection (It’s a client for a web server) both in the vignette and the tests. In the past this was no issue and both the vignette & tests passed without issue. > Now when I push to bioconductor thebuild pipelinefailed three times in a row withSSL connection timeout. Locally it works without issue. Was there an update to the build pipeline relatively recently that prevents connections? > And is there a way to get more detailed information from the build pipeline?

Lluís Revilla (12:10:12) (in thread): > In general it is best if you can check and build your package without needing internet connections. Machines like CRAN and Bioconductor that make many requests might be flagged and blocked. Here are some resources that can come handyhttps://books.ropensci.org/http-testing/

2024-08-29

Ludwig Lautenbacher (04:09:17) (in thread): > Thank you for the guide! I was hoping to get away without mocking an api. But httptest seems to make it rather easy. I will give it a try!

2024-09-06

Lori Shepherd (07:12:24): > The 3.20 release schedule has been announced. Please be aware of important deadlines:https://bioconductor.org/developers/release-schedule/ - Attachment (bioconductor.org): Bioconductor - Release: Schedule > The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.

Lori Shepherd (07:13:23) (in thread): > On our linux builders (nebbiolo1/nebbiolo2) we have started to see SSL timeout issues from a number of different sources. We are currently investigating the issue.

2024-09-07

Michael Lawrence (13:28:01): > Preview of a to-be-released interface to LLMs: > > library(wizrd) > get_mean <- function(name) mean(get(name)) > var <- 1:3 > llama3() |> equip(get_mean) |> predict("What is the mean of var?") > # [1] "The mean of \"var\" is 2." >

Tyler Sagendorf (14:34:10): > @Tyler Sagendorf has joined the channel

2024-09-10

Alex Qin (03:46:00): > @Alex Qin has joined the channel

Philippa Doherty (09:56:56): > @Philippa Doherty has joined the channel

Robert Castelo (13:05:17): > Hi, due to vulnerabilities such as DDoS attacks through ssh in port 22, organizations are starting to block the use of this port (inbound and outbound), which results in, e.g., the impossibility to clone a GitHub repo fromgit@github.comorgit@git.bioconductor.org. This is still not the case of my own university, but I was today in another one which is doing it, so I’m afraid sooner or later mine may block it too. For GitHub I found a workaround in the first answerhere, which consists of adding the following instructions to.ssh/config: > > Host github.com > Hostname ssh.github.com > Port 443 > > Would it be possible to have such a workaround forgit.bioconductor.org? (I tried, but it didn’t work) > > Thanks!! - Attachment (Stack Overflow): How to fix “ssh: connect to host github.com port 22: Connection timed out” for git push/pull/… commands? > I am under a proxy and I am pushing in to git successfully for quite a while. > Now I am not able to push into git all of a sudden. > I have set the RSA key and the proxy and double checked them, with no

2024-09-11

Davide Risso (05:02:39) (in thread): > @Tim Trichedid you get around doing this? A bunch of us are thinking about spatial data containers and we were wondering if S7 is the way to go for this new class…

Shila Ghazanfar (05:06:47): > @Shila Ghazanfar has joined the channel

Alex Mahmoud (11:24:49) (in thread): > I think it’s doable, we’d just need to proxy requests from that port to what gitolite expects. However, we’re looking into migrating away from a self-hosted git server anyway, so unless urgent, might be worth just trying to implement it on the new stack rather than with gitolite cc@Vince Carey@Lori Shepherd

Michael Lawrence (13:14:14) (in thread): > I think it would be worth forming a working group that attempts to reimplement some of the core Bioc classes in S7. It would likely bring a lot of learnings to the S7 project, while also getting a head start on a transition, if we decide to go that way.

Tim Triche (13:23:07) (in thread): > Still working on my PoC@Davide Rissobut agree with@Michael Lawrence

Michael Lawrence (13:24:48) (in thread): > Anyone know the process for initiating such a group?

Tim Triche (13:59:50) (in thread): > I would join,I’mhaving a hell of a time gluing other packages’ genetics together and re exporting to S7 methods

Tim Triche (14:00:19) (in thread): > s/genetics/generics/;

2024-09-12

Davide Risso (03:04:34) (in thread): > There is a BiocClasses working group (https://github.com/Bioconductor/BiocClassesWorkingGroup). Probably different focus, but it may be worth coordinating with them

Robert Castelo (03:31:07) (in thread): > Hi Alex, right now, at least for myself, it’s not urgent, but I’d definitely consider implementing it in the new stack.

Tim Triche (10:53:24) (in thread): > Wowit’sbeen a while since that was active

2024-09-16

Ludwig Lautenbacher (05:51:28): > Hi all, I have a minor issue with the build pipeline. My latest commit triggered the build, but I never got the usual status comment, even though the commit was three days ago. > Can I check the status of such a case? Or should I retrigger the pipeline with a new commit?

Robert Castelo (06:05:15) (in thread): > Did you bump the version in theDESCRIPTIONfile? (in general, builds are only triggered when the minor version increases)

Ludwig Lautenbacher (07:08:33) (in thread): > I think I did, I got a comment stating that the build was triggered.Received a valid push ongit.bioconductor.org; starting a build for commit id: bfbad87113e540ae0292e3f622396af801921899Would I get the build start comment when I didn’t bump the version?

Robert Castelo (07:53:52) (in thread): > This I don’t know, I just mentioned the bumping version issue because it’s a common misstep, specially among developers that are contributing their first package to Bioconductor. If you bumped the version, then you’ll need to wait for an answer from somebody who knows the internals of the Bioconductor build system (cc:@Lori Shepherd)

Lori Shepherd (07:56:02) (in thread): > I’m sorry for the inconvenience. We have experienced some issues with our builders including the one that is tied to the new package submission process. It is associated with the updates as announced here:https://stat.ethz.ch/pipermail/bioc-devel/2024-September/020583.htmlWe are working on getting it back online as soon as possible

Ludwig Lautenbacher (08:00:33) (in thread): > Thank you, for letting me know!

2024-09-17

Lori Shepherd (11:36:19) (in thread): > We have linked the single package builder back up. It should behave as normal now and give back build reports. Please let us know if you experience any further issues

2024-09-18

Aaron Lun (13:22:28): > the section on basilisk in the package submission guidelines is strange. it advises developers to use basilisk instead of reticulate, but those two are not mutually exclusive; basilisk uses reticulate under the hood anyway. > > a better text would be something like: > > Bioconductor packages may interface with Python, e.g., via the reticulate package. However, the Bioconductor package should not require manual installation of Python packages by end-users. If the Bioconductor package relies on the creation of specific Python environments, this process should be automated - typically using the basilisk package - so that no further action is required by the end-user after R package installation.

2024-09-19

Jeroen Ooms (19:02:37): > I ran into a package which references a non-existing git URL for bug reports:https://github.com/bioc/CONSTANd/blob/devel/DESCRIPTION#L18Does this not get flagged in BiocCheck?

2024-09-20

Marcel Ramos Pérez (09:39:17) (in thread): > It would be great to re-use the that functionality inR CMD check --as-cranbut no it doesn’t get flagged inBiocCheck

Dirk Eddelbuettel (12:42:21) (in thread): > Would the add-on packageurlcheckercatch this? Asking mostly because I am not entirely sure exactly where it scans for URLs besides README{,.md} and Rd files. - Attachment (cran.r-project.org): urlchecker: Run CRAN URL Checks from Older R Versions > Provide the URL checking tools available in R 4.1+ as a package for earlier versions of R. Also uses concurrent requests so can be much faster than the serial versions.

2024-09-22

Almog Angel (03:33:31): > Hey developers. I ran into an issue with a package dependency hosted on R-Forge—the pracma package. Even though I correctly listed pracma under “Imports” in the DESCRIPTION file, my package is unable to load it. Here’s the error I encountered: > > Error in pracma::lsqlincon(spillMat[rows, rows], x, lb = 0) : > cannot open file '/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library/quadprog/R/quadprog.rdb': No such file or directory > > It seems like pracma is failing due to a missing dependency, specifically quadprog. Any advice on how to ensure pracma and its dependencies load correctly from R-Forge?

Dirk Eddelbuettel (16:58:53) (in thread): > Bothpracmaandquadprogare standard CRAN packages. Can you install them the usual way viainstall.packages()? You should get CRAN-made binaries for R 4.4.* on macOS for arm64.

2024-09-23

Almog Angel (02:03:02) (in thread): > Seems like I’m facing issues with indirect dependencies in my package. DuringR CMD check, errors are triggered because owlready2 (a Python package) and quadprog (an R package) are required by other packages my package depends on (pracma and ontoProc). I’m unsure how to handle these dependencies without listing them directly in my DESCRIPTION file, as they aren’t used directly by my package.

Lori Shepherd (06:58:10) (in thread): > What is the name of your package? We can look at the install log to see why those package did not install correctly

Almog Angel (07:31:16) (in thread): > xCell2

Lori Shepherd (07:41:13) (in thread): > per the message about quadprog , if you are using the functionality that requires that package, then it will be needed to be listed in the DESCRIPTION because pracma doesn’t have a hard dependency on it (it is only in Suggests for optional features).@Vince Carey/@Andres Wokatyany thoughts on owlready2? shouldn’t this be handled then in the ontoProc package and no additional set up in this package?

Almog Angel (07:46:39) (in thread): > Thanks@Lori Shepherd. Regarding ontoProc, since it’s a Bioconductor package, I think It would be easier to solve this dependency problem. I also posted an issue on their GitHub repository.

Andres Wokaty (10:49:56) (in thread): > reticulateshould be handlingowlready2. I will look at the error reported on nebbiolo2 and teran2.

Vince Carey (11:17:45) (in thread): > owlready2 will be handled entirely through basilisk in ontoProc, moving in that direction presently. you should not need to do anything in BBS

2024-09-25

Hechen Li (17:54:29): > @Hechen Li has joined the channel

Hechen Li (18:08:45): > Hi everyone, recently I updated the vignettes in my package, using%\VignetteIndexEntryto specify custom titles. However, on the landing page, one of the vignettes is not displayed with the specified title, and the file name is shown instead. I checked a lot but couldn’t find why. All other three vignettes don’t have this issue. Does anyone have a clue? > > Landing page:https://bioconductor.org/packages/devel/bioc/html/scMultiSim.html, “spatialCCI.html” should be “Simulating Spatial Cell-Cell Interactions” > > Code:https://code.bioconductor.org/browse/scMultiSim/tree/devel/vignettes/ - Attachment (Bioconductor): scMultiSim (development version) > scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments. - Attachment (code.bioconductor.org): Bioconductor Code: scMultiSim > Browse the content of Bioconductor software packages.

Marcel Ramos Pérez (18:12:26) (in thread): > just a wild guess but could it have something to do with a lowercase file extension?spatialCCI.rmd:thinking_face:cc:@Lori Shepherd

2024-09-26

Louis Le Nézet (07:06:48): > Hi everyone, > I’m trying to test shiny modules usingshinytest2. > To do so, I’m using a data set available in the package. > I’m following the recommendation of Bioconductor by doing the following: > > data_env <- new.env(parent = emptyenv()) > utils::data("sampleped", envir = data_env, package = "Pedixplorer") > pedi <- shiny::reactive({ > ped1 <- Pedigree(data_env[["sampleped"]]) > ped1[famid(ped(ped1)) == "1"] > }) > > Unfortunately this doesn’t seems to work in when launching thetest_check() > > Failed to locate globals in server function. > i This may be due to non-standard evaluation or other dynamic code. The app may not work as expected. > Caused by error in `globalsByName()`: > ! Identified global objects via static code inspection (structure(function (); {; .dependents$register(); if (.invalidated || .running) {; ..stacktraceoff..(self$.updateValue()); }; ...; ped1 <- Pedigree(data_env[["sampleped"]]); ped1[famid(ped(ped1)) == "1"]; }), class = c("reactiveExpr", "reactive", "function"))). Failed to locate global object in the relevant environments: 'data_env' > > Do you know how to fix this ?

Lori Shepherd (07:19:04) (in thread): > @Marcel Ramos Pérez/@Hechen LiThe package landing pages are build off the VIEWS page. It is currently not being picked up there perhttps://www.bioconductor.org/packages/devel/bioc/VIEWSso there might be a bug in biocViews or when the VIEWS is generated@Andres Wokaty

Lambda Moses (13:03:04): > I got this comment from users several times. They save aSpatialFeatureExperimentobject that hasDelayedArraysto RDS and then find that theDelayedArrays no longer work when the RDS is loaded on a different computer. Reading 10X data from h5 givesDelayedArrayfor assays. I myself have done it and it took me a while to realize. It’s actually an issue withSummarizedExperimentthat hasDelayedArray; if I don’t check, then I can’t readily tell if the assays areDelayedArrays or good olddgCMatrix. Do you think it’s a good idea to write asaveRDSmethod forSummarizedExperimentwhich gives a warning if theSummarizedExperimentobject containsDelayedArray? I know that there’ssaveHDF5SummarizedExperimentbut saving RDS is just too tempting.

Marcel Ramos Pérez (13:31:48) (in thread): > Right, saving to RDS is natural but in this case it breaks the link to the internal seed. Perhaps@Hervé Pagèscan weigh in but I think thatsaveRDSis meant to work on any R object that exists in RAM and not on-disk.. There isn’t a generic forsaveRDSsince it is a plain function AFAICT. Perhaps theSummarizedExperimentshow method should have an indication of the internal assay class?

Marcel Ramos Pérez (13:32:27) (in thread): > How are you running that bit of code? I can’t reproduce withR CMD check

Lambda Moses (13:33:25) (in thread): > Theterrapackage for raster geospatial data does have an S4 method forsaveRDS. What it does is to define a generic if one isn’t defined already.

Lambda Moses (13:33:29) (in thread): > https://rspatial.github.io/terra/reference/serialize.html - Attachment (rspatial.github.io): saveRDS and serialize for SpatVector and SpatRaster* — serialize > serialize and saveRDS for SpatVector, SpatRaster, SpatRasterDataset and SpatRasterCollection. Note that these objects will first be “packed” with wrap, and after unserialize/readRDS they need to be unpacked with rast or vect. > Extensive use of these functions is not recommended. Especially for SpatRaster it is generally much more efficient to use writeRaster and write, e.g., a GTiff file.

Lambda Moses (13:34:28) (in thread): > https://github.com/rspatial/terra/blob/e62c73372b008d0ef58957319a1a5a7286d7e338/R/Agenerics.R#L76

Marcel Ramos Pérez (13:46:01) (in thread): > I was able to reproduce withdevtools::test(). I’d probably avoid running tests with that and useR CMD checkinstead. Somehow it seems thatdevtools::test()is changing the calling environment.

Hervé Pagès (14:37:20) (in thread): > Yes we could define asaveRDS()generic inBiocGenericsand have asaveRDS()method for SummarizedExperiment derivatives that throws an error if the object contains out-of-memory assays. The error message would point the user tosaveHDF5SummarizedExperimernt(). I’ve created 2 issues for that: > * https://github.com/Bioconductor/BiocGenerics/issues/18 > * https://github.com/Bioconductor/SummarizedExperiment/issues/83 - Attachment: #18 Add saveRDS() generic - Attachment: #83 Add saveRDS() method for SummarizedExperiment derivatives

Kasper D. Hansen (14:46:22) (in thread): > I think this is a good solution to a common problem

Lambda Moses (15:30:25) (in thread): > Awesome, thank you!

Hervé Pagès (22:31:04) (in thread): > Done. Now you get: > > saveRDS(se, "se.rds") > # Error in saveRDS(se, "se.rds") : > # SummarizedExperiment object contains out-of-memory data so cannot be > # serialized reliably. Please use saveHDF5SummarizedExperiment() from the > # HDF5Array package instead. Also see '?containsOutOfMemoryData' in the > # BiocGenerics package for some context. >

2024-09-27

Hechen Li (17:32:06) (in thread): > Thanks for the suggestion! It was fixed after changing the file extension to strictly “Rmd”. I did changed that before but Ididn’tnotice that my git configuration was case insensitive.

2024-09-29

Louis Le Nézet (11:50:49) (in thread): > Thanks for the answer !

Louis Le Nézet (11:56:50): > Hi, > I’m trying to make reproducible unit test across windows and linux. > The issue is that the default device doesn’t have the same parameter, and the size of the text change just a little but enough to throw off the unit tests. > I tried to use : > > dev.new(width = 10, height = 10, > pin = c(10, 10), din = c(10, 10), fin = c(10, 10), > mai = c(0, 0, 0, 0), omi = c(0, 0, 0, 0), > cin = c(0.10, 0.10), cex = 1, mkh = 0.001, new = TRUE, > fin = c(10, 10), bg = "white", family = "HersheySans", > usr = c(0, 1, 0, 1), xaxp = c(0, 1, 5), yaxp = c(0, 1, 5), > fig = c(0, 1, 0, 1), mar = c(1, 1, 1, 1) > ) > > to fix the different parameters but the results keeps being different. > Has anyone succeed in making reliable graphical tests ?

Louis Le Nézet (12:00:07) (in thread): > Even after using this function, thepar()function differ for > > linux windows > $cra 14.400000 19.250526 10.8 14.4 > $cxy 0.017143674 0.024526239 0.01685092 0.02409176 > $din 9.9895833 9.9945319 10.14159 10.14159 > $fin 9.9895833 9.9945319 10.14159 10.14159 > $pin 8.7495833 8.1545319 8.901593 8.301593 > $plt 0.082085506 0.957956204 0.102055805 0.917955137 0.08085515 0.95858639 0.10057592 0.91914485 >

Louis Le Nézet (12:45:35) (in thread): > I’ve tried to change it, by adding: > > par_lst <- list( > "pin" = c(8, 8), "cex" = 1, "mai" = c(1, 1, 1, 1), > "fin" = c(6, 6), "bg" = "white", "family" = "HersheySans", > "usr" = c(0, 1, 0, 1), xaxp = c(0, 1, 5), yaxp = c(0, 1, 5), > "fig" = c(0, 1, 0, 1), "mar" = c(1, 1, 1, 1), xpd = TRUE > ) > R.devices::devNew("pdf", width = 10, height = 10, par = par_lst) > plot.new() > > to thetestthat.Rfile and this does seems to help.

2024-09-30

Kasper D. Hansen (09:22:03): > Technically, there is something harder about cross-platform tests, where you compute a reference object on one platform and you test across platforms. When I think about this (and I have had this experience in a different context), this both tests reproducibility but also differences across platforms. This is a much harder requirement.

Kasper D. Hansen (09:22:37): > For numerical accuracy, this also tests whether the operational precision (the precision you use to compare across platforms) is too high.

2024-10-01

Louis Le Nézet (03:27:31): > This is effectively difficult. I’ve decided to reduce a bit my requirement, by adding some rounding function (i.esignif) with a precision parameter that I can reduce for the test. > Using that the tests pass on windows and linux.

Kasper D. Hansen (08:55:21): > If you’re doing numerical computations, thinking about accuracy is really important. I have had a lot of surprises when tracking tests across operating systems

Louis Le Nézet (16:14:05): > After fixing this error I’ve finnaly achieved a stable package. > Everything was working nice with unitest for shiny modules using shinytest2. > However, I’ve just changed the github action to run with R 4.4 instead of 4.3 and then the following error appear: > > {shinytest2} R info 21:15:49.14 Error while initializing AppDriver: > Cannot find shiny port number. Error lines found: > Loading required package: shiny > {shiny} R stderr ----------- Loading required package: shiny > > I’ve succeed to reproduce it in two separate conda environment that differ only by the R version. The shiny, and shinytest2 package are of the same version. > This error only happen on linux when I’m using a module of my package with thecovr::package_coverage(). > It does not appear on windows, neither when usingdevtools::testnor withdevtools::checkorR CMD check. > I’ve tried to narrow it down by simplifying thetestthat.R. It seems that addinginstall_pathsomehow charge theshinylibrary. > On another note, R4.4 48.88secs take much longer to compute than R4.3 20.60secs > I would likecovr::package_coverage()as I’m using it with Github Action for the codecove badge. > Has anyone a solution ?

2024-10-02

masembe brian (01:36:35): > @masembe brian has joined the channel

Vince Carey (05:22:30) (in thread): > do you want to share the URL for the github repo where one can find the code to reproduce this?

Louis Le Nézet (05:49:52) (in thread): > It’s available here:https://github.com/LouisLeNezet/Pedixplorer/pull/3For the moment it works with R4.3 but whenever I change it to R4.4 it crash. - Attachment: #3 Bump r > Bump to R version 4.4

Louis Le Nézet (06:31:31) (in thread): > I’ve tried to narrow it down further by using two identical functions > One is declared intestthat.R(works) the other is declared in my package and crash.

Louis Le Nézet (06:38:01) (in thread): > The only different between the two in the output is:<environment: namespace:Pedixplorer>Maybe the shiny library isn’t attached to it ?

Louis Le Nézet (11:17:33) (in thread): > In the end I just needed to increase theload_timeoutas R4.4 took longer to initialize (why ? I don’t know). > But it now works !

Davide Risso (16:19:50) (in thread): > Cc@Dario Righelli

Eva Hamrud (19:05:33): > @Eva Hamrud has joined the channel

2024-10-04

Monisa Hasnain (02:04:51): > @Monisa Hasnain has joined the channel

2024-10-05

Ebele Lynda Okolo (13:28:59): > @Ebele Lynda Okolo has joined the channel

2024-10-06

Federico Marini (10:43:27) (in thread): > @Jared Andrewssomething non complete (yet) would also be the DeeDee packagehttps://github.com/lea-rothoerl/DeeDee/–> we are in the process of trying to give the class its best (so far) shape

Jared Andrews (10:48:04) (in thread): > Hmm, I have some similar(ish) stuff inhttps://github.com/j-andrews7/iBET, similarly unfinished though usable. See the shinyDECorr function, though I didn’t go through the effort of making a class to capture the DE results in a consistent way, I just have it guess based on column names for the usual suspects.

Jared Andrews (10:49:46) (in thread): > Sort of gave it up as I wanted to make a package of shiny modules for lots of base plot types to build off to save time, but just haven’t gotten around to it quite yet.

Federico Marini (10:57:30) (in thread): > Yeah, thing is indeed that - more than an ad.hoc thing somewhere in another package, it would make more sense to have something which has a clear structure - and ideally, well, have our own reports and all use it consistently, which is something that should be strived anyway (think of all the possible flavors one would have in all of this).

Federico Marini (10:58:15) (in thread): > speaking of defining the proper class, we wanted actually to take the whole first concept of DeeDee a bit further, but again - time constraints did not play for us

Federico Marini (10:58:59) (in thread): > One of the students in my group will toy a bit with implementation ideas, if you want to exchange thoughts, why not:wink:

Jared Andrews (11:10:49) (in thread): > Yeah, I’d be intrigued. I’ll take a closer look at it this week.

2024-10-08

Vince Carey (07:27:39) (in thread): > The working group exists and@Laurent Gattois currently the lead. The document (README athttps://github.com/Bioconductor/BiocClassesWorkingGroup) should be updated to identify all team members. IMHO transitioning to S7 is within scope. The group would deal with current and new designs in S4 and advantages of S7.

Dario Strbenac (08:00:06): > Therelease branchhasn’t been built since last week.

Lori Shepherd (08:02:04): > There should be a new one today. Seehttps://bioconductor.org/checkResults/release is only built on Tue and Fri

Tim Triche (08:43:02) (in thread): > Thanks Vince!!

Tim Triche (08:44:14) (in thread): > I fired up usethis/biocthis again yesterday to kick more tires, provoked by some junior folks at the Hutch.Is there somewhere I can purchase more hours in the day:grin:

Michael Lawrence (17:31:56) (in thread): > It would be great to reactivate that working group, and get some more people on board. For example, I think@Hervé Pagèswould be a key participant if we are seriously considering a transition. Significant problems include breaking serialized instances and the inability to incrementally change, since it is currently not possible to extend an S4 class with an S7 class or vice-versa (at least with all of the features still working). Probably the biggest issue design-wise would be that S7 only supports single inheritance. It would be interesting to consider the use cases of multiple inheritance e.g. within S4Vectors and how S7 might accommodate them. The easiest thing would be moving to S7 generics, as the syntax is cleaner and behavior easier to understand, while leaving the classes alone.

Jorge Kageyama (19:28:44) (in thread): > hello, in our case this change broke some pipelines that counted on being able to save just the in memory part. I wonder if it would be possible to have a warning instead of being forced to usesaveHDF5SummarizedExperiment, the reason being that the for example we use this innextflowpipelines, and it is much easier to pass a file, and we had multiple.rdsreferencing a single dataset

Hervé Pagès (20:06:40) (in thread): > FWIW multiple inheritance is extensively (butcautiously) used in theS4Vectors/IRanges/GenomicRangesstack and beyond. E.g. the CompressedIRangesList class in theIRangespackage extends IRangesList (for the semantics) and CompressedList (for the compressed list representation). This allows CompressedIRangesList to inherit a bunch of methods defined for CompressedList objects in general (e.g.unlist(),lapply(),lengths(),getListElement(), etc…). In other words, the CompressedList class serves as an umbrella for the 25 or so list-like classes defined inIRanges/GenomicRangesthat use the compressed list representation, thus avoiding the need to redefine the sameunlist()method over and over for each of them.

Hervé Pagès (20:47:02) (in thread): > Some rough count of the number of classes in Bioconductor that use multiple inheritance: 194 classes in 56 packages. (Count obtained bygrep’ing the entire code base for the occurences ofcontains = c(...)wherec(...)has 2 or more elements after removal of"VIRTUAL".)

Hervé Pagès (21:03:43) (in thread): > > Significant problems include breaking serialized instances and the inability to incrementally change > Given that, plus lack of multiple inheritance, the only realistic path to this transition I can think of is to start from scratch, unfortunately. Start with anS7Vectorspackage with S7 versions of Vector, List, SimpleList, DataFrame, Rle, Hits, etc… and see how it goes. Import all the unit tests fromS4Vectorsand make sure they pass on your S7 objects. Then, if you’re satisfied with what you have, go for S7 versions of IRanges, IRangesList, CharacterList, NumericList, etc… e.g. in a newS7IRangespackage or something like that. It’s going to be a bottom up approach. A HUGE refactoring effort! (No wonder after we’ve spent 20 years or so learning and mastering S4 to implement hundreds of S4 classes.) Do we have the resources for that? Personally I don’t.

Hervé Pagès (22:11:04) (in thread): > It would be easy to replace the error with a warning. However note that an easy workaround in this case is to callbase::saveRDS()instead, which is what I do insaveHDF5SummarizedExperiment()to serialize the in-memory part of the object. See:https://github.com/Bioconductor/HDF5Array/blob/6d4a8f7420471231d72804b5a1ac3585259056b9/R/saveHDF5SummarizedExperiment.R#L120-L123I could improve the error message to suggest the use ofbase::saveRDS()though, for users who really know what they’re doing.

Jorge Kageyama (22:27:51) (in thread): > i see, yes, that works!

Hervé Pagès (22:34:22) (in thread): > I edited the error message inSummarizedExperiment1.35.4. Now you get: > > > saveRDS(cmb3, "cmb3.rds") > Error in saveRDS(cmb3, "cmb3.rds") : > This SummarizedExperiment object contains out-of-memory data so cannot > be serialized reliably. Please use saveHDF5SummarizedExperiment() from > the HDF5Array package instead. Alternatively you can call > base::saveRDS() on it but only if you know what you are doing. > See '?containsOutOfMemoryData' in the BiocGenerics package for more > information. >

2024-10-09

Ludwig Lautenbacher (04:49:50): > Hi all, I got a failure in my package build that is a bit confusing for me. In 2 out of 6 hosts theCHECKstep fails because theSpectrapackage is missing to build my vignette. I think this is because in myDESCRIPTIONfile i specifiedSpectrumnotSpectraas a suggested package. So far this all makes sense to me. The confusing part is why this passed the CI pipeline during the review process and why this doesn’t fail on the other hosts as well. The only reason that I could come up with was that the preinstalled packages for the hosts are different but If I compare theInstalled pkgsfor a hosts where it works and where it doesn’t work both of them list theSpectrapackage. > > Can someone explain why it works on some hosts but not on others? And can I fix this by just updating myDESCRIPTIONfile? > > Here is the build reporthttps://bioconductor.org/checkResults/devel/bioc-LATEST/koinar/teran2-checksrc.html

Vince Carey (05:03:33): > @Ludwig Lautenbacherthe issue is that you use Spectra in your vignette but it is not declared in Suggests. The error message is a little strange but arises from the fact that R CMD check will create an isolated installed package collection based on the declarations in DESCRIPTION and only use those packages. This is dictated by the environment variable*R_CHECK_SUGGESTS_ONLY*which became a default in CRAN checking and in Bioc checking. In Writing R Extensions find > > On most systems, R CMD check can be run with only those packages declared in ‘Depends’ and ‘Imports’ by setting environment variable *R_CHECK_DEPENDS_ONLY*=true, whereas setting *R_CHECK_SUGGESTS_ONLY*=true also allows suggested packages, but not those in ‘Enhances’ nor those not mentioned in the DESCRIPTION file. It is recommended that a package is checked with each of these set, as well as with neither. >

Vince Carey (05:41:50) (in thread): > Thanks@Hervé Pagès. I just watched the video. I gather that the package is now called S7 (video refers to R7) and it is available on CRAN.@Hervé Pagès’ analysis seems sound to me; perhaps automation could be used to develop aspects of S7Vectors and IRanges7 to allow assessment of impacts. I would assume that contributors of new Bioc packages who want to employ S7 would be free to do so, but not at the expense of violations of the general recommendation on interoperability with existing classes that has been central to the project. I would recommend that those interested in working on S7 in Bioc present to the Tech Advisory Board. A generally open presentation in “developers forum” could also be arranged via zoom as concepts come into focus.

Louis Le Nézet (06:55:46): > Hi ! > I’m currently updating the vignettes in my package and I’ve added a interactive plot in it. > However this does make the html vignettes quite large (i.e. 4.4Mb) and therefore I get the error > > ERROR: Package tarball exceeds the Bioconductor size requirement. > Package Size: 5.7 MB > Size Requirement: 5.00 MB > > Is there a way to keep the vignettes as it is or should I unevaluate the plotly function ?

Mike Smith (09:07:38) (in thread): > To follow up on this, the*R_CHECK_SUGGEST_ONLY*variable is set in the.Renvironfile on the builders. The example file you can find athttps://bioconductor.org/checkResults/devel/bioc-LATEST/Renviron.biocindicates that it’s only set on the Linux builders, hence the success/failure pattern that you’re seeing: > > # - Only the Bioconductor ***Linux*** builders use the above setup at the moment (i.e. all > # packages except base and recommended packages are installed in <R_HOME>/site-library). > # This means that 'R CMD check' can only expose undeclared dependencies on the Bioconductor > # Linux builders. It will NOT expose them on the Bioconductor Windows or Mac builders. >

Vince Carey (12:06:53): > We need to discuss this internally. There may be a way to reduce the volume consumed by an interactive plot. I don’t like the idea of forcing reduced functionality in bioc packages. For now you may have to unevaluate.

Michael Lawrence (12:31:46) (in thread): > Sounds good. We’re going to work on enabling extension of S4 classes with S7 classes, so that new classes can be built in S7. And as I said new code (or old code) can always use S7 generics, which has a cleaner syntax, IMHO.

Simon Pearce (12:41:50): > Personally I feel that the 5MB limit could be a bit higher

David Rach (14:57:41): > For CRAN/devtools check(), I have quite a few of these “no visible binding for global variables” come up with a note. Still a bit confused about their cause, is there a good way to straightforward address/resolve them? Thanks for any insight! David - File (PNG): 1000008344.png

Dirk Eddelbuettel (15:02:44) (in thread): > “CRAN giveth and CRAN taketh.” Complaining about the number here does not change things much, maybe start a discussion on r-package-devel instead?

Dirk Eddelbuettel (15:06:17) (in thread): > That happens under non-standard evaluation when the parser triggered byR CMD check(and alike) thinks a global was spotted, it is also common withdata.tablewhere we can address columns directly. One fix is to declare the identifiers viautils::globalVariable()ie add a lineutils::globalVariables(c("Fluorophore", "AdjustedY", "Detector", "value", ".data", "Backups", "Clusters"))and adjust as needed.

Kasper D. Hansen (16:27:05) (in thread): > I think@Simon Pearceis complaining about Bioc limits

Kasper D. Hansen (16:27:40) (in thread): > or change your syntax in your code

Dirk Eddelbuettel (16:28:30) (in thread): > Oh, my bad in that case. CRAN also has (somewhat) arbitrary and old (by today’s standards) limits for NOTE.

2024-10-10

Dario Righelli (04:30:56) (in thread): > If you want, I can invite you/you can join (@Michael Lawrence,@Tim Triche) to the#biocclasseschannel. I’ve just sent a message (following this discussion) to try to meet at the beginning of November.

Dirk Eddelbuettel (21:15:20) (in thread): > To make this concrete, how do you suggest changing the syntax? A concrete example would be below, and it would warn onmpgandhpandcylas the checker in base R ‘cannot know’ these are columns of thedata.frame/data.tablethat isD. - File (PNG): image.png

Kasper D. Hansen (21:19:48) (in thread): > Idon’tthink this is easy to fix with data.table syntax.But the same errors can also come from tidyverae expressions and those can usually be fixed.

2024-10-14

Erick Navarro (16:36:26): > @Erick Navarro has joined the channel

2024-10-15

Sean Davis (00:31:56): > @Mike SmithI’m in the midst of updating some older book material and wanted to incorporate some of your new styling, particularly around the questions/solutions. I’m using quarto and have working cross-reference environments. What I’d like to do is to use your MSMB callout styles since they look SO professional. Is there code available for the css and for how to use the styles?

Mike Smith (06:34:10) (in thread): > I think I use the Quarto callout utilities documented inhttps://quarto.org/docs/authoring/callouts.htmlI have the following in the_quarto.ymlto define the titles etc: > > language: > en: > # Tasks (green, unnumbered) > callout-tip-title: "Task" # tip - green > # Questions & Exercises (Questions - yellow, Exrs. - red, numbered) > callout-warning-title: "Question" > crossref-wrn-title: "Question" > crossref-wrn-prefix: "Question" > > callout-important-title: "Exercise" > crossref-imp-title: "Exercise" > crossref-imp-prefix: "Exercise" > # Solutions (blue, unnumbered) > callout-note-title: "Solution" # note - blue > > callout-icon: false >

2024-10-18

Mike Morgan (11:08:13): > I get a BiocCheck warning forEmpty or missing \value sections found in man page(s).But this is in the man page for the package, so there is no return value. Add@return NULLdoesn’t fix the warning. Any suggestions?

Marcel Ramos Pérez (11:16:52) (in thread): > Hi Mike, does it have a\docType{package}tag? If it does, it shouldn’t trigger the warning.

Yasmine Elnasharty (14:02:08): > @Yasmine Elnasharty has joined the channel

2024-10-22

Edward Zhao (03:04:48): > @Edward Zhao has joined the channel

Edward Zhao (03:08:43): > Hello all, wondering if anyone else has run into a problem with the R arrow package causing build errors on macOS. > > Error: processing vignette 'VisiumIO.Rmd' failed with diagnostics: > unable to load shared object '/Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library/arrow/libs/arrow.so': > dlopen(/Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library/arrow/libs/arrow.so, 0x0006): symbol not found in flat namespace (_ZSTD_compress) > --- failed re-building 'VisiumIO.Rmd' > > See links here: > * BayesSpace on BioC3.20 fails:https://bioconductor.org/checkResults/devel/bioc-LATEST/BayesSpace/ > * VisiumIO on BioC 3.20 fails:https://bioconductor.org/checkResults/devel/bioc-LATEST/VisiumIO/ > * VisiumIO on BioC 3.19 is ok:https://bioconductor.org/checkResults/release/bioc-LATEST/VisiumIO/ > > * interestingly, it seems like BioC 3.19 uses a newer version of macOS to test than BioC 3.20

Leonardo Collado Torres (14:14:40) (in thread): > I think that you might had been thinking aboutDEFormatshttps://bioconductor.org/packages/DEFormats/

Jared Andrews (14:16:49) (in thread): > Hmm, I don’t think so, it was specifically for the results tables.

2024-10-23

Hong Qin (17:49:27): > @Hong Qin has joined the channel

2024-10-25

Hervé Pagès (01:58:49): > I’m considering havingBiocGenericsdepend ondplyrin BioC 3.21. Here’s why: > > Many packages putdplyron the search path (by having it inDepends):AMARETTO,AnVIL,BiocSet,broadSeq,BubbleTree,cellxgenedp,ceRNAnetsim,DFplyr,Guitar,HiCcompare,octad,omada,optimalFlow,Organism.dplyr,pfamAnalyzeR,protGear,QuaternaryProd,RegEnrich,Rtpca,TPP2D,TPP. > > This masks a few S4 generics defined inBiocGenerics, likecombine(),intersect(),setdiff(),union(), and therefore breaks things like: > * combine(data.frame(aa=11:13), data.frame(bb=21:24))if one of the packages above gets loaded afterBiocGenerics; > * intersect(IRanges("11-20"), IRanges("15-30"))if one of the packages above gets loaded afterIRanges. > More generally, any package that implements an S4 method for one of the above generics will break if one of the packages above is loaded after it. > > Note thattidySummarizedExperimentdoes something peculiar in that area: it does notDependsondplyr, it onlyImportsit. However it also calls an internal function (tidySummarizedExperiment::tidyverse_attach()) in its.onAttach()hook to explicitly putdplyr(and other packages) on the search path. The result is the same as with the packages above i.e. loading it will also break thecombine()andintersect()methods defined in other packages. > > Of course the user can work around this by usingBiocGenerics::combine()orBiocGenerics::intersect()but this is barely a good situation. Few users would know what to do and it’s a bummer to have to use this in an interactive session. > > One way to solve this problem once for all is to haveBiocGenericsdepend ondplyr(viaDepends). That way the former would always be in front of the latter in thesearch()path. Additionally this will give us the opportunity to turn some populardplyrverbs llikemutate(),select(),filter(), etc… into S4 generics inBiocGenericsif that’s something that the authors of the tidyxxx and plyxxx packages in Bioconductor think they could benefit from. Luckilydplyris quite popular so most users already have it installed. It’s also quite light so havingBiocGenericsdepend on it should not be a problem. > > Any thoughts?

Shian Su (03:07:11) (in thread): > It’snot the intended purpose of depends and it feels wrong to use it this way. Namespace conflicts are a fact of life every R user has to deal with eventually, I think documenting common conflicts and resolution strategies should suffice.

Alik Huseynov (05:43:02) (in thread): > I think addingdpyris important since it is already widely used and quite handy. And in some cases, combination withmagrittrpipes will make the code cleaner and easier to follow (provided that it doesn’t slow things down)

Aaron Lun (11:34:11) (in thread): > having carefully avoided tidyverse contamination in all my packages, this change would be quite the disappointment.

Hervé Pagès (11:39:15) (in thread): > @Shian SuYes a fact of life and a major annoyance, like bugs. If we know about them and have the possiblity to address them, it doesn’t feel right to not do anything and pass them to the user. We createdBiocGenericsmany years ago with the precise goal to fight name conflicts. It’s been very effective at doing that. Also note that if we turn some of thedplyrverbs into S4 generics (which I think we should do at some point), we’ll have to putdplyrinBiocGenerics’sDependsanyways. This is going to be easier to do with less risk of disruption ifBiocGenericsalready depends ondplyr.

Hervé Pagès (11:44:32) (in thread): > @Aaron LunHow would this change affect your packages? Also how hard was it to carefully avoid tidyverse contamination? You make it sound like it was a lot of work that this change will throw away. FWIW all my packages also avoid tidyverse contamination but I wouldn’t say “carefully” because I didn’t do anything for that. It just happened:wink:

Hervé Pagès (11:52:53) (in thread): > @Alik HuseynovThanks! Do people still needmagrittr’s pipe? Can’t they use the pipe in base R instead? It’s been around for a while now. Just curious.

Alik Huseynov (12:09:27) (in thread): > Yes and it is quite handy. I always use magrittr pipes in my workflows but if I co-develop something then I always follow agreement with main dev. > Honestly, I try to avoid base R pipe since there limitations on how one can use it

Aaron Lun (12:17:56) (in thread): > No effect on the packagesper se, but it adds an (unnecessary) transitive dependency. I don’t like the increased exposure to~~~Rstudio’s~~~Posit’s potential screw-ups. Every dependency should have a good reason to justify its existence, and the avoidance of naming conflicts with a non-base package just doesn’t seem like a very good reason. If we do this, it opens the door to a whole bunch of other requests toDependson widely used packages that aren’t relevant to us. Also, dplyr is not that light, with a bunch of transitive dependencies of its own. Who knows what they’ll do in the future. > > As for what I do: given a choice between using a tidyverse-related package and not (either by using another package, or by just writing it myself), I will choose the latter. Sometimes I have no choice (e.g., httr2 pulls in magrittr, RSQLite pulls in rlang), and I just tolerate the increased dependency burden through gritted teeth. But other times, I can rejoice - for example, the upcoming biocmake + Rlibigraph submissions will liberate me from an igraph dependency and its associated tidyverse dependencies (magrittr, etc.).

Lambda Moses (12:22:16) (in thread): > I also have this problem for theSpatialFeatureExperimentbuild. Any updates?

Dirk Eddelbuettel (12:30:05) (in thread): - File (R): Untitled

Dirk Eddelbuettel (12:32:21) (in thread): > I mentioned this thread somewhere and someone rightly observed that you could in fact consider the zero-dependency rewrite of the verbs provided by packagepoormanhttps://cran.r-project.org/package=poorman - Attachment (cran.r-project.org): poorman: A Poor Man’s Dependency Free Recreation of ‘dplyr’ > A replication of key functionality from ‘dplyr’ and the wider ‘tidyverse’ using only ‘base’.

Dirk Eddelbuettel (12:35:42) (in thread): > Might also provide a base layer for S4 generics, possibly to be replaced bydplyritself when present. Just to make things more complicated still:wink:

Alik Huseynov (12:51:56) (in thread): > > As for what I do: given a choice between using a tidyverse-related package and not (either by using another package, or by just writing it myself), I will choose the latter. > @Aaron Lunso basically you suggest to avoid tidyverse packages completely?

Aaron Lun (12:54:11) (in thread): > as a general rule, yes. there are some exceptions with irreplaceable functionality, e.g., shiny pulls in a lot of tidyverse stuff. But I don’t see any real benefits from the NSE, dplyr verbs, piping or such. The syntatic sugar is not worth the extra dependencies.

Hervé Pagès (12:57:34) (in thread): > I feel the same but it doesn’t change the fact that 99.9% of Bioconductor usersalreadyhavedplyrinstalled. There’s also a growing interest in bringing more tidy functionality in Bioconductor with many tidyxxx and dplyxx package submissions and the tidyomics project. We can not ignore that.

Alik Huseynov (13:01:52) (in thread): > I’m more for making code more clear and pipes are the best ways to do it. Maybe tidyverse syntax will become a separate language..

Alik Huseynov (13:02:21) (in thread): > > There’s also a growing interest in bringing more tidy functionality in Bioconductor with many tidyxxx and dplyxx package submissions and the tidyomics project. We can not ignore that > I fully agree on that!

Hervé Pagès (13:12:20) (in thread): > BTW I didn’t mean to start a debate about tidyverse syntax vs “traditional” syntax, this is just a matter of taste. All I want is avoid name clashes in Bioconductor. There’s an easy way to achieve that and make both syntaxes coexist peacefully in the ecosystem.

Hervé Pagès (13:20:53) (in thread): > @Dirk EddelbuettelYes 15 packages but nb of packages maybe not the best metrics. Total size is only 5.9M. Only takes 44 sec to download/install/compile on my Linux laptop.library(dplyr)only adds 1 package to thesearch()path. So pretty light and clean IMO.

Hervé Pagès (13:22:08) (in thread): > And this would only matter for the 0.01% of Bioconductor users that don’t already havedplyron their machine.

Aaron Lun (13:37:00) (in thread): > The install time is not the (major) problem. It’s the increased exposure to the dplyr developer’s decisions. We become subject to their whims - what if they add or remove verbs? What if they decide to add even more transitive dependencies? And as usual, any changes will probably be made in the middle of our release cycle, so we’ll be scrambling to react again.

Dirk Eddelbuettel (13:37:35) (in thread): > Installtimeis largely irrelevant to me: withr2uI get all of tidyverse (say) in 18 seconds; dplyr alone likely in four or five. > > But you brushed over@Aaron Lun’s point of added fragility and more or less ignored it, and I think we will just have to disagree here. I agree that programming style is indeed mostly optional, which is why imposing one choice is kinda sad. - Attachment (eddelbuettel.github.io): CRAN as Ubuntu Binaries - r2u > Easy, fast, reliable – pick all three!

Dirk Eddelbuettel (13:38:08) (in thread): > “Jinx” as my wife would say. We posted the sameripostelargely simultaneously.

Hervé Pagès (14:05:36) (in thread): > Changes/instabilities in thedplyrstack will not affect your package if you don’t depend on that stack. It will only affect Bioconductor tidyxxx/plyxxx packages. Whether I putdplyrinBiocGenerics’sDependsor not won’t make any difference in that regard. > Hey, it’s not like I’m proposing to addSeurattoBiocGenerics’sDepends. C’mon guys!

Lluís Revilla (15:16:55) (in thread): > I would recommend to also share this plan on the bioc-devel mailing list too, to give all the Bioconductor maintainers the opportunity to share their experience.

Lluís Revilla (15:19:16) (in thread): > In my opinion, while it has been more consistent with less deprecations now that before (with reshape, reshape2, plyr, dplyr) it doesn’t need to be a dependency. See the talk at useR2024 (pdf near the bottom):https://userconf2024.sched.com/event/1c8z3which is based onhttps://dplyr.tidyverse.org/reference/dplyr_extending.htmlBioconductor classes by implementing these methods could avoid the namespace conflicts.

Lluís Revilla (15:22:15) (in thread): > As a developer even if I have a package installed it is no reason to depend on it when I create a new package. One can provide functions to convert to such classes and let users use those other packages and methods.

Hervé Pagès (16:16:39) (in thread): > > As a developer even if I have a package installed it is no reason to depend on it when I create a new package. > Just to clarify, you won’t need to explicitly depend ondplyr, or worry aboutindirectlydepending on it, when you create your new package. > When you create a new package and go thru the process of carefully picking up the things to put in yourDependsandImportsfields, youalmost alwaysend up dependingindirectlyon things that you never asked for. That’s just another fact of life. And those indirect deps could and will change. You have no control on that but it is what it is and it’s fine.

Hervé Pagès (17:09:25) (in thread): > > what if they add or remove verbs? > Well isn’t that a situation where havingdplyrat the far end of thesearch()path instead of near its beginning will bea good thing? Because no matter what they do, they won’t be able to mask any of our verbs.

2024-10-26

Kylie Bemis (00:58:22) (in thread): > Having briefly addeddplyrverbs toCardinaland then later deprecating and removing them, I have mixed feelings. (I didn’t want to be responsible for the S4 generics, and they broke/changedthe signaturea couple times.) I’d be happy to add them toCardinal, but I really don’t want them inmatter(where I’d much rather remove dependencies). Is there a way to have aTidyGenericspackage that depends on bothdplyrandBiocGenericsand loads them in a way so that there’s a single source of tidy generics for Bioconductor packages that want them and avoids the masking issue?

Hervé Pagès (01:22:33) (in thread): > That would not solve the problem. Thedplyrverbs mask some S4 generics defined inBiocGenerics(e.g.combine()andintersect()) so we need to introduce the dependency ondplyrat theBiocGenericslevel. If you introduce it at theTidyGenericslevel, then loadingTidyGenericswill mask the S4 generics defined inBiocGenerics, so the problem remains. I feel that it would actually make the situation slightly worse because it would increase the chances of masking.

Shian Su (01:48:15) (in thread): > If I understood it right when I sent my first message, this has nothing to do with actually incorporating dplyr verbs or creating any true dependency,it’susing Depends as a hack to avoid namespace masking. “Fixing” the mask issue leads to an uncomfortable implication that the two are interoperable, could you combine objects from both universes of packages? Just for the record, I am not strongly opposed to the idea,it’sjust an uncomfortable solution to the problem.

Shian Su (02:03:07) (in thread): > Also, does legitimizing this technique have unintended consequences if it were more broadly used in other packages?

Shian Su (02:03:57) (in thread): > Do we end up in an arms race of people trying to push each other down the search path?

Hervé Pagès (02:24:59) (in thread): > I don’t think so. Not in an ecosystem where we have a central place (BiocGenerics) to define and share verbs.

Hervé Pagès (02:26:01) (in thread): > Discussion moved tohttps://github.com/Bioconductor/BiocGenerics/issues/20 - Attachment: #20 Add dplyr to BiocGenerics’s Depends field > Proposal > > Add dplyr to BiocGenerics’s Depends field > > Motivation > > The problem > > We have currently 31 software packages in Bioconductor that put dplyr on the search() path, either intentionally by having it in Depends, or unintentionally by depending (directly or indirectly) on a package that has it in Depends. See at the end of this post for the full list. > > This masks some of the S4 generics defined in BiocGenerics, like combine(), intersect(), setdiff(), union(), and therefore breaks things like: > > • combine(data.frame(aa=11:13), data.frame(bb=21:24)) if any of these 31 packages gets loaded after BiocGenerics; > • intersect(IRanges("11-20"), IRanges("15-30")) if any of these 31 packages gets loaded after IRanges. > > More generally, any package that implements an S4 method for one of the above generics will break if any of these 31 packages is loaded after it. > > For the anecdote, we’ve found at least one package that manages to put dplyr on the search() path without having it in Depends (and without depending on a package that has it in Depends): tidySummarizedExperiment. It does so by calling an internal function (tidySummarizedExperiment::tidyverse_attach()) in its .onAttach() hook to explicitly put dplyr (and other packages) on the search path. The result is the same as with any of the 31 packages mentioned previously i.e. loading it will also break the combine() and intersect() methods defined in other packages. > > Sure, the user can work around this by using BiocGenerics::combine() or BiocGenerics::intersect(), but: > > • few users would know what to do; > • it can be a real annoyance to have to use this in an interactive session. > > So not really a satisfying answer to the problem. > > Why is it a problem only now? > > There’s a growing interest in bringing more tidy functionality in Bioconductor with more and more tidyxxx and dplyxx package submissions to Bioconductor. All these packages depend (via Depends or Imports) on dplyr for its popular verbs (mutate, select, filter, etc…). As more tidyxxx/dplyxx packages make it into Bioconductor, the more frequent name clashing events are going to be. > > The proposed solution > > One way to solve this problem once for all – and to greatly improve the user experience – is to have BiocGenerics depend on dplyr (via Depends). That way the former would always be in front of the latter in the search() path. Additionally this will give us the opportunity to turn some popular dplyr verbs llike mutate(), select(), filter(), etc… into S4 generics in BiocGenerics if that’s something that the authors of the tidyxxx and plyxxx packages in Bioconductor think they could benefit from. > > Note that we’ve created BiocGenerics many years ago with the precise goal to fight name conflicts, and it’s been a very effective way of doing that. This proposal is the continuation of this on-going effort, but with the difference that it would be the first time that BiocGenerics depends on a package that is not part of base R. > > However IMO this is a minor concern. Here’s why: > > 1. Even though dplyr itself depends on 15 packages (not counting base packages), all of them are pretty small. The total size of the 16 source tarballs (dplyr + its 15 deps) is only 5.9M. > 2. Installing these 15 packages from source is pretty fast: it takes less than a minute to download/install/compile them on an average Linux laptop. Hey, dplyr is NOT duckdb! :wink: (Of course users who install binary packages – e.g. Windows and Mac users – don’t care about compilation times.) > 3. In fact, dplyr is already imported (directly or indirectly) by hundreds of Bioconductor packages, so 99.9% (made up number!) of Bioconductor users already have it installed. Therefore 1. and 2. above would only matter for the 0.1% of Bioconductor users who don’t already have dplyr on their machine. > > It’s also worth noting that having BiocGenerics depend on dplyr will put the latter at the far end of the search() path in the typical Bioconductor session. This will provide additional protection against potential additions to the dplyr vocabulary by the dplyr developers because none of these additions will be able to mask a Bioconductor verb. > > Comments welcome > > Please note that this is not the place to debate about tidyverse syntax vs “traditional” syntax, nor about which one is your favorite. Let’s keep the discussion focused on name clashes in Bioconductor and how to avoid them. This proposal is an easy way to achieve that for the dplyr verbs, and to make the tidyverse and “traditional” syntax coexist peacefully in the ecosystem. > > List of Bioconductor software packages that put dplyr on the search() path (as of Oct 25, 2024) > > AlphaMissenseR, AMARETTO, AnVIL, bioCancer, BiocSet, broadSeq, BubbleTree, canceR, cBioPortalData, cellxgenedp, ceRNAnetsim, DFplyr, GNOSIS, Guitar, HiCcompare, IsoformSwitchAnalyzeR, octad, omada, optimalFlow, Organism.dplyr, pfamAnalyzeR, protGear, QuaternaryProd, RegEnrich, Rtpca, tidyomics, tidySingleCellExperiment, tidySpatialExperiment, tidySummarizedExperiment, TPP, TPP2D > > How this list was obtained: > > see details > 1. Clone the BioC manifest repo:
> > git clone [https://git.bioconductor.org/admin/manifest.git](https://git.bioconductor.org/admin/manifest.git) > > 2. Put the following code in put_dplyr_on_search_path.R:
> > are_you_putting_me_on_the_search_path <- function(you, me, verbose=FALSE) > { > if (isTRUE(verbose)) > message("Checking ", you, " ... ", appendLF=FALSE) > Rscript <- file.path(R.home("bin"), "Rscript") > fmt <- "suppressMessages(library(%s));cat(\"package:%s\"%%in%%search())" > Rexpr <- sprintf(fmt, you, me) > out <- suppressWarnings( > system2(Rscript, c("-e", paste0("'", Rexpr, "'")), stdout=TRUE) > ) > if (isTRUE(verbose)) > message("OK") > status <- attr(out, "status") > if (!is.null(status) && status != 0) > return(NA) > identical(out, "TRUE") > } > ## Check all software packages in BioC 3.21 (all of them need to be installed): > manifest <- "path/to/manifest/software.txt" > softpkgs <- read.table(manifest)[[2]] > length(softpkgs) # 2281 software packages in BioC 3.21 on Oct 25, 2024 > ## Takes about 3-4 hours on a Linux machine that has (almost) all the software > ## packages installed: > system.time(yes <- sapply(softpkgs, are_you_putting_me_on_the_search_path, "dplyr", TRUE)) > saveRDS(yes, "put_dplyr_on_search_path.rds") > table(yes, useNA="ifany") > > 3. Adjust manifest. > 4. Run script with path/to/R/bin/Rscript put_dplyr_on_search_path.R >put_dplyr_on_search_path.log 2>&1 &

2024-10-27

Edward Zhao (01:11:48) (in thread): > I sent an email to the mailing list but have not heard back

Eva Hamrud (18:30:50): > Hello, I was wondering if another worker/environment has been added to the Bioconductor build checks foe BioC 3.20? I pushed changes for the last time over a week ago and my package passed all the checks, but then when I checked the page again today I see my package has failed on the linux environment ‘kunpeng2’ (https://bioconductor.org/checkResults/3.20/bioc-LATEST/mixOmics/). I’m certain I didn’t see any errors last week so I’m not sure where this is coming from as I haven’t pushed any other changes to Bioconductor. I know that all checks need to have been passed by Friday 25th October so the package can be in BioC 3.20 which is released later this week could someone please advise on what I need to do?

Lori Shepherd (19:19:16) (in thread): > If your package is building on the other platforms it will still be released in bioc 3.20 so no worries there.

2024-10-28

Martin Grigorov (05:59:13) (in thread): > Hi!kunpeng2builder is not new. It is being used to test the build on Linux ARM64 for around an year now. > The problem in the test is the comparison of floating point integers. As you may know arithmetic operations with FP integers are not exact. You need to check that the difference is within some range, with a tolerance

Martin Grigorov (05:59:48) (in thread): > here is an example -https://stackoverflow.com/questions/61360074/compare-floats-in-r - Attachment (Stack Overflow): Compare floats in R > Disclaimer > > I was not sure whether to post that here or on CV but after having read what is on topic on CV I think it is more R specific then purely statistical. Thus, I posted it here. > > Problem > > C…

Eva Hamrud (18:26:24) (in thread): > ok will look into that, thank you!

Changqing (19:37:12): > I ran into a zlib.h not found error on kunpeng2. I am usingPKG_LIBS = -pthread -lz $(RHTSLIB_LIBS)to include zlib and it was working until yesterday. Did something change on kunpeng2? > > * installing to library '/home/biocbuild/R/R-4.4.1/site-library' > * installing **source** package 'FLAMES' ... > **** using non-staged installation via StagedInstall field > **** libs > using C++ compiler: 'g++ (conda-forge gcc 14.2.0-1) 14.2.0' > using C++17 > ... > classes/GeneAnnotationParser.cpp:8:10: fatal error: zlib.h: No such file or directory > 8 | #include "zlib.h" > | ^~~~~~~~~~~~~~ >

Lori Shepherd (20:38:26) (in thread): > @Martin Grigorov

2024-10-29

Martin Grigorov (05:17:42) (in thread): > Which package fails ? It is not clear from the provided information

Martin Grigorov (05:17:49) (in thread): > ah, FLAMES !

Martin Grigorov (05:30:55) (in thread): > export CPATH="$CPATH:/home/biocbuild/miniforge3/include"fixed it: > > biocbuild@kunpeng2 ~/git [1]> R CMD build FLAMES (base) > * checking for file 'FLAMES/DESCRIPTION' ... OK > * preparing 'FLAMES': > * checking DESCRIPTION meta-information ... OK > * cleaning src > * installing the package to build vignettes > * creating vignettes ... OK > * cleaning src > * checking for LF line-endings in source and make files and shell scripts > * checking for empty or unneeded directories > * building 'FLAMES_1.99.2.tar.gz' >

Travis Blimkie (11:19:09): > @Travis Blimkie has joined the channel

2024-10-30

FAIZAN (12:31:33): > @FAIZAN has joined the channel

Henrik Bengtsson (13:41:57): > Question on removal of broken packages: Is the cleanup of broken Bioc packages automatic or manual? Take for instance the ‘ISAnalytics’ package, itfailed all checks in Bioc 3.19, and all checks inBioc devel (now 3.20). There has been no updates to this packagesince 2023-07-26. Yet, it looks likeit was “approved” for the Bioc 3.20 release. Wasn’t this supposed to be deprecated and removed? This is a general question; I happened to spot this one because it keeps popping up as a broken package in my revdep checks. Am I misunderstanding the deprecation process, is there a bug in some script, or something else?

Lori Shepherd (13:46:25) (in thread): > It is manual. I was more lenient this release than most release because we were having so much trouble with our builders this release cycle that I didn’t think it fair to developers to deprecate with out consistent reports. One of my pet projects is to build a report database so that length of failures can be more publically traceable but I have a script that I keep locally for this. that particular package did build/check this release cycle and didn’t start failing until mid Sept (we also normally give packages 4-6 weeks of consistent failure before deprecation)

Lori Shepherd (13:49:42) (in thread): > in general the process is … auto failure notifications from the builders… separate emails if it keeps failing and on my radar for deprecation … if it fails for 4-6 weeks with no effort or replies then marked for deprecation in devel…. if not fixed before release it would move forward and be deprecated in the next release and removed from the next devel… they still would have the release cycle to fix in release to bring it back …

Henrik Bengtsson (15:25:21) (in thread): > Got it - thanks for clarifying. BTW, is there a way to see if a package is marked for deprecation, e.g. a custom field inDESCRIPTION?

Lori Shepherd (15:27:21) (in thread): > yes if a package hasPackageStatus: Deprecatedin the DESCRIPTION thats are way of marking and having it appear strikedout in the build report. I’ll also try to announce on bioc-devel and the support site a few times during release

Lori Shepherd (15:28:26) (in thread): > and I try to keep this page updatedhttps://bioconductor.org/about/removed-packages/

Lori Shepherd (16:31:02): > Bioconductor 3.20 is released! Thanks to all developers and community members for contributing to the project! Please see full release announcement:https://bioconductor.org/news/bioc_3_20_release/

Hervé Pagès (21:16:41) (in thread): > Finally resolved (in BioC 3.21) by leveraging thegenericspackage:https://github.com/Bioconductor/BiocGenerics/issues/20#issuecomment-2448791233 - Attachment: Comment on #20 Add dplyr to BiocGenerics’s Depends field > Alternative solution implemented in BiocGenerics 0.53.1 (BioC 3.21, requires R 4.5): b18a5f2 > > The good news is that the dplyr folks don’t need to do anything :smiley: : > > > library(IRanges) > > intersect(IRanges("11-20"), IRanges("15-30")) > # IRanges object with 1 range and 0 metadata columns: > # start end width > # <integer> <integer> <integer> > # [1] 15 20 6 > > library(dplyr) > > intersect(IRanges("11-20"), IRanges("15-30")) # breaks in BioC 3.20! > # IRanges object with 1 range and 0 metadata columns: > # start end width > # <integer> <integer> <integer> > # [1] 15 20 6 > > > > Not many dplyr verbs are actually defined as S3 generics in the generics package, but we are just lucky here that the “setops verbs” (intersect, union, etc…) are. However, the good thing is that if other dplyr verbs turn out to clash with verbs in Bioconductor, we could always ask the tidyverse folks to put them in generics so we can use the same setup as with the “setops verbs” to avoid the clash. This kind of change is pretty straightforward and should not be disruptive. > > sessionInfo() > > > R Under development (unstable) (2024-10-22 r87265) > Platform: x86_64-pc-linux-gnu > Running under: Ubuntu 23.10 > > Matrix products: default > BLAS: /home/hpages/R/R-4.5.r87265/lib/libRblas.so > LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: America/Los_Angeles > tzcode source: system (glibc) > > attached base packages: > [1] stats4 stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] dplyr_1.1.4 IRanges_2.41.0 S4Vectors_0.45.0 > [4] BiocGenerics_0.53.1 generics_0.1.3 > > loaded via a namespace (and not attached): > [1] utf8_1.2.4 R6_2.5.1 tidyselect_1.2.1 magrittr_2.0.3 > [5] glue_1.8.0 tibble_3.2.1 pkgconfig_2.0.3 lifecycle_1.0.4 > [9] cli_3.6.3 fansi_1.0.6 vctrs_0.6.5 compiler_4.5.0 > [13] pillar_1.9.0 rlang_1.1.4 > >

2024-10-31

Hervé Pagès (11:52:35) (in thread): > Seehttps://stat.ethz.ch/pipermail/bioc-devel/2024-October/020695.html

Tim Triche (14:50:40) (in thread): > thanks for all your hard work on this Lori!

2024-11-03

Dirk Eddelbuettel (11:25:42): > [ Deleted some line noise when I had fallen for a new dependency not properly flagged in my setup. Entirely my bad. ]

2024-11-12

Changqing (02:09:12): > How do I make the citation on the bioconductor package landing page to showet al.? I tried editingCITATIONto this: > > bibentry( > bibtype = "article", > author = "Luyi Tian, Jafar S Jabbari, Rachel Thijssen, et al.", > ... > textVersion = > "Tian, L., Jabbari, J. S., Thijssen, R., et al. (2021). Comprehensive characterization of single-cell full-length isoforms in human and mouse with long-read sequencing. Genome Biology, 22(1), 310.[https://doi.org/10.1186/s13059-021-02525-6](https://doi.org/10.1186/s13059-021-02525-6)" > ) > > If I docitation("FLAMES")within R it prints what I expect, thetextVersionstring. > But ourpackage pageshowsTian L, Jabbari JS, Thijssen R, al. e (2021)for authors. - Attachment (Bioconductor): FLAMES (development version) > Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.

Mariangela Santorsola (06:42:18): > @Mariangela Santorsola has joined the channel

Lori Shepherd (06:44:12): > @Andres Wokaty– might have to look into how the citation files are created on the BBS

Dirk Eddelbuettel (08:35:17) (in thread): > Have you tried protecting the ‘et al.’ with a set of curlies:{et al.}which signals to bibtex and its tools to not split and invert the string?

Andres Wokaty (12:25:02) (in thread): > I think this is due to howbiocViewsgenerates it. I will try to correct it.

2024-11-13

Luke Zappia (02:14:47) (in thread): > It’s probably better to write an entry that lists all the authors. Partly so that anyone that usescitation()gets the full information but also because everyone on the paper deserves to be acknowledged.

Robert Castelo (03:16:11): > Hi, according to the documentation on Bioconductor docker containers athttps://bioconductor.org/help/dockerthe docker container bioconductor/bioconductor_docker:devel should run the current Bioconductor devel, however: > > $ docker run bioconductor/bioconductor_docker:devel Rscript -e 'BiocManager::version()' > [1] '3.20' > > In the docker hub profile of Bioconductor I noticed that the recently updated tagdevel-amd64, and this image seems to be running the current Bioconductor devel: > > $ docker run bioconductor/bioconductor_docker:devel-amd64 Rscript -e 'BiocManager::version()' > [1] '3.21' > > So, has been thedeveltag dropped in favor ofdevel-amd64, (and should then the documentation on the web be updated)? Thanks!

Marcel Ramos Pérez (09:53:43) (in thread): > Hi Robert! Thanks for the question. IMO thedeveltag should point to the latestamd-64build (@Alex Mahmoudfeel free to comment). As for the status of the builds, it looks like there are still some issues with RStudio in the devel containers e.g. inrocker/rstudio:devel. We will have to wait a bit until the image is usable with a newer version ofRStudio. I do know that using theRStudio daily buildsgets around the error that you may be seeing: > > symbol lookup error: /usr/lib/rstudio-server/bin/rsession: undefined symbol: Rf_countContexts >

Alex Mahmoud (09:57:44) (in thread): > The devel-amd64 tag builds separately, hence why it pushed. The ‘devel’ tag expects both amd64 and arm64 to build before propagating, and the latter has some issues. The switch from Ubuntu 22 to Ubuntu 24 hascaused some unexpected issues, so the process to update and rebuild the containers is taking longer than expected. Sorry about that, but it should only be temporary, and the usual tags should hopefully all be back as expected in a couple of weeks

Robert Castelo (10:38:54) (in thread): > I see, thanks for the clarifications, in my case, and probably the case from others, I do not use RStudio from the container, I use the container as a convenient way to build and test the devel version of my packages in command line.

Alex Mahmoud (12:37:50) (in thread): > For that you could use Bioconductor/r-ver:devel instead. I believe that one built successfully for R devel, and is a lighter container with R but not RStudio

Alex Mahmoud (12:38:58) (in thread): > The documentation needs updating to reflect all the new flavors…it’sbeen on my todo list for a while but I keep pushing it off to do other things

Robert Castelo (13:28:26) (in thread): > This is great, thanks!!

2024-11-14

Lori Shepherd (07:34:18) (in thread): > We investigated this a little bit. When generating the landing pages we use readCitationFile that creates a citation/bibentry object and then we use the print(citation, style=“html”) for formatting. It appears correct from the readCitationFile but in our investigation found that when choosing any style in the print function exception TextVersion or bibtext will result in this erroneous al. e. > > > citation > Tian, L., Jabbari, J. S., Thijssen, R., et al. (2021). Comprehensive > characterization of single-cell full-length isoforms in human and mouse > with long-read sequencing. Genome Biology, 22(1), 310.[https://doi.org/10.1186/s13059-021-02525-6](https://doi.org/10.1186/s13059-021-02525-6)A BibTeX entry for LaTeX users is > > @Article{, > author = {Luyi Tian and Jafar S Jabbari and Rachel Thijssen and et al.}, > title = {Comprehensive Characterization of Single-Cell Full-Length Isoforms in Human and Mouse with Long-Read Sequencing}, > journal = {Genome Biology}, > year = {2021}, > volume = {22}, > number = {1}, > pages = {310}, > doi = {10.1186/s13059-021-02525-6}, > } > > > print(citation, style="text") > Tian L, Jabbari JS, Thijssen R, al. e (2021). "Comprehensive > Characterization of Single-Cell Full-Length Isoforms in Human and Mouse > with Long-Read Sequencing." *Genome Biology*, **22**(1), 310. > doi:10.1186/s13059-021-02525-6 > <[https://doi.org/10.1186/s13059-021-02525-6](https://doi.org/10.1186/s13059-021-02525-6)>. > > > print(citation, style="html") > Tian L, Jabbari JS, Thijssen R, al. e (2021). > “Comprehensive Characterization of Single-Cell Full-Length Isoforms in Human and Mouse with Long-Read Sequencing.” > Genome Biology, 22(1), 310. > <a href="[https://doi.org/10.1186/s13059-021-02525-6](https://doi.org/10.1186/s13059-021-02525-6)">doi:10.1186/s13059-021-02525-6</a>. > > > > print(citation, style="Bibtex") > @Article{, > author = {Luyi Tian and Jafar S Jabbari and Rachel Thijssen and et al.}, > title = {Comprehensive Characterization of Single-Cell Full-Length Isoforms in Human and Mouse with Long-Read Sequencing}, > journal = {Genome Biology}, > year = {2021}, > volume = {22}, > number = {1}, > pages = {310}, > doi = {10.1186/s13059-021-02525-6}, > } > > > print(citation, style="md") > Tian L, Jabbari JS, Thijssen R, al. e (2021). "Comprehensive > Characterization of Single-Cell Full-Length Isoforms in Human and Mouse > with Long-Read Sequencing." *Genome Biology*, ***22***(1), 310. > [doi:10.1186/s13059-021-02525-6]([https://doi.org/10.1186/s13059-021-02525-6](https://doi.org/10.1186/s13059-021-02525-6)). > > > print(citation, style="latex") > Tian L, Jabbari JS, Thijssen R, al. e (2021). > ``Comprehensive Characterization of Single-Cell Full-Length Isoforms in Human and Mouse with Long-Read Sequencing.'' > \emph{Genome Biology}, \bold{22}(1), 310. > \Rhref{[https://doi.org/10.1186/s13059-021-02525-6}{doi:10.1186](https://doi.org/10.1186/s13059-021-02525-6}{doi:10.1186)\slash{}s13059\-021\-02525\-6}. > > So it appears to be an issue with the utils package print function of bibentry. For What Its Worth its also recommended to use person when defining authors and you can get around this by defining et al in the list of persons > > bibentry( > bibtype = "article", > author = c(person(given="Luyi", family="Tian"), > person(given="Jafar S", family="Jabbari"), > person(given="Rachel", family="Thijssen"), > person(given="et al.")), > title = "Comprehensive Characterization of Single-Cell Full-Length Isoforms in Human and Mouse with Long-Read Sequencing", > > which results correctly > > > citation = readCitationFile("CITATION") > > print(citation, style="html") > Tian L, Jabbari J, Thijssen R, et al. (2021). > “Comprehensive Characterization of Single-Cell Full-Length Isoforms in Human and Mouse with Long-Read Sequencing.” > Genome Biology, 22(1), 310. > <a href="[https://doi.org/10.1186/s13059-021-02525-6](https://doi.org/10.1186/s13059-021-02525-6)">doi:10.1186/s13059-021-02525-6</a>. > >

Robert Castelo (12:41:34): > Hi, any hint whyGenomicRanges::setdiff()orGenomicRanges::union()withignore.strand=TRUEhas stopped working in devel? There are no commited changes to IRanges or GenomicRanges:thinking_face: > > library(GenomicRanges) > example(setdiff) > > setdff> ## --------------------------------------------------------------------- > setdff> ## A. SET OPERATIONS > setdff> ## --------------------------------------------------------------------- > setdff> > setdff> x <- GRanges("chr1", IRanges(c(2, 9) , c(7, 19)), strand=c("+", "-")) > > setdff> y <- GRanges("chr1", IRanges(5, 10), strand="-") > > setdff> union(x, y) > GRanges object with 3 ranges and 0 metadata columns: > seqnames ranges strand > <Rle> <IRanges> <Rle> > [1] chr1 2-7 + > [2] chr1 9-19 - > [3] chr1 5-10 - > ------- > seqinfo: 1 sequence from an unspecified genome; no seqlengths > > setdff> union(x, y, ignore.strand=TRUE) > Error in .local(x, y, ...) : unused argument (ignore.strand = TRUE) > > sessionInfo() > R Under development (unstable) (2024-11-03 r87286) > Platform: x86_64-pc-linux-gnu > Running under: Ubuntu 20.04.6 LTS > > Matrix products: default > BLAS: /projects_fg/soft/R/R-devel/lib/R/lib/libRblas.so > LAPACK: /projects_fg/soft/R/R-devel/lib/R/lib/libRlapack.so; LAPACK version 3.12.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Europe/Madrid > tzcode source: system (glibc) > > attached base packages: > [1] stats4 stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] GenomicRanges_1.59.0 GenomeInfoDb_1.43.0 IRanges_2.41.0 > [4] S4Vectors_0.45.1 BiocGenerics_0.53.2 generics_0.1.3 > [7] colorout_1.3-0 > > loaded via a namespace (and not attached): > [1] zlibbioc_1.53.0 httr_1.4.7 compiler_4.5.0 > [4] R6_2.5.1 tools_4.5.0 XVector_0.47.0 > [7] GenomeInfoDbData_1.2.13 UCSC.utils_1.3.0 jsonlite_1.8.9 >

Robert Castelo (13:04:36): > mmmm… I’ve found a suspicious change in BiocGenerics@Hervé Pagès?:sweat_smile: > > commit b18a5f2b74f506454eba15e8a229e12206f9e57f > Author: Hervé Pagès <hpages.on.github@gmail.com> > Date: Wed Oct 30 16:41:34 2024 -0700 > > BiocGenerics 0.53.1 > > Add CRAN package generics to Depends field. > > The default methods for S4 generics union(), intersect(), and setdiff() > now are generics::union(), generics::intersect(), and generics::setdiff(), > respectively. See '?BiocGenerics::setops' for more information. >

Hervé Pagès (14:33:16) (in thread): > You’ve probably seen someWarning: multiple methods tables found for ‘intersect’when you loadedGenomicRangeswithlibrary(GenomicRanges). You need to get rid of them by reinstallingS4Vectors,IRanges,GenomeInfoDb, andGenomicRanges. Thenexample(setdiff)should work again. > I still need to bump the versions of these 4 packages to force reinstallation on the end-user machines.

2024-11-15

Robert Castelo (04:29:44) (in thread): > Thanks, that was an easy fix!

2024-11-18

Peter Hickey (22:58:07): > Anyone else having trouble installing or successfully installedsystemfontswith R-devel on macOS (arm64)? > Figured I’d ask here before going further down the rabbit hole.

Peter Hickey (22:59:44) (in thread): > In case someone feels inclined to look, the output of runningBiocManager::install('systemfonts')on my machine - File (Binary): systemfonts_installation.log

2024-11-19

Jeroen Ooms (08:06:47) (in thread): > Can you try again now please? I think there was a glitch.

Aidan Lakshman (12:23:05): > I’m having the devel build of one of my packages failing due to compiled code not findingCallocandFree…I haven’t made any pushes since the (successful) previous release. I may have missed an r-devel announcement; was the binding for theCalloc/Freemacro changed or moved to a different header? I can try changing it toR_Calloc/R_Freeor just to standardcalloc/free, just curious.

Aidan Lakshman (12:26:29) (in thread): > looks like the latest build is compiling with-DSTRICT_R_HEADERS=1on Ubuntu, which i suppose indicates that I may have forgotten a header include…it’s only on Ubuntu though, not on Windows, I’m not sure when that change happened.

Aidan Lakshman (12:30:00) (in thread): > ah, I may have answered my own question…DSTRICT_R_HEADERSwill undefCallocandFree(link). Guess I’ll change them to the non-macro’d versions or the normal ones.

Marcel Ramos Pérez (12:32:34) (in thread): > Yes, it’s in the R-devel NEWS :https://cran.r-project.org/doc/manuals/r-devel/NEWS.html > > Strict R headers are now the default. This removes the legacy definitions ofPI,Calloc,ReallocandFree: useM_PI,R_Calloc,R_ReallocorR_Freeinstead.

Aidan Lakshman (12:34:35) (in thread): > ah, I have to start reading the R-devel NEWS more regularly. Thanks for the link, time to go fix my code:sweat_smile:

Lluís Revilla (14:04:53) (in thread): > I created an account to share as it happens on Mastodon:https://fosstodon.org/@R_devs_news. If you use RSS feeds you can also subscribe to the original feed see links on profile).

Peter Hickey (16:07:07) (in thread): > Thanks, Jeroen. It’s worked this morning. What was the issue? I don’t see any changes

Kylie Bemis (18:56:21) (in thread): > It was mentioned on the BioC-devel mailing list too.I still need to fix my packages as well. Was already usingR_Calloc()but somehow was still usingFree()

Dirk Eddelbuettel (19:03:35) (in thread): > For completeness, there is also the original RSS feed setup by Duncan Murdoch years ago which I still follow:https://developer.r-project.org/RSSfeeds.html

2024-11-20

Jeroen Ooms (05:17:15) (in thread): > The script downloads freetype2 if it is not found on your system, but they may conflict with some other libraries installed on the machine. I fixed it such that it uses the right one.

Jared Andrews (12:56:50): > CRAN allows packages containing Rust cargos or code and provide the rustc toolchain on their build platforms. Has Bioconductor had any discussions towards supporting that as well? More info in theCRAN note, ther-rust FAQ, and theextendrsuite. > > As Rust gains popularity and the ecosystem of performant and useful crates grows, there are instances where wrapping some of that functionality in R packages is enticing.

Andres Wokaty (14:55:28) (in thread): > There was a package submitted that required rust/cargo and we installed rust and cargo on a devel builder, but the submission became inactive. My sense is that Bioconductor is open to it.

Jared Andrews (15:01:00) (in thread): > Ah yeah, I see it nowhere. Certainly seems doable then. Thanks!

Hervé Pagès (16:16:11) (in thread): > Good to know that CRAN is embracing Rust. Thanks for sharing these links@Jared Andrews. Note that the “CRAN note” link seems to take me to the same place as the “extendr” link. Is that intended?

Andres Wokaty (16:21:01) (in thread): > I forgot that ANCOMBC has a dependency on a CRAN package that depends on a package with Rust and Cargo as system requirements so they are already installed on the linux builders.

Jared Andrews (16:23:27) (in thread): > @Hervé Pagèsnope, that was not meant to be the same link, corrected it. CRAN is providing relatively loose guidelines and sort of leaving it up to the community to figure the rest out, it seems.

Lluís Revilla (16:53:33) (in thread): > If it helps, there are some efforts from package developers to improve CRAN/R policy, as it has lead to many problems recently:https://github.com/r-devel/r-svn/pull/182(and some other on other rust-r organizations). - Attachment: #182 Improves check for Rust > yesterday a check for packages which compile Rust was added: 6114d41 > > This check claims that “It is impossible to tell definiitively if a package compiles rust code”. This is not true. Rust code can be compiled if ~~and only if~~ there is a Cargo.toml metadata file. This PR changes the check_rust check to quit if the package does not contain a Cargo.toml anywhere under the src directory.

2024-11-24

Jeroen Ooms (10:12:57) (in thread): > The R in rust community is also very active on r-universe:https://r-universe.dev/search?q=topic%3Arust

2024-11-25

Jeroen Ooms (10:04:46): > Does the bioc git server automatically sync from GitHub if that is the development repo? Or does the maintainer manually have to keep them in sync? > I noticed because packageSpotSweeperhas a broken description file on bioc, which was fixed on GitHub last week.

Andres Wokaty (10:10:06) (in thread): > The maintainer has to manually keep them in sync. I can ping them about it.

Jeroen Ooms (14:20:00) (in thread): > So FYI, the problem is the whiteline on line 5:https://code.bioconductor.org/browse/SpotSweeper/blob/devel/DESCRIPTION

2024-11-26

Jeroen Ooms (08:19:22): > Also the packageReactomeGSAseems to have no name anymore in the description:thinking_face::https://github.com/bioc/ReactomeGSA/commit/4926a5bc853fc2a9b01a79cce70f84d3d2e3900b

Lori Shepherd (08:24:05): > I can reach out to this one. I’m doing some package notifications today anyways for other Bioc Failing packages.@Andres Wokatydid you already reach out to SpotSweeper?

Dirk Eddelbuettel (08:32:03): > While we’re at it, can you reach out to HilbertVis? I still have the compilation failure there because of the requirement for a format string (turning into an error with the compiler default switches for formats). Error messages in thread.

Dirk Eddelbuettel (08:32:18) (in thread): - File (Shell): Untitled

Lori Shepherd (08:51:04) (in thread): > yes I’ve notified them twice already but I’ll try again > > We only have this trigger a warning on our system instead of an error.

Dirk Eddelbuettel (09:05:39) (in thread): > Yep. As I understand it,-Werror=format-securitywould elevate. If you can control your CXXFLAGS easily enough you could consider adding it. But sometimes build systems are cast in concrete and iron so if you can’t I’d understand.

Lori Shepherd (09:20:08) (in thread): > I think we purposely only added it as a warning until it becomes more formalized (or maybe to make sure there weren’t a cascading amount of packages affected) but@Andres Wokaty/@Hervé Pagèswould know for sure

Dirk Eddelbuettel (09:21:08) (in thread): > It’s always an error for me — and it is not tickled by any other of the approx. 450 BioC packages I compile for r2u.

Andres Wokaty (09:22:37) (in thread): > Yes

Andres Wokaty (12:03:33) (in thread): > Our build system uses-Wall. I remember issues with-Werror=format-securityin containers, so some packages might have an issue. We could discuss? I think if we did it, it should be for only for devel. I will note that R Universe seems to have it as warning too.

2024-12-06

Kevin Rue-Albrecht (09:28:33): > Not sure where to find the information, happy to be redirected: > Anyone here knows why Bioconductor is stuck at 3.18 on bioconda? > Thanks!

Stevie Pederson (09:54:51) (in thread): > Hi Kevin! I think there were initially some issues building R4.4 on Windows and if one platform fails, the process gets held up. At least that’s how I read this:https://github.com/bioconda/bioconda-recipes/issues/49778. There’s also some discussion about 3.20 here and it looks like it might not be too far off.https://github.com/bioconda/bioconda-recipes/pull/51889. > > (Not even remotely a conda expert though) - Attachment: #49778 Work list for R 4.4/Bioconductor 3.19 builds - Attachment: #51889 Update BioConductor packages to BioC 3.20

2024-12-10

Francesc Català-Moll (01:45:15): > Hello everyone, > > I am trying to install theMaaslin2package usingBiocManager::install("Maaslin2")on R 4.4.1 (Bioconductor 3.20), but I encounter the following warning message: > > BiocManager::install("Maaslin2") > #> 'getOption("repos")' replaces Bioconductor standard repositories, see > #> 'help("repositories", package = "BiocManager")' for details. > #> Replacement repositories: > #> CRAN:[https://cloud.r-project.org/](https://cloud.r-project.org/)#> Bioconductor version 3.20 (BiocManager 1.30.25), R 4.4.1 (2024-06-14) > #> Installing package(s) 'Maaslin2' > #> Warning: package 'Maaslin2' is not available for Bioconductor version '3.20' > #> > #> A version of this package for your version of R might be available elsewhere, > #> see the ideas at > #>[https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages](https://cran.r-project.org/doc/manuals/r-patched/R-admin.html#Installing-packages) > > However, according to the package’swebpage, Maaslin2 seems to be available for the current release and its build status is reported as OK. > > Could you help me understand why BiocManager is not allowing me to install Maaslin2? Is there some issue with repository synchronization or something else related to the Bioconductor version I’m using? > > Any insights or suggestions would be greatly appreciated. > > Thanks!

Mike Smith (03:59:01) (in thread): > TL;DR ThemetagenomeSeqpackage is failing to build, and Maaslin2 depends on that package. > > For more details on how to reach taht conclusion you have to do a bit of digging to explore why the package isn’t available. I agree it’s a bit confusing that it has a green ‘passing build’ badge, but then isn’t installable. > > On the landing page you linked to, you cand find the ‘Package Archives’ section, where there is nothing listed for ‘Source Package’, ‘Windows Binary’, etc. If the package were available there would be a link to the download it - that’s the same link R would use for the installation. That’s why R reports “package not available”. > > This combination of green build status but no actual package to download indicates it isn’t propagating from the build system to the Bioconductor repository. To figure out why, you can take a look at thebuild reportfor maaslin2. On the far right there are three small red circles, and hovering over them brings up a popup that says “NO, package depends on ‘metagenomeSeq’ which is not available”.Unfortunately, the fix requires either metagenomeSeq to be fixed, or maaslin2 to remove it’s dependency on that package.

Vince Carey (04:39:40) (in thread): > I have written to the metagenomeSeq maintainer to see whether the failing unit tests can be addressed. The failures seem nontrivial to me.

Lori Shepherd (07:00:25) (in thread): > The metagenomicSeq maintainer has also been contacted at least 2 other times since the October release; they have so far been unresponsive.

Leonardo Collado Torres (10:46:52): > Initially while helping@Daianna Gonzalez-Padillawith GHAhttps://github.com/LieberInstitute/smokingMouse/commit/a93e73fd6872b8be9f2738bc1a51f5ef83cf43d7using bioc 3.21 (bioc-devel) we noticed issues athttps://github.com/LieberInstitute/smokingMouse/actions/runs/12257533078with > > subscript out of bounds > > on Rmd vignettes. I see this also on basically all my packageshttps://lcolladotor.github.io/pkgs/. Likehttps://bioconductor.org/checkResults/devel/bioc-LATEST/derfinder/nebbiolo1-buildsrc.html. > > Are you aware of any changes to Rmd vignettes (withBiocStyle) that might be causing this?

Daianna Gonzalez-Padilla (10:46:56): > @Daianna Gonzalez-Padilla has joined the channel

Leonardo Collado Torres (10:47:48): > In@Daianna Gonzalez-Padilla’s case, just switching back to bioc 3.20https://github.com/LieberInstitute/smokingMouse/commit/1b69354cba00d60d7517258230a4ff95866e12a1ran ok on GHAhttps://github.com/LieberInstitute/smokingMouse/actions/runs/12258961989.

Marcel Ramos Pérez (11:13:43) (in thread): > From what I can tell, it looks likesmokingMousewas not added to thebibobject and therefore you get the error: > > > bib[['smokingMouse']] > Error in s[[i]] : subscript out of bounds >

Leonardo Collado Torres (11:15:34) (in thread): > Interesting. I do useRefManageRin nearly all my packages. So maybe there’s a common error there in bioc 3.21. Thanks@Marcel Ramos Pérez! > > cc@Daianna Gonzalez-Padilla

Francesc Català-Moll (11:35:32) (in thread): > Thanks@Mike Smith

Vince Carey (11:50:55) (in thread): > I did hear back from the metagenomeSeq maintainer who says it will be looked at tonight.

Leonardo Collado Torres (12:29:20) (in thread): > I just fixed the issues related to thebibobject on bioc 3.21. Thanks@Marcel Ramos Pérez! > > cc@Daianna Gonzalez-Padilla

2024-12-17

kent riemondy (09:36:47): > I have a CRAN package (valr) that depends onrtracklayerfor reading bigwigs and gtf files. Recently I was notified by CRAN of an AddressSanitizer (ASAN) error coming from the UCSC library code vendored in rtracklayer (relevant log linked in additional issueshereand shown in reply to this message). We don’t link against rtracklayer’s source code and only use R functions from rtracklayer. > > I’m wondering if bioc packages are tested in the bioconductor build system with the variousadditional checksdone by CRAN for compiled packages? and if so, if this ASAN error has been seen previously? - Attachment (cran.r-project.org): valr: Genome Interval Arithmetic > Read and manipulate genome intervals and signals. Provides functionality similar to command-line tool suites within R, enabling interactive analysis and visualization of genome-scale data. Riemondy et al. (2017) doi:10.12688/f1000research.11997.1](https://doi.org/10.12688%2Ff1000research.11997.1)).

kent riemondy (09:39:31) (in thread): > log with ASAN error (truncated to fit in slack message) > > * using log directory '/data/gannet/ripley/R/packages/tests-gcc-SAN/valr.Rcheck' > * using R Under development (unstable) (2024-12-10 r87437) > * using platform: x86_64-pc-linux-gnu > * R was compiled by > gcc-14 (GCC) 14.2.0 > GNU Fortran (GCC) 14.2.0 > ... truncated ... > * checking tests ... [210s/289s] ERROR > Running 'testthat.R' [208s/287s] > Running the tests in 'tests/testthat.R' failed. > Complete output: > > # This file is part of the standard setup for testthat. > > # It is recommended that you do not modify it. > > # > > # Where should you do additional test configuration? > > # Learn more about the roles of various files in: > > # *[https://r-pkgs.org/tests.html](https://r-pkgs.org/tests.html)> # *[https://testthat.r-lib.org/reference/test_package.html#special-files](https://testthat.r-lib.org/reference/test_package.html#special-files)> > > library(testthat) > > library(valr) > > > > test_check("valr") > ================================================================= > ==3660193==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffea4b8aaa5 at pc 0x7f3f6887be26 bp 0x7ffea4b8a980 sp 0x7ffea4b8a140 > READ of size 7 at 0x7ffea4b8aaa5 thread T0 > #0 0x7f3f6887be25 in strlen ../../../../latest/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391 > #1 0x7f3f4476502d in cloneStringZ ucsc/common.c:35 > #2 0x7f3f447650bd in chromNameCallback ucsc/bbiRead.c:209 > #3 0x7f3f4475c1c6 in rTraverse ucsc/bPlusTree.c:255 > #4 0x7f3f447670d3 in bbiChromList ucsc/bbiRead.c:221 > #5 0x7f3f44729eae in bbiSeqLengths /tmp/RtmpvSRLei/R.INSTALL35bab815c41ec/rtracklayer/src/bbiHelper.c:7 > #6 0x7f3f4472cb7d in BWGFile_seqlengths /tmp/RtmpvSRLei/R.INSTALL35bab815c41ec/rtracklayer/src/bigWig.c:218 > #7 0x716d7c in R_doDotCall /data/gannet/ripley/R/svn/R-devel/src/main/dotcode.c:754 > #8 0x8b5122 in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8672 > #9 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #10 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #11 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #12 0x84ac98 in R_execMethod /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2562 > #13 0x7f3f66eaab98 in R_dispatchGeneric /data/gannet/ripley/R/svn/R-devel/src/library/methods/src/methods_list_dispatch.c:1151 > #14 0xa6496e in do_standardGeneric /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:1348 > #15 0x8b65db in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8057 > #16 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #17 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #18 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #19 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #20 0x8375b6 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #21 0x8375b6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1280 > #22 0x8f3afb in forcePromise.part.0.lto_priv.0 /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:976 > #23 0x83858c in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1187 > #24 0x8f3afb in forcePromise.part.0.lto_priv.0 /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:976 > #25 0x87b4ff in forcePromise /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:956 > #26 0x87b4ff in getvar /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:5839 > #27 0x8aea10 in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7852 > #28 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #29 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #30 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #31 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #32 0x8375b6 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #33 0x8375b6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1280 > #34 0x85b866 in do_set /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:3571 > #35 0x8379e6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1232 > #36 0x847453 in do_begin /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:3000 > #37 0x8379e6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1232 > #38 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #39 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #40 0x8b36c4 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #41 0x8b36c4 in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8093 > #42 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #43 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #44 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #45 0x84ac98 in R_execMethod /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2562 > #46 0x7f3f66eaab98 in R_dispatchGeneric /data/gannet/ripley/R/svn/R-devel/src/library/methods/src/methods_list_dispatch.c:1151 > #47 0xa6496e in do_standardGeneric /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:1348 > #48 0x8b65db in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8057 > #49 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #50 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #51 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #52 0x84ac98 in R_execMethod /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2562 > #53 0x7f3f66eaab98 in R_dispatchGeneric /data/gannet/ripley/R/svn/R-devel/src/library/methods/src/methods_list_dispatch.c:1151 > #54 0xa6496e in do_standardGeneric /data/gannet/ripley/R/svn/R-devel/src/main/objects.c:1348 > #55 0x8b65db in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8057 > #56 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #57 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #58 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #59 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #60 0x8375b6 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #61 0x8375b6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1280 > #62 0x85b866 in do_set /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:3571 > #63 0x8379e6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1232 > #64 0x847453 in do_begin /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:3000 > #65 0x8379e6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1232 > #66 0x863a89 in do_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:3945 > #67 0x89f7f3 in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8122 > #68 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #69 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #70 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #71 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #72 0x8375b6 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #73 0x8375b6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1280 > #74 0x864bf3 in do_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:3963 > #75 0x89f7f3 in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8122 > #76 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #77 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #78 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #79 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #80 0x8612d7 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #81 0x8612d7 in R_forceAndCall /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2460 > #82 0x4a8bd6 in do_lapply /data/gannet/ripley/R/svn/R-devel/src/main/apply.c:75 > #83 0xa4c5ea in do_internal /data/gannet/ripley/R/svn/R-devel/src/main/names.c:1410 > #84 0x8ac86d in bcEval_loop /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:8142 > #85 0x8709bf in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7505 > #86 0x836f32 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1167 > #87 0x842172 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2393 > #88 0x8358aa in applyClosure_core /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2306 > #89 0x8375b6 in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2328 > #90 0x8375b6 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1280 > #91 0x9c3ae9 in Rf_ReplIteration /data/gannet/ripley/R/svn/R-devel/src/main/main.c:265 > #92 0x9c3ae9 in R_ReplConsole /data/gannet/ripley/R/svn/R-devel/src/main/main.c:317 > #93 0x9c4feb in run_Rmainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1219 > #94 0x9cf202 in Rf_mainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1226 > #95 0x42a0df in main /data/gannet/ripley/R/svn/R-devel/src/main/Rmain.c:29 > #96 0x7f3f6702950f in __libc_start_call_main (/lib64/libc.so.6+0x2950f) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c) > #97 0x7f3f670295c8 in __libc_start_main_alias_2 (/lib64/libc.so.6+0x295c8) (BuildId: 8257ee907646e9b057197533d1e4ac8ede7a9c5c) > #98 0x42aac4 in _start (/data/gannet/ripley/R/gcc-SAN3/bin/exec/R+0x42aac4) (BuildId: c86b172163d2a55ba4aa15f088ae96484901ba52) > > Address 0x7ffea4b8aaa5 is located in stack of thread T0 > SUMMARY: AddressSanitizer: dynamic-stack-buffer-overflow ../../../../latest/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:391 in strlen > Shadow bytes around the buggy address: > 0x7ffea4b8a800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8a880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8a900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8a980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8aa00: ca ca ca ca 00 cb cb cb cb cb cb cb 00 00 00 00 > =>0x7ffea4b8aa80: ca ca ca ca[05]cb cb cb cb cb cb cb 00 00 00 00 > 0x7ffea4b8ab00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8ab80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8ac00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8ac80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 0x7ffea4b8ad00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > ==3660193==ABORTING > * checking package vignettes ... OK > * checking re-building of vignette outputs ... [182s/233s] OK > * DONE > Status: 1 ERROR >

Lluís Revilla (09:40:25) (in thread): > There have been a different report on R-package-devel of a similar problem: A CRAN packages receiving notifications that Bioconductor packages do not pass the additional checks (memory leaks or other compiled languages problems). I don’t think Bioconductor checks as extensively as CRAN (there is no Brian Ripely on Bioconductor)

kent riemondy (09:55:40) (in thread): > thanks. do you happen to have a link to one of these discussions on R-package-devel? i can’t seem to find any reports from looking through the past years messages.

Lluís Revilla (09:57:45) (in thread): > See this thread from today:https://stat.ethz.ch/pipermail/r-package-devel/2024q4/011300.html

kent riemondy (09:59:46) (in thread): > thanks!

Lori Shepherd (10:51:02) (in thread): > @Vince Careycan you cc me on the email that was responded too? metagenomeSeq is still failing and there have been no updates to the git repository to attempt to fix

2024-12-18

Kasper D. Hansen (09:19:40) (in thread): > We have recently had a Bioconductor release and these checks are now enabled in the devel branch of bioconductor. This means that you should expect them to get fixed, but we will probably have a slower time line for this than CRAN

Kasper D. Hansen (09:21:13) (in thread): > However, stuff like this - fixing errors in specific libraries that are included inside an R package - can be extremely complicated, and it is also something the package maintainer may have no idea about how to address

Lluís Revilla (09:21:45) (in thread): > @Kasper D. HansenIs there information about these checks enabled? I wasn’t aware of changes regarding new checks on Bioconductor build machines

Kasper D. Hansen (09:23:01) (in thread): > ok, let me back pedal a little bit actually. I don’t know about this specific check, but we have a number of new check in R-devel that are now being used because we switched to R-devel

Kasper D. Hansen (09:23:54) (in thread): > So anything that is standard in R-devel is also being used now. Sometimes “we” also enable specific checks if they are deemed important

Kasper D. Hansen (09:24:29) (in thread): > But for example, the check for usingRf_in calls to the R functions in C code is now being run in Bioc devel

Kasper D. Hansen (09:26:42) (in thread): > Yeah, I looked at@kent riemondyoriginal post and I quite sure we are not running all of these CRAN tests. I do think we should consider runningsomeof them.

kent riemondy (10:02:14) (in thread): > Thanks, glad to hear these checks are being rolled out and considered. Its unclear to me if there are new polices on CRAN related to these compiled code checks, as we have had this dependency for many years without seeing this error. > > Also, agreed that fixing the error is not trivial. The code in question is common code used in the core ucsc library (https://github.com/ucscGenomeBrowser/kent/blob/4d1c370b871b21ba22113be8a13e0cda4275028b/src/lib/common.c#L32) that hasn’t changed in 10+ years.

Lluís Revilla (10:17:13) (in thread): > @kent riemondythe ASAN and other additional checks are not stable and pretty much updated very frequently. Some of these issues, like yours or the linked thread on r-package-devel, are present on Bioconductor for at least one release or more. So it is the compilers and checks on CRAN that improved the issue detection.

Kasper D. Hansen (11:32:54) (in thread): > Also, the Kent code for bigWig isthereference implementation but I would say Kent has a reputation for writing code that runs well on the UCSC system and not caring too much that it is portable to other systems.

2024-12-19

Dirk Eddelbuettel (08:06:14): > CRAN informed me that the upload of package BH (which I tend to update once a year even though the underlyingBoost.orgreleases three times a year) lead to a new build error in BioConductor packagefgseawhich has a hard-wired ‘C++11’ compilation standard now conflicting with Boost headers requiring a minimum of C++14. Editing DESCRIPTION is all it took. Can you make an interim release of the package? It is itself a dependency of a number of other CRAN packages.

Lori Shepherd (08:08:13) (in thread): > has the fgsea maintainer been contacted to make the change?

Dirk Eddelbuettel (08:14:46) (in thread): > Not by me, maybe by CRAN but I essentially just woke up to their email, verified the issue and suggested fix before going on a run. I can send an email if you want me to, I do not know if you all prefer internal channels with a CC somewhere for traveability….

Lori Shepherd (08:16:40) (in thread): > if you could since we are not aware of the issue that would be great. Please ccbioconductorcoreteam@gmail.comand we can help follow up

Dirk Eddelbuettel (08:43:36) (in thread): > Done!

kent riemondy (19:04:29) (in thread): > Ok, thank you both for the clarification and additional context.

2024-12-21

Davide (14:02:14): > @Davide has joined the channel

2024-12-28

Michael Hungbo (07:29:30): > @Michael Hungbo has joined the channel

Pascal-Onaho (07:55:11): > @Pascal-Onaho has joined the channel

2024-12-29

Yahya Jahun (04:01:33): > @Yahya Jahun has joined the channel

2024-12-30

Jeroen Ooms (16:08:16): > I see a lot of packages failing to build on Windows that have removed the dependency onzlibbioc. Note that if your package uses zlib and you remove zlibbioc, you must addPKG_LIBS = -lzto yoursrc/Makevarsinstead, for example:https://github.com/Bioconductor/XVector/pull/5/files

2024-12-31

Vince Carey (05:12:26): > Noted. I hope@Hervé Pagèscan have a look Jan 2 or earlier.

Jeroen Ooms (11:15:53) (in thread): > Same issue forGrafGenandaffyioandaffyPLMandHiCDOC

2025-01-01

Pariksheet Nanda (15:47:54): > @Pariksheet Nanda has joined the channel

2025-01-02

George Kitundu (00:52:28): > @George Kitundu has joined the channel

2025-01-03

Gloria Amor Arroyo (05:35:57): > @Gloria Amor Arroyo has joined the channel

2025-01-07

Vince Carey (15:01:39): > This is being addressed by notifying developers.

2025-01-08

Kasper D. Hansen (16:10:17): > Right now, for Bioc-devel, on the bigsur-arm64 platform, the XML package is missing as a binary and so is BSgenome and rtracklayer. Could someone remind me what is going on here? I could find some old messages saying that (essentially) R-core no longer wants to maintain XML and want people to use other parsers.

Kasper D. Hansen (16:11:51): > This has some downstream impact. When I moved to Apple Silicon I started to rely heavily on the binaries, because it was getting a bit complicated to get the full toolchain install on macOS. Perhaps this was a bad decision on my part, but it does mean that a missing binary can have a lot of impact when you develop on this platform

Dirk Eddelbuettel (16:16:38) (in thread): > XML is still on CRAN, as it has been for decades, and was just updated this week. It has been in maintenance mode for years, but not been removed or archived likely because of dependencies.

Dirk Eddelbuettel (16:18:21) (in thread): > And for what it is worth the Debian builder page for the r-cran-xml package (that I look after, based on CRAN package XML) has arm64 binaries too.https://buildd.debian.org/status/package.php?p=r-cran-xmlSo no apparent ‘technical’ reason you can’t have XML on arm64 that I can see at a quick glance. CRAN has binaries for both macOS flavors too:https://cran.r-project.org/package=XML

Kasper D. Hansen (16:26:07) (in thread): > Thanks

Kasper D. Hansen (16:40:24) (in thread): > So digging around a bit, I think it is because CRAN does not have binaries for this platform for devel. The binaries listed on the page above is incontrib/4.4whereasBiocManagerlooks incontrib/4.5

Kasper D. Hansen (16:43:14): > Ok, so repeating here instead of in the thread: looks like CRAN is not producing binaries for devel (R 4.5) for macOS? And I am guessing that is why there is no binary for BSgenome which otherwise builds well (with WARNINGS). rtracklayer has some other issues in devel.

Dirk Eddelbuettel (17:06:39) (in thread): > You may need to talk to Simon. I found that these builds can at times be laggy. They happen though, “eventually”.

Kasper D. Hansen (17:33:17) (in thread): > Thanks, I asked on r-sig-mac

Lori Shepherd (21:07:36): > CRAN normally rolls out binaries for R-devel (4.5) throughout the months leading up to the release so some are available and others have not been produced yet.

Kasper D. Hansen (21:21:32): > It makes sense that they need to wait to settle on the final toolchain and that they don’t want to do the builds until the toolchain has been resolved. I guess this is what I will get told again on r-sig-mac. Guess I need to get back to figuring out how to get the current devel tool chain installed.

2025-01-09

Ludwig Lautenbacher (08:53:58): > Hi all, is there a way to get notified of the build pipeline for my bioconductor package or do I have to periodically check if the package is updated?

Lori Shepherd (08:57:42) (in thread): > what do you mean by build pipeline and updated? The builds happen automatically for the release and devel, timeline on the overview page :https://bioconductor.org/checkResults/you (assuming you are listed as the maintainer in the DESCRIPTION) will be automatically notified if the package begins to fail on the linux platform. The package will only be “updated” if you push a version bump and after a release with a core team bump

Sean Davis (08:59:21) (in thread): > In addition to Lori’s note that you’ll be emailed if you package starts to fail, you can use tooling in BiocPkgTools to build your own reporting. If you want something quick and easy, you can usehttps://seandavi.github.io/BiocPkgTools/reference/problemPage.html - Attachment (seandavi.github.io): generate hyperlinked HTML for build reports for Bioc packages — problemPage > This is a quick way to get an HTML report of packages maintained by a specific developer > or which depend directly on a specified package. The function is keyed to filter based on either > the maintainer name or by using the ‘Depends’, ‘Suggests’ and ‘Imports’ fields in package descriptions.

Dirk Eddelbuettel (09:02:02) (in thread): > CRAN always builds for r-release, r-devel, and r-oldrel on both windows (these days only 64 bit, used to also do 32bit) and macOS (these for both x86_64/amd64 (“intel”) and arm64 (“m*“). See eg for BiocManager:https://cran.r-project.org/package=BiocManagerBut both Windows and macOS builds seem to have a”bus factor” of one, so if something stalls it may not be affected immediately. But the machinery is up and running: I just checked a package of mine updated yesterday, it already has all four macOS binaries. > > So maybe something is up that is specific to package XML.

Lori Shepherd (09:05:17) (in thread): > even on the BiocManager package you referenced there is no r-devel for mac binaries… the mac binaries tend to be slower to appear for R-devel and in our experience happen over time…

Kasper D. Hansen (09:06:48) (in thread): > Simon answered on r-sig-mac

Kasper D. Hansen (09:07:04) (in thread): > He has not started the build for macOS for R-devel yet, but hope to do so within the week

Kasper D. Hansen (09:07:28) (in thread): > Apparantly the issue is disk space on the build machine. Seems like something that should be solved somehow

Dirk Eddelbuettel (09:10:34) (in thread): > Yes my bad.Windowsalways has r-devel, but macOS does not incl on my package. Sorry about my clearly false statement. R-universe does though and I seem to look often enough at its builds to have that mixed up. Truly sorry.

Ludwig Lautenbacher (09:17:31) (in thread): > Thehttps://bioconductor.org/checkResults/page is very close to what I was looking for. The only point I’m missing is a list of packages that are scheduled for the next build. Does that exist? > Do i understand correctly that the release is updated twice a week on Mondays and Thursdays while devel is updated daily?

Sean Davis (09:18:00) (in thread): > All packages are built every day (devel).

Lori Shepherd (09:18:43) (in thread): > yes Devel is built daily and release is on Monday and Thursdays and that is across all packages

Ludwig Lautenbacher (09:20:27) (in thread): > Perfect thank you! And does the list of packages scheduled for the next build exist or do I have to just wait and see?

Lluís Revilla (09:20:37) (in thread): > If it helps, I subscribed to the RSS feed here on slack, every time there is a new built I get a DM with the result here (I don’t remember how I did, as it was years ago)

Sean Davis (09:21:35) (in thread): > @Ludwig LautenbacherAll packages in the release/devel cycle are built during a build cycle.

Ludwig Lautenbacher (09:22:51) (in thread): > @Sean DavisI thought only packages which had a version bump?

Ludwig Lautenbacher (09:23:08) (in thread): > @Lori Shepherdwhat did you mean in your inital message with “core team bump”?

Lori Shepherd (09:24:32) (in thread): > all packages are build/checked every run. > only packages will a version bump will propagate to end users/made available for download from BiocManager. > > twice a year there is a Bioconductor release. All software packages get a version bump at that time by the core team so that the new release has a new version number for tracking

Ludwig Lautenbacher (09:28:42) (in thread): > I think I get it now! Thank you all for the clarification!

2025-01-10

Federico Marini (07:02:49): > In case your GHA are/will be failing…https://bsky.app/profile/gaborcsardi.org/post/3lfeu6tzr5c2l - Attachment (bluesky): Attachment > PSA: GitHub Actions switched ubuntu-latest to ubuntu-24.04, which does not have #Rstats pre-installed. Use r-lib/actions/setup-r@v2 (see https://github.com/r-lib/actions) to install it, or use a Docker container, e.g. > docker run https://ghcr.io/r-hub/r-minimal/r-minimal:latest R …

2025-01-11

Hervé Pagès (18:56:49) (in thread): > A few things: > * This is a conversation about builds. I suppose@Andres Wokatyis not following this channel (#developers-forum), otherwise I’m sure they would have clarified the situation about the missing macOS binaries on CRAN for R-devel. Thanks@Lori Shepherdfor stepping in and clarifying. > * BTW this is not a new situation at all. For at least the last 10 years CRAN has started to produce macOS package binaries for R-devel only a few weeks before the new R release, typically 4-6 weeks before. > * Not something that makes our lives easier from a build system point of view but we’ve learned how to handle it and we’ve been dealing with it for many years. Note thatXMLand other difficult CRAN packages are installed on the Mac builders (thanks to@Andres Wokaty). So this is not the reason why there are no macOS binaries forBSgenomeorrtracklayerat the moment. > * The reasonrtracklayerfails to build on macOS and kunpeng2 is clearly indicated on the build report: it’s becausemicroRNAis not available on these machines. And the reasonmicroRNAis not available is because it has a compilation error on these machines. See for examplehttps://bioconductor.org/checkResults/3.21/bioc-LATEST/microRNA/lconway-install.html > * rtracklayer’s maintainer Michael Lawrence has been notified about this situation:https://github.com/lawremi/rtracklayer/issues/136. > Any chancertracklayercan get rid of that dep@Michael Lawrence? > Thanks!

2025-01-12

Federico Marini (05:37:44): > https://github.com/docker/for-mac/issues/7520 - Attachment: #7520 [Workaround in description] Mac is detecting Docker as a malware and keeping it from starting > Description > > Whenever Docker is started, this error is shown: > > > Malware Blocked. “com.docker.socket” was not opened because it contains malware. this action did not harm your Mac. > > Reproduce > > 1. Start Docker > 2. See the error > Image > > Workaround > > Tip > > If you face this issue, try the following procedure: > > 1. Quit Docker Desktop and check that no remaining docker processes are running using the Activity Monitor > 2. Run the following commands: > > #!/bin/bash > > # Stop the docker services > echo “Stopping Docker…” > sudo pkill ‘[dD]ocker’ > > # Stop the vmnetd service > echo “Stopping com.docker.vmnetd service…” > sudo launchctl bootout system /Library/LaunchDaemons/com.docker.vmnetd.plist > > # Stop the socket service > echo “Stopping com.docker.socket service…” > sudo launchctl bootout system /Library/LaunchDaemons/com.docker.socket.plist > > # Remove vmnetd binary > echo “Removing com.docker.vmnetd binary…” > sudo rm -f /Library/PrivilegedHelperTools/com.docker.vmnetd > > # Remove socket binary > echo “Removing com.docker.socket binary…” > sudo rm -f /Library/PrivilegedHelperTools/com.docker.socket > > # Install new binaries > echo “Install new binaries…” > sudo cp /Applications/Docker.app/Contents/Library/LaunchServices/com.docker.vmnetd /Library/PrivilegedHelperTools/ > sudo cp /Applications/Docker.app/Contents/MacOS/com.docker.socket /Library/PrivilegedHelperTools/ > > 1. Restart Docker Desktop. > > If that still doesn’t work, download one of the currently supported release from the Release notes and re-apply step 2. > > As suggested <https://github.com/docker/for-mac/issues/7520#issuecomment-2578291149|running this command> is working for most of people that had this problem. > > Original issue details### docker version > > Client: > Version: 26.1.4 > API version: 1.45 > Go version: go1.21.11 > Git commit: 5650f9b > Built: Wed Jun 5 11:26:02 2024 > OS/Arch: darwin/arm64 > Context: desktop-linux > Cannot connect to the Docker daemon at unix:///Users/admin/.docker/run/docker.sock. Is the docker daemon running? > > (Can’t get docker started to check more details) > > —- > Asked for a friend running Docker in the same version and this is the output: > > Client: > Version: 27.0.3 > API version: 1.46 > Go version: go1.21.11 > Git commit: 7d4bcd8 > Built: Fri Jun 28 23:59:41 2024 > OS/Arch: darwin/arm64 > Context: desktop-linux > > Server: Docker Desktop 4.32.0 (157355) > Engine: > Version: 27.0.3 > API version: 1.46 (minimum version 1.24) > Go version: go1.21.11 > Git commit: 662f78c > Built: Sat Jun 29 00:02:44 2024 > OS/Arch: linux/arm64 > Experimental: false > containerd: > Version: 1.7.18 > GitCommit: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e > runc: > Version: 1.7.18 > GitCommit: v1.1.13-0-g58aa920 > docker-init: > Version: 0.19.0 > GitCommit: de40ad0 > > docker info > > lient: > Version: 27.0.3 > Context: desktop-linux > Debug Mode: false > Plugins: > buildx: Docker Buildx (Docker Inc.) > Version: v0.15.1-desktop.1 > Path: /Users/lorenzo/.docker/cli-plugins/docker-buildx > compose: Docker Compose (Docker Inc.) > Version: v2.28.1-desktop.1 > Path: /Users/lorenzo/.docker/cli-plugins/docker-compose > debug: Get a shell into any image or container (Docker Inc.) > Version: 0.0.32 > Path: /Users/lorenzo/.docker/cli-plugins/docker-debug > desktop: Docker Desktop commands (Alpha) (Docker Inc.) > Version: v0.0.14 > Path: /Users/lorenzo/.docker/cli-plugins/docker-desktop > dev: Docker Dev Environments (Docker Inc.) > Version: v0.1.2 > Path: /Users/lorenzo/.docker/cli-plugins/docker-dev > extension: Manages Docker extensions (Docker Inc.) > Version: v0.2.25 > Path: /Users/lorenzo/.docker/cli-plugins/docker-extension > feedback: Provide feedback, right in your terminal! (Docker Inc.) > Version: v1.0.5 > Path: /Users/lorenzo/.docker/cli-plugins/docker-feedback > init: Creates Docker-related starter files for your project (Docker Inc.) > Version: v1.3.0 > Path: /Users/lorenzo/.docker/cli-plugins/docker-init > sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.) > Version: 0.6.0 > Path: /Users/lorenzo/.docker/cli-plugins/docker-sbom > scout: Docker Scout (Docker Inc.) > Version: v1.10.0 > Path: /Users/lorenzo/.docker/cli-plugins/docker-scout > > Server: > Containers: 10 > Running: 9 > Paused: 0 > Stopped: 1 > Images: 41 > Server Version: 27.0.3 > Storage Driver: overlay2 > Backing Filesystem: extfs > Supports d_type: true > Using metacopy: false > Native Overlay Diff: true > userxattr: false > Logging Driver: json-file > Cgroup Driver: cgroupfs > Cgroup Version: 2 > Plugins: > Volume: local > Network: bridge host ipvlan macvlan null overlay > Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog > Swarm: inactive > Runtimes: io.containerd.runc.v2 runc > Default Runtime: runc > Init Binary: docker-init > containerd version: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e > runc version: v1.1.13-0-g58aa920 > init version: de40ad0 > Security Options: > seccomp > Profile: unconfined > cgroupns > Kernel Version: 6.6.32-linuxkit > Operating System: Docker Desktop > OSType: linux > Architecture: aarch64 > CPUs: 12 > Total Memory: 7.657GiB > Name: docker-desktop > ID: 1e75072f-7d8f-47c3-917a-43dc08d31755 > Docker Root Dir: /var/lib/docker > Debug Mode: false > HTTP Proxy: http.docker.internal:3128 > HTTPS Proxy: http.docker.internal:3128 > No Proxy: hubproxy.docker.internal > Labels: > com.docker.desktop.address=unix:///Users/lorenzo/Library/Containers/com.docker.docker/Data/docker-cli.sock > Experimental: false > Insecure Registries: > hubproxy.docker.internal:5555 > 127.0.0.0/8 > Live Restore Enabled: false > > Diagnostics ID > > Can’t get a Diagnostics ID because I’m not able to open docker, the error is from MacOS > > Additional Info > > I tried installing older versions of Docker but the error is the same to all of them.

Federico Marini (05:37:50): > couple of PSAs these days..

Federico Marini (05:40:04): > –>https://docs.docker.com/desktop/cert-revoke-solution/ - Attachment (Docker Documentation): Fix startup issue for Mac > Learn how to resolve issues affecting macOS users of Docker Desktop, including startup problems and false malware warnings, with upgrade, patch, and workaround solutions.

Charlotte Soneson (07:31:35) (in thread): > Re:microRNA, I have reached out to the current maintainer (I believe the email address in the DESCRIPTION file is no longer active). It looks like it may just be a matter of replacing one no-longer-remappedmkCharLenbyRf_mkCharLen.

2025-01-13

Hervé Pagès (12:51:25) (in thread): > Thanks@Charlotte Soneson!

2025-01-15

Claire Rioualen (05:37:49): > Hi, I’m currently working with packages metadata and have a couple questions regarding:smiley: > * Can I specify a Bioconductor version for thebiocViewsVocabfrom thebiocViewspackage? Can’t find the information in the manual. If not, is it safe to assume it’s updated according to Bioconductor versions, and in line withBiocManager::version()? > * Can I extract a specific subset of vocabulary? Eg vocabulary related only to Software and not Data > * How can I get the information, for all packages, of when they were first added to the Bioconductor repository? That would be the information displayed as “in Bioconductor since” on each package webpage, however I can’t find it usingBiocPkgTools. > Thanks in advance :)

Lori Shepherd (08:27:15) (in thread): > * It safe to assume its updated according to the Bioconductor versions. We really don’t change the vocab often unless there is a requested new term. That is reflected with the minor version bumps in between releases (and documented in generally in the NEWS file when a new term is added) > * Seehttps://www.bioconductor.org/packages/release/bioc/vignettes/biocViews/inst/doc/HOWTO-BCV.html#querying-a-repositorythat talks about creating a sublist. > * Give me a few minutes and I’ll post more on the third question…

Lori Shepherd (08:44:56) (in thread): > The years in Bioc is a little trickier; if there isn’t something in BiocPkgTools, we probably should. Currently we parse the BIoconductor manifest files for the badge on the landing page. This however presents some challenges if you start to evaluate non software designated packages as we didn’t keep a manifest file for say annotation packages. The webstats for packages tries to be better about a package list and evaluates the packages available via the PACKAGES file in our legacy releases but isn’t calculating a since or keeping track of that information, its just wants the official package list for a given release.

Lori Shepherd (09:27:10) (in thread): > I’ll looking at converting to an R function that I could contribute to BiocPkgTools….

Claire Rioualen (09:56:31) (in thread): > Thanks a lot, this is very helpful!:pray:

2025-01-17

Julian Stamp (10:02:24): > @Julian Stamp has joined the channel

Julian Stamp (10:08:08): > Hi, > > I am not sure whether I am in the right place for my question, please feel free to refer me else where. > I just published a package on CRAN that has a Biodonductor dependency (Rhdf5lib). All R CMD checks that I did on all platforms are working. Both Linux and Windows are working on CRAN too, but the macOS build fails with an error message that points to theRhdf5libdependency (https://www.r-project.org/nosvn/R.check/r-release-macos-arm64/smer-00install.html). Does anyone have an idea why the macOS arm64 build would be different with respect to Bioconductor dependencies? > > I was also looking for check results forRhdf5libon Bioconductor but could not find them. Are there any publicly available?

Lluís Revilla (10:57:17) (in thread): > Checks on Bioconductor forRhdf5libare athttps://bioconductor.org/checkResults/release/bioc-LATEST/Rhdf5lib/. I am not familiar with C dependencies and namespace, but is_H5Treclaimrelated torhdf5lib? It could also be a problem on CRAN machine, but I recently saw some flags on macOS of packages with C code (I think it was this:https://github.com/R-macos/recipesor something similar) Good luck!

Julian Stamp (11:07:31) (in thread): > Thank you for the quick response! > > You are right,_H5Treclaimcomes from a headerless library that I package with the R package ininst/include. I apologize, it does not seem to beRhdf5librelated.

2025-01-20

Robert Castelo (12:25:49): > Hi, > > More than a week ago I fixed the package gDNAx in both devel (1.5.1) and release (1.4.1), both versions build, check and install correctly without errors or warnings, except for a TIMEOUT in ‘taishan’ for the release version (seehereandhere). However, while the tar ball and binary versions of the package have propagated correctly for the devel version, this did not happen with the release version (see landing pagehere), where the package version is still 1.4.0. Since the release version is giving a TIMEOUT in ‘taishan’, I suspect that it might have to do with that, but if I recall this correctly, in the past, if a package did not build in one of the platforms (this was often Windows), the new version would still propagate on the platforms where it built correctly. The TIMEOUT is difficult to interpret, because seems to be happening during vignette creation, but no other clue is given (see report pagehere). Is there anything I could try to do to help propagating the release version of the package?

Vince Carey (12:29:04): > @Martin Grigorov^^

Martin Grigorov (12:48:21): > Thanks for the ping!I’llcheck it tomorrow!

2025-01-21

Martin Grigorov (06:26:11) (in thread): > > > BiocManager::install("gDNAx", force = TRUE) > Bioconductor version 3.20 (BiocManager 1.30.25), R 4.4.2 (2024-10-31) > Installing package(s) 'gDNAx' > trying URL '[https://bioconductor.org/packages/3.20/bioc/src/contrib/gDNAx_1.4.1.tar.gz](https://bioconductor.org/packages/3.20/bioc/src/contrib/gDNAx_1.4.1.tar.gz)' > Content type 'application/x-gzip' length 752585 bytes (734 KB) > ================================================== > downloaded 734 KB > > * installing **source** package 'gDNAx' ... > **** package 'gDNAx' successfully unpacked and MD5 sums checked > **** using staged installation > **** R > **** inst > **** byte-compile and prepare package for lazy loading > **** help > ***** installing help indices > **** building package indices > **** installing vignettes > **** testing if installed package can be loaded from temporary location > **** testing if installed package can be loaded from final location > **** testing if installed package keeps a record of temporary installation path > * DONE (gDNAx) > > The downloaded source packages are in > '/home/biocbuild/tmp/Rtmpa4IyzH/downloaded_packages' > > It installed without any problem! I guess it should be OK in the next run!

Robert Castelo (08:13:03) (in thread): > Hi, great, so today the landing page finally shows the tar ball and binaries of version 1.4.1. Thanks!

2025-01-23

Julian Stamp (13:05:50) (in thread): > I raised this issue withHighFiveand heard the response that the symbol_H5Treclaimis part ofHDF5notHighFive(https://github.com/BlueBrain/HighFive/discussions/1073) . Would this point to it being an issue with the CRAN environment or the way I set up theMakevarsfile?

2025-01-30

Pariksheet Nanda (11:29:38): > Is the code browser down for anyone else?https://code.bioconductor.org/browse/

Lori Shepherd (11:33:55) (in thread): > its working for me. but if anyone else is having issues please let us know and we can look further into it

Pariksheet Nanda (11:34:21) (in thread): > It’s working again for me!

2025-01-31

Tim Triche (09:53:05) (in thread): > likewise

Bastien CHASSAGNOL (17:11:37): > @Bastien CHASSAGNOL has joined the channel

2025-02-03

Joseph (19:01:19): > Hi everyone! So for my lab, we currently have a regular GitHub account which we used for our initial “OutSplice” package submission to Bioconductor. We are now considering switching the regular account to an organization and I was wondering if anyone knew if this would pose any sort of issue with Bioconductor’s connection with our GitHub account. I am assuming not, but I wanted to check anyway! Thanks for the help!

Lori Shepherd (19:35:30): > If the package is already in Bioconductor then it does not affect it at all. After acceptance, Bioconductor only knows of the git.Bioconductor. org location

2025-02-05

Hiru (06:33:49): > Hello everyone, I attempted to send an email tobioc-devel@r-project.org, but it seems that the body of my email remains empty upon delivery. I am sharing the email contents here in hopes of receiving assistance.Subject: Segfault in viewMeans from IRanges on R-devel with Bioc-develI am reaching out to seek advice regarding an issue with one of the Bioconductor packages I maintain, EpiCompare. Recently, I observed that the package fails checks and tests when running on R-devel with Bioc-devel packages. > > The primary concern is that the traceback indicates a segmentation fault occurring within the C code for theviewMeansfunction in the widely-used IRanges package. However, I noticed that this function has not been modified in years, which adds to the confusion. > > For context: > • Both the current and past devel versions of EpiCompare, which pass checks on the current R and Bioconductor versions, exhibit the same failure on R-devel. > • I have reproduced this issue using GitHub Actions and on my personal Linux machine, which confirms its consistency across environments. > • The affected portion of my code has not been changed for a considerable amount of time, making it challenging to pinpoint the root cause of the issue. > > Full traceback: > ***** caught segfault ***** > address 0x1, cause 'memory not mapped' > > Traceback: > 1: .Call2("C_viewMeans_RleViews", trim(x), na.rm, PACKAGE = "IRanges") > 2: viewMeans(v, na.rm = na.rm) > 3: viewMeans(v, na.rm = na.rm) > 4: viewMeans2(v, na.rm = na.rm) > 5: FUN(X[[i]], ...) > 6: lapply(names(numvar), function(seqname) { v <- Views(numvar[[seqname]], bins_per_chrom[[seqname]]) viewMeans2(v, na.rm = na.rm)}) > 7: lapply(names(numvar), function(seqname) { v <- Views(numvar[[seqname]], bins_per_chrom[[seqname]]) viewMeans2(v, na.rm = na.rm)}) > 8: GenomicRanges::binnedAverage(bins = gr_windows, numvar = data_cov, varname = "score", na.rm = FALSE) > 9: FUN(X[[i]], ...) > 10: lapply(X = S, FUN = FUN, ...) > 11: doTryCatch(return(expr), name, parentenv, handler) > 12: tryCatchOne(expr, names, parentenv, handlers[[1L]]) > 13: tryCatchList(expr, classes, parentenv, handlers) > 14: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- sys.call(-4L) dcall <- deparse(call, nlines =1L) prefix <- paste("Error in", dcall, ": ") LONG <- 75L sm <- strsplit(conditionMessage(e),"\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if (is.na(w)) w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L], type = "b") if (w > LONG) prefix <- paste0(prefix, "\n ") } else prefix <- "Error : " msg <- paste0(prefix, conditionMessage(e),"\n") .Internal(seterrmessage(msg[1L])) if (!silent && isTRUE(getOption("show.error.messages"))) { cat(msg, file = outFile) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))}) > 15: try(lapply(X = S, FUN = FUN, ...), silent = TRUE) > 16: sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) > 17: FUN(X[[i]], ...) > 18: lapply(seq_len(cores), inner.do) > 19: apply_fun(FUN = FUN, X, ...) > 20: bpplapply(X = peakfiles, workers = workers, FUN = function(gr) { gr <- compute_percentiles(gr = gr,thresholding_cols = intensity_cols, initial_threshold = 0) gr_names <- names(GenomicRanges::mcols(gr)) intens_col <- gr_names[gr_names %in% paste(intensity_cols, "percentile", sep = "_")][1] data_cov <-GenomicRanges::coverage(gr, weight = intens_col) rm(gr) gr <- GenomicRanges::binnedAverage(bins = gr_windows,numvar = data_cov, varname = "score", na.rm = FALSE) return(gr$score)}, ...) > 21: rebin_peaks(peakfiles = peakfiles, genome_build = "hg19", bin_size = 5000, workers = 1) > > The error is reproducible with built-in data. For more details, please refer to the GitHub Issue here:https://github.com/neurogenomics/EpiCompare/issues/155As a relatively new programmer, I find this situation puzzling and would greatly appreciate any guidance or suggestions on how to approach debugging or resolving this problem. - Attachment: #155 rebin_peaks: Segfault (memory not mapped) for R-devel v4.5.0 > 1. Bug description > > All functions which call for rebin_peaks fail under R-devel v4.5.0 (tested with 2025-01-19 r87600 and prior). Everything works as expected with the latest R-release. > > 2. Reproducible example > > Running example for rebin_peaks > > devtools::load_all() > #> :information_source: Loading EpiCompare > #> Warning: replacing previous import ‘Biostrings::pattern’ by ‘grid::pattern’ > #> when loading ‘genomation’ > > data(“CnR_H3K27ac”) > data(“CnT_H3K27ac”) > peakfiles <- list(CnR_H3K27ac=CnR_H3K27ac, CnT_H3K27ac=CnT_H3K27ac) > > peakfiles_rebinned <- rebin_peaks(peakfiles = peakfiles, > genome_build = “hg19”, > bin_size = 5000, > workers = 1) > #> Standardising peak files in 647,114 bins of 5,000 bp. > #> Warning in apply_fun(FUN = FUN, X, …): scheduled cores 1, 2 did not deliver > #> results, all values of the jobs will be affected > #> Merging data into matrix. > #> Error in rownames[reprex v2.1.1](-(**tmp**, value = c(“chr1:1-5000”, “chr1:5001-10000”, : attempt to set ‘rownames’ on an object with no dimensions > > Created on 2025-01-20 with https://reprex.tidyverse.org) > > Data > > (Built-in) > > 3. Session info > > utils::sessionInfo() > # > > Data > > (Built-in) > > 3. Session info > > utils::sessionInfo() > #) R Under development (unstable) (2025-01-19 r87600) > #> Platform: x86_64-pc-linux-gnu > #> Running under: Ubuntu 22.04.5 LTS > #> > #> Matrix products: default > #> BLAS: /home/hd423/downloads/R-devel/lib/libRblas.so > #> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0 LAPACK version 3.10.0 > #> > #> locale: > #> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
> #> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
> #> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
> #> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
> #> > #> time zone: Etc/UTC > #> tzcode source: system (glibc) > #> > #> attached base packages: > #> [1] stats graphics grDevices utils datasets methods base
> #> > #> loaded via a namespace (and not attached): > #> [1] compiler_4.5.0 fastmap_1.2.0 cli_3.6.3 tools_4.5.0
> #> [5] htmltools_0.5.8.1 withr_3.0.2 fs_1.6.5 glue_1.8.0
> #> [9] yaml_2.3.10 rmarkdown_2.29 knitr_1.49 reprex_2.1.1
> #> [13] xfun_0.50 digest_0.6.37 lifecycle_1.0.4 rlang_1.1.5
> #> [17] evaluate_1.0.3 > > Created on 2025-01-20 with reprex v2.1.1

Hiru (06:35:39) (in thread): > It appears that thespoonpackage is also failing for a similar reason, though the error originates from a different dependency package.Reference:Email with the subject “[Bioc-devel] spoon problems reported Multiple platform build/check BioC 3.21”

Lluís Revilla (09:03:14) (in thread): > Even if the function hasn’t been modified in years it could be that for years it was having problems (or that the R or C standards changed since then). Having said that could you reduce the reproducible example to only use IRanges rowMeans and still segfault? It would help to detect where the issue might be

2025-02-06

Kasper D. Hansen (07:16:49): > I have done relatively little research on the following question so for now I am just looking for a quick answer. I have gotten a report from a CRAN developer that his package fails on CRAN due to a sanitizer issue with Rgraphviz, its dependency.

Kasper D. Hansen (07:18:21): > How do I get a system where I can run gcc-ASAN (or clang-ASAN) while running R command check, so I can try to reproduce this error? Is there instructions somewhere?

Dirk Eddelbuettel (07:28:08) (in thread): > There are. I have provided Docker containers for this for a decade; I have a package on CRAN callsanitizerswithknown true positive errorsfailing under sanitizer use but not otherwise you can try to validate your setup. Note that in my containers ‘RD’ is the binary you should use. > > These days a popular container is also provided by Winston at his wch/r-debug repo. That is a ‘sumo’ build with many variants in one large container. You also need RD* for the different builds.

Dirk Eddelbuettel (07:28:58) (in thread): > The rhub setup also had it / has it. This switched to relying on GHA and when I tried it before it did not work for me. It may now and be easiest if it does.

Kasper D. Hansen (07:39:00) (in thread): > Thanks a lot Dirk. Amazing help as always.

Dirk Eddelbuettel (07:43:02) (in thread): > It lacks a decent writeup. If you have the patience and time to take some notes TODAY we could whack out a quick arXiv note that may one day be a ‘ten tips to fight ASAN errors’ …

Vince Carey (07:52:15) (in thread): > @Lori Shepherd@Marcel Ramos Pérezthis info can be added to contributions guide? Esp. once we have a report on Kasper’s journey. Thanks Dirk!

Kasper D. Hansen (08:00:43) (in thread): > Great idea, I’ll take notes. I am unlikely to start my journey TODAY though:slightly_smiling_face:But hopefully tomorrow/weekend.

Mike Smith (08:53:22) (in thread): > I went through this journey two weeks ago for rhdf5 and found Dirk’s UBSAN Docker containers to be super helpful. Happy to chime in with my experiences too.

Mike Smith (08:55:27) (in thread): > Here’s a GHA workflow I put together to test I could reproduce the CRAN errors and then my fixeshttps://github.com/grimbough/rhdf5/actions/runs/12745442237/workflow

Dirk Eddelbuettel (08:58:11) (in thread): > Thanks for kind words. The thing that remains so frustrating is that CRAN is still not quite “on the Docker train”. If only they madetheir setupa container replication would be so much easier. But between BDR using Fedora and bleeding-edge compilers as well as sometimes-changing configs some of the containers can get stale – hence the need for mine and Winston’s. And the initial validation step is critical. > > Overall all this is getting better but at a slower-than-glacial pace not really commensurate with the ‘fix your package in N weeks or else’ deadlines. Oh well. They still run an amazing free gift for the world.

2025-02-07

Mike Smith (08:43:52) (in thread): > I’ve no idea what’s changed, and I’ll open an issue with IRanges, but this seems to be a MRE that replicates what’s happening. > > library(IRanges) |> suppressPackageStartupMessages() > v <- IRanges::Views(Rle(values = integer(0)), > IRanges(start = 1, end = 2)) > IRanges::viewMeans(v) > > > > ***** caught segfault ***** > address 0x1, cause 'memory not mapped' > > Traceback: > 1: .Call2("C_viewMeans_RleViews", trim(x), na.rm, PACKAGE = "IRanges") > 2: IRanges::viewMeans(v) > 3: IRanges::viewMeans(v) >

Hiru (08:46:35) (in thread): > Thanks, Mike! I can replicate your example with R v4.5.0 > > No errors on the previous version for the same code.

Levi Waldron (10:08:23): > I just updated a convenience script for running the Bioconductor Docker containers, athttps://github.com/waldronlab/bioconductor/. Demo of functionality in thead:

Levi Waldron (10:09:01) (in thread): > > % ./bioconductor -h > Usage: bioconductor.sh [-v version] [-e envtype] [-p port] [-w password] [-d dockerhome] [-l] [-h] > -v version Specify the Bioconductor version (e.g., 'devel', 'RELEASE_X_Y', 'X.Y'). > -e envtype Specify the environment type ('rstudio', 'bash', or 'R'). Default is 'rstudio'. > -p port Specify the port number. Default is 8787. > -w password Specify the RStudio password. Default is 'bioc'. > -d dockerhome Specify the Docker home directory. Default is '/Users/Levi/dockerhome'. > -l List all available Bioconductor Docker versions. > -h Show this help message. > > The $DOCKER_RPKGS environment variable is optional and used to specify the R packages directory on the host machine. > If not set, the default is '$HOME/.docker-$version-packages'. > This directory is mounted into the Docker container at /usr/local/lib/R/host-site-library. > This allows R packages installed in the container to be persisted on the host machine and shared across multiple containers or sessions. >

Levi Waldron (10:09:22) (in thread): > > % ./bioconductor -v > Error: Version is required. Use the '-v' flag to specify a version. Use the '-l' flag to list available versions or the '-h' flag for examples. >

Levi Waldron (10:09:35) (in thread): > > % ./bioconductor -l > Fetching available versions from Docker Hub... > latest > devel > devel-amd64 > RELEASE_3_20 > 3.20 > RELEASE_3_20-R-4.4.2 > 3.20-R-4.4.2 > devel-R-devel > devel-arm64 > devel-R-4.4.1 > RELEASE_3_19 > 3.19 > RELEASE_3_19-R-4.4.1 > 3.19-R-4.4.1 > devel-R-4.4.0 > RELEASE_3_19-R-4.4.0 > 3.19-R-4.4.0 > RELEASE_3_18 > 3.18 > RELEASE_3_18-R-4.3.3 > 3.18-R-4.3.3 > devel-R-patched > RELEASE_3_17 > 3.17 > RELEASE_3_17-R-4.3.1 > 3.17-R-4.3.1 > RELEASE_3_18-R-4.3.2 > 3.18-R-4.3.2 > RELEASE_3_18-R-4.3.1 > 3.18-R-4.3.1 > devel-R-latest > RELEASE_3_17-R-4.3.0 > 3.17-R-4.3.0 > RELEASE_3_16 > 3.16 > RELEASE_3_16-R-4.2.3 > 3.16-R-4.2.3 > RELEASE_3_16-R-4.2.2 > 3.16-R-4.2.2 > RELEASE_3_15 > RELEASE_3_14 > RELEASE_3_13 > RELEASE_3_12 > RELEASE_3_11 > bioc2020.1 > bioc2020 > RELEASE_3_10 >

Levi Waldron (10:09:55) (in thread): > > % ./bioconductor -v devel -e R > devel: Pulling from bioconductor/bioconductor_docker > Digest: sha256:4143648f3d64ab3996b96f106327c0b681a910345083f5e34cf3d6f45b19416f > Status: Image is up to date for bioconductor/bioconductor_docker:devel > docker.io/bioconductor/bioconductor_docker:devel > > What's next: > View a summary of image vulnerabilities and recommendations → docker scout quickview bioconductor/bioconductor_docker:devel > > ================================================== > Installed packages will go in host directory: /Users/Levi/.docker-devel-packages > RStudio home directory will be mounted on host directory: /Users/Levi/dockerhome > ================================================== > > > R Under development (unstable) (2025-01-28 r87664) -- "Unsuffered Consequences" > Copyright (C) 2025 The R Foundation for Statistical Computing > Platform: aarch64-unknown-linux-gnu > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > Natural language support but running in an English locale > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > > >

Mike Smith (10:18:23) (in thread): > I’ve created an issue athttps://github.com/Bioconductor/IRanges/issues/55with a few more details.

2025-02-10

Lluís Revilla (06:04:40): > With BiocGenerics depending on generics one of my packages on CRAN broke (BaseSet): What is the recommendation for dealing with conflicting generics (BaseSet::tidy vs generics::tidy)? I use Suggests: GSEABase, would following recommendations on WRE to conditionally import classes from it help? Thanks in advance

Lluís Revilla (06:07:03): > The previous issue was only found when CRAN started testing with Bioc 3.21: Do Bioconductor package maintainers notify CRAN maintainers when their upcoming release break reverse dependencies packages? This is required by CRAN but I’m not sure what is the policy or recommendations on Bioconductor.

Hervé Pagès (14:58:47) (in thread): > At the root of the problem is the fact thatgenericsandBaseSetboth define atidy()S3 generic, so the outcome of callingtidy()is inherently unpredictable (it depends on the order in which the packages are loaded). Not a good situation, especially for the end user. > Note that the fact thatBiocGenericsnow depends ongenericsjust helps expose the issue. In other words, any user who had loadedgenericsafterBaseSetwould have run into the same error, even beforeBiocGenericsstarted to depend ongenerics. > All this can be avoided by havingBaseSetdepend ongenerics(viaDependsorImports, importing onlytidy) and turning the varioustidy()methods defined inBaseSetinto methods forgenerics::tidy(). > The quick fix for now just to make CRAN happy and not get kicked out is to callBaseSet::tidy()in all your examples and vignette, but that doesn’t really address the usability issue.

Hervé Pagès (15:14:36) (in thread): > Thanks Mike, very useful! Taking a look now…

Lluís Revilla (16:05:51) (in thread): > Thanks Hervé! Note that the purpouse of the generics is different, one is to convert to tibble the other to convert to BaseSet classes. My current fix is not quick but will be more permanent: I think that generic was a bad decision.

Hervé Pagès (17:12:11) (in thread): > Oh yeah, you can also use a different name, I should have mentioned that. It’s even better, especially if the two generics do different things. Somehow I assumed that you really liked the name and wanted to stick to it:wink:

Hervé Pagès (17:43:28) (in thread): > To be really honnest (and blunt), CRAN packages that depend on Bioconductor are anomalies in my view. I understand that people might want to be able to resuse some of our infrastructure packages like Biobase, BiocGenerics, Biostrings, etc.. but at the same time don’t want to go thru our approval process, follow our release cycles etc… However they also need to understand what it means to have a package on CRAN that depends on Bioconductor: > * CRAN doesn’t have the notion of release for their package collection as a whole. Packages are expected to passR CMD checkon R current, old, and devel at all time. However this won’t be always possible if a package depends on Bioconductor. Some change from one Bioconductor release to the next could break a package on CRAN in a way that makes it impossible for the package to restore compatibility withbothversions of Bioconductor. Has never happened so far (AFAIK) but I can easily imagine situations where we need to make changes to a core Bioconductor class that leads to such situation. We’ve just been lucky so far that it hasn’t happened yet. > * We have no way to know if a change in BioC devel breaks a CRAN package, until CRAN starts testing with BioC devel. This usually only happens 2 or 3 months before our Spring release and might happen around our Fall release, a few days before or after. And even when CRAN starts to test with BioC devel, we have no way automated/systematic way to know what packages that breaks. It’s up to each Bioconductor package maintainer to check the health of their rev deps on CRAN if they have any. I wonder how many BioC developers are willing to do that:wink: > All this to say that the right place for a package that depends on Bioconductor is… Bioconductor! Then they’re in the daily report, receive automatic failure notifications, get reminders about the upcoming releases and so on…

2025-02-11

Lluís Revilla (04:13:55) (in thread): > I like the name but the description doesn’t match what mine does, the definition is different and the problems aren’t worth keeping it.

Lluís Revilla (04:56:34) (in thread): > Let’s see it too the other way around: > > Bioconductor packages that dependon CRAN are anomalies in my view. I understand that people might want to be able to reuse some of our infrastructure packages like generics, Matrix, survival, etc.. but at the same time don’t want to go thru our approval process, follow our release cycles etc, … However they also need to understand what it means to have a package on Bioconductor that depends on CRAN. > Even if there are multiple repositories, with different process and focus, I think they/we should work together to make users, developers and repository maintainers work easier. That the whole point of having a repository.

Lluís Revilla (04:59:56) (in thread): > About the specific points raised: > * Packages are expected to pass R CMD check –as-cran but not on old releases, although it is checked. Behavior on CRAN packages can check the version of the Bioconductor one to behave one way or another adapting to Bioconductor release cycle and packages used. > * there istools::package_dependenciesto list the packages affected andtools::check_packages_in_dirto tests reverse dependencies. This works on Bioconductor even if they are from CRAN. I bet CRAN maintainers are not freely willing to check reverse dependencies (I don’t like to do that, specially because it is a bit difficult) but as it is a CRAN policy they(/we) follow it through. My question was towards setting a policy or at least a recommendation on Bioconductor guidelines. > BaseSet doesn’t depend on Bioconductor, it only suggests packages on it. I could remove that functionality without problems for myself. See also my previous question about conditional depending on class definitions… I only suggests Bioconductor packages because I think it would be useful for Bioconductor: The package started as a 1 year collaboration within Bioconductor. That approach/package was driven out of Bioconductor, so I submitted to CRAN. > It is good to know there is no interest on packages outside Bioconductor working with Bioconductor. I will take note of this and work with other repositories willing to work between them.

Hervé Pagès (18:41:02) (in thread): > Fixed inIRanges2.41.3.

Hervé Pagès (18:51:50): > A replacement fort(t(m) / colSums(m)): > > m / as_tile(colSums(m), along=2) > > Faster and more memory efficient! Available inS4Arrays1.7.3 (see?as_tile).

Kasper D. Hansen (19:05:46): > Well, the best standard way to code this is usingsweep()

Kasper D. Hansen (19:06:09): > But sounds good

Hervé Pagès (19:24:10): > sweep()usesaperm()internally so is as inefficient ast(t(m) / colSums(m))

2025-02-12

Hiru (08:31:59) (in thread): > Thank you for the update! Glad to hear the issue has been resolved. > > Could you please confirm whether the changes have been pushed to Bioconductor devel? It appears that the S4Vectors dependency is still at version v0.45.2, despite the updated version (v0.45.4) on GitHub.

2025-02-13

Anushka Paharia (09:24:45): > @Anushka Paharia has joined the channel

Aaron Lun (13:14:45) (in thread): > looks like it doesn’t work correctly with DelayedArrays yet: > > > library(DelayedArray) > > set.seed(0) > > m <- matrix(runif(100), 20, 5) > > X <- t(t(m)/colSums(m)) > > X[1,] > [1] 0.07990106 0.07897805 0.04052479 0.03774897 0.09526068 > > Y <- DelayedArray(m) / as_tile(colSums(m), along=2) > > Y[1,] > [1] 0.07990106 0.06927500 0.03664700 0.03625099 0.08559679 >

Hervé Pagès (13:16:56) (in thread): > Not ready for that. But that’s actually my 2nd main use case, 1st being SparseArray objects where the double-transposition trick can be very expensive.

Hervé Pagès (13:18:25) (in thread): > But good catch, thanks! I should block it on DelayedArray objects until it’s ready.

JP Flores (13:54:25): > @JP Flores has joined the channel

Hervé Pagès (15:26:51) (in thread): > I was actually hoping thatDelayedArray(m) / as_tile(colSums(m), along=2)would just fail with a “no method found” kind of error because I only have a/method forDelayedArray#vector, not forDelayedArray#arrayinDelayedArray. But this long-standing bug in themethodspackage got me again: > > > is(array(), "vector") > [1] TRUE > > So I need to addDelayedArray#arrayandarray#DelayedArraymethods to work around this.

Hervé Pagès (15:59:28) (in thread): > Done inDelayedArray0.33.6.

2025-02-14

Sarah (20:19:51): > @Sarah has joined the channel

2025-02-19

Josias Rodrigues (17:44:23): > @Josias Rodrigues has joined the channel

2025-02-20

James Nemesh (09:08:40): > @James Nemesh has joined the channel

James Nemesh (09:13:57): > Hi! I was wondering how strictly two of theBiocCheck::BiocCheck('new-package'=TRUE)notes checks are enforced when submitting new packages. The two notes I have are: > 1. NOTE: Consider shorter lines; 170 lines (4%) are > 80 characters long. > 2. NOTE: The recommended function length is 50 lines or less. There are 11 functions greater than 50 lines. > I find that because my variable names are long and descriptive, formatting lines to be <= 80 characters is pushing more of my functions to be over 50 lines in length. > I’m not against refactoring truly long methods (I have 2-3 of the 11 that I think are reasonable), but others are only that long because of of the tight column width formatting. > Thanks for any advice.

Lori Shepherd (09:16:45) (in thread): > depends on the reviewer but if you can argue your case you should be okay. These notes are suggestions to encourage good coding practices to write simpler and easy to understand functions. if the lines are long because of descriptive naming of variables this would of course be encouraged over shortening

James Nemesh (09:19:01) (in thread): > Thanks! Most of the checks are actually very reasonable (runnable examples? Unit tests? Crazy talk!). It’s really all style guide stuff. In python I use formatblackwhich enforces well formatted code but not function length.

2025-02-21

Sean Davis (10:32:49): > Somewhat off-topic, but we had a small discussion in#tech-advisory-boardabout research software engineers and supporting them.https://zenodo.org/records/10436166came up and I thought I might share here. - Attachment (Zenodo): Getting Started with the RSE Movement within your Organization: A Guide for Individuals > Recognizing the critical role of Research Software Engineers (RSEs), this guide serves as a resource for individuals aspiring to champion the RSE movement within their organizational contexts. By offering practical steps and tips, the guide aims to instigate positive change by connecting RSEs and cultivating a cohesive RSE community. Structured into specific sections, it begins with creating awareness and assessing interest, then progresses to the establishment of an informal RSE community and the recruitment of allies. Ultimately, the guide guides individuals in forming an RSE group or society within their organization. These actions set the foundation for collaborative efforts, support systems, and advocacy, enabling individuals to drive impactful change and foster a conducive environment for the flourishing of Research Software Engineers in their organization.

Pariksheet Nanda (15:18:15) (in thread): > In pythonblackmay not enforce the line length butflake8’s warning defaults to 79. Historically line length limits / conventions can be attributed to terminal sizes and Fortran (<= 72 characters for Fortran 66 because they required sequence numbers in columns 73-80 for when you drop your punch cards and need to reorder them). Also IBM line printers had a 132 character limit. Personally I find it easier to read shorter columns of code much like the shorter columns in newsprint and some journal articles.

2025-02-25

Louis Le Nézet (06:01:08): > Hi ! > Has anyone tried to use BiocStyle with pkgdown ?

Federico Marini (06:31:35): > Not directly that, but FWIW@Kevin Rue-Albrechtcame up with a combination of colors that you can put in a css file and gives you something biocstyle-ish

Federico Marini (06:32:05): > https://isee.github.io/iSEE/is an example of something deployed with that combination - Attachment (isee.github.io): Interactive SummarizedExperiment Explorer > Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.

Federico Marini (06:32:27): > an alternative combination would be this:https://isee.github.io/iSEEDemoEuroBioC2024/ - Attachment (isee.github.io): iSEEDemoEuroBioC2024 > This package provides the environment and materials for the package demo at the European Bioconductor Meeting 2024. The environment includes the iSEE package and all its extensions developed under the iSEE organisation. The materials provide a brief overview of the core package functionality, and a walkthrough of the development of an example extension package.

Federico Marini (06:32:37): - File (PNG): image.png

2025-02-27

Hervé Pagès (15:52:02) (in thread): > I missed this sorry. Everything looks fine on the daily report and bothIRanges2.41.3 andS4Vectors0.45.4 have propagated to the public package repos for BioC 3.21. Please keep in mind that it usually takes between 24 and 48 hours for changes made on GitHub orgit.bioconductor.orgto propagate all the way to the public package repos and package landing pages.

Dario Strbenac (18:00:02): > Any plans to add an R6-specific section to the Developer’s Guide? Review checklist also lacks S4 fundamentals, such as providing a constructor function instead of expecting package user to usenew().

Marcel Ramos Pérez (18:36:36) (in thread): > We don’t really have a lot of R6 use cases but perhaps we can just refer to thehttps://adv-r.hadley.nz/r6.htmlsection in the Developer’s Guide

Hiru (19:20:25) (in thread): > Thank you very much for all the fixes! My package is passing all checks with your recent changes.

Vince Carey (22:22:39) (in thread): > @Dario Strbenacplease file an issue on the developer guide github repo so these topics get picked up

2025-02-28

Kateřina Matějková (09:20:50): > @Kateřina Matějková has joined the channel

2025-03-01

Yao-Chung Chen (05:55:41): > @Yao-Chung Chen has joined the channel

Yao-Chung Chen (06:03:28): > Hello, > Does anyone know if the citation content changed (inst/CITATION) will be rendered in the devel landing page? or it will only reflect when the new version release on the release landing page. I have been waiting for several days for this (all builds pass on the devel branch). Thank you!

Henrik Bengtsson (10:34:11) (in thread): > Please share the package name or URL. Then I’m sure someone here will be able to look into it and give you a constructive answer.

Lori Shepherd (14:13:24) (in thread): > Agreed. I can look into it if I know the package name

Hervé Pagès (21:47:32) (in thread): > Did you bump the version@Yao-Chung Chen?

2025-03-05

Lori Shepherd (15:40:16): > Bioconductor 3.21 Release Schedule:https://bioconductor.org/developers/release-schedule/ - Attachment (bioconductor.org): Bioconductor - Release: Schedule > The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.

Dario Strbenac (16:00:00) (in thread): > There seem to be some key differences between S4 and R6. The advice about the package user not usingnewdirectly but the package developer providing a wrapper function with the same name as the class as a constructor seems to be unsuitable for R6.

2025-03-07

James Nemesh (12:37:52): > I have a question about the preliminary review of my package, and what I need to do to fix it. The feedback I received was: > > Seurat objects and dgeMatricies are not a standard Bioconductor class. You can still provide functionaltiy but the package should be updated to work directly with Bioconductor standard classes. > > My package does not have any interfaces (public functions) that require Seurat Objects. Seurat is used internally to perform single cell differential expression and gene module calculations. > I’m not clear on what I need to change to deal with this feedback - I can swap in SingleCellExperiment forprivatefunction signatures that users can’t call (these are not exported functions!), but am I restricted from using Seurat within a function to run differential expression on that object? > > Edit : Link to my submission:https://github.com/Bioconductor/Contributions/issues/3750#issuecomment-2706603744

Vince Carey (13:01:41): > @James Nemeshthanks for writing. I will have a look; if you want to give the link to your repo others could weigh in too.

Vince Carey (14:59:39): > I’ll put a few comments in the issue noted above. I think the issue was closed because a checklist element was deleted, but we can discuss there and then see what next steps are warranted.

James Nemesh (15:05:56): > @Vince CareyHi, do you mean the check list in most of the issues: IE: > > Confirm the following by editing each check box to '[x]' > > I understand that by submitting my package to Bioconductor, > the package source and all review commentary are visible to the > general public. > > My coworker / author submitted this and didn’t include those. Sounds like we need to add that to the initial issue. I’m more than happy do resubmit with all of those requirements.

Vince Carey (16:28:40): > I have now entered my comments athttps://github.com/Bioconductor/Contributions/issues/3750#issuecomment-2707493912. Apropos the preceding query, yes, that checklist needs to be part of the issue stream. - Attachment: Comment on #3750 DropSift > Thanks for this submission, I think it covers very interesting functionality and it would be nice to have it in the ecosystem. Since it doesn’t use Bioconductor packages or classes, it could be best suited for CRAN. But let’s discuss some ot the steps that would bring it closer to Bioconductor. > > Here’s the start of man page for a key ingestion function: > > > readDgeFile package:DropSift R Documentation > > Read in DGE matrix > > Description: > > Read in a dense DGE matrix in McCarroll lab format (cell barcodes > in columns, genes in rows) or read the expected 10x MTX files from > a directory. > > Usage: > > readDgeFile(dgeMatrixFile, cell_features = NULL) > > Arguments: > > dgeMatrixFile: The file to parse. If a directory, then the expected > 10x MTX files are read. > > > > Is “McCarroll lab format” very widely known/used? I had not heard of it. Perhaps
> some additional background on the lab and the format are warranted in the
> description. Or perhaps this text is just leftover from something more private/local. > > If 10x MTX files are relevant, then functions in the TENxIO package (vignette) should be used to
> ingest them. > > — > > Running example(readDgeFile), the writing step seems pretty slow. Once the
> example is completed, one could get closer to Bioconductor data usage patterns
> with > > > > library(SingleCellExperiment) > > sce = SingleCellExperiment(dge_matrix) > > colData(sce) = DataFrame(cell_features) > > sce > class: SingleCellExperiment > dim: 25697 4120 > metadata(0): > assays(1): '' > rownames(25697): A1BG A1BG-AS1 ... ZYX ZZEF1 > rowData names(0): > colnames: NULL > colData names(5): cell_barcode num_reads pct_intronic pct_mt > frac_contamination > reducedDimNames(0): > mainExpName: NULL > altExpNames(0): > > > > Now we have the option of setting assayNames, adding metadata, computing
> and storing dimension-reduced re-expressions of the assay data in ways
> that are well-practiced in other packages and workflows. > > — > > There is no need to have new software for parsing h5ad format. > > > h5ad_file <- system.file("extdata", "adata_example.h5ad.gz", package = > "DropSift") > library(zellkonverter) > library(R.utils) > dd = tempfile() > gunzip(h5ad_file, dest=dd, ext="gz") > x = readH5AD(dd) > > > > now we have > > > > x > class: SingleCellExperiment > dim: 5 5 > metadata(5): NHashID expression_data_type input_id > optimus_output_schema_version pipeline_version > assays(2): X exon_counts > rownames(5): IGFBP4 INPP4B ENSG00000251293 MADD-AS1 ENSG00000275016 > rowData names(32): gene_names ensembl_ids ... > number_cells_detected_multiple number_cells_expressing > colnames(5): GGGTTTACATAACTCG TTGTTGTTCCTTATCA AGCTACAGTGAGCCAA > ACTCCCAGTAGAGCTG AACAACCGTCTGTCCT > colData names(41): cell_names CellID ... pct_mitochondrial_molecules > input_id > reducedDimNames(0): > mainExpName: NULL > altExpNames(0): > > assay(x) > 5 x 5 sparse Matrix of class "dgCMatrix" > GGGTTTACATAACTCG TTGTTGTTCCTTATCA AGCTACAGTGAGCCAA > IGFBP4 . . . > INPP4B 1 3 1 > ENSG00000251293 7 5 . > MADD-AS1 . . . > ENSG00000275016 . . . > ACTCCCAGTAGAGCTG AACAACCGTCTGTCCT > IGFBP4 . . > INPP4B 1 1 > ENSG00000251293 . . > MADD-AS1 . . > ENSG00000275016 . . > > > > You have a nice package and vignette and it can clearly be used according to
> its documentation. However, to be part of the Bioconductor ecosystem, and to get
> an expeditious review, the coding choices need to reflect awareness and
> acceptance of the value of the general approaches to data structure and package
> interop and reuse that have been part of the project going on 24 years. Exceptions
> can be made but in this case I feel that conscious adoption of a few of the most
> prominent components of Bioconductor would be a prerequisite for continuing a review. > > Please feel free to add more comment in this issue.

James Nemesh (17:02:07) (in thread): > Hi! Thanks so much for your comments and feedback! > > Would it be best to modify the first post in the issue to add the checklist, or should I append it? I think we can address many of the other issues you raised, and that’s all appreciated!

Vince Carey (17:03:50) (in thread): > @Lori Shepherd^^ i’d propose adding the checklist where it belongs and reopening the issue so that we have the whole history in place. but@Lori Shepherdwill give the final instruction.

James Nemesh (17:04:04) (in thread): > Great, thanks again!

Lori Shepherd (17:34:51) (in thread): > Please open a new issues to make sure the system picks it up

James Nemesh (17:35:10) (in thread): > Will do! Thanks again, sorry for the mess.

Vince Carey (17:35:22) (in thread): > right, we can cross reference the old issue stream that is closed, as needed.

2025-03-09

yuvashree v (04:13:52): > @yuvashree v has joined the channel

2025-03-11

Lluís Revilla (03:59:15): > FYI: Martin Maechler posted on R-devel about some changes on S4 showMethods output. He is looking for feedback from Bioconductor developers:https://stat.ethz.ch/pipermail/r-devel/2025-March/083881.html

Sanika Menkudale (07:11:52): > @Sanika Menkudale has joined the channel

2025-03-12

Sisi Wang (19:42:16): > @Sisi Wang has joined the channel

2025-03-13

Mihai Todor (06:59:39): > @Mihai Todor has joined the channel

2025-03-17

Jesulolufemi Adelanke (12:56:02): > @Jesulolufemi Adelanke has joined the channel

2025-03-18

James Nemesh (11:48:46): > Hi! My co-author and I have been having some discussion about what Bioconductor “wants” for a package. The goal of our package is to read in a few inputs, apply some machine learning, and generate a classification - in this case, to take a single cell data set and classify “empty” and “nuclei” droplets. > > I like the idea of a package that can take file arguments and emit a result file - you can call this package on the command like using Rscript, and it makes it trivial to integrate into a larger scRNASeq workflow. There’s no real R knowledge required - you can just call the library in a trivial way via:Rscript -e 'library(my_library)' -e 'do_the_thing(in=my_input_file,out=my_output_file)``My co-author thinks that perhaps the Bioconductor way is to leave off all of the file parsing, and leave that to the user, and instead only expose the API - for example, the API might take in an already parsedSingleCellExperiment`object with the correct metadata, and emits a dataframe that contains the single cell identifiers and classification labels, but does not serialize the results to disk. That requires the user to write some R code to handle parsing data into the correct format, and writing the results to disk. Not necessarily a huge amount of R code, but it limits the userbase somewhat. > > What’s the Bioconductor philosophy? (I’ll note that in either case, we support API access to the functionality. I want to support “extra” functionality to remove that burden from a user.)

Vince Carey (11:51:32): > I don’t think there’s a simple answer. I think that we have benefited from the definition of shareable/extensible data classes that can be populated by independently defined parsers that work for different platforms. There is no reason that a CLI-oriented utility like the one you sketch above should not be available from a Bioconductor package. But the package should make use of the data classes for the values they provide.

James Nemesh (11:52:48): > I absolutely agree on using the data classes already established by Bioconductor so the API is reusable within the ecosystem.

Peter Hickey (17:57:01): > Just to give you the view of an experienced R/Bioconductor user (who I get might not be your target audience, so feel free to take with a grain of salt:slightly_smiling_face:) . > I don’t like R software that writes to file instead of returning R objects that enable me to continue my analysis where I already am. > Same feeling for ‘scripting’ style R code - R is a primarily an interactive data analysis language in my mind. > If I’m using Bioconductor for single-cell analysis, I’m using SingleCellExperiment objects and I want something that integrates with that. > How I get the data into a SingleCellExperiment object is a different task/responsibility

Jorge Kageyama (18:04:06): > Ill add my pinch of salt, at Genentech we have people that want to use bioconductor packages, but dont want to touch or learn R, so i know your pain. I general still, we want single responsibilities in our code, for many reasons, but an easy one is, what happens if there is an error on your write? does the entire computation gets lost? > The leaves us to hybrid approaches, we have a bioconductor repo, and a wrapper/utils repo, where we have collections of functions that make your life easier, and these are the ones we incorporate for example in Nextflow. If a user can install a library, it can install two, and then keeps the bioconductor pieces “pure”, while still adding a layer of accesibility.

James Nemesh (18:19:57): > Thanks for the feedback! We definitely plan on having the API available to do processing - you can always run the code in an interactive mode. > The file inputs are simply a wrapper around it for convenience. We’ve discussed having a 2nd package for the hybrid approach, which has its plusses and minuses in terms of putting related code in the same place - if we had more functions that required wrappers, it might be more attractive to approach the problem that way. Internally that’s actually how we run this already - as part of a larger workflow, where is is one of a few methods for the same task - but has become the default one over time.

Shian Su (18:28:06): > Internally read, processing, write should be three decoupled actions. This gives users a chance to create data for your processing function if their raw data doesn’t fit your reader’s format. It also gives them the choice/responsibility of how to write their data out, eg could be csv, xlsx, rds, … > > You can still have your CLI by trivially piping your constructed output into the writer.Rscript -e “dostuff() |> writestuff(outfile)”

2025-03-19

Robert Shear (10:13:05): > @Robert Shear has left the channel

iniobong simeon (10:23:48): > @iniobong simeon has joined the channel

2025-03-20

Leonardo Collado Torres (09:23:24) (in thread): > You might be interested inhttps://lcolladotor.github.io/biocthis/reference/use_bioc_pkgdown_css.htmlwhich uses the CSS Kevin came up with. - Attachment (lcolladotor.github.io): Create a css with Bioconductor colors for pkgdown — use_bioc_pkgdown_css > This function creates the pkgdown/extra.css file with Bioconductor-style > colors that will make your pkgdown documentation websites much cooler ^_^.

Vince Carey (16:27:09): > For developers who use basilisk: I prematurely merged a PR to the sources that makes a transition to virtual environment management of python infrastructure. This change has been reverted and the current devel version of 1.19.3 is in the intended state. If you have a checkout of basilisk source code from git, you may have to use the commandsgit fetch && git reset --hard origin/develto get a clean source checkout. Feel free to DM me with concerns, and I will summarize back as appropriate.

2025-03-21

Kasper D. Hansen (16:00:57): > I don’t know if I should ask here or on the devel email list, but I would really like to get some help with Rgraphviz devel on nebbiolo1. It fails with a core dump while building theRgraphics.Rnwvignette. It would on all other platforms on the build system. It works on the platforms I have access to, including the bioconductor-devel docker container (well, here it “kind of” works because the container does not have Latex installed).

Kasper D. Hansen (16:01:15): > I would really love for someone to do the following to see if the error can be replicated

Kasper D. Hansen (16:01:59): > > # check out the Rgraphviz sources, we ned 2.51.5 > R CMD build --no-build-vignettes Rgraphviz > R CMD check Rgraphviz_2.51.5.tar.gz >

Kasper D. Hansen (16:02:07): > and / or

Kasper D. Hansen (16:02:56): > > # check out the Rgraphviz sources, we ned 2.51.5 > R CMD build --no-build-vignettes Rgraphviz > R CMD INSTALL Rgraphviz_2.51.5.tar.gz > R CMD Sweave Rgraphviz/vignettes/Rgraphviz.Rnw > > and see if you can figure out which line creates the buffer overflow

Kasper D. Hansen (16:27:51): > … managed to install latex in the Docker container. The packages passes R CMD check. But I am running it as linux/amd64 emulated under Apple Silicon, so perhaps this does something to buffer overflows

Martin Morgan (18:31:47) (in thread): > I used therocker UBSAN container, tangling and then runningRD -f vignettes/Rgraphviz.R(I thinkRDgets the sanitizer-enabled version of R) shows the error at > > > ################################################### > > ### code chunk number 41: figbipartite > > ################################################### > > plot(g, attrs = att, nodeAttrs=nA, subGList = sgL) > ================================================================= > ==57148==ERROR: AddressSanitizer: heap-use-after-free on address 0x51200052d080 at pc 0x7ffff2764317 bp 0x7ffffffc21a0 sp 0x7ffffffc2198 > READ of size 8 at 0x51200052d080 thread T0 > #0 0x7ffff2764316 in cleanup1 /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/rank.c:62:21 > #1 0x7ffff2764316 in dot_rank /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/rank.c:585:5 > #2 0x7ffff2739a8a in dot_layout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/dotinit.c:263:9 > #3 0x7ffff29b6d5f in gvLayoutJobs /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/gvc/gvlayout.c:90:2 > #4 0x7ffff29c410c in gvLayout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/gvc/gvc.c:90:5 > #5 0x7ffff273779a in Rgraphviz_doLayout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/doLayout.c:253:5 > #6 0x7ffffeca494c in R_doDotCall /tmp/R-devel/src/main/dotcode.c:757:11 > #7 0x7ffffecf26a2 in do_dotcall /tmp/R-devel/src/main/dotcode.c:1437:11 > #8 0x7ffffee44009 in bcEval_loop /tmp/R-devel/src/main/eval.c:8118:14 > #9 0x7ffffedf86e4 in bcEval /tmp/R-devel/src/main/eval.c:7501:16 > #10 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #11 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #12 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #13 0x7ffffedf779f in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #14 0x7ffffedf779f in Rf_eval /tmp/R-devel/src/main/eval.c:1280:12 > #15 0x7ffffee10215 in do_set /tmp/R-devel/src/main/eval.c:3567:8 > #16 0x7ffffedf71eb in Rf_eval /tmp/R-devel/src/main/eval.c:1232:12 > #17 0x7ffffee0beb2 in do_begin /tmp/R-devel/src/main/eval.c:2996:10 > #18 0x7ffffedf71eb in Rf_eval /tmp/R-devel/src/main/eval.c:1232:12 > #19 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #20 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #21 0x7ffffee4edfd in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #22 0x7ffffee4edfd in bcEval_loop /tmp/R-devel/src/main/eval.c:8089:16 > #23 0x7ffffedf86e4 in bcEval /tmp/R-devel/src/main/eval.c:7501:16 > #24 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #25 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #26 0x7ffffee00857 in R_execMethod /tmp/R-devel/src/main/eval.c:2562:11 > #27 0x7ffffac1c787 in R_dispatchGeneric /tmp/R-devel/src/library/methods/src/methods_list_dispatch.c:1154:19 > #28 0x7ffffef9fce3 in do_standardGeneric /tmp/R-devel/src/main/objects.c:1348:13 > #29 0x7ffffedf78a2 in Rf_eval /tmp/R-devel/src/main/eval.c:1264:9 > #30 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #31 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #32 0x7ffffedf779f in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #33 0x7ffffedf779f in Rf_eval /tmp/R-devel/src/main/eval.c:1280:12 > #34 0x7ffffef3af37 in Rf_ReplIteration /tmp/R-devel/src/main/main.c:265:2 > #35 0x7ffffef3db50 in R_ReplConsole /tmp/R-devel/src/main/main.c:317:11 > #36 0x7ffffef3d964 in run_Rmainloop /tmp/R-devel/src/main/main.c:1219:5 > #37 0x7ffffef3dc4a in Rf_mainloop /tmp/R-devel/src/main/main.c:1226:5 > #38 0x555555663dd4 in main /tmp/R-devel/src/main/Rmain.c:29:5 > #39 0x7ffffe1b9ca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 > #40 0x7ffffe1b9d64 in __libc_start_main csu/../csu/libc-start.c:360:3 > #41 0x555555580370 in _start (/usr/local/lib/R/bin/exec/R+0x2c370) (BuildId: 5f5cd5f376807c2106419d2502da92a4b967104f) > > 0x51200052d080 is located 192 bytes inside of 272-byte region [0x51200052cfc0,0x51200052d0d0) > freed by thread T0 here: > #0 0x55555561ff0a in free (/usr/local/lib/R/bin/exec/R+0xcbf0a) (BuildId: 5f5cd5f376807c2106419d2502da92a4b967104f) > #1 0x7ffff2763e37 in cleanup1 /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/rank.c:72:3 > #2 0x7ffff2763e37 in dot_rank /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/rank.c:585:5 > #3 0x7ffff2739a8a in dot_layout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/dotinit.c:263:9 > #4 0x7ffff29b6d5f in gvLayoutJobs /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/gvc/gvlayout.c:90:2 > #5 0x7ffff29c410c in gvLayout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/gvc/gvc.c:90:5 > #6 0x7ffff273779a in Rgraphviz_doLayout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/doLayout.c:253:5 > #7 0x7ffffeca494c in R_doDotCall /tmp/R-devel/src/main/dotcode.c:757:11 > #8 0x7ffffecf26a2 in do_dotcall /tmp/R-devel/src/main/dotcode.c:1437:11 > #9 0x7ffffee44009 in bcEval_loop /tmp/R-devel/src/main/eval.c:8118:14 > #10 0x7ffffedf86e4 in bcEval /tmp/R-devel/src/main/eval.c:7501:16 > #11 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #12 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #13 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #14 0x7ffffedf779f in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #15 0x7ffffedf779f in Rf_eval /tmp/R-devel/src/main/eval.c:1280:12 > #16 0x7ffffee10215 in do_set /tmp/R-devel/src/main/eval.c:3567:8 > #17 0x7ffffedf71eb in Rf_eval /tmp/R-devel/src/main/eval.c:1232:12 > #18 0x7ffffee0beb2 in do_begin /tmp/R-devel/src/main/eval.c:2996:10 > #19 0x7ffffedf71eb in Rf_eval /tmp/R-devel/src/main/eval.c:1232:12 > #20 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #21 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #22 0x7ffffee4edfd in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #23 0x7ffffee4edfd in bcEval_loop /tmp/R-devel/src/main/eval.c:8089:16 > #24 0x7ffffedf86e4 in bcEval /tmp/R-devel/src/main/eval.c:7501:16 > #25 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #26 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #27 0x7ffffee00857 in R_execMethod /tmp/R-devel/src/main/eval.c:2562:11 > #28 0x7ffffac1c787 in R_dispatchGeneric /tmp/R-devel/src/library/methods/src/methods_list_dispatch.c:1154:19 > #29 0x7ffffef9fce3 in do_standardGeneric /tmp/R-devel/src/main/objects.c:1348:13 > #30 0x7ffffedf78a2 in Rf_eval /tmp/R-devel/src/main/eval.c:1264:9 > #31 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #32 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > > previously allocated by thread T0 here: > #0 0x5555556201a3 in malloc (/usr/local/lib/R/bin/exec/R+0xcc1a3) (BuildId: 5f5cd5f376807c2106419d2502da92a4b967104f) > #1 0x7ffff2a2d40b in gmalloc /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/common/memory.c:49:10 > #2 0x7ffff2a2d40b in zmalloc /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/common/memory.c:27:10 > #3 0x7ffff276924b in new_virtual_edge /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/fastgr.c:167:9 > #4 0x7ffff2769c10 in virtual_edge /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/fastgr.c:201:22 > #5 0x7ffff276ee2b in class1 /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/class1.c:104:3 > #6 0x7ffff2763216 in dot_rank /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/rank.c:563:5 > #7 0x7ffff2739a8a in dot_layout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/dotgen/dotinit.c:263:9 > #8 0x7ffff29b6d5f in gvLayoutJobs /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/gvc/gvlayout.c:90:2 > #9 0x7ffff29c410c in gvLayout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/graphviz/lib/gvc/gvc.c:90:5 > #10 0x7ffff273779a in Rgraphviz_doLayout /tmp/RtmpRFNn5A/R.INSTALLab7116d71997/Rgraphviz/src/doLayout.c:253:5 > #11 0x7ffffeca494c in R_doDotCall /tmp/R-devel/src/main/dotcode.c:757:11 > #12 0x7ffffecf26a2 in do_dotcall /tmp/R-devel/src/main/dotcode.c:1437:11 > #13 0x7ffffee44009 in bcEval_loop /tmp/R-devel/src/main/eval.c:8118:14 > #14 0x7ffffedf86e4 in bcEval /tmp/R-devel/src/main/eval.c:7501:16 > #15 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #16 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #17 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #18 0x7ffffedf779f in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #19 0x7ffffedf779f in Rf_eval /tmp/R-devel/src/main/eval.c:1280:12 > #20 0x7ffffee10215 in do_set /tmp/R-devel/src/main/eval.c:3567:8 > #21 0x7ffffedf71eb in Rf_eval /tmp/R-devel/src/main/eval.c:1232:12 > #22 0x7ffffee0beb2 in do_begin /tmp/R-devel/src/main/eval.c:2996:10 > #23 0x7ffffedf71eb in Rf_eval /tmp/R-devel/src/main/eval.c:1232:12 > #24 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #25 0x7ffffedfd06c in applyClosure_core /tmp/R-devel/src/main/eval.c:2306:16 > #26 0x7ffffee4edfd in Rf_applyClosure /tmp/R-devel/src/main/eval.c:2328:16 > #27 0x7ffffee4edfd in bcEval_loop /tmp/R-devel/src/main/eval.c:8089:16 > #28 0x7ffffedf86e4 in bcEval /tmp/R-devel/src/main/eval.c:7501:16 > #29 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #30 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #31 0x7ffffee00857 in R_execMethod /tmp/R-devel/src/main/eval.c:2562:11 > #32 0x7ffffac1c787 in R_dispatchGeneric /tmp/R-devel/src/library/methods/src/methods_list_dispatch.c:1154:19 > > I ran the container on my mac as > > Rgraphviz devel% docker run --cap-add SYS_PTRACE -it -v $PWD:/work -w /work rocker/r-devel-ubsan-clang bash > > There is a lot of additional output but slack won’t let me paste it all in…

Kasper D. Hansen (18:59:15) (in thread): > Superb! Thanks a lot.Will explore soon.

Kasper D. Hansen (18:59:55) (in thread): > Interesting that this looks like relatively old (as in not changes in a long time) code

2025-03-24

Vince Carey (10:36:34) (in thread): > Barging in here, uninvited, but interested in UBSAN. Using the container on a mac with –platform=linux/amd64 and it seems fairly slow. Anyway on installing Rgraphviz from source I had > > **** testing if installed package can be loaded from temporary location > refstr.c:79:29: runtime error: member access within null pointer of type 'refstr_t' (aka 'struct refstr_t') > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior refstr.c:79:29 > **** checking absolute paths in shared objects and dynamic libraries > **** testing if installed package can be loaded from final location > refstr.c:79:29: runtime error: member access within null pointer of type 'refstr_t' (aka 'struct refstr_t') > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior refstr.c:79:29 > **** testing if installed package keeps a record of temporary installation path > > Not sure if that was already observed. Now let’s try the vignette.

Vince Carey (10:42:41) (in thread): > I get the same error at figbipartite. Let’s see if we can fix.

Vince Carey (10:52:18) (in thread): > Failing in agopen. I think I will move to an issue for Rgraphviz in github.

Vince Carey (11:12:29) (in thread): > Certain facts are givenhere. I think we need to tell graphviz folks about certain concerns, but also block the offending code in the vignette. I will try to have a PR soon.

Vince Carey (11:51:59) (in thread): > https://github.com/vjcitn/RgraphvizFIX/tree/VJCblockblocks the offending code. Maybe try on BBS to see?

Vince Carey (12:44:07) (in thread): > It seems to me that under UBSAN, Rgraphviz will not build. My process in container winds up with > > #29 0x7ffffedf6e1f in Rf_eval /tmp/R-devel/src/main/eval.c:1167:8 > #30 0x7ffffee01575 in R_execClosure /tmp/R-devel/src/main/eval.c:2393:22 > #31 0x7ffffee00857 in R_execMethod /tmp/R-devel/src/main/eval.c:2562:11 > #32 0x7ffffac1c787 in R_dispatchGeneric /tmp/R-devel/src/library/methods/src/methods_list_dispatch.c:1154:19 > #33 0x7ffffef9fce3 in do_standardGeneric /tmp/R-devel/src/main/objects.c:1348:13 > > SUMMARY: AddressSanitizer: 507246 byte(s) leaked in 32399 allocation(s). > > [2]- Exit 1 RD CMD build RgraphvizFIX >

Vince Carey (12:45:58) (in thread): > I will try in a different environment. I would love to understand how to leverage this information to improve the package. I suspect it will take some time.

Kasper D. Hansen (13:03:25) (in thread): > Yeah, so a bit of comments (I still haven’t had time to look at this)

Kasper D. Hansen (13:05:19) (in thread): > The Graphviz code in the package is quite old by now. This is partly because last time I updated the code in 2012 (I think, might have been 1-2 years later), it was a massive effort to do, especially to crate the windows version. Now, due to change by Tomas, it is much easier. So one approach is to rip the bandaid off and update Graphviz and hope this fixes the problem. It may or may not, but it is not totally crazy to try.

Vince Carey (13:29:33) (in thread): > The vendored code is ~10yo? Didn’t think to check on that…

Kasper D. Hansen (13:32:41) (in thread): > Yes. Now, I am not sure how much has changed in that time wrt. layout routines vs. the rendering (which we don’t use). I definately want to move to the new code base soon, but I wanted to preferably have a clean build first. Your PR fix (which just removes the code thta causes the overflow) achieves that

Vince Carey (13:33:34) (in thread): > I hope so…. didn’t try on BBS.

2025-03-25

Vince Carey (09:57:19) (in thread): > It isn’t clear to me why bioc git does not have an update to Rgraphviz

Kasper D. Hansen (10:01:02) (in thread): > Thanks for reminding me

Kasper D. Hansen (10:03:43) (in thread): > done

Kasper D. Hansen (10:04:11) (in thread): > This also includes a patch from Andrew McDavid

2025-03-27

Changqing (04:20:25): > I kept getting this error from my Github Actions macOS instances: > > Warning: Warning: unable to access index for repository[https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5](https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5): > cannot open URL '[https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5/PACKAGES](https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5/PACKAGES)' > > It seems to be on the latest release of BiocManager: > > Skipping install of 'BiocManager' from a cran remote, the SHA1 (1.30.25) has not changed since last install. > Use `force = TRUE` to force installation > > And R 4.5: > > BiocManager::install(version = "3.21", ask = FALSE, force = TRUE) > shell: /usr/local/bin/Rscript {0} > env: > has_testthat: true > run_covr: false > run_pkgdown: true > has_RUnit: false > cache-version: cache-v1 > R_REMOTES_NO_ERRORS_FROM_WARNINGS: true > RSPM: > NOT_CRAN: true > TZ: UTC > GITHUB_TOKEN: ***** > GITHUB_PAT: ***** > R_LIBS_USER: /Users/runner/work/_temp/Library > *R_CHECK_SYSTEM_CLOCK*: FALSE > XML_CONFIG: /usr/local/opt/libxml2/bin/xml2-config > 'getOption("repos")' replaces Bioconductor standard repositories, see > 'help("repositories", package = "BiocManager")' for details. > Replacement repositories: > CRAN:[https://cran.rstudio.com](https://cran.rstudio.com)Bioconductor version 3.21 (BiocManager 1.30.25), R Under development (unstable) > (2025-03-10 r87922) > Old packages: 'edgeR', 'GenomeInfoDbData', 'limma', 'cluster', 'Matrix' > Warning: Warning: unable to access index for repository[https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5](https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5): > cannot open URL '[https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5/PACKAGES](https://bioconductor.org/packages/3.21/workflows/bin/macosx/big-sur-arm64/contrib/4.5/PACKAGES)' >

Andres Wokaty (09:19:18) (in thread): > Thanks for mentioning this. I’ll fix it.

David Rach (21:56:45): > Hi All, I remember seeing a discussion early last year about .qmd’s almost being ready for vignettes in CRAN (vs regular .rmd). I believe I recall someone since showed an example for a Bioconductor vignette done with .qmd since then but can’t find the details (@Hervé Pagès?) > > My vignettes are currently all .rmd, but I am primarily using .qmd for my work reports and websites, so wondering where .qmd for Bioconductor Vignette status is at a year later. Was the previous issue the difference in the render/knit, or BiocCheck note? > > Thanks, > David

2025-03-28

Laurent Gatto (03:58:25) (in thread): > Seehttps://github.com/Bioconductor/BiocStyle/issues/114#issuecomment-2753619409 - Attachment: Comment on #114 support for quarto theme > Thanks @LiNk-NY for looking into this. Agreed it will be a useful addition. Any idea who’s primarily responsible for BiocStyle is now? Is it still @aoles ? > > — > > Just a quick glance athe screen shot, do you know what’s happening with the author and affiliation columns? They look a bit jumbled up there.

2025-03-31

Shian Su (19:57:36): > So I vibe-coded my way into a working minimap2 interface for R, thoughts on whether this is a viable project?https://github.com/Shians/minimap2-ai-r

Shian Su (19:58:23): > I need to decide if I want to pay for Cursor since I’m now out of free tokens.:cry:

Kasper D. Hansen (20:02:34): > I don’t know. I could see the appeal for teaching for example. I have always aligned in a separate process.

Kasper D. Hansen (20:02:46): > I would look into whether people are using Rbowtie

Kasper D. Hansen (20:03:06): > That seems to me to have a similar potential usecase

Kasper D. Hansen (20:03:56): > … so basically you should ignore my comment since I am clearly not the target audience, but you should look into Rbowtie

Dirk Eddelbuettel (20:21:32) (in thread): > Micro-comment from glancing at your DESCRIPTION: You do not need SytemRequirements: C++11 (unless you plan compilation on ancient R versions) because C++11 has been the minimum for several years and it is already C++14 with R 4.4.* (and maybe even R 4.3.*).

Shian Su (20:22:13): > Thanks for pointing me to Rbowtie, they actually have a very interesting approach of compiling the binary and usingsystem.fileto find the binary andsystemto call it. That’s actually 1000x simpler than me trying to work out how to use minimap2’s C interface (with heavy AI assistance). How does one smuggle a binary from the src area out into inst?

Shian Su (20:24:06) (in thread): > Thanks Dirk, the majority of this code is AI generated as an experiment over the weekend. I appreciate experienced eyes checking over it for obvious issues. My goal was simply to make it generate alignments identical to the output of the CLI tool.

Dirk Eddelbuettel (20:25:49) (in thread): > I always have to look it up but WRE under R 4.4.3 says > > C++ standards: From version 4.0.0 R required and defaulted to > > C++11; from R 4.1.0 in defaulted to C++14 and from R 4.3.0 to C++17 > > (where available). For maximal portability a package should either > > specify a standard (*note Using C++ code::) or be tested under all of > > C++11, C++14 and C++17.

Shian Su (20:27:01) (in thread): > Is it preferable to specify C++17 or remove it entirely?

Dirk Eddelbuettel (20:27:36) (in thread): > As for the ‘can I smuggle a binary in’: leads to the usual problem of ‘will it run everywhere’ etc but I have an example for that too—time-series seasonal adjustement (eg in packageseasonal) uses a binary provided by the US Commerce Department (so free for everybody) but shipped as a binary. We created a wrapperx13binarythatseasonaldepends upon. You could borrow that model. Both are on CRAN, and we have an RJournal paper on the approach.

Dirk Eddelbuettel (20:28:41) (in thread): > > Is it preferable to specify C++17 or remove it entirely? > The recommendation (and by now check fromR CMD check) is to remove entirelyunless you need a feature that only C++11 hadso just leaving it ‘open’ and ‘relaxed’ is best. The compiler knows what to do.

Shian Su (20:29:49) (in thread): > I have the full source code, so I’d be compiling the binary on the target machine, the only smuggling is from src where it’d normally be built into inst, and I’m not sure if there’s a standard way to do that or I just do some../../inst/binshenanigans in the Makefile.

Dirk Eddelbuettel (20:30:13) (in thread): > No a local binary, made at install time, is fine. You control the path! Functionsystem.file(..., package="mypackage")is your friend.

Dirk Eddelbuettel (20:32:01) (in thread): > (For seasonal we decided to decouple asx13binaryis released less often. So the binary, made at install or build time, lives there.seasonalcalls it. Same thing otherwise.)

Shian Su (20:37:16): > In terms of use-case, the first motivation is for FLAMES which currently has to grab minimap2 via basilisk to make the pipeline installation more user friendly in terms of installation. The second motivation is I want to one day get into some real-time long-read processing using R, whereby reads streaming off a Nanopore sequencer can be processed in real time to compute statistics and make decisions. This is step 1 towards that because through the C interface I can keep the built index in memory as batches of new reads stream in, essential for real-time processing.

Shian Su (20:38:44) (in thread): > Thanks,cp bin/minimap2 ../../inst/binit is then, very straightforward.

Dirk Eddelbuettel (20:39:49) (in thread): > “In theory, yes. In practice there will be hickups.” Just kidding.

Shian Su (21:28:57) (in thread): > So far it seems to be working, it’d be cool to turn this intoanear-zero maintenance package that just rebuilds with each release of minimap2 via GitHub action.

2025-04-02

Tim Triche (12:21:19) (in thread): > hey@Peter(Yizhou) Huangthis could be helpful for you

Peter(Yizhou) Huang (12:21:23): > @Peter(Yizhou) Huang has joined the channel

Tim Triche (12:22:16) (in thread): > this is pretty slick! vibe coding for the win

Peter(Yizhou) Huang (13:46:01) (in thread): > I guess it would be super handy in bam-slicing case, when we sliced target regions from genomic BAMs -> covert to fastq -> aligned against transcript reference using minimap2 all together in R environment.

Tim Triche (13:53:11) (in thread): > :100:

Shian Su (18:17:22) (in thread): > This was my first experience with agentic models that did multi-file edits with controllable context. I decided to try problems at 3 levels of difficulty. > > The first was just simple implementation of an S4 class along with unit tests, it had errors in the unit test because the coding LLM is still bad at maths and couldn’t correctly calculate the number of rows in the matrix that it declared. Easy fix and the rest was perfectly functional. Iteratively asking it to implement more methods and calculate internal invariants was fun, particularly invariants because those need to be updated on any mutating method and it’s easy to miss when coding manually. > > The second attempt was to do something I had almost zero understanding of. I wanted a web app todo list with a hierarchical task structure, dependencies and automation. It produced a surprisingly decent initial product, but the second it broke and I didn’t have the technical skills to debug it became an unrecoverable mess. LLMs at this stage is not capable of effective debugging of its own code. > > So the third attempt was this. I know that it’s absolutely possible in principle because we already have things like Rsamtools. I know that it shouldn’t be overly complex because the code in the Python bindings look quite simple. I had no idea how to actually build C code into a shared library to call with Rcpp. The biggest time sink was actually getting it all to compile on my ARM Mac. Minimap2 makes heavy use of SSEinstructions from x86 and it took a lot of trial and error with the Makevars to get it to work. But it was great to have the LLM set up 95% of the code for me to focus on the hard problems. It also required that I had already wasted sufficient portions of my life dealing with compiler errors to resolve the issues that came up. > > Overall 8 out of 10:robot_face:. Would vibe code again.

Shian Su (18:19:58) (in thread): > I’m not going to sign up to Cursor, since I already have a GitHub copilot subscription and VS Code already has these features in their preview release.

2025-04-04

Kasper D. Hansen (20:47:53): > To me, it sounds like you have some compeling usecases. Another approach would be to use basilisk to install minimap2 via conda.

Kasper D. Hansen (20:49:04): > Most importantly, you’re building this for yourself

2025-04-09

Felix Lisso (04:02:42): > @Felix Lisso has joined the channel

2025-04-14

Robert Castelo (11:08:44): > Hi, one of my packages, gDNAx, is failing only on Windows (link) with the following error, which seems to be prompted by some problem when accessing data downloaded from ExperimentHub, hosted in my own server: > > * checking for file 'gDNAx/DESCRIPTION' ... OK > * preparing 'gDNAx': > * checking DESCRIPTION meta-information ... OK > * installing the package to build vignettes > * creating vignettes ... ERROR > --- re-building 'gDNAx.Rmd' using rmarkdown > [E::hts_idx_load3] Could not load local index file 'E:\biocbuild\bbs-3.21-bioc\tmpdir\RtmpOeGSKm/s32gDNA0.bai' : Invalid argument > [E::hts_idx_load3] Could not load local index file 'E:\biocbuild\bbs-3.21-bioc\tmpdir\RtmpOeGSKm/s32gDNA0' : No such file or directory > > Quitting from gDNAx.Rmd:100-106 [unnamed-chunk-3] > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > <error/rlang_error> > Error in `h()`: > ! error in evaluating the argument 'obj' in selecting a method for function 'unname': isOpen(<BamFile>) is not 'TRUE' > --- > Backtrace: > ▆ > 1. ├─gDNAx::gDNAdx(bamfiles, txdb) > 2. │ └─gDNAx:::.checkPairedEnd(bfl, singleEnd, BPPARAM) > 3. │ ├─S4Vectors::unname(vapply(bfl, testpe, FUN.VALUE = logical(1L))) > 4. │ └─base::vapply(bfl, testpe, FUN.VALUE = logical(1L)) > 5. │ └─gDNAx (local) FUN(X[[i]], ...) > 6. │ ├─base::close(bf) > 7. │ └─Rsamtools::close.BamFile(bf) > 8. │ └─base::stop("isOpen(<BamFile>) is not 'TRUE'") > 9. └─base::.handleSimpleError(...) > 10. └─base (local) h(simpleError(msg, call)) > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Error: processing vignette 'gDNAx.Rmd' failed with diagnostics: > error in evaluating the argument 'obj' in selecting a method for function 'unname': isOpen(<BamFile>) is not 'TRUE' > --- failed re-building 'gDNAx.Rmd' > > I bumped the version already to see whether simply rebuilding would clean up the error, but today I see that it doesn’t. Since it builds in every other platform, I assumed this must be a Windows-specific problem, but in the next develbuild, Windows does not show that problem, although then Linux shows a similar one with one other file among the ones downloaded through the ExperimentHub. Release 3.20 has been happilybuildingthrough the entire release cycle. Let me know if you have any hint that I could apply to the last commit before release:sweat_smile:or any other more long-term clues that I could try to follow, thanks!

2025-04-15

Rasmus Hindström (07:07:53): > @Rasmus Hindström has joined the channel

2025-04-29

Jeroen Ooms (13:16:25): > There are currently ~200 bioc software packages that have aharddependency (e.g.Imports:) on a annotation/experimentdatapackage: seethis scriptfor a list. Note that such a dependency makes it more difficult to install the package, and also such data packages are not available for e.g. WebAssembly. If such a dependency is not strictly required (only used for e.g. for certain examples/vignettes) you can make your software package more portable and easier to install by moving this dependency into aSuggests:.