#hubmapr

2024-04-12

Stephanie Hicks (11:46:03): > @Stephanie Hicks has joined the channel

Christine Hou (11:46:19): > @Christine Hou has joined the channel

Stephanie Hicks (11:47:45): > set the channel description: Informal research group working towards creating a Bioconductor API to access HuBMAP data

Federico Marini (11:47:52): > @Federico Marini has joined the channel

Martin Morgan (11:47:57): > @Martin Morgan has joined the channel

Shila Ghazanfar (11:48:01): > @Shila Ghazanfar has joined the channel

Stephanie Hicks (11:49:22): > hi folks! I wanted to connect us all together in a channel to start conversations about working towards creating a Bioconductor API to access HuBMAP data.

Stephanie Hicks (11:50:48): > @Christine Houis a master’s student working in my group who joined a few months back, but has been doing some self-reading the last few months about thehcaandcellxgenedppackages until I get more bandwidth to meet regularly. I’m happy to say that we just started to meet regularly!

Stephanie Hicks (11:52:48): > I shared with Christine how@Federico Marini@Shila Ghazanfar@Martin Morganhad thought about this idea previously withHuBMAPRand she has been tinkering with these ideas the last few weeks while also focused on understanding how the previous packages worked.

Stephanie Hicks (11:53:20): > @Christine Houplease feel free to introduce yourself and thank you for joining this project!

Christine Hou (11:54:35): > @Stephanie HicksThanks for the introduction and creating the slack channel for us! It is my pleasure to work on this project:slightly_smiling_face:! > > Hi! I am Christine Hou, 1st year graduate student in Biostatistics ScM program from JHU Bloomberg School of Public Health. Nice to see you all here!hcaandcellxgenedppackages are really impressive, and I learned a lot from these two packages!

Stephanie Hicks (13:46:36): > Thanks! also, I just want to add the I’ve asked Christine to create a branch inHuBMAPRto start pushing code as she works on the project. Let me know if anyone has concerns with that.

2024-04-15

Federico Marini (14:57:05): > Hi:slightly_smiling_face:Here’s Federico, long time user, slightly less long time developer, head of a Bioinformatics group that cannot let go the hands on development:slightly_smiling_face:

2024-05-27

Aedin Culhane (21:16:16): > @Aedin Culhane has joined the channel

2024-06-23

Christine Hou (15:19:03): > Hi! I am Christine. > I am here to ask a question regarding to HuBMAP API. Currently I am exploring some APIs available to use, and very confused aboutHuBMAP Ingest API. I followed the instructions and tried some link, but I obtained > > Error in curl::curl_fetch_memory(url, handle = handle) : > SSL peer certificate or SSH remote key was not OK: [uuid.hubmapconsortium.org] SSL: no alternative certificate subject name matches target host name 'uuid.hubmapconsortium.org' > > I am totally new to this error, so I wondered anyone met similar error message before. Any hint/instruction to help solve will be greatly appreciated. Thanks!:slightly_smiling_face: - Attachment (smart-api.info): SmartAPI > The SmartAPI project aims to maximize the FAIRness (Findability, Accessibility, Interoperability, and Reusability) of web-based Application Programming Interfaces (APIs)

2024-08-09

Christine Hou (14:35:30): > <!channel>Hi all! I want to share good news here that I almost finish writing HuBMAPR package. I created a new repo in my GitHub to push new codes, and just invited@Federico Marini@Shila Ghazanfarvia GitHub. Can you accept the invitations:slightly_smiling_face:? If I miss someone, please tell me. > > I really hope to hear some feedback from all of you to build a better package. You can comment via issues to offer any suggestion/comment. Looking forward to reading them! Thanks.

Shila Ghazanfar (22:30:58) (in thread): > wonderful Christine, thanks for doing so, will do

2024-08-11

Federico Marini (09:09:50) (in thread): > Done!

Federico Marini (09:10:12) (in thread): > Brief response since I am on holiday but will look into it as soon as i am back! > Thanks for inviting us!

2024-08-12

Shila Ghazanfar (00:58:11) (in thread): > @Christine Houi just posted an issue finding a bug with a particular dataset UUID… i can push the necessary change to the function and namespace if you like.. take a look and let me know thanks!

Christine Hou (08:43:55) (in thread): > @Shila GhazanfarThank you so much on checking the errors. I had not even noticed this kind of small details. Just pushed the new codes!

Shila Ghazanfar (19:54:01) (in thread): > All good christine, ill flag anything else i see

2024-08-14

Martin Morgan (15:42:16): > For what it’s worth I spent a bit of time developingrglobusfor interacting with Globus. Once you’ve used Christine’s amazing package to get a HuBMAP dataset id, you could use rglobus to download some or all of the data from within R, without using a browser, so your workflow would be fully scripted / reproducible.

Christine Hou (15:59:24) (in thread): > I just read through this package! Thank you so much to develop such amazing package to transfer the Globus files to local computer. I am not sure whether it is a better idea to combine both packages together which is to replacefiles.Rby your R scripts. But i am totally fine with separate packages working on different aspects.:slightly_smiling_face:

Martin Morgan (16:17:43) (in thread): > I started the package as a function in HuBMAPR but it got too complicated! And I thought about my colleagues who use Globus but not HuBMAP, so I think it’s better as a stand-alone package. > > Probably there are light-weight convenience functions that could go in HuBMAPR so that the user doesn’t need to know details (e.g., about how to discover the HuBMAP Public collection).

Christine Hou (16:25:11) (in thread): > I agree! I think it would be much better to keep them separate to avoid over-complication. What’s more, files.R can also be kept in HuBMAPR package to let the users know whether the dataset is open-accessed via Globus or restricted. > > We can update vignettes in HuBMAPR to let users know that the rglobus package can help to download files within R. How do you think about this?

Martin Morgan (16:37:15) (in thread): > yes updating the vignette sounds like a good solution.

2024-09-15

Stephanie Hicks (06:14:30): > hi folks! I hope your september is going well. Christine has been hard at work ont his package and I wanted to mention it is now under review in the contributions queue (https://github.com/Bioconductor/Contributions/issues/3532) - Attachment: #3532 HuBMAPR > Update the following URL to point to the GitHub repository of
> the package you wish to submit to Bioconductor > > • Repository: https://github.com/christinehou11/HuBMAPR > > Confirm the following by editing each check box to ‘[x]’ > > • I understand that by submitting my package to Bioconductor,
> the package source and all review commentary are visible to the
> general public. > • I have read the Bioconductor Package Submission
> instructions. My package is consistent with the Bioconductor
> Package Guidelines. > • I understand Bioconductor <https://bioconductor.org/developers/package-submission/#naming|Package Naming Policy> and acknowledge
> Bioconductor may retain use of package name. > • I understand that a minimum requirement for package acceptance
> is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
> Passing these checks does not result in automatic acceptance. The
> package will then undergo a formal review and recommendations for
> acceptance regarding other Bioconductor standards will be addressed. > • My package addresses statistical or bioinformatic issues related
> to the analysis and comprehension of high throughput genomic data. > • I am committed to the long-term maintenance of my package. This
> includes monitoring the support site for issues that users may
> have, subscribing to the bioc-devel mailing list to stay aware
> of developments in the Bioconductor community, responding promptly
> to requests for updates from the Core team in response to changes in
> R or underlying software. > • I am familiar with the Bioconductor code of conduct and
> agree to abide by it. > > I am familiar with the essential aspects of Bioconductor software
> management, including: > > • The ‘devel’ branch for new packages and features. > • The stable ‘release’ branch, made available every six
> months, for bug fixes. > • Bioconductor version control using Git
> (optionally via GitHub). > > For questions/help about the submission process, including questions about
> the output of the automatic reports generated by the SPB (Single Package
> Builder), please use the #package-submission channel of our Community Slack.
> Follow the link on the home page of the Bioconductor website to sign up.

Stephanie Hicks (06:18:35): > There is also apkgdownwebsite helping to describe the package (https://christinehou11.github.io/HuBMAPR/). - Attachment (christinehou11.github.io): Interface to HuBMAP > HuBMAP provides an open, global bio-molecular atlas of the human body at the cellular level. The datasets(), samples(), donors(), publications(), and collections() functions retrieves the information for each of these entity types. *_details() are available for individual entries of each entity type. *_derived() are available for retrieving derived datasets or samples for individual entries of each entity type. Data files can be accessed using files_globus_url().

Stephanie Hicks (06:19:44): > On a related note, we are almost through drafting a manuscript to submit toBioinformatics Applications Note(2-3 pages, 1 figure). I’ve taken a close pass through it, but would like Christine’s help with a few small things. Once that is complete, we anticipate sending you all manuscript on Overleaf. I want to put this on your radar as I assume it will happen in the next week.

Stephanie Hicks (06:20:28): > Feedback is welcome on the package and will loop back here soon to ask for your edits/suggestions/comments on the manuscript. Thank you all!

Christine Hou (12:49:56): > Thanks for your help@Stephanie Hicks! Really appreciate your review and comments for the manuscript. > I just complete revising some details, and I think the manuscript is almost ready to be shared with all of you! > It will be really grateful for you all to take a quick look and give some feedback. If@Martin Morgancan help on the details in Method session, it will be perfect. > Thanks for the time!

2024-09-16

Martin Morgan (07:19:49) (in thread): > happy to provide feedback when I have access to the manuscript; I might not be very responsive over the next week or so.

Stephanie Hicks (17:47:49): > hi all! Apologies on my slow response. Here is the manuscript (https://www.overleaf.com/6137149132fcnwrchqwqbw#06488b) and I’ve put it in theHuBMAPRfolder at the top of this channel - Attachment (overleaf.com): Overleaf, Online LaTeX Editor > An online LaTeX editor that’s easy to use. No installation, real-time collaboration, version control, hundreds of LaTeX templates, and more.

Stephanie Hicks (17:47:58) (in thread): > done!:slightly_smiling_face:

2024-09-18

Federico Marini (04:34:47): > sorry for the silence here on my end. > Fantastic work@Christine Houfor wrapping it up! I will go in with a full round of review “parallel” to the official bioc one in case I spot something

Shila Ghazanfar (07:37:12): > Just read through the MS, excellent work@Christine Hou! made some small edits, overall is reading nicely – you may be asked about how the data gets downloaded and how to arrange especially with many files, whether it caches, etc.. i think this all goes torglobusbut may be worth addressing more head-on in the text

2024-09-19

Federico Marini (10:03:00): > I did a full run on my machine and did find a test not passing, don’t know if this is expected. After all, it does build on the BBS, so I can’t judge too much on the reasons. > Maybe it is worth setting up a full spectrum of CI/CD for testing on all main types of OSs? Happy to share a good yml that we are currently using for all our latest packages

Federico Marini (10:05:43): - File (PNG): image.png

Federico Marini (10:06:13): > (which is fixed by setting the expected value to 0, but maybe we can simply use another uuid? -Also the one from the example has 0 rows returned in my instance)

Christine Hou (10:51:39) (in thread): > It looks really wired because the package should pass all tests at the time I submitted the Bioconductor for review otherwise the package were not able to continue to the peer review step. I will double check later.

Federico Marini (10:52:15) (in thread): > I could not agree more with that, which is why I was very surprised it gave an error on my machine

Federico Marini (10:52:45): > meanwhile: GHA for check+bioccheck are all set and running

Federico Marini (10:52:55): > and they recently gave this error ->https://github.com/christinehou11/HuBMAPR/actions/runs/10943280275/job/30382302613

Federico Marini (10:53:18): > which is something that can happen probably randomly if the API are not responding correctly/in time

Federico Marini (10:53:37): > I am sure it will go away if I re-run this

Federico Marini (10:54:56): > we can also probably add a few biocViews more to reflect the size/scope of what we are fetching?

Christine Hou (10:54:57): > The API does not work very well sometimes even when I try to retrieve something else outside the package

Christine Hou (10:55:38) (in thread): > Do you have any idea? Feel free to add if you have some:slightly_smiling_face:

Federico Marini (10:55:51): > yeah, I had that fear as well. Don’t know if it is worth adding some kind of fallback mechanisms that do not trigger the “hard error” that would then lead to have the package fail in building/checking

Federico Marini (11:00:11) (in thread): > DataImport, ThirdPartyClient, Spatial, Infrastructure would be my first best guesses

Federico Marini (11:00:39) (in thread): > it is not a mandatory thing but it would help others see this “passively” when they search for those views

Federico Marini (11:00:47) (in thread): > and add accordingly into the vignette, I guess

Federico Marini (11:08:03) (in thread): > (I mean, hubmap is so vast one cannot catch them all. or it could, but then it is more like a “fake information” if we tick all boxes)

Federico Marini (11:15:33) (in thread): > aaaand it is gone, all:white_check_mark:

Christine Hou (11:36:00): > I edited the query language, and all tests passed! I also edited the DESCRIPTION file to include more in biocViews. - File (PNG): Screenshot 2024-09-19 at 11.35.35.png

Christine Hou (11:36:36): > Thanks for pointing out the error!@Federico MariniVery helpful!

Federico Marini (11:37:13) (in thread): > might be there could be one or two conflicts in the PR that can come in from my branch

Federico Marini (11:37:18) (in thread): > I’ll fix that beforehand!

Christine Hou (11:38:10) (in thread): > I will first push the changes to the GitHub!

Christine Hou (11:38:23) (in thread): > Then you can have more checks based on the new version

Federico Marini (11:38:30) (in thread): > k, just ping me once it is there

Federico Marini (11:38:50) (in thread): > I will solve the conflicts after dinner, and then move on to the manuscript

Christine Hou (11:38:59) (in thread): > After you finish the checks, I will push to bioconductor remote

Federico Marini (11:39:09) (in thread): > Ok, will give you a greenlight

Federico Marini (11:39:55): > One thought on a thing I probably would like to see in the vignette somewhere - what format is the data that one would download via globus?

Federico Marini (11:40:06): > i.e. “how Bioc-ready” would that be?

Federico Marini (11:40:44): > Given the broad spectrum of data types it includes, it might be hard to make a clear statement, but probably worth doing a mention somewhere

Federico Marini (11:41:22): > (biased view: If it would be all SE-like, it would be half-a-breeze to set up a full instance of iSEEindex to have it browsable in iSEE, yummy yummy.:smile:)

Federico Marini (11:41:54): > ((this with a new feature we have currently coming up in devel where we can use a call to an r function to have an object loaded/loadable))

Christine Hou (12:01:27): > The data types are broad, and Globus folder for each dataset uuid does not include only raw data, but some reports, analysis, images, or even some unrecognized data type (naming like 0 1 2 3 …). That’s why I mentioned in manuscript that Globus online > website helps user to “preview and download raw data products, downstream > analysis reports, metadata files, and visualizations.” i think i can added this sentence in vignettes. I think it is enough to give a shot near the end because each uuid dataset has totally different components of files.

Christine Hou (12:22:49): > @Federico Marinijust pushed the new codes!

Federico Marini (16:17:40) (in thread): > agree on the line you propose.

Federico Marini (16:18:02) (in thread): > which is a bit of a pity, because this makes it “diverge” a bit from e.g. the cellxgenedp package

Christine Hou (16:39:30) (in thread): > Feel free to have more checks! Just message me once you finish and I will give a bump up and push new codes to bioconductor repo! Thanks for your help!

Federico Marini (18:14:26) (in thread): > Ok, I did solve the conflict

Federico Marini (18:14:41) (in thread): > and while at it, I took some time to extend even more the test coverage:wink:

Federico Marini (18:19:47) (in thread): > all ready for you@Christine Houhttps://github.com/christinehou11/HuBMAPR/pull/14 - Attachment: #14 Testfix gha > • Fixing the tests > • setting up Github actions for full check > • extra biocViews > • news slightly tweaked (got a note in BiocCheck..)

Federico Marini (18:19:55) (in thread): > (version’s bumped too)

Federico Marini (18:20:13) (in thread): > merge that one in and push upstream anytime

Christine Hou (18:54:55) (in thread): > Successfully merged! Thanks for your help! The Bioconductor check also passed and our package is able to wait for review.

Stephanie Hicks (21:25:44): > Thank you both so much@Federico Marini@Christine Houfor making these changes today!

2024-09-20

Federico Marini (10:26:27): > YVW:slightly_smiling_face:Sorry for having you wait a bit on my round of feedback, was away+sick afterwards

2024-09-21

Federico Marini (11:41:52): > I am done with my pass on the manuscript. Most comments are in and I see@Christine Houis already on it:slightly_smiling_face:

Federico Marini (11:42:22): > it is very well written already, so I could focus on the tiny details where devils might hide

Federico Marini (11:42:40): > but overall: It reads very well and captures nicely the work you did!

Christine Hou (11:44:27): > @Federico MariniThanks for your time and efforts! Really appreciate your work on both manuscript and package development!

Federico Marini (11:45:32): > my pleasure:slightly_smiling_face:thank you for acknowledging this, you did really most of it AND beautifully

Federico Marini (11:45:49): > (I, for my side, am very annoyed i cannot fix the hyperref conflict:grimacing:)

Christine Hou (11:47:39): > Do you have anything specific to add such that the conflict came out?

Federico Marini (11:54:05): > nah, it is more like, we load it, but the document class does that too

Christine Hou (11:55:53): > Not sure from my side. I am also a started in Overleaf loll

Christine Hou (11:57:03): > Can you clarify the “license” you commented in data availability section? Not sure what it is. That’s perfect if you can give an example

Federico Marini (12:02:48): > we have Artistic 2-0

Federico Marini (12:02:52): > we can simply say that

Federico Marini (12:03:06): > like literally the software license we are using for the pkg

Christine Hou (12:03:28): > oh i misunderstood. That is clear for me now.

2024-09-22

Federico Marini (05:06:56): > re: latex -> I call it the sunday morning win

Federico Marini (05:06:59): - File (PNG): image.png

Federico Marini (05:07:10): - File (PNG): image.png

Federico Marini (05:07:11): > :tada:

Federico Marini (05:09:09): > (source of the solution:https://stackoverflow.com/questions/71263063/latex-error-option-clash-for-package-hyperref) - Attachment (Stack Overflow): LaTeX Error: Option clash for package hyperref >