#biocpython
2023-05-05
Kozo Nishida (00:19:50): > @Kozo Nishida has joined the channel
Vince Carey (00:20:55): > @Vince Carey has joined the channel
Isaac Virshup (00:20:55): > @Isaac Virshup has joined the channel
Jayaram Kancherla (00:20:55): > @Jayaram Kancherla has joined the channel
Johannes Rainer (00:21:36): > @Johannes Rainer has joined the channel
Martin Morgan (00:34:37): > @Martin Morgan has joined the channel
Charlotte Soneson (00:39:41): > @Charlotte Soneson has joined the channel
Kozo Nishida (00:45:59): > First I am interested in sharing Bioconductor’s data resources (== AnnotationHub, ExperimentHub) with Python. > Can data from *Hub already be imported into Python?
Lori Shepherd (00:47:12): > @Lori Shepherd has joined the channel
Jayaram Kancherla (01:34:25): > that is exactly why we started the BioPy effort. We have a system called ArtifactDB that stores genomic data in language agnostic formats and allows analysts to easily transition between languages. In addition, we developedrds2pypackage, that can directly read serialized rds files in python. we might need to add translators and support more classes, but it currently demonstrates how one can read MAE’s, SE and SCE’s
Tuomas Borman (02:53:04): > @Tuomas Borman has joined the channel
Kayla Interdonato (05:30:30): > @Kayla Interdonato has joined the channel
Stephanie Hicks (07:04:35): > @Stephanie Hicks has joined the channel
Isaac Virshup (07:11:28) (in thread): > Depends on the data. Some is in serialized R objects, some is in text based files, some may even be custom text based files. > > Ingenomic-features
we have some code for looking through the files here:https://github.com/scverse/genomic-features/blob/main/src/genomic_features/ensembl/ensembldb.py#L64-L112But you can also do this from the annotationhub web interface
irem Kahveci (07:12:26): > @irem Kahveci has joined the channel
Brian Schilder (08:46:54): > @Brian Schilder has joined the channel
Brian Schilder (09:05:01): > Hi everyone, excited to see lots on discussion on this. A couple of thoughts: > * I’m the maintainer of therworkflowsR package + GitHub Action. I’ve thought for a while it would be great to have an equivalent of this but for python, that could install, test, document and containerise any python package through CI. Thecookiecutter-scverseworkflow seems to be a step in the right direction, though it would be even better to see a implementation that relies on a centrally maintainer GH action (which would make it much easier to distribute fixes/upgrades to all users of that workflow). This is the strategy that rworkflows adopts, such that users of the rworkflows action only have to supply a short workflow script that calls the action with certain arguments selected. > * Perhaps also related, I created the R packageechocondato make interfacing with python/conda environments as easy as possible. It works via eitherreticulate
and/orbasilisk
, so that it can cover more use cases. > * I’m also a huge fan of anndata, both in its python and R implementations. I’ve extended some of anndatas functionality in my other R packagescKirbywhich aims to ingest/convert any single-cell object format, including from python to R (and vice versa). > Just wanted to post these here in case anyone finds them helpful. Happy to answer any questions, or if anyone has feedback on how to improve these projects.
Gregor Sturm (09:37:46): > @Gregor Sturm has joined the channel
Gregor Sturm (09:41:26) (in thread): > > though it would be even better to see a implementation that relies on a centrally maintainer GH action (which would make it much easier to distribute fixes/upgrades to all users of that workflow) > Note that we have a mechanism in place to automatically create pull requests to all repos using the template should an update be required.
Danila (09:51:58): > @Danila has joined the channel
Frederick Tan (09:55:34): > @Frederick Tan has joined the channel
Martin Morgan (10:05:06) (in thread): > @Brian Schildera recent scverse hackathon activity tried to unite the current ‘n’ AnnData interfaces in R to a single one; it is probably still at the ‘n + 1’ stage, but still has momentum and might be worth while to make sure that scKirby’s needs are being met so that scKirby does not provide an ‘n + 2’ (I didn’t know about scKirby beforehand…) solution:wink:. The github repository is currently athttps://github.com/scverse/anndataR. Contact Robrecht Cannoodt to participate more.
Brian Schilder (10:08:56) (in thread): > very cool, hadnt heard of this! i’ll be sure to reach out to Robrecht. > currently scKirby heavily relies on thisanndataR package, so it’s great to hear there’s been further development on this
Robert Shear (13:30:15): > @Robert Shear has joined the channel
Steve Lianoglou (14:06:42): > @Steve Lianoglou has joined the channel
Mark Keller (15:30:24): > @Mark Keller has joined the channel
2023-05-08
Sebastian Lobentanzer (08:55:43): > @Sebastian Lobentanzer has joined the channel
2023-05-10
Thiago Britto-Borges (08:42:08): > @Thiago Britto-Borges has joined the channel
2023-05-11
Noriaki Sato (03:28:24): > @Noriaki Sato has joined the channel
2023-05-12
Aljes Binkevich (07:35:38): > @Aljes Binkevich has joined the channel
2023-05-17
Hassan Kehinde Ajulo (11:55:47): > @Hassan Kehinde Ajulo has joined the channel
2023-06-09
Aedin Culhane (17:25:01): > @Aedin Culhane has joined the channel
2023-06-21
Vince Carey (06:01:48): > This channel has gotten a little stale – I was looking for a python channel, but I hope we can discuss both the specifics of BiocPy by@Jayaram Kancherlaand general concerns of python interoperation here.
Vince Carey (06:03:17): > On a recent TAB call we discussed interop via basilisk in relation to a relatively recent Config/reticulate field in DESCRIPTION as noted under “Format” athttps://rstudio.github.io/reticulate/articles/python_dependencies.html. - Attachment (rstudio.github.io): Managing an R Package’s Python Dependencies > reticulate
Vince Carey (06:12:31): > One question that arose concerned the weight of basilisk’s retention of version-specific conda environments for basilisk-using packages. I recently experimented with upgrading the version of hail used in BiocHail and noted > > 1.3G 1.0.0 > 0 1.0.0-00LOCK > 4.0K 1.0.0_dir.expiry > 0 1.0.0_dir.expiry-00LOCK > 1.2G 1.1.1 > 0 1.1.1-00LOCK > 4.0K 1.1.1_dir.expiry > 0 1.1.1_dir.expiry-00LOCK >
> This is the result of using du -sh * in the folder in .cache/R/… where basilisk installs miniconda. Folder 1.0.0 corresponds to BiocHail 1.0.0, 1.1.1 to BiocHail 1.1.1. Let’s ignore the fact that 1.1.1 would be devel and 1.0.0 is release. Each version of the basilisk-dependent package gets its own python interpreter and runtime.
Vince Carey (06:12:50): > The associated environment spec is > > bsklenv <- basilisk::BasiliskEnvironment( > envname = "bsklenv", packages = "pandas==1.3.5", > pkgname = "BiocHail", pip = c("hail==0.2.118", "ukbb_pan_ancestry==0.0.2") > ) >
Vince Carey (06:13:58): > This is not a complaint, but an indication for a need to manually remove obsolete python infrastructure as package versions mature. I do not know how the Config/reticulate approach works and some information on this would be welcome.
Jayaram Kancherla (10:31:13): > conda clean --all
is usually what i do to remove unused packages across all my conda environments. I don’t see an explicit function from reticulate to perform this but that should help clean up some of these files.
Andres Wokaty (14:33:08): > @Andres Wokaty has joined the channel
2023-06-22
Thiago Britto-Borges (11:32:13): > Hi channel, > I’m interested helping to translate/port some bioc GRanges verbs to Python.
Jayaram Kancherla (15:52:02) (in thread): > thanks for reaching out on twitter.github.com/biocpyis where I’ve been spending chunks of my time to build some of these core data structures and representations. Feel free to reach out if you are overwhelmed:slightly_smiling_face:
2023-06-29
ChiaSin (12:51:14): > @ChiaSin has joined the channel
2023-06-30
Aaron Lun (13:20:26): > @Aaron Lun has joined the channel
Aaron Lun (13:21:09): > basilisk has bumped its default versions of python and miniconda. Downstream authors should not be affected if they pinned their required versions of python, otherwise they may see some resolution errors.
2023-07-13
Brian Schilder (07:01:51): > @Brian Schilder has joined the channel
2023-07-24
Ludwig Geistlinger (10:59:17): > @Ludwig Geistlinger has joined the channel
Davide Risso (10:59:37): > @Davide Risso has joined the channel
Pedro Sanchez (11:09:41): > @Pedro Sanchez has joined the channel
Alan O’C (13:10:35): > @Alan O’C has joined the channel
Peter Hickey (17:31:36): > @Peter Hickey has joined the channel
2023-07-25
Luke Zappia (02:47:20): > @Luke Zappia has joined the channel
2023-07-27
Jacques SERIZAY (19:29:02): > @Jacques SERIZAY has joined the channel
2023-08-02
Beth Cimini (08:20:15): > @Beth Cimini has joined the channel
2023-08-04
Trisha Timpug (09:34:52): > @Trisha Timpug has joined the channel
2023-08-11
Aaron Lun (18:45:30): > now with a logo - File (PNG): image.png
2023-08-14
Vince Carey (22:49:15): > How can wr bring the biocpy packages into the Bioc build and check framework? One speculation is that the biocpy packages have quarto documents as vignettes and these are checked by BiocCheck.@Marcel Ramos Pérez@Jayaram Kancherla– does this make sense?
Jayaram Kancherla (23:20:34): > The current set ofpackagesusepyscaffoldto bootstrap the package setup process. pyscaffold uses tox under the hood to build, test, generate documentation and publish the package to PyPI. I’ve also been using quarto as vignettes for our internal packages and it works really well. The CI/CD job runs the vignettes every night to ensure no bugs have been introduced by the packages and their downstream dependencies.
Jayaram Kancherla (23:21:28) (in thread): > tox is pretty neat since it creates isolated environments for running all steps in the development workflow.
2023-08-15
Marcel Ramos Pérez (05:59:21): > @Marcel Ramos Pérez has joined the channel
Vince Carey (06:06:01) (in thread): > Thanks for the details. Here’s a situation that would be nice to avoid: Uponpip install biocpy
with a very informally managed python ecosystem (apparently python 3.9) on my ubuntu 20.04 laptop, I have > > ERROR: genomicranges 0.2.11 has requirement numpy==1.22.1, but you'll have numpy 1.23.0 which is incompatible. > ERROR: genomicranges 0.2.11 has requirement pandas==1.4.2, but you'll have pandas 2.0.3 which is incompatible. >
Sebastian Lobentanzer (06:36:29) (in thread): > I would advocate for using only dedicated python environments. In my projects I use Poetry for managing the environments, and it deals with dependency relationships beautifully. A lock file makes sure that all package versions are recorded.
Sebastian Lobentanzer (06:37:42) (in thread): > I think installing anything into a system Python installation with whatever may be in there already cannot be recommended any more.
Vince Carey (06:52:21) (in thread): > Yes. So far I have made progress by doing the work inside a container.
Sebastian Lobentanzer (07:01:40) (in thread): > It is actually a good point: the ecosystem should provide simple and comprehensive guidance on how to set up an environment that is not fraught with myriads of dependency issues and other maintenance time sinks. I’d be more than happy to help with that, maybe we can create a landing page of sorts.
Vince Carey (07:13:46) (in thread): > Thanks! I do think a landing page makes sense, and we would link to it frombioconductor.org. Another question concerns building and testing on mac and windows – would that be a straightforward change to the .github workflow?
Sebastian Lobentanzer (07:14:15) (in thread): > yes; the runners can just use different OS distributions to account for that
Sebastian Lobentanzer (07:15:30) (in thread): > using theruns-on
parameter
Jayaram Kancherla (10:27:40) (in thread): > yup, its straightforward to setup CI/CD to create python wheels for different architectures and OS, it gets a little complicated when the packages are interfacing with c/c++ libraries, I’ve use cibuildwheel, while it works great, its insanely slow.@Vince CareyI’ll also publish a conda env file that installs all the packages from biocpy. I tried to keep the versioning loose unless I absolutely have to, which is what you ran into because pandas changed their api in v2.
2023-08-16
Alex Mahmoud (14:38:15): > @Alex Mahmoud has joined the channel
2023-11-01
Jayaram Kancherla (19:17:20): > @Hervé PagèsI remember when I presented the biocpy work at the TAB, you mentioned about reusing the c functionality from the IRanges package. Would you be interested in making the c code its own library so we can reuse the same functionality in both R and Python? We already do this a lot for@Aaron Lun’s C++ code; especially tatami representations and the scran single cell methods.
Hervé Pagès (19:17:27): > @Hervé Pagès has joined the channel
Aaron Lun (19:19:33) (in thread): > :+1:it would be very interesting to have BioC core “officially” share its C code for re-use elsewhere
Hervé Pagès (19:47:12) (in thread): > There’s is defintely some appeal to that, but I won’t have the bandwidth for such refactoring ofIRanges’s C code. Maybe someone else wants to give this a try? Would that be a good Outreachy project? (probably really tough though)
Hervé Pagès (19:58:43) (in thread): > Also where does all this fit with repect to PyRanges?https://pyranges.readthedocs.io/
Jayaram Kancherla (20:06:57) (in thread): > That would be a great outreachy project if they are still accepting applications. I’m refactoring interval operations from biocpy’sgenomicrangesinto its own package (IRanges) to support our language agnostic data stores inartifactdb. > > pyranges is great but does not fully support our usecases and interval operations currently provided in Bioc’s IRanges package
Hervé Pagès (20:19:43) (in thread): > There’s one cohort of Outreachy interns every 6 months. You’re not worried that using the exact same names for the Python packages and R packages might confuse people?
Hervé Pagès (20:21:46) (in thread): > Even though tensorflow is already doing that… so maybe that’s fine.
Jayaram Kancherla (22:01:30) (in thread): > i am in favor of using the same names gives a sense of familiarity and fewer package names to remember
2023-11-03
Charlotte Soneson (03:58:49): > :wave:I was wondering if someone may have an explanation/solution for the followingbasilisk
issue, which we started seeing in Bioc 3.18 (it is also there in 3.19): On Linux, when the fallback is activated,reticulate
can not be loaded since there is already a newer version ofreticulate
loaded (bybasilisk
). Investigating more in detail, it seems that theclusterCall
s inbasiliskStart
(here, specificallythis line, but the same happens e.g. if.libPaths()
is called in the cluster process) loads also thebasilisk
namespace (and thus, alsoreticulate
, from the parent library) in the cluster process, which prevents the fallback version ofreticulate
to be loaded there later. This did not seem to happen in Bioc 3.17. We can reproduce the behaviour on GitHub Actions with a minimal example from thebasiliskStart
man page; compareubuntu-latest (3.17)
andubuntu-latest (3.18)
inthis run. The most obvious things that are different between the two are theBioc
/basilisk
/basilisk.utils
versions and the fixed versions ofreticulate
(and consequently ofR
) in the fallback environments, but we were not able to find anything in the updates that would be the obvious cause for this behaviour. On mac it works fine (the error on GHA seems to be an intermittent conda issue, it works locally with Bioc 3.18), and the windows GHA issue seems different. I’d be curious to hear if anyone has an idea of what might be the reason. Thanks! (cc@Michael Stadler)
Aaron Lun (10:03:44): > Hm. Nothing obvious occurs to me. Perhaps you can try moving thelibrary()
call from 206 to be in front of.activate_condaenv
on 202.
2023-11-06
Michael Stadler (02:36:05): > @Michael Stadler has joined the channel
Michael Stadler (02:44:40): > @Aaron LunGood idea! I have tried to move thelibrary()
call from 206 before the.activate_condaenv
on 202, but I get the same error: > > Error in checkForRemoteErrors(lapply(cl, recvResult)) : > one node produced an error: Package 'reticulate' version 1.34.0 cannot be unloaded: > Error in unloadNamespace(package) : namespace 'reticulate' is imported by 'basilisk' so cannot be unloaded >
> Intriguingly, when I step through the expressions one by one, by placing abrowser()
statement after the creation of the child process on 198, and check the loaded namespaces on the child, I find that it is running the fallback R version (4.3.1) and the namespace is initially clean: > > > print(clusterCall(proc, sessionInfo)) > [[1]] > R version 4.3.1 (2023-06-16) > Platform: x86_64-conda-linux-gnu > Running under: CentOS Linux 7 (Core) > > Matrix products: default > BLAS/LAPACK: /tungstenfs/groups/gbioinfo/stadler/sharedHome/home_nfs_cache_R/basilisk/1.15.0/0/envs/fallback/lib/libopenblasp-r0.3.24.so; LAPACK version 3.11.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Europe/Zurich > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] compiler_4.3.1 parallel_4.3.1 >
> Now runningclusterCall(proc, function() { library("reticulate", character.only=TRUE, lib.loc = file.path(R.home(), "library")) })
throws the above error, and checking the child namespace now gives: > > > print(clusterCall(proc, sessionInfo)) > [[1]] > R version 4.3.1 (2023-06-16) > Platform: x86_64-conda-linux-gnu > Running under: CentOS Linux 7 (Core) > > Matrix products: default > BLAS/LAPACK: /tungstenfs/groups/gbioinfo/stadler/sharedHome/home_nfs_cache_R/basilisk/1.15.0/0/envs/fallback/lib/libopenblasp-r0.3.24.so; LAPACK version 3.11.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Europe/Zurich > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] basilisk.utils_1.15.0 compiler_4.3.1 basilisk_1.15.0 Matrix_1.6-1.1 cli_3.6.1 > [6] tools_4.3.1 parallel_4.3.1 rstudioapi_0.15.0 dir.expiry_1.11.0 Rcpp_1.0.11 > [11] reticulate_1.34.0 grid_4.3.1 jsonlite_1.8.7 filelock_1.0.2 rlang_1.1.1 > [16] png_0.1-8 lattice_0.22-5 >
> It seems that something automatically (?) loads the namespaces ofbasilisk
and that will also loadreticulate
, unfortunately from the library of the parent instead of the one from the child. There is no other call or expression between the clean namespace on the child and the library call that throws the error. What could trigger this automatic loading?
Martin Morgan (07:35:13): > This sounds (from a distance) like standard parallel behavior – a function definition includes the environment in which it is defined, and for a package function that includes imported namespaces. So sending even the anonymous functionfunction() { library("reticulate", character.only=TRUE, lib.loc = file.path(R.home(), "library")) }
also includes the package namespace and imports. > > One could try something like > > fun = function() { ... } > environment(fun) <- .GlobalEnv > clusterCall(proc, fun) >
> to get started…
Michael Stadler (09:00:08): > Thank you@Martin Morgan, that was indeed the explanation. Seeting theenvironment()
of the called function(s) to.GlobalEnv
does not trigger anymore the loading of thebasilisk
namespace on the child. I wonder why this seems to behave differently inBioC 3.18
compared toBioC 3.17
, both onR-4.3.1
(seethis run on GHA), but I think thanks to your explanation we will be able to suggest a workaround.
2023-11-07
Michael Stadler (07:19:12): > We have created a PR forbasilisk
herethat implements@Martin Morgan’s suggestion. When testing locally, this seems to work. - Attachment: #30 load reticulate on fallback earlier and w/o env > This is related to the bioc slack discussion starting with https://community-bioc.slack.com/archives/C056CEJTH5Z/p1698998329036819 from @csoneson > > By loading reticulate
on the fallback process earlier and using a function defined in .GlobalEnv
, the fallback R process does not inherit the parent’s reticulate
namespace anymore, which otherwise leads to an error in Bioconductor 3.18 and onwards.
Vince Carey (14:18:00): > question: is there anything analogous to sessionInfo in python?
Jayaram Kancherla (14:26:10) (in thread): > not really its usually a combination of sys and pip. I use this snippet in some of my scripts > > import sys > import subprocess > > print(sys.version_info) > print(sys.platform) > > out = subprocess.Popen(['pip', 'list'], > stdout=subprocess.PIPE, > stderr=subprocess.STDOUT) > stdout,stderr = out.communicate() > > print(stdout.decode('ascii')) >
Charlotte Soneson (14:27:55) (in thread): > Maybehttps://pypi.org/project/sinfo/? - Attachment (PyPI): sinfo > sinfo outputs version information for modules loaded in the current session, Python, and the OS.
2023-11-08
Luke Zappia (03:44:31): > I lost track of the conversation, does anyone know if the problem above might be the cause of the error I’m seeing for****{zellkonverter}****? > > Quitting from lines 42-48 [read] (zellkonverter.Rmd) > Error: processing vignette 'zellkonverter.Rmd' failed with diagnostics: > ImportError: /home/biocbuild/.cache/R/basilisk/1.14.0/zellkonverter/1.12.0/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/h5py/../../.././libcurl.so.4: undefined symbol: nghttp2_option_set_no_rfc9113_leading_and_trailing_ws_validation > Run `reticulate::py_last_error()` for details. >
> It’s only on some platforms so maybe it could also be a cache thing?
Vince Carey (05:37:00) (in thread): > > >>> sinfo() > The `sinfo` package has changed name and is now called `session_info` to become more discoverable and self-explanatory. The `sinfo` PyPI package will be kept around to avoid breaking old installs and you can downgrade to 0.3.2 if you want to use it without seeing this message. For the latest features and bug fixes, please install `session_info` instead. The usage and defaults also changed slightly, so please review the latest README at[https://gitlab.com/joelostblom/session_info](https://gitlab.com/joelostblom/session_info). > ----- > sinfo 0.3.4 > ----- > Python 3.8.18 (default, Sep 11 2023, 13:40:15) [GCC 11.2.0] > Linux-6.1.0-1025-oem-x86_64-with-glibc2.17 > 16 logical CPU cores, x86_64 > ----- > Session information updated at 2023-11-08 05:33 >
Hervé Pagès (13:17:21) (in thread): > @Andres WokatyCan you try to deletebasilisk’s cache on the nebbiolos see if that helps? Thanks!
Hervé Pagès (14:41:35) (in thread): > @Luke ZappiaI just deletedbasilisk’s cache on the nebbiolos. Let see how things go on tomorrow’s build reports: > * https://bioconductor.org/checkResults/3.18/bioc-LATEST/zellkonverter/nebbiolo2-buildsrc.html > * https://bioconductor.org/checkResults/3.19/bioc-LATEST/zellkonverter/nebbiolo1-buildsrc.html
Hervé Pagès (21:02:34) (in thread): > Deletingbasilisk’s cache on nebbiolo2 didn’t solve the problem. Only benefit of having done so is thatR CMD build
now shows output of repopulating the cache (but also, not surprisingly, this time it took much longer forR CMD build
to fail, 500 sec. instead of 30 sec.). > This will show up on tomorrow’s 3.18 report but as a heads-up I’ve attached the output ofR CMD build zellkonverter
on nebbiolo2 below. > I find this part intriguing: > > ==> WARNING: A newer version of conda exists. <== > current version: 4.12.0 > latest version: 23.10.0 >
> but I know nothing about conda. > BTW simplest way to reproduce this on nebbiolo2 is with: > > library(zellkonverter) > example_h5ad <- system.file("extdata", "krumsiek11.h5ad", package = "zellkonverter") > readH5AD(example_h5ad) >
> Unfortunately, using the above code I can’t reproduce the error onbioconductor_docker:RELEASE_3_18
: > > > library(zellkonverter) > > example_h5ad <- system.file("extdata", "krumsiek11.h5ad", package = "zellkonverter") > > readH5AD(example_h5ad) > ... populating basilisk's cache so lots of output here ... > ... > ... but after a while ... > ... > ... SUCCESS! ... > class: SingleCellExperiment > dim: 11 640 > metadata(2): highlights iroot > assays(1): X > rownames(11): Gata2 Gata1 ... EgrNab Gfi1 > rowData names(0): > colnames(640): 0 1 ... 158-3 159-3 > colData names(1): cell_type > reducedDimNames(0): > mainExpName: NULL > altExpNames(0): > Warning message: > The names of these selected uns$highlights items have been modified to match R conventions: '0' > -> 'X0', '159' -> 'X159', '319' -> 'X319', '459' -> 'X459', and '619' -> 'X619' > > > sessionInfo() > R version 4.3.2 (2023-10-31) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 22.04.3 LTS > > Matrix products: default > BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 > LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 > [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Etc/UTC > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] zellkonverter_1.12.0 BiocManager_1.30.22 > > loaded via a namespace (and not attached): > [1] crayon_1.5.2 cli_3.6.1 rlang_1.1.2 > [4] png_0.1-8 jsonlite_1.8.7 DelayedArray_0.28.0 > [7] dir.expiry_1.10.0 SummarizedExperiment_1.32.0 S4Vectors_0.40.1 > [10] RCurl_1.98-1.13 stats4_4.3.2 MatrixGenerics_1.14.0 > [13] Biobase_2.62.0 grid_4.3.2 filelock_1.0.2 > [16] abind_1.4-5 bitops_1.0-7 SingleCellExperiment_1.24.0 > [19] IRanges_2.36.0 basilisk_1.14.0 GenomeInfoDb_1.38.0 > [22] compiler_4.3.2 Rcpp_1.0.11 XVector_0.42.0 > [25] rstudioapi_0.15.0 lattice_0.22-5 reticulate_1.34.0 > [28] SparseArray_1.2.2 parallel_4.3.2 GenomeInfoDbData_1.2.11 > [31] GenomicRanges_1.54.1 Matrix_1.6-1.1 withr_2.5.2 > [34] tools_4.3.2 matrixStats_1.1.0 zlibbioc_1.48.0 > [37] S4Arrays_1.2.0 basilisk.utils_1.14.0 BiocGenerics_0.48.1 >
- File (Plain Text): zellkonverter-nebbiolo2.txt
2023-11-09
Luke Zappia (02:47:07) (in thread): > :crying_cat_face:I was hoping that would work but I guess not. Thanks for trying though! Maybe it is something to do with the version oflibcurlthat is installed/found by Python? This is the only thing I find with a quick searchhttps://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270940. > > I’m not sure how much I can help given it is working locally and in docker.
Hervé Pagès (10:46:28) (in thread): > libcurl and libnghttp2 versions that get installed in the zellkonverterAnnDataEnv-0.10.2 environmemt are the same on nebbiolo2 compared to docker (bioconductor_docker:RELEASE_3_18). I’ll dig more into this after breakfast:wink::coffee:
2023-11-10
Hervé Pagès (00:18:26) (in thread): > Some progress on this. > > The issue seems to be related with how the zellkonverterAnnDataEnv-0.10.2 conda env gets activated. If I simply activate it withreticulate::use_condaenv()
, then I can load the anndata module: > > library(reticulate) > envpath <- "~/.cache/R/basilisk/1.14.0/zellkonverter/1.12.0/zellkonverterAnnDataEnv-0.10.2" > use_condaenv(envpath, required=TRUE) > import("anndata") # no problem >
> and thenzellkonverter::readH5AD(example_h5ad)
works fine. > > However, if instead I activate it withbasilisk::useBasiliskEnv()
, then trying to load the anndata module produces the error: > > library(reticulate) > library(basilisk) > envpath <- "~/.cache/R/basilisk/1.14.0/zellkonverter/1.12.0/zellkonverterAnnDataEnv-0.10.2" > useBasiliskEnv(envpath) > import("anndata") > # Error in py_module_import(module, convert = convert) : > # ImportError: /home/hpages/.cache/R/basilisk/1.14.0/zellkonverter/1.12.0/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/h5py/../../.././libcurl.so.4: undefined symbol: nghttp2_option_set_no_rfc9113_leading_and_trailing_ws_validation > # Run `reticulate::py_last_error()` for details. >
> useBasiliskEnv()
is whatzellkonverter::readH5AD()
uses internally to activate the zellkonverterAnnDataEnv-0.10.2 conda env. It’s basically a wrapper arounduse_condaenv()
: > > > useBasiliskEnv > function (envpath) > { > envpath <- normalizePath(envpath, mustWork = TRUE) > activateEnvironment(envpath) > use_condaenv(envpath, required = TRUE) > py_config() > invisible(NULL) > } >
> but with the important difference that it callsactivateEnvironment()
before callinguse_condaenv()
.activateEnvironment()
is defined inbasilisk.utils. It is in the business of altering and/or unsetting some environment variables, and setting new ones. It uses a complicated mechanism to determine what variables to touch. Interestingly, on both nebbiolo1 and the docker image, it determines thatPATH
andLD_LIBRARY_PATH
need to be modified (amongst other variables), but in different ways. However, the alteration ofPATH
andLD_LIBRARY_PATH
doesn’t seem to be what’s causing theimport("anndata")
error on nebbiolo1. > > So far I was not able to identify what change to what environment variable is responsible for this error. I’ll keep investigating.
Luke Zappia (01:49:15) (in thread): > Thanks! I noticed on the latest report there is a different error about a vignette file missing. Not sure if something has changed or if that’s a different message from the same cause?
Hervé Pagès (12:32:08) (in thread): > You mean the “Failed to locate ‘weave’ output file” error on the report for3.19right? That’s another story. > This seems to be another regression in R devel whereR CMD build
displays this unuseful error message instead of the real vignette error. Almost all the packages that have a vignette error on nebbiolo1 (102/117) now display this generic message. > Maybe it was fixed in more recent revisions of R devel, I need to check. FYI there was another regression in R devel (reported herehttps://stat.ethz.ch/pipermail/r-devel/2023-November/082993.html) that was affecting the behavior ofR CMD INSTALL
, and it got fixed a few days ago. We’ll need to update R on the devel builders.
2023-11-11
Aaron Lun (03:21:42) (in thread): > Currently updating my local R to 3.19, but I will just mention thatreticulate::use_condaenv
doesn’t (AFAICT) activate the environment in the conda sense. It loads the embedded python library but it doesn’t run the conda environment’s activation scripts… again, AFAICT, because who really knows what’s going on here. > > Activating the environment seemed important. At least, that’s what the conda instructions tell me to do.
Hervé Pagès (15:37:38) (in thread): > Thanks Aaron. Was the decision to add thebasilisk.utils::activateEnvironment()
step driven by a desire to cumply with the conda instructions, or was it because real problems were encountered when sticking to a barereticulate::use_condaenv()
call? If the latter, then it sounds like maybe this is something that should be addressed inreticulate::use_condaenv()
. Have you discussed this with thereticulatefolks? Gosh, 422 open issue forreticulate:https://github.com/rstudio/reticulate/issues! Quite discouraging indeed…:worried:
2023-11-13
Vince Carey (10:26:27) (in thread): > @Michael StadlerI tried using your fork at fix_fallback_3_18 for a resistant environment in GCP, but ran into > > Error in py_module_import(module, convert = convert) : > ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/jupyter/.cache/R/basilisk/1.13.4/BiocSIMBA/0.0.0/bsklenv/lib/python3.9/site-packages/matplotlib/_path.cpython-39-x86_64-linux-gnu.so) > Run `reticulate::py_last_error()` for details. > > Error in unserialize(node$con): error reading from connection > Traceback: >
> a) what is the status of the PR, b) isn’t the fallback (testload setting) supposed to solve this?
Michael Stadler (10:37:39) (in thread): > @Vince CareyThere are probably people who are in a better position than me to comment on this (in particular@Aaron Lun), but according to my understanding: > a) The PR (fix_fallback_3_18
, both my original suggestion as well as the adapted one including Aron’s suggestions in commithttps://github.com/LTLA/basilisk/pull/30/commits/d81532a7153efa87c352b49015772c191bf29da4) works for me as expected in my environment. This is an old fedora core server that depends on the fallback solution. > b) Yes, I assume that too and I am confused that you do get this message. In my testing, I have forced the fallback solution. Here is the code that I use for testing: > > setBasiliskForceFallback(TRUE) # force fallback > fbpath <- "/path/to/local/basilisk/conda/env" # this should not matter > res <- basiliskStart(env = fbpath) >
> I am doing my testing under R 4.4 (devel), BioC 3.19 (devel): > > R Under development (unstable) (2023-10-25 r85412) > Platform: x86_64-pc-linux-gnu > Running under: CentOS Linux 7 (Core) > > Matrix products: default > BLAS/LAPACK: FlexiBLAS OPENBLAS; LAPACK version 3.10.1 > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > time zone: Europe/Zurich > tzcode source: system (glibc) > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] basilisk_1.13.4 testthat_3.2.0 > > loaded via a namespace (and not attached): > [1] stringi_1.7.12 lattice_0.21-9 digest_0.6.33 magrittr_2.0.3 grid_4.4.0 pkgload_1.3.3 > [7] fastmap_1.1.1 rprojroot_2.0.4 jsonlite_1.8.7 Matrix_1.6-1.1 processx_3.8.2 pkgbuild_1.4.2 > [13] sessioninfo_1.2.2 brio_1.1.3 urlchecker_1.0.1 ps_1.7.5 promises_1.2.1 purrr_1.0.2 > [19] cli_3.6.1 shiny_1.7.5.1 rlang_1.1.2 crayon_1.5.2 basilisk.utils_1.15.0 ellipsis_0.3.2 > [25] remotes_2.4.2.1 withr_2.5.2 cachem_1.0.8 devtools_2.4.5 tools_4.4.0 dir.expiry_1.11.0 > [31] parallel_4.4.0 memoise_2.0.1 httpuv_1.6.12 filelock_1.0.2 reticulate_1.34.0 vctrs_0.6.4 > [37] R6_2.5.1 mime_0.12 png_0.1-8 lifecycle_1.0.3 stringr_1.5.0 fs_1.6.3 > [43] htmlwidgets_1.6.2 usethis_2.2.2 miniUI_0.1.1.1 desc_1.4.2 callr_3.7.3 later_1.3.1 > [49] glue_1.6.2 profvis_0.3.8 Rcpp_1.0.11 rstudioapi_0.15.0 xtable_1.8-4 htmltools_0.5.7 > [55] compiler_4.4.0 prettyunits_1.2.0 >
> (basilisk
was cloned from GitHub and not yet synchronized with bioconductor, so it does not have the version bump to 1.15.0 yet).
Martin Morgan (10:47:54) (in thread): > Also not an expert but naively when I encountered this problem I added > > packages = "libcxx==16.0.6", # or as appropriate >
> to theBasiliskEnvironment
command. But again not an expert, and I don’t really understand what this ‘fallback’ stuff is about…
Aaron Lun (10:49:34) (in thread): > WhateverBiocSIMBA
is, it should be callingtestload
to check that its packages load, and that it diverts to the fallback if the package fails due to LIB… errors.
Aaron Lun (10:50:26) (in thread): > @Martin Morgansee the “Testing package loads” section in?basiliskStart
Martin Morgan (12:23:47) (in thread): > Thanks for the pointer@Aaron Lun; I now understand fallback. In my case, I was helping someone debug their package. Something close to a reproducible example (on my macOS Sonoma 4.1.1, which I guess has a specific libc++ version but I don’t know how to determine it…) is in a package > > bsklenv <- basilisk::BasiliskEnvironment( > envname = "bsklenv4", > pkgname = "CellxGCensusR", > packages = character(), > pip = "cellxgene-census==1.7.0" > ) > > mtm_test <- > function() > { > proc <- basiliskStart(bsklenv) > on.exit(basiliskStop(proc)) > > basiliskRun(proc, function(...) { > sc <- reticulate::import("cellxgene_census") > }) > } >
> which fails with > > Error in py_module_import(module, convert = convert) : > OSError: Could not find/load shared object file: libllvmlite.dylib > Error was: dlopen(/Users/ma38727/Library/Caches/org.R-project.R/R/basilisk/1.15.0/CellxGCensusR/0.0.0.9000/bsklenv4/lib/python3.9/site-packages/llvmlite/binding/libllvmlite.dylib, 0x0006): Library not loaded: @rpath/libc++.1.dylib > Referenced from: <1D8F8542-2D9E-3A33-9878-EDBDAFC802D7> /Users/ma38727/Library/Caches/org.R-project.R/R/basilisk/1.15.0/CellxGCensusR/0.0.0.9000/bsklenv4/lib/python3.9/site-packages/llvmlite/binding/libllvmlite.dylib > Reason: tried: '/Users/ma38727/Library/Caches/org.R-project.R/R/basilisk/1.15.0/CellxGCensusR/0.0.0.9000/bsklenv4/lib/python3.9/lib-dynload/../../libc++.1.dylib' (no such file), '/Users/ma38727/bin/R-devel/lib/libc++.1.dylib' (no such file), '/Library/Java/JavaVirtualMachines/jdk-18.0.2.jdk/Contents/Home/lib/server/libc++.1.dylib' (no such file) >
> Note that thesearch path starts inside the ****bsklenv4
******** directory**, where there is no libc++ (there appears also to be no libc++ in the R-devel/lib directory in this built-from-source installation) > > bsklenv4 $ find . -name "libc++*" > bsklenv4 $ >
> When I update the package to include a specific version of libc++ > > bsklenv <- basilisk::BasiliskEnvironment( > envname = "bsklenv5", > pkgname = "CellxGCensusR", > packages = "libcxx==16.0.6", > pip = "cellxgene-census==1.7.0" > ) >
> R is able to load thecellxgene_census
python module, and libc++ is in the conda environment > > bsklenv5 $ find . -name "libc++*" > ./lib/libc++.a > ./lib/libc++experimental.a > ./lib/libc++.1.0.dylib > ./lib/libc++.dylib > ./lib/libc++.1.dylib >
Aaron Lun (12:25:49) (in thread): > seems like a problem on the conda side if it’s not pulling the libs it needs into the conda env
Hervé Pagès (23:10:36) (in thread): > Looks like the culprit is Ubuntu packagelibcurl4-gnutls-dev
, which got updated on Oct 3 (see Changelog herehttps://ubuntu.pkgs.org/22.04/ubuntu-updates-main-amd64/libcurl4-gnutls-dev_7.81.0-1ubuntu1.14_amd64.deb.html) and landed on the nebbiolos at some point after that. Not sure why because this is not a package we normally install on the Linux builders. > Problem is that the package messes things by introducing the following symlinks in/lib/x86_64-linux-gnu
: > > biocbuild@nebbiolo1:/lib/x86_64-linux-gnu$ ls -l libcurl.a libcurl.so > lrwxrwxrwx 1 root root 16 Oct 3 13:15 libcurl.a -> libcurl-gnutls.a > lrwxrwxrwx 1 root root 17 Oct 3 13:15 libcurl.so -> libcurl-gnutls.so >
> Because of these symlinks, R internet module (internet.so
located inR.home("modules")
) gets linked tolibcurl-gnutls.so.4
instead oflibcurl.so.4
.@Andres WokatyWe should removelibcurl4-gnutls-dev
from the nebbiolos and reinstall R. I know it’s on your list to update R on nebbiolo1 but we’ll also need to recompile it on nebbiolo2. Something we need to look at is the output ofldd internet.so | grep curl
(aftercd ~/bbs-X.Y-bioc/R/modules
). It should produce: > > libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f5d072b7000) >
> but not: > > libcurl-gnutls.so.4 => /lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007fd33e369000) >
> like we see at the moment. Thanks! > P.S.: I alreadysudo apt-get remove libcurl4-gnutls-dev
on~nebbiolo1~nebbiolo2.
2023-11-14
Aaron Lun (11:36:14) (in thread): > oops, forgot to answer your question. No, nothing actually required env activation AFAIK. I suppose it could be necessary if, e.g., it adds something from the conda env to the PATH that is indirectly called somewhere else (and thus can’t be easily handled by just specifying the full path to the binary in the conda environment).
Hervé Pagès (12:14:46) (in thread): > Thanks. I’m thinking that maybe theactivateEnvironment()
step could be made optional inuseBasiliskEnv()
. Would just be a matter of adding an extra argument to the former e.g.activate.env
with default toTRUE
, plus adding the same argument tobasiliskStart()
andbasiliskRun()
. Could even beFALSE
by default. IIUC it could be that 99.9% ofbasiliskclients don’t need that extra step so that would reduce the risk of things going wrong for them. What do you think? I’d prepare a PR if you’re ok with that.
Aaron Lun (12:16:21) (in thread): > I was thinking the same thing.
Hervé Pagès (12:24:18) (in thread): > Great! I take this as a yes:wink:
Hervé Pagès (12:25:19) (in thread): > @Andres WokatyHeads-up: I’ll go ahead and update R on nebbiolo1 as soon as the data-experiment builds are done today (in a few minutes). I want to confirm that this actually solves thezellkonverterissue. Also this should take care of cleaning the devel report from the 2 annoyingR CMD INSTALL
andR CMD build
regressions that are supposed to be fixed in the latest R devel.
Andres Wokaty (12:39:57) (in thread): > @Hervé PagèsDid you uninstalllibculr4-gnutls-dev
on nebbiolo2 already? > > jwokaty@nebbiolo2:~$ sudo apt-get remove libcurl4-gnutls-dev > [sudo] password for jwokaty: > Reading package lists... Done > Building dependency tree... Done > Reading state information... Done > Package 'libcurl4-gnutls-dev' is not installed, so not removed > 0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded. >
Hervé Pagès (12:54:08) (in thread): > No I didn’t. Weird! I thought I saw it on the machine yesterday when I checked. Did you run someapt
command before that (e.g. autoremove) or something that could have taken care of doing some cleaning? I don’t think anything relies onlibculr4-gnutls-dev
so (1) I don’t know how it ended up on nebbiolo1, and (2)apt
cleaning options could maybe detect that and automatically decide to get rid of it.
Hervé Pagès (12:56:48) (in thread): > Anyway,internet.so
is linked to the wrong lib on nebbiolo2 so recompiling R will hopefully fix that: > > biocbuild@nebbiolo2:~/bbs-3.18-bioc/R/modules$ ldd internet.so | grep curl > libcurl-gnutls.so.4 => /lib/x86_64-linux-gnu/libcurl-gnutls.so.4 (0x00007fbc38774000) >
Andres Wokaty (12:57:00) (in thread): > No, I did runapt update
andapt upgrade
but it wasn’t listed anywhere and I didn’t runapt autoremove
. I did runldd internet.so | grep curl
and the output was incorrect (as expected).
Hervé Pagès (13:05:29) (in thread): > Oops, I got confused, it’s on nebbiolo2 that I removedlibcurl4-gnutls-dev
yesterday, not on nebbolo1. Sorry for the confusion.
Hervé Pagès (13:41:07) (in thread): > FWIW what we explicitely install on the Linux builders islibcurl4-openssl-dev
which is listed as conflicting withlibculr4-gnutls-dev
: > > biocbuild@nebbiolo1:~$ dpkg-query -s libcurl4-openssl-dev | grep Conflicts > Conflicts: libcurl4-gnutls-dev, libcurl4-nss-dev, libssl-dev (<< 1.1), libssl1.0-dev >
> In other words we must have one or the other on the machine, but we can’t have both. As a matter of factapt-get remove
knows that, and it’s smart enough to automatically install the other one if you remove one.
Hervé Pagès (13:48:07) (in thread): > Furthermore, I see that the docker images havelibcurl4-gnutls-dev
, notlibcurl4-openssl-dev
, but somehow they manage to haveinternet.so
linked to the correct lib (I don’t know how R gets installed there). HOWEVER, if I go in the terminal on the docker image and compile my own R, theninternet.so
gets linked to the wrong lib and I can reproduce thezellkonverterissue.@Alex MahmoudDo you know why the docker images havelibcurl4-gnutls-dev
and notlibcurl4-openssl-dev
? Do we have any control on that?
Alex Mahmoud (14:26:58) (in thread): > I am not sure. We inherit that from rocker (https://github.com/rocker-org/rocker-versioned2/blob/master/scripts/install_R_source.sh#L66) but it does seem to belibcurl4-openssl-dev
Alex Mahmoud (14:28:07) (in thread): > The only other mention I see is here:https://github.com/rocker-org/rocker-versioned2/blob/62a752289fd278370b72a8dafc8a1c0501bd6019/scripts/install_verse.sh#L59but seems to be a temporary swapping (and that should only affect tidyverse and ml-verse flavors in theory)
Hervé Pagès (14:35:56) (in thread): > and somehow we end up withlibcurl4-gnutls-dev
on the docker, like we did on the nebbiolos, when it’s not supposed to be there. Really confusing!
Alex Mahmoud (14:38:08) (in thread): > My best (blind) guess is that some library installed afterwards likely depends onlibcurl4-gnutls-dev
and is swapping it although not explicitly listed. Trying to verify/identify
Alex Mahmoud (14:48:07) (in thread): > The culprit is probablylibrdf0-dev
(https://github.com/Bioconductor/bioconductor_docker/blob/89662185c85beff166c24fdd88d422140318aeb6/bioc_scripts/install_bioc_sysdeps.sh#L66) which depends onlibraptor2-dev
which depends onlibcurl4-gnutls-dev
. Probably worth discussing in the meeting tomorrow to figure out how to deal with it
Alex Mahmoud (14:49:04) (in thread): > libgdal-dev
(https://github.com/Bioconductor/bioconductor_docker/blob/89662185c85beff166c24fdd88d422140318aeb6/bioc_scripts/install_bioc_sysdeps.sh#L142) same forlibproj-dev
(https://github.com/Bioconductor/bioconductor_docker/blob/89662185c85beff166c24fdd88d422140318aeb6/bioc_scripts/install_bioc_sysdeps.sh#L46) andlibnetcdf-dev
(https://github.com/Bioconductor/bioconductor_docker/blob/89662185c85beff166c24fdd88d422140318aeb6/bioc_scripts/install_bioc_sysdeps.sh#L38) also depend onlibcurl4-gnutls-dev
,
Hervé Pagès (15:11:22) (in thread): > hmm… I wonder how “strongly” these things depend onlibcurl4-gnutls-dev
because doingsudo apt-get remove libcurl4-gnutls-dev
on nebbiolo1 didn’t remove or even mention them, they are still there. Plus I can compile/link R packagessfandproj4without any issue. Last but not least, when I installedlibgdal-dev
,libproj-dev
, andlibnetcdf-dev
on my laptop, that didn’t swaplibcurl4-openssl-dev
forlibcurl4-gnutls-dev
. > So maybe the only real culprit islibrdf0-dev
, which we don’t have on the nebbiolos. It doesn’t seem to be needed by any Bioconductor package or any of their deps.
Alex Mahmoud (15:17:40) (in thread): > Rocker comment implies it’s needed forhttps://cran.r-project.org/web/packages/redland/index.htmlnot sure if we care for it nor do I know if there is another package for which it was added to the list. Beyond fixing this specific issue though, it does raise an interesting question about conflicting dependencies especially for bioc2u where everything is done through apt - Attachment (cran.r-project.org): redland: RDF Library Bindings in R > Provides methods to parse, query and serialize information stored in the Resource Description Framework (RDF). RDF is described at https://www.w3.org/TR/rdf-primer/](https://www.w3.org/TR/rdf-primer/)). This package supports RDF by implementing an R interface to the Redland RDF C library, described at https://librdf.org/docs/api/index.html](https://librdf.org/docs/api/index.html)). In brief, RDF provides a structured graph consisting of Statements composed of Subject, Predicate, and Object Nodes.
Hervé Pagès (15:29:12) (in thread): > Well, my understanding is thatin theorylibcurl4-gnutls-dev
andlibcurl4-openssl-dev
should be interchangeable. So maybe the only problem we’re having here is that the latest update tolibcurl4-gnutls-dev
(from Oct 3) introduced the suspicious symlinks in/lib/x86_64-linux-gnu/
that seem to break things (see my comment above from yesterday about this). I’m going to check with the maintainer oflibcurl4-gnutls-dev
about those symlinks.
2023-11-15
Hervé Pagès (12:37:55) (in thread): > Good news is thatzellkonverteris green today on nebbiolo1 (devel) and nebbiolo2 (release): > * https://bioconductor.org/checkResults/3.19/bioc-LATEST/zellkonverter/nebbiolo1-buildsrc.html > * https://bioconductor.org/checkResults/3.18/bioc-LATEST/zellkonverter/nebbiolo2-buildsrc.html > The BUILD error on kunpeng2 (Linux arm64) also seems about system lib mismatches, but with different libraries involved. I don’t have access to this machine so won’t be able to troubleshoot that one. > Didn’t hear back from thelibcurl4-gnutls-dev
maintainer yet. I emailed him directly but maybe there’s a better way, I don’t know (it’s my first time reporting an issue with a Debian/Ubuntu package). > Still need to work on theactivate.env
PR forbasilisk.
Hervé Pagès (13:01:13) (in thread): > @Alex MahmoudFWIW today I didsudo apt-get install librdf0-dev
on my laptop and that didn’t swaplibcurl4-openssl-dev
forlibcurl4-gnutls-dev
. So none oflibgdal-dev
,libproj-dev
,libnetcdf-dev
, orlibrdf0-dev
resulted in a swap for me. Still a mystery why the nebbiolos got swapped at some point and why docker image is also swapped. Note that I’m on Ubuntu 23.10 vs 22.04 for the nebbiolos and the docker. > Anyways, maybe we should investigate a way to “pin”libcurl4-openssl-dev
to avoid the risk of an accidental swap in the future.
Alex Mahmoud (13:02:17) (in thread): > That’s helpful to know… I have no idea what else would cause it to swap given it’s not mentioned explicitly anywhere…
Hervé Pagès (14:31:25) (in thread): > @Aaron LunDo you think you could re-syncbasilisk’s repo on GitHub with the repo atgit.bioconductor.org? Will avoid potential merge conflicts when I submit the PR. Thanks!
Aaron Lun (15:18:48) (in thread): > done
Hervé Pagès (22:23:56) (in thread): > oops, realizing now thatbasilisk.utils::activateEnvironment()
will also need the new arg. Think you can also re-syncbasilisk.utils? Thx!
2023-11-16
Aaron Lun (01:03:31) (in thread): > also done
Aaron Lun (01:03:35) (in thread): > oops, wrong thread
Aaron Lun (01:03:40) (in thread): > also done
Luke Zappia (02:37:20) (in thread): > Thanks for your help@Hervé Pagès!
2023-12-29
Manvi Yaduvanshi (10:00:27): > @Manvi Yaduvanshi has joined the channel
2024-01-10
Bernie Mulvey (15:04:01): > @Bernie Mulvey has joined the channel
Stephanie Hicks (15:07:34): > hi@Bernie Mulvey– I wanted to connect / introduce you to the wonderful@Luke Zappiawho is the lead developer of zellkonverter.
Bernie Mulvey (15:10:38) (in thread): > Thanks@Stephanie Hicks!@Luke Zappia, I’m encountering problems with the Bioc 3.18 version of zellkonverter — namely, I have to uninstall and reinstall reticulate and zellkonverter, and then zellkonverter will work for exactly 1 R session. If I try and run it again in a new session, Rstudio aborts (or the plain R GUI crashes without warning). > > It looks like from a quick search of the Slack that there are some ongoing headaches with reticulate/zellkonverter, but was curious if you had any insights into what this might be the result of.
2024-01-11
Luke Zappia (02:50:01) (in thread): > That sounds weird. I vaguely remember something like this happening before but I can’t remember the details. Possibly it’s something to do with environments/compiling. Can you please open an issue with more detail about what you triedhttps://github.com/theislab/zellkonverter?
Vince Carey (06:38:38) (in thread): > This could be challenging but interesting to solve. I am concerned that users who get R from conda could have hard-to-diagnose conditions. I don’t see an issue at zellkonverter yet, but I would recommend@Bernie Mulveyprovide full details on how R and conda are being used independently of the problem with zellkonverter usage with sessionInfo() result after the failure. To establish a “clean room” to try to reproduce the problem I used docker. > > docker run -ti bioconductor/bioconductor_docker:RELEASE_3_18 bash >
> then > > BiocManager::install("zellkonverter") > BiocManager::install("scRNAseq") > library(zellkonverter) > example(SCE2AnnData) >
> This is a little tedious because everything that is needed is installed, but we have binaries for packages installed in the container so they install quickly.
Vince Carey (06:41:16) (in thread): > https://github.com/waldronlab/bioconductoris worth a visit for a script that can simplify efficient use of docker for everyday analysis.
Bernie Mulvey (09:05:42) (in thread): > Thanks all! I’ll give it a try in the Docker and see what happens. I suspect that my issue may be rooted deeper in something to do with makevars, $PATH, homebrew, etc.
Bernie Mulvey (09:44:52) (in thread): > So yes, the Docker environment run works. (And I was using a different from zellkonverter before, so I triedexample(SCE2AnnData)
on my main machine and still got a segfault, complete with the same blaring obnoxious interference on my speakers.
Bernie Mulvey (09:50:47) (in thread): > Ah, just tried running it from R in the terminal and got the following (instead of a full-out crash):********* caught segfault ******
*******address 0x40, cause 'invalid permissions'
************* caught segfault ******
************ caught segfault ******
*******address 0x18, cause 'invalid permissions'
************* caught segfault ******
*******address 0x30, cause 'invalid permissions'
************* caught segfault ******
*******address 0x0, cause 'invalid permissions'
************* caught segfault ******
*******address 0x48, cause 'invalid permissions'
********Error in match(attrnames, specials) :
********2 arguments passed to .Internal(match) which requires 4
************* caught segfault ******
*******address 0x58, cause 'invalid permissions'
************* caught segfault ******
************ caught segfault ******
*******address 0x50, cause 'invalid permissions'
********address 0x10, cause 'invalid permissions'
********address 0x28, cause 'invalid permissions'
************* caught segfault ******
*******Error: no more error handlers available (recursive errors?); invoking 'abort' restart
********Error: no more error handlers available (recursive errors?); invoking 'abort' restart
************* caught segfault ******
*******Error: no more error handlers available (recursive errors?); invoking 'abort' restart
********Error: no more error handlers available (recursive errors?); invoking 'abort' restart
********address 0x38, cause 'invalid permissions'
********address 0x8, cause 'invalid permissions'
********Traceback:
********1: Error: no more error handlers available (recursive errors?); invoking 'abort' restart
********Traceback:
********Traceback:
********1: structure(list(message = as.character(message), call = call), class = class)
********2: simpleError(msg, call)
********3: tryCatchOne(expr, names, parentenv, handlers[[1L]])
********4: tryCatchList(expr, classes, parentenv, handlers)
********5: tryCatch({ oldpythonpath <- Sys.getenv("PYTHONPATH") newpythonpath <- Sys.getenv("RETICULATE_PYTHONPATH", unset = paste(config$pythonpath, system.file("python", package = "reticulate"), sep = .Platform$path.sep)) local({
************* caught segfault ******
*******1: example(SCE2AnnData)
********Possible actions:
********1: abort (with core dump, if enabled)
********2: normal R exit
********3: exit R without saving workspace
********4: exit R saving workspace
********Sys.setenv(PYTHONPATH = newpythonpath)structure(list(message = as.character(message), call = call), class = class)
********2: simpleError(msg, call)
********3: pairlist(NULL, NULL, NULL)
********4: tryCatchList(expr, classes, parentenv, handlers)
********5: tryCatch({ oldpythonpath <- Sys.getenv("PYTHONPATH") newpythonpath <- Sys.getenv("RETICULATE_PYTHONPATH", unset = paste(config$pythonpath, system.file("python", package = "reticulate"), sep = .Platform$path.sep)) local({ Sys.setenv(PYTHONPATH = newpythonpath) on.exit(Sys.setenv(PYTHONPATH = oldpythonpath), add = TRUE) py_initialize(config$python, config$libpython, config$pythonhome, address 0x20, cause 'invalid permissions'
********on.exit(Sys.setenv(PYTHONPATH = oldpythonpath), add = TRUE) config$virtualenv_activate, config$version >= "3.0",
********Traceback:
********1: example(SCE2AnnData)
********Possible actions:
********1: abort (with core dump, if enabled)
********2: normal R exit
********3: exit R without saving workspace
********4: exit R saving workspace
********Traceback:
********interactive(), numpy_load_error) })Selection: 1: py_initialize(config$python, config$libpython, config$pythonhome, config$virtualenv_activate, config$version >= "3.0", interactive(), numpy_load_error)example(SCE2AnnData)
********Selection:
********Possible actions:
********1: abort (with core dump, if enabled)
********2: normal R exit
********3: exit R without saving workspace
********4: exit R saving workspace
********Selection: })}, error = function(e) { Sys.setenv(PATH = oldpath) if (
****is.na****(curr_session_env)) { Sys.unsetenv("R_SESSION_INITIALIZED") } else { Sys.setenv(R_SESSION_INITIALIZED = curr_session_env) } stop(e)})
********6: initialize_python()
********}, error = function(e) { Sys.setenv(PATH = oldpath) if (
****is.na****(curr_session_env)) { Sys.unsetenv("R_SESSION_INITIALIZED") } else { Sys.setenv(R_SESSION_INITIALIZED = curr_session_env) } stop(e)})
********6: initialize_python()
********7: ensure_python_initialized()
********8: py_config()
********9: useBasiliskEnv("python", package = "reticulate")
********10: basiliskStart(env, full.activation = full.activation, fork = fork, shared = shared, testload = testload)
********11: basiliskRun(fun = function(sce) { adata <- zellkonverter::SCE2AnnData(sce) zellkonverter::AnnData2SCE(adata)}, env = zellkonverterAnnDataEnv(), sce = seger)
********12: eval(ei, envir)
********13: eval(ei, envir)
********14: withVisible(eval(ei, envir))
********15: source(tf, local, echo = echo, prompt.echo = paste0(prompt.prefix, getOption("prompt")), continue.echo = paste0(prompt.prefix, getOption("continue")), verbose = verbose, max.deparse.length = Inf, encoding = "UTF-8", skip.echo = skips, keep.source = TRUE)
********16: example(SCE2AnnData)
********Possible actions:
********1: abort (with core dump, if enabled)
********2: normal R exit
********3: exit R without saving workspace
********4: exit R saving workspace
********Selection: 7: ensure_python_initialized()
********8: py_config()
********9: useBasiliskEnv(envpath, full.activation)
********10: basiliskStart(env, full.activation = full.activation, fork = fork, shared = shared, testload = testload)
********11: basiliskRun(fun = function(sce) { adata <- zellkonverter::SCE2AnnData(sce) zellkonverter::AnnData2SCE(adata)}, env = zellkonverterAnnDataEnv(), sce = seger)
********12: eval(ei, envir)
********13: eval(ei, envir)
********14: withVisible(eval(ei, envir))
********15: source(tf, local, echo = echo, prompt.echo = paste0(prompt.prefix, getOption("prompt")), continue.echo = paste0(prompt.prefix, getOption("continue")), verbose = verbose, max.deparse.length = Inf, encoding = "UTF-8", skip.echo = skips, keep.source = TRUE)
********16: example(SCE2AnnData)
****Then, bizarrely, I get several errors before/while I try and choose a “possible action” before I can actually send a response:Possible actions:``1: abort (with core dump, if enabled)``2: normal R exit``3: exit R without saving workspace``4: exit R saving workspace``Selection: `` ***** caught bus error *****``address 0x411000006, cause 'invalid alignment'``Traceback:`` 1: example(SCE2AnnData)``Possible actions:``1: abort (with core dump, if enabled)``2: normal R exit``3: exit R without saving workspace``4: exit R saving workspace``Selection: `` ***** caught bus error *****``address 0x156031808, cause 'invalid alignment'``Traceback:`` 1: example(SCE2AnnData)``Possible actions:``1: abort (with core dump, if enabled)``2: normal R exit``3: exit R without saving workspace``4: exit R saving workspace``Selection: `` ***** caught bus error *****``address 0xfffffe67ae584000, cause 'invalid alignment'``Traceback:`` 1: example(SCE2AnnData)``Possible actions:``1: abort (with core dump, if enabled)``2: normal R exit``3: exit R without saving workspace``4: exit R saving workspace``Selection: 3`` ***** caught bus error *****``address 0x11c7bc000, cause 'invalid alignment'``Traceback:`` 1: example(SCE2AnnData)``Possible actions:``1: abort (with core dump, if enabled)``2: normal R exit``3: exit R without saving workspace``4: exit R saving workspace``Selection: 3`` ***** caught bus error *****``address 0x10a574000, cause 'invalid alignment'``Traceback:`` 1: example(SCE2AnnData)``Possible actions:``1: abort (with core dump, if enabled)``2: normal R exit``3: exit R without saving workspace``4: exit R saving workspace``Selection: 3
Luke Zappia (10:24:59) (in thread): > Yeah, that looks to me like something has been messed up in paths somewhere.****{basilisk}****is supposed to help avoid that but I think it can still happen sometimes, particularly if you are using****{reticulate}****in other ways in the same session. > > Are you usingSCE2AnnData()
rather thanwriteH5AD()
? Because that runs in your normal****{reticulate}****environment rather than the special environment****{basilisk}****creates for****{zellkonverter}****.
Bernie Mulvey (11:05:05) (in thread): > I’ve had all of the same issues using bothexample(SCE2AnnData)
and any file I’ve tried withreadH5AD("somedata.h5ad")
. I haven’t triedwriteH5AD
explicitly yet.
Vince Carey (11:55:47) (in thread): > Do we have sessionInfo() after the bad events?
Bernie Mulvey (12:40:20) (in thread): > Ah, yes sorry–there’s sessionInfo() in the git issue I posted. However, this does not reflect the state of R after the crash since R is terminated by the errors. It does reflect the session if I solely load zellkonverter but don’t run any of its commands (i.e., the session state whenexample(SCE2AnnData)
is initially called). > > I completely wiped Anaconda and R and am reinstalling R, all of my packages, then Rstudio, then anaconda, so we’ll see what happens then..https://github.com/theislab/zellkonverter/issues/108 - Attachment: #108 Segfaults with CRAN R, BioC 3.18, and fresh Anaconda Install > Hi, > > After several months of not utilizing the zellkonverter package for no particular reason, I came back to some code I had previously used and began encountering segfaults on my personal computer. The datasets I was trying to load were moderately large (though well within the 32GB RAM + 500GB swap allocated to my R session). > > I have already tried the following things, most comprehensively in this order:
> •Run basilisk.utils::clearExternalDir()
> •remove.packages(c(“basilisk”,“reticulate”,“zellkonverter”)
> •reinstalling baslisk, reticulate, zellkonverter individually, with or without type set to “source”
> •reinstalling only zellkonverter and letting the installer handle the reticulate/basilisk dependencies
> •remove ~/anaconda3 , ~/.anaconda , ~/.conda , ~/.miniconda , ~/mambaforge directories and reinstall Anaconda for Mac, followed by both tries at R package reinstallations above
> •manually initializing a basilisk environment before running zellkonverter for the “first” time using basilisk::setupBasiliskEnv()
> •removing opt/homebrew/bin from $PATH entirely to prevent conflicts with brew’s python
> •placing opt/homebrew/bin at the very end of \(PATH to include homebrew but guarantee other python binaries supercede homebrew
>
> Regardless, I get this segfault at most or a total crash of R and RStudio (I can get the segfault messages reliably running R from a command line window):
>
> Minimal example:
>
> ```
> library(zellkonverter)
> example(SCE2AnnData)
> ```
>
> Example output (after package loading messages, etc):
>
> ```
> ***** caught segfault *****
> address 0x40, cause 'invalid permissions'
> ***** caught segfault *****
> ***** caught segfault *****
> address 0x18, cause 'invalid permissions'
> ***** caught segfault *****
> address 0x30, cause 'invalid permissions'
> ***** caught segfault *****
> address 0x0, cause 'invalid permissions'
> ***** caught segfault *****
> address 0x48, cause 'invalid permissions'
> Error in match(attrnames, specials) :
> 2 arguments passed to .Internal(match) which requires 4
> ***** caught segfault *****
> address 0x58, cause 'invalid permissions'
> ***** caught segfault *****
> ***** caught segfault *****
> address 0x50, cause 'invalid permissions'
> address 0x10, cause 'invalid permissions'
> address 0x28, cause 'invalid permissions'
> ***** caught segfault *****
> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
> ***** caught segfault *****
> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
> Error: no more error handlers available (recursive errors?); invoking 'abort' restart
> address 0x38, cause 'invalid permissions'
> address 0x8, cause 'invalid permissions'
> Traceback:
> 1: Error: no more error handlers available (recursive errors?); invoking 'abort' restart
> Traceback:
> Traceback:
> 1: structure(list(message = as.character(message), call = call), class = class)
> 2: simpleError(msg, call)
> 3: tryCatchOne(expr, names, parentenv, handlers[[1L]])
> 4: tryCatchList(expr, classes, parentenv, handlers)
> 5: tryCatch({ oldpythonpath <- Sys.getenv("PYTHONPATH") newpythonpath <- Sys.getenv("RETICULATE_PYTHONPATH", unset = paste(config\)pythonpath, system.file(“python”, package = “reticulate”), sep = .Platform\(path.sep)) local({
> ***** caught segfault *****
> 1: example(SCE2AnnData)
> Possible actions:
> 1: abort (with core dump, if enabled)
> 2: normal R exit
> 3: exit R without saving workspace
> 4: exit R saving workspace
> Sys.setenv(PYTHONPATH = newpythonpath)structure(list(message = as.character(message), call = call), class = class)
> 2: simpleError(msg, call)
> 3: pairlist(NULL, NULL, NULL)
> 4: tryCatchList(expr, classes, parentenv, handlers)
> 5: tryCatch({ oldpythonpath <- Sys.getenv("PYTHONPATH") newpythonpath <- Sys.getenv("RETICULATE_PYTHONPATH", unset = paste(config\)pythonpath, system.file(“python”, package = “reticulate”), sep = .Platform\(path.sep)) local({ Sys.setenv(PYTHONPATH = newpythonpath) on.exit(Sys.setenv(PYTHONPATH = oldpythonpath), add = TRUE) py_initialize(config\)python, config\(libpython, config\)pythonhome, address 0x20, cause ‘invalid permissions’ > on.exit(Sys.setenv(PYTHONPATH = oldpythonpath), add = TRUE) config\(virtualenv_activate, config\)version >= “3.0”, > Traceback: > 1: example(SCE2AnnData) > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Traceback: > interactive(), numpy_load_error) })Selection: 1: py_initialize(config\(python, config\)libpython, config\(pythonhome, config\)virtualenv_activate, config$version >= “3.0”, interactive(), numpy_load_error)example(SCE2AnnData) > Selection: > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: })}, error = function(e) { Sys.setenv(PATH = oldpath) if ([is.na](http://is.na)(curr_session_env)) { Sys.unsetenv(“R_SESSION_INITIALIZED”) } else { Sys.setenv(R_SESSION_INITIALIZED = curr_session_env) } stop(e)}) > 6: initialize_python() > }, error = function(e) { Sys.setenv(PATH = oldpath) if ([is.na](http://is.na)(curr_session_env)) { Sys.unsetenv(“R_SESSION_INITIALIZED”) } else { Sys.setenv(R_SESSION_INITIALIZED = curr_session_env) } stop(e)}) > 6: initialize_python() > 7: ensure_python_initialized() > 8: py_config() > 9: useBasiliskEnv(“python”, package = “reticulate”) > 10: basiliskStart(env, full.activation = full.activation, fork = fork, shared = shared, testload = testload) > 11: basiliskRun(fun = function(sce) { adata <- zellkonverter::SCE2AnnData(sce) zellkonverter::AnnData2SCE(adata)}, env = zellkonverterAnnDataEnv(), sce = seger) > 12: eval(ei, envir) > 13: eval(ei, envir) > 14: withVisible(eval(ei, envir)) > 15: source(tf, local, echo = echo, prompt.echo = paste0(prompt.prefix, getOption(“prompt”)), continue.echo = paste0(prompt.prefix, getOption(“continue”)), verbose = verbose, max.deparse.length = Inf, encoding = “UTF-8”, skip.echo = skips, keep.source = TRUE) > 16: example(SCE2AnnData) > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: 7: ensure_python_initialized() > 8: py_config() > 9: useBasiliskEnv(envpath, full.activation) > 10: basiliskStart(env, full.activation = full.activation, fork = fork, shared = shared, testload = testload) > 11: basiliskRun(fun = function(sce) { adata <- zellkonverter::SCE2AnnData(sce) zellkonverter::AnnData2SCE(adata)}, env = zellkonverterAnnDataEnv(), sce = seger) > 12: eval(ei, envir) > 13: eval(ei, envir) > 14: withVisible(eval(ei, envir)) > 15: source(tf, local, echo = echo, prompt.echo = paste0(prompt.prefix, getOption(“prompt”)), continue.echo = paste0(prompt.prefix, getOption(“continue”)), verbose = verbose, max.deparse.length = Inf, encoding = “UTF-8”, skip.echo = skips, keep.source = TRUE) > 16: example(SCE2AnnData) > > > Then, bizarrely at this point, I get several errors before/while I try and choose a “possible action” before I can actually send a response: > >
> Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: > ***** caught bus error ***** > address 0x411000006, cause ‘invalid alignment’ > Traceback: > 1: example(SCE2AnnData) > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: > ***** caught bus error ***** > address 0x156031808, cause ‘invalid alignment’ > Traceback: > 1: example(SCE2AnnData) > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R exit > 3: exit R without saving workspace > 4: exit R saving workspace > Selection: > ***** caught bus error ***** > address 0xfffffe67ae584000, cause ‘invalid alignment’ > Traceback: > 1: example(SCE2AnnData) > Possible actions: > 1: abort (with core dump, if enabled) > 2: normal R …
Bernie Mulvey (14:57:16) (in thread): > @Luke Zappia@Vince CareyI somehow managed to resolve this — and I don’t think completely reinstalling R and Rstudio and all my packages was ultimately necessary to do it. Posted my solution on the git issue — and am completely mystified as to how the approach I took finally resolved the problem.
Vince Carey (15:04:16) (in thread): > Glad you are up and running again. Are you saying that you have to have a debugger running in order for you to use zellkonverter?
Bernie Mulvey (15:06:25) (in thread): > No, I just needed to run debug(install_conda()) the one time in order to get Basilisk properly set up. After that, zellkonverter was able to take off and build its little container successfully and use it without a debugger. Otherwise, trying to run zellkonverter would throw an error during its attempt to build its env with a message re: zlib.1.dylib, stemming from its own internal call to basilisk.utils::install_conda()).
Vince Carey (15:40:57) (in thread): > Thanks for sticking with it. I really want to prevent events of this sort, but getting to the root causes may be too disruptive. I hope the path forward remains smooth.
2024-01-12
Luke Zappia (03:20:02) (in thread): > Yeah, thanks for finding a fix and sharing it. I’m not really sure what the root cause was or why that helped but glad you got it working.
Bernie Mulvey (13:58:13) (in thread): > @Vince Careypun intended?
Bernie Mulvey (14:00:43) (in thread): > It seems like basilisk is supposed to offer to auto-install its little container-ized space if it doesn’t detect one upon package installation + loading, which never happened for me. So my only guess is that something about the file paths it’s looking in/trying to install to by default on Silicon Macs are wrong. (but that doesn’t really explain how debug() magically gets it to find the right place and proceed)
2024-01-15
Luke Zappia (05:07:15): > Is it possible to find out what has been installed in the****{basilisk}****environment for****{zellkonverter}****devel? There is a build error I can’t replicate anywhere else (including the build system docker container)https://bioconductor.org/checkResults/devel/bioc-LATEST/zellkonverter/nebbiolo1-checksrc.html. Only thing I can think of is that a different version of something has been installed somewhere. Possibly clearing the cache would help but I would like to know what the cause is.
2024-01-16
Vince Carey (06:38:47): > OK, we need to use session_info within python to get at this, I think.@Luke Zappiacan you provide a call to python session_info after a successful ‘check’? I am not completely clear how this should be done but some related reporting of the python environment needs to be integrated into the build system, and then we need a mechanism for comparing python session_info outputs.
Vince Carey (06:43:55): > It seems we are going to need a “steering committee” for the evolution of basilisk and bioc/python interaction. I am working on a PR that uses miniconda version py311_23.11.0-2. This could have ramifications for build system outcomes for basilisk-using packages, and it would be good to coordinate changes of this sort. Interested parties should comment in thread on this post.
Luke Zappia (07:20:45) (in thread): > Locally I would probably just manually activate the conda environment and doconda list
to see what is installed. > > Doing it as part of the check is probably more difficult because you would need to automatically find any****{basilisk}****environments. Also, Python doesn’t really have a standardsessionInfo()
equivalent.****{reticulate}****can give some info but I’m not sure how that works with****{basilisk}****.
Charlotte Soneson (07:22:23) (in thread): > FWIW, I can replicate the error locally (on an M1 mac). This is what’s in my conda environment: > > % conda list -p /Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2 > # packages in environment at /Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2: > # > # Name Version Build Channel > anndata 0.10.2 pyhd8ed1ab_0 conda-forge > array-api-compat 1.4 pyhd8ed1ab_0 conda-forge > bzip2 1.0.8 h93a5062_5 conda-forge > c-ares 1.25.0 h93a5062_0 conda-forge > ca-certificates 2023.11.17 hf0a4a13_0 conda-forge > cached-property 1.5.2 hd8ed1ab_1 conda-forge > cached_property 1.5.2 pyha770c72_1 conda-forge > exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge > h5py 3.10.0 nompi_py311h393cb7e_100 conda-forge > hdf5 1.14.2 nompi_h3aba7b3_100 conda-forge > krb5 1.21.2 h92f50d5_0 conda-forge > libaec 1.1.2 h13dd4ca_1 conda-forge > libblas 3.9.0 20_osxarm64_openblas conda-forge > libcblas 3.9.0 20_osxarm64_openblas conda-forge > libcurl 8.5.0 h2d989ff_0 conda-forge > libcxx 16.0.6 h4653b0c_0 conda-forge > libedit 3.1.20191231 hc8eb9b7_2 conda-forge > libev 4.33 h93a5062_2 conda-forge > libexpat 2.5.0 hb7217d7_1 conda-forge > libffi 3.4.2 h3422bc3_5 conda-forge > libgfortran 5.0.0 13_2_0_hd922786_1 conda-forge > libgfortran5 13.2.0 hf226fd6_1 conda-forge > liblapack 3.9.0 20_osxarm64_openblas conda-forge > libnghttp2 1.58.0 ha4dd798_1 conda-forge > libopenblas 0.3.25 openmp_h6c19121_0 conda-forge > libsqlite 3.44.2 h091b4b1_0 conda-forge > libssh2 1.11.0 h7a5bd25_0 conda-forge > libzlib 1.2.13 h53f4e23_5 conda-forge > llvm-openmp 17.0.6 hcd81f8e_0 conda-forge > natsort 8.4.0 pyhd8ed1ab_0 conda-forge > ncurses 6.4 h463b476_2 conda-forge > numpy 1.26.0 py311hb8f3215_0 conda-forge > openssl 3.2.0 h0d3ecfb_1 conda-forge > packaging 23.2 pyhd8ed1ab_0 conda-forge > pandas 2.1.1 py311h9e438b8_1 conda-forge > pip 23.3.2 pyhd8ed1ab_0 conda-forge > python 3.11.5 h47c9636_0_cpython conda-forge > python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge > python-tzdata 2023.4 pyhd8ed1ab_0 conda-forge > python_abi 3.11 4_cp311 conda-forge > pytz 2023.3.post1 pyhd8ed1ab_0 conda-forge > readline 8.2 h92ec313_1 conda-forge > scipy 1.11.3 py311h93d07a4_1 conda-forge > setuptools 69.0.3 pyhd8ed1ab_0 conda-forge > six 1.16.0 pyh6c4a22f_0 conda-forge > sqlite 3.44.2 hf2abe2d_0 conda-forge > tk 8.6.13 h5083fa2_1 conda-forge > tzdata 2023d h0c530f3_0 conda-forge > wheel 0.42.0 pyhd8ed1ab_0 conda-forge > xz 5.4.5 h80987f9_0 > zlib 1.2.13 h53f4e23_5 conda-forge > zstd 1.5.5 h4f39d0f_0 conda-forge >
> Here’s the traceback in case it tells you something:slightly_smiling_face: > > > out <- readH5AD(temp, X_name = "X") > Error in py_call_impl(callable, call_args$unnamed, call_args$named) : > AttributeError: 'dict' object has no attribute 'shape' > Error raised while reading key '/' of <class 'h5py._hl.files.File'> to / > Run `reticulate::py_last_error()` for details. > > reticulate::py_last_error() > > ── Python Exception Message ───────────────────────────────────────────────────────────────────────────────────────────────── > Traceback (most recent call last): > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_io/h5ad.py", line 254, in read_h5ad > adata = read_dispatched(f, callback=callback) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/experimental/_dispatch_io.py", line 46, in read_dispatched > return reader.read_elem(elem) > ^^^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_io/utils.py", line 205, in func_wrapper > return func(*args, **kwargs) > ^^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_io/specs/registry.py", line 249, in read_elem > return self.callback(read_func, elem.name, elem, iospec=get_spec(elem)) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_io/h5ad.py", line 234, in callback > return AnnData( > ^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/*core/anndata.py", line 362, in __init_* > self._init_as_actual( > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_core/anndata.py", line 593, in _init_as_actual > self._layers = Layers(self, layers) > ^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/*core/aligned_mapping.py", line 334, in __init_* > self.update(vals) > File "<frozen _collections_abc>", line 949, in update > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/*core/aligned_mapping.py", line 202, in __setitem_* > value = self._validate_value(value, key) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_core/aligned_mapping.py", line 78, in _validate_value > if self.parent.shape[axis] != dim_len(val, i): > ^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/functools.py", line 909, in wrapper > return dispatch(args[0].*_class_*)(*args, **kw) > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "/Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/utils.py", line 90, in dim_len > return x.shape[axis] > ^^^^^^^ > AttributeError: 'dict' object has no attribute 'shape' > Error raised while reading key '/' of <class 'h5py._hl.files.File'> to / > > ── R Traceback ────────────────────────────────────────────────────────────────────────────────────────────────────────────── > ▆ > 1. └─zellkonverter::readH5AD(temp, X_name = "X") > 2. └─basilisk::basiliskRun(...) > 3. └─zellkonverter (local) fun(...) > 4. └─anndata$read_h5ad(file, backed = if (backed) "r" else FALSE) > 5. └─reticulate:::py_call_impl(callable, call_args$unnamed, call_args$named) >
Luke Zappia (07:26:05) (in thread): > Thank you! I’ll check and see if there are any package differences compared to what I have installed. At least I know it’s not only the build system that has an issue even though I can’t replicate it. On the other hand I have to work out what it is and how to fix it…
Charlotte Soneson (07:27:45) (in thread): > I’ve also foundconda list
to provide the most informative output. We have moved recently to try to explicitly pin the versions of as many packages as possible inbasilisk.R
(including platform-dependent ones, e.g.https://github.com/fmicompbio/orthos/blob/devel/R/basilisk.R) to try to avoid version mismatches…
Charlotte Soneson (07:29:05) (in thread): > I also got a lot of these: > > /Users/charlottesoneson/Library/Caches/org.R-project.R/R/basilisk/1.15.1/zellkonverter/1.13.1/zellkonverterAnnDataEnv-0.10.2/lib/python3.11/site-packages/anndata/_core/anndata.py:522: FutureWarning: The dtype argument is deprecated and will be removed in late 2024. > warnings.warn( >
Charlotte Soneson (07:32:09) (in thread): > Maybe it can also be reproduced on the GitHub Actions runners?
Luke Zappia (08:39:16) (in thread): > I should check GHA, that’s a good idea but I’m not sure if they are working with R devel. I’ll look into theFutureWarning
’s again but I couldn’t find what was causing them before. I agree they are annoying though.
Charlotte Soneson (09:08:25) (in thread): > I think theFutureWarning
is from AnnData:https://github.com/scverse/anndata/blob/73dabaa12deb0ddda0c58bc09d2cac7e46a3a339/anndata/_core/anndata.py#L523(but maybe you meant why they were triggered here, sorry). GitHub Actions work well with R devel; if you use Mike’s Bioc actions, specifying that you want to use Bioc devel will automatically retrieve the right R.
Vince Carey (09:57:58) (in thread): > @Luke Zappiathe session_info package at pypi is not sufficient/standard? We’re going to need some programmatic approach to compare environments. If it is a call to conda list from R at time of check success, so be it. You can send me the output atstvjc@channing.harvard.eduand we will try to compare to the build system at time of check failure.
Vince Carey (09:58:55) (in thread): > @Charlotte Sonesonit seems what you are doing with comprehensive pinning is the recommended approach.
Charlotte Soneson (10:11:21) (in thread): > I’m not an expert here and maybe there are workarounds, but it seems to me that two potential issues with relying onsession_info
would be that (1) it would assume that each developer includessession_info
in their basilisk conda environment (as I guess the call has to be made from a python session within that environment), and (2) it doesn’t really capture everything that is in the environment, just what is explicitly loaded in the python session (and we may not have a single python session running with all the used packages loaded at once).
2024-01-17
Luke Zappia (10:23:08) (in thread): > Ok, I’ve made some updates to GHA and now they run and I can reproduce the error there (at least on the systems that don’t fail for other reasons…). > > I’m not much closer to working out what the cause might be though. It might be something to do with how the file is written on the R side rather than how Python reads it but it’s hard to check because it works locally for me.@Charlotte SonesonCould you please try running this code in your environment that fails and sending me the file? That could help with working out which side the problem is on. I’m sure you have more important things to do though so totally fine if you can’t. > > library(zellkonverter) > library(scRNAseq) > sce <- ZeiselBrainData() > > assay(sce, "layer") <- DelayedArray::DelayedArray(counts(sce)) > > writeH5AD(sce, "test-file.h5ad") > > # Check if you get the error > readH5AD("test-file.h5ad", X_name = "X") >
Charlotte Soneson (10:29:03) (in thread): > :+1:file sent via DM
Luke Zappia (10:37:49) (in thread): > Thanks! I can reproduce the error with this file:tada:. I think that this confirms that 1) it’s something to do with writing and 2) it’s system dependent (to some degree at least). I’m still not sure what’s going on but at least there’s something to look into now.
Luke Zappia (11:39:16) (in thread): > I think a small change in****{HDF5Array}****was the issue. I have pushed an update for compatibility with that. Thanks for your help!
Sebastian Lobentanzer (12:42:48) (in thread): > I’d definitely be interested how this evolves!
2024-01-18
Vince Carey (08:24:15): > github.com/vjcitn/basilisk.utilsis a PR for the main repo at LTLA which uses the miniconda version py311_23.11.0-2 as the default conda version to install. It can be overwritten by value of BASILISK_MINICONDA_VERSION. If this latest version is used, the versions of external packages used for testing basilisk need to be changed.github.com/vjcitn/basiliskis a PR that carries out the selection of updated versions of test packages. I have updatedgithub.com/vjcitn/BiocSklearnto use the new conda, pinning more recent versions of scikit-learn and numpy with no untoward effects. Others may wish to test out these draft versions of basilisk.utils and basilisk to assess effects on python/basilisk-dependent packages.
Alan O’C (13:00:18) (in thread): > Means dropping support for Windows, no? Aaron previously updated and it broke every dependency on that platformhttps://github.com/LTLA/basilisk/issues/24 - Attachment: #24 Error in installConda() : conda installation failed with status code ‘2’ > Hello, thank you very much for developing this software. I encountered the following error while running writeH5AD locally. How can I correct it? > > > writeH5AD(sce, file = ‘D:/filtered_featured_bc_matrix.h5ad’)
> > Error in installConda() : conda installation failed with status code ‘2’
2024-01-19
Aaron Lun (01:56:41): > @Vince Careyre your PRs: do you want me to just transfer the two repos to the BioC GitHub org? This would be a good way to pass the torch to the proposed basilisk steering committee (BaSC for short), who could then do the necessary changes without waiting for my approval. (Though I would of course be happy to provide an opinion on any PR, given that there is a lot of historical baggage + weird hacks in the code that don’t necessarily explain themselves.)
Aaron Lun (02:14:54): > <!channel>anyway this is a good time to reiterate the call for volunteers for the BaSC (see:point_up:). > > Why get involved? > * Have a say on the future of Bioconductor’s Python capabilities. You want a feature/updates/bug fix/whatever? Be the change you want to see in the world. > * Spread the pain of maintenance. Suffering shared is suffering halved. Perhaps different members can help to debug build failures on whatever platforms they have access to. > * Something something credit/acknowledgement that you can stick on a CV somewhere, if it’s sufficiently formalized.
Vince Carey (05:51:45) (in thread): > Thanks@Alan O’Cfor reminding us of this. I don’t have ready access to Windows to explore this problem. I have used basilisk with the updated miniconda on M1 mac without trouble. So there may be a need to special-case windows somehow. This might push me over the threshold to get a windows machine for my 24/7 use. Are you up for joining BaSC?
Vince Carey (06:36:33) (in thread): > Is it worth revisitinghttps://conda.io/projects/conda/en/latest/user-guide/install/windows.html#installing-in-silent-mode? I see some negative findings in the issues athttps://github.com/conda/conda/issues/10611but some time has passed … - Attachment: #10611 Miniconda installation fails > Current Behavior > > In our company, we have a “software kiosk” for installing Miniconda silently on Windows 10 machines. For some people it does not work. As there are no installation logs, the software delivery team started to make some analysis of its own. What is observed that for the people for whom the installation does not work, the installer is blocked from forking subprocesses.
> Is this a known issue? Of course, we do not know if Windows is blocking or if it is something with the installer, if some can instruct on how I can provide more information, I would definitively like to do so. I also tried to ask here: https://stackoverflow.com/questions/66744993/miniconda-only-installs-partially-how-can-i-find-the-error, but unfortunately there has not be an answer that could resolve the issue. > > Steps to Reproduce > > > C:\WINDOWS\ccmcache\32\Files\Miniconda3-py39_4.9.2-Windows-x86_64.exe /InstallationType=AllUsers /AddToPath=1 /RegisterPython=1 /S /D=C:\Company\Miniconda >
> > Expected Behavior > > The installer should be able to install Miniconda. > > Environment Information > `conda info`
> > > Not Applicable (i.e. does not work, as conda does not get installed) >
> `conda config --show-sources`
> > > Not Applicable (i.e. does not work, as conda does not get installed) >
> `conda list --show-channel-urls`
> > > Not Applicable (i.e. does not work, as conda does not get installed) >
Alan O’C (07:26:52) (in thread): > I wouldn’t be keen to have the conda versions mismatched so widely, but if there’s a workaround it’d be fine
Alan O’C (07:27:23) (in thread): > Would have to ask my boss if this is something I can spend time on, as my extracurriculars are a bit oversubscribed at the moment
2024-01-21
stefano mangiola (05:16:57): > @stefano mangiola has joined the channel
Vince Carey (05:18:03) (in thread): > Yes, go ahead with the transfer.
2024-01-22
Aaron Lun (01:07:06) (in thread): > oops. looks like i can’t transfer a repo without write permissions on the target org
Aaron Lun (01:07:21) (in thread): > I wonder if I can give you owner rights, then you can transfer it yourself
stefano mangiola (18:51:46): > @Michael Milton, this is the biocPython channel I was mentioning to you.@Michael Miltonis very versed in both R and Python
Michael Milton (18:51:49): > @Michael Milton has joined the channel
2024-01-23
Vince Carey (06:40:44) (in thread): > Update – after a couple of hours on windows I confirm that silent installation for a newer miniconda distro is currently problematic for windows, so I am going to withdraw the change to basilisk.utils for now.
Alan O’C (08:44:26) (in thread): > Yeah it’s unfortunate, hopefully we can find a solution, I’m sure eventually it’ll become a major blocking issue
Vince Carey (09:01:29) (in thread): > I accepted the invitation for basilisk but I can’t see settings. I wonder if it is because I am not listed as a collaborator? github handle vjcitn
Aaron Lun (10:42:08) (in thread): > hm. looks like I don’t have an option to give you owner permissions for a personal repo.
Aaron Lun (10:42:36) (in thread): > wait - a bit of a hack, but maybe I can transfer it to OSCA-source, and then you can transfer it to Bioc from there.
Aaron Lun (10:42:45) (in thread): > Do you have permissions on OSCA source?
Vince Carey (11:06:32) (in thread): > i am an owner of OSCA-source, it looks like
2024-01-25
Aaron Lun (01:48:57) (in thread): > in progress now
Aaron Lun (01:49:19) (in thread): > the only thing i would ask is that you hold onto thesubmission
branch, as this syncs with the JOSS submission.
Vince Carey (15:34:28) (in thread): > OK, I transferred. Now “hold onto” submission just means we don’t remove it? I have no plan to remove any branches.
Vince Carey (15:34:47) (in thread): > Do you want to transfer basilisk.utils too?
Aaron Lun (15:36:40) (in thread): > I think i did
Aaron Lun (15:36:47) (in thread): > yes, just avoid deletingsubmission
Aaron Lun (15:37:17) (in thread): > (I have the paper itself in a separate repo anyway, but JOSS requires it to be on the main repo during submission.)
Vince Carey (15:37:30) (in thread): > oh yes, i see the utils package now.
2024-01-29
Saga (23:52:49): > @Saga has joined the channel
2024-01-31
Bernie Mulvey (14:21:30) (in thread): > @Luke Zappiajust an update here–I found another issue that was getting between me and successfully installing/using the environment, namely relating to DYLD_FALLBACK_LIBRARY_PATH being set.https://github.com/rstudio/rstudio/issues/13967 - Attachment: #13967 Mac + Reticulate + RStudio 2023.09.0 or later > System details > > > RStudio Edition : Desktop > RStudio Version : RStudio 2023.09.0 / 2023.09.1 > OS Version : Mac OS 14.1.1 > R Version : 2023-11-20 r85569 >
> > Steps to reproduce the problem > > Python is installed via home brew. The path is passed to Reticulate as follows. > > > Sys.setenv(RETICULATE_PYTHON = "/opt/Homebrew/Cellar/python@3.11/3.11.6_1/bin/python3") > Sys.setenv(TENSORFLOW_PYTHON = "/opt/Homebrew/Cellar/python@3.11/3.11.6_1/bin/python3") > library(reticulate) >
> > In RStudio versions prior to Desert Sunflower, when I import umap, it works as expected. > > > > reticulate::repl_python() > Python 3.11.6 (/opt/homebrew/Cellar/python@3.11/3.11.6_1/bin/python3) > Reticulate 1.34.0 REPL -- A Python interpreter in R. > Enter 'exit' or 'quit' to exit the REPL and return to R. > >>> import numpy as np > >>> import pandas as pd > >>> import umap.umap_ as umap >
> > No issues here. > > Describe the problem in detail > > In 2023.09.0 and later (including the current version – 2023.09.1), the correct Python is picked up but then I get an error: > > > > reticulate::repl_python() > Python 3.11.6 (/opt/homebrew/Cellar/python@3.11/3.11.6_1/bin/python3) > Reticulate 1.34.0 REPL -- A Python interpreter in R. > Enter 'exit' or 'quit' to exit the REPL and return to R. > >>> import numpy as np > >>> import pandas as pd > >>> import umap.umap_ as umap > Traceback (most recent call last): > File "<python_lib_path>/site-packages/llvmlite/binding/ffi.py", line 134, in *_getattr_* > return self._fntab[name] > ~~~~~~~~~~~~~~~~~~~~~^^^^^^ > KeyError: 'LLVMPY_AddSymbol' > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "<python_lib_path>/site-packages/llvmlite/binding/ffi.py", line 115, in _load_lib > self._lib_handle = ctypes.CDLL(str(lib_path)) > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > File "<python_lib_path>/lib/python3.11/ctypes/*_init_*.py", line 376, in *_init_* > self._handle = _dlopen(self._name, mode) > ^^^^^^^^^^^^^^^^^^^^^^^^^ > OSError: dlopen(<python_lib_path>/site-packages/llvmlite/binding/libllvmlite.dylib, 0x0006): Library not loaded: @rpath/libz.1.dylib > Referenced from: <dynamic_library_path> /site-packages/llvmlite/binding/libllvmlite.dylib > Reason: tried: '<other_paths>' > > The above exception was the direct cause of the following exception: > > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "<python_lib_path>/site-packages/umap/umap_.py", line 29, in <module> > import numba > File "<r_library_path>/python/rpytools/loader.py", line 119, in _find_and_load_hook > return _run_hook(name, _hook) > ^^^^^^^^^^^^^^^^^^^^^^ > ... [repeated stack trace omitted for brevity] > File "<python_lib_path>/site-packages/llvmlite/binding/*_init_*.py", line 4, in <module> > from .dylib import * > File "<r_library_path>/python/rpytools/loader.py", line 119, in _find_and_load_hook > return _run_hook(name, _hook) > ^^^^^^^^^^^^^^^^^^^^^^ > ... [repeated stack trace omitted for brevity] > File "<python_lib_path>/site-packages/llvmlite/binding/dylib.py", line 36, in <module> > ffi.lib.LLVMPY_AddSymbol.argtypes = [ > ^^^^^^^^^^^^^^^^^^^^^^^^ > File "<python_lib_path>/site-packages/llvmlite/binding/ffi.py", line 137, in *_getattr_* > cfn = getattr(self._lib, name) > ^^^^^^^^^ > File "<python_lib_path>/site-packages/llvmlite/binding/ffi.py", line 129, in _lib > self._load_lib() > File "<python_lib_path>/site-packages/llvmlite/binding/ffi.py", line 123, in _load_lib > raise OSError("Could not find/load shared object file") from e > OSError: Could not find/load shared object file >
> > I trimmed the output to anonymise but I hope the details are clear. > > I installed and uninstalled RStudio leaving all other libraries (including R, Python, and umap the same) and I was able to replicate this behavior whereby there is no error with prior versions of RStudio and an error with a newer version of RStudio. In addition, if I launch Python through terminal, I do not get an error. Finally, many other Python modules load ok but umap always leads to the same error. > > All of this suggests that some change in Desert Sunflower broke the configuration in my system. I suspect this has to do with the change in the handling of DYLD_FALLBACK_LIBRARY_PATH
. > > I have installed and reinstalled reticulate and all other R and Python libraries but nothing seems to help. It is only when I revert to the old version of RStudio that the error goes away. > > I also get this error if I try to import numba when running Python through Reticulate in the newer version of RStudio but not otherwise. As numba is used by umap, I suspect that this may be interrelated or even the source of the error. However, I am clueless beyond that. > > My system is a bit cruddy and has old libraries. Something could have been left behind that is causing the issue or it may be the path. However, the fact that the older version of RStudio and running Python directly on Terminal gives me the expected behavior makes me pretty sure some change in the new version is key to this. > > If this is replicable then the following steps “should” replicate it. > > 1. Install Python 3.11 using home brew. > 2. Install umap-learn using pip. > 3. Set Python path, run the interpreter, and load the library. > 4. Import umap or numba. > > Describe the behavior you expected > > Reticulate + Python in newer versions of RStudio should be the same as Python on terminal and Reticulate + Python on older versions of RStudio. > > For now, I am going to simply use an older version of RStudio but I am happy to help debug if anyone has specific questions. I must admit I am unsure of where to start as its quite a specific error, but I am happy to report specifics from both prior and new versions if you tell me what specifics to report. > > • [X ] I have read the guide for submitting good bug reports. > • [X ] I have installed the latest version of RStudio, and confirmed that the issue still persists. > • [ X] If I am reporting an RStudio crash, I have included a diagnostics report. > • [X ] I have done my best to include a minimal, self-contained set of instructions for consistently reproducing the issue.
2024-02-05
Alan O’C (08:24:54) (in thread): > I am tentatively willing to be involved, once the scater/scuttle/scran replacement comes in, so I’m at least trading one maintenance burden for another
2024-02-08
Julien Roux (03:08:00): > @Julien Roux has joined the channel
2024-02-09
Julien Roux (04:49:41): > Hi all, > I encountered some issues recently when trying RNA velocity analysis withvelociraptor
, which creates a conda environment throughbasilisk
to run thescvelo
python package. The environment could not be created because of some package conflicting. > I tracked down the issue to some interference with my localminiconda3
installation, and notably the line:channel_priority: strict
in my~/.condarc
. To be honest I did not expectbasilisk
to take my~/.condarc
into consideration, is this some expected behaviour? > If yes, I think this should be documented somewhere, foe example in thevignette? > What do you think?
Vince Carey (08:53:24): > @Julien Rouxthanks for this observation. The basilisk.utils package includes considerable documentation about configuration. If there is in fact no material addressing your concern please file an issue athttps://github.com/bioconductor/basilisk/issuesas the maintenance of the package has been transferred to bioc core. A pull request would be welcome; perhaps the basilisk.utils repo would be appropriate. There are many ways in which conda and R might interact. Check the output of reticulate::py_config() in different scenarios as there may be some relationships among installed packages that arise from this configuration element.
Julien Roux (09:23:41) (in thread): > Thanks, I’ll have a look!
Alan O’C (09:41:59): > I would expect basilisk to respect your condarc, honestly, given a common pattern is to store conda environments/packages/etc in a single place to make them easier to manage, or to get around disk quotas
2024-02-15
Vince Carey (05:38:46): > https://github.com/basnijholt/unidepclaims to be a single source of truth for python interdependency resolution. Interest? Also, does poetry have a role in evolution of bioc-python interop?
Sebastian Lobentanzer (05:59:18) (in thread): > Interesting! I have been using poetry almost exclusively for most of my Python projects, and the dependency resolving works much better than the conda/pip I know from before.
Sebastian Lobentanzer (05:59:43) (in thread): > There is also pixi:https://prefix.dev - Attachment (prefix.dev): prefix.dev – solving software package management > The software package management platform for Python, C++, R, Rust and more
2024-02-16
Jayaram Kancherla (00:31:19) (in thread): > the newer versions of conda come with mamba as the solver and it does a pretty good and fast job at dependency resolution
Vince Carey (05:59:41) (in thread): > @Jayaram Kancherlawe’d love to use the more recent miniconda in basilisk to get mamba functionality but are blocked because silent installation is not supported on windows.
Jayaram Kancherla (09:35:06) (in thread): > just ran into this on twitter -https://x.com/charliermarsh/status/1758216803275149389?s=46&t=M-P74_YozEAT_CQxRpkAjQ - Attachment (X (formerly Twitter)): Charlie Marsh (@charliermarsh) on X > Announcing uv: an extremely fast Python package installer and resolver, written in Rust. > > uv is designed as a drop-in alternative to pip, pip-tools, and virtualenv. > > With a warm cache, uv installs are near-instant. Here, it’s > 75x faster than pip and pip-tools.
2024-02-17
Manvi Yaduvanshi (10:54:49): > Hey! This is Manvi Yaduvanshi. I’m a newbie contributor to Bioc, lately contributed to Sweave2Rmd. Was off for a while but may I know any issues or anything I can contribute in here. Along with a few resources?
2024-02-28
Maria Doyle (08:45:41): > @Maria Doyle has joined the channel
2024-03-06
Jayaram Kancherla (09:47:54): > Hello, everyone! > > I shared this with a few folks, we recently launched a tutorial highlighting the functionality implemented through biocpy @https://biocpy.github.io/tutorial/This tutorial delves into the functionalities of key data structures and packages we have developed in Python, that align closely with their Bioconductor implementations. We’ve also included workflows to make transitioning between R and Python easier for various analysis use cases. If you are interested and want to contribute, don’t hesitate to jump in or reach out. Excited to hear your thoughts, contributions, and any feedback you might have!
2024-03-07
Aaron Lun (08:30:51): > :point_up:above work is also compatible with the recently migrated representations in thescRNAseqandcelldexpackages, so you can get the same data in both R and Python for your workflows.
2024-03-16
Aaron Lun (00:35:50): > @Vince Careyseehttps://github.com/Bioconductor/basilisk/pull/38; perhaps more general coordination across the project to removedefaults
would be helpful. - Attachment: #38 Commented on the licensing problems with the defaults channel. > The defaults
packages haven’t been free to use for a long time, which makes it difficult to use any default-dependent Bioconductor packages in commercial settings. Perhaps a more general notification to affected packages (e.g., FLAMES) would also be helpful if they can be persuaded to drop defaults
.
2024-03-25
Tim Triche (17:22:24): > @Tim Triche has joined the channel
2024-03-27
David Rach (13:20:42): > @David Rach has joined the channel
2024-03-29
Manisha Nair (06:16:59): > @Manisha Nair has joined the channel
2024-04-04
Aaron Lun (13:54:37): > @Vince Careyseehttps://github.com/Bioconductor/basilisk.utils/pull/11, conda’s non-OSS license is coming back to bite - Attachment: #11 Provide an option to use mamba-forge instead of Miniconda. > This has an actual, proper, open-source license compared to Anaconda, which makes it a lot safer to use in commercial environments that would otherwise be restricted by Anaconda’s ToS. > > This option is currently not set as the default as it is liable to break a whole bunch of downstream packages. I’d guess that BioC may want to set it as the default as the license is more in line with BioC’s OSS goals, but this would require at least one release’s worth of break-fixes, e.g., for all of the packages that pull from defaults
.
Vince Carey (13:55:51): > Thanks for the tip. We’ll have to tackle this soon.
2024-04-10
Vince Carey (12:06:58) (in thread): > There may be a path forward with mambaforge … I just tried a silent installation and it succeeded
Vince Carey (12:07:12) (in thread): > Contrary to what I wrote on a basilisk.utils issue
2024-04-28
Danielle Callan (08:19:47): > @Danielle Callan has joined the channel
2024-04-29
Julie Iskander (23:57:16): > @Julie Iskander has joined the channel
2024-05-09
Vince Carey (07:12:04): > https://github.com/conda/conda/issues/10611#issuecomment-2102421331may be a path to achieving windows support when upgrading the miniconda selection in basilisk - Attachment: Comment on #10611 Miniconda installation fails > From the documentation: > > grafik > > That means, using /S
requieres /D
as well. This is what the documentation says, but from my testing (with a Windows 11 sandbox) this is not needed. > > It seems to fail only when it cannot write to the destination, I suggest to run silent uninstall first. > > > miniconda3\Uninstall-Miniconda3.exe /S >
2024-05-31
Vince Carey (18:13:17) (in thread): > silent installation with recent miniconda or miniforge continues to fail on windows
2024-07-11
Sathish Kumar (06:03:43): > @Sathish Kumar has joined the channel
2024-07-23
Aaron Lun (19:55:10): > <!here>planning to switch to miniforge as the default in basilisk, to avoid the licensing headaches from miniconda’s defaults. This is liable to break any package that was implicitly relying onmain
,default
, etc. channels; these should now be explicitly listed in thechannels=
of theBasiliskEnvironment()
constructor.
Jacques SERIZAY (23:22:53) (in thread): > @Aaron Lundoes that mean that mamba is available to use now? I have not followed the development ofbasilisk
over the past few months. I just found this issue (https://github.com/Bioconductor/basilisk.utils/pull/11) which seems to suggest that it is, but I don’t find any information in the doc/manual? Is everything happening under the hood? - Attachment: #11 Provide an option to use miniforge instead of Miniconda. > This has an actual, proper, open-source license compared to Anaconda, which makes it a lot safer to use in commercial environments that would otherwise be restricted by Anaconda’s ToS. > > This option is currently not set as the default as it is liable to break a whole bunch of downstream packages. I’d guess that BioC may want to set it as the default as the license is more in line with BioC’s OSS goals, but this would require at least one release’s worth of break-fixes, e.g., for all of the packages that pull from defaults
.
2024-07-24
Aaron Lun (18:48:29): > IT IS DONE, 1.17.1. The latest version ofbasiliskwill (i) use Miniforge by default, and (ii) ignore the.condarc
on the host machine. If your package previously had an implicit dependency on any of the Anaconda channels (e.g.,main
,default
, etc.) or on the default Python version, you should change it to be explicitly declared. Otherwise, miniforge should be a plug-and-play replacement for the old Miniconda installation.
Aaron Lun (18:51:22) (in thread): > I believe that the latest versions of conda use libmamba (the C++ library underneath mamba) for resolution, seehere. We’re using the very latest release miniforge (24.3.0-0 as of writing) so I would expect it to effectively be the same as using mamba under the hood. Indeed I see some “mamba” mentions popping up in the installation stdout, so I assume that it’s involved somewhere. - Attachment (conda.org): Conda 23.10.0: libmamba is now the default solver | conda.org > :bullettrain_side: The speedy conda-libmamba-solver becomes the default solver in the Conda 23.10.0 release. Please fasten your seat belts.
Jacques SERIZAY (23:23:42) (in thread): > Thanks Aaron, nice to hear about this. I think I have not been playing with latest conda for a while, so I had probably missed this!
2024-09-11
Charlotte Soneson (15:15:11): > I’m curious to learn if anyone knows if it’s possible to usereticulate
to display amatplotlib
plot in an R Markdown R chunk. The attached Rmd provides a reproducible example (assuming the presence of a conda environment containing at leastmatplotlib
) - in the R chunk, I have not found a way to get the plot to display (using an interactive backend just brings up the plot in a new window, with a non-interactive one it doesn’t display at all). - File (Plain Text): test_python_plotting.Rmd
2024-09-12
Philippa Doherty (12:09:50): > @Philippa Doherty has joined the channel
2024-09-13
H. Emre (09:17:18): > @H. Emre has joined the channel
2024-09-20
Camille Guillermin (09:29:37): > @Camille Guillermin has joined the channel
2024-09-24
Alan O’C (04:38:44) (in thread): > A truly heinous way would be to use matplotlib to save to disk thenknitr::include_graphics
it
Charlotte Soneson (04:48:26) (in thread): > Yes, that works (in the example above I usedplt$savefig
+knitr::include_graphics
in an R chunk), but I agree that it doesn’t feel very smooth. In the end, for now I resorted to just plotting in a python chunk (accessing the object from the R session:woman-shrugging:)
2024-10-04
Monisa Hasnain (02:05:36): > @Monisa Hasnain has joined the channel
2024-10-23
Abdullah Al Nahid (16:59:36): > @Abdullah Al Nahid has joined the channel
Juan Henao (17:25:18): > @Juan Henao has joined the channel
Hong Qin (17:44:34): > @Hong Qin has joined the channel
2024-10-29
Maximilian (06:14:25): > @Maximilian has joined the channel
2024-11-07
Malvika Kharbanda (22:09:36): > @Malvika Kharbanda has joined the channel
2024-11-11
Jared Andrews (12:40:37): > @Jared Andrews has joined the channel
Josh Steier (12:42:24): > @Josh Steier has joined the channel
Josh Steier (13:03:39): > Hi everyone, > I have a background in computational biology(MS applied math and stats degree from Stony Brook University), and a strong background in machine learning, where I currently work as a machine learning engineer. > I’ve been working in machine learning for roughly 3-4 years, including my educational background, and programming in Python since I was younger. > I’m eager to contribute to Bioconductor and don’t know where to start, can someone help me with that? > Thank you!
Mikhail Dozmorov (13:06:39): > @Mikhail Dozmorov has joined the channel
Jayaram Kancherla (14:00:12) (in thread): > Hi Josh, Thank you for your interest in contributing to Bioconductor. > > ForBiocPy, we have two potential projects that might interest you, 1) Implementing theR/CompressedLists(seediscussion) in Python: would be a great introduction to both the R and the Python Bioconductor ecosystems. or 2) developing representations for spatial experiments and implementing downstream analysis methods. (discussion). > > I would recommend (1) if you want to get started. I’m also happy to meet and discuss these options and explore what best aligns with your interests.
Josh Steier (14:17:37) (in thread): > Hi Jayaram, > Thank you very much for the potential projects, and nice to meet you. > I would like to focus on (1) firstly and I appreciate the second one as well. > In order to be most effective I’ll prioritize (1), and please let me know of any additional information I might need to get started, just getting my feet wet here! > We can meet sometime too, perhaps when works best for you. > I’m available at these times and have general flexibility(EST time zone, US): > 1). Tuesday 1pm-3pm. > 2). Wednesday(and generally, every day of the week, except Friday), 12pm-3pm. > 3). Friday, 11am-3pm. > > Let me know, and feel free to give any additional information, I would love to get started before meeting, and I appreciate your time and help.
Jayaram Kancherla (14:20:33) (in thread): > No worries, we have a developers guide:https://github.com/BiocPy/developer_guideto get started so that our python package management is kind of consistent. Feel free to take a look and let me know if you have any questions. I’ll send you a calendar invite for us to touch base!
2024-11-12
Kylie Bemis (08:31:55): > @Kylie Bemis has joined the channel
Matteo Tiberti (09:05:25): > @Matteo Tiberti has joined the channel
Marouen Ben Guebila (22:34:16): > @Marouen Ben Guebila has joined the channel
2024-11-13
Dharmesh Dinesh Bhuva (17:32:15): > @Dharmesh Dinesh Bhuva has joined the channel
2024-11-15
Louise (15:01:21): > @Louise has joined the channel
2024-11-19
Ritika Giri (09:56:16): > @Ritika Giri has joined the channel
2024-11-25
Sergius Nyah (17:51:44): > @Sergius Nyah has joined the channel
Sergius Nyah (17:52:46) (in thread): > Thanks for this, Jay! > Would begin contributing as well!
Sergius Nyah (17:56:38) (in thread): > After reading the guidelines, how do I choose which Repo to contribute to ?
2024-12-12
Jayaram Kancherla (01:24:42): > This may come in handy for scaffolding new Python packages so that the package structure and management is more consistent. Check it out and let us know if you have any thoughts or issues.https://github.com/biocpy/biocsetup
Jayaram Kancherla (01:26:19) (in thread): > Feel free to pickup issues across any current packages or contribute a new package that interests you
Sergius Nyah (01:28:19) (in thread): > Great! Thanks sir
2024-12-21
hcorrada (00:08:20): > @hcorrada has joined the channel
2024-12-23
Jayaram Kancherla (13:38:26): > this might be of interest here:https://github.com/Wainberg/ryp > > - running R code inside Python > - quickly transferring huge datasets between Python (NumPy/pandas/polars) and R without writing to disk > - interactively working in both languages at the same time >
2024-12-26
Sergius Nyah (21:32:33): > Hello everyone, > > I hope this message finds you well. My name is Sergius, and I’m excited to introduce myself as a new contributor to the community. I’ve taken the time to thoroughly review the developer guide and complete the necessary steps to get started. > > As I’m eager to begin contributing, I was wondering if someone could kindly point me in the right direction. Could you please let me know where I can find issues to work on?
2024-12-28
Tejaswi Velugapally (05:24:17): > @Tejaswi Velugapally has joined the channel
Pascal-Onaho (07:55:04): > @Pascal-Onaho has joined the channel
2024-12-29
Yahya Jahun (04:01:39): > @Yahya Jahun has joined the channel
2025-01-02
Najuma Najeem (11:01:13): > @Najuma Najeem has joined the channel
2025-01-09
Ouma Ronald (07:17:39): > @Ouma Ronald has joined the channel
2025-01-13
Vince Carey (14:31:51): > Here’s what was offered at basel hackathon but not presented:https://docs.google.com/presentation/d/1KFD8UegdmMla0Z4CR4pIaco3YIiyUcGcu90n4cIjzoY/edit?usp=sharing - File (Google Slides): Language-agnostic, self-describing data structures for genomics(copy)
Vince Carey (14:34:41): > This is the basic package for SpatialData, working from the scverse/spatialdata zarr-oriented format for diverse spatial transcriptomics platforms:https://github.com/HelenaLC/SpatialData… in the same repo find SpatialData.data and SpatialData.plot
2025-01-14
neslihan oztas (10:19:21): > @neslihan oztas has joined the channel
2025-01-16
David Akwuru (09:07:37): > @David Akwuru has joined the channel
Jeremiah Akintomide (09:07:57): > @Jeremiah Akintomide has joined the channel
2025-01-22
Jayaram Kancherla (10:23:17) (in thread): > Hi@Sergius Nyah, I’m guessing you are interested in contributing to Python packages. I’m tagged a couple of issues for first time contributors. If you are interested, please take a look:https://github.com/BiocPy
Sergius Nyah (15:04:46) (in thread): > Okay thanks. Would do!
2025-01-29
apekshya kandel (09:33:26): > @apekshya kandel has joined the channel
2025-02-11
Kevin (15:21:03): > @Kevin has joined the channel
Aaron Lun (16:33:12): > basilisk updated to use miniforge 24.11.3-0 and reticulate 1.40.0 in the fallback.
2025-02-14
Sarah (20:19:55): > @Sarah has joined the channel
2025-02-25
Artür Manukyan (10:56:36): > @Artür Manukyan has joined the channel
2025-02-27
Vince Carey (05:44:30): > We put significant effort into interoperation with scverse/spatialdata, but conditions with dask (and with synchronization for other modules) have proven problematic. The issues connected withthis commentare illustrative and are an opportunity for an interested reader to go in and make a PR to dask. - Attachment: Comment on #11146 Dask 2024.5.1 removed .attrs
> I currently don’t have enough bandwidth to take this on. This would need someone from the community taking a look at it and providing a first PR for it.
Luca Marconato (09:28:47): > @Luca Marconato has joined the channel
Jayaram Kancherla (10:51:27) (in thread): > This happened a couple of times for us as we started writing biocpy classes. It’s one of our reasons we wroteBiocFrame(& provide coercions back and forth from pandas and polars). A short term solution is, you may want to explicitly & strictly version the set of packages needed for spatialdata to work. something similar but not as robust as yourcommentin the R/spatial data package
2025-02-28
Luca Marconato (08:45:44) (in thread): > For the moment inspatialdata
Python we are constraining also the version of Dask we can install.
2025-03-01
Jayaram Kancherla (13:37:28) (in thread): > @Luca Marconato: have you all looked into ray instead of dask? I am not against dask and have used it quite a lot in my packages. but have heard good things about ray and the community support it has.
2025-03-02
Luca Marconato (12:53:13) (in thread): > Thanks for sharing. The reason why we went withdask
is that it has support both for Zarr and Parquet and it’s quite flexible with data types (so for instance it was extended togeopandas
viadask-geopandas
). It seems that ray only support Parquet. Zarr (OME-Zarr to be precise) is a key technology for us, so for the moment we don’t have much way around Dask.
Jayaram Kancherla (20:39:32) (in thread): > makes sense, didn’t know ray doesn’t support these formats. I thought they would’ve abstracted the file format from their compute interfaces. thank you for explaining this.
2025-03-24
Vince Carey (13:31:14): > no biocpy call today?
2025-03-25
Jayaram Kancherla (21:31:42) (in thread): > apologies, I’m still on a break but will be back for the next one on 4/7.
2025-04-17
Khanh (21:26:21): > @Khanh has joined the channel
2025-04-25
Aaron Lun (00:26:57): > <!channel>https://github.com/Bioconductor/basilisk/pull/52
2025-04-28
Vince Carey (07:37:37): > i can’t make working group call today at 130 but might join at 2 if it happens