#voyager

2022-11-01

U7T29M3DG (23:11:10): > @U7T29M3DG has joined the channel

UA5GZMWHM (23:11:30): > @UA5GZMWHM has joined the channel

U02EN9EQQ5U (23:11:30): > @U02EN9EQQ5U has joined the channel

U91L3C2KF (23:11:30): > @U91L3C2KF has joined the channel

U7T29M3DG (23:13:23): > /github subscribe pachterlab/voyager

Unknown User (23:13:23): > [Unsupported block type: section] > > [Unsupported block type: context]

U01UF27E9P0 (23:49:55):

2022-11-02

U01UF27E9P0 (01:27:33): U01UF27E9P0 (01:52:13): U01UF27E9P0 (01:54:32): U01UF27E9P0 (10:06:18): U7T29M3DG (18:30:47): > hi<@UA5GZMWHM>

U7T29M3DG (18:30:54): > I’ve updated the description in the main branch.

U7T29M3DG (18:31:08): > Separately I’ve edited the slide-seq, xenium and merfish vignettes.

U7T29M3DG (18:31:11): > Can you rebuild?

U7T29M3DG (18:32:27): > FYI Bioconductor 3.16 is released:https://twitter.com/Bioconductor/status/1587874254414123008?s=20&t=mjmQPflNjrgI0OlP0Zz8EQ - Attachment (twitter): Attachment > Bioconductor 3.16 is Released. Thanks to all developers and community members for contributing to the project! Please see the full release announcement: https://bioconductor.org/news/bioc_3_16_release/ #Rstats @Bioconductor

U7T29M3DG (18:32:50): > Voyager is here:https://bioconductor.org/packages/release/bioc/html/Voyager.html - Attachment (Bioconductor): Voyager > Voyager to SpatialFeatureExperiment (SFE) is just like scater to SingleCellExperiment. While SFE is a new S4 class, Voyager implements basic exploratory spatial data analysis (ESDA) methods for SFE. This first version supports univariate global spatial ESDA methods such as Moran’s I, permutation testing for Moran’s I, and correlograms. Voyager also implements plotting functions to plot SFE data and ESDA results. Multivariate ESDA and univariate local metrics will be added in later versions.

U01UF27E9P0 (18:33:52): U01UF27E9P0 (18:57:29): U7T29M3DG (18:59:34): > Ok good point. We should check that its consistent in the vignettes as well. We can stick to capital for the package (lowercase on the github repo).

U7T29M3DG (18:59:42): > I think I erroneously changed a few of these. Apologies.

UA5GZMWHM (19:44:55): > It seems that the MERFISH vignette is taking much longer to build on GitHub Actions than on Tolva. I monitored memory use while running it on Tolva. It never exceeded 7 GB, which is the amount of RAM the GitHub Action runner has. However, it gets to about 6.4 GB. The limited memory probably made it really slow. It’s still running and hasn’t thrown an out of memory error yet.

UA5GZMWHM (20:34:56) (in thread): > The reason why it’s so much slower on GitHub Actions is probably thatdbscanis not installed there rather than RAM, so a less efficient algorithm is used for k nearest neighbors. I found that in the CosMX and Xenium dataset, k nearest neighbors takes much longer on GitHub Actions than on Tolva, and dbscan is not installed on GitHub Actions. I just pushed a new tag, and I’ll soon find out if this is the case as now I’m making GitHub Actions install dbscan.

U01UF27E9P0 (20:42:09):

2022-11-03

U7T29M3DG (10:08:56): > <@UA5GZMWHM>: let me know when the site is rebuilt (the MERFISH link is still broken)

UA5GZMWHM (10:10:18): > I’m having trouble building it on GitHub Actions. It just never finished. I let it run overnight and it still didn’t finish in 6 hours so the system automatically cancelled the job. I’m trying to use their Mac machine, which has more RAM. I still wonder why that happens since it built without any problem on my laptop with 8 GB of RAM while I was doing other things, though it’s quite a bit slower than on Tolva.

U01UF27E9P0 (11:56:01): U7T29M3DG (11:59:02): > The site looks beautiful.

UA5GZMWHM (12:09:38): > That was a test, where I only built the MERFISH vignette to see if it works. It worked, so I’m now using GitHub’s Mac instance to rebuild the whole thing. I think I

UA5GZMWHM (12:09:53): > I’ll stick to Mac since it’s so much faster to install dependencies since there’re binaries.

UA5GZMWHM (12:10:23): > Actually I checked the specs of 2022 laptop models. They’re way more powerful than the Mac instance on GitHub.

U7T29M3DG (12:32:14): > Exactly. Current macbook **** air **** can be purchased with 2Tb disk and 32Gb of RAM.

UA5GZMWHM (13:06:39): > It’s remarkable how quickly it evolved. My 2017 MacBookPro has 8 GB of RAM and 4 cores. Now many laptops have 10 cores.

U01UF27E9P0 (15:03:28): UA5GZMWHM (15:12:19): > The rebuild is complete. It took over an hour. If you want multiple vignettes for each technology, then if we rebuild all the vignettes on GitHub Actions, it will take hours. I wonder if we should just do it on Tolva or on my laptop withlazy = TRUEso unedited old vignettes will not be rebuilt. But meanwhile, that’s bad for reproducibility, since old vignettes that are not rebuilt might no longer work as the packages update. Or maybe rebuild everything with every Bioconductor release or an important bug fix rather than every time the website needs to be update. I’m not sure if I can get GitHub Actions to not to build old unchanged vignettes.

U7T29M3DG (15:55:31): > That’s a very good point you raise and something to think about / discuss. As many more vignettes are added in the future this has to somehow scale.

U7T29M3DG (15:55:57): > And I agree that having Github actions build the vignettes is a good bar for reproducibility.

U01UF27E9P0 (18:06:58): U01BMGGNFEF (19:36:49): > @U01BMGGNFEF has joined the channel

U023DK2HCM7 (19:37:38): > @U023DK2HCM7 has joined the channel

U7YCV0V8F (19:37:41): > @U7YCV0V8F has joined the channel

2022-11-04

U7T29M3DG (03:06:59): > kallisto bustools used for Visium pre-processing: - File (PNG): image.png

U7T29M3DG (03:07:00): > fromhttps://www.nature.com/articles/s41587-022-01517-6 - Attachment (Nature): Spatial mapping of the total transcriptome by in situ polyadenylation > Nature Biotechnology - Spatial RNA sequencing is extended beyond poly-A transcripts to capture the full transcriptome.

2022-11-07

U7YCV0V8F (13:42:54): > https://www.nature.com/articles/s41592-022-01604-1 - Attachment (Nature): Light-Seq: light-directed in situ barcoding of biomolecules in fixed cells and tissues for spatially indexed sequencing > Nature Methods - Light-Seq uses light-directed DNA barcoding in fixed cells and tissues for multiplexed spatial indexing and subsequent next generation sequencing. This approach blends spatial and…

U7YCV0V8F (13:43:27): > > We first validated the barcoding chemistry in vitro on a glass surface coated with immobilized DNA strands. By adding fluorescently labeled barcode strands and using custom photomasks (for example, of a cat) to crosslink them to the surface, we were able to create patterns with a single barcode strand (Fig. 2b) or use sequential rounds of barcoding with unique strands to pattern multiple regions on the same slide, such as the three-color Penrose triangle (Fig. 2c). >

U7YCV0V8F (13:43:29): > this is super cool

U7YCV0V8F (13:43:33): > custom photo masks!

U7YCV0V8F (13:46:19): > super interesting idea

U7YCV0V8F (13:47:52): > > Using a ×10 objective, a single mirror in our DMD setup can yield a practical resolution <2 µm based on estimating the full-width at half-maximum on a dot array (Extended Data Fig. 1a,b). >

U7YCV0V8F (13:48:04): > I think they meant to say >2µm

U7YCV0V8F (13:48:45): > omg

U7YCV0V8F (13:48:47): > the use PER

U7YCV0V8F (13:48:56): > > Next, to read out the sequence of target DNAs with their corresponding crosslinked barcode sequences by NGS, the crosslinked bases must be addressed without loss of either the barcode identity or the barcoded sequence. To this end, we developed a cross-junction synthesis reaction to copy both the barcoded DNA sequence and barcode into a new single strand of DNA without a crosslink (Fig. 3a). For this we use a strategy similar to our previously developed Primer Exchange Reaction (PER)35. We use a primer with a strand displacing polymerase that copies a new strand until it is halted at the crosslink point. We designed the sequences around the crosslink to have an identical domain so that the extended primer can reach across the junction and be templated on the opposing strand through a branch migration36,37,38 competition between two identical domains. The single-stranded DNA product of this cross-junction reaction can then be amplified and read out with standard NGS pipelines. >

U7YCV0V8F (13:50:16): > im impressed with this paper

U7T29M3DG (15:06:43): > Are we meeting now<@UA5GZMWHM><@U02EN9EQQ5U>?

UA5GZMWHM (15:06:54): > Isn’t it 1 pm?

UA5GZMWHM (15:07:15): > Oh right, now. I forgot. Can we meet on zoom? I’m still at home

UA5GZMWHM (15:07:31): > Or I’m coming in 15 minutes

U7T29M3DG (15:09:06): > It was supposed to be at noon

U7T29M3DG (15:09:17): > If you come in 15 minutes thats fine.

U7T29M3DG (15:09:18): > Thanks!

UA5GZMWHM (15:09:24): > I’m sorry

U7T29M3DG (15:09:29): > No worries at all.

UA5GZMWHM (16:09:06): > https://www.icloud.com/reminders/0a29JTb57nc8dSZJsG-RrapWA#Voyager - Attachment (icloud.com): iCloud.com > Sign in to iCloud to access your photos, videos, documents, notes, contacts, and more. Use your Apple ID or create a new account to start using Apple services.

U7T29M3DG (16:28:41): > https://twitter.com/slavov_n/status/1582347828818456576?s=20&t=NUhjO9n396K9ud5pKi6Azg - Attachment (twitter): Attachment > The results of different methods applied to the same scRNA-seq data differ substantially. > > This is true even for fold changes, as shown below for Seurat and Scanpy. > > The differences between selected transcript “markers” are even larger: https://www.biorxiv.org/content/10.1101/2022.05.09.490241v2 via @davisjmcc

UA5GZMWHM (17:57:11): > This paper has some comparisons of the results from spdep, GeoDa, and PySAL:https://onlinelibrary.wiley.com/doi/pdf/10.1111/gean.12319

UA5GZMWHM (18:00:08): > https://openaccess.nhh.no/nhh-xmlui/bitstream/handle/11250/2565494/Bivand_Wong.pdf?sequence=2

UA5GZMWHM (18:08:30): > The second one is more relevant

U01UF27E9P0 (18:48:52):

2022-11-08

U7YCV0V8F (10:35:50): > https://www.biorxiv.org/content/10.1101/2022.11.06.515380v1 - Attachment (bioRxiv): Modular cell type organization of cortical areas revealed by in situ sequencing > The cortex is composed of neuronal types with diverse gene expression that are organized into specialized cortical areas. These areas, each with characteristic cytoarchitecture, connectivity, and neuronal activity, are wired into modular networks. However, it remains unclear whether cortical areas and their modular organization can be similarly defined by their transcriptomic signatures. Here we used BARseq, a high-throughput in situ sequencing technique, to interrogate the expression of 107 cell type marker genes in 1.2 million cells over a mouse forebrain hemisphere at cellular resolution. De novo clustering of gene expression in single neurons revealed transcriptomic types that were consistent with previous single-cell RNAseq studies. Within medium-grained cell types that are shared across all cortical areas, gene expression and the distribution of fine-grained cell types vary along the contours of cortical areas. The compositions of transcriptomic types are highly predictive of cortical area identity. We grouped cortical areas into modules so that areas within a module, but not across modules, had similar compositions of transcriptomic types. Strikingly, these modules match cortical subnetworks that are highly interconnected, suggesting that cortical areas that are similar in cell types are also wired together. This “wire-by-similarity” rule reflects a novel organizing principle for the connectivity of cortical areas. Our BARseq-based strategy is high-throughput and low-cost, and scaling up this approach to many animals can potentially reveal the brain-wide molecular architecture across individuals, developmental times, and disease models. ### Competing Interest Statement A.M.Z. is a founder and equity owner of Cajal Neuroscience and a member of its scientific advisory board. The remaining authors declare no competing interests.

U7YCV0V8F (15:03:59): > Gene expression and morphology

U7YCV0V8F (15:04:02): > https://doi.org/10.1038/s41592-022-01667-0

U7T29M3DG (18:07:11): > <!channel>: when is the call tomorrow scheduled for?

UA5GZMWHM (22:45:56): > 9 am

U7T29M3DG (23:22:56): > Thanks

2022-11-09

U7YCV0V8F (01:45:14): > I completely forgot to add this meeting to my calendar and double booked 9am. I can be on by 9:30

U7YCV0V8F (01:45:28): > Worst case I’ll get an update from you all offline

U7T29M3DG (02:54:49): > 9:30am should be fine.

Unknown User (11:55:39): > [Unsupported block type: call]

U7T29M3DG (12:47:59): > “One voice can be stronger than a thousand voices.”

U7YCV0V8F (12:48:32): > Whichseason/episode?

UA5GZMWHM (12:49:19): > Star Trek: Voyager episode The Gift

UA5GZMWHM (12:49:28): > When Seven of Nine joined the crew

2022-11-11

U7YCV0V8F (20:47:11): > https://gist.github.com/diegohaz/6ca633834ee54f781e1294f11e5f51ae

2022-11-16

U7YCV0V8F (13:02:36): > This seminar is happening now<@UA5GZMWHM><@U02EN9EQQ5U>and may be useful to you

U7YCV0V8F (13:02:37): > https://cedars-sinai.zoom.us/j/99340398719?pwd=UFhoU2ZhNkJzOUE1L0txbi9mRW5kQT09

U7YCV0V8F (13:02:54): - File (PNG): image.png

U7T29M3DG (13:06:57): > Thanks for sharing the link. Looks like an interesting talk.

U7T29M3DG (13:07:42): > This is the author of spatial cut-and-tag:https://www.science.org/doi/10.1126/science.abg7216

U7T29M3DG (13:08:37): > dbit-seq:https://www.cell.com/cell/pdf/S0092-8674(20)31390-8.pdf

UA5GZMWHM (13:58:56): > <@U02EN9EQQ5U>Are you already working on a vignette? OK, it took me quite a while to catch up updating the museumst database. Now I think I’ll begin writing a vignette that Lior really wants. I’ll do the ones you are not already working on.

U02EN9EQQ5U (14:00:50): > <@UA5GZMWHM>I’mworking on the seqFISH vignette and Slide-Seq landing page for the website.Ipushed the Visium landing page to the repo some time last week.

UA5GZMWHM (14:01:08): > OK, then I’ll work on using spatial methods on non-spatial scRNA-seq data

UA5GZMWHM (14:02:21): > Also, you need to push a tag to rebuild the website

U02EN9EQQ5U (14:03:02): > Yes.I figured it made sense to rebuild when the landing pages were completed

UA5GZMWHM (14:37:23): > FYI, there’s a bug in the newest version ofggplot2that affects Voyager. Just don’t use anything with hexbin until that bug is fixed.

UA5GZMWHM (14:37:43): > Or use an older version of ggplot2

UA5GZMWHM (16:46:20): > <@U02EN9EQQ5U>Please remember to update_pkgdown.ymlto put the landing pages into the navbar menu when you rebuild the website

U02EN9EQQ5U (16:47:10): > Will do<@UA5GZMWHM>

UA5GZMWHM (19:09:01) (in thread): > https://github.com/tidyverse/ggplot2/issues/5037 - Attachment: #5037 hexbin is broken > It looks like since at least the latest release (3.4.0) geom_hex is broken in some way. For an example, see <https://ggplot2.tidyverse.org/reference/geom_hex.html#ref-examples|the documentation>: > > image > > (Current state on left, expected output on the right.)

U01UF27E9P0 (23:58:55):

2022-11-17

U7T29M3DG (02:40:48): > I looked at the spatial atac-seq paper. We should add a vignette for this sometime. The paper is here:https://www.nature.com/articles/s41586-022-05094-1 - Attachment (Nature): Spatial profiling of chromatin accessibility in mouse and human tissues > Nature - Spatial-ATAC-seq—spatially resolved chromatin accessibility profiling of tissue sections using next-generation sequencing—delineated tissue-region-specific epigenetic…

U7T29M3DG (02:40:57): > The data is here:https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE171943 - Attachment (ncbi.nlm.nih.gov): GEO Accession viewer > NCBI’s Gene Expression Omnibus (GEO) is a public archive and resource for gene expression data.

2022-11-18

U01UF27E9P0 (19:02:06):

2022-11-20

U7T29M3DG (11:27:08): > Hi<@UA5GZMWHM>

U7T29M3DG (11:27:35): > I’ve reviewed the nonspatial vignette and it’s very good. However I think we should delete the part where we paint on the UMAP (the initial UMAP is fine). And just stick to the Moran’s I analysis.

U7T29M3DG (11:27:52): > Furthermore, I do think we should add the basic QC up front (knee plot, mitochondrial gene check, and resultant filtering).

U7T29M3DG (11:28:38): > I also think we should perform the standard clustering and marker finding, and then display the Moran’s I specifically for (all) the main marker genes for PBMC.

UA5GZMWHM (17:36:04): > I did delete the umap part. I did a major edit though, with the correlograms, and DE, which are really interesting.

2022-11-26

U7T29M3DG (11:08:35): > <@UA5GZMWHM>and<@U02EN9EQQ5U>have made some quick progress on organization of the website, vignettes etc. Shall we schedule a call soon again (with<@U01BMGGNFEF>and<@U91L3C2KF>) to discuss the Python implementation? Any questions so far?

U91L3C2KF (11:09:11): > How is Tuesday morning UAB

U91L3C2KF (11:09:20): > Morning ish

U7T29M3DG (11:19:29): > UAB?

U91L3C2KF (13:03:18): > University of Alabama Birmingham, go Dragons!

U91L3C2KF (13:03:55): > Autocomplete gone wrong. Anytime in the morning California time is good for us

2022-11-27

U7T29M3DG (23:32:13): > Sounds good.

U7T29M3DG (23:33:43): > 9am?

2022-11-28

U91L3C2KF (08:41:52): > 9am (17pm GMT) is good

2022-11-29

Unknown User (12:00:58): > [Unsupported block type: call]

Unknown User (12:13:21): > [Unsupported block type: call]

U91L3C2KF (12:44:31): > peturhelgi@gmail.com

U04CYELHK5H (12:45:42): > @U04CYELHK5H has joined the channel

U7YCV0V8F (12:49:50): > <@UA5GZMWHM>https://github.com/pachterlab/COVID19-County/blob/8b5d30832534bd08f8ae41a7cebb8acd59d9a9a0/.github/workflows/build.yml#L7-L11

U7YCV0V8F (12:50:14): > More info about cronhttps://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows

U7T29M3DG (13:00:49): > <@U04CYELHK5H>and<@U01BMGGNFEF>: what are your github usernames?

U7T29M3DG (13:00:58): > I’ve made the new repo (voyager-testing)

UA5GZMWHM (13:05:06) (in thread): > Good to know. Thanks!

U04CYELHK5H (19:01:50) (in thread): > Mine is peturhelgi and Sindri’s is baroona

U7T29M3DG (19:10:20) (in thread): > Thanks. Added you to the repo.

2022-12-01

U01UF27E9P0 (02:13:21):

2022-12-03

UA5GZMWHM (00:33:40): > I haven’t figured out a way to not to have the devel website overwrite the actual website when deployed. How about this: I can make GitHub Actions to build the website every day at 2 am, but don’t deploy it. You can git clone the repo, checkout the gh-pages-devel branch, and open the index.html file to look at the devel version of the website. Is this acceptable? If not I may try to host it outside GitHub Pages, like Travis. Or self-host it on Tolva.

UA5GZMWHM (00:37:52): > How about this: I still want to use GitHub Actions. I’ll fork Voyager and create the devel website from the fork. I’ll give you write access to that fork. When we’re ready to update the main actual website, we can create a pull request with that fork, and the main website will be updated when we merge that pull request.

UA5GZMWHM (00:39:26): > So the URL of the devel website will behttps://lambdamoses.github.io/voyager/dev/ “dev” so it can’t be found by search engines.

UA5GZMWHM (00:40:30): > Then I suppose we no longer need the documentation-devel branch. Are you OK with deleting it?

U02EN9EQQ5U (00:41:27): > I think that should be ok - Ihaven’tyet pushed anything unique to the devel branch

UA5GZMWHM (00:41:47): > So I’m deleting it now before I fork

UA5GZMWHM (00:42:11): > The devel version of the website will be in the documentation branch of the fork

U7T29M3DG (00:43:08): > That works

U01UF27E9P0 (01:49:48): U01UF27E9P0 (16:59:33): U01UF27E9P0 (17:32:24): U01UF27E9P0 (17:50:42):

2022-12-05

U04DA10TABX (15:23:04): > @U04DA10TABX has joined the channel

U01UF27E9P0 (20:20:08):

2022-12-06

UA5GZMWHM (22:12:51): > I’ve been stupid. I forgot that the time zone in the scheduled run is UTC, so the devel website just started building though I scheduled it to run at 2:22 am every day. We’ll see if it works.

2022-12-07

UA5GZMWHM (00:14:19): > It worked. Here’s the devel website:https://lambdamoses.github.io/voyager/dev/index.html - Attachment (lambdamoses.github.io): From geospatial to spatial omics > SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic > exploratory spatial data analysis (ESDA) methods for SFE. This first version > supports univariate global spatial ESDA methods such as Morans I, > permutation testing for Morans I, and correlograms. The Voyager package also implements > plotting functions to plot SFE data and ESDA results. Multivariate ESDA > and univariate local metrics will be added in later versions.

U7T29M3DG (00:15:46): > Awesome!

UA5GZMWHM (15:56:16): > <@U02EN9EQQ5U>I ranExperimentHubData::makeExperimentHubMetadata(".")which gave an error, fromSourceType = "Rds"but it should be RDS. I fixed that and pushed it to upstream. Did you get any email from Lori Shepherd from the last 2 weeks? There was one on November 29 which I cc-ed to you.

UA5GZMWHM (16:43:02): > That ggplot2 hexbin bug has been fixed:https://github.com/tidyverse/ggplot2/issues/5037#event-7982206123

U7T29M3DG (17:47:20): > Note to self: look at pixel-seq (h/t<@UCG42NA2U>)

UCG42NA2U (17:47:23): > @UCG42NA2U has joined the channel

U02EN9EQQ5U (18:24:29): > <@UA5GZMWHM>I just checked and stillhaven’treceived anything beyond the initial thread -I’llfollow up

U02EN9EQQ5U (19:18:37): > Ok<@UA5GZMWHM>, now just waiting for Bioconductor to add the seqFISH metadata to ExperimentHub and to be assigned new IDs. I cc’ed you on the most recent email I sent this evening

2022-12-09

U7T29M3DG (15:51:49): > Another vignette we should write is for CODEX, which is spatial single-cell proteomics data. This paper has a good (available) dataset:https://www.nature.com/articles/s41592-022-01651-8 - Attachment (Nature): Annotation of spatially resolved single-cell data with STELLAR > Nature Methods - STELLAR (spatial cell learning) is a geometric deep learning model that works with spatially resolved single-cell datasets to both assign cell types in unannotated datasets based…

UCG42NA2U (18:49:03): > https://www.sciencedirect.com/science/article/pii/S0092867422013678 - Attachment (sciencedirect.com): Polony gels enable amplifiable DNA stamping and spatial transcriptomics of chronic pain > Methods for acquiring spatially resolved omics data from complex tissues use barcoded DNA arrays of low- to sub-micrometer features to achieve single-…

2022-12-12

U7T29M3DG (11:58:39): > Should we set up a brief meeting this week to catch up on the status of the various vignettes etc?

U02EN9EQQ5U (12:02:46) (in thread): > Yes - I think the 9am time slot has been working so far. I’m available Wednesday through Friday at that time this week.

U7T29M3DG (12:13:47) (in thread): > Wednesday 9am sounds good

UA5GZMWHM (13:11:25) (in thread): > OK, see you then

2022-12-13

UA5GZMWHM (00:42:54): > I copied and pasted some code fromspdepto SFE with minor modification when implementing my faster way of finding distance based neighbors. Becausespdephas GPL license, I might change the license of SFE to GPL3.

U7T29M3DG (01:02:59): > I’m fine with that.

U7YCV0V8F (01:40:52): > I think its worth considering whether this would affect adoption of SFE within industry

U7YCV0V8F (01:41:46): > My intuition is that companies may be hesitant to create production level code for which SFE + voyager are dependencies if it means that they have to also make their code GPL

UA5GZMWHM (01:46:31): > Incorporating code is different from dependency. Actually R itself is GPL while R packages don’t have to be GPL.

UA5GZMWHM (01:48:57): > Anyway, I might end up removing the code I copied and pasted after some benchmarks

U7T29M3DG (02:28:45): > I agree with you<@U7YCV0V8F>but I figured Voyager can remain BSD-2, and SFE is likely to become fairly rigid after a while. But yeah- I generally prefer BSD-2.

U7T29M3DG (13:18:51): > Reminder: Voyager meeting tomorrow at 9am PST.

UA5GZMWHM (17:57:14): > OK, I have no background in business and I have the belief of ethics over profit, “utopian” you may call it. I don’t understand why would companies using SFE want to make their software proprietary to begin with. Presumably they use it for research, but besides Matlab, GraphPad Prism, Mathematica, SAS, and SPSS, I don’t see proprietary software that is commonly used in our field. Those 5 mentioned mainly do the basics so they don’t sound like what SFE is used for, and I’ve seen many open source Matlab extensions in spatial transcriptomics. Moreover, image processing in our field is switching from Matlab to Python and Pharma companies are switching from SAS to R. The cool part of GPL is that any derivative of the free software will remain free as in freedom, so nobody can take the freedom away from the users, to know what’s in the software, to distribute, and to modify it as they see fit. Why is that important? To avoid mass surveillance, to make sure that monopolistic companies can’t lock our culture in their vaults (Disney+ does that), and to give users control so we don’t get the dystopian scenarios like companies forcing us to buy new devices with planned obsolescence (remember the software deliberately slowing down older iPhones) or remotely censoring books (Amazon did that to 1984 back in 2009). Without free and open source software (FOSS), the cool new developments in our field wouldn’t be possible. > That said, as we don’t live in a perfect world, so I wouldn’t say that everyone who sells proprietary software is evil. For example, RStudio/Posit profits from premium proprietary versions of RStudio and related tools like RStudio Connect, so they can pay their staff to write better open source software that may not be awarded in academia, which is RStudio’s social mission as a B Corp. To be honest, I’m using LibreOffice, a FOSS alternative to Microsoft Office, and find it kind of sucks. I also use DarkTable, a FOSS alternative to Adobe Lightroom, to edit photos. It’s decent, but it often crashes and making an HDR is kind of a pain in the ass compared to Lightroom. I’m using proprietary CleanMyMac because the FOSS alternatives all suck. It could be taking away my freedom because it scans my whole disk and can have surveillance risk because I can’t see the source code so can’t know if it does surveillance. Probably the FOSS alternatives will be better if the programmers are paid a living wage to write them and the donations are probably not enough to do that. Well but then I’ve seen local non-profits raising enough money to have full time staff. But again, I don’t see how those in industry profit from making proprietary software based on SFE. Like Cell Ranger is open source. Nanostring’s various R packages to analyze GeoMX DSP data are also open source. I believe they profit from selling reagent kits and machines rather than software. So I wonder why not using GPL is relevant. Most likely SFE is so specialized that it won’t be used in proprietary software that do the pernicious things just mentioned and we already have an open source culture, so I’m not completely not OK with not using copyleft. But I initially chose GPL to make a moral stance against those pernicious practices and against placing profit over people and planet in general.

U7T29M3DG (18:44:10): > During grad school I actually sat next to Richard Stallman many nights in the CS building, and that’s when I first learned about GPL and started to understand it’s logic and purpose. I think it’s an admirable approach, to not just advocate for but force open source by copyleft. But over time I grew to prefer free licenses such as MIT and BSD. The reason is that in academia, there is little practical difference for whether software that is being used is GPL or BSD/MIT. But in industry it makes a big difference. GPL, first of all, has nothing to do with free / not free. It only protects open source. But companies can sell GPL code, and sometimes do. However companies, frequently put together codebases from many different places and tools, and as soon as they use GPL code they can no longer use other code unless they GPL their whole company’s codebase. Many (most, all?) companies won’t do this, because they want to keep at least part of their code closed source. So what ends up happening is that they either use GPL and violate the rules, or just avoid using GPL code (very common). This ends up slowing down the dissemination of software, not speeding it up. > > As I’ve said before, in the case of SFE, I don’t really thing it matters whether it’s licensed GPL or BSD. If there’s a good reason to GPL it, such as using GPL code inside it, I’m fine with that. But I’d still prefer BSD if possible. > > BTW, Cell Ranger is not open source (only early versions are). But it is free. I’d much rather it cost money, but be open source under BSD…. and crucially, carefully documented. That way academic groups could build on it, and in turn 10x could benefit from distributed development. Of course ideally it would be BSD and free.

UA5GZMWHM (20:22:45): > My bad. I found a GitHub repo for Cell Ranger so thought it’s open source. I’m aware that you can sell GPL software since the “free” is about freedom rather than not costing money. I’m also not against, say, having open source software available, but charging users for using the same software on a server provided by the company to cover server cost. I wonder what are non-pernicious reasons why companies would keep some code closed source if they are not profiting from the software anyway, besides that the code is not ready for release. My concern is about the pernicious behaviors of proprietary software that take away our freedom.

U7T29M3DG (20:34:05): > I 100% agree with you that companies in (comp)bio should make software free. I think one of the reasons they sometimes don’t is not so much that they have ulterior motives, but just that they think they can profit from the selling software for $. 30-40 years ago this was true in some areas, e.g. software for molecular dynamics. With very few exceptions, it’s proved not to be true, and so companies are just succumbing to wishful thinking. I do think in some cases software is also closed because companies are hiding details about their data. Obviously this is also not a good reason.

2022-12-14

U023DK2HCM7 (11:58:01) (in thread): > Is this in person or via zoom?

U02EN9EQQ5U (12:00:31) (in thread): > We’ve been meeting by zoom:point_up_2:

U023DK2HCM7 (12:01:29) (in thread): > What’s the link?

U02EN9EQQ5U (12:03:21) (in thread): > Hm, usually someone sends a meeting invite using the Zoom plug in on Slack. Give me a sec and I’ll set up a call

U023DK2HCM7 (12:03:37) (in thread): > Thank you! :)

Unknown User (12:04:16): > [Unsupported block type: call]

U02EN9EQQ5U (12:04:50) (in thread): > Lior beat me to it<@U023DK2HCM7>:slightly_smiling_face:

UA5GZMWHM (12:06:16): > https://lambdamoses.github.io/voyager/dev/index.html - Attachment (lambdamoses.github.io): From geospatial to spatial omics > SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic > exploratory spatial data analysis (ESDA) methods for SFE. This first version > supports univariate global spatial ESDA methods such as Morans I, > permutation testing for Morans I, and correlograms. The Voyager package also implements > plotting functions to plot SFE data and ESDA results. Multivariate ESDA > and univariate local metrics will be added in later versions.

U7T29M3DG (12:07:15): > https://pachterlab.github.io/voyager/ - Attachment (pachterlab.github.io): From geospatial to spatial omics > SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic > exploratory spatial data analysis (ESDA) methods for SFE. This first version > supports univariate global spatial ESDA methods such as Morans I, > permutation testing for Morans I, and correlograms. The Voyager package also implements > plotting functions to plot SFE data and ESDA results. Multivariate ESDA > and univariate local metrics will be added in later versions.

U91L3C2KF (12:23:16): > https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/algorithms/imaging

U7T29M3DG (12:30:45): > https://nanostring.com/blog/spotlight-on-spatial-informatics-overlay-of-structurally[…]n-geomx-dsp-tissue-images-with-spatialomicsoverlay-r-package/ - Attachment (NanoString): Spotlight on Spatial Informatics: Overlay of Structurally Profiled ROIs on Tissue Images with SpatialOmicsOverlay R Package > Spatial biology is changing the way we view biology. Advancements in spatial biology are now enabling us to compartmentalize a tissue in terms of cells, tissue structure, or disease state.

U7T29M3DG (12:37:29): > https://github.com/pachterlab/voyager-testing

U91L3C2KF (12:38:10): > <@U02EN9EQQ5U>what is your github account?

U02EN9EQQ5U (12:41:05) (in thread): > kayla-jackson

U91L3C2KF (12:45:31): > https://github.com/pmelsted/voyagerpy

2023-01-02

U7T29M3DG (08:17:28): > I talked to<@U91L3C2KF>and<@U04CYELHK5H>today about having regular weekly meetings from now on starting next week on Mondays at 11am PST. Hope this works for everybody.

2023-01-03

UCG42NA2U (19:57:04): > https://genome.cshlp.org/content/early/2021/05/25/gr.271288.120 - Attachment (genome.cshlp.org): Characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomics data with nonuniform cellular densities > An international, peer-reviewed genome sciences journal featuring outstanding original research that offers novel insights into the biology of all organisms

2023-01-06

UA5GZMWHM (01:44:29): > https://doi.org/10.1016/j.jspi.2010.03.045 - Attachment (sciencedirect.com): The Moran coefficient for non-normal data > This paper summarizes findings that extend statistical distribution properties of the Moran coefficient index measuring spatial autocorrelation to non…

2023-01-09

UA5GZMWHM (14:01:14): > zoom link?

UA5GZMWHM (14:05:15): > https://caltech.zoom.us/j/83677894726

U7T29M3DG (15:07:44): > Hi everyone,

U7T29M3DG (15:08:10): > I’m very very sorry. I put in this meeting into the calendar in Iceland in the wrong time zone… I won’t miss any more of these. Apologies.

2023-01-10

UA5GZMWHM (16:33:31): > I’m invited to give an online talk for the Advanced Biomedical Computation series at Harvard on January 30, about SFE and Voyager. I’m pretty sure that someone will ask about a Python version. Is it OK if I mention the Python development by Sindri and Pall?

U01UF27E9P0 (17:01:35): U01UF27E9P0 (17:01:55) (in thread): U01UF27E9P0 (17:22:11): U01UF27E9P0 (17:23:05): U7T29M3DG (17:31:12) (in thread): > I’m personally fine with it but it’s up to<@U04CYELHK5H>,<@U01BMGGNFEF>and<@U91L3C2KF>. Also it might make sense to decide on that sooner to the date, when we have a better idea how far we are/will be.

U91L3C2KF (17:36:17) (in thread): > I’mfine with it just saying thatit’sin the works.We will know more in a couple of weeks

U01UF27E9P0 (19:07:03): UA5GZMWHM (22:32:23): > I moved the website-devel back to the pachterlab repo. The devel website works:https://pachterlab.github.io/voyager/dev/ - Attachment (pachterlab.github.io): From geospatial to spatial omics > SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic > exploratory spatial data analysis (ESDA) methods for SFE. This first version > supports univariate global spatial ESDA methods such as Morans I, > permutation testing for Morans I, and correlograms. The Voyager package also implements > plotting functions to plot SFE data and ESDA results. Multivariate ESDA > and univariate local metrics will be added in later versions.

UA5GZMWHM (22:32:34): > The release website still works. It was not overwritten.

UA5GZMWHM (22:37:14): > So I’m deleting my fork of Voyager under my personal account

2023-01-11

U7T29M3DG (12:20:41): > Thanks<@UA5GZMWHM>

U7T29M3DG (12:21:06): > Can you point me to where the pages are that build into this dev website? I.e., if I want to edit a specific tech landing page or vignette, where do I edit?

U02EN9EQQ5U (12:24:49): > The text for the dev vignettes are under thedocumentation-develbranch in thevignettesfolder. All of the existing landing pages have the*_landing.Rmdconvention

U7T29M3DG (12:27:53): > Thanks!

U01UF27E9P0 (19:02:18):

2023-01-12

U01UF27E9P0 (01:34:50): U01UF27E9P0 (02:02:40):

2023-01-16

UA5GZMWHM (13:57:32): > Today is Martin Luther King Jr. Day. Are we still meeting?

U91L3C2KF (13:59:51): > <@U7T29M3DG>?

UA5GZMWHM (14:02:16): > Anyway, it’s probably not an institute holiday in Iceland. We can still meet if you wish

U7T29M3DG (14:03:01): > Yes

Unknown User (14:03:04): > [Unsupported block type: call]

UA5GZMWHM (14:26:26): > categorical: dittoseq

UA5GZMWHM (14:26:44): > https://bioconductor.org/packages/release/bioc/html/dittoSeq.html - Attachment (Bioconductor): dittoSeq > A universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected dittoColors().

U7T29M3DG (14:28:06): > import seaborn as sns > dittoseq_cmap = sns.diverging_palette(250, 15, s=75, l=40, n=256, center=“dark”)

U7T29M3DG (14:28:18): > You can also use the librarycolorcetwhich have a color palette calleddiverging.dittoseq

U7T29M3DG (14:28:22): > import colorcet as cc > dittoseq_cmap =cc.cm.diverging.dittoseq

2023-01-19

UA5GZMWHM (17:42:06): > https://link.springer.com/article/10.1007/s10109-006-0034-9 - Attachment (SpringerLink): Hidden negative spatial autocorrelation > Journal of Geographical Systems - Mostly lip service treatments of negative spatial autocorrelation (NSA) appear in the literature, although spatial scientists confront it in practice. NSA was…

UA5GZMWHM (19:08:33): > I wonder what you think. I looked into scater’s log normalization, which is actually implemented in scuttle. Unlike Seurat, it uses log2 rather than natural log, so the differences in log normalized values can be interpreted as log fold changes. Also, while in Seurat, before the log, the raw count is divided by the size factor (colSums(x)/10000), the size factor in scater is (colSums(x)/mean(colSums(x))). In addition, scater has other kinds of transforms like asinh.

UA5GZMWHM (19:09:15): > Here’s where I want your comments: Do you think it’s better to use and then explain how scuttle/scater’s log normalization works and how it’s different from Seurat, or use the Seurat way because Seurat is so popular?

U01UF27E9P0 (19:16:53):

2023-01-20

U04CYELHK5H (06:11:19): > Whatever you go for, I’d say it’d we more transparent if you explain the differences, and if there is an analytic reason you chose either one, explain why

U01UF27E9P0 (19:17:50):

2023-01-21

U01UF27E9P0 (19:20:33): U7T29M3DG (19:40:20): > I prefer (colSums(x)/mean(colSums(x)))

U7T29M3DG (19:40:34): > as the normalizing factor for depth. And then log following that.

2023-01-22

U04CYELHK5H (08:04:32): > btw, the links between the vignettes are outdated since they still point tolambdamoses.github.com

U01UF27E9P0 (19:11:53):

2023-01-23

U91L3C2KF (10:28:35): > I’msick so Iwon’tbe able to join the meeting tonight

U7T29M3DG (11:28:46): > Hope you feel better soon. We’ll post an update after the call.

Unknown User (14:01:20): > [Unsupported block type: call]

U7T29M3DG (14:12:39): > The vignette we are talking about is here:https://pachterlab.github.io/voyager/dev/articles/nonspatial.html - Attachment (pachterlab.github.io): Apply spatial analyses to non-spatial scRNA-seq data > Voyager

U04CYELHK5H (14:24:04): > https://github.com/pmelsted/voyagerpy

U01UF27E9P0 (19:15:37): U7T29M3DG (19:19:17): > Meeting minutes (in brief): > * Lambda implemented the simplified normalization, as did Pétur. Things look like they are matching up. > * Kayla suggested that the seqFish data might illustrate a situation where the same normalization across all cells is problematic. > * There was a minor bug on the Voyager devel website. To be fixed, along with finalizing the vignettes that are being implemented in Python. > * We’ll discuss the Python website next week (Pétur will be away).

2023-01-24

U01UF27E9P0 (19:14:13): UA5GZMWHM (19:21:25): > Heads up: I’ll create a tag and update the release website on January 29 because I’m giving the online talk on January 30.

2023-01-25

U04CYELHK5H (11:22:14): > <@UA5GZMWHM>I have a question about the Visium graph. You said that the weights you used on the edges, for computing Moran’s I, where 1 (or a constant) divided by the number of adjacent edges. I take it that for node i, the the weights of the edges of that node sum up to 1 (or some constant). However, if nodes i and j are neighbours, and they don’t have the same degree, thenW[i, j] != W[j, i], whereWis the weight matrix. Is this the desired behaviour? Would it not make sense that the weight matrix was symmetric, and handle the perimeter of the tissue differently?

UA5GZMWHM (11:26:26): > Right,it’snot symmetric.I think row normalization is a way to adjust for edge effect as spots on the edge have fewer neighbors.Thatsaid,it’sjust the default of spdep. Ihaven’tbenchmarked different types of edge weights in different cases.But the W style,with row normalization,is recommended for Moran plot,while the binary matrix is recommended for Getis-Ord Gi*

U04CYELHK5H (11:29:25): > OK, thank you. Just wanted to make sure we’re on the same page. When you say row normalization, are you talking about normalizing the rows in the assay?

UCG42NA2U (11:39:15): > <@UA5GZMWHM>I wonder whether you have checked moran’s I function that MERINGUE implemented?https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8494224/ - Attachment (PubMed Central (PMC)): Characterizing spatial gene expression heterogeneity in spatially resolved single-cell transcriptomic data with nonuniform cellular densities > Recent technological advances have enabled spatially resolved measurements of expression profiles for hundreds to thousands of genes in fixed tissues at single-cell resolution. However, scalable computational analysis methods able to take into consideration …

UA5GZMWHM (11:41:28): > No, I mean row normalizing the adjacency matrix

U04CYELHK5H (11:42:07): > Ah I see, that makes more sense now

U7YCV0V8F (12:36:18): > How are probe sets generated for spatial data<@UA5GZMWHM>? this is what i could find from the 10x genomics websitehttps://support.10xgenomics.com/spatial-gene-expression-ffpe/probe-sets/overview

UA5GZMWHM (12:37:21): > I don’t know. I think it’s probably 10X proprietary info. For FFPE, they sequence the probes instead of the cDNAs.

UA5GZMWHM (12:38:29): > Oh right, you reminded me. We should probably write a vignette for Visium FFPE, because the QC will be different from frozen section Visium since FFPE Visium datasets don’t have mitochondrial counts.

UA5GZMWHM (12:42:36) (in thread): > I looked at the source code. Meringue also row normalizes the adjacency matrix.

U01UF27E9P0 (19:15:26):

2023-01-26

U01UF27E9P0 (19:18:36):

2023-01-27

U01UF27E9P0 (02:20:45): U01UF27E9P0 (19:24:28):

2023-01-28

U01UF27E9P0 (23:44:09):

2023-01-29

U01UF27E9P0 (19:09:20):

2023-01-30

U01UF27E9P0 (00:25:40): U01UF27E9P0 (00:45:55): U01UF27E9P0 (02:46:35): UA5GZMWHM (13:33:16): > Here’s the zoom link to the online talk I’m giving today at 1 pm:https://partners.zoom.us/j/82826415806

U7YCV0V8F (13:35:23): > Could I get access tohttps://github.com/pmelsted/voyagerpy?

U01UF27E9P0 (13:53:49): Unknown User (14:01:29): > [Unsupported block type: call]

U91L3C2KF (14:07:59) (in thread): > done

U7YCV0V8F (14:09:10) (in thread): > thank you

U01UF27E9P0 (15:38:08): U01UF27E9P0 (21:39:34):

2023-01-31

UA5GZMWHM (01:37:59): > https://escholarship.org/uc/item/6jg661wx - Attachment (escholarship.org): Spatial autocorrelation and red herrings in geographical ecology > Author(s): Diniz, JAF; Bini, L M; Hawkins, Bradford A. | Abstract: Aim Spatial autocorrelation in ecological data can inflate Type I errors in statistical analyses. There has also been a recent claim that spatial autocorrelation generates ‘red herrings’, such that virtually all past analyses are flawed. We consider the origins of this phenomenon, the implications of spatial autocorrelation for macro-scale patterns of species diversity and set out a clarification of the statistical problems generated by its presence. Location To illustrate the issues involved, we analyse the species richness of the birds of western/central Europe, north Africa and the Middle East. Methods Spatial correlograms for richness and five environmental variables were generated using Moran’s I coefficients. Multiple regression, using both ordinary least-squares (OLS) and generalized least squares (GLS) assuming a spatial structure in the residuals, were used to identify the strongest predictors of richness. Autocorrelation analyses of the residuals obtained after stepwise OLS regression were undertaken, and the ranks of variables in the full OLS and GLS models were compared. Results Bird richness is characterized by a quadratic north-south gradient. Spatial correlograms usually had positive autocorrelation up to c. 1600 km. Including the environmental variables successively in the OLS model reduced spatial autocorrelation in the residuals to non-detectable levels, indicating that the variables explained all spatial structure in the data. In principle, if residuals are not autocorrelated then OLS is a special case of GLS. However, our comparison between OLS and GLS models including all environmental variables revealed that GLS de-emphasized predictors with strong autocorrelation and long-distance clinal structures, giving more importance to variables acting at smaller geographical scales. Conclusion Although spatial autocorrelation should always be investigated, it does not necessarily generate bias. Rather, it can be a useful tool to investigate mechanisms operating on richness at different spatial scales. Claims that analyses that do not take into account spatial autocorrelation are flawed are without foundation.

U01UF27E9P0 (19:21:51): UA5GZMWHM (20:01:46): > <@U7T29M3DG>You mentioned spatial autocorrelation in 1D for chromatin accessibility. Whilesfneeds to be hacked to apply to 1D, the time series tradition has temporal autocorrelation and may be more relevant.

2023-02-01

U01UF27E9P0 (03:05:02): U01UF27E9P0 (03:23:02): U01UF27E9P0 (19:18:43):

2023-02-02

U01UF27E9P0 (19:21:46):

2023-02-03

UA5GZMWHM (01:55:30): > I pushed a small subset of a 10x visium dataset to the voyager-testing repo. The counts and log normalized counts matrices are mtx files. I also included centroid coordinates and spot polygons. Let me know if you have problems with the test dataset.

UA5GZMWHM (01:55:47): > Some food for thought when it comes to spatially variable genes:https://doi.org/10.1080/13658810902832591 - Attachment (Taylor & Francis): Detecting negative spatial autocorrelation in georeferenced random variables > Negative spatial autocorrelation refers to a geographic distribution of values, or a map pattern, in which the neighbors of locations with large values have small values, the neighbors of locations…

U01UF27E9P0 (19:10:58):

2023-02-04

U01UF27E9P0 (01:30:59): U01UF27E9P0 (19:03:31):

2023-02-05

U01UF27E9P0 (19:17:09):

2023-02-06

Unknown User (14:02:30): > [Unsupported block type: call]

UA5GZMWHM (14:11:45): > https://github.com/pachterlab/voyager-testing

U7T29M3DG (14:43:23): > https://www.bioconductor.org/about/release-announcements/

U7YCV0V8F (14:45:22): > * Early March deadline for completing voyager for next bioconductor release > * Setup compatibility tests for visium, 10xv3 b/w R/Py > * Setup automatic testing framework

U7T29M3DG (17:53:08): > Hi<@UA5GZMWHM>,

U7T29M3DG (17:53:33): > Can you post the slides here you ended up presenting last week? (sorry if you already did so and I missed it)

U01UF27E9P0 (19:28:40): UA5GZMWHM (22:59:31) (in thread): > I’m not sure why I need to post it - File (PDF): ABC presentation.pdf

U7T29M3DG (23:16:36) (in thread): > Thanks<@UA5GZMWHM>! Just wanted to have a record of it here and I thought it would be useful for our collaborators to see it.

2023-02-07

U01UF27E9P0 (19:08:27): U01UF27E9P0 (23:38:04) (in thread):

2023-02-08

U01UF27E9P0 (01:07:51): U01UF27E9P0 (10:22:47): U01UF27E9P0 (19:24:31):

2023-02-09

U01UF27E9P0 (19:12:26):

2023-02-10

U01UF27E9P0 (19:13:28):

2023-02-11

U01UF27E9P0 (19:11:31):

2023-02-12

U01UF27E9P0 (19:17:45):

2023-02-13

Unknown User (14:01:51): > [Unsupported block type: call]

U91L3C2KF (14:14:13): > Pkdown .yamlhttps://github.com/pachterlab/voyager/blob/main/.github/workflows/pkgdown.yaml

U7T29M3DG (14:14:45): > https://nbviewer.jupyter.org/

UA5GZMWHM (16:55:00): > I have the PDF.https://onlinelibrary.wiley.com/doi/abs/10.1111/j.0016-7363.2006.00679.x

U02EN9EQQ5U (17:32:22) (in thread): > Would you mind sharing the PDF? I’d like to read this.

UA5GZMWHM (17:53:50) (in thread): > I got it from Docuserve. I think sharing it here counts as private study. Not 100% sure if it’s legal but I think it most likely is. - File (PDF): beyond mule kicks.pdf

U02EN9EQQ5U (17:55:25) (in thread): > :+1:

U02EN9EQQ5U (17:55:32) (in thread): > thank you

U01UF27E9P0 (19:14:54): UA5GZMWHM (20:50:38): > https://www.mdpi.com/2571-905X/2/3/27 - Attachment (MDPI): Negative Spatial Autocorrelation: One of the Most Neglected Concepts in Spatial Statistics > Negative spatial autocorrelation is one of the most neglected concepts in quantitative geography, regional science, and spatial statistics/econometrics in general. This paper focuses on and contributes to the literature in terms of the following three reasons why this neglect exists: Existing spatial autocorrelation quantification, the popular form of georeferenced variables studied, and the presence of both hidden negative spatial autocorrelation, and mixtures of positive and negative spatial autocorrelation in georeferenced variables. This paper also presents details and insights by furnishing concrete empirical examples of negative spatial autocorrelation. These examples include: Multi-locational chain store market areas, the shrinking city of Detroit, Dallas-Fort Worth journey-to-work flows, and county crime data. This paper concludes by enumerating a number of future research topics that would help increase the literature profile of negative spatial autocorrelation.

UA5GZMWHM (23:40:47): > Thespdeppackage does not implement multivariate spatial data analysis methods. Thespatialreg,adespatial, andGWmodelpackages do. Also, for variograms, I would need thegstatpackage. I want to get your opinion here: Do you think it’s better to add the multivariate spatial packages in theImportsfield, meaning it has to be installed in order to install Voyager, or in theSuggestsfield, meaning it doesn’t have to be installed when Voyager is installed until the user uses the multivariate function?

U04L7R4KCDS (23:41:50): > @U04L7R4KCDS has joined the channel

Unknown User (23:41:55): > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: section] > > [Unsupported block type: context]

2023-02-14

UA5GZMWHM (16:24:50): > Looks interesting, though I haven’t read it yet. I have the PDF from Docuserve.https://link.springer.com/article/10.1007/s43071-022-00031-w - Attachment (SpringerLink): Some useful details about the Moran coefficient, the Geary ratio, and the join count indices of spatial autocorrelation > Journal of Spatial Econometrics - Popular spatial autocorrelation (SA) indices employed in spatial econometrics include the Moran Coefficient (MC), the Geary Ratio, (GR)…

U01UF27E9P0 (19:45:00):

2023-02-15

U01UF27E9P0 (19:28:28): U01UF27E9P0 (21:04:12):

2023-02-16

U01UF27E9P0 (19:04:27):

2023-02-17

U01UF27E9P0 (19:00:31):

2023-02-18

U01UF27E9P0 (19:00:07):

2023-02-19

U01UF27E9P0 (19:14:15):

2023-02-20

U01UF27E9P0 (20:28:33):

2023-02-21

UA5GZMWHM (20:33:46): > A long time ago, I was writing Cosmodrome, which I eventually quit because of my lack of expertise and to be honest interest in image processing and the file format woes. Someone else at UCSC is picking up the slack, though I’m not sure how they deal with overlaps between FOVs and the global coordinates. Their pipeline includes conversion to SpaceTx format though I’m not sure which original formats are supported but it could be a step forward in dealing with file format woes.https://www.biorxiv.org/content/10.1101/2023.02.17.529010v1?ct= - Attachment (bioRxiv): A Unified Pipeline for FISH Spatial Transcriptomics > In recent years, high-throughput spatial transcriptomics has emerged as a powerful tool for investigating the spatial distribution of mRNA expression and the effects it may have on cellular function. There is a lack of standardized tools for analyzing spatial transcriptomics data, leading many groups to write their own in-house tools that are often poorly documented and not generalizable to other datasets. Currently, the only publicly available tools for extracting annotated transcript locations from raw multiplexed fluorescent in situ hybridization (FISH) images are starfish, which is lacking in some key areas, and MERlin, which is restricted to only MERFISH data. To address this, we have expanded and improved the starfish library and used those tools to create PIPEFISH, a semi-automated and generalizable pipeline that performs transcript annotation for FISH-based spatial transcriptomics. PIPEFISH has options for image processing, decoding, and cell segmentation, and calculates quality control metrics on the output to allow the user to assess the pipeline’s performance on their data. We used this pipeline to annotate transcript locations from three real datasets from three different common types of FISH image-based experiments: MERFISH, seqFISH, and targeted in situ sequencing (ISS), and verified that the results were high quality using the internal quality metrics of the pipeline and also a comparison to a orthogonal method of measuring RNA expression in a similar tissue sample. We have made PIPEFISH publicly available through Github for anyone interested in analyzing data from FISH-based spatial transcriptomic assays. > > ### Competing Interest Statement > > The authors have declared no competing interest.

U7T29M3DG (21:20:33): > Interesting. Thanks for sharing<@UA5GZMWHM>

U7YCV0V8F (21:24:59): > They miscapitalized GitHub:confused:

2023-02-22

U01UF27E9P0 (19:18:30): U01UF27E9P0 (19:19:59): U01UF27E9P0 (19:59:38):

2023-02-23

U01UF27E9P0 (01:32:42): UA5GZMWHM (19:02:27): > <@U7T29M3DG>That’s the recent Xenium paper I mentioned

UA5GZMWHM (19:02:29): > https://www.biorxiv.org/content/10.1101/2023.02.13.528102v1?ct= - Attachment (bioRxiv): Optimizing Xenium In Situ data utility by quality assessment and best practice analysis workflows > The Xenium In Situ platform is a new spatial transcriptomics product commercialized by 10X Genomics capable of mapping hundreds of transcripts in situ at a subcellular resolution. Given the multitude of commercially available spatial transcriptomics technologies, recommendations in choice of platform and analysis guidelines are increasingly important. Herein, we explore eight preview Xenium datasets of the mouse brain and two of human breast cancer by comparing scalability, resolution, data quality, capacities and limitations with eight other spatially resolved transcriptomics technologies. In addition, we benchmarked the performance of multiple open source computational tools when applied to Xenium datasets in tasks including cell segmentation, segmentation-free analysis, selection of spatially variable genes and domain identification, among others. This study serves as the first independent analysis of the performance of Xenium, and provides best-practices and recommendations for analysis of such datasets. > > ### Competing Interest Statement > > Mats Nilsson is advisor to 10X Genomics. Malte D. Leucken has received talk honorariums from Pfizer and Janssen pharmaceuticals, and has been a contractor for the Chan Zuckerberg Initiative. Fabian J. Theis consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd, and Omniscope Ltd, and has ownership interest in Dermagnostix GmbH and Cellarity. All other authors declare no competing interests.

2023-02-25

U01UF27E9P0 (19:01:43):

2023-02-26

U01UF27E9P0 (19:16:24):

2023-02-27

U02EN9EQQ5U (13:59:33): > Hi<!channel>I’m going to be a few minutes late to today’s meeting - I’m leaving an appointment that ran a bit late.

U7T29M3DG (14:00:34): > No worries.

Unknown User (14:00:37): > [Unsupported block type: call]

U04CYELHK5H (16:19:41): > I pushed some changes to the README.md in voyager-testing. It covers a proposed directory structure for testing the compatibility of the vignettes/notebooks.I also included a sample python script reflecting the non-plotting content of the visium basic notebook

U01UF27E9P0 (19:00:33):

2023-02-28

U7YCV0V8F (03:38:11): > book markinghttps://twitter.com/adamgayoso/status/1417989820278730758?s=20 - Attachment (twitter): Attachment > SeuratDisk is IMO the easiest way to move single-cell data objects between R and python. Does anyone have any other favorites that are 1) straightforward to install/use and 2) work as expected?

U01UF27E9P0 (19:09:34):

2023-03-01

U01UF27E9P0 (19:01:18):

2023-03-02

U01UF27E9P0 (01:09:03): U01UF27E9P0 (18:54:33) (in thread): UA5GZMWHM (23:30:17): > I’m thinking about changing the default branch on GitHub todevel. Themainbranch at present is the same as theRELEASE_3_16branch synced with Bioconductor git and isn’t really necessary. Meanwhile, I do sometimes push code that doesn’t work yet todevelas a back up before pushing to Bioconductor git.

2023-03-03

U01UF27E9P0 (19:13:30):

2023-03-04

U7T29M3DG (15:52:29): > https://www.nature.com/articles/s41592-023-01801-6 - Attachment (Nature): Designing spatial transcriptomic experiments > Nature Methods - Optimal design of spatial transcriptomic experiments allows statistical evaluation of the impact of various biological and technological features on the discovery of cell phenotypes.

U7T29M3DG (15:53:05): > “Bost et al. provide R software for their method while Baker et al. provide a Python package that implements their framework. This highlights a trend in single-cell and spatial transcriptomics: several cutting-edge methods are being developed across different programming languages, underlining the need for interoperability and standardized data structures.”

U01UF27E9P0 (19:03:01):

2023-03-05

U01UF27E9P0 (19:07:27):

2023-03-06

UA5GZMWHM (01:32:41): > Here’s the Seurat vs. Scanpy PCA report:https://lambdamoses.github.io/thevoyages/posts/2023-03-05-seurat-vs-scanpy-pca/

U7T29M3DG (11:48:43): > Very interesting- thanks<@UA5GZMWHM>. Looking forward to discussing today on the Voyager call.

U7YCV0V8F (12:34:10): > Informative post- thanks for writing it<@UA5GZMWHM>

Unknown User (14:01:38): > [Unsupported block type: call]

U04CYELHK5H (14:18:33): > https://github.com/pachterlab/voyager-testing

U02EN9EQQ5U (14:27:23): > https://pachterlab.github.io/voyager/dev/articles/vig8_codex.html - Attachment (pachterlab.github.io): CODEX exploratory data analysis > Voyager

U7T29M3DG (14:30:40): > https://pmelsted.github.io/voyagerpy/ - Attachment (pmelsted.github.io): From geospatial to spatial omics > VoyagerPy is a Python package which aims to provide the same functionality for the Python community as Voyager does for the R community. We use AnnData as our data structure which should be familiar to some. The package provides plotting functions useful for analysing spatial single-cell genomics data.

U01UF27E9P0 (19:17:22): UA5GZMWHM (20:40:37): > I have also done the comparison with Seurat clipping. The differences in embeddings are sizable, especially with the later PCs.https://lambdamoses.github.io/thevoyages/posts/2023-03-05-seurat-vs-scanpy-pca/

2023-03-07

U01UF27E9P0 (21:26:47): U01UF27E9P0 (21:48:23): U01UF27E9P0 (21:55:26):

2023-03-13

UA5GZMWHM (12:48:15): > We still have almost a month.https://bioconductor.org/developers/release-schedule/

U04CYELHK5H (12:49:20): > I won’t be able to join tonight, but those are some good news

U7T29M3DG (12:58:13): > Yes, that timeline looks great.

U91L3C2KF (13:02:40): > We are meeting at 11am California time.

U04CYELHK5H (13:04:59): > Then I might be able to join you a bit.The time difference is 7hrs now, right?

U91L3C2KF (13:08:58): > Yes

Unknown User (14:00:26): > [Unsupported block type: call]

U01UF27E9P0 (20:23:26):

2023-03-14

Unknown User (13:00:38): > [Unsupported block type: call]

U7T29M3DG (13:36:08): > https://www.sciencedirect.com/science/article/pii/S0305440320302259 - Attachment (sciencedirect.com): The application of Local Indicators for Categorical Data (LICD) to explore spatial dependence in archaeological spaces > Global and local analyses of spatial autocorrelation are commonplace in spatial archaeology. However, they are exclusively focused on continuous numer…

U01UF27E9P0 (20:17:46): U01UF27E9P0 (20:30:35):

2023-03-15

U023DK2HCM7 (21:51:11): > Hi all, I wrote a GitHub Actions workflow that automatically generates ipynb/Google Colab notebooks from the R vignettes and automatically renews the notebooks when there is change to a vignette (without changing the link pointing to the Colab notebook). > > In short: I am usingjupytextto convert the Rmds to ipynbs (thanks to<@U02EN9EQQ5U>for finding jupytext). > > Of note: For jupytext and the resulting Google Colab to work, I had to make some changes to all vignettes (which should not affect their R/html rendering): > 1. Add the following to the yaml header: > > jupyter: > kernelspec: > display_name: R > language: R > name: ir > > 2. Add hidden Google Colab installs at the top (after the yaml header): > > > > 3. Replace knitr graphics with markdown (and include full link to picture), e.g. > ```

``becomes`If you follow steps 1-3 for a new vignette, its ipynb will automatically be generated and itshouldrun when opened in Colab.Make sure to add any potential additional dependencies in step 2 and if necessary add data downloads to hidden cells.All notebooks will be automatically updated without changing the Colab links when a vignette is added/updated, so everything regarding the Colab notebooks should now be automated without requiring any manual interference (except for adding the Colab link to the landing page once - already done for existing landing pages though the links will need to be updated to main when devel is merged).

Finally, I am still working on testing all of the new notebooks + working on an automatic testing step that checks whether the notebooks run without an error (work in progress). This workflow will not check whether the output is correct, just that there is no error when executing the notebook.All of this is in the documentation-devel branch. Please let me know if anything does not work as expected or if I messed something else up in the process of building this.:slightly_smiling_face:

2023-03-16

UA5GZMWHM (18:44:31): > It’s not super in depth but it confirms my observation that library size is related to histological regions:https://www.biorxiv.org/content/10.1101/2023.03.15.532733v1?ct= - Attachment (bioRxiv): Library size confounds biology in spatial transcriptomics data > Spatial molecular technologies have revolutionised the study of disease microenvironments by providing spatial context to tissue heterogeneity. Recent spatial technologies are increasing the throughput and spatial resolution of measurements, resulting in larger datasets. The added spatial dimension and volume of measurements poses an analytics challenge that has, in the short-term, been addressed by adopting methods designed for the analysis of single-cell RNA-seq data. Though these methods work well in some cases, not all necessarily translate appropriately to spatial technologies. A common assumption is that total sequencing depth, also known as library size, represents technical variation in single-cell RNA-seq technologies, and this is often normalised out during analysis. Through analysis of several different spatial datasets, we noted that this assumption does not necessarily hold in spatial molecular data. To formally assess this, we explore the relationship between library size and independently annotated spatial regions, across 23 samples from 4 different spatial technologies with varying throughput and spatial resolution. We found that library size confounded biology across all technologies, regardless of the tissue being investigated. Statistical modelling of binned total transcripts shows that tissue region is strongly associated with library size across all technologies, even after accounting for cell density of the bins. Through a benchmarking experiment, we show that normalising out library size leads to sub-optimal spatial domain identification using common graph-based clustering algorithms. On average, better clustering was achieved when library size effects were not normalised out explicitly, especially with data from the newer sub-cellular localised technologies. Taking these results into consideration, we recommend that spatial data should not be specifically corrected for library size prior to analysis unless strongly motivated. We also emphasise that spatial data are different to single-cell RNA-seq and care should be taken when adopting algorithms designed for single cell data. > > ### Competing Interest Statement > > The authors have declared no competing interest.

U01UF27E9P0 (20:38:09): U023DK2HCM7 (20:55:03): > I finished going through all of the notebooks/vignettes. All notebooks should run now (even after being regenerated when there are changes to the vignette), except for create_sfe_v2.ipynb, which for some reason runs out of RAM in the unpaid Colab version.

U023DK2HCM7 (21:02:10): > Links added to all the landing pages (though the devel website is not being built right now because the Mac binary for the new R version is not out yet)

2023-03-17

U7T29M3DG (18:44:28): > https://www.biorxiv.org/content/10.1101/2023.03.13.532412v1 - Attachment (bioRxiv): Analysis of RNA processing directly from spatial transcriptomics data reveals previously unknown regulation > Technical advances have led to an explosion in the amount of biological data available in recent years, especially in the field of RNA sequencing. Specifically, spatial transcriptomics (ST) datasets, which allow each RNA molecule to be mapped to the 2D location it originated from within a tissue, have become readily available. Due to computational challenges, ST data has rarely been used to study RNA processing such as splicing or differential UTR usage. We apply the ReadZS and the SpliZ, methods developed to analyze RNA process in scRNA-seq data, to analyze spatial localization of RNA processing directly from ST data for the first time. Using Moran’s I metric for spatial autocorrelation, we identify genes with spatially regulated RNA processing in the mouse brain and kidney, re-discovering known spatial regulation in Myl6 and identifying previously-unknown spatial regulation in genes such as Rps24, Gng13, Slc8a1, Gpm6a, Gpx3, ActB, Rps8 , and S100A9 . The rich set of discoveries made here from commonly used reference datasets provides a small taste of what can be learned by applying this technique more broadly to the large quantity of Visium data currently being created. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2023-03-18

UA5GZMWHM (02:15:37): > My whereabouts, thinking out loud about MULTISPATI PCA, haven’t gotten to the fasterRSPectrabased implementation yet:https://lambdamoses.github.io/thevoyages/posts/2023-03-17-multispati-part-1/

U01UF27E9P0 (20:23:28):

2023-03-19

U01UF27E9P0 (20:20:37):

2023-03-20

Unknown User (14:00:13): > [Unsupported block type: call]

U04CYELHK5H (14:01:43): > afk - File (JPEG): IMG_3713

U91L3C2KF (14:05:42): > I need to update

U7YCV0V8F (14:10:03): > https://github.com/scverse/scanpy/blob/99697347a067297ba182747dbfe843a8f46a7a22/scanpy/neighbors/init.py#L466

U7YCV0V8F (14:16:34): > > def _get_indices_distances_from_dense_matrix(D, n_neighbors: int): > sample_range = np.arange(D.shape[0])[:, None] > indices = np.argpartition(D, n_neighbors - 1, axis=1)[:, :n_neighbors] > indices = indices[sample_range, np.argsort(D[sample_range, indices])] > distances = D[sample_range, indices] > return indices, distances >

U7YCV0V8F (14:16:43): > D is symmetric (computed from pairwise_distances)

U7YCV0V8F (14:16:50): > https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html - Attachment (scikit-learn): sklearn.metrics.pairwise_distances > Examples using sklearn.metrics.pairwise_distances: Agglomerative clustering with different metrics Agglomerative clustering with different metrics

UA5GZMWHM (16:49:11): > They reopened the abstract submission for 24 hours for me. That’s so nice of them!

UA5GZMWHM (16:50:13): > Sorry I didn’t explain it well. This is the package I’m playing around with to plot the H&E image behind Visium spots:https://dieghernan.github.io/tidyterra/index.html - Attachment (dieghernan.github.io): tidyverse Methods and ggplot2 Helpers for terra Objects > Extension of the tidyverse for SpatRaster and > SpatVector objects of the terra package. It includes also new > geom_ functions that provide a convenient way of visualizing terra > objects with ggplot2.

U01UF27E9P0 (20:27:19):

2023-03-21

U01UF27E9P0 (20:18:58):

2023-03-22

UA5GZMWHM (20:21:45): > https://doi.org/10.1101/2023.03.17.533200 - Attachment (bioRxiv): Statistical modeling and analysis of multiplexed imaging data > The rapid development of multiplexed imaging technologies has enabled the spatial cartography of various healthy and tumor tissues. However, the lack of adequate statistical models has hampered the use of multiplexed imaging to efficiently compare tissue composition across sample groups, for instance between healthy and tumor tissue samples. Here, we developed two statistical models that accurately describe the distribution of cell counts observed in a given field of view in an imaging experiment. The parameters of these distributions are directly linked to the field of view size and also to properties of the studied cell type such as cellular density and spatial aggregation. Using these models, we identified statistical tests that have improved statistical power for differential abundance testing of tissue composition compared to the commonly used rank-based test. Our analysis revealed that spatial aggregation is the main determinant of statistical power and that to have sufficient power to detect differences in cell counts when cells are highly aggregated may require sampling of hundreds of fields of view. To overcome this challenge, we provide a new stratified sampling strategy that might significantly reduce the number of required samples. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2023-03-23

U01UF27E9P0 (20:25:30):

2023-03-24

U01UF27E9P0 (20:14:35):

2023-03-26

UA5GZMWHM (18:34:36): > I’m going to change thedocumentation-develwebsite build to Bioconductor 3.17, so I can begin to change the vignettes to reflect the devel version of SFE and Voyager and write new vignettes on devel features.

U7T29M3DG (18:52:55): > Sounds good.

UA5GZMWHM (19:17:18): > https://lambdamoses.github.io/thevoyages/posts/2023-03-25-multispati-part-2/

2023-03-27

UA5GZMWHM (14:01:45): > Are we still meeting today?

U04CYELHK5H (14:03:29): > I thought so. Any reason we might not?

Unknown User (14:05:03): > [Unsupported block type: call]

U7T29M3DG (14:06:21): > Sorry.Ican’tcome on the call today.Apologies

U04CYELHK5H (14:17:43): > https://github.com/pmelsted/voyagerpy/tree/dev

2023-03-29

U01UF27E9P0 (19:01:41):

2023-03-30

U01UF27E9P0 (20:31:15):

2023-03-31

U01UF27E9P0 (20:34:37):

2023-04-01

U01UF27E9P0 (20:16:23):

2023-04-02

U01UF27E9P0 (20:20:39):

2023-04-03

Unknown User (14:01:07): > [Unsupported block type: call]

UA5GZMWHM (14:36:21): > <@U04CYELHK5H>I read thesp.correlogramsource code. Thenblagfunction find the neighbor lists for higher orders of neighbors. As for edge weights, it does not use distance based edge weights, but only does different styles of normalization of the adjacency matrix, such as “W” for row normalized.https://github.com/r-spatial/spdep/blob/82514fb81da0c6c5285bf4219d7979dac34acec4/R/sp.correlogram.R#L35

U04CYELHK5H (14:37:48): > does that mean that all edges have weight 1 and then row-normalized?

UA5GZMWHM (14:38:15): > Yes, though row normalization is one of the styles

UA5GZMWHM (14:38:22): > Distance based edge weights is a relatively new feature inspdepand apparently they haven’t updated this function for that. Maybe I can open an issue to ask them.

U04CYELHK5H (14:41:35): > OK, thank you. I will look into it. I didn’t get the same result in VoyagerPy as you did in the vignette. Isn’t it computing Moran’s I over the extended neighborhood?

UA5GZMWHM (14:45:45): > Not the extended neighborhood, but the higher order one.nblagcreates a list of neighborhood graphs. Each element of the list is for each lag. You can run the R code in debug mode to see the structure of the objects, withdebug(sp.correlogram)

U04CYELHK5H (14:46:54): > thank you! I’ll take a look at it

UA5GZMWHM (15:56:59): > https://github.com/BiocPy

UA5GZMWHM (16:14:32): > I just attended Aaron Lun’s webinar. Another thing here: an interoperable way to write R objects like SCE to disk that is not RDS:https://github.com/ArtifactDBI’m thinking about writing one for SFE, maybe for Bioconductor 3.18 or 3.19.

UA5GZMWHM (16:15:26): > Pall mentioned ultimately writing everything Voyager in C++. This has already been done in scran:https://github.com/LTLA/scran.chan

U01UF27E9P0 (20:22:38):

2023-04-04

U01UF27E9P0 (20:23:04):

2023-04-06

U01UF27E9P0 (20:30:17):

2023-04-07

U01UF27E9P0 (20:33:37):

2023-04-08

U01UF27E9P0 (20:17:16):

2023-04-09

U01UF27E9P0 (20:29:08):

2023-04-10

Unknown User (14:01:52): > [Unsupported block type: call]

U7T29M3DG (14:02:15): > Hi everyone, > > I am double-booked right now… very sorry.

U7T29M3DG (14:02:21): > Can we move the meeting to a bit later today?

U7T29M3DG (14:02:26): > Or else have it without me.

UA5GZMWHM (14:03:04): > OK, I can move it

U02EN9EQQ5U (14:03:06): > I can meet later if it’s not too late for<@U04CYELHK5H>and<@U91L3C2KF>

UA5GZMWHM (14:03:36): > When does it work for you then?

U91L3C2KF (14:03:50): > Later is fine

U04CYELHK5H (14:06:31): > I was just getting out the pool.When do you want to meet?

U7T29M3DG (14:11:24): > Would 12:30pm PST work?

U7T29M3DG (14:11:32): > (i.e. 1 hour 20 minutes from now)?

U7T29M3DG (14:12:50): > Ok great. See you soon. Thanks very much for being accomodating.

Unknown User (15:30:21): > [Unsupported block type: call]

U7T29M3DG (15:46:29): > https://pmelsted.github.io/voyagerpy/ - Attachment (pmelsted.github.io): From geospatial to spatial omics > VoyagerPy is a Python package which aims to provide the same functionality for the Python community as Voyager does for the R community. We use AnnData as our data structure which should be familiar to some. The package provides plotting functions useful for analysing spatial single-cell genomics data.

U91L3C2KF (15:48:00): > https://github.com/pmelsted/voyagerpy/tree/gh-pages

U023DK2HCM7 (15:49:49): > lauraluebbert

U023DK2HCM7 (15:50:00): > https://github.com/lauraluebbert

U91L3C2KF (15:50:26): > https://github.com/pmelsted/voyagerpy-notebooks

U04CYELHK5H (16:23:11): > Are the vignettes online compatible with themainordevelversion of Voyager? I’m running thenonspatial.Rmdfrom the documentation branch and get an error in.central_plotter: unesed argument point_fun. Are the vignettes on the website rendered from the documentation branch or from somewhere else?

UA5GZMWHM (16:25:50): > point_fun is from devel version of scater

U04CYELHK5H (16:26:41): > so the vignettes are using the devel version of Voyager?

U04CYELHK5H (16:27:07): > or am I supposed to be using scater devel?

UA5GZMWHM (16:28:48): > For now just scater devel. But I’ll soon change to Bioconductor devel

U01UF27E9P0 (20:22:39):

2023-04-11

U7YCV0V8F (19:28:11): > https://github.com/scverse/spatialdata

U01UF27E9P0 (20:30:54):

2023-04-12

U01UF27E9P0 (20:27:06):

2023-04-13

U01UF27E9P0 (20:27:47):

2023-04-14

UA5GZMWHM (15:30:37): > Good news: my abstract was accepted by the Bioconductor conference, for a long workshop (90 minutes). Now I’m debating if I’m going to Boston to give the workshop in person because flying is so bad for the environment so I would not fly unless there’s a very compelling reason.

U7YCV0V8F (15:32:39): > I think getting accepted to give a long workshop is a very compelling reason

U91L3C2KF (17:19:38): > You can also carbon offset with most airlines

UA5GZMWHM (17:43:37): > I don’t really believe in carbon offset, because many of those companies might no longer exist before they fulfill their pledge. Also the way they plant trees might not be ecologically sound. I don’t want them to plant monocultures that don’t form viable ecosystems. Also it’s not always the case that more trees = better. For instance, in the America West where the Ponderosa Pine grows, a century of fire suppression led to unnaturally high forest density, with many small trees growing beneath the pines, which made the forest more prone to severe fires that kill the ancient pines because there’s more fuel.

2023-04-15

U01UF27E9P0 (20:16:04):

2023-04-16

U91L3C2KF (06:04:11): > There are other ways, this method stores co2 as rockshttps://en.m.wikipedia.org/wiki/Carbfix - Attachment: Carbfix > Carbfix is an Icelandic company that has developed a novel approach (CO2-to-stone) to capturing and storing CO2 by its capture in water and its injection into subsurface basalts. Once in the subsurface, the injected CO2 reacts with the host rock forming stable carbonate minerals, thus providing for the safe, long-term storage of the captured gas. > Approximately 200 tons were injected into subsurface basalts in 2012. Research results published in 2016 showed that 95% of the injected CO2 was solidified into calcite within 2 years, using 25 tonnes of water per tonne of CO2. Since this time this successful carbon capture and storage approach has been upscaled at Hellisheiði and ongoing research is implementing this approach at other sites across Europe.

U01UF27E9P0 (20:17:24):

2023-04-17

Unknown User (14:00:36): > [Unsupported block type: call]

U01UF27E9P0 (20:27:30):

2023-04-18

U01UF27E9P0 (20:31:45):

2023-04-19

U01UF27E9P0 (20:19:53):

2023-04-20

U01UF27E9P0 (20:30:22):

2023-04-21

U01UF27E9P0 (20:15:49):

2023-04-22

U01UF27E9P0 (12:20:34):

2023-04-23

U7YCV0V8F (19:25:46): > As we work on the voyager preprint we should cite this - File (PNG): image.png

U7YCV0V8F (19:34:22): > I didn’t know this but when scanpy does DE the mask out the cells with zero expression when computing number of elements for the ttest (n)

U7YCV0V8F (19:34:23): > ns_other = np.count_nonzero(self.groups_masks[self.ireference])

U7YCV0V8F (19:35:24): > https://github.com/scverse/scanpy/blob/08be4e9a09a0deb9ca662dc8dbb9160521bb3382/scanpy/tools/_rank_genes_groups.py#L211

2023-04-24

U91L3C2KF (04:15:01): > How much is slight?

U01UF27E9P0 (05:23:15): U04CYELHK5H (13:13:01): > Meeting in the usual hour?

Unknown User (14:01:38): > [Unsupported block type: call]

U02EN9EQQ5U (14:02:07): > I’m so sorry -I’vedouble booked this meeting time andwon’tbe able to make it today.

UA5GZMWHM (14:17:38): > Here’s the Google Doc for the preprint:https://docs.google.com/document/d/1eZkaMjAEv5OoRRfb4q_9nLSECjnwAjJRrSpoWgpAgXc/edit

UA5GZMWHM (14:20:39): > Super last minute so just to cover the basics for PR purposes in the first version on April 26. I’ll write the draft of the Voyager paper before May 10 when I have to submit the draft of my thesis.

UA5GZMWHM (14:24:14): > <@U7T29M3DG>Any suggestions? I think the preprint should say: > 1. Outline of functionalities of SFE and Voyager > 2. Compatibility tests > 3. Justifications for default parameters > 4. The website — building it on GitHub Actions and the Colab notebooks

U01UF27E9P0 (14:38:37): UA5GZMWHM (18:21:23): > I think I’m going to Boston to give the Voyager workshop in person, because I want to meet the outstanding Harvard and Broad spatial people. I’ll donate $300 to my Green Fondo fundraiser because this is a conservative rough estimate of the equity weighted social cost of the carbon emission from this trip from UCSC:https://sustainability.ucsc.edu/initiatives/social-cost-carbon.html. I was also told that Bioconductor’s travel assistance is last resort for those without funding. So how much can the lab pay for the trip?

2023-04-25

U01UF27E9P0 (04:03:43): U01UF27E9P0 (04:06:57): U01UF27E9P0 (04:14:02): U01UF27E9P0 (04:28:16): U01UF27E9P0 (15:02:52): U01UF27E9P0 (15:26:02): U023DK2HCM7 (15:32:11): > Hi all, apologies for missing the meeting yesterday. I’ll take a look at the doc; let me know if there’s anything else I can help with.

U01UF27E9P0 (15:59:56): U04CYELHK5H (18:05:51): > Did you guys update the visium_10.Rmd and nonspatial,Rmd to reflect out options for computing HVGs? I.e. using lowess=FALSE in modelGeneVar? For nonspatial.Rmd, you need to call it explicitly like in the visium vignette

U01UF27E9P0 (19:46:52): U01UF27E9P0 (20:07:58): UA5GZMWHM (20:12:52) (in thread): > Yes I did

U01UF27E9P0 (20:34:06): U01UF27E9P0 (22:30:25):

2023-04-26

U01UF27E9P0 (00:05:42): U01UF27E9P0 (01:33:50): U01UF27E9P0 (01:48:27): U01UF27E9P0 (01:57:07): U01UF27E9P0 (08:25:26): U01UF27E9P0 (08:26:53): U01UF27E9P0 (08:54:48): U01UF27E9P0 (08:56:17): U01UF27E9P0 (11:49:52): U01UF27E9P0 (18:05:27): UA5GZMWHM (19:25:03): > The new website built successfully. I have added landing pages for spatial analysis methods and new vignettes for bivariate and multivariate methods. There’re some glitches like I forgot to add that Colab package installation code chunk, but it’s mostly good to go:https://pachterlab.github.io/voyager/ - Attachment (pachterlab.github.io): From geospatial to spatial omics > SpatialFeatureExperiment (SFE) is a new S4 class for working with > spatial single-cell genomics data. The voyager package implements basic > exploratory spatial data analysis (ESDA) methods for SFE. Univariate methods > include univariate global spatial ESDA methods such as Morans I, > permutation testing for Morans I, and correlograms. Bivariate methods > include Lees L and cross variogram. Multivariate methods include MULTISPATI > PCA and multivariate local Gearys C recently developed by Anselin. The > Voyager package also implements plotting functions to plot SFE data and ESDA > results.

U01UF27E9P0 (23:20:48):

2023-04-27

UA5GZMWHM (17:52:06): > I have announced the release on Mastodon. Here is a summary of the updates since the last release, many of which I haven’t gotten a chance to talk about in our weekly meetings:https://lambdamoses.github.io/thevoyages/posts/2023-04-26-voyager-1-2-0/

2023-04-29

U01UF27E9P0 (18:10:38):

2023-04-30

U01UF27E9P0 (03:50:05): U01UF27E9P0 (17:54:35):

2023-05-01

Unknown User (14:00:59): > [Unsupported block type: call]

UA5GZMWHM (14:03:06): > Google Doc for Voyager paper:https://docs.google.com/document/d/1eZkaMjAEv5OoRRfb4q_9nLSECjnwAjJRrSpoWgpAgXc/edit?usp=share_link - File (Google Docs): Voyager preprint

U7T29M3DG (14:08:01): > https://pmelsted.github.io/voyagerpy/ - Attachment (pmelsted.github.io): From geospatial to spatial omics > VoyagerPy is a Python package which aims to provide the same functionality for the Python community as Voyager does for the R community. We use AnnData as our data structure which should be familiar to some. The package provides plotting functions useful for analysing spatial single-cell genomics data.

U04CYELHK5H (14:26:33): > I’m sorry I totally forgot the meeting today because it’s holiday

U04CYELHK5H (14:27:00): > What did you think about the vpy site?

U023DK2HCM7 (14:28:15): > Google Colab notebooks to test run:<@UA5GZMWHM>: Visium (all 5 notebooks)<@U02EN9EQQ5U>: Slide-Seq (3) + CosMX (2) > Laura: seqFISH(1) + CODEX (2) + Chroium (1) + Merfish (2) + Xenium (2) > > If any of them error out, please let me know the name of the notebook and the error.:slightly_smiling_face:

U023DK2HCM7 (14:32:40): > You can skip the create_sfe notebook

U04CYELHK5H (14:34:20): > Also, voyagerpy is available on PyPI, but v0.1.0 didn’t have all the requirements listed (and maybe some other bug, haven’t had the time to test it thoroughly). I was wondering if we should release v0.1.1 straightaway?

U7T29M3DG (14:43:10): > https://www.oscar-system.org/ - Attachment (OSCAR Computer Algebra System): Home > OSCAR, an Open Source Computer Algebra Research System written in Julia

U7YCV0V8F (14:46:21): > <@U04CYELHK5H><@U91L3C2KF>can you add the .ipynb notebooks to the repo?

U02EN9EQQ5U (14:47:17): > https://github.com/pmelsted/voyagerpy-notebooks

U7T29M3DG (14:55:29): > Summary of the meeting today:

U7T29M3DG (14:57:38): > * <@UA5GZMWHM>is going to focus on the preprint in the next few days, writing the introduction, making Figures 1/2 etc. We had a discussion on the overall main points to make, structure etc., emphasizing the Python - R concordance. > * <@U7YCV0V8F>,<@U02EN9EQQ5U>and<@U7T29M3DG>will work on adding in a bunch of non-spatial vignettes for different technologies, including notebooks for the pre-processing > * <@U023DK2HCM7>is going to move over the Python notebooks to the<@U91L3C2KF>VoyagerPy repo (with your permission<@U91L3C2KF>) and work on automated generation via Actions and also Colab construction. > * Addition of some Methods to the Methods Tab in the Voyager site (including Concordex)

U7T29M3DG (14:57:55): > Feel free to add/edit

U7YCV0V8F (14:58:12): > Here is the GitHub action that can be copied to autogenerate markdown/html from the ipynb notebooks:https://github.com/pachterlab/kallistobustools/blob/master/.github/workflows/build_site.yml

U04CYELHK5H (15:12:23) (in thread): > <@U7T29M3DG>I think I have added them tohttps://github.com/pmelsted/voyagerpy-notebooks.Due to my preliminary stackoverflow search, there are other notebooks there as well.Do you think it should be on a separate (or main) branch of the voyagerpy repo<@U91L3C2KF>? If they were a part of the voyagerpy repo, it’d make sense to me to use the latest version on PyPI, what do you think?

U01UF27E9P0 (17:42:23): U01UF27E9P0 (20:23:14): U023DK2HCM7 (21:40:30): > All of the R Colab notebooks that I assigned to myself run except create_sfe_v2 which runs out of RAM on Colab for reasons Lambda and I did not understand

UA5GZMWHM (22:30:39): > I think it’s where the data frame read from the csv file is converted into a matrix

2023-05-02

U01UF27E9P0 (18:15:55):

2023-05-03

UA5GZMWHM (04:05:01): > <@U04CYELHK5H>I’m currently making Figure 2 of the preprint, to show that the R and Python versions give consistent results. I’m trying to make a Visium spot graph to make some ESDA plots, and I saw the functionfind_visium_graph, which returns a NetworkX graph. But it’s unclear to me how to add that graph to theadata. The 10X Chromium vignette used scanpy to find the knn graph, so it’s not clear how to add the Visium graph. Please help, thanks!

U01UF27E9P0 (04:11:22): UA5GZMWHM (21:01:05): > <@U04CYELHK5H>In addition to the vignettes, is Moran’s I also covered by the compatibility tests?

2023-05-04

U01UF27E9P0 (03:17:49): U04CYELHK5H (07:52:52) (in thread): > <@UA5GZMWHM>this function was written a long time ago when the focus wasn’t on the two vignettes we’ve now implemented (visium_10x and nonspatial). I will need to look better into it before I can give you a definitive answer

U04CYELHK5H (07:53:23) (in thread): > we’re working on it right now

U01UF27E9P0 (17:56:22): UA5GZMWHM (17:57:57) (in thread): > OK, thanks!

UA5GZMWHM (18:11:09) (in thread): > This also requires you to work onfind_visium_graphright?

UA5GZMWHM (21:40:11): > I find myself needing supplementary figures, so I created a new document:https://docs.google.com/document/d/1xY91Ya5Xfg6KjqHqIqb8Y5CZL8YBvv7wOZ4rrESFToU/edit?usp=sharing - File (Google Docs): Supplementary tables and figures

U01UF27E9P0 (23:32:35):

2023-05-05

UA5GZMWHM (04:06:45): > <!channel>Updates about the preprint: I’m mostly done writing the main text, including listing out the default in vignettes and explaining the reasons. Except that I’m waiting for updates on Moran’s I compatibility tests and adding the Visium graph to the AnnData object in the Python implementation, I made half of Figure 2, with the PCA comparisons between Seurat and Scanpy and between R and Python implementations of Voyager. But I think it would be nice to have the Moran’s I comparison, because Voyager is for ESDA. I’m sure that I need to abridge some parts and move some parts to Methods or Supplementary Information for the version submitted to Nature Methods. I also still need to write the “ESDA case study” section, for which I’ll just put some highlights from the vignettes for now. But for the final version I want to use a published dataset and show what I found with ESDA that the original authors did not.

U01UF27E9P0 (17:55:37):

2023-05-06

U01UF27E9P0 (17:58:15): U01UF27E9P0 (19:07:01):

2023-05-07

U01UF27E9P0 (04:31:08): U01UF27E9P0 (17:48:16):

2023-05-08

Unknown User (14:01:10): > [Unsupported block type: call]

UA5GZMWHM (14:34:15): > https://bioconductor.org/packages/release/bioc/manuals/BiocNeighbors/man/BiocNeighbors.pdf

UA5GZMWHM (14:34:53): > KmknnIndex

UA5GZMWHM (14:35:15): > https://bioconductor.org/packages/devel/bioc/vignettes/BiocNeighbors/inst/doc/exact.html

U04CYELHK5H (14:54:00): > https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.neighbors.html

U91L3C2KF (15:00:30): > https://github.com/scverse/scanpy/blob/f03c5b407412de447480733a1a1a0e33e0c871d2/scanpy/neighbors/init.py#L800

U01UF27E9P0 (17:59:12): UA5GZMWHM (19:08:05): - File (PNG): image.png

UA5GZMWHM (19:15:46): > The dashed line is around 1.5e-8, which issqrt(.Machine$double_eps)in R

U7T29M3DG (19:18:50): > Wow

U7YCV0V8F (19:23:24): > That’s an impressive figure

UA5GZMWHM (23:02:24): > Heads up: spdep and esda have different defaults for local Moran. The results would match when I setmlvar = FALSEin spdep. There’s a variance term in the denominator of the local Moran expression. The default isTRUE, where it divides by the maximum likelihood variance (divide by n), while PySAL’s esda package divides by n-1.mlvar = FALSEmakes spdep divide by n-1 as well. I noted this in the paper, but I’m not sure if I’m going to set a default in Voyager, which always follows spdep defaults.

2023-05-09

U01UF27E9P0 (17:53:33):

2023-05-10

U01UF27E9P0 (18:03:45):

2023-05-11

U7YCV0V8F (15:09:59): > To make the voyagerpy docs building a little easier I moved the two notebooks10x_visium.ipynbandnonspatial.ipynbto anexamplesfolder in thevoyagerpyrepo. I plan on checking out thegh-pagesbranch and synching the notebooks directly frommainand then using gh-actions to build the notebooks into HTML

U04CYELHK5H (15:11:50): > So you’re keeping the ipynbs on main?

U7YCV0V8F (15:15:12): > Yea in anexamplesfolder. The reason i think this is a nice way to do things is that we envision making more google colab notebooks that directly load thevoyagerpypackage and so having theipynbfiles as near to the code that they run makes sense. Additionally if users have issues that are applicable to that actual use of the tool I Imagine requiring them to submit a google colab notebook replicating the example as an issue. As we solve more issues we can simply add those user-generated ipynb notebooks tovoyagerpyrepo

U04CYELHK5H (15:17:25): > Alright, makes sense

U01UF27E9P0 (18:12:06): U7YCV0V8F (19:56:17): > I think it would be useful to make a similar diagram for the Python object that is built on AnnData - File (PNG): image.png

U7YCV0V8F (19:56:21): > Is that something the folks in Iceland could help with?

U7YCV0V8F (19:56:51): > You would make a drop-in replacement for the SFE object representation

U7YCV0V8F (19:57:00): > but for with python annotations

U7YCV0V8F (19:57:14): > ie in python the rows are cells and the columns are genes

U7YCV0V8F (19:57:22): > etc

U7YCV0V8F (19:58:17): > actually wait i just realized something- is there a notion of an SFE object that sits on disk? or does each python/R language have its own on-disk representation?

U7YCV0V8F (19:58:21): > ideally they would be the same

UA5GZMWHM (20:11:34): > Or are your going to use SpatialData whose preprint recently came out? I haven’t read it carefully so I’m not sure how helpful it is.

UA5GZMWHM (20:13:19): > At present, the on disk part is different between the languages. The gene count matrix can already be on disk with DelayedArray for R. When you read a h5 file from 10X, then on disk operations are used by default. However, the geometries and local results are in memory at present. There’s such a thing as DelayedDataFrame in Bioconductor so I can use it for dimension reductions and local results. For the geometries, I can use sedona, which allows for on disk geometric operations, though there aren’t as many options as when the geometries are in memory.

UA5GZMWHM (20:14:54): > I may work on that for the next release in October. I’m also going to work on a language agnostic on disk serialization with Genentech alabaster; it already exists for SingleCellExperiment:https://github.com/ArtifactDB/alabaster.sce

UA5GZMWHM (20:16:22): > Alabaster is only on Bioconductor 3.18 (devel) right now

U7YCV0V8F (20:32:41): > The python representation has already been developed by The Icelanders

U7YCV0V8F (20:32:48): > I think an on-disk representation is important

U7YCV0V8F (20:37:35): > on a separate note<@U91L3C2KF>can you help me find the actions file that builds the voyager website

U7YCV0V8F (20:37:56): > unless im doing something dumb I cant seem to find it herehttps://github.com/pmelsted/voyagerpy/tree/gh-pages

U7T29M3DG (20:46:52): > The websites in VoagerPy have not being built by Githubactions (yet)

U7YCV0V8F (20:47:12): > ohhhhh

U7YCV0V8F (20:47:12): > ok

2023-05-12

U01UF27E9P0 (17:49:57):

2023-05-13

U01UF27E9P0 (18:01:47):

2023-05-14

U01UF27E9P0 (18:04:55):

2023-05-15

U7YCV0V8F (12:18:50): > <@U91L3C2KF>can you setup a personal access token for thevoyagerpyrepo

U7YCV0V8F (12:18:54): - File (PNG): image.png

U7YCV0V8F (12:18:57): > https://github.com/marketplace/actions/repo-file-sync-action

U7YCV0V8F (12:20:25): > Once that is setup it will sync the ipynb notebooks from themainbranch into thegh-pagesbranch.

U7YCV0V8F (12:20:56): > and I have put together a github action that will build thegh-pagesfrom theipynbdirectly using theconvert_notebook.shandnb_strip.pyscripts

U7YCV0V8F (12:26:39): > Alternatively<@U91L3C2KF>i think you can elevate my privileges so that i can create the token

Unknown User (14:01:52): > [Unsupported block type: call]

UA5GZMWHM (14:21:05): > https://www.biorxiv.org/content/10.1101/2023.05.05.539647v1?ct= - Attachment (bioRxiv): SpatialData: an open and universal data framework for spatial omics > Spatially resolved omics technologies are transforming our understanding of biological tissues. However, handling uni- and multi-modal spatial omics datasets remains a challenge owing to large volumes of data, heterogeneous data types and the lack of unified spatially-aware data structures. Here, we introduce SpatialData, a framework that establishes a unified and extensible multi-platform file-format, lazy representation of larger-than-memory data, transformations, and alignment to common coordinate systems. SpatialData facilitates spatial annotations and cross-modal aggregation and analysis, the utility of which is illustrated via multiple vignettes, including integrative analysis on a multi-modal Xenium and Visium breast cancer study. > > ### Competing Interest Statement > > J.M. holds equity in Glencoe Software which builds products based on OME-NGFF. F.J.T. consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd, Cellarity, and Omniscope Ltd, and has ownership interest in Dermagnostix GmbH and Cellarity.

U7YCV0V8F (15:47:14): > Ok i pretty much have everything set for the website (minus some css stuff)

U7YCV0V8F (15:47:31): > <@U04CYELHK5H>could you notebooks with pictures to thevoyater-notebooksrepo so i can test what happens with the images?

U7YCV0V8F (15:50:45): > More specifically could you add thevisium_10x.ipynb``nonspatial.ipynbnotebooks with images embedded in them to thevoyager-notebooksrepo?

U7YCV0V8F (16:01:19): > To update everyone: > 1. when a push is made to the voyagerpy github repo, a github action is triggered that > > 1. synchronizes theipynbnotebooks that are in theexamplesfolder within themainbranch into theexamplesfolder in thegh-pagesbranch. > 2. Then, a github actions job is run that > > 1. converts all of the notebooks using theconvert_notebooks.shandnb_strip.pyto html compatible files as was previously done > 2. deploys the website > To be done at some point is automate adding the notebooks (and links to them) to the table on the website. also TODO is make sure the CSS and header is matching on the generatedhtmlfiles (from the notebooks)

U04CYELHK5H (17:49:16): > We can execute the notebooks using in theconvert_notebooks.shscript by adding the--executeflag tojupyter nbconvertcommand. I’m testing it locally right now.

U04CYELHK5H (17:54:06): > Do we want to have the images embedded or as separate files in a subdirectory?

U01UF27E9P0 (17:55:34): U01UF27E9P0 (19:19:21): U01UF27E9P0 (19:20:38) (in thread): U01UF27E9P0 (20:43:20): U023DK2HCM7 (20:47:49): > <@U04CYELHK5H>I fixed gget and yanked the version with the requirement that was causing problems, so you should be able to install it without specifying a version now:slightly_smiling_face:

U01UF27E9P0 (20:55:28): U01UF27E9P0 (21:03:18): U01UF27E9P0 (21:25:00): U01UF27E9P0 (21:51:14) (in thread): U01UF27E9P0 (21:54:30):

2023-05-16

U01UF27E9P0 (18:07:06):

2023-05-17

UA5GZMWHM (00:27:07): > <@U7YCV0V8F><@U02EN9EQQ5U>Kind of last minute: May I talk a little about concordex in tomorrow’s presentation at Cedars-Sinai?

U7YCV0V8F (08:01:04): > I’m ok with that

U7T29M3DG (08:39:14): > :+1:

UA5GZMWHM (10:03:51): > Thanks!

U7T29M3DG (13:09:38): > <@UA5GZMWHM>’s review made the 10x blog:https://www.10xgenomics.com/blog/spatially-resolved-transcriptomics-an-introductory-overview-of-spatial-gene-expression-profiling-methods - Attachment (10x Genomics): Spatially resolved transcriptomics: An introductory overview of spatial gene expression profiling methods - 10x Genomics > Why is profiling the spatial location of biological components essential? How is spatial profiling achieved? This blog answers those questions, with a focus on methods that spatially resolve mRNA targets or the transcriptome.

U01UF27E9P0 (18:05:51):

2023-05-18

U01UF27E9P0 (17:51:29):

2023-05-19

U01UF27E9P0 (01:29:34): U01UF27E9P0 (18:10:48):

2023-05-20

U01UF27E9P0 (17:49:45):

2023-05-21

U01UF27E9P0 (17:49:24):

2023-05-22

U7T29M3DG (10:33:55): > Hi everyone, > > Sorry for the late notice but can we move the meeting to 12pm today (from 11am PST). I have to attend a consortium call at 11.

U04CYELHK5H (12:21:07): > No problem!

U023DK2HCM7 (13:28:45) (in thread): > I might be a bit late

U7YCV0V8F (15:03:02): > Zoom?

Unknown User (15:03:04): > [Unsupported block type: call]

U7T29M3DG (15:10:14): > https://caltech.zoom.us/j/88306445328passcode: ussvoyager

U01UF27E9P0 (17:57:29):

2023-05-23

U01UF27E9P0 (17:54:00):

2023-05-24

U01UF27E9P0 (18:01:56):

2023-05-25

U01UF27E9P0 (17:58:09):

2023-05-26

U01UF27E9P0 (18:05:38):

2023-05-27

UA5GZMWHM (00:54:14): > Are we still going to meet on Memorial Day?

U023DK2HCM7 (14:11:39): > Anima Anandkumar named her new package Voyager as wellhttps://twitter.com/deep__ai/status/1662507426992779264?s=46&t=1ZZOf9kQsZAlnUIYgUDEoQ - Attachment (Twitter): DeepAI on Twitter > :star-struck:Lowkey Goated When Open-Ended Embodiment Is The Vibe:star-struck: Check out the new paper by @AnimaAnandkumar and team - Voyager: An Open-Ended Embodied Agent with Large Language Models. #ComputerScience https://t.co/A6NaV9ejP2

U01UF27E9P0 (17:58:15):

2023-05-28

U01UF27E9P0 (18:01:03):

2023-05-29

UA5GZMWHM (01:56:38): > Are we still meeting on Memorial Day? If not then I’m going to a big trip to Westside. If we are meeting then I’m going to the trip on Tuesday instead.

U7T29M3DG (02:31:45): > Since it’s an official Caltech holiday tomorrow we should move the meeting.

U7T29M3DG (02:32:20): > I’m tentatively moving it to Thursday.

U7T29M3DG (02:32:31): > In the meantime let’s touch base via Slack updates on Tuesday.

U7T29M3DG (02:32:39): > Go ahead and do your bike ride tomorrow<@UA5GZMWHM>

U7T29M3DG (13:08:44): > Sorry, I just realized I’m at UCLA on Thursday.

U7T29M3DG (13:08:54): > Let’s just postpone to next Monday

UA5GZMWHM (13:32:37): > I kind of feel like crap today because I mess up my sleep schedule all the time. I suppose I’ll try to finish editing my thesis today and go to the trip tomorrow, when the sun comes out in the afternoon in Westside. My updates are just that I wrote a supplementary note on MULTISPATI PCA, from a suggestion after my defense. Writing it actually helped me to better understand MULTISPATI PCA.

U01UF27E9P0 (17:47:52):

2023-05-30

U01UF27E9P0 (18:02:28):

2023-05-31

U7YCV0V8F (15:44:14): > Some feedback forvoyagerpy > 1. add a vp.read_anndata method to make reading in pre-made anndata objects easier > 2. vp.plt.plot_barcode_data breaks if a column has NaN values in it > 3. the QC notebooks ought to not use scanpy if voyagerpy is intended to deprecate it > 4. The following code crashed my runtime (used a bunch of RAM) > > > adata.X = adata.X.astype('float64') > adata.layers['counts'] = adata.X.copy() > > # or, equivalently: > vp.utils.log_norm_counts(adata, inplace=True) > > adata.layers['logcounts'] = adata.X.copy() >

U01UF27E9P0 (17:52:52): U04CYELHK5H (18:02:56): > Thanks<@U7YCV0V8F> > 1. Good call. > 2. good to know, will add a fix > 3. We prefer having Scanpy only in the notebooks, since we will eventually phase it out.We aren’t using any Scanpy functions internally.Iirc we only use it for pca, neighbours, and leiden.The KmKnn is next up for implementation. > 4. Do you know which line was too heavy? My guess is the log_norm_counts but can you check again? Are the matrices sparse and/or huge? I don’t remember whether the lognormalized counts are sparse, but they definitely should be so this doesn’t happen

2023-06-01

U04CYELHK5H (11:30:56): > <@UA5GZMWHM><@U7YCV0V8F>and others, how do you think we should handle NaN and inf values when computing Moran’s I, local Moran, losh, etc.? Should we remove the node in the graph or evaluate is as zero?

U01UF27E9P0 (17:49:10): UA5GZMWHM (18:32:22): > Inspdep, there’s thena.actionargument indicating what to do with NA’s. The default is to throw an error. You can also set it to drop the NA’s, or convert NA’s to 0. However, not all spdep functions have this argument, and my own implementations of MULTISPATI PCA and Lee’s L don’t have an argument to deal with NA’s, because I haven’t come across them in spatial transcriptomics data.

2023-06-02

U04CYELHK5H (09:37:16): > Ok, I was just wondering since Sina encountered an error when plotting columns which contain NaNs. Maybe it is enough to just filter out NaNs and infs for basic plotting

U7YCV0V8F (16:13:21): > Yea that should be OK- as long as a warning is reported to the user

U7YCV0V8F (16:14:09): > on a separate note- is it possible to simplify the requirements.txt to just the absolute minimum (for packages that are explicitly imported?)

U7YCV0V8F (16:14:32): > The reason i ask is im running into issues installing the package on google colab due to conflicting requirements

U7YCV0V8F (16:15:24): > For example,

U7YCV0V8F (16:16:12): > Im not super familiar with python builds but can any of these package requirements be loosened? > > anndata = ">=0.8" > esda = "^2.4.3" > geopandas = "^0.12.0" > libpysal = "^4.7.0" > matplotlib = "~3.6" > networkx = ">=3.0" > numpy = "^1.24.3" > opencv-python = "^4.7.0.72" > pandas = "^1.3" > scikit-learn = ">=1.2" > scipy = ">=1.9" > shapely = ">=1.7" > statsmodels = ">=0.13" >

U7YCV0V8F (16:20:10): > One good test to make sure the package is easy to install is to fire up a google colab test notebookhttps://colab.research.google.com/notebooks/empty.ipynband run the following code > > !pip install --quiet git+[https://github.com/pmelsted/voyagerpy](https://github.com/pmelsted/voyagerpy) > > which gives the following error > > ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. > numba 0.56.4 requires numpy<1.24,>=1.18, but you have numpy 1.24.3 which is incompatible. > tensorflow 2.12.0 requires numpy<1.24,>=1.22, but you have numpy 1.24.3 which is incompatible. >

U7YCV0V8F (16:20:38): > Once thats fixed I can finish putting together the rest of the QC notebooks for the voyager website

U04CYELHK5H (17:05:36): > I’m not requiring numba, but it seem like both tensorflow (pre-installed on the colab servers) and numba are both restricting the versions of numpy

U04CYELHK5H (17:06:17): > Do you think we should put an upper bound for numpy?

U7YCV0V8F (17:07:24): > is it even necessary to specify a version for numpy at this point?

U7YCV0V8F (17:07:38): > oh i guess there is an exclusion range

U7YCV0V8F (17:07:43): > maybe just specify the same rang?

U04CYELHK5H (17:12:04): > Sure, that shouldn’t be an issue.But we’re not using requirements.txt, we’re using pyproject.toml and using poetry for building and publishing

U04CYELHK5H (17:13:12): > Also, I fixed the nan issue on the dev branch, and I’m finishing log_norm_counts so it doesn’t cast the sparse matrix into dense

U7YCV0V8F (17:14:26): > Im not sure how it works with pyproject.toml- maybe just remove the specific version?

U7YCV0V8F (17:14:35): > that way it defaults to whatever is on the computer

U04CYELHK5H (17:15:50): - File (PNG): IMG_5245

U04CYELHK5H (17:18:15): > I see now that I set it to ^1.24.3, meaning it will use the newest compatible version 1.X, not reaching into higher major versions.For it to be compatible with tensorflow, we should change it to ^1.22

U7YCV0V8F (17:30:03): > Awesome! Let me know when the requirements have been updated/can install on google colab so I can finish the voyagerpy QC notebooks for these 8 different single-cell assays

U01UF27E9P0 (18:09:13):

2023-06-03

U01UF27E9P0 (17:49:47): UA5GZMWHM (18:30:44): > I wonder where did you get the NaN’s though. I guess there’s probably a division by 0 somewhere. In ggplot2 NA’s and Inf’s are removed and you get a warning about it.

2023-06-04

U01UF27E9P0 (17:55:21):

2023-06-05

U7T29M3DG (14:02:07): > This is the Zoom link for today:https://caltech.zoom.us/j/85299352410

U7YCV0V8F (14:07:59): > I suggest modifying the QC metrics to be consistent with this function:https://github.com/sbooeshaghi/mx/blob/8a71f674de639e2c702fe7999d099161f59c7b54/mx/mx_inspect.py#L128-L150

U04CYELHK5H (14:22:33): > https://github.com/pmelsted/voyagerpy/pull/17 - Attachment: #17 :arrows_counterclockwise: synced file(s) with pmelsted/voyagerpy > synced local file(s) with pmelsted/voyagerpy. > > Changed files > > • synced local directory examples/ with remote directory examples/ > > * * * > > This PR was created automatically by the repo-file-sync-action workflow run #5180377421

U01UF27E9P0 (18:14:49):

2023-06-06

U023DK2HCM7 (00:35:47): > Just a thought while reading the manuscript: Currently, it is not common practice to write such a detailed explanation on every aspect of a new software (though it should be!). It might be worth mentioning something about this in the abstract or introduction to prepare the reader for what they are about to read.

U01UF27E9P0 (17:59:09):

2023-06-07

U7YCV0V8F (14:26:23): > The results section of the manuscript currently focuses on describing voyager from the R package perspective- I think the results could be improved by instead describing voyager as methods that are language agnostic (with references to the Python and R implementations when appropriate)

U7YCV0V8F (14:28:17): > This would help make the manuscript more succint- the text (in my opinion) is overly detailed

U7YCV0V8F (14:30:36): > I think the main ideas (that could guide a slight restructuring and minification of the manuscript) are > 1. SFE data object (with python and R implementations for storing relevant info) > 2. univariate/bivariate/multivariate spatial statistic methods > 3. Compatibility tests between R and python > 4. Extensive documentation & Tutorials

U7YCV0V8F (14:33:01): > My personal opinion is that the text is way too long to say those 4 main points

U7YCV0V8F (14:36:54): > I suggest moving a lot of the material to a Supplementary Note.

U023DK2HCM7 (14:44:23): > If we are going to restructure the text, I would like to suggest the following: > Biologists reading this are first and foremost going to be interested in WHAT Voyager will allow them to do. Once they are sold on that, then they will want to know HOW. I think the case studies should be moved to the beginning of the results section and the figures containing many panels could be split into several (maybe one per technology).

U023DK2HCM7 (14:48:59): > I think that section could be extended while other sections are a little wordy right now as Sina mentioned

U01UF27E9P0 (18:08:00):

2023-06-12

U7T29M3DG (10:48:06): > Hi everyone, > > I apologize but I’ll have to miss the Voyager meeting today.

Unknown User (14:02:23): > [Unsupported block type: call]

UA5GZMWHM (20:19:07) (in thread): > The problem is that at present, some of the methods referred to in the paper are only implemented in the R version, such as the variogram and the bivariate and multivariate methods. I’m not sure how to refer to these methods in a language agnostic way because it would be misleading to imply that they are also implemented in the Python version.

UA5GZMWHM (20:24:33) (in thread): > But I can remove the spdep and gstat references in the Results section; those seem more appropriate in the Methods section.

2023-06-13

U01UF27E9P0 (17:28:36): U01UF27E9P0 (20:44:00):

2023-06-14

U01UF27E9P0 (15:53:15): U01UF27E9P0 (19:19:40):

2023-06-15

U01UF27E9P0 (19:28:47):

2023-06-16

U01UF27E9P0 (18:59:26):

2023-06-17

U01UF27E9P0 (18:47:44):

2023-06-18

UA5GZMWHM (12:55:01): > Since my parents are here and I’m busy entertaining them in their limited time here, how about we move the meeting to Thursday?

U01UF27E9P0 (18:46:04):

2023-06-19

U04CYELHK5H (10:45:31): > I cant make it today, either

U01UF27E9P0 (18:54:33):

2023-06-20

U02EN9EQQ5U (12:20:38) (in thread): > I’m going to be a TA for the Intro to Python bootcamp all this week and the course generally runs from 8a-5p. Please set the meeting at your own convenience and I’ll step out if I can

U01UF27E9P0 (18:52:07):

2023-06-21

U01UF27E9P0 (19:04:19):

2023-06-22

U023DK2HCM7 (14:20:44): > Are we meeting today?

UA5GZMWHM (15:57:52): > Ah I forgot to ask for a time. I can do today before 4:30 pm, or tomorrow afternoon.

U023DK2HCM7 (16:19:56) (in thread): > Either works for me

UA5GZMWHM (16:27:48) (in thread): > Oh right, it would be too late for the Icelandic people. How about do tomorrow at noon?

U023DK2HCM7 (16:37:10) (in thread): > 12:30 would be better for me ifthat’snot too late

UA5GZMWHM (17:36:09) (in thread): > OK, we can do 12:30 if the Icelandic people don’t find it too late.

U01UF27E9P0 (18:23:43):

2023-06-23

UA5GZMWHM (02:23:09) (in thread): > Actually I don’t really have updates. I’ve just been busy entertaining my parents and updating the museum database.

U023DK2HCM7 (02:34:42) (in thread): > Either works for me, Idon’thave any updates either

Unknown User (15:33:30): > [Unsupported block type: call]

UA5GZMWHM (15:38:43): > If nobody joins then I’ll end the meeting

U02EN9EQQ5U (15:39:22) (in thread): > Sorry - still helping out with the python bootcamp today and tomorrow.

U023DK2HCM7 (15:49:03) (in thread): > Sorry I was under the impression that we wouldn’t meet since there were no updates

U01UF27E9P0 (19:05:14):

2023-06-24

U01UF27E9P0 (18:43:16):

2023-06-25

U01UF27E9P0 (18:38:23):

2023-06-26

Unknown User (14:00:14): > [Unsupported block type: call]

2023-07-02

UA5GZMWHM (23:58:33): > Are we still going to meet tomorrow as Lior is in Iceland for his father in law’s funeral? Also it’s the Independence Day holiday in the US. I can still meet if you wish as I try to submit the Voyager paper ASAP.

2023-07-03

U7T29M3DG (01:43:38): > Let’sskip meeting today.

2023-07-05

UA5GZMWHM (02:40:32): > Updates on the manuscript: I suppose it’s done-ish, waiting for Lior’s final word before submitting. The ESDA case studies figure was split into two, one spatial and one non-spatial, to make the panels larger and easier to read; this did not significantly affect the text. I also added two supplementary figures on clustering with MULTISPATI PCs with top positive eigenvalues. I abridged the abstract to 150 words as required by Nature Methods. The main text has 4200 something words excluding abstract, figure legends, and methods. Ideally it should be 3000 but up to 5000 words are allowed with editorial discretion, so I don’t think I can add content. What remains to do before submitting: > 1. Determine author order. Specifically, I don’t know where to place Sindri. A tentative order (no offense intended) is: Lambda, Petur, Kayla, Laura, Sina, Sindri, Lior, and Pall. > 2. Write a cover letter. > 3. Some reformatting

U01UF27E9P0 (03:31:48): UA5GZMWHM (17:22:08) (in thread): > Also plan B if it’s rejected by Nature Methods: maybe Nature Biotechnology, Nature Genetics, Nature Communications, Genome Biology (where Giotto is published), Nucleic Acids Research, or somewhere else?

2023-07-10

Unknown User (14:00:38): > [Unsupported block type: call]

U01UF27E9P0 (18:07:50):

2023-07-11

U01UF27E9P0 (18:05:37):

2023-07-12

U01UF27E9P0 (18:07:05):

2023-07-13

U01UF27E9P0 (18:06:08): U01UF27E9P0 (21:38:55):

2023-07-14

U01UF27E9P0 (00:07:28): U01UF27E9P0 (18:11:37):

2023-07-15

U01UF27E9P0 (18:08:46):

2023-07-16

U01UF27E9P0 (15:04:48): U01UF27E9P0 (18:11:05):

2023-07-17

Unknown User (14:01:00): > [Unsupported block type: call]

UA5GZMWHM (14:07:31): > <!channel>Reminder Voyager crew meeting

U91L3C2KF (14:07:53): > I’mon vacation so Iwon’tbe attending

U02EN9EQQ5U (14:08:18) (in thread): > Hi<@UA5GZMWHM>, I’m also on vacation until Wednesday.

UA5GZMWHM (14:10:21): > My updates: Just waiting for Lior to make more comments on the manuscript before submitting it. We plan to submit it to Nature Biotechnology, and if it’s rejected, then try Nature Methods. I have been updating the Museum database (super duper behind) and analyzing a UCI collaborator’s Visium dataset. That prompted me to think more about dealing with multiple biological replica and case vs. control and will lead to new features in SFE.

UA5GZMWHM (14:10:48): > OK, have a nice trip, Kayla and Pall!

U01UF27E9P0 (18:09:59): U01UF27E9P0 (20:09:52):

2023-07-18

U7T29M3DG (02:04:07) (in thread): > I think we should try Nature Biotechnology first. Then Nature Methods. I agree with the follow on journals.

U7T29M3DG (02:04:22) (in thread): > Also, as I’m working through the manuscript I’ve left some comments on places where references need to be added.

U01UF27E9P0 (18:09:21):

2023-07-19

U7T29M3DG (08:58:16): > I’m done going through the paper.<@UA5GZMWHM>: can you just look at the handful of comments left?

U04CYELHK5H (09:39:03): > I just published the documentation for VoyagerPy tohttps://voyagerpy.readthedocs.io/

U04CYELHK5H (09:46:43): > I think we can unpublish thegithub.iopage, since readthedocs can execute and render the notebooks. Also, I feel like it is easier to maintain because > a) It doesn’t use jekyll (I don’t really fancy it) > b) expects the documentation to lie next to the code in a docs/ directory, which allows versioning of the documentation > c) Doesn’t require the pull-request extravaganza to sync the code with the website

U7T29M3DG (11:06:38): > Two things: > 1. It’s very important to have the preprocessing notebooks<@U7YCV0V8F>has built on the website (whichever it is). > 2. I’m not a fan of the ads that are displayed on the current readthedocs page

U7T29M3DG (11:06:55): - File (PNG): image.png

U04CYELHK5H (11:22:01): > Ahh, yes I forgot about those. I think they have to be there since it is a free service. I think there is a way to have the same stuff hosted on github

U02EN9EQQ5U (11:58:02): > I added the preprocessing notebooks to the documentation branch of the Voyager repo. They should be available with the next build of the website

UA5GZMWHM (12:58:36): > I need to fix some typos and broken references on the documentation website

U01UF27E9P0 (14:19:02): U01UF27E9P0 (17:39:59): U7T29M3DG (19:17:23): > PS please also look at the references I suggested in the Supplement<@UA5GZMWHM>

U01UF27E9P0 (20:19:23):

2023-07-20

UA5GZMWHM (03:15:54): > Here’s the cover letter:https://docs.google.com/document/d/1j-FnHM2O5MMVg69O2yfwhI1WZgIx3G5Z/edit - File (Word Document): caltech_lh_standard_template_div_R4BfoDV.docx

U01UF27E9P0 (03:25:48): U01UF27E9P0 (03:57:07) (in thread): U01UF27E9P0 (06:31:23): U7T29M3DG (09:07:02): > Thanks for drafting the letter<@UA5GZMWHM>

U7T29M3DG (09:07:40): > I’ve made some edits, and think its good to go. Let’s touch base today and finalize any remaining items before submitting. Any last minute suggestions / edits from anyone else? cc<!channel>

UA5GZMWHM (17:29:30): > So far it looks good. If you don’t have any last minute suggestions, then I’m going to submit it.

UA5GZMWHM (17:58:07): > I’ll post it on bioRxiv first.<@U01BMGGNFEF>What’s your ORCID? If you don’t have one, please open an account on ORCID. Nature Biotechnology encourages all authors to link to ORCID and corresponding authors are required to have it.

U01BMGGNFEF (18:05:23) (in thread): > 0009-0005-0779-6923

UA5GZMWHM (18:05:33) (in thread): > Thank you!

U01UF27E9P0 (18:08:50): UA5GZMWHM (18:27:41): > Uploaded to bioRxiv

UA5GZMWHM (19:02:46): > Submitted. For some reason, bioRxiv only emailed me and Pall, while I did put Lior as a corresponding author. Nature Biotechnology only lets me put one corresponding author, and that is Lior.

U023DK2HCM7 (19:12:07): > Congrats!!

2023-07-21

U01UF27E9P0 (05:18:10): U7T29M3DG (13:56:56): > One thing: it’s fairly urgent that we > a) settle on the Python page- your choice Pétur but it needs to be Github or readthedocs fully decided (my preference is the github) > b) make sure the preprocessing notebooks are up on there (they are now on the R page- thanks<@U02EN9EQQ5U>!)

U04CYELHK5H (14:07:50): > Sure, we can dogithub.io.I was thinking that we should have the gh-pages pull the documentation from the main branch on push, and skip the branch sync action.If the example notebooks live in the docs/examples folder, it is easy to execute them and render them

UA5GZMWHM (16:48:18): > <@U91L3C2KF>I forgot to add the funding to the Acknowledgement in the paper. Which grant are you using for the project?

U91L3C2KF (17:33:25): > Same one as we used for the busz paper

UA5GZMWHM (17:38:15): > Thanks!

U01UF27E9P0 (18:02:53): U01UF27E9P0 (20:30:09):

2023-07-22

U01UF27E9P0 (18:03:21):

2023-07-23

U01UF27E9P0 (18:09:37): UA5GZMWHM (23:50:34): > Unless you really have important updates, no meeting on July 24, since I don’t have any updates except for the paper submission and the Bioconductor workshop. Maybe no meetings until we get reviews back, except when we implement new features.

2023-07-24

U01UF27E9P0 (18:05:23): UA5GZMWHM (19:58:47): > <@U04CYELHK5H>Can you put Sina’s preprocessing notebooks on the readthedocs website?

UA5GZMWHM (23:39:11): > https://www.mdpi.com/2227-7390/9/19/2465/htm - Attachment (MDPI): Geary’s c and Spectral Graph Theory > Spatial autocorrelation, of which Geary’s c has traditionally been a popular measure, is fundamental to spatial science. This paper provides a new perspective on Geary’s c. We discuss this using concepts from spectral graph theory/linear algebraic graph theory. More precisely, we provide three types of representations for it: (a) graph Laplacian representation, (b) graph Fourier transform representation, and (c) Pearson’s correlation coefficient representation. Subsequently, we illustrate that the spatial autocorrelation measured by Geary’s c is positive (resp. negative) if spatially smoother (resp. less smooth) graph Laplacian eigenvectors are dominant. Finally, based on our analysis, we provide a recommendation for applied studies.

2023-07-25

U01UF27E9P0 (00:04:46): UA5GZMWHM (14:39:33) (in thread): > They’re there now. Thanks!

U04CYELHK5H (14:42:54) (in thread): > They’re not being executed in the build process but I can try and make it work

UA5GZMWHM (14:44:14) (in thread): > I saw them on thegithub.iopage, which is built with sphinx. The output can be seen. That’s built with readthedocs, right?

UA5GZMWHM (14:44:37) (in thread): > Also, can you add Sina as an author on the website?

U04CYELHK5H (14:45:39) (in thread): > No, readthedocs was used to build and serve the site.Now this is done using github actions, and using sphinx-docs therein for generating the html

UA5GZMWHM (14:46:18) (in thread): > I see. I’m a bit confused.

U04CYELHK5H (14:46:31) (in thread): > And yes, I will add Sina as an author, after I put my son to bed

UA5GZMWHM (14:46:41) (in thread): > OK, good night!

UA5GZMWHM (14:47:15) (in thread): > BTW, I’ve been a night owl since I was a kid:joy::owl:

U04CYELHK5H (14:47:36) (in thread): > Basically, readthedocs is not doing anything right now.The website is built using sphinx (invoked by actions), and served vis guthub pages

UA5GZMWHM (14:48:38) (in thread): > I see. The paper got rejected by Nature Biotechnology, so I can edit it again, to improve it, before I submit it to Nature Methods. I need to change the URL to the Python website.

U04CYELHK5H (14:49:26) (in thread): > So the final url for vpy is thegithub.io

U01UF27E9P0 (18:05:18):

2023-07-27

U7T29M3DG (15:01:03): > The Colab notebook for preprocessing 10x is broken (link doesn’t work):https://pachterlab.github.io/voyager/articles/chromium_landing.html - Attachment (pachterlab.github.io): 10X Chromium Single Cell 3’ v3 Processing Workflows with Voyager > Voyager

U02EN9EQQ5U (15:14:32) (in thread): > Just fixed this for the next build

U7T29M3DG (15:59:14): > Hi everyone, > > The Voyager paper was desk rejected by Nature Biotech. with a suggestion to transfer it to Genome Biology. I think we should fix several aspects of the website / paper etc. before submitting elsewhere (we can discuss where later) and Lambda is already on it.

U7T29M3DG (15:59:26) (in thread): > Thx!

UA5GZMWHM (17:46:42): > I’m aiming for Nature Methods, where the squidpy paper is published. I think there’s some hope because I don’t find our paper worse than the squidpy paper. I may say in the cover letter our paper is a sequel to that paper expanding on what it aims to do in a cool way. VoyagerPy does not aim to supplant squidpy, but expands the ecosystem. Nor does Voyager R aim to supplant other methods like BayesSpace. Here’s a summary of the edits to the paper: > 1. I put more emphasis on the “what you CAN do” in Introduction. > 2. I changed the URL to the VoyagerPy website, which now is thegithub.iosite. > 3. In Discussion, I also noted that ESDA may help improve model based spatial analyses for specific tasks by giving a better understanding of gene expression in space and revealing neglected aspects of spatial phenomena. This is akin to discovery of overdispersion leading to widespread adoption of the negative binomial model. > 4. I’m~~~mutilating~~~editing a note written by Lior and Nicholas Bray on why Seurat and scanpy are both wrong in log fold change. I’m adding a section on scran, on how its log fold change works. tl;dr, for Mann Whitney U test, which is implemented in VoyagerPy according to scran’s implementation, the AUC which is easily calculated from the U statistic is used as effect size instead of log fold change. Log fold change is calculated for t test, and it’s pretty much the log fold change in geometric means of counts divided by size factor + pseudocount in the two groups. Aaron Lun’s coding style isn’t easy to read but is elegant to write, so it took me a while to figure that out. My coding style is influenced by his as I read and in some cases copied source code of SingleCellExperiment and scater when writing SFE and Voyager. > To do before I leave for Boston on August 1: > 1. Remake the log fold change Seurat vs. scanpy plot and make plot for Voyager AUC. > 2. Make tables comparing functionalities of Voyager, squidpy, Seurat v4 and above, Giotto, and STUtility (now superseded by semla), their underlying data structures, and their spatial neighborhood graph methods. The squidpy paper has a similar table in the supplement. > 3. Abridge the paper a little, because it got longer after the edits. > 4. I already added Nicolas Bray as an author in the manuscript. Lior has notified him. We still need to figure out where he should be in the author list, his ORCID, and affiliation. > 5. I just remembered as I’m writing here: Probably I should briefly mention OME-Zarr and SpatialData in the Discussion as ways to potentially better work with the images. That said, I was a reviewer for the paper about the samui browser in Biological Imaging from the Stephanie Hick lab (I’m going to meet her in Boston next week). The paper just got accepted today. They chose cloud optimized GeoTIFF instead of Zarr as they find the former easier to query on the cloud. So I think it’s too early to say if OME-Zarr is the future of multiplexed imaging data format that so troubled Kayla and I over a year ago, so I’m not 100% sure about mentioning it.

U01UF27E9P0 (17:51:05): UA5GZMWHM (17:53:41): > BTW, here’s the website for the workshop I’m presenting in Boston:https://lambdamoses.github.io/VoyagerWorkshop/ - Attachment (lambdamoses.github.io): VoyagerWorkshop > This package contains the workshop material demonstrating using > Voyager for exploratory spatial data analysis (ESDA) in a Visium dataset.

UA5GZMWHM (17:55:44): > Giotto authors are also at the conference, presenting a package demo.

U01UF27E9P0 (20:26:48):

2023-07-28

U7T29M3DG (18:45:53): > I thought you’d find this interesting in terms of geospatial data:

U7T29M3DG (18:45:53): > https://overpass-turbo.eu/ - Attachment (overpass-turbo.eu): overpass turbo > A web based data mining tool for OpenStreetMap which runs any kind of Overpass API query and shows the results on an interactive map.

U7T29M3DG (18:46:10): > You can write code in this website and then have it show you things (chatGPT is useful for helping write the code)

U7T29M3DG (18:47:46): > For example, here are all ATM machines in Cologne Germany:

U7T29M3DG (18:47:53): - File (PNG): image.png

U7T29M3DG (18:47:54): > The code:

U7T29M3DG (18:47:54): > [out:json]; > area[name=“Köln”]->.searchArea; > ( > node(area.searchArea)[“amenity”=“atm”]; > way(area.searchArea)[“amenity”=“atm”]; > relation(area.searchArea)[“amenity”=“atm”]; > ); > out center;

2023-07-29

UA5GZMWHM (04:15:48): > I just realized that the VoyagerPy Wilcoxon test infind_markersis really slow, much slower than scanpy and scran. Not sure if I did something incorrectly or if it should be more optimized in the next version.

U04CYELHK5H (04:17:39): > Iknow.I thinl Sindri ws very upfront with that when he implemted it.

U01BMGGNFEF (11:04:13): > Well the methods in scanpy and scran do two very different things. In our case scran is just plain faster than scipy.stats which is what I used. There could be some improvement with a custom implementation of the mann whitney test for a future version. There is a function get_marker_genes in voyagerpy that does the same as the scanpy one that should be a little faster.

UA5GZMWHM (15:57:15): > scran has its own implementation in C++. I checked if the results in VoyagerPy and scran match. They don’t for the large p-values and AUC’s below 0.5, but nobody cares about those anyway. It’s probably because of differences in the implementation of Wilcoxon in scran and scipy, like how to deal with ties, but I haven’t checked the source code because the C++ code is so hard to read. They largely match for the genes with FDR < 0.05 and AUC > 0.5, which are the genes that matter, with small differences above 1.5e-8 but unlikely to lead to different biological conclusions. That’s when I use pval type = “all” and direction = “up” in scran, which I think is the most sensible though not the default.

UA5GZMWHM (15:57:36): > In contrast, the log FC and FDR in Seurat and scanpy (default parameters) differ wildly.

UA5GZMWHM (16:01:21): > Each facet is for a cluster, which I didn’t annotate but I don’t think we need the cell type annotations to make the point. - File (PNG): image.png - File (PNG): image.png - File (PNG): image.png - File (PNG): image.png

UA5GZMWHM (16:04:40): > I also reproduced the notorious Pullin and McCarthy plot on Seurat vs. Scanpy log FC, with the same PBMC 5k dataset used in the non-spatial case study. I need to add a note on how Seurat and Scanpy deal with either group being all 0 for a gene. The first plot is with the all 0 entries and the +-20 something for scanpy comes from the epsilon = 1e-9 used to prevent invalid values in log2 when the cluster is all 0. The second plot is without those all 0’s. - File (PNG): image.png - File (PNG): image.png

U01UF27E9P0 (18:04:03):

2023-07-30

U01UF27E9P0 (00:22:16): UA5GZMWHM (04:15:38): > Coauthors: Please help if you can. My writing style is just like the horrible mess in my apartment. I hoard things, though not to a pathological point, reasoning that they may be useful. Please identify where I put too many unnecessary details and help abridging the paper if you have the time. Please use suggestion mode so I can check if the meaning is preserved. Thanks!

U7T29M3DG (11:37:38): > Will do. But reading over so far it’s very, very good!

U7T29M3DG (11:38:15): > This figure is nuts:

U7T29M3DG (11:38:17): > Wow

U7T29M3DG (11:38:19): - File (PNG): image.png

UA5GZMWHM (15:52:41): > I mutilated your beautiful note on log fold change:grimacing:

UA5GZMWHM (16:46:36) (in thread): > Seurat first uses a minimum cutoff of percentage of cells in the cluster of interest expressing the gene and log fold change to select a smaller subset of genes to test before it runs DE. Scanpy tests all genes, which is often over 10x more than what Seurat tests. Both Seurat and Scanpy performs multiple testing correction for each cluster separately. But this is still nuts because the Seurat p-values are often well over 10x smaller than the scanpy ones. And Seurat uses Bonferroni while scanpy by default uses BH.

U01UF27E9P0 (18:04:00):

2023-07-31

UA5GZMWHM (02:05:13) (in thread): > And that’s the beauty of open source. We can dig into the source code so see how exactly things are done. Imagine if everyone uses SAS and SPSS. Something weird happened, and we can’t inspect the source code to see what happened.

U01UF27E9P0 (18:01:49): U01UF27E9P0 (21:29:32):

2023-08-01

U01UF27E9P0 (18:05:57):

2023-08-02

U01UF27E9P0 (18:01:19):

2023-08-03

U01UF27E9P0 (18:06:05):

2023-08-04

U01UF27E9P0 (18:03:30): U01UF27E9P0 (20:31:20):

2023-08-05

U01UF27E9P0 (18:02:22):

2023-08-06

U01UF27E9P0 (18:00:07):

2023-08-07

U01UF27E9P0 (18:06:33):

2023-08-08

U01UF27E9P0 (18:05:32):

2023-08-09

U01UF27E9P0 (00:01:39):

2023-08-10

U01UF27E9P0 (18:13:34): UA5GZMWHM (19:41:35): > https://www.overleaf.com/project/649a0f199ba1677c6f7206d6Supplementary note 1, Moran’s I and MULTISPATI

UA5GZMWHM (19:42:28): > Supplementary Notes on Overleaf:https://www.overleaf.com/project/636c5c8a5d33da2fbd8f936a

2023-08-11

UA5GZMWHM (01:24:38): > Next release of Quarto and RStudio will have functionalities of Paperpile, to search online for papers to add to references. It’s free. I look forward to it because my days at Caltech are numbered so soon I’ll no longer be able to use the lab subscription to Paperpile.

UA5GZMWHM (01:48:48): > Here’s the cover letter:https://docs.google.com/document/d/1j-FnHM2O5MMVg69O2yfwhI1WZgIx3G5Z/edit?usp=drive_link&ouid=108696065868254037245&rtpof=true&sd=true - File (Word Document): caltech_lh_standard_template_div_R4BfoDV.docx

U01UF27E9P0 (02:25:26): UA5GZMWHM (02:49:36): > I’m done merging all the supplementary notes into the same Latex file on Overleaf. Our supplementary notes took work to write than the worst papers I’ve seen when curating the Museum database.

U01UF27E9P0 (18:03:22): U01UF27E9P0 (21:19:57):

2023-08-12

U01UF27E9P0 (18:07:10):

2023-08-15

U01UF27E9P0 (19:56:58):

2023-08-17

U01UF27E9P0 (18:05:53):

2023-08-18

U01UF27E9P0 (18:08:50): U01UF27E9P0 (22:08:20):

2023-08-19

U01UF27E9P0 (18:01:13):

2023-08-20

UA5GZMWHM (04:34:37): > v2 of the preprint passed screening very quickly:https://doi.org/10.1101/2023.07.20.549945 - Attachment (bioRxiv): Voyager: exploratory single-cell genomics data analysis with geospatial statistics > Exploratory spatial data analysis (ESDA) can be a powerful approach to understanding single-cell genomics datasets, but it is not yet part of standard data analysis workflows. In particular, geospatial analyses, which have been developed and refined for decades, have yet to be fully adapted and applied to spatial single-cell analysis. We introduce the Voyager platform, which systematically brings the geospatial ESDA tradition to (spatial) -omics, with local, bivariate, and multivariate spatial methods not yet commonly applied to spatial -omics, united by a uniform user interface. Using Voyager, we showcase biological insights that can be derived with its methods, such as biologically relevant negative spatial autocorrelation. Underlying Voyager is the SpatialFeatureExperiment data structure, which combines Simple Feature with SingleCellExperiment and AnnData to represent and operate on geometries bundled with gene expression data. Voyager has comprehensive tutorials demonstrating ESDA built on GitHub Actions to ensure reproducibility and scalability, using data from popular commercial technologies. Voyager is implemented in both R/Bioconductor and Python/PyPI, and features compatibility tests to ensure that both implementations return consistent results. > > ### Competing Interest Statement > > The authors have declared no competing interest.

U01UF27E9P0 (18:04:44):

2023-08-21

U01UF27E9P0 (18:34:23):

2023-08-22

U01UF27E9P0 (18:11:49): UA5GZMWHM (21:24:09): > For the next release of Voyager and SFE in late October of early November, I’m editing the code to deal with case and controls (just basic stuff like whether to compute spatial statistics jointly on different biological replica in the same condition or separately on each section) and maybe 3D data (got serial section data). I would say my ideas about these things haven’t been fully developed yet. Do you have thoughts on these issues? Like how does batch effect manifest spatially? How to meaningfully compare spatial statistics across sections and across conditions? Python team, would you like to implement something like that in VoyagerPy?<@U04CYELHK5H><@U91L3C2KF>

2023-08-23

U01UF27E9P0 (11:25:01): U01UF27E9P0 (13:13:18) (in thread): UA5GZMWHM (15:53:17): > https://github.com/BiocPy/scranpyThere’s a Bioconductor for Python. Maybe you no longer need to reimplement scran functionalities.<@U04CYELHK5H>

UA5GZMWHM (16:33:49): > News: The first preprint that I know of using the commercialized Slide-seq, i.e. Curio Seeker, was posted yesterday. I requested an example dataset, but can’t immediately download it like 10X and Vizgen as someone will message me. If they’re fine with it, then I think we should add an example dataset to the SFEData package and write a vignette on that dataset, maybe to replace the existing Slide-seq one. If not, then at least I think I should add a functionreadSeekerto the SFE package for the next release to read the standardized output.https://doi.org/10.1101/2023.08.21.554210 - Attachment (bioRxiv): A Spatiotemporal Molecular Atlas of the Ovulating Mouse Ovary > Ovulation is essential for reproductive success, yet the underlying cellular and molecular mechanisms are far from clear. Here, we applied high-resolution spatiotemporal transcriptomics to map out cell-type- and ovulation-stage-specific molecular programs as function of time during follicle maturation and ovulation in mice. Our analysis revealed dynamic molecular transitions within granulosa cell types that occur in tight coordination with mesenchymal cell proliferation. We identified new molecular markers for the emerging cumulus cell fate during the preantral-to-antral transition. We describe transcriptional programs that respond rapidly to ovulation stimulation and those associated with follicle rupture, highlighting the prominent roles of apoptotic and metabolic pathways during the final stages of follicle maturation. We further report stage-specific oocyte-cumulus cell interactions and diverging molecular differentiation in follicles approaching ovulation. Collectively, this study provides insights into the cellular and molecular processes that regulate mouse ovarian follicle maturation and ovulation with important implications for advancing therapeutic strategies in reproductive medicine. > > ### Competing Interest Statement > > The authors have declared no competing interest.

UA5GZMWHM (16:50:04): > OK, actually there’s an earlier one that I missed:https://www.biorxiv.org/content/10.1101/2023.03.23.533992v1 - Attachment (bioRxiv): A Single-Nucleus Atlas of Seed-to-Seed Development in Arabidopsis > Extensive studies of the reference plant Arabidopsis have enabled deep understandings of tissues throughout development, yet a census of cell types and states throughout development are lacking. Here, we present a single-nucleus transcriptome atlas of seed-to-seed development employing over 800,000 nuclei, encompassing a diverse set of tissues across ten developmental stages, with spatial transcriptomic validation of the dynamic seed and silique. Cross-organ analyses revealed transcriptional conservation of cell types throughout development but also heterogeneity within individual cell types influenced by organ-of-origin and developmental timing, including groups of transcription factors, suggesting gatekeeping by transcription factor activation. This atlas provides a resource for the study of cell type specification throughout the continuum of development, and a reference for stimulus-response and genetic perturbations at the single-cell resolution. > > One-Sentence Summary A single nucleus atlas of seed-to-seed development in Arabidopsis charts a course through the lifecycle of an organism. > > ### Competing Interest Statement > > The authors have declared no competing interest.

UA5GZMWHM (17:09:14): > <@U91L3C2KF><@U04CYELHK5H>Aaron Lun who wrote scran and SCE is implementing a common infrastructure for R and Python single cell data. This includes not only a C++ library for scran, but also C++ libraries to access R and Python matrices from C++. That includes DelayedArray for on disk operations.https://github.com/tatami-inc/tatami https://github.com/BiocPy/mattress

U04CYELHK5H (17:48:38) (in thread): > Wow, that could come in handy. Depends on the ease of installation and how it compares to scran

U04CYELHK5H (17:53:38): > I guess it makes sense to have the same C++ backend for R and Python if you’re implementing the same package.This would allow easier compatibility testing and even unit testins

UA5GZMWHM (18:09:07): > Definitely. It might also be faster.

U7T29M3DG (18:18:49): > FYI we’re meeting with Aaron Lun on Monday at 1pm PST. Would be great if someone from Iceland can join the call.

UA5GZMWHM (18:42:50): > It’s PDT

U01UF27E9P0 (19:12:57): U01UF27E9P0 (21:00:51):

2023-08-24

U01UF27E9P0 (01:19:00): UA5GZMWHM (18:31:44): > Again, thank you<@U7T29M3DG>for the tweet thread to gather feedback. Based on the messages I got, and on more reflection, I think I would do these in the last revision of the paper before submitting it to Nature Methods: > 1. It seems that the Seurat vs. scanpy part has attracted more comments than the spatial part, probably because this is more familiar to the audience than the spatial part. There’s a suggestion to make Venn diagrams of marker genes using adjusted p < 0.05 and logFC > 2 cutoffs from Seurat, scanpy, and scran. But then the supplementary note on log FC would look too much like a paper on its own right. Maybe the non-spatial Seurat vs. scanpy stuff should be split off as a separate paper which I’ll cite in the Voyager paper and that paper can probably go to a journal like Bioinformatics, well, if I make the same plots for a few other datasets, which is easy. I’m debating whether I should do it. Pro: it looks like a separate paper and it doesn’t use any spatial statistics. Being a separate paper may also make the results more visible since readers don’t have to dig into the supplements. This way I can make more room to go in more depth in the spatial case studies. Con: it’s relevant to the compatibility tests for Voyager so not entirely off topic, and it’s part of VoyagerPy. > 2. When it comes to compatibility tests, I now wonder if Seurat, squidpy, and Giotto give the same Moran’s I. Given that log FC and PCA have such trouble despite the supposedly simple math, I can’t assume that those packages give the same Moran’s I because of the simple math. This is relevant because Seurat actually uses Moran’s I to find spatially variable genes. > 3. It will be clearer after we talk to Aaron Lun next Monday. I think the Discussion section should put more emphasis on the common framework across languages to resolve the Seurat vs. scanpy divide. A future direction is to make a common C++ backend for ESDA just like libscran from Aaron. Maybe we should put Aaron in the Acknowledgement. BTW, there has long been such a C++ backend in the geospatial field, in GEOS, GDAL, and PROJ although most people don’t directly call them from C++.

U01UF27E9P0 (18:44:34):

2023-08-25

U01UF27E9P0 (00:16:00): U01UF27E9P0 (18:54:26):

2023-08-26

UA5GZMWHM (17:17:37): > There’s another attempt as something akin to SFE and SpatialData:https://www.biorxiv.org/content/10.1101/2023.06.07.543950v2 - Attachment (bioRxiv): emObject: domain specific data abstraction for spatial omics > Recent advances in high-parameter spatial biology have yielded a rapidly growing new class of biological data, allowing researchers to more comprehensively characterize cellular state and morphology in native tissue context. However, spatial biology lacks a cohesive data abstraction on which to build novel computational tools and algorithms, making it difficult to fully leverage these emergent data. Here, we present emObject, a domain-specific data abstraction for spatial biology data and experiments. We demonstrate the simplicity, flexibility, and extensibility of emObject for a range of spatial omics data types, including the analysis of Visium, MIBI, and CODEX data, as well as for integrated spatial multiomic experiments. The development of emObject is an essential step towards building a unified data science ecosystem for spatial biology and accelerating the pace of scientific discovery. > > ### Competing Interest Statement > > E.A.G.B., M.H., M.K.R., M.F.B., A.E.T., A.L., B.W., and A.T.M. are employees of Enable Medicine, Inc. and may hold equity.

U01UF27E9P0 (18:05:28): UA5GZMWHM (19:53:43): > I just found this thing which may be a useful resource to learn spatial statistics:https://link.springer.com/referencework/10.1007/978-3-662-60723-7 - Attachment (SpringerLink): Handbook of Regional Science > The multi-volume Handbook of Regional Science provides up-to-date knowledge in the field. Composed by renowned experts. 2nd updated edition.

2023-08-28

UA5GZMWHM (14:29:22): > Here’s the zoom link for the 1 pm meeting with Aaron Lun:https://caltech.zoom.us/j/84221231000<@U0157SYMGBT>you can join if you’re interested. It would be about C++ backend in scran and Voyager called by both R and Python.

U01UF27E9P0 (18:21:15):

2023-08-29

USLACKBOT (13:35:26): > community-biochas joined this channel by invitation fromPachter Lab.

Aaron Lun (13:35:26): > @Aaron Lun has joined the channel

U01UF27E9P0 (18:11:28): UA5GZMWHM (19:05:18): > Below are links to the OSCA book and scran-related repos:https://bioconductor.org/books/release/OSCA/https://github.com/LTLA/scran.chan https://github.com/tatami-inc/tatami https://github.com/LTLA/libscran https://github.com/LTLA/scran.jl https://github.com/BiocPy/scranpy https://github.com/BiocPy/mattress - Attachment (bioconductor.org): Orchestrating Single-Cell Analysis with Bioconductor > Or: how I learned to stop worrying and love the t-SNEs.

2023-08-30

U01UF27E9P0 (18:10:00):

2023-08-31

UA5GZMWHM (03:43:14): > I’m going to switch to the devel version of Voyager and SFE for vignettes indocumentation-develsoon. A lot of updates need to be made to the website.

2023-09-02

UA5GZMWHM (05:03:01): > <@U02EN9EQQ5U>https://doi.org/10.1101/2023.08.30.555624 - Attachment (bioRxiv): Gene count normalization in single-cell imaging-based spatially resolved transcriptomics > Recent advances in imaging-based spatially resolved transcriptomics (im-SRT) technologies now enable high-throughput profiling of targeted genes and their locations in fixed tissues. Normalization of gene expression data is often needed to account for technical factors that may confound underlying biological signals. Here, we investigate the potential impact of different gene count normalization methods with different targeted gene panels in the analysis and interpretation of im-SRT data. Using different simulated gene panels that overrepresent genes expressed in specific tissue anatomical regions or cell types, we find that normalization methods that use scaling factors derived from gene counts differentially impact normalized gene expression magnitudes in a region- or cell type-specific manner. We show that these normalization-induced effects may reduce the reliability of downstream differential gene expression and fold change analysis, introducing false positive and false negative results when compared to results obtained from gene panels that are more representative of the gene expression of the tissue’s component cell types. These effects are not observed without normalization or when scaling factors are not derived from gene counts, such as with cell volume normalization. Overall, we caution that the choice of normalization method and gene panel may impact the biological interpretation of the im-SRT data. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2023-09-05

UA5GZMWHM (16:28:30): > https://bioconductor.org/developers/release-schedule/

2023-09-06

U01UF27E9P0 (17:56:58): UA5GZMWHM (22:20:31): > <@U04CYELHK5H><@U7YCV0V8F>Please help! I wasted a lot of time trying to figure out why this used but work but stopped working.https://github.com/pmelsted/voyagerpy/issues/25 - Attachment: #25 model_gene_var stopped working > I’m rerunning some code that used to work, which was used to make figures for the preprint. Here’s a colab notebook to reproduce the error: https://colab.research.google.com/drive/1T5-3AI2rYsWR-bqZzaF2BmGXyOU3M0in?usp=sharing > > The gene names are the Ensembl IDs so there shouldn’t be duplicates. After data normalization, this code > > > gene_var = vp.utils.model_gene_var(adata.layers['logcounts'], gene_names=adata.var_names) > > > gives this error from pandas: > > > InvalidIndexError: Reindexing only valid with uniquely valued Index objects > > > Pandas version is 1.5.3

2023-09-08

U01UF27E9P0 (13:08:32): UA5GZMWHM (18:17:00): > Coauthors: I’m going to split off supplementary note on log fold change and comparisons between Seurat and scanpy (I have also added Giotto) into a separate paper, which has the same authors as the Voyager paper. Then I’ll remove Nicolas Bray from the Voyager paper because his contributions are all in that separate paper. I’ll also make some more room for Voyager case studies. Are you OK with that?

UA5GZMWHM (18:17:28): > Here are some new plots about compatibility tests:

UA5GZMWHM (18:18:30): > Using default parameters, Seurat, scanpy, and Giotto mostly get different highly variable genes, which causes huge differences in PCA and very likely in anything downstream. - File (PNG): image.png

UA5GZMWHM (18:19:45): > Oh, BTW, I chose these colors for the following reasons: Seurat is for R’s color on GitHub, but somewhat lighter so it’s easier to tell apart from scanpy. Scanpy is for Python’s color on GitHub. Giotto is for the color of Giotto’s logo.

UA5GZMWHM (18:21:03): > When I used default parameters for data normalization and scaling, the PCA embeddings are very different. Giotto also does the normalization quite differently and the documentation didn’t explain why the defaults are chosen. - File (PNG): image.png

UA5GZMWHM (18:23:27): > I also quantified the differences in the PCA eigenvectors (gene loadings). Because the HVG’s are very different, for each pair, I used the genes shared by the two packages. Maybe a bit problematic. - File (PNG): image.png

UA5GZMWHM (18:24:42): > I also tried using the same HVG’s from Seurat. For Giotto, I changed the parameters so the data normalization is the same as in Seurat and scanpy. The PCA is still different because of different ways the data is scaled. - File (PNG): image.png

UA5GZMWHM (18:27:57): > Seurat by default clips the scaled data values at 10. Scanpy just z-scores the genes, no clipping. Giotto by default first scales and centers the genes, and thenit scales and centers the cells so the gene expression profile of each cell has mean 0 and variance 1; I’m not sure if it makes sense. Genes are scaled so more highly expressed genes don’t drown out less expressed genes. But when performing PCA so each dimension is a linear combination of genes, I’m not sure if it makes sense to scale the cells.

UA5GZMWHM (18:28:41): > The differences in eigenvectors are compared numerically. Here the same HVG’s are used. - File (PNG): image.png

UA5GZMWHM (18:34:08): > When I don’t do the clipping in Seurat and don’t scale cells in Giotto, the results would match those of scanpy. The Seurat vs. scanpy and scanpy vs. Giotto lines coincide because Seurat and Giotto use the same R package irlba for PCA by default. It takes quite some digging to find out those hidden defaults and probably most users don’t dig this far. That’s also why I don’t think it’s a good idea for Giotto to reimplement many Seurat functionalities and to implement its own new object so it’s less interoperable with Seurat and SCE — users have to use the reimplementation, and since most wouldn’t deviate much from the defaults, the results from Giotto’s reimplementation would be quite different from those of Seurat (most people use Seurat) and anything downstream should be interpreted differently from Seurat results. - File (PNG): image.png

UA5GZMWHM (18:38:08): > Seurat, squidpy, and Giotto also give different Moran’s I results. Seurat does Moran’s I differently from squidpy and Giotto: Seurat uses a distance matrix and always uses the inverse distance squared as spatial weights, while squidpy and Giotto use a spatial neighborhood graph. For some reason, Seurat’s Moran’s I’s are smaller than those from squidpy and Giotto. Nevertheless, this might not significantly affect downstream results, since the Pearson and Spearman correlations are 0.99 or 1. - File (PNG): image.png

UA5GZMWHM (18:40:30): > Some people use a p-value or log fold change cutoff to find sets of marker genes. Because Seurat and scanpy compute both p-values and log fold changes differently (Wilcoxon is used in both, and default parameters are used), the resulting sets of marker genes are different. If adjusted p-values are used with p < 0.05 as a cutoff, for most clusters, the sets largely agree, but in some clusters they disagree quite a bit. - File (PNG): image.png - File (PNG): image.png

UA5GZMWHM (18:41:28): > Because of the way Seurat computes logFC, Seurat’s logFC is usually lower than that of scanpy. As a result, if the logFC > 2 threshold is used, we get way more markers from scanpy than Seurat. - File (PNG): image.png

U7T29M3DG (19:13:05) (in thread): > :+1:

U7T29M3DG (19:16:12): > Hi everyone, > > Just to follow up on<@UA5GZMWHM>’s message: we decided to split off the discordance examples (between Scanpy and Seurat) and turn it into a standalone paper. This was suggested by several people, and with increasing numbers of comparisons makes sense. > > We are going to leave the author list the same (except<@U8PUSP5PS>will be removed from the Voyager paper because he didn’t contribute there). We’ll try to get this out fairly quickly, and would be great if you could give feedback on the comparisons and/or suggest comparisons we may have overlooked.

UA5GZMWHM (19:28:40): > Here’s the document for the spinoff paper:https://www.overleaf.com/read/gsqtmvgcnmpr - Attachment (overleaf.com): Overleaf, Online LaTeX Editor > An online LaTeX editor that’s easy to use. No installation, real-time collaboration, version control, hundreds of LaTeX templates, and more.

UA5GZMWHM (19:38:48): > Actually, what about submit the Voyager paper shortly after the Bioc 3.18 release? I’m implementing new features on multiple samples and 3D data for Bioc 3.18 anyway. I have many ideas for other new features, but I might not have enough time to implement and test them before the end of October. Now with much of the compatibility test contents in a separate paper, I have more room in the Voyager paper for another case study for multiple samples, taken from a new vignette for the next release.

UA5GZMWHM (19:39:09): > I think that will make the paper quite a bit cooler.

UA5GZMWHM (19:43:29): > I’ll create a new GitHub repo for the code for the spinoff paper maybe this weekend, and move the code used to make the plots to that repo.

U05QPDL3TNF (22:25:33): > @U05QPDL3TNF has joined the channel

2023-09-09

UA5GZMWHM (01:55:14): > I’m moving relevant code from the Voyager preprint code repo to this repo for the split off Seurat vs. scanpy paper:https://github.com/pachterlab/MERJLBABMP_2023<@U05QPDL3TNF>since your downsampling results will be part of this paper, please put your analysis code in this repo. What’s your GItHub username?

U05QPDL3TNF (02:28:46): > Sounds good! My username is josephrich98

2023-09-11

U04CYELHK5H (18:12:46) (in thread): > I’m sorry I never saw this message/issue until now. The problem is that when you slice an AnnData object you get a view to the original AnnData object instead of new one. > > If you copy it viaadata = adata[adata.obs['in_tissue']==True].copy()you should not get the error. Maybe we should check for AnnData views whenever we perform in-place operations on AnnData objects. If I remember correctly, one cannot mutate a view

UA5GZMWHM (18:13:37) (in thread): > It would be nice to give a warning when a view is passed to the function. Somehow it worked with the view with v0.1.0.

U04CYELHK5H (18:25:01) (in thread): > I think that was because it was not truly in-place. These last lines of the function alter the reference to the adata object, which copies it, I thinkhttps://github.com/pmelsted/voyagerpy/blob/058ce8c833ed08ca92cb6151e3e6803a92172add/voyagerpy/utils/utils.py#L137-L142 https://anndata.readthedocs.io/en/latest/tutorials/notebooks/getting-started.html#Views-and-copies - Attachment (anndata): Getting started with anndata > Authors: Adam Gayoso, Alex Wolf Note This tutorial is based on a blog posts by Adam in 2021 and Alex in 2017. In this tutorial, we introduce basic properties of the central object, AnnData(“Annotat…

2023-09-12

UA5GZMWHM (02:55:45): > <@U04CYELHK5H><@U91L3C2KF>These are the header-only libraries of C++ code from@Aaron Lunthat can be a common foundation for non-spatial single cell analyses in Voyager.https://pypi.org/project/assorthead/Plus there is Lowess here, which wasn’t incorporated into VoyagerPy. Could be a starting point for a C++ implementation of Voyager both the R and Python packages wrap, so it will be easier to pass compatibility tests. The entirely non-spatial part should be a separate package, like I can further develop on scranchan. I don’t think I have enough time to do it for Bioc 3.18 (late October). For 3.19 (late April, 2024), I might vendor some of these libraries in R packages to submit to Bioconductor so R package Voyager can begin to transition to C++. We still need to implement spatial analyses in C++ and make them more efficient thanspdep. I definitely think spdep’s correlogram and gstat’s variogram can be parallelized and greatly sped up with more modern practices. This kind of thing is actually not new. In the geospatial field,sfin R andGeoPandasin Python already both wrap C++ libraries like GDAL, PROJ, and GEOS, and I know additional R packages wraping other C/C++ spatial libraries like lwgeom, mapshaper (actually it’s javascript), and s2 which have wider applications outside R. - Attachment (PyPI): assorthead > Assorted C++ headers

UA5GZMWHM (03:01:45): > I think if I put several of these related header only libraries in a Bioconductor R package, I may call itscHeaderverseorBiocHeaderverse. But I would probably do several packages.

2023-09-13

UA5GZMWHM (11:55:21): > Could be the future of SFE:https://paleolimbot.github.io/geoarrow/articles/geoarrow.htmlIt supports larger than memory data and the data isn’t copied when using both R and Python in the same notebook. - Attachment (paleolimbot.github.io): Getting started with geoarrow > geoarrow

2023-09-14

UA5GZMWHM (23:59:35): > <@U7T29M3DG>Some updates: I compared the k nearest graphs from Seurat, scanpy, and Giotto. Or more precisely, whichever graph used downstream for graph-based clustering. They actually don’t use the k nearest graph; rather, they first make a KNN graph, and then use it to make a shared nearest neighbor (SNN) graph, which is then used for clustering. They do it in very different ways (I explained in the notebook how). So the neighborhood sizes are very different. First, with everything default, the neighborhoods on the graph don’t match well, I think in part due to the different neighborhood sizes. - File (PNG): image.png

2023-09-15

UA5GZMWHM (00:11:47): > These are the neighborhood sizes (degree on the neighborhood graph), from default settings. Seurat tends to get much larger neighborhoods. Seurat finds Jaccard indices of neighborhoods of cell x and cell y, so x and y get an edge in the SNN graph if the Jaccard index is at least 1/15. Scanpy uses a function from the umap package to get the SNN graph and I don’t know how it works. Giotto’s default is k = 30, which is why the degree is capped at 30, but the size of the intersection between neighborhoods of cell x and cell y is used to filter edges, so many cells have <30 neighbors. - File (PNG): image.png

U7T29M3DG (00:18:10): > Incredible work Lambda. That’s amazing. It seems it’s not even possible to try to get them to match. > I’m honestly rather shocked at the extent of this difference.

UA5GZMWHM (00:26:10): > One may ask, if these differences don’t affect the clusters, then it shouldn’t matter. But the clusters (using default setting, except that I set resolution = 1 for Seurat to match scanpy and Giotto) don’t match that well either, though in most cases clusters can be matched. Rows are Seurat cluster and columns are scanpy clusters, color is Jaccard index - File (PNG): image.png

UA5GZMWHM (00:26:32): > These are Leiden clusters

UA5GZMWHM (00:27:58): - File (PNG): image.png

UA5GZMWHM (00:33:34): > I did the analyses again fixing k = 20 and using top 30 PC’s but default everything else (Seurat2, scanpy2, Giotto2 in the plot), and then with the same scaled data, same highly variable genes, same PCA, and k = 20 and top 30 PC’s (Seurat3, scanpy3, Giotto3). I think a lot of the differences are caused by having different highly variable genes, and the overlap in HVG’s is pretty small among the 3 packages. Which package finds better HVG’s? I don’t know, haven’t benchmarked that, but probably most users don’t care either and just accept the default. Here Seurat’s HVGs are used, so as a result the neighbors from the different settings (the only difference is whether to clip scaled data to 10) in Seurat are more similar. - File (PNG): image.png

UA5GZMWHM (00:34:44): > But despite using the same HVG and getting the same PCA, the neighborhoods are still very different between the packages, due to the different ways the SNN was computed. - File (PNG): image.png

UA5GZMWHM (00:35:27): > Again, that has something to do with neighborhood size - File (PNG): image.png

UA5GZMWHM (00:37:19): > With the same PCA, the clusters do match better. Here rows are Seurat clusters and columns are scanpy clusters and colors are Jaccard index. - File (PNG): image.png

UA5GZMWHM (00:37:53): - File (PNG): image.png

U7T29M3DG (01:11:28): > Agreed its much better but there are clusters being split (e.g. cluster 3 between seurat and scanpy) in ways that are rather striking and frankly, even given my prior which was rather cynical, I’m shocked.

Aaron Lun (13:42:30) (in thread): > FWIW I’d be happy for any level of contribution from you guys, up to and including taking it all off my hands (libscran + scran.chan + scranpy). Though scran.js is a little bit finicky re. Wasm compilation, so I’ll spare you guys from dealing with that.

U0157SYMGBT (13:44:03): > @U0157SYMGBT has joined the channel

2023-09-18

U01UF27E9P0 (20:47:38): U01UF27E9P0 (23:36:30): U01UF27E9P0 (23:46:03) (in thread):

2023-09-19

U01UF27E9P0 (00:50:25): U01UF27E9P0 (01:00:36) (in thread): UA5GZMWHM (15:13:59): > https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-03045-1 - Attachment (BioMed Central): Disparities in spatially variable gene calling highlight the need for benchmarking spatial transcriptomics methods - Genome Biology > Identifying spatially variable genes (SVGs) is a key step in the analysis of spatially resolved transcriptomics data. SVGs provide biological insights by defining transcriptomic differences within tissues, which was previously unachievable using RNA-sequencing technologies. However, the increasing number of published tools designed to define SVG sets currently lack benchmarking methods to accurately assess performance. This study compares results of 6 purpose-built packages for SVG identification across 9 public and 5 simulated datasets and highlights discrepancies between results. Additional tools for generation of simulated data and development of benchmarking methods are required to improve methods for identifying SVGs.

U01UF27E9P0 (21:43:21):

2023-09-20

U01UF27E9P0 (00:08:44): U01UF27E9P0 (00:08:52): U01UF27E9P0 (00:09:09) (in thread): Alik Huseynov (04:22:41): > @Alik Huseynov has joined the channel

Alik Huseynov (07:40:44) (in thread): > geoarrowwould be great, especially in combination withdplyrandtidyverse

UA5GZMWHM (17:21:52): > <@U04CYELHK5H>scranpyis now on PyPI. Please consider using it for VoyagerPy. Also, Aaron would like someone to take over the scranpy and headers project. If you or someone in your lab is interested, that would be great.https://pypi.org/project/scranpy/ - Attachment (PyPI): scranpy > Analyze multi-modal single-cell data!

Aaron Lun (17:26:10) (in thread): > (also happy to take any contributions to the C++ libraries themselves)

U7T29M3DG (21:19:18): > Voyager has already been cited in a paper using it to analyze Visium data for alzheimers disease (postmortem human prefrontal cortex and compared to mouse)

U7T29M3DG (21:19:20): > https://www.biorxiv.org/content/10.1101/2023.07.24.550282v1.full

2023-09-21

U7YCV0V8F (06:08:09) (in thread): - File (JPEG): Image from iOS

2023-09-23

U01UF27E9P0 (02:29:05):

2023-09-25

U01UF27E9P0 (18:09:16):

2023-09-26

U7YCV0V8F (04:36:19): > https://doi.org/10.1093/bioinformatics/btad550

2023-09-28

U01UF27E9P0 (18:07:08):

2023-09-30

U01UF27E9P0 (17:59:38):

2023-10-01

U01UF27E9P0 (18:00:33):

2023-10-02

U01UF27E9P0 (18:03:22):

2023-10-03

U01UF27E9P0 (18:01:09):

2023-10-04

U01UF27E9P0 (18:04:57):

2023-10-05

U01UF27E9P0 (18:04:12):

2023-10-06

U01UF27E9P0 (18:03:02):

2023-10-07

U01UF27E9P0 (17:58:38):

2023-10-08

U01UF27E9P0 (18:00:28):

2023-10-09

U01UF27E9P0 (18:05:26):

2023-10-10

U01UF27E9P0 (18:07:14):

2023-10-11

U01UF27E9P0 (18:06:47):

2023-10-12

U01UF27E9P0 (18:06:40):

2023-10-13

U01UF27E9P0 (18:08:02):

2023-10-16

U01UF27E9P0 (18:18:06):

2023-10-17

U7T29M3DG (10:18:26): > https://x.com/benjraphael/status/1714018750419111989?s=46&t=0bO41hICU1XW_xBrgZBPaQ - Attachment (X (formerly Twitter)): Ben Raphael on X > Our new method GASTON learns a “topographic map” of a tissue slice, enabling simultaneous identification of spatial domains and gene expression gradients in spatial transcriptomics https://t.co/mnZSxzUkqx > > Led by @UthsavC w/ @BrianJohnArnold @hrksrkr @Cong992 Sereno + Kohei (1/7)

2023-10-20

UA5GZMWHM (19:50:06): > I just learnt about this package:https://bioconductor.org/packages/release/bioc/html/tomoda.html<@U7T29M3DG>suggested using Voyager on 1D data. I think I should write a vignette on Tomo-seq, computing Moran’s I, Lee’s L, and MULTISPATI on 1D data. I can use this package for visualization. I know a really cool dataset from the spiny mouse:https://genome.cshlp.org/content/33/8/1424 - Attachment (Bioconductor): tomoda > This package provides many easy-to-use methods to analyze and visualize tomo-seq data. The tomo-seq technique is based on cryosectioning of tissue and performing RNA-seq on consecutive sections. (Reference: Kruse F, Junker JP, van Oudenaarden A, Bakkers J. Tomo-seq: A method to obtain genome-wide expression data with spatial resolution. Methods Cell Biol. 2016;135:299-307. doi:10.1016/bs.mcb.2016.01.006) The main purpose of the package is to find zones with similar transcriptional profiles and spatially expressed genes in a tomo-seq sample. Several visulization functions are available to create easy-to-modify plots. - Attachment (genome.cshlp.org): Spatial transcriptomics reveals asymmetric cellular responses to injury in the regenerating spiny mouse (Acomys) ear > An international, peer-reviewed genome sciences journal featuring outstanding original research that offers novel insights into the biology of all organisms

2023-10-25

U01UF27E9P0 (18:05:42):

2023-11-01

UA5GZMWHM (15:13:20): > https://www.paulamoraga.com/book-spatial/ - Attachment (paulamoraga.com): Welcome | Spatial Statistics for Data Science: Theory and Practice with R > A book for Spatial Statistics for Data Science with R.

2023-11-08

UA5GZMWHM (17:08:54): > Talks a bit about interoperability:https://www.biorxiv.org/content/10.1101/2023.10.30.563174v1?ct=

2023-11-22

U01UF27E9P0 (10:53:45):

2023-11-27

U01UF27E9P0 (22:27:05):

2023-11-29

U01UF27E9P0 (20:03:46):

2023-12-02

U01UF27E9P0 (20:20:10):

2023-12-03

U01UF27E9P0 (16:31:03):

2023-12-04

U01UF27E9P0 (16:43:20):

2023-12-05

U01UF27E9P0 (16:52:52):

2023-12-06

U01UF27E9P0 (16:49:49):

2023-12-07

U01UF27E9P0 (16:38:58): UA5GZMWHM (17:24:07): > I have asked related questions on how representative TMAs are in my shared Reminders (sorry for all the spam), and here’s a study that kind of answers those questions.https://doi.org/10.1101/2023.12.04.569986

2023-12-08

U01UF27E9P0 (15:26:13): U01UF27E9P0 (15:46:36) (in thread): U01UF27E9P0 (16:20:46) (in thread): U01UF27E9P0 (16:34:02):

2023-12-09

UA5GZMWHM (02:34:33): > Apparently my Museum paper/book has been very influential, and there’s a new paper doing something similar for single cell data analysis tools:https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05590-9I intended to collect info on which data analysis tools were used in each paper collecting new data and there’s a column for that in the spreadsheet, but I can’t keep up filling it so it’s mostly left blank. I wonder why nobody is plotting the institutions on maps like I did. - Attachment (BioMed Central): SingleScan: a comprehensive resource for single-cell sequencing data processing and mining - BMC Bioinformatics > Single-cell sequencing has shed light on previously inaccessible biological questions from different fields of research, including organism development, immune function, and disease progression. The number of single-cell-based studies increased dramatically over the past decade. Several new methods and tools have been continuously developed, making it extremely tricky to navigate this research landscape and develop an up-to-date workflow to analyze single-cell sequencing data, particularly for researchers seeking to enter this field without computational experience. Moreover, choosing appropriate tools and optimal parameters to meet the demands of researchers represents a major challenge in processing single-cell sequencing data. However, a specific resource for easy access to detailed information on single-cell sequencing methods and data processing pipelines is still lacking. In the present study, an online resource called SingleScan was developed to curate all up-to-date single-cell transcriptome/genome analyzing tools and pipelines. All the available tools were categorized according to their main tasks, and several typical workflows for single-cell data analysis were summarized. In addition, spatial transcriptomics, which is a breakthrough molecular analysis method that enables researchers to measure all gene activity in tissue samples and map the site of activity, was included along with a portion of single-cell and spatial analysis solutions. For each processing step, the available tools and specific parameters used in published articles are provided and how these parameters affect the results is shown in the resource. All information used in the resource was manually extracted from related literature. An interactive website was designed for data retrieval, visualization, and download. By analyzing the included tools and literature, users can gain insights into the trends of single-cell studies and easily grasp the specific usage of a specific tool. SingleScan will facilitate the analysis of single-cell sequencing data and promote the development of new tools to meet the growing and diverse needs of the research community. The SingleScan database is publicly accessible via the website at http://cailab.labshare.cn/SingleScan .

U01UF27E9P0 (16:26:51):

2023-12-10

U01UF27E9P0 (16:26:56):

2023-12-11

U01UF27E9P0 (16:35:53):

2023-12-12

U01UF27E9P0 (16:48:56):

2023-12-13

UA5GZMWHM (15:28:53): > https://www.biorxiv.org/content/10.1101/2022.12.22.521681v2?ct= - Attachment (bioRxiv): voyAGEr: free web interface for the analysis of age-related gene expression alterations in human tissues > We herein introduce voyAGEr, an online graphical interface to explore age-related gene expression alterations in 49 human tissues. voyAGEr offers a visualisation and statistical toolkit for the finding and functional exploration of sex- and tissue-specific transcriptomic changes with age. In its conception, we developed a novel bioinformatics pipeline leveraging RNA sequencing data, from the GTEx project, encompassing more than 900 individuals. voyAGEr reveals transcriptomic signatures of the known asynchronous ageing between tissues, allowing the observation of tissue-specific age-periods of major transcriptional changes, associated with alterations in different biological pathways, cellular composition, and disease conditions. Notably, voyAGEr was created to assist researchers with no expertise in bioinformatics, providing a supportive framework for elaborating, testing and refining their hypotheses on the molecular nature of human ageing and its association with pathologies, thereby also aiding in the discovery of novel therapeutic targets. voyAGEr is freely available at https://compbio.imm.medicina.ulisboa.pt/app/voyAGEr. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2023-12-14

U01UF27E9P0 (16:26:57):

2023-12-15

UA5GZMWHM (18:06:20): > The third paper citing Voyager that I know of:https://doi.org/10.1101/2023.12.13.571385 - Attachment (bioRxiv): A Comparative Analysis of Imaging-Based Spatial Transcriptomics Platforms > Spatial transcriptomics is a rapidly evolving field, overwhelmed by a multitude of technologies. This study aims to offer a comparative analysis of datasets generated from leafing in situ imaging platforms. We have generated spatial transcriptomics data from serial sections of prostate adenocarcinoma using the 10x Genomics Xenium and NanoString CosMx SMI platforms. Additionally, orthogonal single-nucleus RNA sequencing (snRNA-seq) was performed on the same FFPE tissue to establish a reference for the tumor’s transcriptional profiles. We assessed various technical aspects, such as reproducibility, sensitivity, dynamic range, cell segmentation, cell type annotation, and congruence with single-cell profiling. The practicality of assessing cellular organization and biomarker localization was evaluated. Although fewer genes are measured (CosMx: 960, Xenium: 377, with an overlap of 125), Xenium consistently demonstrates higher sensitivity, a broader dynamic range, and better alignment with single-cell reference profiles. Conversely, CosMx’s out-of-the-box segmentation outperformed Xenium’s, resulting in noticeable transcript misassignment in Xenium within certain tissue areas. However, the impact of this on the cells’ transcriptional profile was minimal. Together, this comprehensive comparison of two leading commercial platforms for spatial transcriptomics provides essential metrics for assessing their performance, offering invaluable insights for future research and technological advancements in this dynamic field. > > ### Competing Interest Statement > > L.G.M., J.T.P., D.P.C., D.P.K., K.B.J., K.W., M.J.R., F.S-D., N.K.R., M.Z., I.S.V., S.R.V.K., L.M.B., J.L.W. and N.B. declare no professional or financial affiliations with 10X Genomics or NanoString Technologies. During the conduct of this study, L.G.M. served as a Scientific Advisor for Millenium Sciences (no longer in this role), Omniscope, and ArgenTag. N.B. is Chief Scientist at Deepcell. S.R.V.K. is the founder of and a consultant for Faeth Therapeutics and Transomic Technologies. None of the authors received any form of payment or compensation from these companies. Consumables used in this study from both companies were purchased at full price, acquired at a discounted rate, or provided free of charge, although not specifically for this study. Subscription to AtoMx was provided at no cost by NanoString Technologies.

2023-12-17

UA5GZMWHM (00:58:43): > Relevant to radiomics in mice:https://www.biorxiv.org/content/10.1101/2023.11.27.568800v1?ct=

2023-12-19

U05DTFCM9TQ (15:25:12): > @U05DTFCM9TQ has joined the channel

UA5GZMWHM (20:29:19): > https://doi.org/10.1101/2023.12.07.570603 - Attachment (bioRxiv): Systematic benchmarking of imaging spatial transcriptomics platforms in FFPE tissues > Emerging imaging spatial transcriptomics (iST) platforms and coupled analytical methods can recover cell-to-cell interactions, groups of spatially covarying genes, and gene signatures associated with pathological features, and are thus particularly well-suited for applications in formalin fixed paraffin embedded (FFPE) tissues. Here, we benchmarked the performance of three commercial iST platforms on serial sections from tissue microarrays (TMAs) containing 23 tumor and normal tissue types for both relative technical and biological performance. On matched genes, we found that 10X Xenium shows higher transcript counts per gene without sacrificing specificity, but that all three platforms concord to orthogonal RNA-seq datasets and can perform spatially resolved cell typing, albeit with different false discovery rates, cell segmentation error frequencies, and with varying degrees of sub-clustering for downstream biological analyses. Taken together, our analyses provide a comprehensive benchmark to guide the choice of iST method as researchers design studies with precious samples in this rapidly evolving field. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2023-12-20

UA5GZMWHM (22:58:58): > OK, I didn’t read it carefully but it looks cool. They did both Visium and micro CT.https://doi.org/10.1126/scitranslmed.ade4619

2024-01-04

UA5GZMWHM (19:01:37): > https://doi.org/10.1101/2023.12.03.569744 - Attachment (bioRxiv): Systematic comparison of sequencing-based spatial transcriptomic methods > Recent advancements of sequencing-based spatial transcriptomics (sST) have catalyzed significant advancements by facilitating transcriptome-scale spatial gene expression measurement. Despite this progress, efforts to comprehensively benchmark different platforms are currently lacking. The extant variability across technologies and datasets poses challenges in formulating standardized evaluation metrics. In this study, we established a collection of reference tissues and regions characterized by well-defined histological architectures, and used them to generate data to compare six sST methods. We highlighted molecular diffusion as a variable parameter across different methods and tissues, significantly impacting the effective resolutions. Furthermore, we observed that spatial transcriptomic data demonstrate unique attributes beyond merely adding a spatial axis to single-cell data, including an enhanced ability to capture patterned rare cell states along with specific markers, albeit being influenced by multiple factors including sequencing depth and resolution. Our study assists biologists in sST platform selection, and helps foster a consensus on evaluation standards and establish a framework for future benchmarking efforts that can be used as a gold standard for the development and benchmarking of computational tools for spatial transcriptomic analysis. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2024-01-09

UA5GZMWHM (20:15:01): > New special item in the museum, special because they used a custom large format Visium-like spatial transcriptomics method on a large slide that contains sections of a whole mouse. Both creepy/gruesome and cool.https://www.nature.com/articles/s41590-023-01722-8 - Attachment (Nature): A pairwise cytokine code explains the organism-wide response to sepsis > Nature Immunology - Chevrier and colleagues uncovered a hierarchical cytokine circuit arising from the pairwise effects of TNF with IL-18, IFN-γ or IL-1β, which explains the organism-wide…

2024-01-10

UA5GZMWHM (03:51:12): > <@U05QPDL3TNF><@U02EN9EQQ5U>I documented in detail how I curate the Museum database, including setting up RSS feed, what is deemed relevant, and what the columns in the database mean. It’s on CryptPad, a privacy-friendly, secure, and open source alternative to Google Docs:https://cryptpad.fr/code/#/2/code/view/d0BfTcGUpiAyEC+8q7vJfNtuo8di47Fj801F3GGm0RY/p/If you’re interested in becoming a curator, I’ll dm you the password or you sign up on CryptPad and I’ll share the document with you. Your help in manual curation or in creating more automated ways is greatly appreciated because I’m struggling to keep up. It’s taking up so much time that I’m so behind in writing SFE and Voyager. This database is very valuable, not only in updating the digital humanities analyses in the book, but also that it’s a great resource to learn about spatial transcriptomics. Benefits of being a curator: you will be forced to be very up to date in this field and you will learn about new diseases, species, and institutions. - Attachment (cryptpad.fr): Encrypted Code > CryptPad: end-to-end encrypted collaboration suite

U05QPDL3TNF (13:18:36) (in thread): > Sounds great! Once I wrap up this Seurat-Scanpy work I’d be happy to help. It would be great to meet and talk through your process at that point as well!

U02EN9EQQ5U (14:10:57) (in thread): > I’min! I agree with Joe that it would be great to meet to talk about this whenever you have free time

UA5GZMWHM (17:05:48) (in thread): > Great, I don’t have much scheduled so can be anytime

U05QPDL3TNF (19:19:35) (in thread): > Same,I’mflexible.Maybe sometime Friday or next week?

UA5GZMWHM (19:20:19) (in thread): > What about Friday 2 pm?

U05QPDL3TNF (19:21:36) (in thread): > Work for me

U02EN9EQQ5U (19:21:59) (in thread): > Same

UA5GZMWHM (19:23:30) (in thread): > Alright, see you then

2024-01-11

U01UF27E9P0 (16:33:22):

2024-01-12

U01UF27E9P0 (16:33:44): U02EN9EQQ5U (17:05:15) (in thread): > <@U05QPDL3TNF>We are in room 222. Next to the big conference room

UA5GZMWHM (17:08:06) (in thread): > https://cryptpad.fr/profile/#/2/profile/view/BvXbo8XtvTiQDaobqbUeONT8TQplex8BEICXEUPnbhI/ - Attachment (cryptpad.fr): CryptPad > CryptPad: end-to-end encrypted collaboration suite

2024-01-13

U01UF27E9P0 (16:19:59): UA5GZMWHM (17:47:43): > https://doi.org/10.1101/2024.01.11.575135 - Attachment (bioRxiv): Comparative analysis of multiplexed in situ gene expression profiling technologies > The burgeoning interest in in situ multiplexed gene expression profiling technologies has opened new avenues for understanding cellular behavior and interactions. In this study, we present a comparative benchmark analysis of six in situ gene expression profiling methods, including both commercially available and academically developed methods, using publicly accessible mouse brain datasets. We find that standard sensitivity metrics, such as the number of unique molecules detected per cell, are not directly comparable across datasets due to substantial differences in the incidence of off-target molecular artifacts impacting specificity. To address these challenges, we explored various potential sources of molecular artifacts, developed novel metrics to control for them, and utilized these metrics to evaluate and compare different in situ technologies. Finally, we demonstrate how molecular false positives can seriously confound spatially-aware differential expression analysis, requiring caution in the interpretation of downstream results. Our analysis provides guidance for the selection, processing, and interpretation of in situ spatial technologies. > > ### Competing Interest Statement > > A.H. was employed by 10x Genomics from July 2020 to September 2021 and owns stock in the company. In the past 3 years, R.S. has received compensation from Bristol-Myers Squibb, ImmunAI, Resolve Biosciences, Nanostring, 10x Genomics, Neptune Bio, and the NYC Pandemic Response Lab. R.S. is a co-founder and equity holder of Neptune Bio.

2024-01-14

UA5GZMWHM (01:47:57): > Bianca Dumitrascu sent me this paper, which seems relevant to<@U05QPDL3TNF>’s work on Seurat vs. Scanpy:https://www.pnas.org/doi/10.1073/pnas.1901326117

U01UF27E9P0 (16:32:47):

2024-01-16

U01UF27E9P0 (16:39:16):

2024-01-17

U01UF27E9P0 (16:21:42):

2024-01-20

U01UF27E9P0 (16:21:35):

2024-01-21

U01UF27E9P0 (16:26:12):

2024-01-22

U01UF27E9P0 (16:50:30):

2024-01-25

U01UF27E9P0 (16:22:19):

2024-01-26

U01UF27E9P0 (16:37:46): U023DK2HCM7 (20:42:01): > Hi<@U04CYELHK5H>, is there a function to compute the spatial neighborhood graph in voyagerpy? I am trying to compute spatial Moran’s I for a custom spatial technology, but I don’t think there is a spatial Moran’s I tutorial on the website yet, is that right?

2024-01-27

U01UF27E9P0 (16:22:52):

2024-01-28

U01UF27E9P0 (16:29:05):

2024-01-29

U01UF27E9P0 (16:14:24):

2024-01-30

U01UF27E9P0 (16:28:28):

2024-01-31

U01UF27E9P0 (16:39:44):

2024-02-01

U01UF27E9P0 (16:27:33):

2024-02-03

U01UF27E9P0 (16:31:31):

2024-02-04

U01UF27E9P0 (16:27:33):

2024-02-05

U01UF27E9P0 (00:19:49): U01UF27E9P0 (00:22:49): UA5GZMWHM (01:57:19): > I made two “great” discoveries this weekend: > 1. Geometric operations withsfactuallydoes work with 3D data, although I can only perform the operations in x and y. It’s much easier than I thought and I can avoid a hacky way I came up with earlier to add the z coordinates back. > 2. I heard about thesfheaderspackage a long time ago, but I just found out that it can replace much of my clunky code to convert plain data frames intosfgeometries orders of magnitude faster. Great news for larger datasets.

U01UF27E9P0 (16:39:52): U01UF27E9P0 (18:11:20):

2024-02-06

UA5GZMWHM (13:16:29): > https://doi.org/10.1101/2024.02.01.578436 - Attachment (bioRxiv): SEraster: a rasterization preprocessing framework for scalable spatial omics data analysis > Motivation Spatial omics data demand computational analysis but many analysis tools have computational resource requirements that increase with the number of cells analyzed. This presents scalability challenges as researchers use spatial omics technologies to profile millions of cells. Results To enhance the scalability of spatial omics data analysis, we developed a rasterization preprocessing framework called SEraster that aggregates cellular information into spatial pixels. We apply SEraster to both real and simulated spatial omics data prior to spatial variable gene expression analysis to demonstrate that such preprocessing can reduce resource requirements while maintaining high performance. We further integrate SEraster with existing analysis tools to characterize cell-type spatial cooccurrence. Finally, we apply SEraster to enable analysis of a mouse pup spatial omics dataset with over a million cells to identify tissue-level and cell-type-specific spatially variable genes as well as cooccurring cell-types that recapitulate expected organ structures. Availability and implementation Source code is available on GitHub (https://github.com/JEFworks-Lab/SEraster) with additional tutorials at https://JEF.works/SEraster. > > ### Competing Interest Statement > > The authors have declared no competing interest.

UA5GZMWHM (14:33:34): > I’ve been having this idea for a long time, though I haven’t added it to the SFE package yet. I think when I implement it (most likely after the hackathon), I’ll add the option to do the binning at the level of transcript spots in smFISH-based data. It could deserve a vignette like when I use bins of different sizes and square vs. hexagonal to see spatial autocorrelation at different length scales.

Alik Huseynov (16:08:55): > Nice! binning is important part to work with rowGeometries as segmentation-free approach.

2024-02-07

U01UF27E9P0 (16:46:15):

2024-02-09

U01UF27E9P0 (16:39:09): UA5GZMWHM (19:58:13): > Have to do before I leave: > * Vignette for SpatialGenomics (seqFISH), data from Rahma, Brian Williams, Matt Thomson > * Vignette for Open-ST > * Joe works on vignette with Alik to learn SFE > * Joe improves VoyagerPy > * 3 papers: Joe’s paper (Cell System), SFE (Bioinformatics, compare with Giotto, SCE, SPE, Squidpy, MolExp, do geometric operations, update overview workflow on 3D and multiple samples), Voyager (Nature Methods) > * Email more people on the workshop & hackathon > * Visium HD vignette > * Voyager paper: non-spatial data, new case studies on smFISH (rowGeometry) and multiple samples

UA5GZMWHM (22:13:34): > Another thing when it comes to the SFE paper that I forgot to mention: it already got started when the SFE package has a function to reformat smFISH transcript spot coordinate data and write the reformatted results to disk. Ideally only transcript spots of certain genes of interest are loaded into memory since I anticipate the primary use of this to be visualization and you can’t visualize transcript spots of too many genes at once before they get way too crowded. So the files on disk get complicated structure. I already started thealabaster.sferepo. It’s for an on-disk language agnostic serialization of the SFE object. Genentech’s ArtifactDB has already implemented this for SCE and SPE.

UA5GZMWHM (22:14:28): > Now thetidySingleCellExperimentpaper is under review at Nature Methods. It can be extended to SFE just like thatsfis the part of Tidyverse for spatial data.

2024-02-10

U01UF27E9P0 (02:09:26): U01UF27E9P0 (02:12:22): U01UF27E9P0 (02:13:49): U01UF27E9P0 (16:44:34):

2024-02-11

U01UF27E9P0 (16:26:55):

2024-02-12

U01UF27E9P0 (16:34:27):

2024-02-18

Artür Manukyan (05:08:45): > @Artür Manukyan has joined the channel

2024-03-01

U01UF27E9P0 (00:13:47):

2024-03-02

UA5GZMWHM (23:34:21): > https://www.researchgate.net/publication/328723512_Tobler%27s_First_Law_of_Geography

2024-03-03

U01UF27E9P0 (01:42:10):

2024-03-04

U01UF27E9P0 (03:21:27): U01UF27E9P0 (03:23:51):

2024-03-05

UA5GZMWHM (16:35:28): > https://www.biorxiv.org/content/10.1101/2023.08.18.553810v2 - Attachment (bioRxiv): 13C-SpaceM: Spatial single-cell isotope tracing reveals heterogeneity of de novo fatty acid synthesis in cancer > Metabolism has emerged as a key factor in homeostasis and disease including cancer. Yet, little is known about the heterogeneity of metabolic activity of cancer cells due to the lack of tools to directly probe it. Here, we present a novel method, 13C-SpaceM for spatial single-cell isotope tracing of glucose-dependent de novo lipogenesis. The method combines imaging mass spectrometry for spatially-resolved detection of 13C6-glucose-derived 13C label incorporated into esterified fatty acids with microscopy and computational methods for data integration and analysis. We validated 13C-SpaceM on a spatially-heterogeneous normoxia-hypoxia model of liver cancer cells. Investigating cultured cells, we revealed single-cell heterogeneity of lipogenic acetyl-CoA pool labelling degree upon ACLY knockdown that is hidden in the bulk analysis and its effect on synthesis of individual fatty acids. Next, we adapted 13C-SpaceM to analyze tissue sections of mice harboring isocitrate dehydrogenase (IDH)-mutant gliomas. We found a strong induction of de novo fatty acid synthesis in the tumor tissue compared to the surrounding brain. Comparison of fatty acid isotopologue patterns revealed elevated uptake of mono-unsaturated and essential fatty acids in the tumor. Furthermore, our analysis uncovered substantial spatial heterogeneity in the labelling of the lipogenic acetyl-CoA pool indicative of metabolic reprogramming during microenvironmental adaptation. Overall, 13C-SpaceM enables novel ways for spatial probing of metabolic activity at the single cell level. Additionally, this methodology provides unprecedented insight into fatty acid uptake, synthesis and modification in normal and cancerous tissues. > > ### Competing Interest Statement > > T.A. holds patents on imaging mass spectrometry and leads a startup on single-cell metabolomics incubated at the BioInnovation Institute. G.J.P. is a scientific advisory board member for Cambridge Isotope Laboratories and has a collaborative research agreement with Thermo Scientific.

U01UF27E9P0 (18:46:58):

2024-03-08

UA5GZMWHM (15:42:55): > https://www.biorxiv.org/content/10.1101/2024.03.04.583400v1 - Attachment (bioRxiv): A Spatial Multi-Modal Dissection of Host-Microbiome Interactions within the Colitis Tissue Microenvironment > The intricate and dynamic interactions between the host immune system and its microbiome constituents undergo dynamic shifts in response to perturbations to the intestinal tissue environment. Our ability to study these events on the systems level is significantly limited by in situ approaches capable of generating simultaneous insights from both host and microbial communities. Here, we introduce Microbiome Cartography (MicroCart), a framework for simultaneous in situ probing of host features and its microbiome across multiple spatial modalities. We demonstrate MicroCart by comprehensively investigating the alterations in both gut host and microbiome components in a murine model of colitis by coupling MicroCart with spatial proteomics, transcriptomics, and glycomics platforms. Our findings reveal a global but systematic transformation in tissue immune responses, encompassing tissue-level remodeling in response to host immune and epithelial cell state perturbations, and bacterial population shifts, localized inflammatory responses, and metabolic process alterations during colitis. MicroCart enables a deep investigation of the intricate interplay between the host tissue and its microbiome with spatial multiomics. > > ### Competing Interest Statement > > The authors have declared no competing interest.

U01UF27E9P0 (23:01:36):

2024-03-14

UA5GZMWHM (19:02:23): > They did both Visium and micro CT in mouse after tenotomy and with 30% body surface area burn. The experiment is kind of gruesome.https://www.nature.com/articles/s41413-024-00320-0 - Attachment (Nature): The HIF-1α/PLOD2 axis integrates extracellular matrix organization and cell metabolism leading to aberrant musculoskeletal repair > Bone Research - The HIF-1α/PLOD2 axis integrates extracellular matrix organization and cell metabolism leading to aberrant musculoskeletal repair

UA5GZMWHM (19:15:57): > Now we have another benchmark study:https://doi.org/10.1101/2024.03.12.584114 - Attachment (bioRxiv): Benchmarking clustering, alignment, and integration methods for spatial transcriptomics > Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice, and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remain challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of benchmark studies complicates the selection of methods and future method development. Here we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. Different experimental metrics and analyses, like adjusted rand index (ARI), uniform manifold approximation and projection (UMAP) visualization, layer-wise and spot-to-spot alignment accuracy, spatial coherence score (SCS), and 3D reconstruction, are meticulously designed to assess method performance as well as data quality. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics. This analysis leads to a comprehensive recommendation that covers multiple aspects for users. The code used for evaluation is available on GitHub. Additionally, we provide jupyter notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets (). > > ### Competing Interest Statement > > The authors have declared no competing interest.

2024-03-19

U01UF27E9P0 (23:39:22): U01UF27E9P0 (23:43:03):

2024-03-20

U01UF27E9P0 (17:27:50): UA5GZMWHM (21:39:09): > Good news today: someone from CMU achieved my dream back in 2021 to reconstruct 3D cell segmentation from 2D segmentation from each z plane.https://www.biorxiv.org/content/10.1101/2024.03.08.584082v1 - Attachment (bioRxiv): 3DCellComposer - A Versatile Pipeline Utilizing 2D Cell Segmentation Methods for 3D Cell Segmentation > Background Cell segmentation is crucial in bioimage informatics, as its accuracy directly impacts conclusions drawn from cellular analyses. While many approaches to 2D cell segmentation have been described, 3D cell segmentation has received much less attention. 3D segmentation faces significant challenges, including limited training data availability due to the difficulty of the task for human annotators, and inherent three-dimensional complexity. As a result, existing 3D cell segmentation methods often lack broad applicability across different imaging modalities. > > Results To address this, we developed a generalizable approach for using 2D cell segmentation methods to produce accurate 3D cell segmentations. We implemented this approach in 3DCellComposer, a versatile, open-source package that allows users to choose any existing 2D segmentation model appropriate for their tissue or cell type(s) without requiring any additional training. Importantly, we have enhanced our open source CellSegmentationEvaluator quality evaluation tool to support 3D images. It provides metrics that allow selection of the best approach for a given imaging source and modality, without the need for human annotations to assess performance. Using these metrics, we demonstrated that our approach produced high-quality 3D segmentations of tissue images, and that it could outperform an existing 3D segmentation method on the cell culture images with which it was trained. > > Conclusions 3DCellComposer, when paired with well-trained 2D segmentation models, provides an important alternative to acquiring human-annotated 3D images for new sample types or imaging modalities and then training 3D segmentation models using them. It is expected to be of significant value for large scale projects such as the Human BioMolecular Atlas Program. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2024-03-21

UA5GZMWHM (05:19:30): > https://www.nature.com/articles/s41592-024-02212-x - Attachment (Nature): SpatialData: an open and universal data framework for spatial omics > Nature Methods - SpatialData is a user-friendly computational framework for exploring, analyzing, annotating, aligning and storing spatial omics data that can seamlessly handle large multimodal…

UA5GZMWHM (05:22:23): > Now what, I guess I can try to submit the SFE paper to Nature Methods as well. Not 100% sure if it should still be separate from the Voyager paper, but I already have a lot to say about SFE so maybe it should be a separate paper. Not sure if they will reject it because it’s quite similar to SpatialData. There’s a SpatialData R package in progress, not on Bioc yet. Thing is it’s more difficult to work with Zarr in R. There’s the Rarr package but it doesn’t support all Zarr functionalities.

UA5GZMWHM (05:40:19): > Maybe I’ll be fine. I’ve seen similar papers being published in the same issue of Nature Methods before. I know some good cases in spatial transcriptomics, but here’s a bad case: Alevin fry, which basically just copied kallisto bustools and nobody cites it. SFE is still a contribution to the R community. I suppose I can move the first case study in the Voyager paper to the SFE paper since it doesn’t involve spatial statistics but involves geometric operations, so there’s more room for other case studies for Voyager. I can also put more emphasis on the map analogy and that sample relations tree. For the second case study, I can get a published Visium and Mass Spec metabolomics dataset (that will be a vignette anyway) and do the alignment like I have done for Cathal from Adelaide though I can’t use his dataset.

Alik Huseynov (06:44:20): > I think SFE as separate paper would make sense and could probably be done before things get updated on SpatialDataR. > And Separate paper on Voyager spatial analysis/geospatial stats stuff.

UA5GZMWHM (14:35:35): > I already kind of wrote a new SFE vignette for the workshop. I wonder which way you prefer: > 1. Put SFE vignettes on the Voyager documentation website > 2. Create a documentation branch in the SFE repo for more extensive vignettes just like for Voyager. So the vignettes will be on the separate SFE documentation website rather than the Voyager one.

Alik Huseynov (16:08:56): > I think 2. is preferable, especially for SFE paper, it would cool to have. I also started a small vignette on Seurat to SFE conversion (Visium, and image-based ST)..

U01UF27E9P0 (16:11:30): U01UF27E9P0 (20:51:21):

2024-03-22

UA5GZMWHM (01:41:42): > Google doc for the SFE paperhttps://docs.google.com/document/d/11ps2j_Ds2CIt0InMM1qKws7YncMc5DUDD0Qeyr36UYg/edit?usp=sharing

U01UF27E9P0 (17:41:25):

2024-03-23

U01UF27E9P0 (17:36:25):

2024-03-24

U7T29M3DG (00:43:59): > I agree with Alik that a separate SFE paper makes sense, and glad to see you’ve started it.

U01UF27E9P0 (17:26:51):

2024-03-25

Pedro Sanchez (06:39:59): > @Pedro Sanchez has joined the channel

U01UF27E9P0 (07:21:25): U01UF27E9P0 (07:21:25) (in thread): Pedro Sanchez (07:40:02): > @Pedro Sanchez has left the channel

U01UF27E9P0 (08:12:45) (in thread): U01UF27E9P0 (17:48:03):

2024-03-26

Alik Huseynov (05:55:51): > interesting paper from<@U91L3C2KF>:+1:I’m glad that Harmony performed the best.https://www.biorxiv.org/content/10.1101/2024.03.19.585562v1 - Attachment (bioRxiv): Batch correction methods used in single cell RNA-sequencing analyses are often poorly calibrated > As the number of experiments that employ single-cell RNA-sequencing (scRNA-seq) grows it opens up the possibility of combining results across experiments or processing cells from the same experiment assayed in separate sequencing runs. The gain in the number of cells that can be compared comes at the cost of batch effects that may be present. Several methods have been proposed to combat this for scRNA-seq datasets. > > We compared seven widely used method used for batch correction of scRNA-seq datasets. We present a novel approach to measure the degree to which the methods alter the data in the process of batch correction, both at the fine scale comparing distances between cells as well as measuring effects observed across clusters of cells. We demonstrate that many of the published method are poorly calibrated in the sense that the process of correction creates measurable artifacts in the data. > > In particular, MNN, SCVI and LIGER performed poorly in our tests, often altering the data considerably. Batch correction with Combat, BBKNN and Seurat introduced artifacts that could be detected in our setup. However, we found that Harmony was the only method that consistently performed well, in all the testing methodology we present. Due to these result Harmony is the only method we can safely recommend using when performing batch correction of scRNA-seq data. > > ### Competing Interest Statement > > The authors have declared no competing interest.

U01UF27E9P0 (17:38:54):

2024-03-27

U01UF27E9P0 (00:18:45): U01UF27E9P0 (00:18:45) (in thread): UA5GZMWHM (08:18:07): > > Import update: The Bioconductor Release will be May 1 following the release of R 4.4 on April 24. > > The Bioconductor 3.18 branch will be frozen Monday April 15th. After that date, no changes will be permitted ever on that branch. > > The deadline for devel Bioconductor 3.19 for packages to pass R CMD build and R CMD check is Friday April 26th. While you will still be able to make commits past this date, This ensures any changes pushed to git.bioconductor.org are reflected in at least one build report before the devel branch will be copied to a release 3.19 branch. >

UA5GZMWHM (08:19:08): > So we have a little over a month to work on the next release, not including vignettes that are not on Bioconductor.

2024-03-29

U01UF27E9P0 (17:22:58):

2024-03-30

UA5GZMWHM (01:00:06): > Here’s my abstract for Bioc2024. Please let me know if you have comments or suggestions. I have to submit it on April 1.https://docs.google.com/document/d/1SrPK2s_102xZdJrBc_gcddV0K6NspSO19MB6hBmLJ2A/edit?usp=sharing

U01UF27E9P0 (17:35:08):

2024-03-31

Alik Huseynov (15:58:38) (in thread): > looks good, thanks!

U01UF27E9P0 (17:23:26):

2024-04-01

UA5GZMWHM (16:52:31) (in thread): > If nobody has comments then I’ll submit it

U01UF27E9P0 (17:27:05):

2024-04-02

U01UF27E9P0 (17:32:09): UA5GZMWHM (20:54:55): > Just found this: the glorious era of ISH atlases, prequel to spatial transcriptomics, is still alive.https://doi.org/10.1016/j.neures.2017.10.009 https://doi.org/10.1073/pnas.2020125118

2024-04-03

U01UF27E9P0 (17:30:10):

2024-04-04

U01UF27E9P0 (17:26:40):

2024-04-05

U01UF27E9P0 (17:26:21):

2024-04-06

Alik Huseynov (06:19:43): > <@UA5GZMWHM>and<@U7T29M3DG>regarding our discussion of ST techs comparison. It is finally online on bioRxiv. ST techs (Visium, RNAScope, Vizgen, Xenium and Molecular Cartography) as well as snRNAseq. > there is also discussion on tech-specific segmentation part as well as the robustness and practical usage of machines.https://www.biorxiv.org/content/10.1101/2024.04.03.586404v1 - Attachment (bioRxiv): Comparison of spatial transcriptomics technologies used for tumor cryosections > Background: Spatial transcriptomics (ST) methods provide single cell transcriptome profiles within the endogenous cell tissue context. Thus, they are ideally suited to resolve intra-tumor heterogeneity and interactions between tumor and non-malignant cells in their microenvironment. However, different ST technologies exist and are rapidly evolving. It is frequently not clear how they address a given research question in cancer research. Here, we investigated fresh frozen tissue sections from medulloblastoma with enhanced nodularity (MBEN) using four distinct imaging-based ST methodologies (RNAscope HiPlex, Molecular Cartography, MERFISH/Merscope, and Xenium), alongside sequencing-based ST on Visium slides. Results: Our comparative case study describes critical aspects of the different workflows and identifies informative parameters to assess the experimental results. We evaluate sensitivity and specificity across platforms, show how technology dependent features affect the results and provide guidance for the experimental design. Furthermore, we demonstrate how cell segmentation can be improved and/or additional custom readouts can be integrated by reimaging of the slides after the ST experiment. Conclusions: The different ST methods come with specific features that should be considered in the method selection and experimental design. We anticipate that the insight gained in our study will facilitate successful application of ST for the analysis of fresh frozen tissue sections of solid tumors. > > ### Competing Interest Statement > > The authors have declared no competing interest.

UA5GZMWHM (08:21:04) (in thread): > I took a look at the paper. About Moran’s I, it says > > Bounds of Moran’s I go from -1 to +1 (similar to Pearson correlation coefficients). A value round 0 indicates spatially random pattern, < 0 towards -1 negative spatial autocorrelation (chessboard-like pattern), > 0 to towards 1 indicates positive spatial autocorrelation (clustered, also gradient-like patterns). > That’s not entirely accurate. The lower and upper bounds actually depend on the spatial neighborhood graph. The bounds are the largest and smallest eigenvalues of the double centered adjacency matrix of the graph. In most cases, the upper bound is around 1, but the lower bound is usually around -0.6 in 2D tessellations. Cite this paper about the lower and upper bounds:https://onlinelibrary.wiley.com/doi/10.1111/j.1538-4632.1984.tb00797.x

UA5GZMWHM (08:31:40) (in thread): > Also I don’t think it’s appropriate to cite the Voyager paper when it comes to Moran’s I itself; you should cite the original paper instead:https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.2517-6161.1948.tb00012.x

Alik Huseynov (08:51:55) (in thread): > Thanks! We will correct that.. > I remember from herehttps://www.paulamoraga.com/book-spatial/spatial-autocorrelation.html#global-morans-ithat the global ranges were -1 to 1. - Attachment (paulamoraga.com): Chapter 8 Spatial autocorrelation | Spatial Statistics for Data Science: Theory and Practice with R > Spatial autocorrelation is used to describe the extent to which a variable is correlated with itself through space. This concept is closely related to Tobler’s First Law of Geography, which states…

U01UF27E9P0 (17:23:28): U7T29M3DG (18:25:31): > Hi@Alik Huseynov

U7T29M3DG (18:25:56): > That’s a great preprint and will be very useful to people (and us in particular). Congratulations!

UA5GZMWHM (18:41:17) (in thread): > I think it should be fine to say “usually go from -1 to 1”. But the “usually” is important because often the lower bound is closer to -0.5 and in some pathological cases the bounds are quite far from -1 and 1. See this paper for some pathological cases (I can send you the pdf if you’re interested):https://link.springer.com/article/10.1007/s10109-019-00293-3 - Attachment (SpringerLink): Spatial autocorrelation for massive spatial data: verification of efficiency and statistical power asymptotics > Journal of Geographical Systems - Being a hot topic in recent years, many studies have been conducted with spatial data containing massive numbers of observations. Because initial developments for…

2024-04-07

Alik Huseynov (08:27:51): > Thanks Lior! hopefully we will get it published before this year ends

U01UF27E9P0 (17:19:17):

2024-04-08

U01UF27E9P0 (17:29:55):

2024-04-09

U01UF27E9P0 (17:20:37):

2024-04-10

U01UF27E9P0 (17:22:53):

2024-04-11

U01UF27E9P0 (17:23:42):

2024-04-12

U01UF27E9P0 (17:16:46):

2024-04-13

U01UF27E9P0 (17:22:19):

2024-04-14

U01UF27E9P0 (19:36:57):

2024-04-15

U01UF27E9P0 (17:46:09):

2024-04-16

U01UF27E9P0 (17:33:03):

2024-04-17

U01UF27E9P0 (17:34:07):

2024-04-18

U01UF27E9P0 (17:37:48):

2024-04-19

U01UF27E9P0 (17:29:21):

2024-04-20

U01UF27E9P0 (17:28:22):

2024-04-21

U01UF27E9P0 (17:22:49):

2024-04-22

U01UF27E9P0 (17:26:23):

2024-04-23

U01UF27E9P0 (17:25:25): UA5GZMWHM (18:55:41): > https://doi.org/10.1101/2024.04.18.590091 - Attachment (bioRxiv): Spatial Transcriptomics in Mechanomics: New Horizons in Exploring the Mechanoregulation of Bone Regeneration > In over 100 years, the field of bone mechanobiology has sought experimental techniques to unravel the molecular mechanisms governing the phenomenon of mechanically-regulated bone (re)modelling. Each cell within a fracture site resides within different local micro-environments characterized by different levels of mechanical strain - thus, preserving the spatial location of each cell is critical in relating cellular responses to mechanical stimuli. Our spatial transcriptomics based “mechanomics” platform facilitates spatially-resolved analysis of the molecular profiles of cells with respect to their local in vivo mechanical environment by integrating time-lapsed in vivo micro-computed tomography, spatial transcriptomics, and micro-finite element analysis. We investigate the transcriptomic responses of cells as a function of the local strain magnitude by identifying the differential expression of genes in regions of high and low strain within a fracture site. Our platform thus has the potential to address fundamental open questions within the field and identify novel mechano-responsive targets to enhance bone regeneration. > > ### Competing Interest Statement > > The authors have declared no competing interest.

2024-04-24

U01UF27E9P0 (17:22:48):

2024-04-25

U01UF27E9P0 (17:35:21):

2024-04-26

U01UF27E9P0 (03:58:00): U01UF27E9P0 (03:58:00) (in thread): U01UF27E9P0 (03:58:13): U01UF27E9P0 (03:58:13) (in thread): U01UF27E9P0 (03:58:22): U01UF27E9P0 (03:58:22) (in thread): U01UF27E9P0 (17:29:33):

2024-04-27

U01UF27E9P0 (03:21:29): U01UF27E9P0 (03:21:30) (in thread): U01UF27E9P0 (17:19:20):

2024-04-28

U01UF27E9P0 (17:22:25):

2024-04-29

U01UF27E9P0 (05:44:17): U01UF27E9P0 (17:24:30):

2024-04-30

U01UF27E9P0 (17:20:00):

2024-05-01

U01UF27E9P0 (17:26:50):

2024-05-02

U01UF27E9P0 (17:31:39):

2024-05-03

U01UF27E9P0 (17:34:57):

2024-05-04

U01UF27E9P0 (17:22:29):

2024-05-05

U01UF27E9P0 (17:16:20):

2024-05-06

U01UF27E9P0 (17:23:43):

2024-05-07

U01UF27E9P0 (17:26:27):

2024-05-08

U01UF27E9P0 (17:37:22): U7T29M3DG (20:35:35): > FYI FYIhttps://x.com/shyam_lab/status/1788240248390250764 - Attachment (X (formerly Twitter)): Shyam Prabhakar Lab (@shyam_lab) on X > Wonderful to see BANKSY incorporated into this @satijalab spatial data analysis workflow. Looking forward to trying out the other parts. Well deserved recognition for BANKSY’s developers @vipul1891 @NigelChouS @jleechung!

2024-05-09

U01UF27E9P0 (17:40:36):

2024-05-10

U01UF27E9P0 (17:24:11):

2024-05-11

U01UF27E9P0 (17:15:31):

2024-05-12

U01UF27E9P0 (17:17:56):

2024-05-13

U01UF27E9P0 (17:28:14):

2024-05-15

U01UF27E9P0 (00:25:22):

2024-05-17

U01UF27E9P0 (20:22:00):

2024-05-31

U01UF27E9P0 (10:53:55):

2024-06-06

U01UF27E9P0 (02:16:13):

2024-06-13

UA5GZMWHM (18:03:47): > Impact of spatial autocorrelation on frequency distribution:https://chjs.mat.utfsm.cl/volumes/09/01/Chun_Griffith(2018).pdf

UA5GZMWHM (18:08:49): > https://www.researchgate.net/profile/Daniel-Griffith/publication/312462981_Positive_spa[…]able-frequency-distributions.pdf?origin=publication_detail

U7T29M3DG (19:24:14): > This is very interesting; thanks for sending<@UA5GZMWHM>

UA5GZMWHM (21:04:12): > On data normalization for imaging based technologies, relevant to the Xenium vignette:https://www.biorxiv.org/content/10.1101/2023.08.30.555624v2

U7T29M3DG (22:24:55): > Yes, this paper was written based on<@U02EN9EQQ5U>’s ideas which Jean learned of when talking to Kayla at CSHL.:disappointed:

U02EN9EQQ5U (22:27:24) (in thread): > Yes - interesting that the manuscript hasn’t advanced far past the initial conversation that Jean and I had

2024-06-15

U05QPDL3TNF (14:05:21): > I’vefinished the Xenium vignette for voyagerpy.It all looks qualitatively similar to voyager R

2024-06-17

UA5GZMWHM (00:15:34): > https://www.nature.com/articles/s41592-024-02299-2 - Attachment (Nature): The tidyomics ecosystem: enhancing omic data analyses > Nature Methods - tidyomics offers a software ecosystem for omic data manipulation and analysis that bridges Bioconductor with the tidyverse framework.

Alik Huseynov (15:03:57) (in thread): > I saw that paper before, but didn’t know the ideas were taken from you. Using cell volume is interesting but again depends on cell segmentation quality, probably considering segmented cell shape as another factor would be advantageous.

Alik Huseynov (16:11:50) (in thread): > And, when using segmentation-free approach, ie spatial binning on transcript coordinates, the resultant bins are all the same size and shape. What normalization approach would be the best here, like simple row standardization?

UA5GZMWHM (20:13:22): > https://doi.org/10.1101/2024.05.17.594641

2024-06-26

U7T29M3DG (17:11:11): > Hi<@UA5GZMWHM>I’m just meeting with<@U05QPDL3TNF>as I’ve just returned from my travels. Shall I start reading / editing the SFE paper so that Joe can then move on to continuing with Voyager?

UA5GZMWHM (23:43:50): > You can edit the text. I’m also traveling right now. Just moved out yesterday and I’m now in San Francisco

2024-06-27

Alik Huseynov (06:00:36) (in thread): > I updated the comparison table and will be editing and adding what’s missing as well.

2024-07-06

U01UF27E9P0 (03:40:19): U01UF27E9P0 (22:33:02):

2024-07-07

U01UF27E9P0 (06:02:47): U01UF27E9P0 (06:21:37):

2024-07-16

U01UF27E9P0 (17:30:53) (in thread):

2024-07-19

UA5GZMWHM (12:01:01): > https://doi.org/10.1101/2024.05.31.596908 - Attachment (bioRxiv): SpaNorm: spatially-aware normalisation for spatial transcriptomics data > Library size normalisation is necessary to enable comparisons between observations in transcriptomic datasets. Numerous methods have been developed to normalise these effects with sample and gene specific adjustments. However, in spatial transcriptomics data, normalisation is complicated by the fact that spatial region-specific library size confounds biology. The most popular approach of adapting methods developed for single-cell RNA-seq data has been shown to excessively remove biological signals associated with spatial domains and thus results in poorer downstream domain identification. To this end, we propose the first spatially-aware normalisation method, SpaNorm. SpaNorm concurrently models spatial library size effects and the underlying smooth biology, to tease apart these effects, and thereby remove library size effects without removing biology. This is achieved through optimal decomposition of spatially smooth variation into those related and unrelated to library size and the use of location-specific scaling factors. Using 27 tissue samples from 6 datasets spanning 4 spatial platforms, we show that SpaNorm outperforms current state of the art methods at retaining biological information in the form of spatial domains and spatially variable genes (SVGs) better than 4 commonly used single-cell normalisation approaches. SpaNorm is versatile and it can be used for both spot-based and subcellular spatial transcriptomics data. Notably, the benefit of using SpaNorm is more pronounced for the latter data such as those from Xenium, STOmics and CosMx platforms for which the proportion of genes exhibiting region-specific library size effect is higher. SpaNorm works equally well with segmented cell-level data and spot-based data where each spot contains multiple cells. > > ### Competing Interest Statement > > The authors have declared no competing interest.

Alik Huseynov (13:41:41) (in thread): > Interesting that they are using thin-plate-splines,I was using that long time ago to map 3D surfaces/meshes

2024-07-21

U01UF27E9P0 (20:55:27): U01UF27E9P0 (20:55:27) (in thread):

2024-07-22

UA5GZMWHM (09:03:19): > I’m going to present the updates of SFE and Voyager at Bioc2024 on July 24. Workshop website here:https://lambdamoses.github.io/SFEWorkshop2024/index.htmlThere should be an interactive instance onhttps://workshop.bioconductor.org/coming up soon, run out of a Docker container, but it hasn’t been deployed yet. - Attachment (lambdamoses.github.io): SFEWorkshop2024 > This workshop showcases new features in SpatialFeatureExperiment and Voyager, including support for transcript spot geometries in rowGeometries, plotting transcript spots, aggregating transcript spots with a spatial grid or any geometry, aggregating cell level data with a spatial grid or any geometry, splitting SFE objects with geometries, and improved support for images and affine transformations.

2024-07-29

U01UF27E9P0 (17:48:23) (in thread): U01UF27E9P0 (17:49:24): U01UF27E9P0 (17:49:24) (in thread):

2024-07-30

UA5GZMWHM (13:31:30): > There was a talk about this at the Bioc2024 conference and I met the author. The most important thing I learnt: there’re 6 Visium barcodes that consistently have low UMI counts due to barcode sequence motif.https://doi.org/10.1101/2024.06.06.597765 - Attachment (bioRxiv): SpotSweeper: spatially-aware quality control for spatial transcriptomics > Quality control (QC) is a crucial step to ensure the reliability and accuracy of the data obtained from RNA sequencing experiments, including spatially-resolved transcriptomics (SRT). Existing QC approaches for SRT that have been adopted from single-nucleus RNA sequencing (snRNA-seq) methods are confounded by spatial biology and are inappropriate for SRT data. In addition, no methods currently exist for identifying histological tissue artifacts unique to SRT. Here, we introduce SpotSweeper, spatially-aware QC methods for identifying local outliers and regional artifacts in SRT. SpotSweeper evaluates the quality of individual spots relative to their local neighborhood, thus minimizing bias due to biological heterogeneity, and uses multiscale methods to detect regional artifacts. Using SpotSweeper on publicly available data, we identified a consistent set of Visium barcodes/spots as systematically low quality and demonstrate that SpotSweeper accurately identifies two distinct types of regional artifacts, resulting in improved downstream clustering and marker gene detection for spatial domains. > > ### Competing Interest Statement > > The authors have declared no competing interest. > > * DLPFC > : dorsolateral prefrontal cortex > MAD > : median absolute deviation > QC > : quality control > sc/snRNA-seq > : single-cell/nucleus RNA-sequencing > SRT > : spatially-resolved transcriptomics > WM > : white matter

UA5GZMWHM (15:10:41): > https://arxiv.org/abs/2405.18779 - Attachment (arXiv.org): Categorization of 31 computational methods to detect spatially variable genes from spatially resolved transcriptomics data > In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 31 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.

Alik Huseynov (16:15:49) (in thread): > thanks! that is interesting paper

UA5GZMWHM (17:14:01): > Just discovered a really cool R package:https://github.com/isciences/exactextractr

2024-08-07

U01UF27E9P0 (17:32:19): U01UF27E9P0 (17:32:19) (in thread): U01UF27E9P0 (17:33:17): U01UF27E9P0 (17:33:18) (in thread):

2024-08-26

U01UF27E9P0 (15:30:50):

2024-08-27

Nilesh Kumar (11:31:50): > @Nilesh Kumar has joined the channel

NILESH KUMAR (11:37:13): > @NILESH KUMAR has joined the channel

U01UF27E9P0 (17:18:08):

2024-08-28

U01UF27E9P0 (18:06:02):

2024-09-09

UA5GZMWHM (17:56:02): > I’ve been working on a new project, but I need to focus on the Voyager project again to prepare for the next Bioconductor release this October. I need to submit alabaster.sfe to Bioconductor and deal with the arrow object size error for Xenium v3. I also need to update the documentation website and vignettes.

UA5GZMWHM (18:00:08): > https://bioconductor.org/developers/release-schedule/ - Attachment (bioconductor.org): Bioconductor - Release: Schedule > The Bioconductor project aims to develop and share open source software for precise and repeatable analysis of biological data. We foster an inclusive and collaborative community of developers and data scientists.

2024-09-19

U01UF27E9P0 (15:48:36):

2024-09-23

U01UF27E9P0 (15:39:24):

2024-10-08

Alik Huseynov (03:59:06): > Hybrid hackathon on image-based spatial omics. Languages can be R or python.https://spatialhackathon.github.io/<@U7T29M3DG><@U02EN9EQQ5U><@U05QPDL3TNF>this might be interesting for you since one of the main points is on data normalization of imST. 10X, NanoString and Vizgen will contribute datasets as well.<@UA5GZMWHM>please share with people at Columbia - Attachment (ELIXIR-Germany SpaceHack): Home > Github Pages for SpatialHackathon

U02EN9EQQ5U (12:49:01) (in thread): > Thanks for sharing@Alik Huseynov

2024-10-10

Alik Huseynov (15:25:09): > I would like to share this recent talk from SpatialData devs..https://www.youtube.com/watch?v=LjHf451ICPYInteroperability is planned with SFE and VoyagerPy as well.<@U05QPDL3TNF>and<@U7T29M3DG>few questions: > * would some of the geospatial python implementationhttps://py.geocompx.org/andhttps://geographicdata.science/book/intro.htmlused in VoyagerPy? > * any updates on VoyagerPy development? > Thanks! - Attachment (YouTube): Developing data infrastructures and analytical systems for spatial omics

U05QPDL3TNF (17:01:36): > Hi Alik. Thanks for sharing - I’m excited to listen. > > Voyagerpy generally uses geopandas, shapely, and scipy for geospatial statistical analysis and image geometries. I translated a spatial visium and Xenium vignette from R to python, and am awaiting the point of near submission for the voyager paper to add a few more vignettes (especially visium HD and merfish)

2024-10-11

Alik Huseynov (04:30:00): > Great, Thanks Joe!

2024-10-15

Alik Huseynov (05:51:43) (in thread): > FYI, I just opened this on SpatialData-VoyagerPy integration ->https://github.com/pachterlab/voyagerpy/issues/29 - Attachment: #29 test SpatialData-VoyagerPy integration > Hi @josephrich98 , as for Xenium vignette, I think one can relatively easily make SpatialData-VoyagerPy integration. Since the object is AnnData in any case, and after loading the data, SpatialData-specific .zarr file can be stored.
> Xenium output is read with spatialdata_io.xenium (NOTE: Xenium-specific .zarr files are never used as input)
> Short example: > > > from spatialdata_io import xenium > import spatialdata as sd > sdata = xenium("path/to/xenium_outs") > # save as .zarr to disk > sdata.write("xenium.zarr") > > # re-load .zarr file if needed > sdata = sd.read_zarr("xenium.zarr") > sdata["table"] # # AnnData object, use for the Xenium vignette > > > > To convert from SpatialData to AnnData, basically <https://spatialdata.scverse.org/en/stable/tutorials/notebooks/notebooks/examples/squidpy_integration.html#example-of-conversion-from-spatialdata-to-the-legacy-anndata-showing-also-that-affine-transformations-are-handled|this one> > > > sdata = sd.read_zarr("xenium.zarr") > from spatialdata_io.experimental import from_legacy_anndata, to_legacy_anndata > # covert to AnnData > adata = to_legacy_anndata(sdata, include_images = True) > > > > <https://spatialdata.scverse.org/en/stable/tutorials/notebooks/notebooks/examples/squidpy_integration.html#xenium-example|xenium example> with squidpy integration

2024-10-22

UA5GZMWHM (20:04:00): > @Alik Huseynov<@U05QPDL3TNF><@U7T29M3DG>Recently Alik and I have been working on some Visium HD functionalities. I’m developing new software infrastructure and approach to ESDA and spatial QC based on a collaborator Yvon’s unpublished Visium HD data though I’ll use a dataset from the 10X website for the vignette. I still haven’t wrapped up the Voyager and SFE papers due to perfectionism. I’m debating whether the Visium HD part should be included in the papers or should be a separate paper. Shall we talk on Zoom sometime next week about the scope of the existing SFE and Voyager papers? Bianca and Yvon would also like to join. Bianca prefers next Tuesday or Thursday and not Wednesday.

U05QPDL3TNF (20:41:09) (in thread): > I’mflexible to meet next week and glad to hear things are moving forward!

2024-10-23

Alik Huseynov (00:26:02) (in thread): > Great! Thursday works for me, flexible there.

U7T29M3DG (01:15:06) (in thread): > Next Thursday is good for me.

UA5GZMWHM (01:18:18) (in thread): > Sounds good

Alik Huseynov (01:29:49) (in thread): > Looking forward!

UA5GZMWHM (14:16:20) (in thread): > What time would it work for you?

2024-10-24

Alik Huseynov (02:19:27) (in thread): > Between 17:00-19:00 CET

2024-10-29

UA5GZMWHM (12:18:41) (in thread): > <@U7T29M3DG>Are you available at that time?

U7T29M3DG (14:52:42) (in thread): > Thursday 10:00am PST works for me, and I think that’s in@Alik Huseynov’s window.

UA5GZMWHM (15:29:34) (in thread): > I sent you a calendar invite

Alik Huseynov (18:01:50) (in thread): > I got the invite,thanks.

2024-10-31

UA5GZMWHM (13:19:02): > What needs to be done for the papers: > 1. Compatibility tests for Xenium > 2. Update the documentation website > 3. Write the manuscripts (already pinned in this channel) > We’ll meet again this time in two weeks to check progress.

UA5GZMWHM (13:19:22): > <@U02EN9EQQ5U>may join next time if interested

U02EN9EQQ5U (13:19:55) (in thread): > Sure - I’ll be sure to add my availability for the next planning meeting.

2024-11-04

U01UF27E9P0 (23:14:36):

2024-11-05

U01UF27E9P0 (18:00:06) (in thread):

2024-11-08

UA5GZMWHM (17:29:04): > Thealabaster.sfepackage has been submitted to Bioconductorhttps://pachterlab.github.io/alabaster.sfe/ - Attachment (pachterlab.github.io): Language agnostic on disk serialization of SpatialFeatureExperiment > Builds upon the existing ArtifactDB project, expending alabaster.spatial for language agnostic on disk serialization of SpatialFeatureExperiment.

2024-11-14

UA5GZMWHM (11:54:28): > I’m updating the documentation websites. Now Aaron Lun has made a lot of progress on libscran. Now we can call the same underlying C++ library to do basic QC, find highly variable genes, and do clustering and DE from R and Python. This should make compatibility testing much easier.https://github.com/libscran/scrapper?tab=readme-ov-file https://github.com/BiocPy/scranpy

UA5GZMWHM (11:54:55): > That said we should still write out how exactly the HVG and DE methods work in this package, the way we did for Seurat and scanpy

U01UF27E9P0 (12:01:29): U01UF27E9P0 (12:23:54): U01UF27E9P0 (13:12:14): U7T29M3DG (21:28:56): > What was the Python analog of alabaster that you mentioned?

UA5GZMWHM (22:12:54): > dolomite, for examplehttps://github.com/ArtifactDB/dolomite-base

2024-11-15

U01UF27E9P0 (19:18:59) (in thread):

2024-11-21

U01UF27E9P0 (21:07:11): UA5GZMWHM (21:10:03): > I FINALLY got the documentation website to build successfully. Here’s the new Xenium vignette:https://pachterlab.github.io/voyager/articles/vig5_xenium.htmlI decided not to go further down the rabbit hole since it’s already pretty long; the multi-scale ESDA stuff should be a separate vignette.

U7T29M3DG (21:13:38): > WOW!!!

U7T29M3DG (21:13:43): > That’s a beautiful vignette

2024-11-22

UA5GZMWHM (17:21:29): > I need y’all to do more work on Voyager. Bianca isn’t very happy with me spending all my time on Voyager because she needs me to do my new project to get tenure.

U02EN9EQQ5U (17:25:24) (in thread): > At our next meeting, can we spend some time discussing the state of the project and timeline/plan for the upcoming features? We can decide whatwe can take off your plate

UA5GZMWHM (17:26:19) (in thread): > Yes; you can see some of the wish list of new features in the GitHub issues of SFE and Voyager

U7T29M3DG (22:06:34): > Hi<@UA5GZMWHM>

U7T29M3DG (22:08:03): > A few things: first, as<@U02EN9EQQ5U>said we’ll make a plan. Second, Bianca makes a fair point, and I think one thing is to see if she can contribute to the project, which undoubtedly she can as she is brilliant, and then she can coauthor on it (and it will help with her tenure!) To that end, we should make sure our next meeting includes her. Third, I do think we’re close to getting out our paper so hopefully it won’t be much longer.

UA5GZMWHM (22:16:50): > These are some time-consuming things (not new Visium HD-related on disk functionalities) that may help with the paper: > 1. For the new Voyager paper, a case study or vignette on spatial autocorrelation at different length scales. Aggregate a single cell resolution dataset into spatial bins of various sizes (aggregatemethod for SFE) compute Moran’s I and MULTISPATI, compare results across resolutions. > 2. Rewrite the existing vignette that applies local Moran’s I on the k-nearest neighbor graph in gene expression space so it’s more like MERINGUE and does not depend on clustering > 3. Edit existing R basic Visium, 10xv3, and Xenium vignettes to usescrapperand edit the corresponding Python vignettes to usescranpyand update compatibility tests > 4. Vignette/case study with multiple samples (I have a Visium dataset from mouse adipose to get to SFEData) comparing Moran’s I and MULTISPATI > Would anyone be interested?

2024-11-23

U01UF27E9P0 (01:58:43): U01UF27E9P0 (14:06:28):

2024-11-24

U01UF27E9P0 (08:31:40):

2024-11-25

UA5GZMWHM (10:39:29) (in thread): > Please claim if you’re interested

2024-12-02

U01UF27E9P0 (16:38:35): U02EN9EQQ5U (17:08:15): > Meeting link:https://columbiauniversity.zoom.us/j/92934036311?pwd=AfcvDaNA4HIlQgzVhmAaMIeel8MqbH.1

Unknown User (17:38:21): > [Unsupported block type: call]

UA5GZMWHM (17:49:48): > @Alik HuseynovLior would like to do a work session on Zoom for 2 hours for us to work together. He’s proposing December 16, 11 am PST. Would this time work for you?

2024-12-03

Alik Huseynov (08:44:51) (in thread): > any alternative date after Dec 16th?

UA5GZMWHM (21:44:09): > <@U7T29M3DG>Alik is available on December 17 or later. Can you choose another date for the work session?

U7T29M3DG (21:47:59): > Yes. December 18th is good. 10am PST Dec. 18?

2024-12-04

Alik Huseynov (02:41:35): > Dec. 18th, perfect!Thanks

2024-12-18

UA5GZMWHM (13:02:06): > <@U7T29M3DG>Do you have a zoom link for the work session?

Unknown User (13:03:11): > [Unsupported block type: call]

U7T29M3DG (13:03:32): > My 8:30 meeting this morning ran late andI’mwalking over to the office now.SoI’llbe there in a few minutes

UA5GZMWHM (13:10:53): > @Alik Huseynov

U7T29M3DG (13:13:54): > <@U05QPDL3TNF>

U7T29M3DG (13:15:13): > https://docs.google.com/document/d/11ps2j_Ds2CIt0InMM1qKws7YncMc5DUDD0Qeyr36UYg/edit?tab=t.0

2024-12-19

U01UF27E9P0 (22:37:10):

2024-12-23

Unknown User (13:01:12): > [Unsupported block type: call]

2025-01-16

UA5GZMWHM (15:32:42): > No pressure if you have lost precious things to the fire or have evacuated. I wrote more in the supplementary methods of the SFE paper and made the supplementary figures. Shall we meet again sometime next week to discuss the SFE paper and make some edits before submitting it?

UA5GZMWHM (15:33:59): > Also, I want to reorganize some of the code Joe used in the Seurat vs. Scanpy paper into a package to compare single cell analysis results, tentatively called scCompare. Or can you come up with a better name?

U05QPDL3TNF (16:08:50): > That sounds good. My schedule is flexible to meet. > > And I think the idea for scCompare is great.I can already envision how we can organize it.Perhaps we can discuss more in our next SFE meeting?

Alik Huseynov (17:21:30): > If next week Mo or Fr works after 7pm CET, I can jointhe call on SFE paper

2025-01-21

UA5GZMWHM (13:27:14): > Can we do Thursday?

U7T29M3DG (15:56:59): > Thursday this week?

UA5GZMWHM (16:15:30): > Yes

2025-01-23

UA5GZMWHM (09:57:14): > <@U7T29M3DG>Can yall comment on the figures of the SFE paper? If you have no objections to the contents of the figures, then I’ll go make the final cosmetic changes.

U7T29M3DG (12:03:59): > Hi<@UA5GZMWHM>I will take a final look today

2025-01-29

UA5GZMWHM (20:43:41): > Shall we schedule an online meeting to write the cover letter for the SFE paper?

UA5GZMWHM (21:16:49): > https://www.biorxiv.org/content/10.1101/2024.12.06.627195v2 - Attachment (bioRxiv): MuSpAn: A Toolbox for Multiscale Spatial Analysis > The generation of spatial data in biology has been transformed by multiplex imaging and spatial-omics technologies, such as single cell spatial transcriptomics. These approaches permit a detailed mapping of cell populations and phenotypes within the tissue context, which reveals that tissues are complex ecosystems that include multiple organisational structures over different length scales. Quantitative methods for maximising the information that can be retrieved from these images have not kept pace with technological advances in platforms, and no standard methodology has emerged for spatial data analysis. Proposed pipelines are often tailored to individual studies, leading to a fragmented landscape of available methods, and no clear guidance about which statistical tools are best suited to a particular research question. > > In response to these challenges, we present MuSpAn, a Multiscale Spatial Analysis package designed to provide straightforward access to well-established and cutting-edge mathematical tools for analysing spatial data. MuSpAn provides easy to use, flexible, and interactive access to quantitative methods developed from mathematical fields that include spatial statistics, topological data analysis, network theory, geometry, probability and ecology. Users can construct custom pipelines from tools across these fields to address specific biological problems, or conduct unbiased exploration of data for discovery spatial biology. In summary, MuSpAn is an extensive platform which enables multiscale analysis of spatial data, ranging from the subcellular to the tissue-scale. > > ### Competing Interest Statement > > MuSpAn is released under an Academic Use licence, and is therefore available to licence commercially.

2025-01-30

U01UF27E9P0 (07:59:23): U01UF27E9P0 (10:28:04) (in thread): UA5GZMWHM (10:32:53): > alabaster.sfe is accepted by Bioconductorhttps://github.com/Bioconductor/Contributions/issues/3650#issuecomment-2624103974 - Attachment: Comment on #3650 alabaster.sfe > This package has a great vignette that demonstrates many of the features of the serialization approach. Test coverage is very good. I would say this is ready to accept. It looks like SFEData uses RDS to ship its example SFEs via ExperimentHub. Do you have a plan to use the language-agnostic serialization for a next-generation SFEData? And have you worked on python client code that can ingest the new serializations? @lshep this is accepted.

Alik Huseynov (12:03:57) (in thread): > I could do Wed 05th or Friday 07th after 19:00 CET.

UA5GZMWHM (12:28:01) (in thread): > <@U7T29M3DG>does it work for you?

2025-02-03

UA5GZMWHM (12:10:10): > https://doi.org/10.1101/2024.12.13.628390

2025-02-04

U01UF27E9P0 (20:21:18):

2025-02-07

U01UF27E9P0 (12:31:18): U01UF27E9P0 (12:31:19) (in thread): U01UF27E9P0 (12:51:29): UA5GZMWHM (20:50:00): > Do you think the Voyager documentation website should have a FAQ page?

2025-02-10

UA5GZMWHM (01:37:14): > <@U7T29M3DG>@Alik HuseynovHere’s the SFE cover letter:https://docs.google.com/document/d/157myWoJhqzwkEi2dBOtQ3f8vFYvUYuq9/edit?usp=sharing&ouid=108696065868254037245&rtpof=true&sd=true

U7T29M3DG (13:09:15): > Looks great!

Alik Huseynov (16:09:15): > Nice cover letter!

2025-02-11

U01UF27E9P0 (04:23:55): U01UF27E9P0 (11:47:09) (in thread): U01UF27E9P0 (11:49:43) (in thread): U01UF27E9P0 (21:03:22) (in thread):

2025-02-20

UA5GZMWHM (23:36:00): > I moved the supplementary info into a separate document:https://docs.google.com/document/d/1up18fZVutJZxpIugD56HqrxvmbSqse0DZemLKqanmVI/edit?usp=sharing

2025-02-21

UA5GZMWHM (01:32:19): > Again, the main text of the SFE paper is here:https://docs.google.com/document/d/11ps2j_Ds2CIt0InMM1qKws7YncMc5DUDD0Qeyr36UYg/edit?usp=sharing

UA5GZMWHM (01:34:09): > <@U7T29M3DG>@Alik Huseynov<@U05QPDL3TNF>I did some more edits. Do you agree or disagree with posting it on bioRxiv and submitting it? Anything else do you want before submitting?

Alik Huseynov (12:22:35): > Having it on bioRxiv would be great.

U7T29M3DG (17:19:48): > THe paper looks good to me Lambda. Go ahead and submit! (can you give access on the supp info tolakigigar@gmail.com)?

2025-02-24

U01UF27E9P0 (10:02:24):

2025-02-26

U01UF27E9P0 (07:18:58) (in thread): U7T29M3DG (11:13:37): > Good news.Major hurdle cleared.The paper is out for review at Nature Methods.

2025-02-27

Alik Huseynov (09:40:52) (in thread): > very fast, great! Hopefully we will know the result soon

UA5GZMWHM (09:46:28): > BioRxiv seems to be behind

U7T29M3DG (11:10:18): > Yeah… it can sometimes take a few days.

U7T29M3DG (16:10:51): > Manuscript now screened

2025-02-28

Alik Huseynov (03:32:47): > it is now on BioRxivhttps://www.biorxiv.org/content/10.1101/2025.02.24.640007v1.full.pdf

2025-03-10

Bernie Mulvey (23:53:07): > @Bernie Mulvey has joined the channel

Bernie Mulvey (23:53:38): > @Bernie Mulvey has left the channel

2025-03-17

U01UF27E9P0 (10:05:12):

2025-03-27

U01UF27E9P0 (10:32:39):

2025-03-28

U01UF27E9P0 (05:47:31): U01UF27E9P0 (05:47:31) (in thread):

2025-04-05

U01UF27E9P0 (14:50:21):

2025-04-09

U7T29M3DG (14:00:29): - File (Plain Text): Untitled

U7T29M3DG (14:01:09): > Sorry, I’m not sure all reviews came through

U7T29M3DG (14:01:11): > Pasting them below

U7T29M3DG (14:01:31): - File (Plain Text): Untitled

2025-04-13

U7T29M3DG (15:05:18): > Hi<@UA5GZMWHM>

U7T29M3DG (15:05:27): > I’d like to schedule a zoom call this week to discuss the reviews; when is convenient for you?

UA5GZMWHM (15:06:20): > Not Tuesday 4:39-6 pm EDT and not Friday. Other times work

U7T29M3DG (16:18:39): > How about Wednesday 9:30am PST (=12:30pm EST)?

UA5GZMWHM (23:13:44): > Sounds good

2025-04-14

Alik Huseynov (17:12:05) (in thread): > Hi Lior and Lambda, > I’m can go through comments over Easter holidays.

2025-04-16

Unknown User (12:27:17): > [Unsupported block type: call]

UA5GZMWHM (12:30:50): > Response to reviews:https://docs.google.com/document/d/1QJMMubCtQcAUq9ZcAaaZgbTY1Jko6b2eo4TllbOEHuI/edit?tab=t.0

UA5GZMWHM (12:42:37): > https://github.com/ArtifactDB/dolomite-sfe

U05QPDL3TNF (12:52:54): > sfe to anndata github request:https://github.com/pachterlab/SpatialFeatureExperiment/issues/56 - Attachment: #56 Conversion of sfe Object to AnnData > Hello, > > I wanted to ask if there is a built-in function to convert a sfe object to AnnData object, or if there is a way to save the sfe object as a h5ad file to facilitate interoperability with Python. > > Thanks!