Functions for querying the HGNC REST API
Marcel Ramos
CUNY School of Public Health, New York, NY9 May 2025
HGNCREST.RmdHGNCREST
The HGNCREST package provides functions for querying the
HGNC REST API. The functions follow the HUGO Gene Nomenclature Committee
(HGNC) REST API documentation at https://www.genenames.org/help/rest/. There are three
main operations that can be performed with this package:
- fetching general information about the HGNC database
(
hgnc_info). - fetching information about a specific gene
(
hgnc_fetch) - searching for genes based on a query (
hgnc_search)
Installation
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("waldronlab/HGNCREST")Package load
library(HGNCREST)General information
The hgnc_info function returns general information about
the HGNC database. It includes searchableFields and
storedFields metadata.
## $lastModified
## [1] "2025-05-06T13:38:21.968Z"
##
## $numDoc
## [1] 45901
##
## $responseHeader
## $responseHeader$QTime
## [1] 1
##
## $responseHeader$status
## [1] 0
##
##
## $searchableFields
## [1] "hgnc_id" "rna_central_id" "alias_name" "locus_group"
## [5] "symbol" "location" "mane_select" "name"
## [9] "rgd_id" "entrez_id" "status" "uniprot_ids"
## [13] "alias_symbol" "ccds_id" "omim_id" "ucsc_id"
## [17] "mgd_id" "prev_symbol" "curator_notes" "refseq_accession"
## [21] "ena" "locus_type" "ensembl_gene_id" "vega_id"
## [25] "prev_name"
##
## $storedFields
## [1] "locus_type" "horde_id" "bioparadigms_slc"
## [4] "enzyme_id" "prev_name" "date_symbol_changed"
## [7] "refseq_accession" "mgd_id" "homeodb"
## [10] "omim_id" "alias_name" "gtrnadb"
## [13] "pubmed_id" "alias_symbol" "date_approved_reserved"
## [16] "ccds_id" "location" "name"
## [19] "uuid" "lsdb" "status"
## [22] "alias_name" "_version_" "cosmic"
## [25] "rna_central_id" "date_name_changed" "ensembl_gene_id"
## [28] "vega_id" "mirbase" "location"
## [31] "prev_symbol" "curator_notes" "cd"
## [34] "mamit-trnadb" "ena" "lncipedia"
## [37] "snornabase" "prev_name" "gene_group"
## [40] "merops" "ucsc_id" "uniprot_ids"
## [43] "imgt" "symbol" "mane_select"
## [46] "rgd_id" "entrez_id" "date_modified"
## [49] "lncrnadb" "gencc" "locus_group"
## [52] "orphanet" "iuphar" "hgnc_id"
## [55] "agr" "gene_group_id" "pseudogene.org"
Searchable fields
The searchableFields function is a convenience function
that returns a character vector of searchable fields in the HGNC
database.
## [1] "ccds_id" "alias_symbol" "uniprot_ids" "status"
## [5] "entrez_id" "rgd_id" "name" "mane_select"
## [9] "symbol" "location" "locus_group" "alias_name"
## [13] "hgnc_id" "rna_central_id" "prev_name" "vega_id"
## [17] "locus_type" "ensembl_gene_id" "ena" "refseq_accession"
## [21] "curator_notes" "prev_symbol" "mgd_id" "ucsc_id"
## [25] "omim_id"
Fetching gene information
The hgnc_fetch function returns a tibble
with information about the gene specified by the
searchableField and value arguments.
hgnc_fetch("ena", "BC040926")## # A tibble: 1 × 27
## date_approved_reserved gene_group_id hgnc_id alias_symbol date_symbol_changed
## <chr> <list> <chr> <list> <chr>
## 1 2009-07-20T00:00:00Z <int [1]> HGNC:37… <chr [1]> 2010-11-25T00:00:0…
## # ℹ 22 more variables: locus_group <chr>, refseq_accession <list>,
## # lncipedia <chr>, vega_id <chr>, name <chr>, uuid <chr>, status <chr>,
## # locus_type <chr>, prev_symbol <list>, ena <list>, symbol <chr>,
## # entrez_id <chr>, gene_group <list>, date_name_changed <chr>,
## # ensembl_gene_id <chr>, location_sortable <chr>, prev_name <list>,
## # ucsc_id <chr>, rna_central_id <list>, date_modified <chr>, location <chr>,
## # agr <chr>
Searching for genes
The hgnc_search function searches for genes based on a
query. It returns a data.frame with information about the
genes that match the query.
hgnc_search("symbol", "BRCA1")## hgnc_id symbol score
## 1 HGNC:1100 BRCA1 4.694909
Advanced search
The hgnc_search function also allows for more complex
queries using the query argument. The query should be a
string that follows the HGNC REST API query syntax.
hgnc_search("symbol", c("ZNF*", "AND", "status:Approved")) |>
head()## hgnc_id symbol score
## 1 HGNC:12991 ZNF2 1.018147
## 2 HGNC:13089 ZNF3 1.018147
## 3 HGNC:13139 ZNF7 1.018147
## 4 HGNC:13154 ZNF8 1.018147
## 5 HGNC:55280 ZNF8-DT 1.018147
## 6 HGNC:56757 ZNF8-ERVK3-1 1.018147
Searching for genes based on multiple criteria
In this example, we search for genes with the name “MAPK interacting” and that have a locus type of “gene with protein product”. Note that “locus_type” is a searchable field in the HGNC database that can be used to filter the search results.
hgnc_search(
"name",
c("MAPK interacting", "AND", "locus_type:gene with protein product")
)## hgnc_id symbol score
## 1 HGNC:7110 MKNK1 8.771411
## 2 HGNC:7111 MKNK2 8.771411
Conclusion
The HGNCREST package provides a convenient way to query
the HGNC REST API from R. It allows users to fetch information about
specific genes, search for genes based on a query, and get general
information about the HGNC database.
sessionInfo
Click to expand
## R version 4.5.0 (2025-04-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] HGNCREST_0.99.6 httr2_1.1.2 BiocStyle_2.37.0
##
## loaded via a namespace (and not attached):
## [1] vctrs_0.6.5 cli_3.6.5 knitr_1.50
## [4] rlang_1.1.6 xfun_0.52 textshaping_1.0.1
## [7] jsonlite_2.0.0 glue_1.8.0 htmltools_0.5.8.1
## [10] BiocBaseUtils_1.11.0 ragg_1.4.0 sass_0.4.10
## [13] rappdirs_0.3.3 rmarkdown_2.29 tibble_3.2.1
## [16] evaluate_1.0.3 jquerylib_0.1.4 fastmap_1.2.0
## [19] yaml_2.3.10 lifecycle_1.0.4 bookdown_0.43
## [22] BiocManager_1.30.25 compiler_4.5.0 fs_1.6.6
## [25] pkgconfig_2.0.3 htmlwidgets_1.6.4 systemfonts_1.2.3
## [28] digest_0.6.37 R6_2.6.1 utf8_1.2.5
## [31] pillar_1.10.2 curl_6.2.2 magrittr_2.0.3
## [34] bslib_0.9.0 tools_4.5.0 pkgdown_2.1.2
## [37] cachem_1.1.0 desc_1.4.3