R/cg-internals.R
process_drug_link_ncit.Rd
Run the full sequence that scrapes, parses, and stores the NCI Drug Dictionary found at CancerGov.org and any correlates to the NCI Thesaurus in a Postgres Database.
process_drug_link_ncit( conn, verbose = TRUE, render_sql = TRUE, expiration_days = 30 )
conn | Postgres connection object. |
---|---|
verbose | When reading from a slow connection, this prints some output on every iteration so you know its working. |
Unlike the other process_*
functions, this one uses regex to find the NCI Thesaurus Code, if presence, from the scraped URLs instead of performing any scraping.
The links to Drug Pages are scraped from the Data Dictionary URL over the maximum page number and are saved to a Drug Link Table in the cancergov
schema. The URLs in the Drug Link Table are then scraped for any HTML Tables of synonyms and the results are written to a Drug Link Synonym Table. The links to active clinical trials and NCIt mappings are also derived and stored in their respective tables.