All NCIt Codes that have not been scraped or were scraped in the expiration period are scraped in the NCIt Thesaurus at the "https://ncithesaurus.nci.nih.gov/ncitbrowser/pages/concept_details.jsf?dictionary=NCI_Thesaurus&code=%s&ns=ncit&type=synonym&key=null&b=1&n=0&vse=null# path.
get_ncit_synonym( conn, sleep_time = 5, expiration_days = 30, verbose = TRUE, render_sql = TRUE )
conn | Postgres connection object. |
---|---|
sleep_time | Time in seconds for the system to sleep before each scrape with |
verbose | When reading from a slow connection, this prints some output on every iteration so you know its working. |
The links to Drug Pages are scraped from the Data Dictionary URL over the maximum page number and are saved to a Drug Link Table in the cancergov
schema. The URLs in the Drug Link Table are then scraped for any HTML Tables of synonyms and the results are written to a Drug Link Synonym Table. The links to active clinical trials and NCIt mappings are also derived and stored in their respective tables.