NAME

UMLSQuery - A module to query a umls mysql installation


SYNOPSIS

    use UMLSQuery;
        my $U = new UMLSQuery;
        $U->init( u => 'username',
                          p => 'password',
                          h => 'hostname',
                          dbname => 'umls');
        $U->getCUI(string/aui/sui/lui, sab=>)
        $U->getAUI(string/cui/sui/lui, sab=>)
        $U->getSTR(string/cui/aui/sui/lui, sab=>)
        $U->getSAB(string/cui/aui/sui/lui)
        $U->mapToId(phrase, idtype=>cui/lui/sui/aui, sab=>)
        $U->getParents(aui/cui, rela=>, sab=>)
        $U->getCommonParent(aui/cui, aui/cui, rela=>, sab=>)
        $U->getChildren(aui/cui, rela=> sab=>)
        $U->getCommonChild(aui/cui, aui/cui, rela=>, sab=>)
        $U->getDistBF(cui_1, cui_2,rela=>)
        $U->getAvailableSAB()
        $U->finish()


DESCRIPTION

This module will allow you to connect to a mysql UMLS installation and run common queries. If you have a query that you want, contact me at nigam@stanford.edu.

$U->init(u => 'username', p => 'password', h => 'hostname', dbname => 'umls');
Provide a username, password, host and dbname of a valid UMLS mysql database. You can optionally provide a port=> if your mysql is not on port 3306

$U->getCUI(string/aui/sui/lui, sab=>)
This function accepts any text string, an aui (Atom Unique Identifier), sui (String Unique Identifier) or lui (Lexical Unique Identifer) and gets its cui (Concept Unique Identifier). The search is for an exact match. To restrict the search to a particular dictionary provide the sab value. The following searches for 'prostate' in the SNOMED-CT vocabulary.
        $U->getCUI('prostate', sab=>'SNOMEDCT')

$U->getAUI(string/cui/sui/lui, sab=>)
This function accepts any text string, a cui (Concept Unique Identifier), sui (String Unique Identifier) or lui (Lexical Unique Identifier) and gets its aui (Atom Unique Identifer). The search is for an exact match. To restrict the search to a particular dictionary provide the sab value. The following searches for 'prostate' in the SNOMED-CT vocabulary.
        $U->getAUI('prostate', sab=>'SNOMEDCT')

$U->getSTR(cui/aui/sui/lui, sab=>)
This function accepts a cui (Concept Unique Identifier), aui (Atom Unique Identifer) sui (String Unique Identifier) or lui (Lexical Unique Identifier) and gets its string. The search is for an exact match. To restrict the search to a particular dictionary provide the sab value. The following searches for 'A0812060' in the SNOMED-CT vocabulary.
        $U->getSTR('A0812060', sab=>'SNOMEDCT')

$U->getSAB(string/cui/aui/sui/lui)
This function accepts a cui (Concept Unique Identifier), aui (Atom Unique Identifer) sui (String Unique Identifier) or lui (Lexical Unique Identifier) and gets the dictionary/s it belongs to. The search is for an exact match (if a string is provided).
        $U->getSAB('prostate')

$U->mapToId(phrase, idtype=>cui/lui/sui/aui, sab=>)
This function accepts a phrase (upto 10 words) and maps it to the chosen idtype (can be restricted by sab if desired. The function first looks for an exact match for the phrase, if none is found, then will generate all possible permutations and attempt an exact match for each one (with right truncation of words to look for partial matches). The following tries to find a CUI belonging to the SNOMED-CT for 'intraductal carcinoma of prostate'.
        $U->mapToId('intraductal carcinoma of prostate', idtype=>'cui', sab=>'SNOMEDCT');

The function returns a hash which has a particular permutation as its key and the value is an array of pairs of id - string. The above examle returns:

        key             id      string
        ------------------------------
        carcinoma       C0007097 Carcinoma
        intraductal     C1644197 Intraductal
        prostate        C0033572 Prostate
        carcinoma prostate      C0600139 Carcinoma prostate
        intraductal carcinoma   C0007124 Intraductal carcinoma
        prostate carcinoma      C0600139 Prostate carcinoma
        carcinoma of prostate   C0600139 Carcinoma of prostate

in case of mutliple matches, the id - string pair will be pushed onto the array.

$U->getParents(aui/cui, rela=>, sab=>)
This function accepts a cui or aui and returns all its parents (optionally restricted along a particular relationship type (rela, 188 posible values) and a vocabulary). The example below finds all isa parents of 'C0600139'.
        $U->getParents('C0600139', rela=>'isa');

The function returns a hash, where the keys are the direct parents of the id and the values are the ids forming the path to the root node. The ids are always reported as aui. The example above returns:

        direct parent   Path to the root
        ---------------------------------------------------
        A3407646        A3684559.A3713095.A3506985.A3407646

$U->getCommonParent(aui/cui, aui/cui, rela=>, sab=>)
This function accepts a pair of cuis or auis and returns the common parent (optionally restricted along a particular relationship type (rela, 188 posible values) and a vocabulary). The example below finds common parents of 'C0600139','C0007124' along any rela type.
        $U->getCommonParent('C0600139','C0007124');

The function returns the common parent (aui) and the distance (dist) from the query nodes. The above example returns:

        aui             dist
        -------------------------------------------------------
        A0689089        4 links from C0600139 3 links from C0007124

$U->getChildren(aui/cui, rela=> sab=>)
This function accepts a cui or aui and returns all its direct children (optionally restricted along a particular relationship type (rela, 188 posible values) and a vocabulary). The example below finds all isa children of 'C0376358'.
        $U->getChildren('C0376358',rela=>'isa');

The function returns a hash, where the keys are the direct children of the id and the values is the query id. The ids are reported in the form of the query id. The example above returns:

        child   parent
        ------------------------
        C0347001        C0376358
        C1330959        C0376358
        C1302530        C0376358
        C1282482        C0376358

$U->getCommonChild(aui/cui, aui/cui, rela=>, sab=>)
This function accepts a pair of cuis or auis and returns the common child (optionally restricted along a particular relationship type (rela, 188 posible values) and a vocabulary). The example below finds common parents of 'C0376358','C0346554' along any rela type.
        $U->getCommonChild('C0376358','C0346554')

The function returns the common child and the ids of the query nodes. The example above returns

        C0600139        common child of C0376358 and C0346554

$U->getDistBF(cui_1, cui_2,rela=>,maxR=>)
This function accepts two cuis and performs a breadth first search from cui_1 to find cui_2 and reports the number of links at which cui_2 is found. The search is aborted if cui_2 is not found in a radius of maxR links. (maxR is set to 3 if it is not provided)
        $U->getDistBF('C0600139','C0007124')

The above example returns 2.

$U->getAvailableSAB('search string')
This function returns a hash where the keys are the 'sab' and the values are the descriptions of those sab that contain the 'search string'. The example below searches for sab that have SNOMED in their description.
        $U->getAvailableSAB('SNOMED')

It returns:

        sab             description
        -------------------------------------------------
        SNOMEDCT        SNOMED Clinical Terms, 2006_01_31
        SCTSPA  SNOMED Clinical Terms, Spanish Language Edition, 2005_10_31
        SNM     SNOMED-2, 2
        SNMI    SNOMED International, 1998

$U->finish()
Disconnect and end the querying.