Show simple item record

dc.contributor.advisorSang, Nguyen Thi Thanh
dc.contributor.authorMinh, Tran Ngoc Khanh
dc.description.abstractThe aim of this thesis is to build a drug dictionary that whenever users input a keyword of drug name or drug usages, it presents two types of results: the relevant drugs applying Text Classification, Text Clustering and Vector Space Model concept; and the results of keyword matching by using Database SQL statements. First thing to do is to divide drugs information in Database into k groups by using Text Clustering, based on the similarity between objects. Next, SQL operators will query and return all results matching to the input keyword. The system also calculates and presents the dominant group (based on results of Database-based search). A dominant group is a group that its number of occurrence is the highest. Then, in Text Classification concept, K-Nearest Neighbor algorithm aims to find K most similar drugs in the dominant group. As results, users will receive keyword matching results of Database-based search and relevant drugs applied Text Mining concept. The methodology is used to change every sentence of drug information into a Vector Space Model is TF-IDF, in which each element in a vector is a weighted number. Moreover, the similarity, or so-called the distance between two vectors can be calculated by using Cosine Similarity measurement. Besides, another important step is to pre-process data before mining will also be mentioned in details. Last but not least, some tools and resources, which are used to build the dictionary, will be introduced later.en_US
dc.publisherInternational University - HCMCen_US
dc.subjectDrug dictionaryen_US
dc.titleBuilding a drug dictionaryen_US

Files in this item


This item appears in the following Collection(s)

Show simple item record