Abstract
Background: The KNIME platform offers several tools for the analysis of chem- and pharmacoinformatics data. Unless one has sufficient in-house data available for the analysis of interest, it is necessary to fetch third party data into KNIME. Many data sources offer valuable data, but including this data in a workflow is not always straightforward.
Objective: Here we discuss different ways of accessing public data sources. We give an overview of KNIME nodes for different sources, with references to available example workflows. For data sources with no individual KNIME node available, we present a general approach of accessing a web interface via KNIME.
In addition, we discuss necessary steps before the data can be analysed, such as data curation, chemical standardisation and the merging of datasets.
Keywords: KNIME, database, data mining, web service, data curation, chemical standardization, REST, API.
[http://dx.doi.org/10.1145/1656274.1656280]
[http://dx.doi.org/10.1002/minf.201400188] [PMID: 27490039]
[http://dx.doi.org/10.1186/s13321-016-0121-y] [PMID: 26855674]
[http://dx.doi.org/10.1039/C6MD00065G] [PMID: 27774140]
[http://dx.doi.org/10.1021/acs.jmedchem.7b00954] [PMID: 29235859]
[http://dx.doi.org/10.1021/acs.jcim.8b00466] [PMID: 30372058]
[http://dx.doi.org/10.1021/ci00007a012]
[http://dx.doi.org/10.1016/j.drudis.2014.11.006] [PMID: 25463038]
[http://dx.doi.org/10.1016/j.websem.2014.03.003]
[http://dx.doi.org/10.1007/978-1-4939-8630-9_7] [PMID: 30039404]
[http://dx.doi.org/10.1021/ci049885e] [PMID: 15667141]
[http://dx.doi.org/10.1093/nar/gkv1072] [PMID: 26481362]
[http://dx.doi.org/10.1093/nar/gku1244] [PMID: 25477388]
[http://dx.doi.org/10.1093/nar/gkr777] [PMID: 21948594]
[http://dx.doi.org/10.1093/nar/gkt1031] [PMID: 24214965]
[http://dx.doi.org/10.1093/nar/gkw1074] [PMID: 27899562]
[http://dx.doi.org/10.1093/nar/gkv352] [PMID: 25883136]
[PMID: 28602100 ]
[http://dx.doi.org/10.1021/ed100697w]
[http://dx.doi.org/10.1093/nar/gkj067] [PMID: 16381955]
[http://dx.doi.org/10.1093/nar/gkx1037] [PMID: 29126136]
[http://dx.doi.org/10.1093/nar/28.1.235] [PMID: 10592235]
[http://dx.doi.org/10.1093/nar/gkv951] [PMID: 26400175]
[http://dx.doi.org/10.1093/nar/gky1033] [PMID: 30371825]
[http://dx.doi.org/10.1021/acs.jcim.5b00559] [PMID: 26479676]
[http://dx.doi.org/10.1093/nar/gkv396] [PMID: 25934803]
[http://dx.doi.org/10.1021/ci100176x] [PMID: 20572635]
[http://dx.doi.org/10.1016/j.tox.2017.06.003] [PMID: 28652195]
[http://dx.doi.org/10.1021/acs.jcim.6b00129] [PMID: 27280890]
[http://dx.doi.org/10.1007/s10822-015-9860-5] [PMID: 26201396]
[http://dx.doi.org/10.1016/j.chembiol.2017.11.009] [PMID: 29276046]
[http://dx.doi.org/10.1002/minf.201200059] [PMID: 23293680]
[http://dx.doi.org/10.1016/j.ddtec.2015.01.005] [PMID: 26194583]
[http://dx.doi.org/10.1002/minf.201700023] [PMID: 28586180]
[http://dx.doi.org/10.1186/s13321-018-0293-8] [PMID: 30097821]
[http://dx.doi.org/10.1007/978-1-4939-7847-2_14]
[http://dx.doi.org/10.1186/1758-2946-5-3] [PMID: 23317286]
[http://dx.doi.org/10.3233/SW-2012-0088]
[http://dx.doi.org/10.1186/s13321-015-0072-8] [PMID: 26155308]
[http://dx.doi.org/10.1007/978-3-319-11964-9_7]
[http://dx.doi.org/10.1038/sdata.2016.18] [PMID: 26978244]