Knime

An open source data platform that can access and navigate complex data

KNIME is an open source data analytics, reporting and integration platform. It can manipulate complex data and apply powerful analytical and statistics. It permits

KNIME is written in Java and based on Eclipse and makes use of its extension mechanism to add plugins providing additional functionality. It is released under GPL v3 with an exception that allows others to use the well defined node API to add proprietary extensions.

KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface allows assembly of nodes for data preprocessing (ETL: Extraction, Transformation, Loading), for modeling and data analysis and visualization.

KNIME allows users to visually create data flows (or pipelines), selectively execute some or all analysis steps, and later inspect the results, models, and interactive views.  The core version already includes hundreds of modules for data integration (file I/O, database nodes supporting all common database management systems), data transformation (filter, converter, combiner) as well as the commonly used methods for data analysis and visualization. With the free Report Designer extension, KNIME workflows can be used as data sets to create report templates that can be exported to document formats like doc, ppt, xls, pdf and others. Other capabilities of KNIME are:

  • KNIME‘ s core-architecture allows processing of large data volumes that are only limited by the available hard disk space (most other open source data analysis tools are working in main memory and are therefore limited to the available RAM). E.g. KNIME allows analysis of 300 million customer addresses, 20 million cell images and 10 million molecular structures.
  • Additional plugins allows the integration of methods for Text mining, Image mining, as well as time series analysis.
  • KNIME integrates various other open-source projects, e.g. machine learning algorithms from Weka, the statistics package R project, as well as LIBSVM, JFreeChart, ImageJ, and the Chemistry Development Kit.

Since 2006, KNIME has been used in pharmaceutical research, but is is also used in other areas like CRM customer data analysis, business intelligence and financial data analysis.