Natural Language Processing and Watson APIs

Cognitive APIs

Year 2016, marked by the win of Google DeepMind in strategy game Go is definitely the year of artificial intelligence. Cognitive computing, machine learning, neural networks, natural language processing – these new concepts offer tremendous opportunities for bringing god-like intelligence and judgment capabilities to our everyday technical and business applications.

In January we, at IBM, made a presentation to the FinTech community in Geneva on cognitive computing and on how to use Watson APIs. Here in this article is a summary of what cognitive computing is and how to put it easily in practice.

Sasha Lazarevic, Alexandre Gaillard (Swiss Fintech Leader), and Pierre Kaufmann (IBM Cognitive Solutions Architect)
Geneva, January 11, 2016 : Sasha Lazarevic, Alexandre Gaillard (Swiss Fintech Leader, InvestGlass CEO), and Pierre Kauffmann (IBM Cognitive Solutions Architect)

 
But first of all, I feel I need to clarify some concepts:

Artificial intelligence (AI): an attempt of the scientists to translate the activities of the neurons in the brain into a set of logical constructs and models. It was very popular concept in 1980s, but it didn’t produce important practical results.

Cognitive computing: a comeback of AI 30 years later, but based on machine learning models and big data, and with a pragmatic goal to focus on resolution of specific practical problems (image recognition, autonomous driving, expert advisory solutions and similar).

Machine learning: statistical techniques based on regression and classification, used for pattern detection. When implemented through specific algorithms, these models can improve themselves (“learn”).

Supervised machine learning: Using a “training” set of data (or sample) whose outcomes are known, with the purpose to determine the initial regression parameters and improve them over time.

Machine learning is already used in multiple domains: optical recognition, speech recognition, spam detection, robotics, hedge fund trading strategies etc. It’s been implemented through open-source or commercially available solutions like R libraries, Python scikit-learn libraries, Weka, IBM SPSS etc. All these tools use one or more out of 40 different statistical algorithms.

So, is there anything new here in cognitive computing?

Well, there is a new piece that comes now into the story and aims to change completely the way how we interact with the computers and machines in general, and this piece is Natural Language Processing (NLP).

NLP is a technology that enables machines to understand the meaning of text. This understanding is made possible through a set of techniques like shallow and deep parsing, information extraction, and semantic analysis.

Let’s look at NLP in more details

The most common practical NLP implementation is the Question Answering (QA) system. The architecture of QA system is very simple:
1) Question Processing,
2) Passage Retrieval and
3) Answer Extraction

Question processing module will take your question, parse it into tokens (or words), and create a tree structure of it. It will identify the type of the question (factoid, procedural, causal etc), the key words, based on which it will identify the existence of the semantic relationships in it (e.g. role{person, company}). Passage retrieval is the 2nd step to retrieve relevant passages from the documents database, based on the already identified key words. In the end, the candidate answers are extracted from the passages, scored based on their statistical significance and the best answer is provided, together with its confidence level.

Machine learning in the context of NLP is implemented through bag-of-words model, or matrix representation of the text, where sentences are represented as numeric multisets (or a “bag”) of its words. This allows the use of regression analysis to identify the regression parameters. Through a training set of data, these parameters are refined and can be used to recognize new , previously unseen questions and match them to appropriate answers.

How to use all this?

IBM Watson is the most sophisticated and the most comprehensive NLP implementation up to this moment. It is available in two forms: as a packaged solution in form of advisors (Oncology Advisor, Pharmaceutical Research Advisor, Wealth Management Advisor etc) sold for (relatively) big bucks, or in the form of APIs.

Watson APIs are very practical way of putting NLP in practice and integrating it in your applications. Enabling AI is becoming very easy making simple calls to Watson APIs. APIs are currently available for machine dialog, relationship extraction, sentiment and emotional analysis, concept analysis, personality insight and many other, very practical use cases. You can provide your own documents or use preloaded news and document database, and process these documents with APIs of your choice. Let me mention the latest addition to the list: PhD API. So, if you are not yet PhD, compensate it with this API! 🙂

The complete, but not exhaustive list of all currently available APIs can be found here:
https://lzrvc.com/wordpress/wordpress/watson-api-catalog

How should you start?

Create your id on Watson Developer Cloud and Bluemix platform, and start coding. A lot of documentation and advice available online and through forum support:
http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/

Let the new era of applied artificial intelligence begin – IBM is democratizing cognitive computing through its Watson API interface, since for the most of the use cases, the development and testing of AI solutions is completely free of charge.

You can find find our presentation slides here:

Sasha Lazarevic, April 2016