EXPLAINED: How and why technology plays the most important role in the business world


EXPLAINED: How and why technology plays the most important role in the business world

Technology is everywhere and to imagine a world without it and a use case where we are unaffected by it seems impossible. From the ubiquitous GPS voice finding the optimum routes to work, to OTTs feeding recommendations based on our viewing behavior, to our favorite voice-based assistant, to chatbots answering basic questions, to algorithmic robot based trading seeking the extra bang for your buck, to determining the best treatment response using radiation oncology reports, or picking up weather warnings on a trek up the Himalayas, our interactions with technology are everywhere and all-consuming.

As technology becomes pervasive not only is it less invasive and more effective, it is growing smarter and more intelligent, like a child learning a new trick each day. Imagining a car driving itself from point A to point B or walking into a man less Amazon Go store is no longer limited to science fiction, it is now a reality. At the core of this steep learning curve of technology lies the concept of machine learning (ML).

Machine learning can be thought of as the brain behind the systems, platforms, or beeping lights of technology. Statistics can be considered as ground zero for machine learning (ML) as both fields of study analyze observations to reveal some underlying process. However, the two diverge in their assumptions, terminology, and techniques. Statistical approaches rely on foundational assumptions and explicit models of structure, such as the use of samples that are assumed to be drawn from a specified underlying probability distribution. In contrast, machine learning seeks to extract knowledge from large datasets with no such restrictions. ML aims to automate decision making by learning from known examples to determine the underlying structure.

One area where ML has made a significant contribution is in the recognition of handwritten characters using techniques of Optical Character Recognition (OCR). Unconstrained handwritten character recognition is an area of continuous study as perfecting it would imply that machines could read, understand, and interpret handwritten text without any errors. Add to it the different writing styles, languages, use of phrases, grammar, and ethnic implications of tone and voice and the challenge becomes mammoth. Applications of a perfected OCR solution would range from automated onboarding of customers by your bank, to speedy analysis of feedback forms at malls or coffee shops, to translation of files notes at medical labs. Possibilities are limitless.

“For us at Whyte, the domain of interest was to digitize the client onboarding form, personal information documents and, tax reports for end clients of one of our clients that was an international bank. Once we had the handwritten forms digitized, with a fair level of accuracy and confidence, we could use the readily available robotic process automation platforms to build a robot that could onboard end client files automatically. OCR was the natural tool to pick and with Google, Microsoft, and Tesseract OCR engines, readily and freely available to use, we embarked on a journey to automate and save countless hours for our client’s service staff. Very soon we realized the human nature of the problem, encountered writing styles, and realized that even reading a handwritten date could mean parsing through 10 possible ways of writing a date. This is when we conceptualized the Intelligent Character Recognition (ICR) engine that leveraged ML techniques. The dataset used for the ICR consisted of unrestrained handwritten forms, scanned at 300 pixels per inch with eight-bit of greyscale per pixel. Advancements in specific areas of computer-aided drawing, image processing, and digital recognition were used to identify the written text” says Soumen Mukherjee, Co-Founder and Director, Sales and Strategy- Group Whyte. 

Process at Whyte 

The process starts with noise reduction of the input image wherein median filtering is used and the image contrast is worked such that each character lights up from the background page. Written characters are then normalized for size using a technique called size normalization where any abnormal streaks or anomalies are omitted from the analysis. This also makes the process quicker as junk is not analyzed. With pure handwritten text as input, the ICR structural and contour analysis algorithms begin to extract features from the characters and start classifying them into alphabets or numbers. At this point the curvature at every point along the inner and outer contours or each character’s image is analyzed on eight features, three concave and five convex. Each feature is in turn associated with a direction or simply put in the most typical way in which that character is normally written. 

The ICR engine was conceived with a feed-forward neural network at its heart that combined and scored the outcomes of all the algorithms and finally decided on the alphabet that it believed was being read.

This simplest form of ML made the ICR a potent learning machine. The neural network learned from its mistakes as much as it did from its success. Each wrong identification led to a self-correction that tweaked its internal working slightly such that it would not repeat the same calculations and arrive at the same scores the next time it encountered that situation. It is the ML that makes the ICR intelligent and in concept improves character recognition ability to over 80%. With this, Whyte remains confident of automating the bank’s client onboarding process. We were able to model a straight through rate of up to 57% i.e. more than half the work did not require any human intervention, leaving bank’s employees with time for serving other customers.

Just as with finding algorithms that were very accurate at recognizing specific aspects of handwritten text and then getting them together in the right combinations to make art possible, Whyte believes that the business landscape is replete with use cases where answers to portions of the problem are available and it takes only a step back to look at the full picture and stitch the canvas.




Source link