How to integrate machine learning into my workflow: A definitive guide

 

This article introduces the topic of machine learning (ML), and demonstrates how ML tools can be successfully integrated with your existing workflow across industries. This article begins by offering contemporary definitions of ML, artificial intelligence (AI), and workflows. It then offers a step-by-step four phase procedure for how to best leverage ML in your workflow without disrupting or compromising business aims. Finally, seven examples of ML in existing workflows are given in order to illustrate how effective and simple ML can be.

Blog-4.jpg
 
 

CONTENTS

  1. What is machine learning?

  2. What is Artificial Intelligence?

  3. What is a workflow?

  4. What can machine learning do for my workflow?

  5. How to integrate machine learning into my workflow, step by step

    • Define

    • Prototype

    • Production

    • Measure

    6. Examples of machine learning tools in existing workflows

    • Google predictive text

    • Smart search through data

    • Smart chronology for data extractions

    • Smart analytics and reporting

    • Anomaly usage detection 

    • End point detection

    • Redaction of information

WHAT IS MACHINE LEARNING?

 Machine learning (ML) is a field of artificial intelligence (AI) and computer science. ML is concerned with the use of algorithms and data to replicate the way that humans learn. ML tools and processes improve in accuracy with time; the more the process is repeated, the more that can be learnt and applied to likened examples.

ML is a particularly important component in data science. Algorithms can be trained to  make classifications or predictions, unearthing useful insights. These insights can drive businesses forward because they often relate to key growth metrics or user behaviour. 

ML can be used to speed up and introduce greater degrees of accuracy in data heavy or repetitive tasks. Over time, algorithms ‘learn’ to the extent that they ‘predict’ outcomes/tasks without using up any sweat hours.

Blog-3.jpg

WHAT IS ARTIFICIAL INTELLIGENCE?

The birth of AI dates back to the 1950s. Alan Turing, in a seminal paper on this subject, “Computing Machinery and Intelligence”, asks the following question: ‘Can machines think?’. From here develops the ‘Turing Test’, where a human interrogator has the job of distinguishing between human and computer responses. While AI has developed hugely since this moment in the 1950s, it is important to grasp the origins of AI, now a ‘hot 21st century topic’ that is all too often misunderstood. At base level, AI involves the use of computers to mimic human problem-solving and decision-making capabilities.

John McCarthy penned the following definition for AI in 2004, and it serves as a solid and broad foundation for contemporary understanding: “[AI] is the science and engineering of making intelligent machines, especially intelligent computer programmes. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.”

Thus AI is a broad umbrella term for intelligent (human-like) computing processes, and machine learning is one aspect of AI that focuses specifically on mimicking human learning capabilities.

WHAT IS A WORKFLOW?

A workflow is a sequential set of occurrences when data is passed between humans and systems. Workflows occur across a broad range of business and industries. The core function of a workflow is processing a set of data along a path, from undone to done, raw to refined.

Workflows can be broadly grouped into three categories:

  1. Process workflow: This is when a set of tasks is repetitive and the outcomes are predictable. 

  2. Case workflow: The path required to complete the flow is unknown from the outset and case dependent.

  3. Project workflow: The path to completion is mostly predictable, but there is room for flexibility en route. Crucially, a project workflow is only good for one instance, and not meant to be repeated exactly each time, like a process workflow.

It can be useful to keep in mind that if data is not moving along a chain of events/occurrences, then you do not have a workflow. A series of unlinked tasks (pay a receipt > pick up a coffee > lock the conference room) is ‘task management’ not a workflow.

Blog-2.jpg

WHAT CAN MACHINE LEARNING DO FOR MY WORKFLOW?

ML can take up predictable tasks that humans would otherwise have to do along the data chain of a workflow. A workflow (even if it’s a case workflow) will have junctures, or bottleneck intersections, along the path from start to finish, where a lot of time and data processing is necessary. ML can fast track through these junctures, streamlining the journey from beginning to outcomes. ML can also help a workplace and team members simplify data management. ML is particularly useful when data is being managed in large quantities. The larger the data set, the more latent potential gain from ML there is.

ML can be plugged into a junction along a workflow with varying degrees of customisation put in place around it. So although ML is contemporarily oftentimes applied to process workflows in large data funnels, it is also applicable to project workflows to speed up the aspects of the project that are repeated from likened/other instances, so as to free up more time for teams to focus on the crucial unique customisations. 

Lastly, ML can dig beneath the superficial layer of a data set to reveal information/findings that are otherwise opaque, difficult, or time consuming to locate. In this way, ML not only holds the potential to fast track a workflow, but it can also unearth new layers of depth/nuance in a workflow and its outcomes.

HOW TO INTEGRATE MACHINE LEARNING INTO MY WORKFLOW, STEP BY STEP

An ML workflow can be divided into four main phases:

  1. Define: Understand the business needs. Understand how the business actually works - what is the data being passed along a chain? How is the data being passed? What are the desired outcomes of this chain, and who/what manages the flow? In this first phase it is also really important to isolate the pain points; where is the is time and resource drain? What are the ‘sweat hours’ along a data chain? It is imperative to have a clear architecture, or map, defining how data moves, and where it stagnates.

  2. Prototype: Given the define phase, you should have clear idea of specific points along the chain of data flow that drain time and resources. These are the areas to focus in on. You need to collect the data at this point, prepare the data, and train a model of algorithms to process it that can then be repeated over time with accurate and desired outcomes. Remember that ML requires a ‘learning’ phase, where the computing gets better at the task with more repetition. Some ML software is ready to be used and already has gone through this prototyping and testing phase. But custom ML needs to be prototyped in order to adequately meet the unique context. You then have to evaluate the outcomes to ensure you have hit the right cadence and detail with regards to algorithms and the point at which you ‘crunch the data’.

  3. Production: This is when you deploy a model based on successful results in the prototyping phase. It is a phase in which the ‘integration’ happens because you knit the model and its outcomes into the overall workflow. 

  4. Measure: This final phase is when you gather and analyse insights to ensure the predictions made through the ML model are levelling up to your business needs. You can also monitor predictions over time to observe fluctuations and gain a deeper level of understanding into your data sets, cross-referencing and comparing over time and data streams.

EXAMPLES OF MACHINE LEARNING IN EXISTING WORKFLOWS

Predictive text

This is probably the most widely used form of ML, given freely to us through Google, WhatsApp, Microsoft Outlook and numerous other software examples. The predictive text to complete words has been around for some time, but more recently Google introduced ML to fast track typing of emails in Gmail by offering whole sentences and curated responses to emails. As with any ML, the more you use (select options; edit options) the predictive prompts, the more accurate they are in terms of matching your tone and voice and specific likely responses.

Smart search

This is particularly valuable for industries that work with enormous data sets, such as legal, insurance, accounting reporting and academic research. ML searches are able to pick up not just 100% match instances in your data set, but also broader and more nuanced likened matches. In other words, searches are not about choosing the right combination of words to search with, but about applying the general intent, and letting the ML discover similar and linked findings. When at it’s best, this type of ML tool actually surpasses the skill of a human researcher who would otherwise be searching through mounds of documents with a highlighter, not just because its so much quicker, but because it can detect really refined and nuanced similarities. A great example of this is Doc Insights smart search software.

Smart chronology builder

Ordering findings from data sets can be a laborious task. Moreover, there is not always an obvious and quickly distinguishable date to align with the finding in question. A smart chronology builder filters findings into date, or your chosen type of order, and then offers backlinks to original documents so you can cross-check and locate origins. This can save a lot of time in building a legal case, for example, where there might be more than one date attached to one document, and a chronology in findings needs to be built. Doc Insights offers just this smart chronology builder.

Smart analytics and reporting

This is when analytics or metrics based on a set of data is generated for you automatically. An example could be analytics offered on user activity on the backend of a website builder. Or, some eDiscovery tools offer really interactive and appealing analytics and reporting for you based on the data set you choose to pull through the software. A great example is Zylab automated ediscovery. This is incredibly useful for presentation of information and in gaining insights into your business or research area. There is an interesting crossover here between linguistic and visual in communication of data.

Anomaly usage detection 

This is used in security for corporations and larger business. It is a service offered by networking providers, an example being Cisco Systems. ML enables operations and security in a business to observe and track patterns in device and software usage over time. When an anomaly in the pattern occurs, a red flag is raised, and the point at which the anomaly occurs is isolated for further investigation. Banks also use this in tracking spending habits. If you suddenly drop a huge amount of money in two countries you never normally visit, this is raised as anomalous, and banks can automatically freeze and isolate the transactions for further investigation.

End point detection

Again this is a crossover between ML and information security relevant to larger businesses. It involves the monitoring of ‘end point’ devices - laptops, mobile phones and tablets - to isolate errors or anomalous behaviour so as to pick up viruses or cyber attacks. The trawling through a huge amount of data on end point usage happens automatically and on an ongoing basis (and can be largely ‘private’, or hidden from other human eyes). Human intervention with the ML tool occurs when breakages in the pattern are detected. 

Redaction of information

This is when natural language processing (an example of ML) is used to recognise particular information or fields across documents. Then, when recognised, these are redacted, meaning blocked out or extracted from the document. It is particularly useful for law firms or governments who may need to publish a large volume of documents that may contain sensitive or private information. A redaction tool would be able to search through the large volume of documents and redact sensitive information in a far quicker time than any human could. The documents can then be published with only the sensitive information omitted, and the rest of the information - fit for public view - is made public. There is an open-source ML-powered redaction tool available for free download, created by Hadeda AI.

Blog-1.jpg

For more information about how to integrate machine learning into your workflow, get in touch with the Doc Insights team and book a demo! Doc Insights offers subscription solutions and custom enterprise solutions.