Natural Language Processing #NLP – Giving doctors the freedom to write what they want by Dr. Anuradha Monga

Healthcare produces the highest quantity of data records as compared to any other industry. There has been a substantive shift in the provider workflows from capturing data in paper based records to electronic modes and storage in the past few decades.

Natural Language Processing: Giving doctors the freedom to write what they want

Electronic health records (EHRs) have clearly emerged as an innovative technology to facilitate the transition. However despite of the advancements, EHRs have not been able to achieve credible benefits in areas of population health management, health information exchange, patient care coordination and clinical analytics. 

One of the biggest barriers in achieving success with EHRs has been the disparate forms of data which are difficult to aggregate and analyze. Doctors feel comfortable writing notes along with the flow of their clinical thoughts, however EHRs are not designed to capture medical information in a doctor’s natural language. This inability many a times leads to poor EHR usability. As a result, a lot of valuable information is left out from the ambit of analysis. With the advent of newer technologies, now it may be possible to plug such gaps. NLP (natural language processing) is one such technology which providers are now adopting with an anticipation to improve clinical outcomes and for the simplification of the daunting task of data entry in a computer.

Clinical data is not consistent, making analysis difficult

An EHR captures data in primarily four ways:

  • Clinical Data is directly entered in pre-structured templates 
  • Scanned documents are uploaded in the system 
  • Text reports are transcribed by speech recognition technology or by dictation and manual data entry.
  • Data is purged into an EHR by interfacing it with other information systems like laboratory systems, radiology systems, or monitoring devices. 


Clinical data is usually presented in a structured or unstructured format. Selective choices for capturing data in the form of templates like physician order sets, drop down menus, check boxes etc constitute structured data. Aggregation, analysis and reporting from structured data is easier but doesn’t provide an individualized, customized identity to an EHR. On the other hand, unstructured data constitutes free text narratives and clinical notes i.e doctor’s notes, patient encounters, patient health records etc and enable the physicians and patients to get their observations, complaints and concepts recorded in their own parlance. The unstructured data is a rich source of information about a patient’s health but it’s a challenge to transform it into structured and analyzable data that can be used for improving care outcomes. This challenge can be overcome with the technology of natural language processing.

Unstructured clinical notes are a mine of golden data; the wait to explore them ends with NLP


NLP is a data science based technology that can extract data from free text. NLP can be used by clinicians to convert medical notes into formats which are structured and standardized. Auto-processing of textual data can help providers in making use of clinical documentation data for a variety of purposes including but not limited to:

  • Improving communications between healthcare teams and thus help improve outcomes
  • Reduce overhead costs of clinical documentation
  • Improve revenues by automation of the coding and documentation


Computers can be given the ability to infer the intended meaning of words, thus enabling them to identify trends and patterns in huge datasets. 

NLP can change the course of the way chronic diseases are managed:

One of the most promising area for exploring use cases of NLP in healthcare includes predictive analytics and risk scoring. Carefully deployed AI tools can be used for risk stratification and determination of hotspots in chronic diseases. 

NLP can be used to tag socioeconomic terms hidden in free text notes to identify the social determinants of health. This can be augmented with machine learning to develop risk scores by proactive identification of trends from clinical and social data, laboratory reports, diagnoses etc. It is possible to create algorithms and train them on clinical record data to identify disease symptoms accurately. 

Clinical records are a rich source of information regarding the symptoms of many diseases. Grouping of such similar symptoms can help in syndrome identification on the basis of disease presentation. As a result, it may be possible to unearth clusters which may otherwise not be suspected. Routinely available information in electronic health records, such as demographic and geographical location data and primary care free-text clinical records should be leveraged while making use of such algorithms.  

Why off the shelf NLP engines may not be what the doctors want:


While it sounds easy, healthcare free text data comes with its own challenges. Word sense ambiguity is perhaps one of the most challenging problems in the noise of free text clinical notes. Accurate translation of the structured patient information pertaining to medical procedures, symptoms, tests etc depends on the algorithm’s ability to assign correct interpretations to the relevant medical words. For example, the acronym RA can be used in different contexts with different meaning by doctors. RA can be interpreted as right atrium, right arm or rheumatoid arthritis depending on the case presentation and clinical context. 

Disambiguating the senses of acronyms, symbols and words that are used in a doctor’s clinical notes can significantly ease the burden on human effort needed to develop more accurate systems. A data-driven approach which involves development of any algorithm that infers patterns should consist of a supervised and unsupervised learning phase to yield benefits. In supervised learning every data item of the training data is labeled with the correct answer. Unsupervised learning on the other hand is a process where the computer recognizes patterns automatically. The true potential of an NLP and machine learning algorithm can only be harnessed when the data is trained in the provider’s environment.

Word sense disambiguation based NLP pays a significant role in improved analytics and patient outcomes:

Word sense ambiguation based language processing ability of the computer for accurate mining of clinical documents can bridge the gaps in documentation and aid clinical decision support and clinical documentation improvement programs. 

More insightful extraction of data is possible with a decreased ambiguity in clinical data. When the computer has the ability to infer the intended meaning of words, it can find useful patterns in heaps of data easily. IBM’s Watson Supercomputer technology is an apt example of how NLP can facilitate meaningful analytics, by identifying such patterns. IBM’s content analytics process is used for collection and analysis of structured and unstructured data, and its similarity analytics makes use of NLP and machine learning technology for analysis of a large number of variables in a patient’s medical history and present condition to identify patterns and draw a comparison with similar conditions and potential outcomes. 

There is no doubt that word sense disambiguation enabled NLP technology can have a potentially huge on impact clinical data analytics with its superior ability to infer meanings of extracted data more accurately. Data analytics for improved patient outcomes is not the only benefit of this technology, it can also support accuracy of billing. With its ability to support clinical documentation improvement programs, it can also help in improving clinical workflows.

SymptomAI by “PredictDisease” is a healthcare analytics platform that is driven by artificial intelligence, NLP and machine learning to assist patients and primary care physicians by measuring the potential risk of a chronic disease that starts with minor symptoms. The platform leverages data from lifestyle activities, social media/website forums, scientific research papers, and family history, matching these with known signs/symptoms and other demographic characteristics for the early detection of the chronic disease. It takes into account, social and biologic determinants of health to predict the risk score. Visit us at www.predictdisease.com or write to us at info@predictdisease.com for more info.

References: 
[1]. Auto Coding and NLP: 
http://www.himss.org/content/files/AutoCodingandNaturalLanguageProcessing(WhitePaper).pdf

[2]: Dooling, Julie A. “Advancing Technology Connects Transcription and Coding: The Developing Role of NLP, NLU, and CAC in HIM.” Journal of AHIMA 83, no.7 (July 2012): 52-53

[3]: Goldberg, Michael. “IBM Makes New Health Care Push with Predictive Analytics, Process Management.” Data Informed. http://data-informed.com/ibm-makes-new-health-care-push-with-predictive-analytics-process-management/


Author
Dr. (Maj) Anuradha Monga

A versatile military veteran with expertise in healthcare management, Anuradha has acquired real world experience in areas of Hospital operations, Health insurance claims management and mass insurance, Healthcare IT, NABH implementation and digital marketing.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from The Healthcare IT Experts Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading

Scroll to Top
Connect
1
👋 Hello
Hello!! 👋 Manish here, Thanks for visiting The Healthcare IT Experts Blog !! How can i help you?