dc.description.abstract |
Cardiovascular diseases remain the leading cause of mortality in the Philippines,
with ischemic heart disease as the most prevalent condition. At the Philippine
General Hospital (PGH), echocardiography is a critical diagnostic tool used in
managing these conditions; however, the process of documenting and managing
echocardiographic reports remains inefficient and labor-intensive, relying on paperbased
systems and manual data transcription into the OpenMRS platform. This
study aimed to develop a web-based system that automates the extraction and
structuring of 2D echocardiographic data from scanned PDF reports to improve
workflow efficiency and data usability for clinical and research purposes.
The system, built using Laravel and integrated with Tesseract OCR and OpenCV,
allows users to upload scanned PDF files, preprocess images (grayscale conversion,
binarization, and resizing), define Regions of Interest (ROI), and extract structured
data. Comparative analysis showed that while Tesseract OCR benefitted
from preprocessing, PyMuPDF produced more accurate outputs, even from imagebased
PDFs. The system also enables automatic generation of patient summary
reports and export of structured data in CSV format.
The results demonstrate that a combined OCR and ROI approach, along with
manual ROI definition, significantly enhances the accuracy of text extraction from
medical documents. This solution reduces administrative workload, minimizes human
error, and provides structured datasets for advanced research and potential
AI-driven applications—ultimately improving the quality of cardiovascular care at
PGH. |
en_US |