From episodes of care to diagnosis codes : automatic text categorization for medico-economic encoding

Ruch, Patrick ; Gobeilla, Julien ; Tbahritia, Imad ; Geissbühlera, Antoine

In: AMIA 2008 Symposium Proceedings, 2008, p. 636-640

We report on the design and evaluation of an original system to help assignment ICD (International Classification of Disease) codes to clinical narratives. The task is defined as a multi-class multidocument classification task. We combine a set of machine learning and data-poor methods to generate a single automatic text categorizer, which returns a ranked list of ICD codes. The combined ranking... Plus

Ajouter à la liste personnelle
    Summary
    We report on the design and evaluation of an original system to help assignment ICD (International Classification of Disease) codes to clinical narratives. The task is defined as a multi-class multidocument classification task. We combine a set of machine learning and data-poor methods to generate a single automatic text categorizer, which returns a ranked list of ICD codes. The combined ranking system currently obtains a precision of 75% at high ranks and a recall of about 63% for the top twenty returned codes for a theoretical upper bound of about 79% (inter-coder agreement). The performance of the data-poor classifier is weak, whereas the use of temporal features such as anamnesis and prescription contents results in a statistically significant improvement.