MIDRC/ACR radiology report deidentifier
Overview
This radiology report de-identifier model takes one csv, xlsx, or json file as input. The following headers must be present in the file: Facility ID, Patient ID, Accession Number, Report. One file that matches the input filetype will be generated as output. the following headers are present in the output file: Facility ID, Patient ID, Accession Number, De-ID Report. If an error occurs during the model run, no output file will be generated. A log file is generated in the output directory for each run.
Setup
create a folder to be used for your outputs
create a folder to be used for intermediate filetypes (referred to below as scratch)
Ensure your csv, xlsx or json input fields match with the provided examples. Note - all fields except "Report" can be blank if desired.
Run as a Docker container
After building Docker container with the included Dockerfile, run:
docker run \
-v /path/to/input:/input \
-v /path/to/output:/output \
-v /path/to/scratch:/scratch \
--gpus all de_id --input_filepath "path\to\connect\input_file"
Run as a Windows python executable
After unzipping executable, open cmd and run:
cd extract-folder/dist/De-id
De-id.exe --device_list "cpu" --input_filepath "path\to\input_file" --output_dir "path\to\output_dir" --scratch_dir "path\to\scratch_dir"