Event box

Data Bytes - Extracting Data from PDFs with Python

Monday, October 7, 2024
12:00pm - 1:00pm
Event URL will be sent via registration email.

Registration is required. There are 31 seats available.

Ever found yourself with a collection of information-rich PDFs that you wished you could easily combine into an analysis-ready dataset? Join Johns Hopkins Data Services in this Data Bytes session as we provide an overview of the kinds of data that may be present in PDFs, and demo several Python packages that you can use to extract and combine it.

Data Bytes are short data-related talks, hosted by Data Services, and offered during lunch on Mondays. Visit the Data Bytes Schedule to see all Data Bytes sessions this fall.