J. Kamal Ohio State UMC data warehouse
13+ year project
Information Warehouse - IW
IW Data Marts: Business, Clinical, Research, External
Started with financial.
Web scorecards, dashboards, NLP, data mining.
Powerful ad hoc query tools.
Oracle BI Discoverer.
iCount - User Query tool for aggregates -- aggregate counts of patient cohorts to define patient populations for research grants; tied to standard terminology; all users have access. No IRB needed.
IW has >4K users
Queries increasing exponentially.
Users: Administrators, managers, clinicinas researchers, finance.
50 dashboards of approx 1K indicators. Powerful visual reports.
Research portal, Investigator profile
CIO dashboard.
MRSA reporting
Antimicrobial
Restraint Orders
Falls
Diabetes mgmt
Genetic data storage and analysis
Patient tracker automates patient flow recording
Managers notified of idle processes
Data driven issue awareness
More scorecards
Physician Profiling system
De-identified IW
Hagop Mekhjian MD Chief Medical Officer
Posters 14, 41, 102, 46, 91, ... (11+ posters)
IW at OSUMC has Honest Broker status; Two protocols - internal+ external - worked with IRB and Research Office to
make processes robustly meet the criteria for non-human subject research.
C. Weng et al. Comparing Effectiveness of Clinical Registry vs. Clinical Data Warehouse for Supporting Clinical Trial Recruitment: Case Study
Columbia University
TECOS Trial Evaluating Cardiovasc Outcomes w Sitagliptin
Diabetes Registry created in 2005, contains 5K patients, few variables A1C, urine microalb., LDL chol
vs.
Clinical Data Warehouse
Too many false positives recruiting with Registry, PI turned to Informatics/Data Warehouse
Warehouse had 2X the # of variables of interest vs. Registry (8/12 vs. 4/12).
Temporarily Ineligible (awaiting result)
Definitely Ineligible
Potentially Eligible > recent visit, must be referred by their primary care provider.
Confirmed Eligible
Target n=60
Registry Warehouse
Potentially Eligible 2033 100
Confirmed Eligible 29(6.6%) 31 (31%)
nonconsent rate S i m i l a r
No working days 74 59
Patients/wk 1 2.5
Total in 3 months 14 30
Registry lacked rich clinical data for evaluating exclusion criteria
Columbia Study site rated #3 of over 300+ sites worldwide.
Warehouse needs sophisticated query skills.
Registry has better quality for disease-specific markers.
CDW more effective in excluding the ineligible.
Dynamically generated registries linked to CDW can lead to promising recruitment solutions,
Planning a dynamic Protocol-specific screening tool that utilizes CDW and Registry
PI Weng NLM R01
Payne P. TRITON Project
Integrative Translational Research Information Management Platform
CLL Research Consortium
NCI funded P01 fo rCLL
n=5,000 cohort
Objectives:
Modernize legacy information system
provide tools to increase in research productivity
-data exchange
-bio-specimen mgmt
-adverse event detection
-protocol driven Decision support
-integraitve query and data aggregation
Provide community access to technologyes
Systems Design:
Open source, standards compliant
portal based approach
caTissue
caAERS
caGRID
SQL
MySQL
GWT environment "Google look and feel"
Current Project Status
billing, budgeting, trial mgmt, close-out and reporting
Model-driven design/architecture:
Real world protocols > represent as a logical model>map to standards and data sources
Some lessons:
Don't use just PIs - get research staff "on the ground".
Use Project Management methods.
Organization perception is just as important as technical functionality.
Mining Clinical Data using Minimal Predictive Rules
Batal I. and Hauskrecht M., University of Pittsburgh
Rule Induction methods represent knowledge
Association Rules more complete than greedy algorithsm - eg. trees; generate a lot of rules
Add correlation measure - statistical test or interestingness measure (evaluates each rule individually
We should consider the nested structure of the rules.
Minimal Predictive Rule [MPR]:
predicts class significantly better than the subrules
Lossy pruning speeds the mining, at the risk of some missing MPRs.
A-MPR prunes 98% of the search space without changing AUC.
An efficient algorithm for approximate mining.