TrialTwin :: Open Data Repository

Regulatory Repository | 30+ Years Historical Data

We have built an integrated repository of regulatory data in the Life Sciences space.

Using Open Data allows users to trace a drug's or medical device's entire lifecycle: * Starting with chemical compounds (NLM's PubChem) * Through clinical trials (ClinicalTrials.gov, WHO’s ITPR) * Documentation on regulatory pathway (IND, NDA, etc.) * Reported adverse events (FDA's FAERS / MAUDE) * Manufacturer payments to providers (HHS' OpenPayments) * Medicare reimbursement data (CMS' Provider Utilization and Payment Data)

The Repository offers users a 360 degrees view of each previously-cleared drug or medical device.

TrialTwin - Open Data Repository

Medicine Navigator

Once we integrate all the Open Data related to a drug then we'll build a navigation mechanism to be able to view this massive amounts of data through personalized "views".

We think these "views" would be a good starting point:
  • Intellectual Property View: extracted relevant data from the US' Patent and Trademark Office

  • Chemical View: compound-level information

  • Payments & Providers View: this US-specific view displays the money flows in the US healthcare system

  • Research View: publications relevant to the specific drug

  • Regulatory View: data gathered across offices of regulatory agencies

  • Foreign Views: country-specific data from sources outside the US (initially Mexico and Spain)

TrialTwin - Medicine Navigator

Spain

Here's a view of a type of Open Data sourced from Spain's regulatory agency, AEMPS.

TrialTwin - Open Data Repository - Spain's data

FDA's 510(k)

We download and process thousands of PDFs from the FDA and extract all the text from those documents.

TrialTwin - Open Data Repository - FDA's 510(k)

EMA - Summary of Opinions

Here's a view of a type of Open Data sourced from the European Union's regulatory agency, EMA.

TrialTwin - Open Data Repository - EMA's Summary of Opinions

Open Data to Train AI, ML

The data stored in our Open Data Repository module can be used to train AI / ML models. Think about this data as the "Ground Truth" of what has happened in the pharma space in the US for the last 30 years.

Users can leverage Life Sciences-specific regulatory documents crafted before the age of "AI", including:
  • 40,000+ Protocols, SAPs, ICFs

  • over 70,000 FDA application files

  • 110,000 full FDA labels ("SPL")

Users can use this data to train their Models with the text extracted from all those documents, containing 600+ million words.

We can also include additional Open Data from other US agencies, including:
  • CMS – Medicare

  • HHS – healthcare

  • NLM – research and publication references

Contact us

Please contact us for more details.