This project is an ongoing effort to create language-agnostic semantic models of data science code, to promote knowledge sharing, automation, and intelligent tooling in the data science community. Elements of the project include:
Models of data science code
Data Science Ontology
The Data Science Ontology is a knowledge base about data science with a focus on computer programming. The concepts of the ontology are drawn from statistics, machine learning, and the practice of software engineering for data science. Besides cataloging and organizing data science concepts, the ontology provides semantic annotations of commonly used Python and R packages, such as pandas, scikit-learn, and R stats.
Catlab.jl is an experimental programming framework and computer algebra system for applied category theory, written in Julia. The focus is on monoidal categories and wiring diagrams (aka string diagrams), with support for manipulating, normalizing, serializing, and visualizing morphisms in monoidal categories.