Replication package for the paper "Bridging the Language Gap: An Empirical Study of Bindings for Open Source Machine Learning Libraries in Software Package Ecosystems "

Data Overview

The data is based on the dataset of Librario.io, the extracted data can be found in the data/ folder:

all_bindings.csv: Contains 250,668 bindings identified by BindFind, alongside their respective host library names.
data/labelled_data: This directory contains labelled binding data which split into training, validation, and testing sets.
data/binding_qa: This directory contains the performance results of BERT-like models on our labelled dataset
rq2_ml_repos.csv and rq2_ml_bindings.csv: Provide details on 546 ML libraries and their 2,436 bindings.
labelled_rq3_pop_ml_repos.csv and labelled_rq3_pop_ml_bindings.csv: Provide details on 40 popular ML libraries and their 133 bindings
rq3_pop_ml_repos_tags.csv and labelled_rq3_pop_ml_bindings_versions.csv: Provide the results of our version matching analysis for 3,785 tags and 3,277 versions.

Environment Setup

We provide an environment.yml file that can be used with Conda to create an environment with all the necessary dependencies:

conda env create -f environment.yml

Reproducing the Study

For replicating the results presented in our paper, we have organized Jupyter notebooks in the analyze_notebooks directory. In addition, we provide the scripts for data collecting in the data_collection directory

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Replication package for the paper "Bridging the Language Gap: An Empirical Study of Bindings for Open Source Machine Learning Libraries in Software Package Ecosystems "

Data Overview

Environment Setup

Reproducing the Study

About

Releases

Packages

Languages

asgaardlab/MLBindings

Folders and files

Latest commit

History

Repository files navigation

Replication package for the paper "Bridging the Language Gap: An Empirical Study of Bindings for Open Source Machine Learning Libraries in Software Package Ecosystems "

Data Overview

Environment Setup

Reproducing the Study

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages