Skip to content

Customize Data Repos

Nyckollas Brandão edited this page Jul 1, 2023 · 8 revisions

Customize Data Repositories

The PHYLOViZ Web Platform allows for customization of data repositories, providing flexibility in storing and managing phylogenetic data. By default, the platform utilizes a set of repositories to store the generated resources and user-uploaded files. However, it supports the integration of additional repositories, allowing users to tailor the data storage and retrieval process according to their specific needs.

Advantages of Custom Data Repositories

  1. Diverse Data Models: The platform's flexible architecture enables the use of multiple repositories with different data models simultaneously. This allows for the incorporation of various representations of phylogenetic data, accommodating diverse data structures and formats.

  2. Efficient Resource Relations: Custom data repositories can focus on establishing efficient relationships between resources, optimizing querying and retrieval operations. By utilizing repositories specifically designed for phylogenetic data, such as PhyloDB, the platform ensures streamlined access to interconnected resources.

  3. Scalability: Incorporating scalable solutions like Amazon S3 (Simple Storage Service) as a file repository allows for efficient storage of large files, such as typing data, distance matrices, or tree representations. This scalability ensures that the platform can handle increasingly larger datasets without sacrificing performance.

Using a Custom Data Repository

To seamlessly integrate repositories with different interfaces into the platform, adapters play a crucial role. Adapters transform or adapt existing code to conform to a common interface, ensuring compatibility between repositories and the platform. This approach, known as the Adapter design pattern, allows for the utilization of a variety of repositories without impacting the platform's core business logic. In the context of the PHYLOViZ Web Platform, the code for accessing the data repositories can be referred to as "adapters," emphasizing their ability to adapt existing repositories to the common interface.

To incorporate a new data repository into the platform, the following steps need to be followed for each affected resource:

  1. Create a New Data Repository Class: Add a new class that follows the DataRepository interface. This class should implement the methods to be called by the service layer, with each method properly adapting the access code to the platform's data model contract.

  2. Define Data Repository Specifics: Add a new class following the DataRepositorySpecificData interface. This class specifies what data is stored within the metadata of the resource to identify it in the repository and provide information for its access.

  3. Identify the Repository: Add a new constant to the enum RepositoryId, which globally identifies the new repository.

For example, if you want to add a new repository that handles trees, you would create a new TreeDataRepository class, a new TreeDataRepositorySpecificData class, and a new constant in the TreeRepositoryId enum.

Finally, register the new code in the configuration to make the new data repository accessible by the microservices (API). To do this, navigate to the DataRepositoryConfig.java file and add the new beans to the maps, linking each of the created classes to the RepositoryId.

If desired, the repository should also be integrated into the workflow tasks/tools.

Customization for Enhanced Functionality

Custom data repositories offer enhanced functionality and performance for specific use cases. For example, when efficient storage and retrieval of distance matrices are required, incorporating a repository optimized for this purpose can provide significant advantages. By selectively utilizing specialized repositories alongside the default repositories, users can achieve more effective solutions tailored to their specific requirements.

By leveraging the customization capabilities of data repositories, users of the PHYLOViZ Web Platform can efficiently store, manage, and retrieve phylogenetic data in a manner that aligns with their unique research needs.