Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature 213 db load instructions #214

Merged
merged 32 commits into from
Jul 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
1b76b78
Issue #213 initial commit of script and config files
bikegeek Jul 21, 2023
398535d
support added to create and delete database, and modify xml specifica…
bikegeek Jul 23, 2023
4cdf718
added missing database name to the db drop and create commands
bikegeek Jul 23, 2023
4e4ef8a
delete database before creating
bikegeek Jul 23, 2023
0a8dca8
fix syntax on drop and create database, terminating single quote shou…
bikegeek Jul 23, 2023
31595e1
refactored to only handle database prep
bikegeek Jul 23, 2023
1fd34ea
first commit of code that specifically updates the XML specification …
bikegeek Jul 23, 2023
6af3670
rename dataclass, remove call to non-existent method
bikegeek Jul 23, 2023
cdd5a2b
update logging messages
bikegeek Jul 23, 2023
f97962c
fix syntax for granting access
bikegeek Jul 23, 2023
ade196e
another fix syntax for granting access
bikegeek Jul 23, 2023
aba1a77
yet another fix syntax for granting access
bikegeek Jul 23, 2023
a78ccb4
fix grant command by cleaning up spaces and quotations
bikegeek Jul 23, 2023
a044dd3
fix grant command with terminating quote
bikegeek Jul 23, 2023
b61c895
fix schema command
bikegeek Jul 23, 2023
186d062
fix schema command-missing space
bikegeek Jul 23, 2023
b7a7cdc
replace check_output with run for subprocess
bikegeek Jul 24, 2023
76464d4
use correct path to schema file
bikegeek Jul 24, 2023
6ce3c21
make entries more generic
bikegeek Jul 24, 2023
77f4771
Completed instructions, first commit
bikegeek Jul 24, 2023
79a889b
fixed syntax for first code block
bikegeek Jul 24, 2023
e27e0d8
Added text to use the existing example for subsetting the data.
bikegeek Jul 24, 2023
f2fc467
Added explicit directions for running the met_db_load script.
bikegeek Jul 24, 2023
0268aa2
Fixed formatting of list.
bikegeek Jul 24, 2023
756796f
fix xml_specification setting example
bikegeek Jul 24, 2023
8080a03
fix instructions in Load data section to use consistent language
bikegeek Jul 24, 2023
5a5ad76
add troubleshooting content when database is non-existent
bikegeek Jul 25, 2023
f7ce8c3
attempt to reformat troubleshooting table
bikegeek Jul 25, 2023
f52c039
additional content to troubleshooting using original formatting
bikegeek Jul 25, 2023
c32fed5
clean up formatting in troubleshooting table
bikegeek Jul 25, 2023
a7a53aa
fix grammar in troubleshooting
bikegeek Jul 25, 2023
133991c
RST syntax errors
bikegeek Jul 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions METdbLoad/sql/scripts/data_loading_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@

# Configuration file used to load MET ASCII data into
# a database.

dbname: dummy_dbname
username: dbuser
password: dbpassword
host: localhost
port: 1234

# Location (full path and schema file) to the sql schema. Replace with the location
# of your METdataio source code. PROVIDE THE FULL PATH TO THE SCHEMA FILE, NO RELATIVE PATHS OR
# ENVIRONMENT VARIABLES.
schema_location: /full-path-to/mv_mysql.sql

# Name and location of the XML specification file
xml_specification: /full-path-to/db_load_specification.xml

# Databases are grouped, select an existing group.
group: Testing
description: My test database

# Directory (full path) to where the MET data resides.
data_dir: /path-to-met-data

# Set the appropriate setting to True to indicate what type of data is
# being loaded.
load_stat: True
load_mode: False
load_mtd: False
load_mpr: False
load_orank: False
35 changes: 35 additions & 0 deletions METdbLoad/sql/scripts/db_load_specification.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<load_spec>
<connection>
<management_system>mysql</management_system>
<host>localhost:3306</host>
<database>mv_integrating_fire</database>
<user>mvadmin</user>
<password>160GiltVa0D5M</password>
</connection>

<folder_tmpl>/scratch/vdunham/</folder_tmpl>

<verbose>true</verbose>
<insert_size>1</insert_size>
<stat_header_db_check>true</stat_header_db_check>
<mode_header_db_check>false</mode_header_db_check>
<mtd_header_db_check>false</mtd_header_db_check>
<drop_indexes>false</drop_indexes>
<apply_indexes>false</apply_indexes>

<load_stat>true</load_stat>
<load_mode>false</load_mode>
<load_mtd>false</load_mtd>
<load_mpr>false</load_mpr>
<load_orank>false</load_orank>

<load_val>
<field name="met_tool">
<val>point_stat</val>
</field>
</load_val>

<group>RAL Projects</group>
<description>MET output generated for SOARS research.</description>

</load_spec>
150 changes: 150 additions & 0 deletions METdbLoad/sql/scripts/db_prep.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
'''
Creates the METviewer database to store MET output. Requires a YAML configuration
file (data_loading_config.yaml) with relevant database information (i.e. username,
password, etc.).
'''
import os.path
import subprocess

import yaml
import argparse
from dataclasses import dataclass
import logging

logging.basicConfig(encoding='utf-8', level=logging.DEBUG)


@dataclass
class DatabaseInfo:
'''
Data class for keeping the relevant information for loading the
METviewer database.
'''

db_name: str
user_name: str
password: str
host_name: str
port_number: int
schema_path: str
config_file_dir: str

def __init__(self, config_obj: dict, config_file_dir: str):
'''

Args:
config_obj: A dictionary containing the
settings to be used in creating the database.
'''

self.db_name = config_obj['dbname']
self.user_name = config_obj['username']
self.password = config_obj['password']
self.host_name = config_obj['host']
self.port_number = config_obj['port']
self.schema_path = config_obj['schema_location']
self.config_file_dir = config_file_dir

def create_database(self):
'''
Create the commands to create the database.

Returns: None

'''
# Command to create the database, set up permissions, and load the schema.
uname_pass_list = ['-u', self.user_name, ' -p', self.password, ' -e ']
uname_pass = ''.join(uname_pass_list)
create_list = ["'create database ", self.db_name, "'"]
create_str = ''.join(create_list)
create_cmd = uname_pass + create_str
logging.debug(f'database create string: {create_cmd}')

# Permissions
perms_list = ['"',"GRANT INSERT, DELETE, UPDATE, INDEX, DROP ON " ,
self.db_name,
'.* to ', "'mvuser'", "@'%'", '"']

perms_str = ''.join(perms_list)
perms_cmd = uname_pass + perms_str
logging.debug(f'database grant permissions string: {perms_cmd}')


# Schema
schema_full_path = os.path.join(self.schema_path,
'METdataio/METdbLoad/sql/mv_mysql.sql')
schema_list = [ "-umvadmin -p",self.password, " ", self.db_name, ' < ',
schema_full_path]
schema_cmd = ''.join(schema_list)
logging.debug(f'Schema command: {schema_cmd}')



try:
self.delete_database()
except subprocess.CalledProcessError:
logging.info("Database doesn't exist. Ignoring this error.")
pass

try:
create_db = subprocess.run(['mysql', create_cmd])
db_permissions = subprocess.run(['mysql', perms_cmd])
db_schema = subprocess.run(['mysql', schema_cmd])
except subprocess.CalledProcessError:
logging.error('Error in executing mysql commands')

def delete_database(self):
'''
Create the commands to delete a database.
Returns: None

'''

# Command to delete the database
uname_pass_list = ['-u', self.user_name, ' -p', self.password, ' -e ']
uname_pass = ''.join(uname_pass_list)
drop_list = ["'drop database ", self.db_name, "'"]
drop_str = ''.join(drop_list)
drop_cmd = uname_pass + drop_str
logging.debug(f'Drop database command: {drop_cmd}')

try:
_ = subprocess.run(['mysql', drop_cmd])

except subprocess.CalledProcessError:
logging.error('Error in executing mysql commands')


if __name__ == "__main__":

# Create a parser
parser = argparse.ArgumentParser()

# Add arguments to the parser
parser.add_argument('action')
parser.add_argument('config_file')

# Parse the arguments
args = parser.parse_args()

# Get arguments value
action = args.action
config_file = args.config_file

action_requested = str(action).lower()
logging.debug(f'Action requested: {action_requested}')
logging.debug(f'YAML Config file to use: {str(config_file)}')
config_file_dir = os.path.dirname(config_file)
logging.debug(f'Directory of config file: {config_file_dir}')

with open(config_file, 'r') as cf:
db_config_info = yaml.safe_load(cf)
db_loader = DatabaseInfo(db_config_info, config_file_dir)
if action_requested == 'create':
db_loader.create_database()
elif action_requested == 'delete':
db_loader.delete_database()
else:
logging.warning(
f'{action_requested} is not a supported option. Only "create" and '
f'"delete" are supported options.')
121 changes: 121 additions & 0 deletions METdbLoad/sql/scripts/generate_xml_spec.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
'''
Creates the METviewer database to store MET output.
'''
import os.path
import subprocess

import yaml
import argparse
from dataclasses import dataclass
import logging

logging.basicConfig(encoding='utf-8', level=logging.DEBUG)


@dataclass
class DatabaseLoadingInfo:
'''
Data class for keeping the relevant information for loading the
METviewer database.
'''

db_name: str
user_name: str
password: str
host_name: str
port_number: int
group: str
schema_path: str
data_dir: str
xml_spec_file: str
load_stat: bool
load_mode: bool
load_mtd: bool
load_mpr: bool
load_orank: bool
config_file_dir: str

def __init__(self, config_obj: dict, config_file_dir:str):
'''

Args:
config_obj: A dictionary containing the
settings to be used in creating the database.
'''

self.db_name = config_obj['dbname']
self.user_name = config_obj['username']
self.password = config_obj['password']
self.host_name = config_obj['host']
self.port_number = config_obj['port']
self.group = config_obj['group']
self.schema_path = config_obj['schema_location']
self.data_dir = config_obj['data_dir']
self.xml_spec_file = config_obj['xml_specification']
self.load_stat = config_obj['load_stat']
self.load_mode = config_obj['load_mode']
self.load_mtd = config_obj['load_mtd']
self.load_mpr = config_obj['load_mpr']
self.load_orank = config_obj['load_orank']
self.description = config_obj['description']
self.config_file_dir = config_file_dir


def update_spec_file(self):
'''
Edit the XML specification file to reflect the settings in the
YAML configuration file.
'''

# Assign the host with the host and port assigned in the YAML config file
import xml.etree.ElementTree as et
tree = et.parse(self.xml_spec_file)
root = tree.getroot()

for host in root.iter('host'):
host.text = self.host_name + ":" + str(self.port_number)

for dbname in root.iter('database'):
dbname.text = self.db_name

for user in root.iter('user'):
user.text = self.user_name

for password in root.iter('password'):
password.text = self.password

for data_folder in root.iter('folder_tmpl'):
data_folder.text = self.data_dir

for group in root.iter('group'):
group.text = self.group

for desc in root.iter('description'):
desc.text = self.description

tree.write(os.path.join(self.config_file_dir, 'load_met.xml'))



if __name__ == "__main__":

# Create a parser
parser = argparse.ArgumentParser()

# Add arguments to the parser
parser.add_argument('config_file')

# Parse the arguments
args = parser.parse_args()

# Get arguments value
config_file = args.config_file

logging.debug(f'Config file to use: {str(config_file)}')
config_file_dir = os.path.dirname(config_file)
logging.debug(f'Directory of config file: {config_file_dir}')

with open(config_file, 'r') as cf:
db_config_info = yaml.safe_load(cf)
db_loader = DatabaseLoadingInfo(db_config_info, config_file_dir)
db_loader.update_spec_file()
Loading