Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MET 11.1.0 unable to install on MacOS using Homebrew #2775

Open
23 tasks
HathewayWill opened this issue Dec 24, 2023 · 32 comments
Open
23 tasks

MET 11.1.0 unable to install on MacOS using Homebrew #2775

HathewayWill opened this issue Dec 24, 2023 · 32 comments
Labels
alert: NEED ACCOUNT KEY Need to assign an account key to this issue component: build process Build process issue priority: medium Medium Priority requestor: Community General Community type: bug Fix something that is not working

Comments

@HathewayWill
Copy link

Replace italics below with details for this issue.

Describe the Problem

MET 11.1.0 fails to build NETCDF-CXX

Expected Behavior

MET would compile like 11.0.0

Environment

Describe your runtime environment:
*1. Machine: Virtual Machine
*2. OS: MacOS 13
*3. Software version number(s): 13.4 beta

To Reproduce

See attached zip file with logs and compile.sh script

MET_FAIL_MACOS.zip

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Review default alert labels
  • Select component(s)
  • Select priority
  • Select requestor(s)

Milestone and Projects

  • Select Milestone as the next bugfix version
  • Select Coordinated METplus-X.Y Support project for support of the current coordinated release
  • Select MET-X.Y.Z Development project for development toward the next official release

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of main_<Version>.
    Branch name: bugfix_<Issue Number>_main_<Version>_<Description>
  • Fix the bug and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into main_<Version>.
    Pull request: bugfix <Issue Number> main_<Version> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the next bugfix version
    Select: Coordinated METplus-X.Y Support project for support of the current coordinated release
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Complete the steps above to fix the bug on the develop branch.
    Branch name: bugfix_<Issue Number>_develop_<Description>
    Pull request: bugfix <Issue Number> develop <Description>
    Select: Reviewer(s) and Development issue
    Select: Milestone as the next official version
    Select: MET-X.Y.Z Development project for development toward the next official release
  • Close this issue.
@HathewayWill HathewayWill added alert: NEED ACCOUNT KEY Need to assign an account key to this issue alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle alert: NEED MORE DEFINITION Not yet actionable, additional definition required type: bug Fix something that is not working labels Dec 24, 2023
@HathewayWill
Copy link
Author

ld: warning: directory not found for option '-L/Users/workhorse/WRF/MET-11.1.0/external_libs/lib/lib' ld: warning: directory not found for option '-L=/Users/workhorse/WRF/MET-11.1.0/external_libs/lib:/Users/workhorse/WRF/MET-11.1.0/external_libs/lib'

line 134/135 of the netcdf-cxx install log

@jprestop
Copy link
Collaborator

jprestop commented Jan 9, 2024

Hi @HathewayWill. Could you please try changing the following line in compile_MET_all.sh from:

configure_lib_args="-lhdf5_hl -lhdf5 -lz"

to

configure_lib_args="-lnetcdf -lhdf5_hl -lhdf5 -lz"

and see if you get a successful compilation? Please let us know how it goes. Thanks!

@jprestop jprestop removed alert: NEED MORE DEFINITION Not yet actionable, additional definition required alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle labels Jan 9, 2024
@jprestop jprestop added this to the MET 11.1.1 (bugfix) milestone Jan 9, 2024
@HathewayWill
Copy link
Author

Hi @HathewayWill. Could you please try changing the following line in compile_MET_all.sh from:

configure_lib_args="-lhdf5_hl -lhdf5 -lz"

to

configure_lib_args="-lnetcdf -lhdf5_hl -lhdf5 -lz"

and see if you get a successful compilation? Please let us know how it goes. Thanks!

sadly that didn't work, here are the log files again.

@HathewayWill
Copy link
Author

MET_FAIL_MACOS 2.zip

@HathewayWill
Copy link
Author

HathewayWill commented Jan 15, 2024

@jprestop

Found a solution to netcdfcxx but I don't know why.

Needed configure_lib_args="-lnetcdf -lm -lhdf5_hl -lhdf5 -lz"

@HathewayWill
Copy link
Author

but now we have a different error:

met.configure.log
met.make.log
met.make_install.log
met.make_test.log

very confused @jprestop

@jprestop
Copy link
Collaborator

Hi @HathewayWill.

I see in the met.make_test.log file:

*** Running Wavelet-Stat on APCP using a GRIB forecast and netCDF observation ***
../src/tools/core/wavelet_stat/wavelet_stat \
        ../data/sample_fcst/2005080700/wrfprs_ruc13_12.tm00_G212 \
        ../out/pcp_combine/sample_obs_2005080712V_12A.nc \
        config/WaveletStatConfig_APCP_12 \
        -outdir ../out/wavelet_stat -v 2
DEBUG 1: Start grid_stat by workhorse(501) at 2024-01-15 18:15:58Z  cmd: ../src/tools/core/grid_stat/grid_stat ../out/pcp_combine/sample_fcs\
t_12L_2005080712V_12A.nc ../out/pcp_combine/sample_obs_2005080712V_12A.nc config/GridStatConfig_APCP_12 -outdir ../out/grid_stat -v 2
DEBUG 2: OMP_NUM_THREADS is not set. Defaulting to 1 thread. Recommend setting OMP_NUM_THREADS for faster runtimes.
DEBUG 2: OpenMP running on 1 thread(s).
DEBUG 1: Default Config File: /Users/workhorse/WRF/MET-11.1.0/share/met/config/GridStatConfig_default
DEBUG 1: User Config File: config/GridStatConfig_APCP_12
GSL_RNG_TYPE=mt19937
GSL_RNG_SEED=1
DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc
DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc
DEBUG 2: Processing masking regions.
terminate called after throwing an instance of 'netCDF::exceptions::NcNotNCF'
  what():  NetCDF: Unknown file format
file: ncFile.cpp  line:88
FATAL: Received Signal Abort. Exiting 6
make[1]: *** [grid_stat] Error 6
make[1]: *** Waiting for unfinished jobs....

Let's check on your NetCDF installations. Can you please tell me if all of the following files exist in your /Users/workhorse/WRF/MET-11.1.0/external_libs/include and /Users/workhorse/WRF/MET-11.1.0/external_libs/lib directories?

Files for NetCDF4 C:
$MET_NETCDF/include/netcdf.h
$MET_NETCDF/lib/libnetcdf.a
$MET_NETCDF/lib/libnetcdf.so

Files for NetCDF4 C++:
$MET_NETCDF/include/netcdf
$MET_NETCDF/lib/libnetcdf_c++4.a
$MET_NETCDF/lib/libnetcdf_c++4.so

@HathewayWill
Copy link
Author

@jprestop

They appear to be in there.
Screenshot 2024-01-16 at 5 07 10 PM
Screenshot 2024-01-16 at 5 09 22 PM

for netcdf-c++ I had to add -lm and -lnetcdf

@jprestop
Copy link
Collaborator

@HathewayWill Ah yes, very confusing indeed. I think @georgemccabe figured out the problem. The compile_MET_all.sh script was running "make test" using MAKE_ARGS. Since some tests rely on the output of other tests to succeed, running "make test" in parallel won't work and explains the confusing information in the log file where it says it is running wavelet_stat, but then the log information refers to grid_stat. I have modified the compile_MET_all.sh script and have added "-lnetcdf -lm" to configure_lib_args for the compilation of NetCDF-CXX. Please download the new script and try again.

@HathewayWill
Copy link
Author

@jprestop @georgemccabe

So does make test need to have the make args removed for each one if running in parallel processing?

I'm running a WRF run right now but give me tonight and I'll test it later.

Sounds like that was the error which will make everyone happy that it's fixed using dtc-mosit and WRF-mosit

@jprestop
Copy link
Collaborator

HI @HathewayWill

So does make test need to have the make args removed for each one if running in parallel processing?

I don't think I understand your question. I'm not sure what you mean by "each one".

To help clarify, we changed:

run_cmd "make ${MAKE_ARGS} test > $(pwd)/met.make_test.log 2>&1"

to

run_cmd "make test > $(pwd)/met.make_test.log 2>&1"

Maybe you mean - does ${MAKE_ARGS} need to be removed in calls to the external libraries' "make test" commands? If so, the answer, unfortunately, is I don't know if the external libraries "make test" commands rely on the output of other tests to succeed. All I can say is that I haven't experienced this problem previously in installations on various machines, so I think until we encounter a problem it is likely ok to leave as-is.

@HathewayWill
Copy link
Author

@jprestop

That was what I was getting at.

You answered my question about the removal.

Sorry for the confusion

@HathewayWill
Copy link
Author

@jprestop

Testing it now.

Was reading the new compile_MET script and I noticed that the make install for met doesn't have MAKE ARGS. Can met not be installed in parallel?

run_cmd "make install > met.make_install.log 2>&1"

@HathewayWill
Copy link
Author

HathewayWill commented Jan 19, 2024

@jprestop

tested it and it got worse.

Before it would fail at test now it fails at met.make

Here is the relevant log files.
met.make.log
configure.log

compile_MET_all.log

HathewayWill referenced this issue Jan 19, 2024
Removing ${MAKE_ARGS} from "make install" and "make test" for MET.  Removing "met" prefix from met.configure.log because we really need config.log for any detail.  It is confusing to have met.configure.log when that does not contain useful information.
@HathewayWill
Copy link
Author

@jprestop @georgemccabe
untitled folder.zip

different error now.

@jprestop
Copy link
Collaborator

I'm wondering if these files are corrupted:

DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc
DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc

Could you please send them to us following the directions here?

@HathewayWill
Copy link
Author

I'm wondering if these files are corrupted:

DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc
DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc

Could you please send them to us following the directions here?

@jprestop I'm having issues with ubuntu and the ftp protocol.

@jprestop
Copy link
Collaborator

jprestop commented Jan 19, 2024

You could also try to attached the files here, @HathewayWill.

@HathewayWill
Copy link
Author

You could also try to attached the files here, @HathewayWill.
@jprestop @georgemccabe

https://we.tl/t-tobDvvKulo

not sure if you can get this otherwise email me directly

@jprestop
Copy link
Collaborator

Hi @HathewayWill. Well, the NetCDF files do not seem to be corrupted. I copied them over and ran "ncdump" on them, and that worked fine. I also copied them to our project machine and ran the command that is causing you problems:

/nrit/ral/met-11.1.0/bin/grid_stat sample_fcst_12L_2005080712V_12A.nc sample_obs_2005080712V_12A.nc GridStatConfig_APCP_12 -outdir ./out/grid_stat -v 2

but I got a successful run. I did not have the problem you are experiencing:

*** Running Grid-Stat on APCP using netCDF input for both forecast and observation ***
../src/tools/core/grid_stat/grid_stat \
        ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc \
        ../out/pcp_combine/sample_obs_2005080712V_12A.nc \
        config/GridStatConfig_APCP_12 \
        -outdir ../out/grid_stat -v 2
DEBUG 1: Start grid_stat by workhorse(501) at 2024-01-19 01:39:36Z  cmd: ../src/tools/core/grid_stat/grid_stat ../out/pcp_combine/sample_fcst_12L_2005080\
712V_12A.nc ../out/pcp_combine/sample_obs_2005080712V_12A.nc config/GridStatConfig_APCP_12 -outdir ../out/grid_stat -v 2
DEBUG 2: OMP_NUM_THREADS is not set. Defaulting to 1 thread. Recommend setting OMP_NUM_THREADS for faster runtimes.
DEBUG 2: OpenMP running on 1 thread(s).
DEBUG 1: Default Config File: /Users/workhorse/WRF/MET-11.1.0/share/met/config/GridStatConfig_default
DEBUG 1: User Config File: config/GridStatConfig_APCP_12
GSL_RNG_TYPE=mt19937
GSL_RNG_SEED=1
DEBUG 1: Forecast File: ../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc
DEBUG 1: Observation File: ../out/pcp_combine/sample_obs_2005080712V_12A.nc
DEBUG 2: Processing masking regions.
terminate called after throwing an instance of 'netCDF::exceptions::NcNotNCF'
  what():  NetCDF: Unknown file format
file: ncFile.cpp  line:88
FATAL: Received Signal Abort. Exiting 6
make[1]: *** [grid_stat] Error 6
make: *** [test] Error 2

Let's have you try running the same command outside of "make test". In the directory /Users/workhorse/WRF/MET-11.1.0/MET-11.1.0/scripts, could you please run the following:

export TEST_OUT_DIR=/Users/workhorse/WRF/MET-11.1.0/MET-11.1.0
/Users/workhorse/WRF/MET-11.1.0/bin/grid_stat \
../out/pcp_combine/sample_fcst_12L_2005080712V_12A.nc \
../out/pcp_combine/sample_obs_2005080712V_12A.nc \
config/GridStatConfig_APCP_12 \
-outdir ../out/grid_stat -v 2

Please give that a try and post the output here. Please let me know if you have any questions.

@HathewayWill
Copy link
Author

@jprestop

I'm going to rebuild the mac and test it again, i can't even repeat the error on my own machine, now it is stopping before the previous error.

Do you have a mac machine available there?

@jprestop
Copy link
Collaborator

Hi @HathewayWill. We have two developers who have successfully installed MET-11.1 on their Macs. One was using 13.6.2 (Ventura) and the other was using 12.6.2 (Monterey). Both compiled using the GNU compilers.

@HathewayWill
Copy link
Author

Morning @jprestop

Okay that's good to know. Let me rebuild my mac and double check on my side. I wonder if it's the shell script.

Are they using homebrew GNU compilers or something else?

@jprestop
Copy link
Collaborator

@HathewayWill. They both used GNU compilers that were installed via MacPorts.

@HathewayWill
Copy link
Author

@jprestop

Might be the solution I'm using homebrew.

Can you ask them which gnu version of macports they are using

@jprestop
Copy link
Collaborator

I would think that Homebrew and MacPorts would be similar, but it could be an issue. One of the developers was using MacPorts 12.3.0 and also used the compile_MET_all.sh script successfully.

@HathewayWill HathewayWill changed the title Met 11.1.0 unable to install on MacOS Met 11.1.0 unable to install on MacOS using Homebrew Jan 28, 2024
@HathewayWill
Copy link
Author

@jprestop @georgemccabe

So for grins I tried to compile MET v11.0.0 using the same structure for installation as I did with V11.1.0

V11.0.0 didn't install so I am going to check my structure.

@HathewayWill
Copy link
Author

HathewayWill commented Jan 28, 2024

@jprestop @georgemccabe

So for grins I tried to compile MET v11.0.0 using the same structure for installation as I did with V11.1.0

V11.0.0 didn't install so I am going to check my structure.

Got it to work on MacOS Sonoma but not 100% on MacOS Ventura. Ventura the MET Tests have errors but metplus still runs sucessfully

The fixes worked that you implemented it on Sonoma but I have attached the logs for Ventura.
MET_logs.zip

Screenshot from 2024-01-28 05-15-49
Screenshot from 2024-01-28 05-16-20

@HathewayWill
Copy link
Author

HathewayWill commented Jan 30, 2024

@jprestop @georgemccabe
So for grins I tried to compile MET v11.0.0 using the same structure for installation as I did with V11.1.0
V11.0.0 didn't install so I am going to check my structure.

Got it to work on MacOS Sonoma but not 100% on MacOS Ventura. Ventura the MET Tests have errors but metplus still runs sucessfully

The fixes worked that you implemented it on Sonoma but I have attached the logs for Ventura. MET_logs.zip

And now Sonoma doesn't work. This is very confusing to me.

@HathewayWill
Copy link
Author

@jprestop @georgemccabe @JohnHalleyGotway

With you're permission i'm going to close this issue and open two different ones for the different mac operating systems. I think there is two different issues going on for each OS and I want to keep them seperate.

I will reference this issue though in the new ones if that is okay with you?

@jprestop
Copy link
Collaborator

Hi @HathewayWill.

This situation is certainly very strange, particularly considering our developers have has successful compilations on various Mac OSs. This could be something in your environment, although it's not clear yet.

Even though MET's configure ran successfully for you, I do see the following error:

ld: Undefined symbols:
  _H5Pset_all_coll_metadata_ops, referenced from:
      _main in ccnwjJ9B.o
collect2: error: ld returned 1 exit status
configure:18015: $? = 1

The other developers did not receive that error. I have their config.log files and would like to step through to see the differences, but I haven't have had a chance to look into the above error or to compare the log files yet.

@HathewayWill
Copy link
Author

@jprestop

I will retest and see what I can find and attach log files here.

@JohnHalleyGotway JohnHalleyGotway added component: build process Build process issue priority: medium Medium Priority requestor: Community General Community labels Feb 8, 2024
@JohnHalleyGotway JohnHalleyGotway changed the title Met 11.1.0 unable to install on MacOS using Homebrew MET 11.1.0 unable to install on MacOS using Homebrew Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alert: NEED ACCOUNT KEY Need to assign an account key to this issue component: build process Build process issue priority: medium Medium Priority requestor: Community General Community type: bug Fix something that is not working
Projects
Development

No branches or pull requests

3 participants