Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDBC.load() hangs on load from Netezza data source #52

Open
metanoid opened this issue Oct 23, 2019 · 4 comments
Open

JDBC.load() hangs on load from Netezza data source #52

metanoid opened this issue Oct 23, 2019 · 4 comments

Comments

@metanoid
Copy link

I'm unable to get the results from a sql query on a netezza database to load, the call simply hangs until the julia process is ended.

It's not clear what the problem is because there are no warnings or errors.

To reproduce:

using DataFrames
using JavaCall
JavaCall.addClassPath("C:/JDBC/nzjdbc.jar")
using JDBC
JDBC.init()
classforname("org.netezza.Driver")
hostname = "netezza_system"
port = "5480"
database = "SYSTEM"
username = ENV["netezza_user"];
password = ENV["netezza_pwd"];
connectionstring = "jdbc:netezza://$(hostname):$(port)/$(database);user=$(username);password=$(password)";
myquery = "SELECT *   FROM EXAMPLE_TABLE  LIMIT 1000"

cnxn = JDBC.Connection(connectionstring)
csr = cursor(cnxn)
execute!(csr, myquery)
src = JDBC.Source(csr)
df = JDBC.load(DataFrame, src)

The final JDBC.load call above never returns a result.

Is there something I can do to diagnose the issue?

julia> versioninfo()
Julia Version 1.0.5
Commit 3af96bcefc (2019-09-09 19:06 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_NUM_THREADS = 4

Package versions:

  [a93c6f00] DataFrames v0.18.4
  [6042db11] JDBC v0.5.0
  [494afd89] JavaCall v0.7.2
@metanoid
Copy link
Author

I think the issue is this:

JDBC.load(DataFrame, src) calls Tables.matrializer(DataFrame)(src) (in JDBC :: tables.jl)
which calls columntable(src) (from Tables :: namedtuples.jl)
which calls columns(src) (from Tables :: fallbacks.jl)

And it is this function which hangs.

Whereas this is able to print all the data:

for i in rows(csr)
    println(i)
end

@aviks
Copy link
Member

aviks commented Oct 23, 2019

Yeah, I was about to say, can you just use the low level functions

stmt = createStatement(conn)
rs = executeQuery(stmt, myquery)
for r in rs
     println(getInt(r, 1),  getString(r,"FIELDNAME")) ....
end

@metanoid
Copy link
Author

Yeah, I was about to say, can you just use the low level functions

stmt = createStatement(conn)
rs = executeQuery(stmt, myquery)
for r in rs
     println(getInt(r, 1),  getString(r,"FIELDNAME")) ....
end

Yes, I can, that works, I just need to figure out how to get the output into DataFrames form now

@metanoid
Copy link
Author

The below seems to work for my needs:

df_schema = Tables.schema(src)
df = DataFrame(collect(df_schema.types), collect(df_schema.names))
for i in rows(csr)
    push!(df, i)
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants