Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] demo code raise no output schema exception #1105

Open
breadbread1984 opened this issue Nov 29, 2024 · 0 comments
Open

[BUG] demo code raise no output schema exception #1105

breadbread1984 opened this issue Nov 29, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@breadbread1984
Copy link

Bug description

demo code at section Extract and save Item embeddings raises no output schema exception.

Steps/Code to reproduce bug

just follow the demo

Expected behavior

Environment details

  • Merlin version:
  • merlin 0.0.1
  • merlin-core 0+untagged.1.g6d396aa
  • merlin-dataloader 0+untagged.1.g1441a12
  • merlin-hps 1.0.0
  • merlin-models 0+untagged.1.geb1e541
  • merlin-sok 2.0.0
  • merlin-systems 0+untagged.1.ga19d311
  • Platform: Linux 12ce9556ef42 5.4.0-200-generic
  • Python version: 3.10.12
  • PyTorch version (GPU?): N/A
  • Tensorflow version (GPU?): 2.12.0+nv23.6

Additional context

  File "/root/raid/common_models/recommend_system/merlin/aliccp/extract_item_feature.py", line 56, in main
    item_embeddings = workflow.fit_transform(Dataset(item_features)).to_ddf().compute()
  File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 236, in fit_transform
    self.fit(dataset)
  File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 213, in fit
    self.executor.fit(dataset, self.graph)
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 501, in fit
    ).sample_dtypes()
  File "/usr/local/lib/python3.10/dist-packages/merlin/io/dataset.py", line 1169, in sample_dtypes
    _real_meta = self.engine.sample_data(n=n)
  File "/usr/local/lib/python3.10/dist-packages/merlin/io/dataset_engine.py", line 64, in sample_data
    _head = _ddf.partitions[partition_index].head(n)
  File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/core.py", line 1268, in head
    return self._head(n=n, npartitions=npartitions, compute=compute, safe=safe)
  File "/usr/local/lib/python3.10/dist-packages/dask/dataframe/core.py", line 1302, in _head
    result = result.compute()
  File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 314, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dask/base.py", line 599, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/dask/threaded.py", line 89, in get
    results = get_async(
  File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 511, in get_async
    raise_exception(exc, tb)
  File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 319, in reraise
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/dask/local.py", line 224, in execute_task
    result = _execute_task(task, data)
  File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/usr/local/lib/python3.10/dist-packages/dask/optimization.py", line 990, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 149, in get
    result = _execute_task(task, cache)
  File "/usr/local/lib/python3.10/dist-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/usr/local/lib/python3.10/dist-packages/dask/utils.py", line 72, in apply
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 103, in transform
    transformed_data = self._execute_node(node, transformable, capture_dtypes, strict)
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 117, in _execute_node
    upstream_outputs = self._run_upstream_transforms(
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 135, in _run_upstream_transforms
    node_output = self._execute_node(
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 117, in _execute_node
    upstream_outputs = self._run_upstream_transforms(
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 135, in _run_upstream_transforms
    node_output = self._execute_node(
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 125, in _execute_node
    transform_output = self._run_node_transform(node, transform_input, capture_dtypes, strict)
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 255, in _run_node_transform
    raise exc
  File "/usr/local/lib/python3.10/dist-packages/merlin/dag/executors.py", line 242, in _run_node_transform
    transformed_data = node.op.transform(selection, input_data)
  File "/usr/local/lib/python3.10/dist-packages/merlin/systems/dag/ops/workflow.py", line 107, in transform
    output = self.workflow._transform_df(transformable)
  File "/usr/local/lib/python3.10/dist-packages/nvtabular/workflow/workflow.py", line 256, in _transform_df
    raise ValueError("no output schema")
ValueError: no output schema
@breadbread1984 breadbread1984 added the bug Something isn't working label Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant