Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError in _find_most_related_community_from_entities #106

Open
RoadToNowhereX opened this issue Dec 6, 2024 · 1 comment
Open

KeyError in _find_most_related_community_from_entities #106

RoadToNowhereX opened this issue Dec 6, 2024 · 1 comment

Comments

@RoadToNowhereX
Copy link

After extraction and indexing, I would get those KeyErrors whatever I ask the LLM.

Frontend: kotaemon
Python 3.10
Local LLM: Nemo_saiga
Local Embedding Model: gte-qwen2-7B

INFO:httpx:HTTP Request: POST http://127.0.0.1:1234/v1/chat/completions "HTTP/1.1 200 OK"
User-id: 1, can see public conversations: True
Session reasoning type None use mindmap (default) use citation (default) language (default)
Session LLM
Reasoning class <class 'ktem.reasoning.simple.FullDecomposeQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Chosen rewrite pipeline DecomposeQuestionPipeline(
(llm): ChatOpenAI(api_key=null, base_url=http://127.0.0...., frequency_penalty=None, logit_bias=None, logprobs=None, max_retries=None, max_retries_=2, max_tokens=4096, model=gguf/embedding-..., n=1, organization=None, presence_penalty=None, stop=None, temperature=None, timeout=None, tool_choice=None, tools=None, top_logprobs=None, top_p=None)
)
INFO:httpx:HTTP Request: POST http://127.0.0.1:1234/v1/chat/completions "HTTP/1.1 200 OK"
Rewrite result []
searching in doc_ids []
INFO:ktem.index.file.pipelines:Skip retrieval because of no selected files: DocumentRetrievalPipeline(
(vector_retrieval): <function Function.prepare_child..exec at 0x0000024B5BFF4160>
(embedding): <function Function.prepare_child..exec at 0x0000024B5BFF4670>
)
searching in doc_ids []
INFO:ktem.index.file.pipelines:Skip retrieval because of no selected files: DocumentRetrievalPipeline(
(vector_retrieval): <function Function.prepare_child..exec at 0x0000024B5C076EF0>
(embedding): <function Function.prepare_child..exec at 0x0000024B5C0771C0>
)
INFO:httpx:HTTP Request: POST http://127.0.0.1:5678/v1/embeddings "HTTP/1.1 200 OK"
GraphRAG embedding dim 3584
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 438 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-graphrag:Loaded graph from E:\AI\LLM\kotaemon-NanoGraphRAG\ktem_app_data\user_data\files\nano_graphrag\8983c848-526b-4783-a255-c0115fed6e63\input\graph_chunk_entity_relation.graphml with 1772 nodes, 1127 edges
INFO:nano-vectordb:Load (1563, 3584) data
INFO:nano-vectordb:Init {'embedding_dim': 3584, 'metric': 'cosine', 'storage_file': 'E:\AI\LLM\kotaemon-NanoGraphRAG\ktem_app_data\user_data\files\nano_graphrag\8983c848-526b-4783-a255-c0115fed6e63\input\vdb_entities.json'} 1563 data
INFO:httpx:HTTP Request: POST http://127.0.0.1:5678/v1/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\queueing.py", line 575, in process_events
response = await route_utils.call_process_api(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\blocks.py", line 1923, in process_api
result = await self.call_function(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\blocks.py", line 1520, in call_function
prediction = await utils.async_iteration(iterator)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 663, in async_iteration
return await iterator.anext()
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 656, in anext
return await anyio.to_thread.run_sync(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\anyio_backends_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\anyio_backends_asyncio.py", line 943, in run
result = context.run(func, *args)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 639, in run_sync_iterator_async
return next(iterator)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 801, in gen_wrapper
response = next(iterator)
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\pages\chat_init
.py", line 981, in chat_fn
for response in pipeline.stream(chat_input, conversation_id, chat_history):
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\reasoning\simple.py", line 535, in stream
docs, infos = self.retrieve(message, history)
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\reasoning\simple.py", line 130, in retrieve
retriever_docs = retriever_node(text=query)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\base.py", line 1097, in call
raise e from None
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\base.py", line 1088, in call
output = self.fl.exec(func, args, kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\backends\base.py", line 151, in exec
return run(*args, **kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\middleware.py", line 144, in call
raise e from None
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\middleware.py", line 141, in call
output = self.next_call(*args, **kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\middleware.py", line 117, in call
return self.next_call(*args, **kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\base.py", line 1017, in runx
return self.run(*args, **kwargs)
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\index\file\graph\nano_pipelines.py", line 385, in run
entities, relationships, reports, sources = asyncio.run(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\index\file\graph\nano_pipelines.py", line 155, in nano_graph_rag_build_local_query_context
use_communities = await find_most_related_community_from_entities(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\nano_graphrag_op.py", line 698, in find_most_related_community_from_entities
related_community_keys = sorted(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\nano_graphrag_op.py", line 702, in
related_community_datas[k]["report_json"].get("rating", -1),
KeyError: '2'
Session reasoning type simple use mindmap (default) use citation inline language (default)
Session LLM saiga
Reasoning class <class 'ktem.reasoning.simple.FullQAPipeline'>
Reasoning state {'app': {'regen': False}, 'pipeline': {}}
Thinking ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x0000024B571F0F40>, FSPath=WindowsPath('E:/AI/LLM/kotaemon-NanoGraphRAG/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x0000024B571F00A0>, get_extra_table=False, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x0000024B538D7B50>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x0000024B538D5090>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x0000024B538D59F0>), mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset
object at 0x0000024B636A1D50>, FSPath=<theflow.base.unset
object at 0x0000024B636A1D50>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset
object at 0x0000024B636A1D50>, VS=<theflow.base.unset
object at 0x0000024B636A1D50>, file_ids=[], user_id=<theflow.base.unset object at 0x0000024B636A1D50>), DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x0000024B574712D0>, FSPath=WindowsPath('E:/AI/LLM/kotaemon-NanoGraphRAG/ktem_app_data/user_data/files/index_3'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x0000024B57472C50>, get_extra_table=False, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x0000024B538D59C0>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x0000024B5393D240>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x0000024B5393EC50>), mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0')], retrieval_mode='hybrid', top_k=10, user_id=1), NanoGraphRAGRetrieverPipeline(DS=<theflow.base.unset object at 0x0000024B636A1D50>, FSPath=<theflow.base.unset object at 0x0000024B636A1D50>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x0000024B636A1D50>, VS=<theflow.base.unset_ object at 0x0000024B636A1D50>, file_ids=['a1a001b8-0d74-4fb3-b208-cd57fdde2db6'], user_id=<theflow.base.unset_ object at 0x0000024B636A1D50>)]
searching in doc_ids []
INFO:ktem.index.file.pipelines:Skip retrieval because of no selected files: DocumentRetrievalPipeline(
(vector_retrieval): <function Function._prepare_child..exec at 0x0000024B5D3D24D0>
(embedding): <function Function._prepare_child..exec at 0x0000024B5D3D2710>
)
searching in doc_ids []
INFO:ktem.index.file.pipelines:Skip retrieval because of no selected files: DocumentRetrievalPipeline(
(vector_retrieval): <function Function._prepare_child..exec at 0x0000024B593CFEB0>
(embedding): <function Function.prepare_child..exec at 0x0000024B5D5CBBE0>
)
INFO:httpx:HTTP Request: POST http://127.0.0.1:5678/v1/embeddings "HTTP/1.1 200 OK"
GraphRAG embedding dim 3584
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 438 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-graphrag:Loaded graph from E:\AI\LLM\kotaemon-NanoGraphRAG\ktem_app_data\user_data\files\nano_graphrag\8983c848-526b-4783-a255-c0115fed6e63\input\graph_chunk_entity_relation.graphml with 1772 nodes, 1127 edges
INFO:nano-vectordb:Load (1563, 3584) data
INFO:nano-vectordb:Init {'embedding_dim': 3584, 'metric': 'cosine', 'storage_file': 'E:\AI\LLM\kotaemon-NanoGraphRAG\ktem_app_data\user_data\files\nano_graphrag\8983c848-526b-4783-a255-c0115fed6e63\input\vdb_entities.json'} 1563 data
INFO:httpx:HTTP Request: POST http://127.0.0.1:5678/v1/embeddings "HTTP/1.1 200 OK"
Traceback (most recent call last):
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\queueing.py", line 575, in process_events
response = await route_utils.call_process_api(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\blocks.py", line 1923, in process_api
result = await self.call_function(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\blocks.py", line 1520, in call_function
prediction = await utils.async_iteration(iterator)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 663, in async_iteration
return await iterator.anext()
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 656, in anext
return await anyio.to_thread.run_sync(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\anyio_backends_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\anyio_backends_asyncio.py", line 943, in run
result = context.run(func, *args)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 639, in run_sync_iterator_async
return next(iterator)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\gradio\utils.py", line 801, in gen_wrapper
response = next(iterator)
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\pages\chat_init
.py", line 981, in chat_fn
for response in pipeline.stream(chat_input, conversation_id, chat_history):
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\reasoning\simple.py", line 287, in stream
docs, infos = self.retrieve(message, history)
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\reasoning\simple.py", line 130, in retrieve
retriever_docs = retriever_node(text=query)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\base.py", line 1097, in call
raise e from None
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\base.py", line 1088, in call
output = self.fl.exec(func, args, kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\backends\base.py", line 151, in exec
return run(*args, **kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\middleware.py", line 144, in call
raise e from None
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\middleware.py", line 141, in call
_output = self.next_call(*args, **kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\middleware.py", line 117, in call
return self.next_call(*args, **kwargs)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\theflow\base.py", line 1017, in _runx
return self.run(*args, **kwargs)
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\index\file\graph\nano_pipelines.py", line 385, in run
entities, relationships, reports, sources = asyncio.run(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\asyncio\base_events.py", line 649, in run_until_complete
return future.result()
File "E:\AI\LLM\kotaemon\libs\ktem\ktem\index\file\graph\nano_pipelines.py", line 155, in nano_graph_rag_build_local_query_context
use_communities = await _find_most_related_community_from_entities(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\nano_graphrag_op.py", line 698, in _find_most_related_community_from_entities
related_community_keys = sorted(
File "E:\CondaEnvironments\kotaemon-nanographrag_env\lib\site-packages\nano_graphrag_op.py", line 702, in
related_community_datas[k]["report_json"].get("rating", -1),
KeyError: '8'

@RoadToNowhereX
Copy link
Author

Error: Expecting property name enclosed in double quotes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant