-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request : Enhanced Geospatial and Temporal Search #740
Comments
A more (?) related way to attributes and types: Extension of Data Sources Implementation:
from txtai.graph import Graph
import geopandas as gpd
import pandas as pd
from qwikidata.linked_data_interface import get_entity_dict_from_api
class EnhancedGraph(Graph):
def __init__(self):
super().__init__()
self.gdf = gpd.GeoDataFrame()
self.temporal_data = pd.DataFrame()
def add_geospatial_node(self, node_id, geometry, **attrs):
self.graph.add_node(node_id, geometry=geometry, **attrs)
self.gdf = self.gdf.append({'node_id': node_id, 'geometry': geometry, **attrs}, ignore_index=True)
def add_temporal_node(self, node_id, timestamp, **attrs):
self.graph.add_node(node_id, timestamp=timestamp, **attrs)
self.temporal_data = self.temporal_data.append({'node_id': node_id, 'timestamp': timestamp, **attrs}, ignore_index=True)
def import_wikidata(self, entity_id):
entity_dict = get_entity_dict_from_api(entity_id)
node_id = entity_dict['id']
attrs = {claim['mainsnak']['property']: claim['mainsnak']['datavalue']['value']
for claim in entity_dict['claims'] if 'datavalue' in claim['mainsnak']}
self.graph.add_node(node_id, **attrs)
return node_id
def to_geopandas(self):
return self.gdf
def to_temporal_pandas(self):
return self.temporal_data
def import_geojson(self, file_path):
gdf = gpd.read_file(file_path)
for idx, row in gdf.iterrows():
self.add_geospatial_node(idx, row.geometry, **row.to_dict())
def import_temporal_csv(self, file_path, timestamp_col, node_id_col):
df = pd.read_csv(file_path, parse_dates=[timestamp_col])
for idx, row in df.iterrows():
self.add_temporal_node(row[node_id_col], row[timestamp_col], **row.to_dict())
def spatial_query(self, geometry):
return self.gdf[self.gdf.intersects(geometry)]
def temporal_query(self, start_time, end_time):
mask = (self.temporal_data['timestamp'] >= start_time) & (self.temporal_data['timestamp'] <= end_time)
return self.temporal_data.loc[mask]
graph = EnhancedGraph()
# Import geospatial data
graph.import_geojson("cities.geojson")
# Import temporal data
graph.import_temporal_csv("events.csv", timestamp_col="event_date", node_id_col="event_id")
# Import Wikidata
node_id = graph.import_wikidata("Q64")
# Perform spatial and temporal queries
cities_in_area = graph.spatial_query(some_polygon)
events_in_timeframe = graph.temporal_query(pd.Timestamp("2023-01-01"), pd.Timestamp("2023-12-31"))
# Convert to GeoDataFrame or DataFrame for further analysis
gdf = graph.to_geopandas()
temporal_df = graph.to_temporal_pandas() This implementation enhances TxtAI's graph capabilities by:
Regarding the initial type problem: The approach is well-integrated with TxtAI's ecosystem, extending its Graph class and using compatible libraries like geopandas and pandas. It also leverages NetworkX's underlying graph structure while adding geospatial and temporal capabilities on top of it. Citations: |
Here's a plan to enhance TxtAI with geospatial and temporal search capabilities:
1. Extend indexing for geospatial data:
2. Implement temporal search functionalities:
3. Integrate with existing semantic search:
This implementation:
To use this enhanced graph:
This approach extends TxtAI's capabilities while maintaining simplicity and integration with its existing ecosystem.
Citations:
[1] https://networkx.org/documentation/stable/auto_examples/geospatial/index.html
[2] https://networkx.org/documentation/stable/auto_examples/geospatial/extended_description.html
[3] geopandas/geopandas#1592
[4] https://napo.github.io/geospatial_course_unitn/lessons/05-street-network-analysis
[5] https://pypi.org/project/networkx-temporal/
[6] https://www.timescale.com/blog/tools-for-working-with-time-series-analysis-in-python/
[7] https://pythongis.org/part1/chapter-03/nb/03-temporal-data.html
[8] https://github.com/MaxBenChrist/awesome_time_series_in_python
[9] https://unit8co.github.io/darts/
[10] https://www.timescale.com/blog/how-to-work-with-time-series-in-python/
[11] https://github.com/sacridini/Awesome-Geospatial
[12] https://www.mdpi.com/1999-4893/10/2/37
The text was updated successfully, but these errors were encountered: