Skip to content

Commit

Permalink
Merge pull request #65 from jiayuasu/GeoSpark-for-Spark-1.X
Browse files Browse the repository at this point in the history
GeoSpark bumps to 0.5.2
  • Loading branch information
jiayuasu authored Mar 3, 2017
2 parents bae6012 + ac4481c commit 63ac484
Show file tree
Hide file tree
Showing 12 changed files with 617 additions and 118 deletions.
4 changes: 2 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
The MIT License (MIT)

Copyright (c) 2016 Mohamed Sarwat
Copyright (c) 2015-2017 Arizona State University Data Systems Lab (http://www.datasyslab.org)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand All @@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
SOFTWARE.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,11 @@ GeoSpark artifacts are hosted in Maven Central: [**Maven Central Coordinates**](

| Version | Summary |
|:----------------: |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|0.5.2| **Bug fix:** Fix [Issue #58](https://github.com/DataSystemsLab/GeoSpark/issues/58) and [Issue #60](https://github.com/DataSystemsLab/GeoSpark/issues/60); **Performance enhancement:**: (1) Deprecate all old Spatial RDD constructors. See the JavaDoc [here](http://www.public.asu.edu/~jiayu2/geospark/javadoc/0.5.2/). (2) Recommend the new SRDD constructors which take an additional RDD storage level and automatically cache rawSpatialRDD to accelerate internal SRDD analyze step|
|0.5.1| **Bug fix:** (1) GeoSpark: Fix inaccurate KNN result when K is large (2) GeoSpark: Replace incompatible Spark API call [Issue #55](https://github.com/DataSystemsLab/GeoSpark/issues/55); (3) Babylon: Remove JPG output format temporarily due to the lack of OpenJDK support|
| 0.5.0| **Major updates:** We are pleased to announce the initial version of [Babylon](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon) a large-scale in-memory geospatial visualization system extending GeoSpark. Babylon and GeoSpark are integrated together. You can just import GeoSpark and enjoy! More details are available here: [Babylon GeoSpatial Visualization](https://github.com/DataSystemsLab/GeoSpark/tree/master/src/main/java/org/datasyslab/babylon)|


# Important features ([more](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Important-Features))
## Spatial Resilient Distributed Datasets (SRDDs)
Supported Spatial RDDs: PointRDD, RectangleRDD, PolygonRDD, LineStringRDD
Expand All @@ -44,9 +46,10 @@ Inside, Overlap, DatasetBoundary, Minimum Bounding Rectangl, Polygon Union
## Spatial Operation
Spatial Range Query, Spatial Join Query, and Spatial K Nearest Neighbors Query.

# GeoSpark Tutorial ([more](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Tutorial))
# GeoSpark Tutorial ([more](https://github.com/DataSystemsLab/GeoSpark/wiki))
GeoSpark full tutorial is available at GeoSpark GitHub Wiki: [https://github.com/DataSystemsLab/GeoSpark/wiki](https://github.com/DataSystemsLab/GeoSpark/wiki)

#Babylon Visualization Framework on GeoSpark
# Babylon Visualization Framework on GeoSpark
Babylon is a large-scale in-memory geospatial visualization system.

Babylon provides native support for general cartographic design by extending GeoSpark to process large-scale spatial data. It can visulize Spatial RDD and Spatial Queries and render super high resolution image in parallel.
Expand Down Expand Up @@ -81,11 +84,9 @@ Please refer to [JTS Topology Suite website](http://tsusiatsoftware.net/jts/main

* Email us!

## Contributors
## Contact
* [Jia Yu](http://www.public.asu.edu/~jiayu2/) (Email: [email protected])

* [Jinxuan Wu](http://www.public.asu.edu/~jinxuanw/) (Email: [email protected])

* [Mohamed Sarwat](http://faculty.engineering.asu.edu/sarwat/) (Email: [email protected])

## Project website
Expand All @@ -96,4 +97,3 @@ GeoSpark is one of the projects under [Data Systems Lab](http://www.datasyslab.o

# Thanks for the help from GeoSpark community
We appreciate the help and suggestions from GeoSpark users: [**Thanks List**](https://github.com/DataSystemsLab/GeoSpark/wiki/GeoSpark-Community-Thanks-List)

4 changes: 2 additions & 2 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<modelVersion>4.0.0</modelVersion>
<groupId>org.datasyslab</groupId>
<artifactId>geospark</artifactId>
<version>0.5.1-spark-1.x</version>
<version>0.5.2-spark-1.x</version>

<name>${project.groupId}:${project.artifactId}</name>
<description>Geospatial extension for Apache Spark</description>
Expand Down Expand Up @@ -64,7 +64,7 @@
<dependency>
<groupId>org.wololo</groupId>
<artifactId>jts2geojson</artifactId>
<version>0.7.0</version>
<version>0.10.0</version>
<exclusions>
<exclusion>
<groupId>com.vividsolutions</groupId>
Expand Down
191 changes: 156 additions & 35 deletions src/main/java/org/datasyslab/geospark/spatialRDD/LineStringRDD.java
Original file line number Diff line number Diff line change
Expand Up @@ -13,16 +13,16 @@
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.storage.StorageLevel;
import org.datasyslab.geospark.enums.FileDataSplitter;
import org.datasyslab.geospark.formatMapper.LineStringFormatMapper;
import org.wololo.geojson.GeoJSON;
import org.wololo.jts2geojson.GeoJSONWriter;

import com.vividsolutions.jts.geom.Envelope;
import com.vividsolutions.jts.geom.Geometry;
import com.vividsolutions.jts.geom.Polygon;
import com.vividsolutions.jts.geom.LineString;

// TODO: Auto-generated Javadoc
/**
* The Class LineStringRDD.
*/
Expand All @@ -32,18 +32,19 @@ public class LineStringRDD extends SpatialRDD{
* Instantiates a new line string RDD.
*
* @param rawSpatialRDD the raw spatial RDD
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
public LineStringRDD(JavaRDD<Polygon> rawSpatialRDD) {
this.rawSpatialRDD = rawSpatialRDD.map(new Function<Polygon,Object>()
@Deprecated
public LineStringRDD(JavaRDD<LineString> rawSpatialRDD) {
this.rawSpatialRDD = rawSpatialRDD.map(new Function<LineString,Object>()
{
@Override
public Object call(Polygon spatialObject) throws Exception {
public Object call(LineString spatialObject) throws Exception {
return spatialObject;
}

});
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}

/**
Expand All @@ -56,11 +57,12 @@ public Object call(Polygon spatialObject) throws Exception {
* @param splitter the splitter
* @param carryInputData the carry input data
* @param partitions the partitions
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
@Deprecated
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, Integer startOffset, Integer endOffset, FileDataSplitter splitter, boolean carryInputData, Integer partitions) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation, partitions).flatMap(new LineStringFormatMapper(startOffset, endOffset, splitter, carryInputData)));
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}


Expand All @@ -73,11 +75,12 @@ public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, Intege
* @param endOffset the end offset
* @param splitter the splitter
* @param carryInputData the carry input data
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
@Deprecated
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, Integer startOffset, Integer endOffset, FileDataSplitter splitter, boolean carryInputData) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation).flatMap(new LineStringFormatMapper(startOffset, endOffset, splitter, carryInputData)));
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}

/**
Expand All @@ -88,11 +91,12 @@ public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, Intege
* @param splitter the splitter
* @param carryInputData the carry input data
* @param partitions the partitions
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
@Deprecated
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, FileDataSplitter splitter, boolean carryInputData, Integer partitions) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation, partitions).flatMap(new LineStringFormatMapper(splitter, carryInputData)));
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}


Expand All @@ -103,11 +107,12 @@ public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, FileDa
* @param InputLocation the input location
* @param splitter the splitter
* @param carryInputData the carry input data
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
@Deprecated
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, FileDataSplitter splitter, boolean carryInputData) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation).flatMap(new LineStringFormatMapper(splitter, carryInputData)));
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}


Expand All @@ -118,11 +123,12 @@ public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, FileDa
* @param InputLocation the input location
* @param partitions the partitions
* @param userSuppliedMapper the user supplied mapper
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
@Deprecated
public LineStringRDD(JavaSparkContext sparkContext, String InputLocation, Integer partitions, FlatMapFunction userSuppliedMapper) {
this.setRawSpatialRDD(sparkContext.textFile(InputLocation, partitions).flatMap(userSuppliedMapper));
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}

/**
Expand All @@ -131,11 +137,141 @@ public LineStringRDD(JavaSparkContext sparkContext, String InputLocation, Intege
* @param sparkContext the spark context
* @param InputLocation the input location
* @param userSuppliedMapper the user supplied mapper
* @deprecated Please append RDD Storage Level after all the existing parameters
*/
@Deprecated
public LineStringRDD(JavaSparkContext sparkContext, String InputLocation, FlatMapFunction userSuppliedMapper) {
this.setRawSpatialRDD(sparkContext.textFile(InputLocation).flatMap(userSuppliedMapper));
this.boundary();
this.totalNumberOfRecords = this.rawSpatialRDD.count();
this.analyze();
}

/**
* Instantiates a new line string RDD.
*
* @param rawSpatialRDD the raw spatial RDD
* @param newLevel the new level
*/
public LineStringRDD(JavaRDD<LineString> rawSpatialRDD, StorageLevel newLevel) {
this.rawSpatialRDD = rawSpatialRDD.map(new Function<LineString,Object>()
{
@Override
public Object call(LineString spatialObject) throws Exception {
return spatialObject;
}

});
this.analyze(newLevel);
}

/**
* Instantiates a new line string RDD.
*
* @param SparkContext the spark context
* @param InputLocation the input location
* @param startOffset the start offset
* @param endOffset the end offset
* @param splitter the splitter
* @param carryInputData the carry input data
* @param partitions the partitions
* @param newLevel the new level
*/
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, Integer startOffset, Integer endOffset,
FileDataSplitter splitter, boolean carryInputData, Integer partitions, StorageLevel newLevel) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation, partitions).flatMap(new LineStringFormatMapper(startOffset, endOffset, splitter, carryInputData)));
this.analyze(newLevel);
}


/**
* Instantiates a new line string RDD.
*
* @param SparkContext the spark context
* @param InputLocation the input location
* @param startOffset the start offset
* @param endOffset the end offset
* @param splitter the splitter
* @param carryInputData the carry input data
* @param newLevel the new level
*/
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, Integer startOffset, Integer endOffset,
FileDataSplitter splitter, boolean carryInputData, StorageLevel newLevel) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation).flatMap(new LineStringFormatMapper(startOffset, endOffset, splitter, carryInputData)));
this.analyze(newLevel);
}

/**
* Instantiates a new line string RDD.
*
* @param SparkContext the spark context
* @param InputLocation the input location
* @param splitter the splitter
* @param carryInputData the carry input data
* @param partitions the partitions
* @param newLevel the new level
*/
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation, FileDataSplitter splitter, boolean carryInputData, Integer partitions, StorageLevel newLevel) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation, partitions).flatMap(new LineStringFormatMapper(splitter, carryInputData)));
this.analyze(newLevel);
}


/**
* Instantiates a new line string RDD.
*
* @param SparkContext the spark context
* @param InputLocation the input location
* @param splitter the splitter
* @param carryInputData the carry input data
* @param newLevel the new level
*/
public LineStringRDD(JavaSparkContext SparkContext, String InputLocation,
FileDataSplitter splitter, boolean carryInputData, StorageLevel newLevel) {
this.setRawSpatialRDD(SparkContext.textFile(InputLocation).flatMap(new LineStringFormatMapper(splitter, carryInputData)));
this.analyze(newLevel);
}


/**
* Instantiates a new line string RDD.
*
* @param sparkContext the spark context
* @param InputLocation the input location
* @param partitions the partitions
* @param userSuppliedMapper the user supplied mapper
* @param newLevel the new level
*/
public LineStringRDD(JavaSparkContext sparkContext, String InputLocation, Integer partitions, FlatMapFunction userSuppliedMapper, StorageLevel newLevel) {
this.setRawSpatialRDD(sparkContext.textFile(InputLocation, partitions).flatMap(userSuppliedMapper));
this.analyze(newLevel);
}

/**
* Instantiates a new line string RDD.
*
* @param sparkContext the spark context
* @param InputLocation the input location
* @param userSuppliedMapper the user supplied mapper
* @param newLevel the new level
*/
public LineStringRDD(JavaSparkContext sparkContext, String InputLocation, FlatMapFunction userSuppliedMapper, StorageLevel newLevel) {
this.setRawSpatialRDD(sparkContext.textFile(InputLocation).flatMap(userSuppliedMapper));
this.analyze(newLevel);
}

/**
* Minimum bounding rectangle.
*
* @return the rectangle RDD
*/
@Deprecated
public RectangleRDD MinimumBoundingRectangle() {
JavaRDD<Envelope> rectangleRDD = this.rawSpatialRDD.map(new Function<Object, Envelope>() {
public Envelope call(Object spatialObject) {
Envelope MBR = ((Geometry)spatialObject).getEnvelopeInternal();
return MBR;
}
});
return new RectangleRDD(rectangleRDD);
}

/**
Expand All @@ -159,19 +295,4 @@ public Iterable<String> call(Iterator<Object> iterator) throws Exception {
}
}).saveAsTextFile(outputLocation);
}

/**
* Minimum bounding rectangle.
*
* @return the rectangle RDD
*/
public RectangleRDD MinimumBoundingRectangle() {
JavaRDD<Envelope> rectangleRDD = this.rawSpatialRDD.map(new Function<Object, Envelope>() {
public Envelope call(Object spatialObject) {
Envelope MBR = ((Geometry)spatialObject).getEnvelopeInternal();
return MBR;
}
});
return new RectangleRDD(rectangleRDD);
}
}
}
Loading

0 comments on commit 63ac484

Please sign in to comment.