Sunday, July 14, 2019

A Star Is Born at UCR


In the Big Data Lab at UCR, we are happy to announce the first release of UCR STAR, formally, the UCR Spatio-temporal Active Repository []. STAR is made available as a service to the research community to provide easy access to existing big spatio-temporal datasets through an interactive exploratory interface. Researchers and developers can choose from the datasets and interactively explore them through a map-based interface. Users can also search and filter those datasets as if they are shopping for their research, except that everything is free. The website is best accessed through a desktop browser but a limited mobile-friendly interface is also provided. You can find more details on how to use the archive at [].


To better understand how to use UCR STAR, we provide an overview of the main features.

Map Interface

At the heart of UCR STAR, the map provides an interactive exploratory interface for the dataset. Similar to Google Maps or other web maps, you can zoom in/out and pan around. The goal is to get a quick overview of the data distribution, coverage, and accuracy. If the dataset seems a good fit for your work, you can then download it to further explore and analyze it.
The "Zoom All" button quickly zooms out or in to fit the currently selected dataset.

Dataset Selector

The dataset selector shows a list of all available datasets and allows you to select one of them to view. Currently, you can only choose one dataset at a time by clicking it. The selected dataset is highlighted.

Dataset Details

When a dataset is selected, its details are displayed as shown above. Mainly, these details include the following:
  • Project homepage: The homepage of the project that released this dataset.
  • Full download: A direct (external) link to download this dataset from its original source.
  • Subset download: Downloads only the subset of the data that is currently visible on the map. Notice that this is served directly by UCR STAR, hence, the data is not in its original format.
  • Size: Size of the dataset in bytes. Notice that this number varies on the dataset format. This number is an estimate for the original decompressed dataset.
  • Number of records: Total number of records (features) in the dataset.
  • Number of points: Total number of points in the geometries of the data. For example, a polygon with a thousand points counts as one feature and 1,000 points.
  • Format: The format of the original file format, e.g., CSV, GeoJSON, or Shapefile(c).
  • Geometry type: The type of geometries in the file, e.g., point, linestring, or polygon. Some datasets contain a mix of different geometries.
  • Attributes: A list of all attributes in this dataset. For some datasets, you can hover an attribute to reveal its description.

Text Search

You can quickly filter datasets using a text query. This text searches in the name, description, and any additional (hidden) tags associated with each dataset. The text below it shows the number of datasets that match the current text.

Advanced Filter

You can show the advanced filter dialog by clicking the advanced filter button . The advanced filter dialog provides the following filters for the datasets.
  • Data size: Filter datasets by total size in bytes.
  • Feature count: Filter by number of records (features).
  • Number of points: Filter by total number of points in the features.
  • Geometry type: The type of features in the dataset. If none is selected, this filter is not applied.
  • Format type: The format of the data in its original format. Similar to geometry type, if none is selected, this filter is not applied.
All these filters are ANDed together along with the text search. If you want to clear all filters (including the text filter), you can click the Clear All Filters button .

Show Index Boundaries

Though shown in the advanced filter dialog, this advanced option does not really filter the data in any way. If you check that box, the map shows, in addition to the dataset, the boundaries of the index that we use to store this dataset. This is only helpful for advanced users who want to understand how we manage such big dataset. You can click one of the partitions to see more details about it.

Share Button

The share button shows a share dialog that allows you to share the current view of the dataset in two ways.
  1. Permalink: The first one provides a permanent link to the current view. This includes the current dataset and location. You can bookmark this link or send it to your colleagues to point out an interesting part of the dataset.
  2. Embed: This option provides a JavaScript code that you can add to your own website to share the current view. This again includes the selected dataset and location. This code will add a map that shows the dataset and provides the same interactivity (Zoom in/out and pan). There is no need to install any software on your server to add such visualization.

Embed Feature

Below, is an example of embedding a visualization by just copying/pasting the embed code provided by UCR STAR.

No comments:

Post a Comment