Safeguarding Guide

Mapsafe uses geomasking, encryption, and notarisation to safeguard geospatial datasets. Each of these techniques is explained below.

How do I load data?

First, make sure the data you want to mask is in a shapefile format. Mapsafe can only load zipped shapefiles. Once you have the zip file, just click the 'Choose File' button in Upload tab and select the zipped shapefile from your file system, as shown in the video at the end of this page.

What is geographic masking?

Geographic masks are a set of techniques that alter the location of points in a map to protect privacy without overly affecting any spatial patterns. In other words, geographic masks allow researchers to publish useful maps of approximate locations, without exposing sensitive data or violating anyone's privacy. Of course, this is a trade-off: with more masking comes more privacy, but this privacy comes as the cost of information loss. If we apply too much masking to our data, the end result may not resemble the original data whatsoever. While the balance between privacy and information loss can be tricky, its best to air on the side of privacy.

Mapsafe uses the Maskmy.XYZ tool for masking. The tool performs donut masking, which is a funny term for a simple concept: moving each point randomly between a minimum and maximum distance. A more comprehensive explanation about the masking feature can be found on this document.

Donut masking

What is hexagonal binning?

Hexagonal binning (hexbinning) is the other option for anonymising geospatial datasets in MapSafe. Geographic points are aggregated into hexagonal cells using Uber's h3-js library Users can choose the spatial resolution (i.e., the Uber H3 spatial indexing level) along with the buffer radius for encoding. The buffer radius (in KM) allows specification of how far the coverage of the binning should span.

Depending on the area depicted in the dataset, a suitable spatial resolution level will need to be chosen to balance privacy and utility. Large hexagons (lower resolution) would result in many distant locations shown in the same cell, while small hexagons (higher resolutions) may present results that are too sparse to cover the entire area. Therefore, a resolution size needs to be chosen that covers the entire area (i.e., dense representations) and contains a small cell size (i.e., preserving location details) to better balance the trade-off between spatial coverage and location details than other H3 indexing levels.

Hexagonal Binning

(Image acquired from https://www.kontur.io/blog/why-we-use-h3/)



What is encryption?

Encryption uses a passphrase to transform the original data into a form unrecoverable by an adversary. Mapsafe uses the encryption facility provided by the Web.Cryto toolkit whiich is in-built within the browser. A 15 term passphrase is randomly generated and used to encrypt the masked dataset data which is later required to recover the original data. The masked dataset is encrypted in the browser memory at three levels.

  • The full 15 term passphrase is used to encrypt the original dataset,
  • 10 terms of the full passphrase to encrypt the masked dataset, and
  • 5 terms of the full passphrase to encrypt the `more' masked dataset.

These passphrases are required to later decrypt to each of the three levels.

A detailed description of the encryption and decryption proceses are provided in this image.

Encryption Decryption Flow


The obfuscation (masking and hexabinning) and encryption are carrried out at multipe levels:

Encryption decryption levels


What is notarisation?

Notarisation create a digital fingerprint of the data. Usually this is done via a cryptographic hash function that generates a 64 character string. Even the slightest change in the data creates a completely different hash value. Mapsafe mints (stores) the hash value of the encrypted file (containing the original and obfiscated geospatial datasets) as a public record on a tamper-proof Ethereum Blockchain under the user's blockchain account. Using the Metamask Wallet, a user can mint this hash under their Ethereum account. Once the has value is minted, a url link to the Ethereum address where the data is stored as a public record is presented.

Seting up for Notarisation

All data should be notarised on the Ethereum mainnet. However, for testing one can use the public Ethereum testnet, such as Goerli. To mint data on ether networks, you need to carry out these tasks.

  • First, visit Vanity-ETH to generate an Ethereum vanity address and a private key.
  • Second, using your generated Ethereum address, request Ether from this Goerli Faucet
  • Third, install the metamask wallet, and connect it to the Goerli testnet or mainnet.
  • If using the testnet, you can check the Ether balance and all minted data on your address, at the Goerli etherscan

How do I use the mapsafe tool?

Watch this video to learn how to safeguard your data, from start to finish!