Mapsafe uses geomasking, encryption, and notarisation to safeguard geospatial datasets. Each of these techniques is explained below.
First, make sure the data you want to mask is in a shapefile format. Mapsafe can only load zipped shapefiles. Once you have the zip file, just click the 'Choose File' button in Upload tab and select the zipped shapefile from your file system, as shown in the video at the end of this page.
Geographic masks are a set of techniques that alter the location of points in a map to protect privacy without overly affecting any spatial patterns. In other words, geographic masks allow researchers to publish useful maps of approximate locations, without exposing sensitive data or violating anyone's privacy. Of course, this is a trade-off: with more masking comes more privacy, but this privacy comes as the cost of information loss. If we apply too much masking to our data, the end result may not resemble the original data whatsoever. While the balance between privacy and information loss can be tricky, its best to air on the side of privacy.
Mapsafe uses the Maskmy.XYZ tool for masking. The tool performs donut masking, which is a funny term for a simple concept: moving each point randomly between a minimum and maximum distance. A more comprehensive explanation about the masking feature can be found on this document.
Hexagonal binning (hexbinning) is the other option for anonymising geospatial datasets in MapSafe. Geographic points are aggregated into hexagonal cells using Uber's h3-js library Users can choose the spatial resolution (i.e., the Uber H3 spatial indexing level) along with the buffer radius for encoding. The buffer radius (in KM) allows specification of how far the coverage of the binning should span.
Depending on the area depicted in the dataset, a suitable spatial resolution level will need to be chosen to balance privacy and utility. Large hexagons (lower resolution) would result in many distant locations shown in the same cell, while small hexagons (higher resolutions) may present results that are too sparse to cover the entire area. Therefore, a resolution size needs to be chosen that covers the entire area (i.e., dense representations) and contains a small cell size (i.e., preserving location details) to better balance the trade-off between spatial coverage and location details than other H3 indexing levels.
(Image acquired from https://www.kontur.io/blog/why-we-use-h3/)
Encryption uses a passphrase to transform the original data into a form unrecoverable by an adversary. Mapsafe uses the encryption facility provided by the Web.Cryto toolkit whiich is in-built within the browser. A 15 term passphrase is randomly generated and used to encrypt the masked dataset data which is later required to recover the original data. The masked dataset is encrypted in the browser memory at three levels.
These passphrases are required to later decrypt to each of the three levels.
A detailed description of the encryption and decryption proceses are provided in this image.
Notarisation create a digital fingerprint of the data. Usually this is done via a cryptographic hash function that generates a 64 character string. Even the slightest change in the data creates a completely different hash value. Mapsafe mints (stores) the hash value of the encrypted file (containing the original and obfiscated geospatial datasets) as a public record on a tamper-proof Ethereum Blockchain under the user's blockchain account. Using the Metamask Wallet, a user can mint this hash under their Ethereum account. Once the has value is minted, a url link to the Ethereum address where the data is stored as a public record is presented.
All data should be notarised on the Ethereum mainnet. However, for testing one can use the public Ethereum testnet, such as Goerli. To mint data on ether networks, you need to carry out these tasks.
Watch this video to learn how to safeguard your data, from start to finish!