# Contents

# Context

#### Tools | GIS Mapping

If the data set associated with the network contains latitude and longitude coordinates, it is now possible to display a graphical object per observation/row on a **Google Map**.

The coordinates need to be loaded as continuous variables and thus have to be discretized. While the choice of the discretization can have an impact on the machine-learned model (if these coordinates are useful for the model), it does not have any impact on the mapping. The continuous values are utilized directly.

The graphical objects have four dimensions: **Shape**, **Color**, **Size** and **Opacity**.

Each of these dimensions can be:

**Fixed**, i.e. identical for each object/observation,- Based on the value of a variable, i.e specific to each observation:
**Directly**extracted from the observation described in the data set when the variable is:- an
**Observable****Random Node**and the value is**Not Missing**,

- an
**Inferred**with the current Bayesian network when the variable is:- an
**Observable****Random Node**and the value is**Missing**, - the
**Target****Node,** - a
**Not Observable Random Node,** - a
with numerical values.**Function****Node**

- an

For inference, all the non missing values of the **Observable** **Random Nodes** are set as hard evidence.

The value that will be utilized for the mapping depends on the type of the variable:

**Discrete:**the state is chosen with the**Maximum a posteriori**criterion**,****Continuous:**the mean value is computed with the posterior probability distribution, normalized to bring all values into the range [0,1],**Function****Node**: except when used to define the**Shape,**the value is normalized to bring all values into the range [0,1], by using the**Minimum**and**Maximum Values**set in the wizard.

Three shapes are defined: Circle (1), Square (2) and Triangle (3).

When not **Fixed**, the shape is chosen based on its rank and the inferred value:

- For
**Discrete**nodes:

where is the state's rank and is the modulo,

- For
**Function**nodes:

where is the value of the**Function**and is the function for converting into an integer.

When not **Fixed **to the user defined** Fixed Value **, the size is chosen based on the and the inferred value:

- For
**Discrete**variables:

where is the normalized state's rank; - For
**Continuous**and**Function**nodes:

where is the normalized value.

When not **Fixed** to the chosen color, the color is chosen as folliows:

- For
**Discrete**variables: the color is chosen based on the state's rank and the**Secondary Color Palette,** **For**and**Continuous**the normalized value is direclty used to defined a color on the user defined scale**Function**nodes:**Min**,**Mid**(if checked), and**Max**.

When not **Fixed **to the user defined** Fixed Value** (), the opacity is chosen based on and the inferred value:

- For
**Discrete**variables:

where is the normalized state's rank; - For
**Continuous**and**Function**nodes:

where is the normalized value.

**Example**

Let's use a data set that contains house sale prices for King County, which includes Seattle. It describes homes sold between May 2014 and May 2015. More precisely, we have extracted the 94 houses that are more than 100 years old, that have been renovated, and come with a basement.

After having set * Price (K$)* as a

**Target Node**, we've used the

**Augmented Markov Blanket**algorithm for generating the following network:

The **Function Node** * Certainty* is defined as:

*1-Entropy(?Price (K$)?, yes)*

The first three parameters of this wizard are the general settings of the mapping:

**Map Type**: Roads, Terrain, Satellite or Hybrid,**Latitude**: the continuous variable to use for the latitude coordinate,**Longitude**: the continuous variable to use for the longitude coordinate.

This setting generates the following map that takes into account four differents variables:

- The
**Observable**variable(discretized into three bins) defines the shape. The values are directly*Overall grade given to the housing unit*__read in the data set__to determine the corresponding discrete bin, if not missing;*<= 7.5*: CIRCLE*<= 8.5*: SQUARE*> 8.5*: TRIANGLE

- The
**Observable**variabledefines the size, with 25 a a the maximum (set in the*Living room area in 2015***Fixed**field), The continous values are directly__read in the data set__, if not missing, - The
**Target Node**defines the color. The values are*Price (K$)*__the inferred posterior mean values,__ - The
**Function Node**defines the opacity. The values are*Certainty*__inferred.__