First Impression on PyScript Through the Eyes of GIS! Distance Calculation!

Milad Korde and Bobby Basnet


Advanced knowledge of HTML, Advanced knowledge of JavaScript, Intermediate knowledge of GitHub, Intermediate knowledge of GIS

You have been using JavaScript for a while and always wished to use Python instead? I feel you! PyScript is the answer. Not that this is the only answer but in my opinion one of the best ones!

My first exposure to web development in Python 3 was when I learned about CherryPy! CherryPy is written in Python, and components that deliver dynamic content are written as Python classes tightly integrated with CherryPy’s core. CherryPy is actively developed and enjoys a large user community. But this post is not about CherryPy! It is about PyScript!

PyScript is a framework that allows users to create rich Python applications in the browser using HTML’s interface and the power of PyodideWASM, and modern web technologies. The PyScript framework provides users at every experience level with access to an expressive, easy-to-learn programming language with countless applications.

Let’s see how you can use it in its simplest format coming directly from the


    <py-script> print('Now you can!') </py-script>|


The first thing catching the eye is the <py-script> tag replacing the famous <script> as JavaScript enabler. You need to put your Python code between <py-script> and </py-script>. Following is a simple converter for grades based on a certain range. This is similar to what you see in learning systems such as CANVAS or Moodle, where the instructor sets rules to convert percentages or grades to letters that will appear on your transcript.

    <meta charset="UTF-8">
    <meta name="viewport"
          content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="" />

   <center> <h1>Simple usage of PyScript</h1> </center>
    <div id="convert" align="text-center"></div>
    <py-script output="convert"> 

lst = [65,78,89,73,67,90,96,67,87,78]

for i in lst:
    if i >=50 and i <= 60:
        print ("F")
    elif i >=61 and i <= 70:
        print ("D")
    elif i >=71 and i  <=80:
        print ("C")
    elif i >=81 and i <=90:
        print ("A")


We will take this to another level. A GIS level. We want to improve the code and HTML at the same time. Our intention is to write an application that can take care of distance calculation. Our distance calculation is based on a simple cartesian coordinate system. So, for now, no curvature of the earth is included. We start with a semi-hard coded version. It means you still ask for a user input but in a very ugly way. Upon running the page, you will be asked to insert a value. No help! No suggestion. No message that can help you understand what to do! You have to insert the x1,y1,x2, and y2 in an orderly fashion to make the code calculate the distance between those two points:

File’s name on GitHub: 02_Distance.html

    <meta charset="UTF-8">
    <meta name="viewport"
          content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <link rel="stylesheet" href="" />

    <h1>Distance Calculator</h1>

    <div id="Distance"></div>


    <py-script output="Distance"> 
x1= int(input("What is the longitude of the first point"))
y1= int(input("What is the latitude of the first point"))
x2= int(input("What is the longitude of the second point"))
y2= int(input("What is the latitude of the first point"))

def distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    dsquared = dx*dx + dy*dy
    result = dsquared**0.5
    return result

r = distance(x1,y1, x2, y2)

print (r)


We are trying to improve your python writing and HTML management simultaneously. Getting 4 values separately and an HTML pop-up message is painful and brutal. So let’s take care of that:

File’s name on GitHub: 03_Distance with list input.html

<!DOCTYPE html>
<html lang="en">
      <title> Distance Calculator </title>
    <!-- Pyscript -->
    <!--<link rel="stylesheet" href="" />-->
    <!-- bootstrap for styling , Pyscript can also be used for styling instead of bootstrap --> 
    <link rel="stylesheet" href="" integrity="sha384-9aIt2nRpC12Uk9gS9baDl411NQApFmC26EwAOH8WgZl5MYYxFfc+NcPb1dKGj7Sk" crossorigin="anonymous" />
    body {
        background-color: #fcfcfc;
        padding-top: 55px;
        min-height: 100vh;


 <div id="distance"></div>



from js import document,alert

x1,y1, x2,y2 = [float(x) for x in input("Enter 4 values with a space between them: ").split()]

def distance(x1,y1,x2,y2):
    dx = x2 - x1
    dy = y2 - y1
    dsquared = dx*dx + dy*dy
    calculated_distance = dsquared**0.5
    return calculated_distance

result = distance(x1,y1,x2,y2)

#print ("The distance between two points is", result)

pyscript.write("The Distnce is",result)


    <div class="container">
        <h1 class="text-center">
            Insert coordinates of your chosen two points with a space between every value to calculate the distance. </h1>
           <p align="center"> Example: 23.52 45.2 24 67.08 </p>
        <br />

       <!--  <p><h3>Enter the coordinates</h3>
        <input id="x1,y1, x2,y2" type="text" class="form-control" aria-label="Large" aria-describedby="inputGroup-sizing-sm" placeholder="Enter the Latitude of the first point" /></p> -->

       <!--  <center>
            <button id="distance" type="submit"  class="btn btn-info" pys-onClick="distance">Calculate the Distance Between Two Points</button>

        </center> -->
        <textarea id="The Distnce is" class="form-control" rows="2" aria-label="Large" aria-describedby="inputGroup-sizing-sm" placeholder="Result"></textarea></p>


Is it possible to improve it? The answer is yes. So what we imagine is a web-based app that can ask for inputs (coordinates) and returns the result of the calculation:

To achieve this, we need to take care of a few things:

  1. The proxy that is going to take care of the event handler. In this case, I have a button (Calculate the Distance Between two Points) that will trigger the function on click and show the result in the Result box. To achieve that, I will use create_proxy from the PyoDide library.
  2. I also have a result box that should receive the result upon the click on the button:

document.getElementById("Distance").innerHTML = 'Calculated Distance: ' + str(calculated_distance)

3. The second function is set to create the proxy itself and create a JavaScript Proxy.

File’s name on Github: 04_Distance Calculator.html

from js import document, alert
#imported this to create a proxy function that will not execute when the page is loaded
from pyodide import create_proxy
def button_click(event):
	FirstLat = document.getElementById("FirstLat").value
	FirstLong= document.getElementById('FirstLong').value
	SecLat= document.getElementById('SecLat').value
	SecLong= document.getElementById('SecLong').value
	<!-- #validate input
	if(not(FirstLat.isnumeric()) and not(FirstLong.isnumeric()) and not(SecLat.isnumeric()) and not(SecLong.isnumeric())):
		alert("Invalid Input(s), please make sure you have valid cordinates input")
		return False -->
	#since the input values will be string, convert to float
	dx = float(SecLat) - float(FirstLat)
	dy = float(SecLong) - float(FirstLong)
	dsquared = dx*dx + dy*dy
	calculated_distance = dsquared**0.5
	document.getElementById("Distance").innerHTML = 'Calculated Distance: ' + str(calculated_distance)

def setup():
	# The page is ready, clear the "page loading"
	document.getElementById("msg").innerHTML = ''
	# Create a JsProxy for the callback function
	click_proxy = create_proxy(button_click)
	# Set the listener to the callback
	e = document.getElementById("button")
	e.addEventListener("click", click_proxy)

As you may notice, I have commented out the validation part with an HTML toggle comment because I know I will use the code in an HTML file. Also, you may ask why we are validating if we do not include it. Some of you may want to use it for other kinds of use. Then you can use the validation strategy. In our case, and specifically in the case of distance calculation based on a Cartesian coordinate system, you will have negative numbers: Western hemisphere, for example! If you put the verification in the act, you will have problems. Please try it yourself!

Comparing Multiple Fields From a Table to One Single Field in the Same Table – In ArcPro and in Python

Prerequisites: Intermediate Python, Intermediate ArcGIS Pro

Previously I had written a post regarding the calculation of a field based on another field here. However, the old post was based on comparing only two fields. In simple words, what I was trying to find consisted of populating a newly created field just by looking at the content of another field in the same table.

In this post, I want to take a step further and compare more than 1 field. The idea comes from a question that one of my friends asked, which tries to determine the contamination in samples based on a reference value:

“How can I determine if 50% of all the variables (field in GIS) that I have in a table are bigger than the reference variable?”

The same question can be formulated in another way:

“How can I determine if the reference variable is smaller than 50% of all the variables ?”

Not Clear? I feel you! I felt the same the very first time I was asked the question. So let’s take a look at the table and make it easier (hopefully):

So what I want to do is to determine whether the majority of values stored in RD1,RD2,RD3,RD4,RD5,RD6,RD7,SH1, and Site 11 are bigger than the result stored in the field called Blank for each row. Notice that I am not showing all the fields because I am interested in 50% of them, and it does not matter which one should be included. Therefore I am choosing the first 9 appearing in the above picture. This brings another point: we have 18 variables, excluding “Blank.” So if I find 9 values in a row where the number is bigger than the value stored in Blank, then I want that row to be kept in the table, and if the opposite happens, I want that record (row) to be removed from the table!

If it was a matter of comparison between two variables, it would have been possible to solve by introducing two parameters (arguments) in a def like I did in this post! But here, we have more variables, and to compare them to each other, I need to get some help from Python. That help comes from *argument.

I name my function as defcom and introduce variables into it:

defcom (!Blank!,!RD1!,!RD2!, !RD3!,!RD4!, !RD5!, !RD6!,!RD7!, !SH1!,!Site11!)

Compare Fields Example_Seif

Next, I use the Code Block in calculate field to call the function:

def defcom (Blank,RD1,RD2,*argument):
    if Blank >= RD1 and Blank >= RD2 and Blank >= RD3 and Blank >= RD4 and Blank >= RD5 and Blank >= RD5 and Blank >= RD6 and Blank >= RD7 and Blank >= SH1 and Blank >= Site11 :
        return "delete"
        return "keep"

In this way, I can avoid repeating all the variables as arguments when I define my function and just use them to do the if statement and/or call them whenever I want to call them.

The above code written in the Calculate Field will produce a new field (column) and will assign “delete” or “keep” based on the outcome of the if statement. In this way, I know which “records” (rows) can be kept or deleted and eventually answer the question we asked at the beginning of the post.

In Python

But what if you want to try this in Python? I will make this a little bit easier by introducing a variable for every field that we might have in our imaginary table. So that our (imaginary) table is composed of a header and one row:

Blank= 0.5 RD1= 0.3 RD2= 0.2 RD3= 0.2 RD4= 0.6 RD5= 0.7 RD6= 0.6 RD7= 0.9 SH1= 0.2 Site11= 0.6

This might not be efficient, but I think it might be better for educational purposes. In addition, you can copy and paste it into your text editor to see the result, play with it, change it, etc. The same variables can be introduced in a list or tuple or simply called from a data frame, but we now stick to each field’s single variable and value.

After setting variables, we replicate the same concept. The only things that have changed are that I am using *args instead of *arguments and putting variables inside parenthesis for if statement:

Blank= 0.5
RD1= 0.3
RD2= 0.2
RD3= 0.2
RD4= 0.6
RD5= 0.7
RD6= 0.6
RD7= 0.9
SH1= 0.2
Site11= 0.6

def defcom (Blank,RD1,RD2,*args):
    if (Blank >= RD1 and Blank >= RD2 and Blank >= RD3 and Blank >= RD4 and
    Blank >= RD5 and Blank >= RD6 and Blank >= RD6 and Blank >= RD7 and
    Blank >= SH1 and Blank >= Site11) :
        return ("delete")
        return ("keep")

e= defcom(Blank,RD1,RD2)

print (e)

A good exercise would be to run through a for loop for a real table so the code can decide whether to keep the row or not! Next time!

Converting Geographic Coordinate System for Vector Data Models in PostgreSQL While Using Leaflet for Web Mapping!


  1. Introductory knowledge of JavaScript,
  2. Advanced knowledge of HTML
  3. Intermediate knowledge of SQL
  4. Advanced knowledge of PostgreSQL
  5. Advanced Knowledge of PostGIS
  6. Advanced knowledge of Leaflet

The assumption of this post is that you know how to connect a database to the Leaflet project.

If you have used Leaflet for web mapping development, you probably have seen how easy it is to manage, visualize, and change geospatial data by converting them to GeoJson files. On the other hand, if you are not in academia in any quality you will probably want to opt for open source geospatial tools that make you save a lot of money. In that case, Leaflet looks like a good choice.

One way to manipulate geospatial data is to set a local server on your computer using Leafletjs. What I like about this approach is the possibility of connecting a database of geospatial layers (PostgreSQL in my case) to our HTML file.

The first concern consists of taking care of coordinate systems and projection. This issue especially comes up when visualizing your layer on the Leaflet base map. Since most web-based maps use World Geodetic System 1984 as a datum, they might not fit the original GCS of the layer you want to show.

It is known that in Leaflet we can change the geographic coordinate system (GCS) by using the Coordinate Reference System (CRS) method directly in your index.html:

var crs = new L.Proj.CRS('EPSG:3006',
  '+proj=utm +zone=33 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs',
    resolutions: [
      8192, 4096, 2048, 1024, 512, 256, 128
    origin: [0, 0]

An efficient way to control GCS in your shapefiles is to store and change them in a database (PostgreSQL) and call them through the Leaflet by converting them to GeoJson files. In this way, you can control the GCS and Projection by calling numbers from the catalog of spatial reference systems, also known as spatial_ref_sys, stored as a table in your database. (The spatial_ref_sys table is a PostGIS included and Open Geospatial Consortium (OGC) compliant database table lists over 3000 known spatial reference systems and details needed to transform/reproject between them).

Now let’s connect a shapefile from my database without checking whether it fits the Leaflet base map. The layer I chose comprises 15 countries formerly under Imperial Great Britain.

It is no surprise that you cannot see those 15 countries because they are showing up as a tiny aggregation of polygons in the middle of the Indian Ocean.

If I zoom to the area contained by the circle, then I can see those 15 countries.

So apparently, there is a problem with projection.

It is possible to check the GCS by using the following code, which will return the Spatial Reference ID (srid) of the layer as 3857:

SELECT ST_SRID(geom) FROM countries_formerly_under_gb LIMIT 1;

One way to solve the problem is to update the spatial reference (GCS) to WGS 84 even without checking the GCS of the source layer in my database by assuming that my base map in Leaflet has the same coordinate system:

select UpdateGeometrySRID( 'public', 'countries_formerly_under_gb', 'geom', 4326);

Does the above code solve the problem?

Let’s introduce another unknown variable. So far, I did not know the spatial reference of my base map coming from Leaflet, and I hoped for 4326 as the number of srid, which is one of the projected variants of WGS 84. However, after loading the data, I got the same result. So at this point, my second choice was srid= 3857 (from spatial_ref_sys), which is a geographic coordinate system based on WGS 84 used frequently in web mapping. But this time the difference is that I have already converted my geom to 4326 by using the previous line of code. So now I need to convert from 4326 to 3857:

ALTER TABLE countries_formerly_under_gb
 ALTER COLUMN geom TYPE geometry(MultiPolygon, 3857)
   USING ST_Transform(ST_SetSRID(geom, 4326), 3857);

And what I get on my localhost is the following:

The 15 Countries now are showing up correctly on my base map

What Tutorials Will Not Tell You on Geostatistical Predicted Surfaces! In R and ArcGIS Pro! Part 4- (Semi)Variogram, Error and Final Assessment!

6. Next, we want to fit a variogram model to the binned data and add it to our graph. For this, you’ll need to select the sill (“psill”), nugget, and range values appropriately or the curve may not appear on the graph. Let’s take a look at the components of variogram:

Typical semivariogram
Figure 1a. Components of a variogram

Figure 1b. Variogram

When you look at the model of a semivariogram, you’ll notice that at a certain distance, the model levels out. The distance where the model first flattens out is known as the range. The value that the semivariogram model attains at the range (the value on the y-axis) is called the sill. The partial sill is the sill minus the nugget. Theoretically, at zero separation distance (lag = 0), the semivariogram value is 0. However, at a small separation distance, the semivariogram often exhibits a nugget effect, which is some value greater than 0. For example, if the semivariogram model intercepts the y-axis at 2, then the nugget is 2.

The nugget effect can be attributed to measurement errors or spatial sources of variation at distances smaller than the sampling interval or both. Measurement error occurs because of the error inherent in measuring devices. Natural phenomena can vary spatially over a range of scales. Variation at microscales smaller than the sampling distances will appear as part of the nugget effect. Before collecting data, it is important to gain some understanding of the scales of spatial variation.

The general formula for creating the variogram is:

γ(si,sj) = ½ var(Z(si) - Z(sj))

If two locations, si and sj, are close to each other in terms of the distance measure of d(si, sj), you expect them to be similar, so the difference in their values, Z(si) – Z(sj), will be small. As si and sj get farther apart, they become less similar, so the difference in their values, Z(si) – Z(sj), will become larger. This can be seen in the above figure, which shows the anatomy of a typical semivariogram.

Covariance function

The covariance function is defined to be

C(si, sj) = cov(Z(si), Z(sj)),

Covariance is a scaled version of correlation. So, when two locations, si and sj, are close to each other, you expect them to be similar, and their covariance (a correlation) will be large. As si and sj get farther apart, they become less similar, and their covariance becomes zero. This can be seen in the following figure, which shows the anatomy of a typical covariance function.

Typical covariance function
Figure 2. Covariance

Notice that the covariance function decreases with distance, so it can be thought of as a similarity function.

Relationship between semivariogram and covariance function

There is a relationship between the semivariogram and the covariance function:

 γ(si, sj) = sill - C(si, sj),

This relationship can be seen in the figures. Because of this equivalence, you can perform prediction in Geostatistical Analyst using either function. (All semivariograms in Geostatistical Analyst have sills.)

Semivariograms and covariances cannot be just any function. For the predictions to have nonnegative kriging standard errors, only some functions may be used as semivariograms and covariances. Geostatistical Analyst offers several choices that are acceptable, and you can try different ones for your data. You can also have models that are made by adding several models together—this construction provides valid models, and you can add up to four of them in Geostatistical Analyst. There are some instances when semivariograms exist, but covariance functions do not. For example, there is a linear semivariogram, but it does not have a sill, and there is no corresponding covariance function. Only models with sills are used in Geostatistical Analyst. There are no hard-and-fast rules on choosing the “best” semivariogram model. You can look at your empirical semivariogram or covariance function and choose a model that looks appropriate. You can also use validation and cross-validation as a guide.

Once you’ve created the empirical semivariogram, you can fit a model to the points forming the empirical semivariogram. The modeling of a semivariogram is similar to fitting a least-squares line in regression analysis. Select a function to serve as your model, for example, a spherical type that rises at first and then levels off for larger distances beyond a certain range.

Let’s get back to the coding: The “model” can be one of the following:

  • “Exp” (for exponential)
  • “Sph” (for spherical)
  • “Gau” (for Gaussian)
  • “Mat” (for Matern)
  • Linear

kriging models
Figure 3.Different types of variogram models


After a lot of attempts I was able to get an acceptable model with the following parameters:

TheVariogramModel <- vgm(psill=45, ,model="EXP", nugget=0.01, range=0.5)
plot(TheVariogram, model=TheVariogramModel)

Figure 4. The fitted (almost) line to the variogram

Note that for random data there is very little autocorrelation so the range is very small and we reach the sill quickly. In Arcpro instead, the modeling process will take care of setting the parameters which makes it easier to run but less appealing to understand what is happening.

would you like to know how a spatially correlated variable looks like? Take a look at the following figure:

Figure 5. A high spatially correlated variogram

Semivariograms are estimated for subsets of input points independently (the input data is first divided into overlapping subsets of a specified size (defaulted to 100 points per subset) and a spectrum of semivariograms is estimated for each subset. When spectrums are very different for different subsets, the process is not globally stationary, and the predictions made by EBK may be more accurate than the predictions from methods that require global stationery.

Figure 6. Subsets, Source: ESRI

The above graphic shows many pairs of locations, linked by blue lines that are approximately the same length and orientation. They have approximately the same spatial similarity (dependence), which provides statistical replication in a spatial setting so that prediction becomes possible.

Checking for nonstationarity using semivariograms in ArcPro

In spatial modeling of the semivariogram, you begin with a graph of the empirical semivariogram, computed as,

Semivariogram(distance h) = 0.5 * average [ (value at location ivalue at location j)2]

for all pairs of locations separated by distance h. The formula involves calculating half the difference squared between the values of the paired locations. To plot all pairs quickly becomes unmanageable. Instead of plotting each pair, the pairs are grouped into lag bins. For example, compute the average semivariance for all pairs of points that are greater than 40 meters but less than 50 meters apart. The empirical semivariogram is a graph of the averaged semivariogram values on the y-axis and distance (or lag) on the x-axis (see diagram below) accessible in ArcPro.

Figure 7. The Empirical Semivariogram

The presence of large differences in the spectrums in different locations indicates global nonstationarity. Also, When empirical Bayesian kriging is performed in the Geostatistical Wizard, you are able to see the subsets that were used to calculate the predicted value.

Figure 8. Subsets of variograms and empirical variogram

After you have created the empirical semivariogram, you can fit a line to the points to form the empirical semivariogram model. A number of semivariogram models can be employed, and different models may lead to different interpolations. Therefore, selecting an appropriate model to capture the features of the data is critical. The selected model influences the prediction of the unknown values, particularly when the shape of the curve near the origin differs significantly. The steeper the curve near the origin, the more influence the closest neighbors will have on the prediction.

The following graphic shows the Exponential model applied to sample data. This model is typically applied when spatial autocorrelation decreases exponentially with increasing distance, disappearing completely only at an infinite distance.


The following graphic shows the K-Bessel model (available with ArcPro) applied to sample data. This model is typically applied when most of the empirical covariances fall inside the confidence intervals and a few fall just outside the intervals.

But still, this model is not a good one:

Transformations are used to bring your data closer to a normal distribution and to satisfy stationarity assumptions (Refer to Part 1 where we discussed transformation and stationary).

When you apply K-Bessel in the General Properties, an additional graph for the Shape parameter appears. In addition, a new Transformation tab appears, which displays the distribution of the fitted transformations (one for each simulation). The transformation distribution is colored by density, and quantile lines are provided. The K-Bessel semivariogram is the most flexible and accurate, though it takes the longest to calculate. If you are willing to wait to get the most accurate results, K-Bessel is the best choice.

As you change parameters within the wizard, the preview map updates to reflect the new model parameters.

The Error?

The error graph displays the difference between the known values and their predictions. The ArcPro will calculate the error automatically in addition to the QQplot and the prediction.

The standardized error graph displays the error divided by the estimated errors. The standardized error equation is represented by the blue line. To assess the extent of error, each item is squared, and then the average of these squared errors is taken. This average is called the root mean square (or RMS).

Over the course of your geostatistical interpolation process, you will have created several prediction surfaces. Comparison helps you determine how good one model is relative to another. You can evaluate each model by determining which best meets the ideal statistical values based on the cross-validation statistics. With the Geostatistical Wizard, you can assess how parameters affect your data samples. Consider the following columns: measured (the measured value at a point ); predicted (the prediction at that point when it is removed from the input ); and error (the difference between the measured and predicted values). Because an over- or under-prediction can occur at any point, errors can be positive or negative.

You must consider two things when comparing the results from different methods: optimality and validity. For example, if a particular model has a small RMS prediction error, you might conclude that it is the optimal model. However, another model’s RMS prediction error may be closer to the average estimated prediction standard error. Therefore, the second model might be more valid because, when you predict at a point without data, you have only the estimated standard errors to assess your uncertainty of that prediction.

You must also check that the RMS standardized value is close to 1. When the RMS standardized value is close to 1 and the average estimated prediction standard errors are close to the RMS prediction errors from cross-validation, you can be confident that the model is appropriate.

The goal of comparison is to end up with a prediction surface with the following qualities:

  • Unbiased (centered on the true values)
  • Standardized mean prediction errors near 0
  • Valid prediction standard errors
  • Small RMS prediction errors
  • Average standard error near RMS prediction errors
  • Standardized RMS prediction errors near 1

A good-quality model has standardized mean prediction errors near 0, as well as small RMS prediction errors, average standard errors near RMS prediction errors, and standardized RMS prediction errors near 1. The points should be as close as possible to the dashed gray line. Look for points that deviate greatly from the line.

Finally what makes the ArcPro a versatile tool to apply geostatistical prediction is the Cross-Validation. Cross-validation is a technique that allows you to find how well your interpolation model fits your data. “Cross-validation works by removing a single point from the dataset and using all remaining points to predict the location of the point that was removed“. The predicted value is then compared to the real value, and many statistics are generated to determine the accuracy of the prediction.

What Tutorials Will Not Tell You on Geostatistical Predicted Surfaces! In R and ArcGIS Pro! Part 3- (Semi)Variogram, Covariance and Autocorrelation?

If you have had enough introduction to Kriging you can skip to “Modeling a Semivariogram” part and also you can download the data I used here to replicate the code in R.

There are multiple ways of predicting values based on geostatistical approaches. This process known also as Spatial interpolation “is the prediction of exact values of attributes at unsampled locations from measurements made at control points in the same area (O’Sullivan and Unwin, 2010)”. Inverse Distance Weighting (IDW) is a famous deterministic interpolation method and maybe the easiest one to apply but arbitrary and limiting in some cases. Other approaches and especially Kriging provide more insight and they are based on error calculation.

A common way of visualizing the spatial autocorrelation of a variable is a variogram plot. Methods such as Empirical Bayesian Kriging (EBK) accounts for uncertainty in semivariogram estimation by simulating many semivariograms from the input data. When modeling the semivariogram, the autocorrelation can be examined and quantified.

Variogram or semivariogram? Which one is the correct one?

Some authors call it a semivariogram stating that a semivariogram is half of a variogram while others use these two words synonymously (Bachmaer and Backes, 2008). I will be using the terms interchangeably since the discussion about which one is the correct one is not what I am interested in at this point.

Some characteristics of Kriging:

  • Kriging is a statistical interpolation method that is optimal in the sense that it makes the best use of what can be inferred about the spatial structure ( in the surface to be interpolated from an analysis of the control point data.
  • Kriging relies on the semi-variogram. In simple terms, semivariograms quantify autocorrelation because it graphs out the variance of all pairs of data according to distance.
  • Kriging is a method developed in France by Georges Matheron based on the work of Dani Krige. The method was used in the South Africa mining industry. There are different types of Kriging. Three steps are involved in Ordinary Kriging (O’Sullivan and Unwin (2010):

1. Producing a description of the spatial variation in the sample control point data
2. Summarizing this spatial variation by a regular mathematical
3. Using this model to determine interpolation weights

EBK can produce 4 output surfaces: prediction surfaces, prediction standard error surfaces, probability surfaces indicating whether a critical value is exceeded, and quantile surfaces for a predetermined probability level.

Figure 1a. Different types of geostatistical layer produced by EBK. Source: ESRI

Figure 1b. EBK procedure. Source: ESRI

EBK works by first dividing the data into subsets of a given size, estimating a semivariogram model from the data, using the semivariogram to simulate a new value at each of the input data locations, and generating a new semivariogram model from the simulated data. It then repeats the steps, based on the number of simulations, to create a spectrum of semivariograms and, finally, mixes the local surfaces together into the final surface. By generating a variogram, we will be able to look at the variance of the differences of any measurement among pairs of stations at different distances.

Part of the ArcGIS Geostatistical Analyst extension is a geostatistical interpolation method that uses probabilistic techniques to quantify the uncertainty associated with interpolated values. Empirical Bayesian Kriging (EBK) works by building local models on subsets of the data and then combining them to create the final surface. Using EBK, you can model spatial relationships automatically, and the results are often better than the results of other methods. Empirical Bayesian kriging is an interpolation method that accounts for the error in estimating the underlying semivariogram through repeated simulations (see Figure 11. Hence why all those lines in figure 12 coming from ArcPro).

  • Modeling a Semivariogram

The semivariogram (Figure 2) and covariance (Figure 3) functions quantify the assumption that things nearby tend to be more similar than things that are farther apart (first law of geography). Semivariogram and covariance both measure the strength of statistical correlation as a function of distance.

Semivariance is a measure of the degree of spatial dependence between samples. The magnitude of the semivariance between points depends on the distance between them: a smaller distance yields a smaller semivariance, and a larger distance results in a larger one.

The plot of the semivariances, as a function of distance from a point, is referred to as a semivariogram. A semivariogram shows the spatial autocorrelation of the measured sample points and displays the statistical correlation of nearby data points. As the distance increases, the likelihood of these data points being related becomes smaller. In other words: Semi-variograms take 2 sample locations and calls the distance between both points h.

When a strong spatial correlation exists, the semivariogram can help reconstruct missed values and interpolation makes reliable predictions at the unsampled locations. If data are not spatially correlated, there is no way to predict the value between measurement locations other than assigning a statistical measure (for example, arithmetic mean) to all prediction locations. As the data correlation gets stronger, fewer samples are needed for reliable data prediction and interpolation.

Typical semivariogram
Figure 2. Components of a (semi)variogram

Typical covariance function
Figure 3. Covariance

How to create a semivariogram in R:

1. Call the needed packages (We may not use all of them):

library (geoR)

2. Call your table. Here I have converted a shapefile of Ozone measurements to a CSV. I have multiple variables in the table. Make sure that you have lat and long because we are going to base the variogram on them. (If your table does not have coordinates you can assign them randomly in Excel (or any other program) but you also need to know the rules of the coordinate system to avoid wrong coordinates)

ozone <- read.csv("E:/Ozone.csv")

3. Get the summary of distance by feeding the lat and long to the summary function. I have my lat and long on columns 20 and 21. Using the Euclidean formula manually may be practical for 2 observations but can get more complicated when measuring the distance between many observations. The dist() function simplifies this process by calculating distances between our observations (rows) using their features (columns). Remember that you can apply different types of distance calculations with dist function but here we leave it as default which is Euclidean.

dists <- dist(ozone [,20:21])


Min.          1st Qu.     Median      Mean          3rd Qu.         Max.
0.006507   1.546469   2.988936    3.626425     4.146660   12.850608

Remember that we are again, making a matrix (See Part 1) of distances. you can also recall dists seeing the pairwise calculation of points which is a long table.

4. Now let’s bring the geography in. We need to convert a simple data frame into a spatial data frame object (remember that we have a CSV which is considered non-spatial).

coordinates(ozone)= ~ Long+Lat

And plot the x and y:

plot (ozone$Long, ozone$Lat)

Figure 4. Coordinates plotted based on lat and long

5. Next, we create a simple variogram based on the variable storing the emission. This “bins” the data together by breaking up the distances between each of the points based on a “lag” size between the distances. The distances between pairs at which the variogram is calculated are called lags. For instance, lags may be calculated for samples that are 10 feet apart, then samples that are 20 feet apart, then 30 feet, etc. In this case, the distance between lags is 10 feet.

TheVariogram = variogram(AGE_YR_NUM~1, data=ozone)
plot (TheVariogram)

Figure 5. Semivariance without the fitting line

So what is the meaning of the semivariogram that we produced? What you need to know is that every point is given by the difference (variance) of a pair of samples (points) in our ozone table and the value of the emission stored in AGE_YR_NUM variable (called field in GIS). But one thing that might flee from your attention is that this variance is based on the distance. Indeed, in ArcPro if you click one of the points in your variogram you will see that a line will connect two samples as follows:

study points specific
Figure 6. Samples implementation in the variogram

Every point on the variogram will belong to a bin (category of distance). So now I may ask a question: By looking at Figure 6 and considering the sampled two points, where the variance point will fall on the variogram (Figure 5)?

  • Enter the name of the variogram in R and you’ll see a table with the following values:
  • np – number of points in the lag (bin)
  • dist – the average distance between points in the lag
  • gamma – mean for the lag

Figure 4. The variogram table in R

Note that for random data (such as the data I used) there is very little autocorrelation so the range is very small and we reach the sill quickly (we will see it for our data in the following part).

In the next part, we will finish the discussion by modeling the variogram.

What Tutorials Will Not Tell You on Geostatistical Predicted Surfaces! In R and ArcGIS Pro! Part 2 -Normal Distribution

Distribution and Normal Distribution:

One of the other parameters to be taken into account while assessing the spatial autocorrelation is the normal distribution. The normal distribution is one of those concepts that can be explained by complex definitions and even graphs such as the famous bell shape curve. I will use the one definition that I like and I was able to memorize: Normal distribution means that the mean and median are similar. If you would like to know more go ahead and read about the Gaussian curve to get more details about this concept. But why normal distribution? As I explained in part 1 geostatistical interpolation requires your data to be normally distributed.

Let’s say you do not have access to a database with some variables distributed normally. We still can get an idea about ND by using one of the functions incorporated in R which is called rnorm. If you have your own table and it is in CSV format with some numerical variable you can use the following code otherwise skip to the next paragraph.

df <- read.csv('C:\\Path\\To\\DataFile.csv')
v1 <- df[[1]] # by column number
v2 <- df[["col1"]] # by column name

We can assign any number to the function rnorm and the function will take care of creating a normal distribution for us:

r <- rnorm(100)

But how would you know that this is a normal distribution? Let’s try another approach this time. The function is called hist. The main parameter is used to assign a title:

histplot <- hist(r, main ="Normal Distribution")

To do this in ArcPro you can go to Geostatistical Analysis > Explore Data > Histogram.

At this point, we can go back to the bell shape curve definition of normal distribution. Therefore, any variable you plot from any database in order to be considered normally distributed should have a similar shape to the above picture. Remember that you may not get a perfect normal distribution which will happen frequently!

What are the characteristics of normally distributed data (again)?

To answer that question we can use skewness function:


Which gives us: -0.2627771. Closer is the skewness to 0 more the data is normally distributed. In some sources, the normal distribution is assessed with the mean and the median being close together (as I mentioned previously), and the kurtosis near 3. However, I am not introducing the Kurtosis in this post.

How does this look like in ArcPro?

Considering that you are running the Geostatistical Wizard you would check the histogram as follows. Remember that the histogram is moved to Symbology unless you use a Kriging:

Note: The following graph is not based on the data we created randomly in the previous paragraph.

After you map the data, use the Exploratory Spatial Data Analysis (ESDA) tools to examine the data in more quantitative ways (like the QQ plot). These tools give you a deeper understanding of the phenomenon you are investigating and help you make more informed decisions about how to construct the interpolation model.

The Normal QQPlot tool compares your dataset to a standard normal distribution. The x-axis plots the quantile values of the standard normal distribution, and the y-axis plots the corresponding quantile values of the dataset. You can investigate points that depart from a normal distribution by selecting them and examining their locations on a map. You can also apply Log transformation. The QQ plot assesses how evenly distributed the values are relative to the mean value. If the data is normally distributed the points fall on the 45-degree reference line. If the data is not normally distributed, the points deviate from the reference line. The normal QQ plot displays how closely the error aligns with the standard normal distribution. The center of the reference line is the most meaningful; deviations at the ends are very common.

QQ Plot in R:
qqnorm(r, pch = 1, frame = FALSE)


Now let’s add the reference line:

qqline(r, col="green", lwd =2)

So what is your assessment? Is our data normally distributed according to the last paragraph?


What Tutorials Will Not Tell You on Geostatistical Predicted Surfaces! In R and ArcGIS Pro! Part 1- Spatial (Auto)Correlation

One of the critiques I always have against tutorials is their straightforwardness in teaching “how”s rather than “why”s! In a more simplistic way, it is always easier to know “how” you can, for example, run a Kriging Interpolation rather than “why” running a Kriging interpolation. As a Ph.D. student, I learned that for me it is going to be a matter of why and I will have to do less with hows. Hence, this post attempts to review my knowledge on Geostatistical Predicted Surface from a slightly different and mixed perspective. I will use academic sources, online tutorials, and expert knowledge to combine everything possible and create a useful point of reference for myself and whoever wants to learn the concept of surface prediction with more details.

In this post and its following parts 2, 3 and 4, we will be focused on geostatistics, a discipline working on distance weighting of observations based on the control point (stations with exact values) to determine the value of other points. The keyword to remember this long definition is interpolation! To put it in simple words we can use an example: What would be the value of precipitation on the points not covered by precipitation station? Obviously, we cannot cover our cities with precipitation stations or any other type of station so that is why we need to use interpolation techniques to guestimate the value of non-surveyed points.


    1. Basic knowledge of Geostatistical Analysis Tools in ArcPro
    2. Intermediate knowledge of R and packages (libraries) for visualizing geographic data
    3. Intermediate knowledge of statistics

Spatial Autocorrelation:

Why starting with Spatial (Auto)correlation? O’sullivan & Unwin (2014) believe that before moving forward with any type of statistical analysis (like geostatistically predicted surface) we need to make sure of the existence of spatial autocorrelation. Any spatial data set is likely to have self-correlation or autocorrelation (O’sullivan & Unwin, 2014). Just think about the precipitation! it is very unlikely to have different amounts of precipitation between to points close to each other! Therefore, those two points are somehow correlated. This known correlation is often subject to the distance hence why a lot of statistical approaches use distance calculations among points or observations. Another “reason for developing analytical approaches to spatial autocorrelation is to provide a more objective basis for deciding whether or not there really is a spatial pattern, and if so, how unusual that pattern is”  (O’sullivan & Unwin, 2014, p.200).

Spatial Structure:

Any autocorrelation measure must be based on both the spatial
structure of the geographic objects in the study and the similarity
or difference of attribute values at locations near one another. Hence why researchers took care of creating a spatial weight matrix. The spatial relationship between all pairs of locations is measured using the spatial weights matrix generally denoted W. Let’s take a look at this matrix:

Figure 1. A typical weight matrix used in spatial autocorrelation studies (source: O’sullivan & Unwin, 2014, Chapter 7)

But what does this matrix mean? In the first row, we record the spatial relationship between the first value (or observation, point) and all other values. So W11 is the amount of relationship between the first point and itself, and W12 is the amount of relationship between the first point and second point (hence why the number 1,2) and so on. Here we need to remember that the relationship between every point or value with itself should be set to zero. Therefore, W11  W22 and all the other values in the matrix referencing the relationship between a point and itself are set to zero. All the components of the matrix can be denoted as Wij if we want to talk generally about components of the matrix without specifying the row and column.

Now the question is how the amount (it could be binary) of relationship is calculated? There are several methods to do this including adjacency rules such as Rook’s (sometimes called contiguity edges only) and Queen’s (sometimes called contiguity edges corner) adjacency, distance threshold, nearest neighbor, and triangulations. I will be focused on Rook’s and Queen’s adjacency in this post.

Some methods use binary classification assigning 0 and 1 to the correlation where 1 means correlated and 0 means not correlated. Other approaches use the range from 0 to 1 where a number closer to 1 means a highly correlated relationship and a number closer to 0 means less correlated. The process of weighing the strength of the relationship between two points (or values) is often subject to the distance which is denoted as stationarity in some sources such as ESRI guidelines and tutorials for geostatistical predictions. The concept is applied to the distance between two geographic points or values and that is what ESRI means when it says: “stationarity means that the relationship between two points and their values depends on the distance between them and not their exact locations”. So if you choose two pairs of points in four different locations the variation between their values must not be huge otherwise a method such as Kriging would not work optimally. Geostatistical interpolations assume that the data is stationary. If your data is not stationary, you need to use transformations to stabilize variances and then use the Empirical Bayesian Kriging method. This process is often included in multiple software.

Once the spatial structure is defined we can proceed to link the centroids of polygons as follows:

Two examples of spatial structures (i) Rook’s adjacency and (ii) Queen’s adjacency

A less confusing representation of the same concept is as follows:

(source: O’sullivan & Unwin, 2014, Chapter 7)

Let’s take a look at it in a more elaborated way (or you can skip to the next part: Code in R):

Consider that you are looking at the counties in any geographic extent of the U.S. so that we have:

Equation 1. Weighting correlation based on shared Boundaries between adjacent locations (source: O’sullivan & Unwin, 2014, Chapter 7)

where z is a power factor, lij is the length of the shared boundary between
counties, I, and j, and li is the length of the perimeter of county i.

To do all the above steps in R:

1. Call the needed packages:


2. Download the geographic data using the getData method from the raster library. GADM is a web-based data center where you can download geographic boundaries. I do not specify where I want my data to be downloaded in my code hence it is going to be stored in the current directory (where I have my R language installed):

usa <- raster::getData('GADM', country='USA', level=2)

Remember that you can also use 'alt' instead of 'GADM' which stands for altitude (elevation) hence useful in elevation studies; the alt data were aggregated from SRTM 90m resolution data between -60 and 60 latitudes. Also, level is the parameter for detecting different boundaries, and value 1 is for State boundary. I have used value 2 which will give me the counties. The name of the country should be set on 3 letters ISO standard, so for example, Italy would be ITA.

3. Now it is time to use poly2nb method to create the matrix and a neighboring list:

mw_matrix <- poly2nb(usa, row.names=usa$OBJECTID, queen=FALSE)

The most important parameter here is the queen=FALSE which reminds us of the rule of adjacency we explained in the above lines. if TRUE, a single shared boundary point meets the contiguity condition, if FALSE, more than one shared point is required; note that more than one shared boundary point does not necessarily mean a shared boundary line.

Recall (which means to type mw_matrix and press enter) the variable mw_matrix to see a report of relationships (which is called links):

Neighbour list object:
Number of regions: 3148
Number of nonzero links: 17622
Percentage nonzero weights: 0.1778221
Average number of links: 5.59784
9 regions with no links: 180612 223 4002 3757 3689 8690 8714 8849 2524

4. The function nb2mat generates a weights matrix for a neighbours list with spatial weights:

wmus <- nb2mat(mw_matrix, style='B', zero.policy = TRUE)

if you want to check the number of polygons:


and the number of neighbors for each polygon:

i <- rowSums(wmus)

And if we plot i:

Apparently, the average of neighbors for the USA counties is 5.59 based on our methodology.

If we want more details:

summary (mw_matrix, zero.policy = NULL, scale =1)

Link number distribution:

0     1     2     3      4       5       6        7      8     9    10   11   12   13  15    16   23
9    32   48  100  352  871  1030  502  148  41    7     2     1     2    1       1    1

32 least connected regions:
183 114 209 159 2997 3001 3790 3841 9246 9297 8589 8832 8807 8654 9445 9366 9310 9355 9358 9559 9160 9234 9362 9560 9417 9469 9528 9285 9305 9097 2547 2531 with 1 link
1 most connected region:
7137 with 23 links

5. Now let’s bring some visualization and GEOGRAPHY!

Load a shapefile of counties or whatever type of polygons you have access to:

shape <- readOGR("C:/Users/milad/Desktop/Counties/Counties.shp")

We can use poly2nb function to build a neighbors list based on regions with contiguous boundaries, that is sharing one or more boundary point. (see above for rook and queen adjacency). If you set the queen parameter on TRUE you will get a queen’s adjacency based linkage.

wr <- poly2nb(shape, row.names=pols$Id, queen=FALSE)

Plot the links:

plot(shape, col='gray', border='blue')
xy <- coordinates(shape)
plot (wr, xy, col='red' , lwd=1, add=TRUE)



Now we can create a weight matrix based on the links:

fr <- nb2mat (wr, style='B', zero.policy = TRUE)

In order to see a portion of the matrix we can feed the  rows and columns (I am interested in 10 rows and 10 columns):

fr[1:10, 1:10]

What is the meaning of the matrix? Let’s go back to paragraph 4: It must be easy to guess! For example, polygon 1 has a link with polygon 3 and 4. BUT, this might be confusing if you are comparing 1 to 1, 2 to 2, and so on. Do not do that! Here, because of the indexing system, we do not have the header section ([,1] [,2] and so on) starting with the number 0. So if you want to find the correlation between the first county and itself (which equals 0) you need to compare the first number of the first column (0) to the first number of the first row [,1].


Now you are ready to run the Spatial Moran’s I or other methods to assess the spatial autocorrelation which is an important phase in geostatistical surface prediction! Here I showed two of the most important adjacency rules. Other types of adjacency such as nearest neighbors, distance-based, and Delaunay have their own functions and libraries in R.

To do all the above steps in ArcGIS Pro (Python) you can easily use the Generate Spatial Weights Matrix tool and eventually Convert Spatial Weights Matrix to Table in order to visualize your matrix. Otherwise, you can directly use Moran’s I tool.

In the next part, we discuss another important parameter: Normal Distribution!

Populating Empty Cells Based on Another Field with Python in ArcPro and ArcMap!

Populating fields based on multiple values might be a challenging task when it is done manually. Especially when we have millions of records. This post will provide a Pythonic guide to fill the blank cells based on the value in another field in the ArcMap Field Calculator and Calculate Field in ArcPro.


  1. Basic knowledge of ArcPro or ArcMap Field Calculator (It is called Calculate Field in ArcPro)
  2. Basic knowledge of Python (version 2 and 3) “if statements.
  3. Basic knowledge of Database Management and Primary Key.


Let’s talk about the assumptions. We have a fictitious table with 1 million records, including all the cable companies in the U.S present in every community or neighborhood. Fortunately, the number of unique companies is less than 20, but they are repeated for each neighborhood, making the table a colossal record holder of data. The following picture is showing a few records at the beginning of the table.


This table is not a regular table since there are multiple records for the same company, and the Address is not populated for all the cells in the field of Address. The question is, how would you fill the rest of the addresses. A plausible way to do so is to look at the Legal Name and match the Address since we know that each company has only one legal name and one Address associated with that name. The same process could be done with the PSID since it could be considered a Primary Key in terms of database management, BUT as you may notice, the PSID could change! Look at the COX Communications case: the company has different PSID starting at row 6. Hence we go with the Legal Name.

We would like to populate the Address field based on the value of another field. If all the values in the source field were assigned a unique value (such as the same name for all the records), we would not have to use Python to solve the problem. Since multiple values are existing in the field of Legal Name, we need to:

1. look at them one by one and

2. then compare it to the first Address assigned to that company

3. finally copy that Address and paste it to the empty cells where the name of the company is the same.

So at this point, we right-click the Address field header and bring up the Calculate Field to do the job:


  1. Construct a definition (def) and name it “compare” or whatever name you want, loading the Legal Name and Address as two parameters of the definition.
  2. Set the rule by introducing the if and pass the name of the company as a string
  3. Use the return function to assign the Address
  4. Use else and return to take care of the False condition returning the same value in the field in case the name is not corresponding to the value you stated as Legal Name.
  5. Recall the function (definition) named compare in the Address.

Calculate Field in ArcPro
Enter a caption

Remember that you need to do the same process for all the unique company names, and since they are just a few, it is possible to tackle the problem with Python. If the number of companies was a significant number, like hundreds of them, the scenario would flip completely.

How does this look in ArcMap? (Notice the difference in naming between attributes of the def compare and the name of fields. Here I am changing them intentionally):

Field Calculator


I just tried to change the attribute names to LegName and Add because I was curious about how it would mess up with the recalling process. Remember that LegName and Add will refer to the fields you are calling in the Address Box, as shown in the following picture so nothing wrong will happen. You can do the same in ArcPro if you like to confuse yourself!


Final product with all the cells filled with the correct Address:


Land Cover Detection in Google Earth Engine Using Landsat8 Imagery

Geographic Information System(s) (GIS) ha(s)ve come a long way from its inception and now has got integrated with mainstream applications and plays a very crucial role in the business decision process for every industry. The future of GIS is in Cloud Computing (Medium). The licensing costs and implementation costs are alleviated by Open Source GIS. The Google Earth Engine is one of the pioneers in cloud-based geospatial computation and definitely one of the solutions which provides a software-free (on client side) way to solve problems or simply to educate younger generations.

One of the most used operations in GIS and remote sensing is the land cover and land use detection and change studies. The process is known to be computationally time-consuming especially for the vast study areas. GEE though, makes it a very smooth and easy process by providing the API and running JavaScript on the code editor. Here I will show the first step to get the data from the API, prepare it for your study area, and setting bands to detect a certain type of land cover (vegetation).

The very first thing we need is to upload a 6-month composite landsat8 imagery:


var l8 = ee.ImageCollection(‘LANDSAT/LC08/C01/T1’);


All the codes you see after the name LANDSAT, “/LC08/C01/T1″, have their own meaning which is related to the type of imagery. The extent is another concern. This composite image covers the whole world. We can eventually narrow down changing the center point of our layer:


Map.setCenter (51.388973, 35.689198, 9);


Remember that Map.setCenter method in GEE JavaScript puts the longitude first and then latitude. The level of zoom is the last attribute of the method. In my case the zoom is set to 9. I am also focusing on Tehran and I knew the center I was going to choose: Tehran’s relative center point is on long 51 and lat 35. If you are not sure about what is the center point of your study area you can go to and find the numbers by searching the name of the city you are interested in or by dropping a pin on the map (Clicking the map).

The next thing is to set the time span. You can easily change the date to obtain a different time span. In my case I am interested in the imagery between January 2020 and May 2020:


var composite = ee.Algorithms.Landsat.simpleComposite({
collection: l8.filterDate(‘2020-1-1’, ‘2020-5-1’),
asFloat: true


Now let’s display the result:


Map.addLayer(composite, {bands: [‘B6’, ‘B5’, ‘B4’], max: [0.3, 0.4, 0.3]}, “Test”, 1 , 0.8 );


So here let’s bring up the documentation:

Map.addLayer(eeObject, visParams, name, shown, opacity)

Our eeObject is the variable composite and our visualization parameter is given by the bands (visParams, lets you specify the minimum and maximum values to display). You can play around to change the color composite and false color. For example, if you set the B5,B4,B3 instead of B6,B5,B4 you will get a red color composite (False color) reflecting the vegetation:

Here you need to have a basic knowledge of bands in satellite imagery and specifically their order in Landsat 8 and the range of reflectance which is not discussed in this post. To discover what values to use, activate the Inspector tab and click around on the map to get an idea of the range of pixel values. You can also give a name to your visualization as a string and set the default shown attribute to 1 so every time you run the script the visualization will pop-up and finally the opacity will help you to see the reference layer in case you are interested in the name of the places.

The final result is shown in the following picture. Please click the image for a higher quality.


پول مجازی، بیت کوین، و معاملات تمرکز زدایی شده

اولین بار اسم پول مجازی رو در این لینک دیدم. قضیه باور نکردنی بود تا جایی که کنجکاوی من رو مجبور کرد تا یاد بگیرم اصولا کلمه “کریپتو کارنسی”،”بلاک چین” و”بیت کوین” یعنی چی. در سال نود و دو هنوز بخش سوال و جواب سایت بیت کوین دات اورگ راه اندازی نشده بود. اگر حوصله داشته باشید میتونید جواب تقریبا همه سوالات خودتون رو راجع به بیت کوین(که همون پول مجازی هست) در این سایت پیدا کنید. ولی یک مشکلی که این صفحه نگارش شده به زبان فارسی داره تکنیکی بودن آن است. مثلا در قسمت هایی از اولین پاراگراف این صفحه میخوانید: “بیت کوین یک شبکه توافقی است” و”اولین شبکه پرداخت نقطه به نقطه تمرکز زدایی شده” و”بدون هیچگونه اختیار مرکزی”. تمامی این توصیفات برای کسی که هیچ ایده ای از کریپتو کارنسی و یا عملکرد بانک ها نداره بسیار پیچیده به نظر خواهد آمد. البته شایدم نه. به هر حال تصمیم  داشتم که نوشته ای کامل راجع به مفاهیم پول مجازی بنویسم که برای همه قابل فهم باشه. به همین خاطر مقاله ها و مراجعی به زبان ایتالیایی و انگلیسی فراهم کردم تا بتونم به نحو بهتری معنی هر کلمه رو انتقال بدم. در ادامه گزیده ای از آنچه در چهار سال اخیر آموختم رو در اخیتار فارسی زبانان قرار میدم. باشد که قابل استفاده باشد


اولین مفهومی که باید باهاش آشنا بشیم بلاک چین هست. ولی میخوام اینجا قبل از توضیح دادن اینکه بلاک چین چی هست به مشکلی که ازطریق بلاک چین میشه حل کرد بپردازم: فکر کنید دوست شما بابک در یک مسافرت تفریحی به سر میبره و به پول احتیاج داره و حساب بانکیش هم خالی از پوله ( آخر با جیب خالی کی میره مسافرت؟) فعلا شما فرض کنید که پول نداره و به شما زنگ میزنه میگه فلانی یه مقدار پول به حساب من واریز کن. شما زنگ میزنید بانک و به بانکدار میگین که از حساب شما یک میلیون تومان بریزه به حساب دوستتون. بانکدار هم میره حساب شمارو چک میکنه و در صورت کافی بودن موجودی یک میلیون برداشت میکنه و  به حساب دوستتون میریزه. تمام این پروسه از طریق بانک شما، بانک بابک، بانک مرکزی کشور مطبوعتون انجام میشه. در واقع این یکی از دلایلی هست که ما از بانک استفاده میکنیم: “نقل و انتقال پول توسط یک واسطه که همون بانکه و ایجاد یک سازمان واسطه برای اداره کردن نقل و انتقالات ارزی”. در اینجا لازمه که بگم کریپتو کارنسی یکی از اهدافش حذف کردن این واسطه هست. یعنی اینکه شما بتونی پول رو به دست بابک برسونی بدون اینکه بانک بفهمه شما چه قدر پول داری، به چه کسی میخوای پول بفرستی، اون شخص تو کدوم بانک و یا شعبه حساب داره و چه قدر پول از شما گرفته. خوب این آرزوی همه تبهکارها، قاچاقچی ها،  و اختلاس گرهاست ولی به یاد داشته باشید که این فقط یک روی سکه هست

شاید بپرسین خوب مگه مشکل چیه که بانک این کارو انجام بده؟ قبل از ارائه هر گونه توضیح باید این رو در نظر داشته باشیم که ساختار بانک مدرن توسط ایتالیایی های ثروتمند و به خصوص خانواده مدیچی در قرون وسطی ریخته شد(اگر براتون جالبه بیشتر در این مورد بدونین بهتون تماشای سریال مدیچی رو توصیه میکنم). بانک های مدرن توسط افراد ثروتمند بنیاد نهاده شدند و اولین هدف آنها بیشتر کردن سرمایه اولیه بانک هست در حالی که بلاک چین توسط یک تیم متخصص نرم افزار و ریاضیات بنا نهاده شده و هدف خاصی رو برای انباشت سرمایه دنبال نمیکنه

جواب سوال قبل رو میتونیم جور دیگه هم بدیم: سیستم بانکی خیلی نقاط ضعف داره. اگر یه کم تامل کنیم خیلی هاش به ذهنمون میان. مثلا اینکه ما در اکثر مواقع نمیدونیم که بانک با پول ما چه کار میکنه. به کی وام میده و چه قدر میده؟ آیا پول ما در جای درست سرمایه گذاری میشه؟ آیا بانک برای دادن وام های میلیونی و میلیاردی گیرنده وام رو چک میکنه؟ این فقط تعدادی انگشت شمار از سوالاتی هست که میتونه بدون جواب باقی بمونه. حالا سوالی که ما باید بهش جواب بدیم اینه که آیا سیستمی وجود داره که بتونیم باهاش پول بفرستیم و یا دریافت کنیم بدون اینکه واسطه ای باشه؟ بدون اینکه ثروتمندان با واسطه قرار دادن خودشون بر ثروت خود بیفزایند میتوان پولی رو از جایی به جای دیگر منتقل کرد؟ چرا باید برای هر انتقال، پولی به بانک پرداخت کنیم؟ دست آخر آنها کاری را انجام میدن که خود ما هم میتونیم انجام بدیم. یعنی ثبت نقل و انتقالات پول

جواب: بله! سیستمی هست که میشه باهاش دست بانک رو کوتاه کرد و آن سیستم همین بلاک چین و پول مجازی است. توجه داشته باشید که ما میتونیم همون کاری رو که بانک انجام میده بین خودمون انجام بدیم. ما میتونیم یک لیست مجازی از نقل و انتقالات پول بین خودمون نگهداری کنیم: مثلا آقای فلانی در تاریخ فلان یک میلیون پول به بابک داد. اگر ما این واریز رو نقطه شروع سیستم مجازی خودمون به حساب بیاریم میتونیم تمام پولی که بابک بعد از تولد سیستم مجازی نقل وانتقال به دست اورده و یا به اشخاص دیگه داده رو حساب کنیم و بدونیم. شاید در این نقطه سر و صدای طرفدارهای حقوق شخصی در بیاد که آقا یعنی چی که همه عالم و آدم بدونن من چه قدر در آوردم و یا چه قدر خرج کردم که این یک بحث دیگه ای هست که در این منوال نمیگنجد. فقط به گفتن این بسنده میکنم که هویت واقعی فرد در بلاک چین برملا نمیشه مگر در مواقع خاص. حالا سوال دیگه اینه که چند نفر لازم داریم که بتونیم این سیستم رو شروع کنیم؟ جواب: حداقل سه نفر. اگر دونفر باشن کافیه که فقط یک نفرلیست رو داشته باشه ولی خوب اومدیم و یارو تو زرد از آب در اومد و لیست رو دستکاری کرد، حالا چی؟ پس بهتره که دو نفر باشن. حالا که دو نفر هستن اومدیم و هر جفتشون نالوتی و غیر قابل اعتماد بودن، پس بهتره سه نفر باشن تا نفر سوم حواسش به این دوتای دیگه باشه. اگر هم که همه فاسدن پس بهتره که همه جمع کنیم بریم سراغ کارمون پولم بدیم به بانک و والسلام. حالا جدا از شوخی سوال دیگه اینه که گیریم که ما اینکارو کردیم، نقل و انتقالات رو چه جوری باید ثبت کنیم؟ جواب: نقل و انتقلات حد نصاب دارن. مثلا ما میتونیم یک تراکنش و یا ده تراکنش(در زبان انگلیسی ترانزاکشن) رو مد نظر قرار بدیم. وقتی به اون حد نصاب رسیدیم اون لیست رو تا کنیم، مهر و موم کنیم و بزاریم یه گوشه



یادداشت: به این عمل (تا کردن لیست نقل و انتقالات و قرار دادنش در یک جای امن) در زبان تکنیکی ماینینگ گفته میشه ولی ما مهر و موم صداش میکنیم چون در این مرحله به خاطر سپاری این تفسیر ساده تر از کلمه های قلمبه و سلمبه انگلیسی هست. بر خلاف سیستم بانکی که خودش عهده دار چاپ و تکثیر پول هست کریپتوکارنسی ماده اصلی رو از استفاده کنندگان این سیستم جذب میکنه. یعنی هر چی استفاده کننده (ماینر) بیشتر باشه پول بیشتری تولید و تقسیم میشه. دقت داشته باشید که در جمله قبل منظور از تولید همون یافتن بیت کوین ها هست. در واقع این همان اتفاقی است که افتاده و بیت کوینی که در اولین تراکنش فقط سه سنت ارزش داشت همین چند روز پیش از مرزشش هزار دلار گذشت. البته فاکتور های زیادی در قیمت بیت کوین تاثیر دارند ولی یکی از مهمترین  آنها همان تعداد استفاده کنندگان آن است. هر چه بیشتر، قیمت بالاتر، و ارزش آن بیشتر. جا داره که یک بار دیگه از کوتاه شدن دست بانک ها از سازمان دهی پول یاد کنیم چرا که در کریپتو کارنسی مردم نقش اول رو دارند و نه سیاست های پولی و مالی که اغلب اوقات برای تامین امنیت عده ای خاص بر جامعه تحمیل میشوند

Hash Power

تمامی تراکنش ها (یا همون معامله ها) به وسیله یک الگوریتم مهر و موم میشن که یکی از نویسندگان اونرو ماشین جادویی نام گذاشته. این ماشین جادویی در زبان تخصصی “هش پاور” نامیده میشه که ما باهاش کاری نداریم و همون ماشین جادویی صداش میکنیم

یادتونه گفتم که “همه میتونن از همه چی خبر داشته باشن”؟ خوب زیادم راست نگفتم. چون در واقع وقتی شما قراره اون لیست رو مهر و موم کنید احتیاج به یه ماشین جادویی دارین که اینکارو براتون بکنه. پس فرض کنید که شما به این ماشین یه ورودی میدین به نام لیست شماره یک و در خروجی این ماشین یه تیکه کاغذ تا شده دریافت میکنید که هیچی توش ننوشته جز یک کد. میدونم یک کم داره پیچیده میشه ولی با من باشید تا آخر این  داستان رو با هم کشف کنیم. اون تیکه کاغذ کد شده برای من و شما هیچ معنی خاصی نداره ولی اگر بدیمش به ماشین جادویی اون میتونه لیست نقل و انتقالات رو به ما بده تا ما چک کنیم ببینیم چه خبره. حالا سوال دیگه اینه که اصلا چرا همه ما باید یه لیست از نقل و انتقالات داشته باشیم؟ خوب یکیمون که مورد اعتماد همه هست این کارو بکنه و ما هم از روی نسخه اون کپی کنیم وختم به خیر بشه ماجرا. جواب: اینجا اون نقطه ای از ماجرا هست که فرق بین بلاک چین و اقتصادی که ما میشناسیم پدیدار میشه. کسانی که وقت میگذارن و صحت نقل و انتقالات رو حساب میکنن جایزه نقدی(بیت) دریافت میکنن و این همان شالوده پول مجازی هست


به خاطر داشته باشید شاید یک نفر بخواد دزدی بکنه و لیست نقل و انتقالات رو به نفع خودش تغییر بده. ایشون نه تنها یک لیست از نقل و انتقالات بلکه همه لیست های بعدی رو نیز باید تغییر بده و با توجه به این که همه لیست ها به هم مرتبط هستن به وقوع پیوستن این امر دور از انتظاره (یا بهتر بگم: هنوز کسی نتونسته اینکارو بکنه). حالا اگر به جای یک نفر مثلا شش نفر در جمع ده نفره ما تصمیم بگیرن فساد کنن تکلیف چیه؟ جواب: این همان نقطه ضعفی هست که باعث فروپاشی بیت کوین و بلاک چین میشه. سوال دیگه شکاکان به بیت کوین اینه که خوب چطوری میشه ثابت کرد که یک نفر یک بیت کوین رو چند بار خرج نکنه؟ در واقع جواب این سوال در پاراگراف های بالا داده شده. بیت کوین وهر کسری از اون کد گذاری شده و خرج کردن چند باره آن ممکن  نیست. ولی به خاطر داشته باشید که این جواب کامل نیست و میشه در رابطه با امنیت بیت کوین کتاب های قطوری نوشت

خالق بیت کوین

خیلی تحقیقات انجام شده تا خالق بیت کوین شناخته بشه ولی هیچ کمکی نکرده جز افزودن بر شک کنجکاوها. ساتوشی ناکاموتو اسم شخصی هست که یک مقاله علمی درباره بیت کوین رو در سال دوهزار و هشت به یک ایمیل لیست در رابطه با کریپتو کارنسی فرستاد. بعد از مشهور شدن بیت کوین خیلی ها به سراغ فاش کردن هویت ایشان رفتند ولی همگی با دستان خالی برگشتند. وی در ایمیل هایی که به دریافت کننده های اولیه بیت کوین میفرستاد (تا سیستم رو امتحان کنند) خودش رو فردی معرفی میکرد که در ژاپن زندگی کرده در حالی که سرویس ایمیلش روی یک سرور آلمانی ثبت شده بود، انگلیسی رو عالی مینوشت و متخصص کامپیوتر و نرم افزار هم بود. حتی شخصی به همین نام در کالیفرنیای شمالی پیدا شد که با چهره ای آسیایی میتوانست گزینه خوبی برای شناسایی ناکاموتو باشد ولی بعدها از سوی خود وی تکذیب شد و دیگران هم نتوانستند دستش را رو کنند

تصویر صفحهٔ نخست مقاله‌ ناکاموتو

تصویر ساتوشی ناکاموتو مهندسی ژاپنی تبار که رسانه ها به دلیل شباهت اسمی وی را به عنوان خالق بیت کوین شناسایی کردند. وی این موضوع را بارها رد کرده است

کیف پول الکترونیکی

حالا سوال بعدی که پیش میاد اینه که ما کجا میتونیم بیت کوینامونو ذخیره کنیم؟ جواب: کیف پول الکترونیکی. شما قبل از اینکه شروع کنین به ماینینگ و یا کسب درآمد از طریق پول مجازی باید یه کیف پول الکترونیکی برای خودتون دست و پا کنین. توی این لینک میتونید اطلاعات بیشتری به دست بیارید.  من به شخصه از گرین آدرس استفاده میکنم و تا الان راضی بودم. تعداد زیادی نرم افزار وجود داره که چه به صورت آن لاین و یا چه به صورت آف لاین خدمات ذخیره بیت کوین ارائه میدن. به خاطر داشته باشید که ذخیره آن لاین یعنی اینکه شما باید برای دسترسی به حسابتون به اینترنت وصل بشید. برای ذخیره بیت کوین در حساب های آف لاین که روی سیستم عاملتون نصب میشه فقط کافیه که به سیستم خودتون دسترسی داشته باشید. خیلی از سرمایه گذاران بیت کوین دارایی هاشون رو در کامپیوترهایی ذخیره میکنند که سیستم اینترنتش نه تنها غیر فعاله بلکه از توی کامپیوتر حذف شده. اینکار برای بالا بردن امنیت ذخیره کردن هست. کاری که من خودم نمیکنم ولی اکیدا توصیه میکنم

در تصویر زیر مثالی از دریافت بیت بر روی حساب خودم رو برای شما قرار دادم: اون رمز 34 کارکتری که میبینید در واقع شماره حساب کیف پول الکترونیکی هست که به حساب من پول واریز کرده. کیف پول الکترونیکی منم یک رمز مثل این داره. آخرین باری که برای من پول واریز شده 76 بار تایید شده (به کادر سبز رنگ در زیر کد نگاه کنید). یادتون هست که تو پاراگراف دوم و سوم گفتیم که نقل و انتقالات باید از طرف دیگران تایید بشن؟ این شماره تعداد کسانی هست که این تراکنش رو تایید کردن. یعنی 76 نفر میدونن که آقا و یا خانم ایکس برای میلاد 1065.60 بیت معادل 3.93 دلار آمریکا پول ریخته. توجه داشته باشید که معادل سازی به دلارطبق ارزش روزانه بیت کوین حساب میشه و اون 1065.60 بیت در روز و ساعت و دقیقه ای که من اسکرین شات گرفتم 3.93 دلار ارزش داشته . هر بار که شما به کیف پول الکترونیکیتون سر میزنید این مقدار بسته به ارزش بیت کوین به صورت خودکار حساب خواهد شد


چه طوری بیت کوین و یا بیت (واحد کوچکتر بیت کوین) به دست بیاریم؟

راه های مختلفی هست: در این لینک میتونید بیشتر بخونید. اولیش اینه که عضوی از بلاک چین بشین و به ازای تایید نقل و انتقالات جایزه دریافت کنید که این کار اسمش همون ماینینگ کردنه. اون 76 نفری که توی تصویر بالا تراکنش رو تایید کردن یه مقداری بیت برای این کارشون دریافت کردن. اگر حوصله این کار و دانش اون رو ندارید میتونید مستقیما بیت کوین بخرین. شاید اون اوائل که بیت کوین فقط چند سنت ارزش داشت این خیلی ایده خوبی بود ولی امروز که دارم این پست رو آپدیت میکنم قیمت بیت کوین بعد از کلی نوسان به خاطر تصمیم دولت مرکزی چین در قبول نکردن بیت کوین به عنوان روش سرمایه گزاری از مرز چهار هزار دلار گذشته  در حالی که تا کمی بیشتر از یک ماه پیش به بالای پنج هزار دلار رسیده بود(پیش بینی ها حاکی از اینه که در آینده نزدیک به شش هزار دلار خواهد رسید). میتونین در این لینک نوسانات قیمت رو ببینید. به خاطر داشته باشید که هر بیت کوین از یک میلیون بیت تشکیل شده و نرخ روزمره بیتی که شما میتونید به دست بیارید بر طبق شاخصه وسایل و تکنیکی که استفاده میکنید تغییر میکنه. ولی عامل مهم دیگه نرخ بیت کوین هست. مثلا یک سال قبل سایت بیت کوین گت واسه تموم کردن هر نظر سنجی نزدیک  500 تا 800 بیت به شما جایزه میداد ولی از وقتی قیمت بیت کوین بالا رفت فقط بین 30 تا 70 یا 80 بیت جایزه میده. در این لینک میتونید از این سایت دیدن کنید و با به ثمر رساندن یکی از عملیات ها مقداری بیت دریافت کنید. از دیدن ویدئو گرفته تا شرکت کردن در نظر سنجی های مختلف و یا حتی رسوندن یک جعبه به جایی. البته محدودیت های جغرافی و یا تحریم ممکنه دسترسی خیلی از فارسی زبانان رو به این سایت محدود کنه چون اکثر نظر سنجی ها برای ایالات متحده طراحی شده و خدمات دیگه هم شاید محدود به آمریکای شمالی باشه. خودتون کنجکاوی کنین ببینین چه خبره

آیا ارز مجازی دیگری وجود دارد؟

 بله. به غیر از بیت کوین ارزهای دیگری هم وجود دارند. تا امروز که این پست رو آپدیت میکنم نزدیک هزار واحد ارزی مثل بیت کوین به وجود اومدن. دوتا از معروفترین آنها اثریوم و لایت کوین هستن

اثریوم و لایت کوین به ترتیب ارزش کمتری نسبت به بیت کوین دارند. مثلا میتونیم این دو واحد رو به نقره و برنز در برابر طلا که بیت کوین هست تشبیه کنیم. ارزش یک واحد اثریوم مدت هاست که حدود 300 دلار آمریکاست و لایت کوین حدود 45 دلار. خیلی از کسانی که به سرمایه گذاری در این زمینه روی میارن در اغلب مواقع شاید این دو واحد رو به خاطر قیمت کمترشون ترجیح بدن. البته فراموش نکنید که دارم از کسایی حرف میزنم که میخوان منابع مالی واقعیشون رو تبدیل به پول مجازی بکنن! که تعدادشون خیلی کمه. سایت های مختلفی برای خرید و فروش این دو واحد ارزی هست. یکیشونو میتونید اینجا ببینید. توی این وب سایت میتونید لایت کوین بخرین و بفروشین

ارزش بیت کوین

:بیت کوین همه ویژگی های طلا رو داره

اول: هر واحد آن به صورت انفرادی برابر با دیگر واحدهاست (مثل شمش طلا). دوم: قابل تقسیم به واحدهای کوچکتر است(هر بیت کوین به یک میلیون بیت تقسیم میشه و حتی در آینده در صورت لزوم هر بیت میتونه به قسمت های کوچکتر هم تقسیم بشه). سوم: به عنوان وسیله مبادله قابل قبول است. چهارم: مقدار مشخصی از آن در دسترس است (بیست و یک میلیون). پنجم: همه ورژن های بیت کوین با هم برابرند. ششم: قابل حمل است. هفتم: میتواند بارها استفاده بشود بدون آنکه از بین برود. اگر دقت کنید متوجه میشید که بیت کوین همه این خصوصیات رو حتی بهتر از پول نقد برخورداره و حتی خیلی ها بیت کوین رو بهتر از طلا به حساب میاورند. به غیر از این شش مشخصه بیت کوین میتونه خریده بشه و یا فروخته بشه. در حال حاضر هیچ پروسه خرید و فروش واحدی برای بیت کوین وجود نداره و به همین دلیل توی مارکت های مختلف قیمت ها مختلف هستند. آفرینندگان بین کوین معتقدند که با باز کردن پای افراد بیشتردر فرآیند ماینینگ ارزش بیت کوین بالاتر میره. در واقع این فرضیه از مرحله فرضیه بودن گذشته و ثابت شده که با افزایش ماینر ها و تزانزاکشن ها قیمت بیت کوین بالاتر میره

در آخر باید بگم که آینده از آن ارز مجازی خواهد بود. در واقع خیلی از پیش بینی هایی که در رابطه با بالا رفتن قیمت بیت کوین  شد درست از آب درآمد. مطمئنا ارزش این واحد بالاتر هم خواهد رفت. در حال حاضر تعداد بسیار کمی از مردم دنیا این ارز رو میشناسن و اکثر رسانه ها به دلیل فشار های دولتی حاضر نیستند که اخبار بیت کوین در دسترس همه قرار بگیره چون میتونه باعث نابودی سیستم بانکی در تمام دنیا بشه و باعث بشه که میلیون ها نفر شغل خود را از دست بدهند. اگر روزی بیت کوین بتونه سیستم بانکی حال حاضر رو شکست بده دیگر سرمایه گذاری شکل و قاعده خودش رو از دست خواهد داد. از نظر بهره وری دیگر احتیاجی به ضرب سکه و ارز نیست و میشه از این طریق میلیونها تن کاغذ و منابع فلزی رو ذخیره کرد.امنیت این سیستم بالاست و همه به کار هم نظارت میکنن و احتمال اختلاس، رشوه، تبانی و غیره نزدیک به صفر است. با این حال همه متخصصان در زمینه  ارز مجازی قبول دارند که کم و کاستی هایی هم در این سیستم وجود داره و راه برای بهتر کردن اون همیشه بازه. در واقع سیستم های جدید که در زبان انگلیسی ازشون به عنوان فورک یاد میشه (مثل بیت کوین کش) راه کار های پیشنهادی برای بهتر کردن بیت کوین هستند

یادداشت: قیمت بیت کوین در اوایل ماه دسامبر به بیست هزار دلار رسید