Map (go) related operations and matters needing attention

So far, we have talked about the collection of high-level data types belonging to containers for single elements. They either store elements in succession or store them as interdependent pointers, each representing an independent value that belongs to a particular type.

The dictionary we’re going to talk about today is different. It stores not a collection of single values, but a collection of key-value pairs.

What is the key value pair? It is a word literally translated from English key-value pair. As the name implies, a key pair represents a pair of keys and values.

Note that a “key” and a “value” represent a separate value that belongs to a particular type, and bundling them together is a key value. In the Go specification, to avoid ambiguity, they change the key-value pair to a different name, called the “key-element pair”. IWe also use this seemingly clearer word to explain.

Go The dictionary type of a language is actually a specific implementation of a hash table. In this implementation, the biggest difference between a key and an element is that the type of the former is limited, whereas the latter can be of any type.

To explore the reasons for this limitation, we first need to understand one of the most important processes in hash tables: mapping. You can think of a key as an index of an element, and we can look up the element paired with it in the hash table by the key.

This correspondence between keys and elements is called “mapping” in mathematics, which is also the original meaning of the word “map”. The mapping process of hash tables exists in the operation of adding, deleting, modifying and checking pairs of keys and elements.

aMap := map[string]int{

“one”: 1,

“two”: 2,

“three”: 3,

}

v, ok := aMap[“two”]

if ok {

fmt.Printf(“The value of key %q: %d\n”, k, v)

} else {

fmt.Println(“Not found!”)

}

For example, if we want to find the element value corresponding to a key value in the hash table, then we need to pass the key value as a parameter to the hash table. The hash table first uses the hash function (hash function) to convert the key value to the hash value.

A hash value is usually an unsigned integer. A hash table holds a certain number of buckets, also known as hash buckets, which evenly store the key-element pairs that its hash table receives.

Therefore, the hash table uses the lower bits of the hash value of the key to locate a hash bucket, and then goes to the hash bucket to find the key. Because key-element pairs are always stored bundled together, once the key is found, the corresponding element value must be found.

Subsequently, the hash table returns the corresponding element value as the result. As long as this key-element pair exists in the hash table, it must be found, because the hash table adds, modifies, deletes keys-element pairs of the time mapping process, as described earlier.

Now we know that the first step in the mapping process is to convert the key value to the hash value. In the dictionary of Go language, each key value is represented by its hash value. That is to say, dictionaries do not store any key values, but only store their hash values.

Did you feel anything about it? Let’s look down.

Our question today is: what types of key types can be found in dictionaries?

This problem can be found in the Go language specification, but it is not so simple. Its typical answer is that the key types of the Go dictionary cannot be function types, dictionary types, and slice types.

Problem analysis

Let’s analyze this problem.

Go Language specification stipulates that operators must be applied between values of key types = = and! In other words, the value of the key type must be supported. Because the values of function types, dictionary types, and slice types do not support judgment operations, dictionary key types cannot be these types.

In addition, if the type of the key is of the interface type, then the actual type of the key value can not be the above three types, otherwise in the process of running the program will cause panic (that is, run-time panic). For instance:

var badMap2 = map[interface{}]int{

“1”: 1,

[]int{2}: 2, // This will trigger panic.

3: 3,

}

The type of variable badMap2 here is the dictionary type with a key type of interface{} and a value type of int. Such statements do not cause any errors. Or, I have escaped the examination of the Go language compiler through such a statement.

Notice that I literally initialized the dictionary while declaring it, so that it contained three key-element pairs. The key value of second key element pairs is []int{2}, and the element value is 2. Such a key value does not make the Go language compiler wrong, because it is grammatically speaking.This is acceptable.

But when we run this code, the runtime system of the Go language finds the problem here, throwing a panic and pointing the root to the line in the literal that defines the second key – element pair. The later we find the problem, the problem is corrected.The higher the cost, it is best not to set the key type of the dictionary to any interface type. If you have to do so, make sure that the code is within the controllable range.

Also note that if the key type is an array type, make sure that the element type of that type is not a function type, dictionary type, or slice type.

For example, since the element type of the type [1][] string is [] string, it cannot be used as the key type of the dictionary type. In addition, if the type of the key is a structured type, the type of the field in it is also guaranteed to be legitimate. No matter how deep the illegal type is buried,For example, map[[1][2][3][]string]int, Go language compiler will pull it out.

You may be wondering why the value of the key type must support the judgement operation. As I said earlier, once the Go language locates a hash bucket, it tries to find key values in that bucket. How did you find it?

First, each hash bucket will store all the hash values of all the keys it contains. The Go language compares these hash values with the hash value of the lookup key to see if there is equality. If none of the equals is present, then there is no key value to look for in the bucket, and the Go language willThe results will be returned immediately.

If there is equality, then use the key value itself to contrast. Why do we have to compare? The reason is that the hash values of different values may be the same. There is a term called “hash collision”.

Therefore, even if the hash value is the same, the key value is not necessarily the same. If the values of the key types cannot be judged to be equal, then the mapping process cannot continue. Finally, only if the hash and key values of the keys are equal can the matching key-element pairs be found.

Knowledge expansion

Question 1: what types of key types should be considered as the key types of dictionaries?

As you can see by now, there are some types of values in Go that support grading, and some that do not. Which of these types of values are better suited to be the key types of dictionaries?

Here we leave aside the context in which we use dictionaries, only from a performance perspective. In the mapping process described earlier, “converting key values to hash values” and “comparing the key values to be looked up to the key values in the hash bucket” are obviously two important and time-consuming operations.

Therefore, it can be said that the faster the hashing and judgment operations, the more appropriate the corresponding type as a key type.

For all the basic types, pointer types, and array types, structured types, and interface types, the Go language has a set of algorithms that correspond to them. The algorithm includes hash and judgement.

Take the hash operation as an example, the smaller the width is, the faster the type is. The width of a type is the number of bytes that its single value needs to occupy. For example, a value of type bool, int8, and uint8 takes 1 bytes, so the widths of these types are all 1.

The type with width less than 4 is the fastest, followed by the type with width greater than 4 and less than or equal to 8, and the type with width less than or equal to 8 and less than or equal to 16.

This is true for Boolean, integer, floating point, complex, and pointer types. For a string type, because its width is variable, it depends on the specific length of its value, the shorter the length, the faster the hash.

All of these are basic types. Let’s look at advanced types. Hashing the value of an array type is actually getting the hash value of each element in turn and merging it, so the speed depends on its element type and its length. Details are the same as above.

Similarly, hashing the value of a structure type is essentially hashing and merging all its field values, so the key is the type of its fields and the number of fields. For the interface type, the specific hash algorithm is determined by the actual type of the value.

I don’t recommend that you use these advanced data types as dictionary key types, not only because they are hashed and rated slowly, but also because they have variables in their values.

For example, for an array, I can arbitrarily change the value of its elements, but before and after the change, it represents two different key values.

The value of the structure type might be better, because if I could control access to the fields in it, I could prevent the outside world from modifying it. The interface type is the most dangerous type of the key type of the dictionary.

Remember? If the Go runtime system finds that a key does not support the wait operation in this case, a panic is thrown immediately. In the worst case, this is enough to crash the program.

Which of these basic types should be preferred? The answer is to give priority to numerical type and pointer type. In general, the smaller the width of a type, the better. If the string type is not selected, it is better to impose additional constraints on the length of the key value.

Then what is the unusual situation? Generally speaking, Go language sometimes optimizes the dictionary’s adding, deleting, modifying and checking operations.

For example, if the key type of a dictionary is a string type, or if the key type of a dictionary is an integer type with a width of 4 or 8.

Question 2: will a read operation succeed in a dictionary with a value of nil? What about writing operations?

Well, to avoid burning the brain too long, let’s start with a simple question. Because a dictionary is a reference type, its value is nil when we declare only variables of a dictionary type without initializing them.

Will it be successful to try to get the corresponding element value from the key value on such a variable, or to add a key-element pair? This is a simple question, but one we must keep in mind because it involves the stability of the program at run time.

Let me give you an answer. Except for adding key-element pairs, we do nothing on a dictionary with a value of nil that causes no errors. When we try to add a key-element pair to a dictionary with a value of nil, the go runtime system immediately throws a panIC.

summary

This time we have discussed some confusing questions related to dictionary type. For example, why is the key type of dictionary constrained? For example, what type of type should we choose as the key type of dictionary?

I started with Go language specification and answered these questions on the basis of Go language source code. After seriously reading this article, you should have a certain understanding of the mapping process in the dictionary.

Also, you should know something about the hashing and judging operations that Go does on legitimate key types.

Again, always be aware of operations that might trigger panic, such as adding key-element pairs to a dictionary with a value of nil.

Leave a Reply Cancel reply