HashMap源码解析

原创

CoffeeLand

修改于 2020-03-27 10:11:54

5010

修改于 2020-03-27 10:11:54

文章被收录于专栏：CoffeeLandCoffeeLand

Table of Content

hashMap的数据结构
hashMap的原理
hashMap的hash产生
hashMap的hash碰撞
hashMap的重要概念
hashMap的put方法
hashMap的get方法

hashMap的数据结构

 transient Node<K,V>[] table;
 
  static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
        ...
    }

hashMap的原理

hashMap的hash产生

 
 static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
  // h >>> 16 是让h无符号右移16

是用key计算hashCode,然后与key做无符号右移16位 , 是为了让高位移动,让hash均匀

^是异或运算符，异或的规则是转换成二进制比较，相同为0，不同为1.

& 是与运算， 11为1

(h = key.hashCode())^(h >>>16); 的原因是n-1的高位是0, 因此(n-1) & hash 会导致部分高位无法使用，造成系统的浪费，为了能够使用高位来做index，因此将h>>>16位, 这样的话(n-1)& hash将会变得更加均匀

hashmap不用Mod取余的原因

最早计算index = hash % n

hash计算的值一般都会很大,那么取余操作分部的话基本都会落到最后一个桶里
取余的效率没有位操作快,一般认为，Java的%、/操作比&慢10倍左右，因此采用&运算而不是h % length会提高性能。
因为HashMap中的容量都是2的幂次,这样的话2的幂-1都是11111结尾的(16->10000,15->01111)，当容量一定是2^n时,tab[i = (2^n - 1) & hash] == tab [i=(hash%2^n)]

hashMap的hash碰撞

hashMap里hash算法可能使得hashCode(key1)==hashCode(key2) 从而产生hash碰撞.

hashMap解决hash碰撞

如果出现hash碰撞, hashMap将相同的hashCode值的元素, 按照链表的形式存储

当hash值相同，按照链表存储，当链表的长度>8 ,将链表转化为二叉树; 当

hashMap的重要概念

static final int DEFAULT_INITIAL_CAPACITY =1<<4; 默认的初始容量16, 容量不同于size, size是map里的实际数量

static final int MAXIMUM_CAPACITY =1<<30; hashMap最大的容量

static final float DEFAULT_LOAD_FACTOR =0.75f; 默认的加载因子，负载因子=size除以实际容量，值越大，hash表越满

static final int TREEIFY_THRESHOLD =8; 当链表的长度大于8, 将链表转化为红黑树

static final int UNTREEIFY_THRESHOLD =6; 在进行resize的操作中，如果红黑树的元素小于6，将红黑树转化为链表。


    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16 默认的初始容量

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8; 

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;//在进行resize的操作中，如果红黑树的元素小于6， 将红黑树转化为链表。

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
    static final int MIN_TREEIFY_CAPACITY = 64;

hashMap的put方法

tab[i =(n -1)& hash])可以看出hash的主要作用是帮我们找出key在hashMap中对应的数组下标

1. 计算key的hash值

2. 根据key的hash找出要存储元素的位置

1）tab[i=(n-1)&hash] == null 这个位置没有元素,所以创建新的Node

2) tab[i=(n-1)&hash] != null 这个位置不为空, 产生了hash碰撞, 分为以下两种情况

i) 要插入的元素在Node数组里

ii) 要插入的元素在TreeNode里

iii) 要插入的元素链表里

3) 根据onlyIfAbsent的输入值,来判断是否更新, false更新,true不更新

public V put(K key, V value) 
{
        return putVal(hash(key), key, value, false, true);
}
    
/**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) 
{
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null) //这个位置为null， 所以创建新的数组元素
            tab[i] = newNode(hash, key, value, null);
        else { // 元素已经存在, 用新的值来替换,要
            Node<K,V> e; K k;
            
            //要插入的元素在,在数组里
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))  
                e = p;
            //要插入的元素在TreeNode里
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            //要插入的元素链表里
            else { 
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key, will replace the old value once onlyifAbsent is false 
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

hashMap的get方法

get方法是通过计算key的hash值,并且比较key相等,然后取出key的value

1) 用key所对应的hash找出那个位置上的第一个Node

2) if与第一个Node的hash相同并且key相同，直接return 第一个Node

3) if 与第一个node的hash相同, 判断node是在treeNode里还是在链表里

i) if first是TreeNode，则getTreeNode返回

ii) if first是链表,则用用do while去找first的next的key与输入的key相同的node并返回

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    
  final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }