【数据结构】哈希表中——拉链法网站首页 技术教程

【数据结构】哈希表中——拉链法

蓝色学者i 2024-06-17 10:48:31

简介【数据结构】哈希表中——拉链法

前言

书接上回，我们接着来讲解哈希表相关的知识。

拉链法

拉链法是解决哈希冲突的另外一种思路，顾名思义，当两个key值哈希冲突时，会将两个值挂在一个链表上，因此也不会发生踩踏事件，是比较常用的解决哈希冲突的一种方式。

思路分析

要实现出这样的哈希表，我们的哈希表就要设计成一个能挂上链表的结构，插入、查找、删除操作也要兼顾链表操作，当然就不需要三个节点状态了。

结构实现

节点：通过上述分析，节点除了存储的数据外，还应该有一个指针指向链表的下一个节点。

template<class K,class V>
struct HashNode
{
	HashNode<K, V>* _next;
	pair<K, V> _kv;

	HashNode(const pair<K,V>& kv)
		: _kv(kv)
		, _next(nullptr)
	{ }
};

哈希表结构：

template<class K,class V, class hash = HashFunc<K>>
class HashTable
{
public:
private:
	vector<Node*> _tables;
	size_t _n = 0;
};

函数实现

gethash

在类模板里我们引入了一个hash的类模板参数，其实这个就是用来得到对应key的hash值的，因为不是所有类型都可以直接转成整形进行哈希的。我们可以采用类模板的方式来对其他类型进行返回key。

template<class K>
struct HashFunc
{
	size_t operator()(const K& key)
	{
		return key;
	}
};

template<>
struct HashFunc<string>
{
	size_t operator()(const string& s)
	{
		size_t hash = 0;
		for (auto c : s)
		{
			hash += c;
			hash *= 31;
		}
		return hash;
	}
};

这里使用了一个类模板特化来处理string类。

getnextPrime

扩容时可以两倍扩容，但参照库里给出的方案按照一定的数进行扩容造成哈希冲突的可能性较小：

size_t GetNextPrime(size_t prime)
{
	// SGI
	static const int __stl_num_primes = 28;
	static const unsigned long __stl_prime_list[__stl_num_pri
	{
		53, 97, 193, 389, 769,
		1543, 3079, 6151, 12289, 24593,
		49157, 98317, 196613, 393241, 786433,
		1572869, 3145739, 6291469, 12582917, 25165843,
		50331653, 100663319, 201326611, 402653189, 805306457,
		1610612741, 3221225473, 4294967291
	};

	size_t i = 0;
	for (; i < __stl_num_primes; ++i)
	{
		if (__stl_prime_list[i] > prime)
			return __stl_prime_list[i];
	}

	return __stl_prime_list[i];
}

插入

插入的大多数逻辑与开放地址法中的插入基本一致，只是原来节点的线性探测变成了对链表的头插。

bool insert(const pair<K,V> kv)
{
	// 检查是否出现过
	if (find(kv.first)) return false;

	hash gethash;

	// 扩容
	if (_n == _tables.size())
	{
		size_t newsize = GetNextPrime(_tables.size());
		vector<Node*> newtables(newsize, nullptr);

		for (Node*& cur : _tables)
		{
			while (cur)
			{
				Node* next = cur->_next;

				size_t hashi = gethash(cur->_kv.first) % newtables.size();

				//headin
				cur->_next = newtables[hashi];
				newtables[hashi] = cur;

				cur = next;

			}
		}
		_tables.swap(newtables)
	}

	// 插入
	size_t hashi = gethash(kv.first) % _tables.size();
	Node* newnode = new Node(kv);
	Node* cur = _tables[hashi];

	// 头插
	newnode->_next = _tables[hashi];
	_tables[hashi] = newnode;

	//负载因子++
	_n++;

	return true;
}

删除

删除操作即找到对应的位置进行删除即可,需要注意的是头删后要把对应节点置为空，防止下次非法访问。

bool Erase(const K& key)
{
	Node* ret = find(key);
	if (!ret) return false;

	Hash gethash;
	size_t hashi = gethash(key) % _tables.size();

	Node* cur = _tables[hashi];

	if (ret == cur)
	{
		// 头删
		cur = ret;
		_tables[hashi] = nullptr;
	}
	else
	{
		while (cur->_next != ret)
		{
			cur = cur->_next;
		}

		cur->_next = ret->_next;
		
	}

	delete ret;
	return true;
}

查找

Node* find(const K& key)
{
	if (!_table.size())
		return nullptr;

	hash gethash;
	size_t hashi = gethash(key) % _tables.size();

	Node* cur = _tables[hashi];
	while (cur)
	{
		if (cur->_kv.first == key)
		{
			return cur;
		}
		cur = cur->_next;
	}
	return nullptr;
}