{"id":1098,"date":"2020-04-30T22:19:02","date_gmt":"2020-05-01T05:19:02","guid":{"rendered":"http:\/\/209.126.2.187\/?p=1098"},"modified":"2024-02-03T19:18:18","modified_gmt":"2024-02-04T03:18:18","slug":"rolling-hash","status":"publish","type":"post","link":"https:\/\/nanzhou.cc\/index.php\/2020\/04\/30\/rolling-hash\/","title":{"rendered":"Rolling Hash"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>Rolling hash is a common technique to compare if two substrings are equal or not. Given <code>O(n)<\/code> preprocessing time, the comparison becomes <code>O(1)<\/code> at the best case.  <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">OJ<\/h2>\n\n\n\n\n<table id=\"tablepress-15\" class=\"tablepress tablepress-id-15\">\n<thead>\n<tr class=\"row-1\">\n\t<td class=\"column-1\"><\/td><th class=\"column-2\">Trick<\/th><th class=\"column-3\">Difficulty<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-striping row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\"><a href=\"https:\/\/leetcode.com\/problems\/distinct-echo-substrings\/\" target=\"_blank\" rel=\"noopener noreferrer\"> LC 1316. Distinct Echo Substrings<\/a><\/td><td class=\"column-2\">Rolling Hash + string_view or DP in Linear Structure + string_view<\/td><td class=\"column-3\">6 points<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-15 from cache -->\n\n\n\n<h2 class=\"wp-block-heading\">Details<\/h2>\n\n\n\n<p><\/p>\n\n\n\n<p>We use two large coprime numbers <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-94535dde4210bbf2c3391d3f1d784d9e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#98;\" title=\"Rendered by QuickLaTeX.com\" height=\"13\" width=\"8\" style=\"vertical-align: 0px;\"\/> for base and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-52d0d439c6f21bce07adcdd01aafd152_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"15\" style=\"vertical-align: 0px;\"\/> for mod. We use a polynomial to calculate the hash value. Typical values are <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-6054bcde3147c8c45f3143e6bfa49429_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#98;&#61;&#49;&#101;&#57;&#43;&#55;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"88\" style=\"vertical-align: -2px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-e6ffba2f047dba359e44edcc9738ed06_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#109;&#61;&#49;&#101;&#57;&#43;&#57;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"96\" style=\"vertical-align: -2px;\"\/> <\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 25px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-f351358b8fe26151d49abaaf443c3d45_l3.png\" height=\"25\" width=\"231\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#40;&#115;&#41;&#32;&#61;&#32;&#92;&#115;&#117;&#109;&#32;&#115;&#091;&#105;&#093;&#32;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#98;&#94;&#105;&#32;&#92;&#112;&#109;&#111;&#100;&#32;&#109;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>Given a substring <code>s[i, j]<\/code>, it is not difficult to get that the hash value of it is,<\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 38px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-3861409af2bae695bb813bdbdfa1e80c_l3.png\" height=\"38\" width=\"415\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#40;&#115;&#091;&#105;&#44;&#32;&#106;&#093;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#40;&#102;&#40;&#115;&#091;&#48;&#44;&#32;&#106;&#093;&#41;&#32;&#45;&#32;&#102;&#40;&#115;&#091;&#48;&#44;&#32;&#105;&#32;&#45;&#32;&#49;&#093;&#41;&#32;&#43;&#32;&#109;&#41;&#125;&#123;&#98;&#94;&#105;&#125;&#32;&#92;&#112;&#109;&#111;&#100;&#32;&#109;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>We could preprocess tables for all the <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-059b448c5f73a9b24682c5537c1b889f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#102;&#091;&#48;&#44;&#32;&#105;&#093;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"41\" style=\"vertical-align: -5px;\"\/> and <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-e14013cb594d0cd33ef9c26fbeb120a4_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#98;&#94;&#105;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"13\" style=\"vertical-align: 0px;\"\/> in <code>O(n)<\/code>. After that, the complexity of getting a hash value for any substring is <code>O(1)<\/code>. <\/p>\n\n\n\n<p>We then have the following truth,<\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 18px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-0e184f28565f23e34d15ab18639bbb8e_l3.png\" height=\"18\" width=\"431\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#40;&#115;&#116;&#114;&#48;&#41;&#61;&#61;&#102;&#40;&#115;&#116;&#114;&#49;&#41;&#32;&#92;&#99;&#101;&#110;&#116;&#101;&#114;&#110;&#111;&#116;&#92;&#105;&#109;&#112;&#108;&#105;&#101;&#115;&#32;&#115;&#116;&#114;&#48;&#61;&#61;&#115;&#116;&#114;&#49;&#92;&#32;&#40;&#104;&#97;&#115;&#104;&#92;&#32;&#99;&#111;&#108;&#108;&#105;&#115;&#105;&#111;&#110;&#41;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 18px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-2988b62fa1c0783bcfc5d3c5c0e17417_l3.png\" height=\"18\" width=\"275\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#40;&#115;&#116;&#114;&#48;&#41;&#92;&#99;&#101;&#110;&#116;&#101;&#114;&#110;&#111;&#116;&#61;&#102;&#40;&#115;&#116;&#114;&#49;&#41;&#32;&#92;&#105;&#109;&#112;&#108;&#105;&#101;&#115;&#32;&#115;&#116;&#114;&#48;&#92;&#99;&#101;&#110;&#116;&#101;&#114;&#110;&#111;&#116;&#61;&#115;&#117;&#98;&#49;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"cpp\" class=\"language-cpp\">namespace {\n\nusing ll = unsigned long long;\nll kBase = 31;\nll kMod = 1000'000'007;\nconstexpr int kMax = 100'005;\n\n\/\/ sum(s[i] * kBase^i)\nll hashes[kMax];\n\/\/ kBase^i\nll base_to_i[kMax];\n\/\/ inverse(kBase^i)\nll base_to_i_inverse[kMax];\n\n}\n\nll FastPower(ll val, ll n);\n\n\/\/ we have\n\/\/  num * x = 1 \n\/\/  num ^ (kMod - 1) = 1\n\/\/ so x = num ^ (kMod - 2)\nll GetInverse(ll val) {\n  return FastPower(val, kMod - 2);\n}\n\nll FastPower(ll val, ll n) {\n  if (n == 0) {\n    return 1;\n  }\n  if (n == 1) {\n    return val;\n  }\n  if (n % 2 == 0) {\n    return FastPower((val * val) % kMod, n \/ 2);\n  }\n  return (FastPower((val * val) % kMod, (n - 1) \/ 2) * val) % kMod;\n}\n\nvoid Preprocess(string_view str) {\n  int n = str.size();\n  ll curr_base = 1;\n  ll curr_sum = 0;\n  for (int i = 0; i &lt; n; ++i) {\n    curr_sum += (str[i] - 'a') * curr_base; \n    curr_sum %= kMod;\n    hashes[i] = curr_sum;\n    base_to_i[i] = curr_base;\n    curr_base = (curr_base * kBase) % kMod;\n  }\n\n  base_to_i_inverse[n - 1] = GetInverse(base_to_i[n - 1]);\n  for (int i = n - 2; i >= 0; --i) {\n    base_to_i_inverse[i] = (base_to_i_inverse[i + 1] * kBase) % kMod;\n  }\n}\n  \nll GetHash(int l, int r) {\n  return ((hashes[r] + kMod - (l - 1 >= 0 ? hashes[l - 1] : 0)) * base_to_i_inverse[l]) % kMod;\n}<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Two Hash Functions<\/h3>\n\n\n\n<p>In <a href=\"https:\/\/leetcode.com\/problems\/longest-duplicate-substring\/\">LC 1044.&nbsp;Longest Duplicate Substring<\/a>, you might find out one hash rolling hash function causes collisions. Another technique for rolling hash is to use Two hash functions (two bases and two mods). <\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 25px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-e7db46efa9f2ac18e97e7a8ef2579d29_l3.png\" height=\"25\" width=\"239\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#95;&#105;&#40;&#115;&#41;&#32;&#61;&#32;&#92;&#115;&#117;&#109;&#32;&#115;&#091;&#105;&#093;&#32;&#92;&#116;&#105;&#109;&#101;&#115;&#32;&#98;&#95;&#105;&#94;&#105;&#32;&#92;&#112;&#109;&#111;&#100;&#32;&#123;&#109;&#95;&#105;&#125;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 43px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-69370e0ec87efca4f716a4d05bc990d4_l3.png\" height=\"43\" width=\"447\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#95;&#49;&#40;&#115;&#091;&#105;&#44;&#32;&#106;&#093;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#40;&#102;&#95;&#49;&#40;&#115;&#091;&#48;&#44;&#32;&#106;&#093;&#41;&#32;&#45;&#32;&#102;&#95;&#49;&#40;&#115;&#091;&#48;&#44;&#32;&#105;&#32;&#45;&#32;&#49;&#093;&#41;&#32;&#43;&#32;&#109;&#95;&#49;&#41;&#125;&#123;&#98;&#95;&#49;&#94;&#105;&#125;&#32;&#92;&#112;&#109;&#111;&#100;&#32;&#123;&#109;&#95;&#49;&#125;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p><p class=\"ql-center-displayed-equation\" style=\"line-height: 43px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/nanzhou.cc\/wp-content\/ql-cache\/quicklatex.com-9aff7bc5c8f523676d7c6db71570f61b_l3.png\" height=\"43\" width=\"447\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#92;&#091;&#102;&#95;&#50;&#40;&#115;&#091;&#105;&#44;&#32;&#106;&#093;&#41;&#32;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#40;&#102;&#95;&#50;&#40;&#115;&#091;&#48;&#44;&#32;&#106;&#093;&#41;&#32;&#45;&#32;&#102;&#95;&#50;&#40;&#115;&#091;&#48;&#44;&#32;&#105;&#32;&#45;&#32;&#49;&#093;&#41;&#32;&#43;&#32;&#109;&#95;&#50;&#41;&#125;&#123;&#98;&#95;&#50;&#94;&#105;&#125;&#32;&#92;&#112;&#109;&#111;&#100;&#32;&#123;&#109;&#95;&#50;&#125;&#92;&#093;\" title=\"Rendered by QuickLaTeX.com\"\/><\/p><\/p>\n\n\n\n<p>Use <code>f_1 ^ f_2<\/code> directly here will cause collisions. Instead, we use a hash set of pairs to store these two values. HashSet in the standard library helps us handle hash collisions, which is sufficient to pass the OJ.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"cpp\" class=\"language-cpp\">struct PairHash {\n    size_t operator() (pair&lt;ll, ll&gt; const &amp;p) const {\n        return hash&lt;int&gt;()(p.first) ^ hash&lt;int&gt;()(p.second);\n    }\n};\n\nunordered_set&lt;pair&lt;int, int&gt;, PairHash&gt; visited;<\/code><\/pre>\n\n\n\n<p>Another technique you might use here is the <strong>Modular<\/strong> <strong>Multiplicative Inverse<\/strong>. Please refer to <a href=\"http:\/\/161.97.122.139\/index.php\/2020\/06\/20\/modular-multiplicative-inverse-concepts\/\">this post<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Rolling Hash Template <\/h2>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"cpp\" class=\"language-cpp\">\n\nclass RollingHash {\npublic:\n  using ll = unsigned long long;\n  ll hashes[kMaxN];\n  ll base_to_i[kMaxN];\n  ll base_to_i_inverse[kMaxN];\n  \n  const int kBase;\n  const int kMod;\n  \n  RollingHash(string_view str, int kBase, int kMod) : kBase(kBase), kMod(kMod) {\n    int n = str.size();\n    ll curr_base = 1;\n    ll curr_sum = 0;\n    for (int i = 0; i &lt; n; ++i) {\n      curr_sum += (str[i] - 'a') * curr_base; \n      curr_sum %= kMod;\n      hashes[i] = curr_sum;\n      base_to_i[i] = curr_base;\n      curr_base = (curr_base * kBase) % kMod;\n    }\n    \n    base_to_i_inverse[n - 1] = GetInverse(base_to_i[n - 1]);\n    for (int i = n - 2; i &gt;= 0; --i) {\n      base_to_i_inverse[i] = (base_to_i_inverse[i + 1] * kBase) % kMod;\n    }\n  }\n  \n  \/\/ s[i, i + len - 1], s[j, j + len - 1]\n  bool Equal(int i, int j, int len, string_view str) {\n    int hash0 = (GetRelativeHash(i, i + len - 1) * base_to_i[j]) % kMod, hash1 = (GetRelativeHash(j, j + len - 1) * base_to_i[i]) % kMod;\n    \/\/ if (hash0 != hash1) {\n    \/\/     return false;\n    \/\/ } else {\n    \/\/     return str.substr(i, len) == str.substr(j, len);\n    \/\/ }\n    return hash0 == hash1;\n  }\n  \n  int GetHash(int left, int right) {\n    return (GetRelativeHash(left, right) * base_to_i_inverse[left]) % kMod;\n  }\n\nprivate:\n  \/\/ https:\/\/oi-wiki.org\/math\/fermat\/\n  \/\/ https:\/\/oi-wiki.org\/math\/inverse\/\n  ll GetInverse(int val) {\n    return FastPower(val, kMod - 2);\n  }\n  \n  ll FastPower(ll val, unsigned int n) {\n    if (n == 0) {\n      return 1;\n    }\n    if (n == 1) {\n      return val;\n    }\n    if (n % 2 == 0) {\n      return FastPower((val * val) % kMod, n \/ 2);\n    }\n    return (FastPower((val * val) % kMod, (n - 1) \/ 2) * val) % kMod;\n  }\n  \n  ll GetRelativeHash(int left, int right) {\n    return ((ll)hashes[right]  + kMod - ((left &gt;= 1) ? hashes[left - 1] : 0)) % kMod;\n  }\n};\n\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Summary Rolling hash is a common technique to compare if two substrings are equal or not. Given O(n) preprocessing time, the comparison becomes O(1) at the best case. OJ Details We use two large coprime numbers for base and for mod. We use a polynomial to calculate the hash value. Typical values are and &nbsp;&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[48,54],"tags":[],"class_list":["post-1098","post","type-post","status-publish","format-standard","hentry","category-alg","category-string"],"_links":{"self":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts\/1098","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/comments?post=1098"}],"version-history":[{"count":24,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts\/1098\/revisions"}],"predecessor-version":[{"id":1694,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts\/1098\/revisions\/1694"}],"wp:attachment":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/media?parent=1098"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/categories?post=1098"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/tags?post=1098"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}