Jump to content

[LE] Tips for hashing strings


Recommended Posts

Hello,

 

I need to write a simple hashmap in papyrus so that a string can be associated with a unique integer. The aim for it is to be simple and fast. No crypto-security at risk here :D If it collides in a few instances, no big deal, nothing happened.

 

I know of the md5sum hashing but I think it's way too complicated, and other popular algorithms I found on places like stackoverflow require bitwise operations, which (as far as I know) are not available for papyrus.

 

So this is what I came up with so far: convert the first few characters to numbers, add them and just return with that.

int function getHashCode(String akString)
    int i = 0
    int hash = 0
    while( i < 40)    ; we can count on the string having at least 100 characters, this surely wont go out of bounds.
        String char = StringUtil.GetNthChar(akString, i)
        int iChar = StringUtil.AsOrd(char)
        hash += iChar
        i += 1
    endwhile
    return hash
endfunction

I'm not iterating over the entire string so that it's faster. It couldn't get any simpler, and I'm fairly sure this would already work "well enough", but is there perhaps a better algorithm that is just as simple and fast but would get less collisions? Or could be an improvement in some way?

If someone has tips on how to do this better it'd be welcome :smile:

 

Thanks

Link to comment
Share on other sites

next script code is based on this https://www.geeksforgeeks.org/how-string-hashcode-value-is-calculated/

;; How is a string hashcode calculated?
;; The hashcode value of a String is calculated with the help of a formula:
;;
;; s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
;;
;; where: ^ - refers to the exponential operand
;; n - represents the length of the string
;; s - represents the i-th character of the string

 

 

;-----------------------------------
Int FUNCTION myF_GetHashOF(String s)
;-----------------------------------
; https://www.creationkit.com/index.php?title=GetNthChar_-_StringUtil
; https://www.creationkit.com/index.php?title=AsOrd_-_StringUtil

    int iHash
    int n = StringUtil.GetLength(s)

    IF (n > 7)           ; 31^6 is still inside maxLongint of 0x7fffffff
        n = 7            ; limit the string hash to seven chars only
    ENDIF

int i = 0                ; we start with Zero like array
    WHILE (i < n)
        int T = StringUtil.AsOrd( StringUtil.GetNthChar(s, i) )        ; 'A' --> 65 decimal

; *** Function above is case-insensitive (like all SKSE string functions right now).  ***

        IF     (T < 65)
        ELSEIF (T > 122)
        ELSE                    ; convert all letters from 'A'..'Z' to 'a'..'z'
            T = T + 32          ; 65 + 32 = 97
        ENDIF

        i = i + 1
        T = T * Math.POW(31, (n - i))

        iHash = iHash + T
    ENDWHILE

    RETURN iHash
ENDFUNCTION

 

 

 

You wrote: "I found on places like stackoverflow require bitwise operations, which (as far as I know) are not available for papyrus."

This is not right. Bitwise operation on type "Int" are available with pure papyrus code without SKSE. But type "Int" is limited to numbers 31-bit positive.

 

 

 

; 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9 8 7 6 5 4 3 2 1 0  bits          signed integer 32-bit
; /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  / / / / / / / / / /
; 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 0 0 0 0 0 0 0 0 0 = 0x00000000    positive
; 0  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 1 1 1 1 1 1 1 1 1 = 0x7fffffff    range
; -------------------------------------------------------------------------------------
; 1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 0 0 0 0 0 0 0 0 0 = 0x80000000    negative
; 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 1 1 1 1 1 1 1 1 1 = 0xffffffff    range

;-----------------------
Int FUNCTION POW2(Int p) Global  ; internal helper
;-----------------------
; 2^0 = 1
; 2^1 = 2
; 2^2 = 4
; 2^3 = 8
; 2^4 = 16

; https://www.creationkit.com/index.php?title=Pow_-_Math
    RETURN Math.POW(2.0, p as Float) as Int

; 2^p = i
;--------
;int i = 0x01
;    WHILE (p)        ; (p != 0)
;        i = i * 2
;        p = p - 1        
;    ENDWHILE
;    RETURN i    
ENDFUNCTION


;-------------------------------------
Int FUNCTION SwitchBit(Int n, Int p=0) Global
;-------------------------------------
IF (p < 0)
    RETURN n
ENDIF
;---------
IF (p > 31)
    RETURN n
ENDIF
;---------
    p = POW2(p)         ; convert "p" into power of 2 to divide n with

IF ((n/p) % 2 == 0)
    RETURN (n+p)        ; set bit, if was unset before
ENDIF
;---------
    RETURN (n-p)        ; unset bit, if was set before
ENDFUNCTION


;------------------------------------
Int FUNCTION UnsetBit(Int n, Int p=0) Global
;------------------------------------
IF (p < 0)
    RETURN n
ENDIF
;---------
IF (p > 31)
    RETURN n
ENDIF
;---------
    p = POW2(p)          ; convert "p" into power of 2 to divide n with

    IF ((n/p) % 2 == 1)
        n = n - p        ; unset bit
    ENDIF

    RETURN n        
ENDFUNCTION


;----------------------------------
Int FUNCTION SetBit(Int n, Int p=0) Global
;----------------------------------
; https://www.creationkit.com/index.php?title=User:PROXiCiDE/MathUtil
;Int Function Bit_Set(Int aiValue, Int aiIndex = 1)
;    Return Math.LogicalOr(aiValue,Bit_MakeByte(aiIndex))
;EndFunction

IF (p < 0)
    RETURN n
ENDIF
;---------
IF (p > 31)
    RETURN n
ENDIF
;---------
    p = POW2(p)          ; convert "p" into power of 2 to divide n with

    IF ((n/p) % 2 == 0)
        n = n + p        ; set bit
    ENDIF

    RETURN n
ENDFUNCTION

;------------------------------
Bool FUNCTION Bit(Int n, Int p) Global  ; internal helper
;------------------------------
    p = Math.POW(2.0, p as Float) as Int

IF ((n/p) % 2 == 1)
    Return TRUE          ; bit is set
ENDIF
;---------
    Return False         ; bit is unset
ENDFUNCTION


;-----------------------------------
Bool FUNCTION GetBit(Int n, Int p=0) Global
;-----------------------------------
; https://www.creationkit.com/index.php?title=User:PROXiCiDE/MathUtil
;Bool Function Bit_Get(Int aiValue, Int aiIndex = 1)
;    Return Math.LogicalAnd(aiValue,Bit_MakeByte(aiIndex)) != 0
;EndFunction

IF (p < 0)
    Return False
ENDIF
;---------
IF (p > 31)
    Return False
ENDIF
;---------
    RETURN Bit(n, p)
ENDFUNCTION

 

 

Edited by ReDragon2013
Link to comment
Share on other sites

Hmm, that's the actual implementation of Java's hashCode() for the String class? I googled it and multiple sources say yes. Weird, I always thought it would be a lot more complicated. Seems great to use, thanks. :smile:

 

 

Bitwise operation on type "Int" are available with pure papyrus code without SKSE. But type "Int" is limited to numbers 31-bit positive.

Hence the parentheses with "as far as I know". I stand corrected, that's good to know :smile:

Link to comment
Share on other sites

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...