# Fortran Wiki hash table example

A module which is an example implementation of object oriented hash tables in Fortran 2003. The module is licensed under the LGPL. This module could be extended to be generic by use of the transfer function, or specialization through type extension. The methods are attached to the objects via type-bound procedures. Additional PRIVATE statements should be inserted or uncommented for the production code; the raw objects and methods have been left exposed here for testing purposes. This library and the corresponding test program compile with the Intel Fortran compiler, version 11.1.046. The program and library seem to be correct, but they have not been thoroughly tested.

## Code

### Library

This library was fashioned after the description of hash table data structures in The Practice of Programming.

! Module implementing an OO hash table (dictionary) in Fortran 2003.
! Compiles and runs with accompanying test program under the Intel
! Fortran Compiler, version 11.1.046

! Copyright (c) Izaak Beekman 2010

! This program is free software: you can redistribute it and/or modify
! the Free Software Foundation, either version 3 of the License, or
! (at your option) any later version.

! This program is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
! GNU Lesser General Public License for more details.

! You should have received a copy of the GNU Lesser General Public License
! along with this program.  If not, see <http://www.gnu.org/licenses/>.

MODULE hashtbl
IMPLICIT NONE ! Use strong typing
INTEGER, PARAMETER :: tbl_size = 50

TYPE sllist
TYPE(sllist), POINTER :: child => NULL()
CHARACTER(len=:), ALLOCATABLE :: key, val
CONTAINS
PROCEDURE :: put  => put_sll
PROCEDURE :: get  => get_sll
PROCEDURE :: free => free_sll
END TYPE sllist

TYPE hash_tbl_sll
TYPE(sllist), DIMENSION(:), ALLOCATABLE :: vec
INTEGER                                 :: vec_len = 0
LOGICAL                                 :: is_init = .FALSE.
CONTAINS
PROCEDURE :: init => init_hash_tbl_sll
PROCEDURE :: put  => put_hash_tbl_sll
PROCEDURE :: get  => get_hash_tbl_sll
PROCEDURE :: free => free_hash_tbl_sll
END TYPE hash_tbl_sll

PUBLIC :: hash_tbl_sll

Here we see the two derived types the hash table is built from, and the type-bound procedures (methods) each of these data objects has1.

The first object, sllist is a singly linked list, with allocatable scalar CHARACTER type key and value components. (The string lengths may be allocated at runtime.) This type has 3 methods: 1. put to put (and create if necessary) the key-value pair into the appropriate list element. 1. get to retrieve the string (value) corresponding to the key, if the key-value pair exist. 1. free to completely destroy the linked list.

The second object is the hash table. It contains elements to store some meta-data like the vector length (important for hashing the keys) and the state of the object, as well as the array of singly linked lists, vec. The methods associated with this object are: 1. init to initialize the object by allocating vec to be a certain length, and keeping track of the object state and meta-data. 1. put to hash the key and store the value in the proper element of the appropriate linked list, creating it if necessary. 1. get to hash the key and retrieve the associated value, if the key-value pair exists. 1. free to destroy the hash table, freeing up memory and keeping track of meta-data and object state.

Below is a schematic of an instantiated hash table object, as implemented in this library:

The rest of the module, including all the procedures follows below.

CONTAINS

RECURSIVE SUBROUTINE put_sll(list,key,val)
CLASS(sllist),    INTENT(inout) :: list
CHARACTER(len=*), INTENT(in)    :: key, val
INTEGER                         :: keylen, vallen

keylen = LEN(key)
vallen = LEN(val)
IF (ALLOCATED(list%key)) THEN
IF (list%key /= key) THEN
IF ( .NOT. ASSOCIATED(list%child) ) ALLOCATE(list%child)
CALL put_sll(list%child,key,val)
END IF
ELSE
IF (.NOT. ALLOCATED(list%key)) &
ALLOCATE(CHARACTER(len=keylen) :: list%key)
list%key = key
IF (ALLOCATED(list%val)) DEALLOCATE(list%val)
ALLOCATE(CHARACTER(len=vallen) :: list%val)
list%val = val
END IF
END SUBROUTINE put_sll

RECURSIVE SUBROUTINE get_sll(list,key,val)
CLASS(sllist),                 INTENT(in)    :: list
CHARACTER(len=*),              INTENT(in)    :: key
CHARACTER(len=:), ALLOCATABLE, INTENT(out)   :: val
INTEGER                                      :: vallen

vallen = 0
IF (ALLOCATED(list%key) .AND. (list%key == key)) THEN
vallen = LEN(list%val)
IF (ALLOCATED(val)) DEALLOCATE(val)
ALLOCATE(CHARACTER(len=vallen) :: val)
val = list%val
ELSE IF(ASSOCIATED(list%child)) THEN ! keep going
CALL get_sll(list%child,key,val)
ELSE ! At the end of the list, no key found
IF (ALLOCATED(val)) DEALLOCATE(val) ! Exit indication
RETURN
END IF
END SUBROUTINE get_sll

RECURSIVE SUBROUTINE free_sll(list)
CLASS(sllist), INTENT(inout) :: list
IF (ASSOCIATED(list%child)) THEN
CALL free_sll(list%child)
DEALLOCATE(list%child)
END IF
list%child => NULL()
IF (ALLOCATED(list%key)) DEALLOCATE(list%key)
IF (ALLOCATED(list%val)) DEALLOCATE(list%val)
END SUBROUTINE free_sll

The above procedures are bound to the singly linked list objects. The first dummy argument is the object it is bound to, and is passed automatically whenever the procedure is invoked via the object. Since this is a singly linked list we can only traverse the list in one direction, and we can define the methods associated with the list relatively tersely using direct recursion.

  SUBROUTINE init_hash_tbl_sll(tbl,tbl_len)
CLASS(hash_tbl_sll),   INTENT(inout) :: tbl
INTEGER,     OPTIONAL, INTENT(in)    :: tbl_len

IF (ALLOCATED(tbl%vec)) DEALLOCATE(tbl%vec)
IF (PRESENT(tbl_len)) THEN
ALLOCATE(tbl%vec(0:tbl_len-1))
tbl%vec_len = tbl_len
ELSE
ALLOCATE(tbl%vec(0:tbl_size-1))
tbl%vec_len = tbl_size
END IF
tbl%is_init = .TRUE.
END SUBROUTINE init_hash_tbl_sll

! The first part of the hashing procedure using the string
! collating sequence
ELEMENTAL FUNCTION sum_string(str) RESULT(sig)
CHARACTER(len=*), INTENT(in)   :: str
INTEGER                        :: sig
CHARACTER, DIMENSION(LEN(str)) :: tmp
INTEGER :: i

FORALL (i=1:LEN(str))
tmp(i) = str(i:i)
END FORALL
sig = SUM(ICHAR(tmp))
END FUNCTION sum_string

SUBROUTINE put_hash_tbl_sll(tbl,key,val)
CLASS(hash_tbl_sll), INTENT(inout) :: tbl
CHARACTER(len=*),    INTENT(in)    :: key, val
INTEGER                            :: hash

hash = MOD(sum_string(key),tbl%vec_len)
CALL tbl%vec(hash)%put(key=key,val=val)
END SUBROUTINE put_hash_tbl_sll

SUBROUTINE get_hash_tbl_sll(tbl,key,val)
CLASS(hash_tbl_sll),           INTENT(in)    :: tbl
CHARACTER(len=*),              INTENT(in)    :: key
CHARACTER(len=:), ALLOCATABLE, INTENT(out)   :: val
INTEGER                                      :: hash

hash = MOD(sum_string(key),tbl%vec_len)
CALL tbl%vec(hash)%get(key=key,val=val)
END SUBROUTINE get_hash_tbl_sll

SUBROUTINE free_hash_tbl_sll(tbl)
CLASS(hash_tbl_sll), INTENT(inout) :: tbl
INTEGER     :: i, low, high

low  = LBOUND(tbl%vec,dim=1)
high = UBOUND(tbl%vec,dim=1)
IF (ALLOCATED(tbl%vec)) THEN
DO i=low,high
CALL tbl%vec(i)%free()
END DO
DEALLOCATE(tbl%vec)
END IF
tbl%is_init = .FALSE.
END SUBROUTINE free_hash_tbl_sll

END MODULE hashtbl

The procedures defined above operate on the hash table object. Because of the high degree of encapsulation and abstraction implementing these procedures is relatively easy; we can use the methods defined for the singly linked list.

### Test Program

This test program exercises the library defined above, and tests it for correctness. This is not very thorough, but it is a decent start.

! Test program for module hashtbl implementing an OO hash table (dictionary)
! in Fortran 2003. Compiles and runs with accompanying test program under
! the Intel Fortran Compiler, version 11.1.046

! Copyright (c) Izaak Beekman 2010

! This program is free software: you can redistribute it and/or modify
! the Free Software Foundation, either version 3 of the License, or
! (at your option) any later version.

! This program is distributed in the hope that it will be useful,
! but WITHOUT ANY WARRANTY; without even the implied warranty of
! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
! GNU Lesser General Public License for more details.

! You should have received a copy of the GNU Lesser General Public License
! along with this program.  If not, see <http://www.gnu.org/licenses/>.

PROGRAM test_hashtbl
USE hashtbl
IMPLICIT NONE
TYPE(hash_tbl_sll)            :: table
CHARACTER(len=:), ALLOCATABLE :: out
INTEGER, parameter :: tbl_length = 100
INTEGER            :: sum, i, rand_int1, rand_int2 ! 4 byte integer, hopefully
REAL               :: rand
CHARACTER(len=4)   :: rand_str1, rand_str2 ! each char should be 1 byte

PRINT*, ' '
PRINT*, 'This program is free software: you can redistribute it and/or &
&modify it under the terms of the GNU Lesser General Public License&
& as published by the Free Software Foundation, either version 3 of&
PRINT*, ' '
PRINT*, 'This program is distributed in the hope that it will be useful,&
& but WITHOUT ANY WARRANTY; without even the implied warranty of&
& MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the &
&GNU Lesser General Public License for more details.'
PRINT*, ' '
PRINT*, 'You should have received a copy of the GNU Lesser General Public &
&License along with this program.  If not, see &
PRINT*, ' '

CALL table%init(tbl_length)
CALL table%put(key='first_name', val='John')
PRINT*, 'Hash: ', MOD(sum_string('first_name'),tbl_length)
CALL table%put(key='last_name', val='Smith')
PRINT*, 'Hash: ', MOD(sum_string('last_name'),tbl_length)
CALL table%put(key='birthday', val='July 30, 1964')
PRINT*, 'Hash: ', MOD(sum_string('birthday'),tbl_length)
CALL table%put(key='hair_color', val='brown')
PRINT*, 'Hash: ', MOD(sum_string('hair_color'),tbl_length)
CALL table%put(key='eye_color', val='brown')
PRINT*, 'Hash: ', MOD(sum_string('eye_color'),tbl_length)
CALL table%put(key='weight', val='213 lbs')
PRINT*, 'Hash: ', MOD(sum_string('weight'),tbl_length)
CALL table%put(key='height', val='6''3"')
PRINT*, 'Hash: ', MOD(sum_string('height'),tbl_length)

PRINT*, ' '

CALL table%get(key='first_name',val=out)
PRINT*, out
CALL table%get('last_name',out)
PRINT*, out
CALL table%get('birthday',out)
PRINT*, out
CALL table%get('hair_color',out)
PRINT*, out
CALL table%get('eye_color',out)
PRINT*, out
CALL table%get('weight',out)
PRINT*, out
CALL table%get('height',out)
PRINT*, out

! INCLUDE 'stress_test.f90'

PRINT*, ' '

sum = 0
PRINT*, 'Indices of the hash table with content:'
DO i = LBOUND(table%vec,dim=1), UBOUND(table%vec,dim=1)
IF (ALLOCATED(table%vec(i)%key)) THEN
PRINT*, i
sum = sum + 1
END IF
END DO
PRINT*, 'Total used elements:', sum
CALL table%free
PRINT*, ' '
STOP 0
END PROGRAM test_hashtbl

The stress test from the INCLUDE statement above, stress_test.f90, can be seen below. It should be noted that these tests fail. We can check that the reason of this failure comes from the RANDOM_NUMBER(.) function, which repeats a value of rand_int1 coupled with two different values of rand_int2. This fact can be seen by printing the list of nodes of table%vec(idx) when out/=rand_str2, where idx is a hash index of rand_str1.

  ! Included file for test_hashtbl.f90
PRINT*, ' '

sum = 0
PRINT*, 'Mild stress test.'
DO i = 1,2000 ! 4byte integers default on most systems
CALL RANDOM_NUMBER(rand)
rand_int1 = NINT(rand*1000)
rand_str1 = TRANSFER(rand_int1,rand_str1)
CALL RANDOM_NUMBER(rand)
rand_int2 = NINT(rand*1000)
rand_str2 = TRANSFER(rand_int2,rand_str2)
CALL table%put(key=rand_str1, val=rand_str2)
CALL table%get(key=rand_str1,val=out)
IF (TRANSFER(out,rand_int2) /= rand_int2) THEN
PRINT*, 'Error, i=',i,' key=',rand_int1,' Val=',rand_int2,' &
&Out=',TRANSFER(out,rand_int2)
sum = sum + 1
END IF
END DO
PRINT*, 'Number of errors:',sum

A call graph with profiling information can be generated. This can be seen below, but the profiling information is useless because the test program is not long enough (i.e. it doesn’t call the various subroutines as many times as is needed for meaningful profiling information to be produced).

This image can be made using the build instructions in the following section, using gprof and gprof2dot.py.

### Build Instructions

The following is a GNU Make makefile for building the library, building the test program, profiling the library and test program, building the call graph, and building the hash table data structure schematic.

# Make file for hash table (dictionary) example.  Makes a call graph with profiling information and
# a schematic of the data structure.

# Copyright (c) 2010 Izaak Beekman

# This program is free software: you can redistribute it and/or modify
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Lesser General Public License for more details.

# You should have received a copy of the GNU Lesser General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

# Compilers and flags
FC = ifort
FCFLAGS = -g -traceback -fno-omit-frame-pointer -p -fno-inline -fno-inline-functions #-ipo -O3
LDFLAGS = $(FCFLAGS) #-fast FCSYNCHK = -syntax-only -warn # Build rules COMPILE.f90 =$(FC) $(FCFLAGS) -c$<

LINK.f90    = $(FC)$(LDFLAGS) -o $@$^

# Portability macros
RM = rm
DOT = dot
PROFILER = gprof

# Compilation pattern rules, may need to override builtins
# For GNU Make
MAKE = $(MAKE) -R # to override %.o: %.f90$(COMPILE.f90)

%: %.o
$(LINK.f90) %.mod: %.f90 # Just lowercase module names$(FC) $(FCSYNCHK)$<

# Patern rule for making PNGs from graphviz dot files
%.png: %.dot
$(DOT) -Tpng <$< > $@ %.dot: % ./$<  $(ARGS)$(PROFILER) ./$< |$(GPROF2DOT) > $@ # Default target, call graph of test_hashtbl test_hashtbl.png: .PRECIOUS: test_hashtbl.dot .PRECIOUS:test_hashtbl test_hashtbl: modhashtbl.o call_graph.dot: test_hashtbl .PHONY: test_hashtbl.mod test_hashtbl.mod: modhashtbl.mod # Build the image of our dictionary/hash-table data type data_struct.png: 88x31-CC-by-sa.png .PHONY: clean clean: -$(RM) -f *.o *.mod

In the stress test case, there are two possible problems:

1. Keys are generated by RANDOM_NUMBER(), so same keys could be put into the hash table. Repetivie keys are not well-treated in this module. In the subroutine put_sll(list, key, val), there is not code to treat the value with the same key appeared before.

2. Another risk is the use of transfer function. Although the problem may not appear in this test case, it is important to carefully choose the size of the transferred type. Also, my personal experience suggests that it might be better to use transfer(key, your original key type) of the module in certian if statements. For example, in the module, we have:

IF (list%key /= key) THEN
IF ( .NOT. ASSOCIATED(list%child) ) ALLOCATE(list%child)
CALL put_sll(list%child,key,val)
END IF

If your original key is an integer, it might be better to use following code:

IF (transfer(list%key, integer) /= transfer(key, integer)) THEN
IF ( .NOT. ASSOCIATED(list%child) ) ALLOCATE(list%child)
CALL put_sll(list%child,key,val)
END IF

1. Finalization should also be added to these objects, but I am unsure of compiler support for this feature as of June 5th, 2010.