Dictionaries

RedBlack.RBDict ($\S{3.3}$)

PureFun.RedBlack.RBDictType
RedBlack.RBDict{O,K,V} where O
RedBlack.RBDict{K,V}(ord=Base.Order.Forward)
RedBlack.RBDict(iter, o::Ordering=Base.Order.Forward)

Immutable dictionary implemented using a red-black tree (balanced binary search tree). All major operations are $\mathcal{O}(\log{}n)$. Note the ordering parameter, the RBDict iterates in sorted order according to the ordering O. In addition to the main PFDict methods, RBDict implements delete, popmin, and popmax.

Examples

julia> using PureFun, PureFun.RedBlack

julia> f = RedBlack.RBDict(("zyz" => 1, "abc" => 2, "ghi" => 3))
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering, String, Int64} with 3 entries:
  "abc" => 2
  "ghi" => 3
  "zyz" => 1

julia> b = RedBlack.RBDict(("zyz" => 1, "abc" => 2, "ghi" => 3), Base.Order.Reverse)
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 3 entries:
  "zyz" => 1
  "ghi" => 3
  "abc" => 2

julia> popmin(f)
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering, String, Int64} with 2 entries:
  "ghi" => 3
  "zyz" => 1

julia> popmax(f)
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering, String, Int64} with 2 entries:
  "abc" => 2
  "ghi" => 3

julia> popmin(b)
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 2 entries:
  "ghi" => 3
  "abc" => 2

julia> popmax(b)
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 2 entries:
  "zyz" => 1
  "ghi" => 3

julia> delete(b, "ghi")
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 2 entries:
  "zyz" => 1
  "abc" => 2

# forward-ordered by default, so:
julia> d = RedBlack.RBDict{Base.Order.ForwardOrdering}{String,Int}();
julia> d === RedBlack.RBDict{String,Int}()
true
source

Tries ($\S{10.3.1}$)

These tries use path compression (exercise 10.10, although using the "Compressed Trie with digit numbers" variant presented here), resulting in compact and efficient dictionaries

PureFun.Tries.@trieMacro
@trie(Name, edgemap = ...)
Trie.@trie(Name, edgemap = ..., keyfunc = ...)

Tries are defined by an optional keyfunc, which takes keys and returns an iterator of simpler keys, and the edgemap, which maps the simpler keys to subtries. if keyfunc is not specified it will be set to the identity function for all types except for Strings, for which it is codeunits (in order to properly handle variable width character encodings). Once Name has been defined, it can be used like any other PureFun.PFDict

Examples

julia> using PureFun.Tries
julia> @trie(SimpleMap, edgemap=PureFun.Association.List)

julia> s = SimpleMap{String,Int}()
SimpleMap{String, Int64}()

julia> setindex(s, 42, "hello world")
SimpleMap{String, Int64}(...):
  "hello world" => 42

julia> SimpleMap(c => c for c in 'a':'e')
SimpleMap{Char, Char}(...):
  'e' => 'e'
  'd' => 'd'
  'c' => 'c'
  'b' => 'b'
  'a' => 'a'

The edgemap can be any data structure which implements the PFDict interface: it iterates pairs, has get, setindex, and isempty methods, and an empty constructor that has the signature edgemap{K,V}(). A tricky detail is that edgemap{K,V} should be a concrete type for concrete K and V, something you have to account for when defining the trie type if your edgemap dictionary type has extra type parameters. For example, here we must specify the ordering parameter for our RedBlack dictionary in order to use it as the edgemap:

julia> @trie(RedBlackMap,
             edgemap = PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering})

julia> RedBlackMap(("hello" => "world", "reject" => "fascism"))
RedBlackMap{String, String}(...):
  "hello"  => "world"
  "reject" => "fascism"

Cache-efficient bitmap tries

PureFun.Contiguous.biterate breaks single integer keys into iterators of smaller integer keys. PureFun.Contiguous.bitmap is a fast dictionary for small integer keys. By combining them, we end up with a bitmap-trie

julia> @trie(BitMapTrie,
             edgemap = PureFun.Contiguous.bitmap(16),
             keyfunc = PureFun.Contiguous.biterate(4))

julia> BitMapTrie(x => Char(x) for x in rand(UInt16, 5))
BitMapTrie{UInt16, Char}(...):
  0x5f60 => '彠'
  0xce8b => '캋'
  0x92c6 => '鋆'
  0xa2ce => 'ꋎ'
  0xadff => '귿'

julia> @trie(BitMapTrie64,
             edgemap = PureFun.Contiguous.bitmap(64),
             keyfunc = PureFun.Contiguous.biterate(6))

julia> b = BitMapTrie64(x => 2x for x in 1:1_000)
BitMapTrie64{Int64, Int64}(...):
  64  => 128
  128 => 256
  192 => 384
  256 => 512
  320 => 640
  384 => 768
  448 => 896
  512 => 1024
  576 => 1152
  640 => 1280
  ⋮   => ⋮

julia> b[13]
26

Tries themselves can be edgemaps for other tries:

julia> @trie(BitMapTrie,
             edgemap = PureFun.Contiguous.bitmap(16),
             keyfunc = PureFun.Contiguous.biterate(4))
julia> PureFun.Tries.@trie StringTrie Main.BitMapTrie codeunits

julia> StringTrie(("hello" => 1, "world" => 2))
StringTrie{String, Int64}(...):
  "world" => 2
  "hello" => 1
source

HashMap: $\S{10.3.1}$ exercise 10.11

PureFun.HashMaps.@hashmapMacro
HashMaps.@hashmap(Name, approx = ..., exact = ...)
HashMaps.@hashmap(Name, approx = ..., exact = ..., hashfunc = ...)

From exercise 10.11 in $\S{10.3.1}$:

Another common data structure that involves multiple layers of finite maps is the hash table. Complete the following implementation . . .

functor HashTable(structure Approx : FiniteMap
                  structure Exact : FiniteMap
                  val hash : Exact.Key → Approx.Key) : FiniteMap =
struct
    type Key = Exact.Key
    type α Map = α Exact.Map Approx.Map
    ...
    fun lookup(k,m) = Exact.lookup(k, Approx.lookup(hash k, m))
end

The advantage of this representation is that Approx can use an efficient key type (such as integers) and Exact can use a trivial implementation (such as association lists)

To define a hash map, provide a container type for approx, one for exact, and, optionally, a hash function. To avoid confusion, the container type arguments must be named.

Examples

The dictionary types for the approx and exact maps can be any type DictType that satisfies the following:

  • DictType{K,V} describes a concrete type
  • DictType{K,V}() initializes an empty dictionary for keys of type K and values of type V
  • has methods for get(::DictType, key, default) and setindex(::DictType, val, key)

In the following example, we have to specify the ordering parameter for the red-black dictionary, so that it conforms to the required specifications. Note also you might need to fully specify object names including their module (so e.g. Main.MyApproxMap rather than MyApproxMap). You can get around this restriction by defining const type aliases, as in this example:

julia> using PureFun, PureFun.HashMaps

julia> const RB = PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering}
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering}

julia> const assoclist = PureFun.Association.List
PureFun.Association.List

julia> HashMaps.@hashmap(MyHashMap, approx = RB, exact = assoclist)

julia> setindex(MyHashMap{String,Int}(), 42, "hello world")
MyHashMap{String, Int64}(...):
  "hello world" => 42

julia> MyHashMap(char => Int(char) for char in 'a':'f')
MyHashMap{Char, Int64}(...):
  'c' => 99
  'd' => 100
  'e' => 101
  'f' => 102
  'b' => 98
  'a' => 97

With this general framework for hashmaps as nested finite maps, we can implement the Hash Array Mapped Trie, which features prominently among Clojure's standard data structures, and in FunctionalCollections.jl. We assemble a bitmapped trie to use as our approx map, and use an association list for the exact map. For no particular reason except to demonstrate the functionality, here instead of relying on the usual hash function we specify our own hash function (in this case, it is just the hash of the hash):

Examples

julia> using PureFun, PureFun.Tries, PureFun.HashMaps

julia> PureFun.Tries.@trie(BitMapTrie64,
                           edgemap = PureFun.Contiguous.bitmap(64),
                           keyfunc = PureFun.Contiguous.biterate(6))

julia> HashMaps.@hashmap(HAMT,
                         approx   = Main.BitMapTrie64,
                         exact    = PureFun.Association.List,
                         hashfunc = hash ∘ hash)

julia> HAMT{String,Int}()
HAMT{String, Int64}()

julia> HAMT(x => Char(x) for x in 97:105)
HAMT{Int64, Char}(...):
  105 => 'i'
  99  => 'c'
  98  => 'b'
  103 => 'g'
  102 => 'f'
  97  => 'a'
  104 => 'h'
  100 => 'd'
  101 => 'e'
source

Association.List

PureFun.Association.ListType
Association.List{K,V}()
Association.List(iter)

A dictionary implemented as a linked list of pairs. Adding/updating new items is very fast, but lookups can be as bad as $\mathcal{O}(n)$. Almost identical to Base.ImmutableDict, but with a few extra bells and whistles to conform the expected interface for PureFun.PFDict.

Examples

julia> using PureFun, PureFun.Association

julia> Association.List(k => Char(k) for k in rand(UInt16, 5))
PureFun.Association.List{UInt16, Char} with 5 entries:
  0x5d5d => '嵝'
  0x6ab9 => '檹'
  0x4eb0 => '亰'
  0xd55c => '한'
  0xd018 => '퀘'
source

Function reference

Base.setindexMethod
setindex(d::PFDict, v, i)

Return a new dictionary with the value at key i set to v

source
Base.getMethod
get(d::PFDict, key, default)

Get the value associated with key, or return default.

source