Dictionaries
RedBlack.RBDict
($\S{3.3}$)
PureFun.RedBlack.RBDict
— TypeRedBlack.RBDict{O,K,V} where O
RedBlack.RBDict{K,V}(ord=Base.Order.Forward)
RedBlack.RBDict(iter, o::Ordering=Base.Order.Forward)
Immutable dictionary implemented using a red-black tree (balanced binary search tree). All major operations are $\mathcal{O}(\log{}n)$. Note the ordering parameter, the RBDict iterates in sorted order according to the ordering O
. In addition to the main PFDict
methods, RBDict
implements delete
, popmin
, and popmax
.
Examples
julia> using PureFun, PureFun.RedBlack
julia> f = RedBlack.RBDict(("zyz" => 1, "abc" => 2, "ghi" => 3))
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering, String, Int64} with 3 entries:
"abc" => 2
"ghi" => 3
"zyz" => 1
julia> b = RedBlack.RBDict(("zyz" => 1, "abc" => 2, "ghi" => 3), Base.Order.Reverse)
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 3 entries:
"zyz" => 1
"ghi" => 3
"abc" => 2
julia> popmin(f)
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering, String, Int64} with 2 entries:
"ghi" => 3
"zyz" => 1
julia> popmax(f)
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering, String, Int64} with 2 entries:
"abc" => 2
"ghi" => 3
julia> popmin(b)
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 2 entries:
"ghi" => 3
"abc" => 2
julia> popmax(b)
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 2 entries:
"zyz" => 1
"ghi" => 3
julia> delete(b, "ghi")
PureFun.RedBlack.RBDict{Base.Order.ReverseOrdering{Base.Order.ForwardOrdering}, String, Int64} with 2 entries:
"zyz" => 1
"abc" => 2
# forward-ordered by default, so:
julia> d = RedBlack.RBDict{Base.Order.ForwardOrdering}{String,Int}();
julia> d === RedBlack.RBDict{String,Int}()
true
Tries
($\S{10.3.1}$)
These tries use path compression (exercise 10.10, although using the "Compressed Trie with digit numbers" variant presented here), resulting in compact and efficient dictionaries
PureFun.Tries.@trie
— Macro@trie(Name, edgemap = ...)
Trie.@trie(Name, edgemap = ..., keyfunc = ...)
Tries are defined by an optional keyfunc
, which takes keys and returns an iterator of simpler keys, and the edgemap
, which maps the simpler keys to subtries. if keyfunc
is not specified it will be set to the identity function for all types except for Strings, for which it is codeunits
(in order to properly handle variable width character encodings). Once Name
has been defined, it can be used like any other PureFun.PFDict
Examples
julia> using PureFun.Tries
julia> @trie(SimpleMap, edgemap=PureFun.Association.List)
julia> s = SimpleMap{String,Int}()
SimpleMap{String, Int64}()
julia> setindex(s, 42, "hello world")
SimpleMap{String, Int64}(...):
"hello world" => 42
julia> SimpleMap(c => c for c in 'a':'e')
SimpleMap{Char, Char}(...):
'e' => 'e'
'd' => 'd'
'c' => 'c'
'b' => 'b'
'a' => 'a'
The edgemap
can be any data structure which implements the PFDict interface: it iterates pairs, has get
, setindex
, and isempty
methods, and an empty constructor that has the signature edgemap{K,V}()
. A tricky detail is that edgemap{K,V}
should be a concrete type for concrete K
and V
, something you have to account for when defining the trie type if your edgemap dictionary type has extra type parameters. For example, here we must specify the ordering parameter for our RedBlack dictionary in order to use it as the edgemap
:
julia> @trie(RedBlackMap,
edgemap = PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering})
julia> RedBlackMap(("hello" => "world", "reject" => "fascism"))
RedBlackMap{String, String}(...):
"hello" => "world"
"reject" => "fascism"
Cache-efficient bitmap tries
PureFun.Contiguous.biterate
breaks single integer keys into iterators of smaller integer keys. PureFun.Contiguous.bitmap
is a fast dictionary for small integer keys. By combining them, we end up with a bitmap-trie
julia> @trie(BitMapTrie,
edgemap = PureFun.Contiguous.bitmap(16),
keyfunc = PureFun.Contiguous.biterate(4))
julia> BitMapTrie(x => Char(x) for x in rand(UInt16, 5))
BitMapTrie{UInt16, Char}(...):
0x5f60 => '彠'
0xce8b => '캋'
0x92c6 => '鋆'
0xa2ce => 'ꋎ'
0xadff => '귿'
julia> @trie(BitMapTrie64,
edgemap = PureFun.Contiguous.bitmap(64),
keyfunc = PureFun.Contiguous.biterate(6))
julia> b = BitMapTrie64(x => 2x for x in 1:1_000)
BitMapTrie64{Int64, Int64}(...):
64 => 128
128 => 256
192 => 384
256 => 512
320 => 640
384 => 768
448 => 896
512 => 1024
576 => 1152
640 => 1280
⋮ => ⋮
julia> b[13]
26
Tries themselves can be edgemaps for other tries:
julia> @trie(BitMapTrie,
edgemap = PureFun.Contiguous.bitmap(16),
keyfunc = PureFun.Contiguous.biterate(4))
julia> PureFun.Tries.@trie StringTrie Main.BitMapTrie codeunits
julia> StringTrie(("hello" => 1, "world" => 2))
StringTrie{String, Int64}(...):
"world" => 2
"hello" => 1
HashMap
: $\S{10.3.1}$ exercise 10.11
PureFun.HashMaps.@hashmap
— MacroHashMaps.@hashmap(Name, approx = ..., exact = ...)
HashMaps.@hashmap(Name, approx = ..., exact = ..., hashfunc = ...)
From exercise 10.11 in $\S{10.3.1}$:
Another common data structure that involves multiple layers of finite maps is the hash table. Complete the following implementation . . .
functor HashTable(structure Approx : FiniteMap structure Exact : FiniteMap val hash : Exact.Key → Approx.Key) : FiniteMap = struct type Key = Exact.Key type α Map = α Exact.Map Approx.Map ... fun lookup(k,m) = Exact.lookup(k, Approx.lookup(hash k, m)) end
The advantage of this representation is that
Approx
can use an efficient key type (such as integers) andExact
can use a trivial implementation (such as association lists)
To define a hash map, provide a container type for approx
, one for exact
, and, optionally, a hash function. To avoid confusion, the container type arguments must be named.
Examples
The dictionary types for the approx and exact maps can be any type DictType
that satisfies the following:
DictType{K,V}
describes a concrete typeDictType{K,V}()
initializes an empty dictionary for keys of typeK
and values of typeV
- has methods for
get(::DictType, key, default)
andsetindex(::DictType, val, key)
In the following example, we have to specify the ordering parameter for the red-black dictionary, so that it conforms to the required specifications. Note also you might need to fully specify object names including their module (so e.g. Main.MyApproxMap
rather than MyApproxMap
). You can get around this restriction by defining const
type aliases, as in this example:
julia> using PureFun, PureFun.HashMaps
julia> const RB = PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering}
PureFun.RedBlack.RBDict{Base.Order.ForwardOrdering}
julia> const assoclist = PureFun.Association.List
PureFun.Association.List
julia> HashMaps.@hashmap(MyHashMap, approx = RB, exact = assoclist)
julia> setindex(MyHashMap{String,Int}(), 42, "hello world")
MyHashMap{String, Int64}(...):
"hello world" => 42
julia> MyHashMap(char => Int(char) for char in 'a':'f')
MyHashMap{Char, Int64}(...):
'c' => 99
'd' => 100
'e' => 101
'f' => 102
'b' => 98
'a' => 97
With this general framework for hashmaps as nested finite maps, we can implement the Hash Array Mapped Trie, which features prominently among Clojure's standard data structures, and in FunctionalCollections.jl. We assemble a bitmapped trie to use as our approx
map, and use an association list for the exact
map. For no particular reason except to demonstrate the functionality, here instead of relying on the usual hash function we specify our own hash function (in this case, it is just the hash of the hash):
Examples
julia> using PureFun, PureFun.Tries, PureFun.HashMaps
julia> PureFun.Tries.@trie(BitMapTrie64,
edgemap = PureFun.Contiguous.bitmap(64),
keyfunc = PureFun.Contiguous.biterate(6))
julia> HashMaps.@hashmap(HAMT,
approx = Main.BitMapTrie64,
exact = PureFun.Association.List,
hashfunc = hash ∘ hash)
julia> HAMT{String,Int}()
HAMT{String, Int64}()
julia> HAMT(x => Char(x) for x in 97:105)
HAMT{Int64, Char}(...):
105 => 'i'
99 => 'c'
98 => 'b'
103 => 'g'
102 => 'f'
97 => 'a'
104 => 'h'
100 => 'd'
101 => 'e'
Association.List
PureFun.Association.List
— TypeAssociation.List{K,V}()
Association.List(iter)
A dictionary implemented as a linked list of pairs. Adding/updating new items is very fast, but lookups can be as bad as $\mathcal{O}(n)$. Almost identical to Base.ImmutableDict
, but with a few extra bells and whistles to conform the expected interface for PureFun.PFDict
.
Examples
julia> using PureFun, PureFun.Association
julia> Association.List(k => Char(k) for k in rand(UInt16, 5))
PureFun.Association.List{UInt16, Char} with 5 entries:
0x5d5d => '嵝'
0x6ab9 => '檹'
0x4eb0 => '亰'
0xd55c => '한'
0xd018 => '퀘'
Function reference
Base.setindex
— Methodsetindex(d::PFDict, v, i)
Return a new dictionary with the value at key i
set to v
Base.get
— Methodget(d::PFDict, key, default)
Get the value associated with key
, or return default
.
PureFun.PFDict
— TypeAbstract supertype for immutable dictionaries.