Data types in Slogan falls into three categories - basic types, composite types and user-defined types. Basic types, the topic of this chapter, include numbers, characters, strings, symbols and boolean. Composite types are more complex because they are formed by combining values of several simpler ones. Arrays, pairs, lists, hash tables, sets and records are examples of composite types. Slogan also allow the definition of new types that conform to user-specified interfaces.
Slogan classifies numbers as integers, rational, real and complex. This classification is hierarchical, in that all integers are
rational, all rational numbers are real, and all real numbers are complex. Orthogonal to these categories, a number is also either exact or
inexact. In most cases, computations that involve an inexact number will produce an inexact result. One exception to this rule is
multiplying an inexact number with the exact 0
, which will produce an exact number. Operations that mathematically produce
irrational numbers for some rational arguments (e.g., sqrt
) may produce inexact results even for exact
arguments.
There are predicates1 that can be used to determine the specific type of a number.
is_integer(123)
// true
is_real(123)
// true
is_real(1/23)
// true
is_integer(1/23)
// false
is_number(1/23)
// true
is_number(123)
// true
is_rational(1/23)
// true
Exact integer and rational arithmetic is supported to arbitrary precision; the size of an integer or of the denominator or numerator of a ratio is limited only by system storage constraints.
Slogan numbers are written in a straightforward manner not much different from ordinary conventions for writing numbers. An exact
integer is normally written as a sequence of numerals preceded by an optional sign. For example, 3
, +19
,
-100000
, and 208423089237489374
all represent exact integers.
An exact rational number is normally written as two sequences of numerals separated by a slash (/
) and preceded by an
optional sign. For example, 3/4
, -6/5
, and 1/1208203823
are all exact rational numbers.
A ratio is reduced immediately to lowest terms when it is read and may in fact reduce to an exact integer.
Inexact real numbers are normally written in either floating-point or scientific notation. Floating-point notation consists of a
sequence of numerals followed by a decimal point and another sequence of numerals, all preceded by an optional sign.
Scientific notation consists of an optional sign, a sequence of numerals, an optional decimal point followed by a second
string of numerals, and an exponent; an exponent is written as the letter e followed by an optional sign and a sequence of
numerals. For example, 1.0
and -200.0
are valid inexact integers, and 1.5
, 0.034
,
-10e-10
and 1.5e-5
are valid inexact rational numbers. The exponent is the power of ten by which the
number preceding the exponent should be scaled, so that 2e3
is equivalent to 2000.0
.
Exact and inexact real numbers are written as exact or inexact integers or rational numbers; no provision is made in the syntax of Slogan numbers for non-rational real numbers, i.e., irrational numbers.
The exactness of a numeric representation may be overridden by preceding the constant by either 0e
or 0i
.
0e
forces the number to be exact, and 0i
forces it to be inexact. For example, 1
, 0e1
,
1/1
, 0e1/1
, 0e1.0
, and 0e1e0
all represent the exact integer
1
, and 0i3/10
, 0.3
, 0i0.3
, and 3e-1
all represent the inexact
rational 0.3
.
is_exact(123 * 100)
// true
is_exact(123 * 100.0)
// false
is_inexact(123 * 100.0)
// true
1 == 1.0
// false
1 == 0e1.0
// true
inexact(1) == 1.0
// true
0i1 == 1.0
// true
0i1 == exact(1.0)
// false
Numbers are written by default in base 10, although the special prefixes 0b
(binary), 0o
(octal),
0d
(decimal), and 0x
(hexadecimal) can be used to specify base 2, base 8, base 10, or base 16.
For radix 16, the letters a through f or A through F serve as the additional numerals required to express digit values 10
through 15. For example, 0b10101
is the binary equivalent of 2110
, 0o72
is the octal
equivalent of 5810
, and 0xC7
is the hexadecimal equivalent of 19910
.
Numbers written in floating-point and scientific notations are always written in base 10.
Underscores may be added to a number to improve readability. For example, the integer 1234567
could be formatted
as 1_23_4567
.
Complex number literals takes the form R+Ii
, where R
is the real part and I
is
the imaginary part. E.g: 3+7i
.
There are functions that corresponds to the arithmetic and comparison operators. These functions can accept an arbitrary number of arguments.
add(1,2,3,4,5)
// 15
number_is_lt(1,2,3,4,5)
// true
number_is_lt(1,2,3,40,5)
// false
number_is_lt(1,2,3,4,4)
// false
number_is_lteq(1,2,3,4,4)
// true
mult(20, 30, 40)
// 24000
In this section we will discuss functions that perform bitwise binary operations on integers. Some of the most useful of these functions are listed below:
band | bitwise AND |
bior | bitwise inclusive OR |
bxor | bitwise exclusive OR |
bnot | bitwise NOT |
bshift | left/right arithmetic shift |
is_bit_set | predicate to test bit by position |
If the number of bits to shift is negative, bshift
performs a right-shift. Otherwise, the bits are shifted left.
Bitwise operations assume that integer are represented in two's complement, even if they are not represented that way internally.
The following program show how to interpret an integer as a compact set of
independent bits.2 Note that we make use of only the first 32 bits of the
integer, while the underlying value may have more bits. To view the binary representation of an integer,
the built-in number_to_string
function is called. It takes an optional second argument
that specifies the base in which the result string should be formatted. To get a binary formatted string,
we have to pass 2
here.
function turn_bit_on(bits, i)
if (i <= 31) bior(bits, bshift(1, i))
else bits
let a, b = turn_bit_on(0, 31), turn_bit_on(0, 31)
number_to_string(a, 2)
// 10000000000000000000000000000000
number_to_string(b, 2)
// 10000000000000000000000000000000
a = turn_bit_on(turn_bit_on(a, 1), 5)
b = turn_bit_on(turn_bit_on(b, 1), 2)
number_to_string(a, 2)
// 10000000000000000000000000100010
number_to_string(b, 2)
// 10000000000000000000000000000110
number_to_string(band(a, b), 2) // intersection
// 10000000000000000000000000000010
number_to_string(bior(a, b), 2) // union
// 10000000000000000000000000100110
is_bit_set(a, 1) // membership test
// true
is_bit_set(a, 10)
// false
Fixnums represent exact integers within a closed range [-2w-1, 2w-2 - 1], where w
is the fixnum width.
The implementation-specific value of w
can be determined via the function
fixnum_width
, and the endpoints of
the range may be determined via the functions least_fixnum
and
greatest_fixnum
.
The names of arithmetic procedures that operate only on fixnums begin with the prefix "fx" to set them apart from their generic counterparts. The following example demonstrates some of the most useful operations on fixnums:
fxadd(1, 21)
// 22
fxadd(1, greatest_fixnum())
//> error: FIXNUM overflow
fx_is_eq(1, 1)
// true
fx_is_gt(1, 2)
// false
fx_is_gt(10, 2)
// true
fx_is_lteq(10, 2)
// false
fx_is_lteq(10, 10)
// true
fxsub(20, 32)
// -12
fxmult(20, 32)
// 640
fxdiv(20, 32)
0
fxdiv(20, 2)
// 10
Bit and shift operations on fixnums assume that fixnums are represented in two's complement, even if they are not represented that way internally.
number_to_string(fxior(4294967296, fxshift(1, 2)), 2)
// 100000000000000000000000000000100
number_to_string(fxior(4294967296, fxshift(fxshift(1, 2), -3)), 2)
// 100000000000000000000000000000000
Flonums are inexact real numbers. Implementations typically use the IEEE double-precision floating-point representation for flonums. Flonum-specific function names begin with the prefix "fl" to set them apart from their generic counterparts.
fladd(1.2, 4.5)
// 5.7
flmult(1.2, 4.5)
// 5.3999999999999995
fl_is_eq(0, 0.)
//> error: (Argument 1) FLONUM expected
fl_is_eq(0., 0.)
// true
fl_is_lt(-1., 0.)
// true
Characters are atomic objects representing letters, digits, special symbols
such as $
or #
, and
certain non-graphic control characters such as space
and newline
. Characters literals are written
with the \
prefix. For example, the character literal A
will be represented in Slogan source code
as \A
.
The following are special literals that represent non-graphic characters:
\newline
\return
\tab
\space
\backspace
\alarm
\vtab
\esc
\delete
\nul
Any Unicode character may be written with the syntax '\xhh', '\uhhhh' or '\Uhhhhhhhh' where n consists of two, four or eight hexadecimal digits representing a valid Unicode scalar value.
All the comparison operators are overloaded to work with characters:
\A == \A
// true
\A == \a
// false
\A == char_upcase(\a)
// true
\c > \b
// true
There are many predicates useful for finding information about characters and for comparing them:
is_char(\A)
// true
is_char(65)
// false
is_char(integer_to_char(65))
// true
char_is_numeric(\2)
// true
char_is_alphabetic(\e)
// true
char_is_lower_case(\e)
// true
char_is_eq(\a, \a) // `==` optimized to work with characters
// true
char_is_lteq(\a, \b) // `<=` optimized to work with characters
// true
Let us write a new predicate for characters which return true
if its argument is a vowel.
This function will also introduce you to the case
expression.
function is_vowel(c) case (c) \a -> true | \e -> true | \i -> true | \o -> true | \u -> true | else -> false
Case
evaluates an expression and compares its value to those in a list of clauses.
This comparison is done using the is_eq
function which basically checks if two values are stored in the same location in memory. On a successful match, the value of the clause is
returned. An optional else
can be defined to return a default value if all matches fail.
Multiple clauses that return the same value can be compressed into a single list. In is_vowel
the else
can also be omitted because
the default value of case
is false
. These two points leads
to the following rewrite of the function:
function is_vowel(c)
case (c) [\a, \e, \i, \o, \u] -> true
// Usage:
is_vowel(\o)
// true
is_vowel(\b)
// false
is_vowel(\a)
// true
A string is a sequence of characters enclosed in double-quotes. Slogan supports the Unicode standard. That means, Slogan strings can represent scripts from all of the world's written languages. The following are examples of valid string literals:
"hello, world"
// a string may span multiple lines.
"this is a really
long message..."
"ἐγὼ εἰμί"
Double-quotes inside a string must be escaped by a backslash.
"he said: \"hello, there\""
// he said: "hello there"
A list of all escape characters that can appear in string literal and their purpose is listed below:
\n newline
\t tab
\r return
\\ backslash
\b backspace
\a alarm
\v vertical-tab
\" double-quote
\e escape
\d delete
\0 nul
\u unicode character encoded in 4 hexadecimal digits
\x unicode character encoded in 2 hexadecimal digits
\U unicode character encoded in 8 hexadecimal digits
Let us familiarize ourselves with some useful operations on strings:
let s = "For all its power, the computer is a harsh taskmaster."
// accessing individual characters by index:
string_at(s, 2)
// \r
s[2]
// \r
// splicing or extracting sub-strings:
substring(s, 4, 17)
// all its power
s[4:17]
// all its power
s[4:]
// all its power, the computer is a harsh taskmaster.
s[:17]
// For all its power
string_length(s[:17])
// 17
/* `count` is more generic than `string_length`, it can
also find the length of other "collections" of data, like arrays and lists. */
count(s)
// 54
// searching:
string_index_of(s, ",")
// 17
string_index_of(s, "computer")
// 23
string_append("abc", "def", "xyz")
// abcdefxyz
// split the string at commas and spaces:
string_split(s, [\,, \space])
// [For, all, its, power, the, computer, is, a, harsh, taskmaster.]
strings_join("-", string_split(s, [\,, \space]))
// For-all-its-power-the-computer-is-a-harsh-taskmaster.
// comparisons
string_is_eq("abc", "abc")
// true
"abc" == "abc"
// true
string_is_eq("aBC", "abc")
// false
string_is_ci_eq("aBC", "abc")
// true
string_is_lt("abc", "xyz")
// true
"abc" < "xyz"
// true
"abc" >= "abc"
// true
Symbols are used for a variety of purposes as symbolic names in Slogan programs. Symbol constants are written by prefixing identifiers
with the quote mark ('
). All characters valid in identifiers can be used in symbols. Symbols with spaces and special characters
are written by enclosing the symbol in tick (`
) quotes. The following are all valid symbols:
'abc
'$abc
'`abc def`
'`abc+def`
Strings could be used for most of the same purposes, but an important characteristic of symbols makes comparisons for equality
much more efficient. This is because two symbols with the same sequence of characters are stored in the same memory location. This
makes it possible to test them for equality with the is_eq
function,
which does a fast check if its arguments point to
the same location in memory. On the other hand, effective string comparisons always require checking each character in both strings.
let a = "hello"
let b = "hello"
// `==` will compare each character in both strings
a == b
// true
// so does `string_is_eq`
string_is_eq(a, b)
// true
/* `is_eq` only checks of two objects belong to the same
location in memory */
is_eq(a, b)
// false
is_eq(a, a)
// true
// In contrast to strings, two symbols made of the same sequence of characters
// can be efficiently compared for equality just by checking their memory locations.
let x = 'hello
let y = 'hello
let z = 'Hello
is_eq(x, y)
// true
is_eq(x, z)
// false
It is possible to construct new symbols from strings and to convert symbols to strings:
is_eq('hello, string_to_symbol("hello"))
// true
"hello" == symbol_to_string('hello)
// true
1A predicate is a function that answers a question with a
true
or false
value.
2Slogan has a composite type bitarray
that can
represent bitmaps of arbitrary sizes. This type will be introduced in the next chapter.