# Basic Data Types

Data types in Slogan falls into three categories - basic types, composite types and user-defined types. Basic types, the topic of this chapter, include numbers, characters, strings, symbols and boolean. Composite types are more complex because they are formed by combining values of several simpler ones. Arrays, pairs, lists, hash tables, sets and records are examples of composite types. Slogan also allow the definition of new types that conform to user-specified interfaces.

## 4.1 Numbers

Slogan classifies numbers as integers, rational, real and complex. This classification is hierarchical, in that all integers are rational, all rational numbers are real, and all real numbers are complex. Orthogonal to these categories, a number is also either exact or inexact. In most cases, computations that involve an inexact number will produce an inexact result. One exception to this rule is multiplying an inexact number with the exact `0`, which will produce an exact number. Operations that mathematically produce irrational numbers for some rational arguments (e.g., `sqrt`) may produce inexact results even for exact arguments.

There are predicates1 that can be used to determine the specific type of a number.

``````
is_integer(123)
// true
is_real(123)
// true
is_real(1/23)
// true
is_integer(1/23)
// false
is_number(1/23)
// true
is_number(123)
// true
is_rational(1/23)
// true
``````

Exact integer and rational arithmetic is supported to arbitrary precision; the size of an integer or of the denominator or numerator of a ratio is limited only by system storage constraints.

Slogan numbers are written in a straightforward manner not much different from ordinary conventions for writing numbers. An exact integer is normally written as a sequence of numerals preceded by an optional sign. For example, `3`, `+19`, `-100000`, and `208423089237489374` all represent exact integers.

An exact rational number is normally written as two sequences of numerals separated by a slash (`/`) and preceded by an optional sign. For example, `3/4`, `-6/5`, and `1/1208203823` are all exact rational numbers. A ratio is reduced immediately to lowest terms when it is read and may in fact reduce to an exact integer.

Inexact real numbers are normally written in either floating-point or scientific notation. Floating-point notation consists of a sequence of numerals followed by a decimal point and another sequence of numerals, all preceded by an optional sign. Scientific notation consists of an optional sign, a sequence of numerals, an optional decimal point followed by a second string of numerals, and an exponent; an exponent is written as the letter e followed by an optional sign and a sequence of numerals. For example, `1.0` and `-200.0` are valid inexact integers, and `1.5`, `0.034`, `-10e-10` and `1.5e-5` are valid inexact rational numbers. The exponent is the power of ten by which the number preceding the exponent should be scaled, so that `2e3` is equivalent to `2000.0`.

Exact and inexact real numbers are written as exact or inexact integers or rational numbers; no provision is made in the syntax of Slogan numbers for non-rational real numbers, i.e., irrational numbers.

The exactness of a numeric representation may be overridden by preceding the constant by either `0e` or `0i`. `0e` forces the number to be exact, and `0i` forces it to be inexact. For example, `1`, `0e1`, `1/1`, `0e1/1`, `0e1.0`, and `0e1e0` all represent the exact integer `1`, and `0i3/10`, `0.3`, `0i0.3`, and `3e-1` all represent the inexact rational `0.3`.

``````
is_exact(123 * 100)
// true
is_exact(123 * 100.0)
// false
is_inexact(123 * 100.0)
// true
1 == 1.0
// false
1 == 0e1.0
// true
inexact(1) == 1.0
// true
0i1 == 1.0
// true
0i1 == exact(1.0)
// false
``````

Numbers are written by default in base 10, although the special prefixes `0b` (binary), `0o` (octal), `0d` (decimal), and `0x` (hexadecimal) can be used to specify base 2, base 8, base 10, or base 16. For radix 16, the letters a through f or A through F serve as the additional numerals required to express digit values 10 through 15. For example, `0b10101` is the binary equivalent of `2110`, `0o72` is the octal equivalent of `5810`, and `0xC7` is the hexadecimal equivalent of `19910`. Numbers written in floating-point and scientific notations are always written in base 10.

Underscores may be added to a number to improve readability. For example, the integer `1234567` could be formatted as `1_23_4567`.

Complex number literals takes the form `R+Ii`, where `R` is the real part and `I` is the imaginary part. E.g: `3+7i`.

There are functions that corresponds to the arithmetic and comparison operators. These functions can accept an arbitrary number of arguments.

``````
// 15
number_is_lt(1,2,3,4,5)
// true
number_is_lt(1,2,3,40,5)
// false
number_is_lt(1,2,3,4,4)
// false
number_is_lteq(1,2,3,4,4)
// true
mult(20, 30, 40)
// 24000
``````

### 4.1.1 Bitwise Operations

In this section we will discuss functions that perform bitwise binary operations on integers. Some of the most useful of these functions are listed below:

 band bitwise AND bior bitwise inclusive OR bxor bitwise exclusive OR bnot bitwise NOT bshift left/right arithmetic shift is_bit_set predicate to test bit by position

If the number of bits to shift is negative, `bshift` performs a right-shift. Otherwise, the bits are shifted left.

Bitwise operations assume that integer are represented in two's complement, even if they are not represented that way internally.

The following program show how to interpret an integer as a compact set of independent bits.2 Note that we make use of only the first 32 bits of the integer, while the underlying value may have more bits. To view the binary representation of an integer, the built-in `number_to_string` function is called. It takes an optional second argument that specifies the base in which the result string should be formatted. To get a binary formatted string, we have to pass `2` here.

``````
function turn_bit_on(bits, i)
if (i <= 31) bior(bits, bshift(1, i))
else bits

let a, b = turn_bit_on(0, 31), turn_bit_on(0, 31)

number_to_string(a, 2)
// 10000000000000000000000000000000
number_to_string(b, 2)
// 10000000000000000000000000000000

a = turn_bit_on(turn_bit_on(a, 1), 5)
b = turn_bit_on(turn_bit_on(b, 1), 2)

number_to_string(a, 2)
// 10000000000000000000000000100010
number_to_string(b, 2)
// 10000000000000000000000000000110

number_to_string(band(a, b), 2) // intersection
// 10000000000000000000000000000010

number_to_string(bior(a, b), 2) // union
// 10000000000000000000000000100110

is_bit_set(a, 1) // membership test
// true
is_bit_set(a, 10)
// false
``````

### 4.1.2 Fixnums

Fixnums represent exact integers within a closed range [-2w-1, 2w-2 - 1], where `w` is the fixnum width. The implementation-specific value of `w` can be determined via the function `fixnum_width`, and the endpoints of the range may be determined via the functions `least_fixnum` and `greatest_fixnum`.

The names of arithmetic procedures that operate only on fixnums begin with the prefix "fx" to set them apart from their generic counterparts. The following example demonstrates some of the most useful operations on fixnums:

``````
// 22
//> error: FIXNUM overflow

fx_is_eq(1, 1)
// true
fx_is_gt(1, 2)
// false
fx_is_gt(10, 2)
// true
fx_is_lteq(10, 2)
// false
fx_is_lteq(10, 10)
// true
fxsub(20, 32)
// -12
fxmult(20, 32)
// 640
fxdiv(20, 32)
0
fxdiv(20, 2)
// 10
``````

Bit and shift operations on fixnums assume that fixnums are represented in two's complement, even if they are not represented that way internally.

``````
number_to_string(fxior(4294967296, fxshift(1, 2)), 2)
// 100000000000000000000000000000100
number_to_string(fxior(4294967296, fxshift(fxshift(1, 2), -3)), 2)
// 100000000000000000000000000000000
``````

### Flonums

Flonums are inexact real numbers. Implementations typically use the IEEE double-precision floating-point representation for flonums. Flonum-specific function names begin with the prefix "fl" to set them apart from their generic counterparts.

``````
// 5.7
flmult(1.2, 4.5)
// 5.3999999999999995
fl_is_eq(0, 0.)
//> error: (Argument 1) FLONUM expected
fl_is_eq(0., 0.)
// true
fl_is_lt(-1., 0.)
// true
``````

## 4.2 Characters

Characters are atomic objects representing letters, digits, special symbols such as `\$` or `#`, and certain non-graphic control characters such as `space` and `newline`. Characters literals are written with the `\` prefix. For example, the character literal `A` will be represented in Slogan source code as `\A`.

The following are special literals that represent non-graphic characters:

``````
\newline
\return
\tab
\space
\backspace
\alarm
\vtab
\esc
\delete
\nul
``````

Any Unicode character may be written with the syntax '\xhh', '\uhhhh' or '\Uhhhhhhhh' where n consists of two, four or eight hexadecimal digits representing a valid Unicode scalar value.

All the comparison operators are overloaded to work with characters:

``````
\A == \A
// true
\A == \a
// false
\A == char_upcase(\a)
// true
\c > \b
// true
``````

There are many predicates useful for finding information about characters and for comparing them:

``````
is_char(\A)
// true
is_char(65)
// false
is_char(integer_to_char(65))
// true
char_is_numeric(\2)
// true
char_is_alphabetic(\e)
// true
char_is_lower_case(\e)
// true
char_is_eq(\a, \a) // `==` optimized to work with characters
// true
char_is_lteq(\a, \b) // `<=` optimized to work with characters
// true
``````

Let us write a new predicate for characters which return `true` if its argument is a vowel. This function will also introduce you to the `case` expression.

``````
function is_vowel(c)
case (c)
\a -> true
| \e -> true
| \i -> true
| \o -> true
| \u -> true
| else -> false
```    ```

`Case` evaluates an expression and compares its value to those in a list of clauses. This comparison is done using the `is_eq` function which basically checks if two values are stored in the same location in memory. On a successful match, the value of the clause is returned. An optional `else` can be defined to return a default value if all matches fail.

Multiple clauses that return the same value can be compressed into a single list. In `is_vowel` the `else` can also be omitted because the default value of `case` is `false`. These two points leads to the following rewrite of the function:

``````
function is_vowel(c)
case (c) [\a, \e, \i, \o, \u] -> true

// Usage:
is_vowel(\o)
// true
is_vowel(\b)
// false
is_vowel(\a)
// true
``````

## 4.3 Strings

A string is a sequence of characters enclosed in double-quotes. Slogan supports the Unicode standard. That means, Slogan strings can represent scripts from all of the world's written languages. The following are examples of valid string literals:

``````
"hello, world"

// a string may span multiple lines.
"this is a really
long message..."

"ἐγὼ εἰμί"
``````

Double-quotes inside a string must be escaped by a backslash.

``````
"he said: \"hello, there\""
// he said: "hello there"
``````

A list of all escape characters that can appear in string literal and their purpose is listed below:

``````
\n    newline
\t    tab
\r    return
\\    backslash
\b    backspace
\a    alarm
\v    vertical-tab
\"    double-quote
\e    escape
\d    delete
\0    nul
\u    unicode character encoded in 4 hexadecimal digits
\x    unicode character encoded in 2 hexadecimal digits
\U    unicode character encoded in 8 hexadecimal digits
``````

Let us familiarize ourselves with some useful operations on strings:

``````
let s = "For all its power, the computer is a harsh taskmaster."

// accessing individual characters by index:
string_at(s, 2)
// \r
s
// \r

// splicing or extracting sub-strings:
substring(s, 4, 17)
// all its power
s[4:17]
// all its power
s[4:]
// all its power, the computer is a harsh taskmaster.
s[:17]
// For all its power
string_length(s[:17])
// 17

/* `count` is more generic than `string_length`, it can
also find the length of other "collections" of data, like arrays and lists. */
count(s)
// 54

// searching:
string_index_of(s, ",")
// 17
string_index_of(s, "computer")
// 23

string_append("abc", "def", "xyz")
// abcdefxyz

// split the string at commas and spaces:
string_split(s, [\,, \space])
// [For, all, its, power, the, computer, is, a, harsh, taskmaster.]
strings_join("-", string_split(s, [\,, \space]))

// comparisons
string_is_eq("abc", "abc")
// true
"abc" == "abc"
// true
string_is_eq("aBC", "abc")
// false
string_is_ci_eq("aBC", "abc")
// true
string_is_lt("abc", "xyz")
// true
"abc" < "xyz"
// true
"abc" >= "abc"
// true
``````

## 4.4 Symbols

Symbols are used for a variety of purposes as symbolic names in Slogan programs. Symbol constants are written by prefixing identifiers with the quote mark (`'`). All characters valid in identifiers can be used in symbols. Symbols with spaces and special characters are written by enclosing the symbol in tick (```) quotes. The following are all valid symbols:

``````
'abc
'\$abc
'`abc def`
'`abc+def`
``````

Strings could be used for most of the same purposes, but an important characteristic of symbols makes comparisons for equality much more efficient. This is because two symbols with the same sequence of characters are stored in the same memory location. This makes it possible to test them for equality with the `is_eq` function, which does a fast check if its arguments point to the same location in memory. On the other hand, effective string comparisons always require checking each character in both strings.

``````
let a = "hello"
let b = "hello"

// `==` will compare each character in both strings
a == b
// true

// so does `string_is_eq`
string_is_eq(a, b)
// true

/* `is_eq` only checks of two objects belong to the same
location in memory */
is_eq(a, b)
// false
is_eq(a, a)
// true

// In contrast to strings, two symbols made of the same sequence of characters
// can be efficiently compared for equality just by checking their memory locations.
let x = 'hello
let y = 'hello
let z = 'Hello

is_eq(x, y)
// true
is_eq(x, z)
// false
``````

It is possible to construct new symbols from strings and to convert symbols to strings:

``````
is_eq('hello, string_to_symbol("hello"))
// true
"hello" == symbol_to_string('hello)
// true
``````

1A predicate is a function that answers a question with a `true` or `false` value.

2Slogan has a composite type `bitarray` that can represent bitmaps of arbitrary sizes. This type will be introduced in the next chapter.